top of page
Writer's picturevieriazeynana

HashPump – Exploit Hash Length Extension Attack : les algorithmes vulnérables et les contre-mesures



Awhile back, my friend @mogigoma and I were doing a capture-the-flag contest at -ctf.com. One of the levels of the contest required us to perform a hash length extension attack. I had never even heard of the attack at the time, and after some reading I realized that not only is it a super cool (and conceptually easy!) attack to perform, there is also a total lack of good tools for performing said attack! After hours of adding the wrong number of null bytes or incorrectly adding length values, I vowed to write a tool to make this easy for myself and anybody else who's trying to do it. So, after a couple weeks of work, here it is!




HashPump – Exploit Hash Length Extension Attack




Now I'm gonna release the tool, and hope I didn't totally miss a good tool that does the same thing! It's called hash_extender, and implements a length extension attack against every algorithm I could think of:


An application is susceptible to a hash length extension attack if it prepends a secret value to a string, hashes it with a vulnerable algorithm, and entrusts the attacker with both the string and the hash, but not the secret. Then, the server relies on the secret to decide whether or not the data returned later is the same as the original data.


It turns out, even though the attacker doesn't know the value of the prepended secret, he can still generate a valid hash for ! This is done by simply picking up where the hashing algorithm left off; it turns out, 100% of the state needed to continue a hash is in the output of most hashing algorithms! We simply load that state into the appropriate hash structure and continue hashing.


The server sends data and signature to the attacker. The attacker guesses that H is MD5 simply by its length (it's the most common 128-bit hashing algorithm), based on the source, or the application's specs, or any way they are able to.


With most algorithms (including MD4, MD5, RIPEMD-160, SHA-0, SHA-1, and SHA-256), the string is padded until its length is congruent to 56 bytes (mod 64). Or, to put it another way, it's padded until the length is 8 bytes less than a full (64-byte) block (the 8 bytes being size of the encoded length field). There are two hashes implemented in hash_extender that don't use these values: SHA-512 uses a 128-byte blocksize and reserves 16 bytes for the length field, and WHIRLPOOL uses a 64-byte blocksize and reserves 32 bytes for the length field.


This example took me hours to write. Why? Because I made about a thousand mistakes writing the code. Too many NUL bytes, not enough NUL bytes, wrong endianness, wrong algorithm, used bytes instead of bits for the length, and all sorts of other stupid problems. The first time I worked on this type of attack, I spent from 2300h till 0700h trying to get it working, and didn't figure it out till after sleeping (and with Mak's help). And don't even get me started on how long it took to port this attack to MD5. Endianness can die in a fire.


Arguments: hexdigest(str): Hex-encoded result of hashing key + original_data. original_data(str): Known data used to get the hash result hexdigest. data_to_add(str): Data to append key_length(int): Length of unknown data prepended to the hash


Returns: A tuple containing the new hex digest and the new message.>>> hashpumpy.hashpump('ffffffff', 'original_data', 'data_to_add', len('KEYKEYKEY'))('e3c4a05f', 'original_datadata_to_add')```


For this to work however, the extension attack must add junk data inbetween the original data and the appended payload. Remember the note from earlier? The last line of the original file is commented out, so the junk data will also be commented out. If our payload starts with a newline, our payload file will be both valid and executed.


The basic idea of hashing is to take the data provided to the method and create a standardized output regardless of what the source data is. This usually means creating a string of characters from a certain set (usually alphanumeric) that is the same length regardless of the input. For example MD5 hashes are always 32 characters long and SHA1 hashes are 40 characters long.


This padding usually comes in the form of null bytes (0x00 in hex notation) and is added internally during the hash functionality's processing. The padding starts with a 0x80 hex byte and then is padded out, leaving room for an 8 byte field at the end of the data for noting length. You as a developer don't have to think about this processing, it just happens and you're provided the resulting hash of this combined data. The reason it's important to know, however, is that this knowledge is required to understand the attack.


Much like you'd expect from an attack with "length extension" in its name, the issue has to do with the number of characters involved in the processing. Since we now know that our hashes are usually padded out with null bytes and no checking is done on the input as a part of the hashing, it's possible to inject additional null bytes into the content and extend the length of the data being hashed. Because of how the hashing methods work, this can be used to append arbitrary content to the data behind used behind the scenes for the hashing. The hash processing sees this extra data and uses it as a part of the overall data, potentially even overwriting parts of the current data. This means an attacker doesn't even need to know the secret to be able to coerce the resulting hash into something they need.


In this example we've used a key length of 14 but you'll probably need to do a bit of testing if you're using this against a system with an unknown secret. The key length and the secret length have to correctly correlate in order for the attack to be successful.


Since hashpump doesn't automatically go through the key lengths for you (other tools include that functionality) so you'll need to have some kind of wrapper around it to run through various lengths and try each. For my testing I wrote a quick Python script that did the hard work for me of calling hashpump and making the URL requests to determine the success or failure of the attempt. Here's that code:


In this code I'm calling the hashpump command line tool with various key lengths with a starting range of 1...10. The output from the command is pulled in, the padding is URL encoded and the Python requests library is used to make the request to the new URL. The range is there because, as an outsider, we wouldn't know what the secret value is so we need to run several tests to guess. Hopefully the secret is nice and short like ours but you never know. You might have to make the range quite a bit higher to find the sweet spot.


So what happened here? Well, when we make the request to that URL with the modified version of the value of to with all of the padding and the right length, the hash extension attack kicks in. As a result the null padding is appended to the current secret, to and from values and a new hash is created - an exact match for the one we provided as the signature: e07e769759795a082765b2b8c813015e95d45f6. Normally the matching hash for a to value of folder2 and a from value of folder1 would be e2dd854e78d8758bde1eeb3d2e640c3d0ef24611.


You can see where this is going, right? Well, this signature was a md5 hashed value of the secret and the current values from the URL. This should sound pretty familiar to our previous examples. There was a bit more involved than just the hash length extension but it definitely played a major role. With the md5 hashing method being used and the ability of the user to be able to manipulate the values used in the hash directly, it's no shock that this bypass was discovered. Two researchers, Thai Duong and Juliano Rizzo, released a report of their findings and how the bug could be exploited (after they talked with Flickr/Yahoo first, naturally).


There was another issue coupled with the hash length extension that made it even worse. Because of how they were using the URL values (removing the delimiters) it was possible to override a URL value with one of your own. The hash length extension attack then allowed you to trick the server into generating the hash you want and giving it the signature it expects.


One of the easiest paths to migrate to is an HMAC hash. This hashing method uses a different process than those previously mentioned to compute the resulting hash. It was designed to allow for the verification of message integrity and and authenticity of a message (as a signature). While the same hashing types mentioned before can be used along with it, the process used to genrate the hash is no longer vulnerable to this padding attack. If you're using PHP, there's already a built-in function to help make using HMAC hashes easy, hash_hmac:


One last recommendation that's not specifically related to this attack but is a good practice when it comes to comparing hashes: always use a timing-safe comparison. If you're in PHP that means using hash_equals rather than the double- or triple-equals (== or ===). This prevents an attacker from gaining any additional information about valid versus invalid hashes based on timing on the request.


For more information about hashing, hash length extension vulnerabilities and an interesting write-up of when Stripe included it in one of their Capture the Flag contests check out the Resources section below.


I'll try to explain to you how the attack is changed; Due to the fact that the insert into the users' table doesn't work, I move for another way and concentrate on the query available exploiting the error on the insert table in order to identify the password (secret) of the admin char by char. Also if the string I'm sending seems to be complex, but the concept is really simple. I'm saying to the query to insert a blob value in the field message with a size of 3000000000 if the secret respect the condition (secret like '4%') otherwise 30. If you guess the char the query will return an error because the field is too small to contain the value, until you receive an OK message, the char is wrong. Inserting this process in a loop that cycle all the letter, after a small brute-force you'll have the final secret of the admin. I use the same technic to identify the name of the administrator user: 2ff7e9595c


0 views0 comments

Recent Posts

See All

Comments


bottom of page