As blockchain and crypto enthusiasts, you deal with them a lot: hashes.
Your wallet address, a transaction id, a block id are all outputs of hashing functions. You’ll recognize them as a seemingly random set of characters; however, a hash isn’t random at all. The value makes little sense to a human reader because it’s intended to be read and interpreted by a computer.
While a hashing function is a cryptographic function, it’s not encryption.
Encryption works with a system where you have an input, encryption formula, a key, and an encrypted output. By knowing the encrypted input and the key, you can decrypt the output and obtain the original message. A hashing function, in contrast to encryption, works as a one-way function; you are not able to find the original message when the only thing you know is the hash.
While the input data can have any length, from a single character to the whole content of 33 million books of the US Library of Congress, the output is always 32 bytes (SHA-1), or 64 bytes (SHA-2) long discrete number. That’s a number between 0 and 4,294,967,295 for a 32-byte hash and between 0 and 9,223,372,036,854,775,807 for a 64-byte hash. There are more possibilities than all the grains of sand in 10 Sahara deserts.
- The possibility of two different inputs resulting in the same hash is known, but the possibility of this event is that small that it doesn’t even matter.
- The feature that preserves the fixed length of a hashing function output is provided by compression functions, which are a part of a hashing algorithm.
As with a blockchain itself, a hashing function is underlined by the determinism principle. The same input will always return the same hashed result. An important aspect of secure hashing functions is, that if you change only a single character of the input, the resulting hash will be something completely different. At the same time, two similarly looking hashes will have vastly different inputs.
A simple change in the input leads to a completly different hash
For example, the Blockchain-based project, LTO Network, uses SHA-2 as the hashing algorithm. SHA is the abbreviation to “Secret Hashing Algorithm;”. It’s not a single hashing algorithm, but a family of very different functions, of which only SHA-2 and SHA-3 are recommended for use today. This family of hashing functions is managed by NIST, the U.S. institute of standards and technology.
An output of a hashing function is always a discrete number described as a set of binary units: zeros and ones. Typically the hash is shown in hexadecimal form. So rather than having digits from 0 to 9, digits are from 0 to f (where “a” is 10 and “f” is 15). So that’s while even though a transaction id or wallet address is a just number, you can spot some letters in it too.
Do you want to know more?
- Go and hash something using the Ctryptii tool!
How the SHA algorithm works
SHA-1 bears a striking similarity in structure to MD4 and MD5 hashing algorithms that were used earlier. Sha1 =160bits, 5x32bit words, four bytes each (definition from the standard). When hashing whit SHA-2, we have 256 or 512 bits long string constructed by zeroes and ones.
Data of the input are sorted via a loop into 256 or 512 bits large blocks of data, depends on used SHA-2 variation, one at a time until the file is expended. If a message is large to fill exactly the one, for example, 512 bits long, block in length, the hash loop will run only once. This means that the final output of the hashing function will be updated once. If the message were longer and thus we would need more loops, each loop would bring a new block of data into the hashing function.
The compression function takes this data and a bit of message, turn it into another set of “n” values and that’s repeat as we have a message. In our case, this will happen only once, since we have the message of the exact same length as a potential space in the hashing block. This updating of the internal state with compression function, in essence, is called a Merkel down guard construction.
When the message isn’t long enough to fill exactly one 512 bits long block, padding has to be used. Padding means that a space that is left in the block is filed by binary notation, which represents the length of the message in the block. The padding scheme ensures that messages of the same length and messages that end in the same way or in a very similar way don’t share the same padding, thus the final hash.
Why is SHA-2 more secure than SHA-1?
The difference between SHA-1 and SHA2 is in slightly different compression function and the longer internal state; its 256 or 512 bits instead of SHA-1 160bits.
SHA-1 160bits, which is quite long, means that the chance of you stumbling across two messages that hash to the same thing is about 2^80 roughly. That’s still a very long time, somewhere around 12 million GPU years. But the compression function of SHA-1 is not so good that the attackers can reduce this to 2^60, which is brute-force hack-able by a people with a lot of time and money.
SHA-2 is similar to SHA-1 with a slightly different compression function where the internal state is longer. Its 256 or 512 bits with SHA-2, so your starting point is trying to brute force something like 2^128, which is a huge amount more than 2^80 and much much more than 2^60.
- Next Episode
- Previous Episode
- Back to barracks