Hash Functions
A hash function is a form of encryption that takes some plaintext input and transforms it into a fixed-length encrypted output called the message digest. The digest is a fixed-size set of bits that serves as a unique “digital fingerprint” for the original message. If the original message is altered and hashed again, it will produce a different signature. Thus, hash functions can be used to detect altered and forged documents. They provide message integrity, assuring recipients that the contents of a message have not been altered or corrupted.
Hash functions are one-way, meaning that it is easy to compute the message digest but very difficult to revert the message digest back to the original plaintext (e.g., imagine trying to put a smashed pumpkin back to exactly the way it was). Hash function features are listed here:
- A hash function should be impossible for two different messages to ever produce the same message digest. Changing a single digit in one message will produce an entirely different message digest.
- It should be impossible to produce a message that has some desired or predefined output (target message digest).
- It should be impossible to reverse the results of a hash function. This is possible because a message digest could have been produced by an almost infinite number of messages.
- The hash algorithm itself does not need to be kept secret. It is made available to the public. Its security comes from its ability to produce one-way hashes.
- The resulting message digest is a fixed size. A hash of a short message will produce the same size digest as a hash of a full set of encyclopedias.
Hash functions may be used with or without a key. If a key is used, both symmetric (single secret key) and asymmetric keys (public/private key pairs) may be used. The two primary algorithms are listed next and the RFCs listed later provide more information on the protocols. Also see the list of Web sites on the related entries page.
- MD-5 A hash function designed by Ron Rivest, one of the inventors of the RSA public-key encryption scheme. The MD-5 algorithm produces a 128-bit output. Note that MD-5 is now known to have some weaknesses and should be avoided if possible. SHA-1 is generally recommended. This is discussed later.
- SHA-1 (Secure Hash Algorithm-1) SHA-1 is an MD-5-like algorithm that was designed to be used with the Digital Signature Standard (DSS). The United States agencies NIST (National Institute of Standards and Technology) and NSA (National Security Agency) are responsible for SHA-1. The SHA-1 algorithm produces a 160-bit MAC. This longer output is considered to be more secure than MD-5.
Keyed MD5 is a technique for using MD-5. Basically, a sender appends a randomly generated key to the end of a message, and then hashes the message and key combination to create a message digest. Next, the key is removed from the message and encrypted with the sender’s private key. The message, message digest, and encrypted key are sent to the recipient, who opens the key with the sender’s public key (thus validating that the message is actually from the sender). The recipient then appends the key to the message and runs the same hash as the sender. The message digest should match the message digest sent with the message.
The result of a hash function that combines a message with a key is called a message authentication code, or MAC. A MAC is a “fingerprint” or “message digest” of the input in combination with a key available to parties in the message exchange.
Hash functions are used in authentication routines such as CHAP (Challenge Handshake Authentication Protocol). Both the client and server share a secret-the password used by the client, which has been previously exchanged but is never sent over the wire. When the client establishes a link to the server, the server sends a unique “challenge” value (sometimes called a nonce) to the client. The client combines his or her password with the challenge and then runs them through the hash function. The result is sent back to the server, which runs the same process and compares its results with those received from the client. If they compare, the client is considered authentic. Note that the actual password is never sent, only a hash of the challenge and password combination.
HMAC (Hashed Message Authentication Code) is a core protocol that is considered essential for security on the Internet along with IPSec, according to RFC 2316 (Report of the IAB, April 1998). It is not a hash function, but a mechanism for message authentication that uses either MD5 or SHA-1 hash functions in combination with a shared secret key (as opposed to a public/private key pair). Basically, a message is combined with a key and run through the hash function. The result is then combined with the key and run through the hash function again. This 128-bit result is truncated to 96 bits and becomes the MAC.
According to RFC 2104 (HMAC: Keyed-Hashing for Message Authentication, February 1997), HMAC should be used in preference to older techniques, notably keyed hash functions. Keyed hashes based on MD-5 are especially to be avoided, given the hints of weakness in MD-5. HMAC is the preferred shared-secret authentication technique, and it should be used with SHA-1. It can be used to authenticate any arbitrary message and is suitable for logins.
The following RFCs provide important additional information about the hash functions used in the Internet environment. These RFCs are located on the CD-ROM.
- RFC 1321 (MD5 Message-Digest Algorithm, April 1992)
- RFC 1828 (IP Authentication using Keyed MD5, August 1995)
- RFC 1864 (The Content-MD5 Header Field, October 1995)
- RFC 1994 (PPP Challenge Handshake Authentication Protocol (CHAP), August 1996)
- RFC 2069 (An Extension to HTTP: Digest Access Authentication, January 1997)
- RFC 2085 (HMAC-MD5 IP Authentication with Replay Prevention, February 1997)
- RFC 2104 (HMAC: Keyed-Hashing for Message Authentication, February 1997)
- RFC 2316 (Report of the IAB, April 1998)
- RFC 2401 (Security Architecture for the Internet Protocol, November 1998)
- RFC 2403 (The Use of HMAC-MD5-96 within ESP and AH, November 1998)
- RFC 2404 (The Use of HMAC-SHA-1-96 within ESP and AH, November 1998)
- RFC 2537 (RSA/MD5 KEYs and SIGs in the Domain Name System (DNS), March 1999)
- RFC 2831 (Using Digest Authentication as a SASL Mechanism, May 2000)
- RFC 2857 (The Use of HMAC-RIPEMD-160-96 within ESP and AH, June 2000)
Reference