Green Bar
Renew Data Logo RenewData Blog Search
Advanced Search

Navigation Divider
Why RenewData
Navigation Divider
Legal Expertise
Navigation Divider
Technology
> Hash Values
De-Duplication Process
Tape Restore vs. Extraction

Navigation Divider
Cost Management
Navigation Divider
Client Success Stories
Navigation Divider
Industry Affiliations
Navigation Divider
Facility Security
Navigation Divider


Creating Digital Fingerprints with Hash Values

Ensuring Data Integrity RenewData has implemented industry standard technology to ensure the integrity of our customer's data. RenewData uses hash values to:

  • Identify every message, document or file with a digital fingerprint at the start of the production process
  • De-duplicate messages, documents and files by comparing hash values for matching values
  • Validate the final production output of messages, documents and files did not change during processing

By using hash values during validation of the files, RenewData can ensure the documents extracted from a client's backup medium are forensically identical to the files delivered in a production to the client.


What is a Hash value?

"A 'hash' (also called a "digest") is a kind of 'signature' for a stream of data that represents the contents. The closest real-life analog we can think of is 'a tamper-evident seal on a software package': if you open the box (change the file), it's detected."1 The algorithms utilized in creating a hash can detect the minutest change in the structure of the data stream.

For example the MD5 and SHA-1 hash for: "The quick brown fox jumped over the lazy dog" are SHA-1 HASH: f6513640f3045e9768b239785625caa6a2588842 MD5 HASH: 08a008a01d498c404b0c30852b39d3b8

A minor change, such as changing the "d" in dog to "c" results in an entirely different hash value:
"The quick brown fox jumped over the lazy cog"
SHA-1 HASH: 68361d3bafa5fc245e16c9654f4ec32531535207
MD5 HASH: bd266017bcad42c85a57cc3ae5851bf8

Hash values will even change for changes in punctuation:

"the quick brown fox jumped over the lazy dog"
SHA-1 HASH: 6039d1003323d48347ddfdb5ce2842df758fab5f
MD5 HASH: bb0fa6eff92c305f166803b6938dd33a

It should be noted that hashes are "digests" of files, not the "encryption" of a file. Encryption is a two-way operation which, given the right keys, transforms data from a clear text to cipher text and back. Hashes, on the other hand, compile a stream of data into a small "digest" that is a unique representation of the data stream. Hashes are utilized in verifying file integrity, hash passwords and digital signatures.


Creating a Hash Value

The creation of a hash value for a file involves the processing of a variable length message into a fixedlength output. A hash function must be able to process an arbitrary-length message into a fixed-length output. In practice, this can be achieved by breaking the input up into a series of equal-size blocks, and operating on them in sequence using a compression function. Most widely-used hash functions take the form in the diagram below.

In the diagram, a message is segmented into fixed-length blocks and are compressed to an output of the same size. The combination of the compressed blocks equals the hash value for the entire message.


Hash Value Example

1 Unixwiz.net Tech Tips, "An Illustrated Guide to Cryptographic Hashes" by Steve Friedl.



Related Information