A little bit more info inline:
On Jan 18, 2006, at 8:52 AM, David McGrew wrote:
Hi Matt,
nice summary of the KDF situation, more inline:
On Jan 5, 2006, at 3:03 PM, Matt Ball wrote:
Can anyone describe the NIST key-derivation algorithm? This
would be very helpful in guiding the group.
I went ahead and answered my question about the NIST key
derivation. It looks like this is a link to the current proposal
to NIST for a standard "Key Derivation Function":
<http://www.ietf.org/internet-drafts/draft-dang-nistkdf-00.txt>
There is also some discussion about this standard on the IETF
webpage. Here's a link to the e-mail list:
<http://www1.ietf.org/mail-archive/web/cfrg/current/threads.html>
Look for e-mails with the subject "Hash-Based Key Derivation".
Several people from the 1619 group are also members of the ietf
group, so we should be able to get some good feedback on pros and
cons of a KDF.
The NIST KDF spec above defines the key derivation as follows:
Compute Hash-i = H (counter || SV {|| algorithmID} || contextID
{|| SharedInfo}).
Where
counter is a 32-bit integer going from 1 to ceiling
(keydatalen / hashlen),
SV is the "Secret Value", or in this case, the 256-bit
AES key,
algorithmID is an optional algorithm identifier,
contextID is a combination of an identifier for both the sender
and receiver,
SharedInfo is an optional string with any additional data, and
|| is the concatenation operator
If we were to apply this standard to IEEE 1619.1, it would
probably look something like this:
NewKey = SHA-256(0x00000001 || UserKey || Len(FormatID) ||
FormatID || Len(Misc) || Misc)
Since the output of SHA-256 matches the 256-bit key size, it is
only necessary to run the hashing function once (that is, counter
only ever equals 0x00000001).
Can anyone from the IETF group make any comments on this
approach? What kind of pitfalls should we expect? Is it likely
to become a standard?
Since we're getting SV from an out-of-band method, we'll need to
conform to the requirements of Section 3.1. "To ensure that
distinct keying material is generated, a protocol supporting secret
values established out of band MUST include SharedInfo
substrings with transaction or application specific information
unique to this execution of the protocol." Perhaps the FormatID
could be defined as part of the sharedInfo.
As an aside, it is funny that the very definition of contextID
shows that the specification has been written with point-to-point
communication in scope, and not data storage :-)
Another nit: algorithmID is mandatory, and it appears to be an ASN.
1 OID. Which means that we need to get ASN.1 OIDs for all of our
algorithms if we want to conform, as far as I can tell.
As many of you probably already know, NIST maintains an ASN object
register for their crypto stuff. However, that registry appears to
be quite incomplete (I see only AES-128, AES-192, and AES-256 in ECB,
CBC, CFB, and OFB modes). The registry is online at http://
csrc.nist.gov/csor/algorithms.htm#modules
Does anyone know if there is a more complete registry somewhere?
David
Personally, I'd rather use HMAC-SHA256 instead of just SHA256.
This construct would help to reduce the entropy loss and would
also make it harder to attack SHA256 if a weakness is later found
in the hashing function. We could also remove the 'counter' since
it's always 1. Here is how this new approach could look:
NewKey = HMAC-SHA-256(UserKey, Len(FormatID) || FormatID || Len
(Misc) || Misc)
or, by the definition of HMAC:
NewKey = SHA-256(UserKey XOR 0x5C5C5C... || SHA-256(UserKey XOR
0x363636... || Len(FormatID) || FormatID) || Len(Misc) || Misc))
where
UserKey is the user-provided 256-bit key,
NewKey is the derived key later used in the AES-GCM engine,
FormatID is a globally unique identifier of the tape format, and
Misc is any other useful vendor-specific information.
Each length field would be a Big-Endian 32-bit integer indicating
the number of 8-bit bytes within the following field. The
FormatID and Misc fields would either be a Big-Endian Integer, or
a variable length ASCII string. 'FormatID' would need to be a
registered, unique identifier that globally identifies the
format. 'Misc' would be any other vendor-specific information
that would be useful in uniquely identifying the media.
Is this approach secure? Is there too much entropy loss? Comments?
If we're deriving many keys from the master key, then I agree that
HMAC is preferable. I'm not sure if that's the case, though. Your
proposal looks fine to me (though I'm not exactly sure on what
"Format" would contain).
David
Thanks,
-Matt