Hi Matt,
nice summary of the KDF situation, more inline:
On Jan 5, 2006, at 3:03 PM, Matt Ball wrote:
Can anyone describe the NIST key-derivation algorithm? This would
be very helpful in guiding the group.
I went ahead and answered my question about the NIST key
derivation. It looks like this is a link to the current proposal
to NIST for a standard "Key Derivation Function":
<http://www.ietf.org/internet-drafts/draft-dang-nistkdf-00.txt>
There is also some discussion about this standard on the IETF
webpage. Here's a link to the e-mail list:
<http://www1.ietf.org/mail-archive/web/cfrg/current/threads.html>
Look for e-mails with the subject "Hash-Based Key Derivation".
Several people from the 1619 group are also members of the ietf
group, so we should be able to get some good feedback on pros and
cons of a KDF.
The NIST KDF spec above defines the key derivation as follows:
Compute Hash-i = H (counter || SV {|| algorithmID} || contextID
{|| SharedInfo}).
Where
counter is a 32-bit integer going from 1 to ceiling
(keydatalen / hashlen),
SV is the "Secret Value", or in this case, the 256-bit AES
key,
algorithmID is an optional algorithm identifier,
contextID is a combination of an identifier for both the sender
and receiver,
SharedInfo is an optional string with any additional data, and
|| is the concatenation operator
If we were to apply this standard to IEEE 1619.1, it would probably
look something like this:
NewKey = SHA-256(0x00000001 || UserKey || Len(FormatID) || FormatID
|| Len(Misc) || Misc)
Since the output of SHA-256 matches the 256-bit key size, it is
only necessary to run the hashing function once (that is, counter
only ever equals 0x00000001).
Can anyone from the IETF group make any comments on this approach?
What kind of pitfalls should we expect? Is it likely to become a
standard?
Since we're getting SV from an out-of-band method, we'll need to
conform to the requirements of Section 3.1. "To ensure that distinct
keying material is generated, a protocol supporting secret values
established out of band MUST include SharedInfo substrings with
transaction or application specific information unique to this
execution of the protocol." Perhaps the FormatID could be defined as
part of the sharedInfo.
As an aside, it is funny that the very definition of contextID shows
that the specification has been written with point-to-point
communication in scope, and not data storage :-)
Another nit: algorithmID is mandatory, and it appears to be an ASN.1
OID. Which means that we need to get ASN.1 OIDs for all of our
algorithms if we want to conform, as far as I can tell.
Personally, I'd rather use HMAC-SHA256 instead of just SHA256.
This construct would help to reduce the entropy loss and would also
make it harder to attack SHA256 if a weakness is later found in the
hashing function. We could also remove the 'counter' since it's
always 1. Here is how this new approach could look:
NewKey = HMAC-SHA-256(UserKey, Len(FormatID) || FormatID || Len
(Misc) || Misc)
or, by the definition of HMAC:
NewKey = SHA-256(UserKey XOR 0x5C5C5C... || SHA-256(UserKey XOR
0x363636... || Len(FormatID) || FormatID) || Len(Misc) || Misc))
where
UserKey is the user-provided 256-bit key,
NewKey is the derived key later used in the AES-GCM engine,
FormatID is a globally unique identifier of the tape format, and
Misc is any other useful vendor-specific information.
Each length field would be a Big-Endian 32-bit integer indicating
the number of 8-bit bytes within the following field. The FormatID
and Misc fields would either be a Big-Endian Integer, or a variable
length ASCII string. 'FormatID' would need to be a registered,
unique identifier that globally identifies the format. 'Misc'
would be any other vendor-specific information that would be useful
in uniquely identifying the media.
Is this approach secure? Is there too much entropy loss? Comments?
If we're deriving many keys from the master key, then I agree that
HMAC is preferable. I'm not sure if that's the case, though. Your
proposal looks fine to me (though I'm not exactly sure on what
"Format" would contain).
David
Thanks,
-Matt