Hi RMs,

I suppose nobody can give example(s) that justify current API.
I'll leave this issue to RMs decision, since I think this result in no
conclusion.
Please consider carefully if the current API should be kept or not.
I wrote summary for discussion for you.

Regards,

Misunderstandings summary from previous discussion:

Nikita misunderstood what HKDF is made for.
He said key derivation that generates AES 128 bit key from AES 256 bit key
for
compatibility reason as follows is "textbook example" and
good(recommended?)
usecase.

$sha128key = hash_hkdf("sha256", $sha256key, 16);

Although this could be acceptable useage when salt cannot be used, but not
recommended. HKDF is specifically designed to discourage this kind of key
derivation because this is weak. The RFC strongly encourage salt use for
key security. Appropriate key derivation is

$sha128key = hash_hkdf("sha256", $sha256key, 32, "",
$csprng_generated_random_key);
// Store $cspring_generated_ random_key to somewhere.


The reason why latter is a lot more secure is related to Andrey's
misunderstanding.
He said "when ikm is cryptographically strong, salt wouldn't add no more
entropy.
so salt does not matter". (not exact sentence)
What he said partially true, but wrong in a sense of key security.

When ikm has enough entropy, HKDF does not add more entropy.
However, even when ikm is strong, additional salt adds significant
complexities
for key analysis. To make things simpler, suppose we have very simple KDF
as follows.

okm = VSKDF(ikm, salt)  where
 - okm, ikm, salt is 2 digit int
 - when salt is NULL, default to 2
 - VSKDF is simple function that multiply ikm and salt (ikm * salt) returns
last 2 digits
   as derived key.

When ikm=64, salt=NULL

28 = VSKDF(64, NULL);

Since algorithm is known, attackers are able to guess ikm very easily.
Now, suppose salt=16 is provided

24   = VSKDF(64, 16);

It is still easy to guess due to very simple KDF, but guessing ikm is a lot
harder
than previous example while it does not add any entropy.
HKDF has a lot more complex algorithm, therefore salt gives significantly
stronger
key protection even when salt is disclosed as noted in the RFC.

Other misunderstanding should be noted is what HKDF for. It's for general
purpose KDF as the RFC described in HKDF application section. Andrey said
"I'm cherry picking sentence", but the section should be what the HKDF for.
The section even describes obscure usage, HKDF for CSPRNG. This usage
is not even key derivation.

This one I'm not sure misunderstanding or not, but probably it is.
HKDF is designed for any ikm and works with appropriate usage. Very
weak ikm like user entered password can be handled relatively safely.

$safe_key = hash_hkdf("sha256", 'mypassword', 0, '',
$csprng_generated_random_key);
// $csprng_generated_random_key should be kept secret because ikm is too
weak

Without salt, it's disaster. Please note that salt is the last optional
parameter currently.

$dangerous_key = hash_hkdf("sha256", 'mypassword'); // Disaster!

With this usage, attackers can build pre hashed dictionary. Even when they
don't have
dictionary, usual brute force attack is very effective. One may think
additional hashing
would mitigate risk. i.e.

$dangerous_key = hash_hkdf("sha256", hash("sha256", 'mypassword')); //
Disaster!

This does not help much when algorithm is known to attackers. Brute force
attack is
effective still. Adding secret salt(key) helps with this usage also.

Please remember crypt() abuse history. We should imagine similar thing
could happen
with hash_hkdf() API. Current API encourages misuse/abuse too much by making
salt the last optional parameter.

Better documentation may help, "Users must use random $salt always whenever
it is possible even if it is the last optional parameter.".
However, do we really keep this mess forever? for a new function?


(BTW, Nikita, are you going to take your statement back or not. It's too
impolite)


--
Yasuo Ohgaki
yohg...@ohgaki.net

On Wed, Sep 6, 2017 at 10:15 AM, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:

> Hi all,
>
> This is the last recommendation for hash_hkdf[1]. In fact,
> this would be the last chance to fix because we'll have 7.2 soon.
> The issue is secure usage and API consistency.
>
> Currently hash_hkdf() has following signature:
>
> <binary string> hash_hkdf(string $algo , string $ikm [, int $length = 0 [,
> string $info = '' [, string $salt = '' ]]] )
>
> These are rationals behind recommendation. There are many, but
> please read on.
>
> === Parameter Order ===
>
> HKDF[2] algorithm is:
>  1. General purpose key derivation function as per RFC 5869
>  2. "$salt" parameter is a "pre-shared _KEY_" in many cases as mentioned
> RFC 5869
>  3. "$salt" (or preshared key) is very strongly recommended for security
> season as per RFC 5869
>  4. Supplying salt that the same length of input key does not affect
> performance as per RFC 5969
>  5. "$info" is what makes HKDF useful, but it's less important as
> described in RFC 5869
>  6. "$length" is truly an optional parameter for very specific encryption
> algorithm or usage.
>
> Rationale for change:
>  1. Key derivations without secondary key ($salt) does not make sense when
>      secondary key could be available. HKDF is designed for best possible
> key
>      security with the key. Not using secondary key ($salt) simply
> downgrades
>      key security without no reason. i.e. HKDF performance is the same
>      when $salt has the same as hash is set.
>  2. HKDF is based on HMAC. When $info has no use, HMAC would be the best
>      choice for it. i.e. $newkey = hash_hmac($ikm, $key);
>  3. It does not make sense violating RFC recommendations for a RFC
> implementation.
>
> From these facts and reasons, $salt, $info and $length parameter order and
> requirement should be changed from
>
> string $algo , string $ikm [, int $length = 0 [, string $info = '' [,
> string $salt = '' ]]]
>
> to
>
> string $algo , string $ikm , string $salt, string $info = '' [,
> int $length = 0 ]
> Note: Users can set empty string if they really don't need $salt and/or
> $info.
>
> Conclusion:
> This way, users would have better chances to use hash_hkdf() more securely
> and
> properly.
>
> [1] http://php.net/hash_hkdf
> [2] http://www.faqs.org/rfcs/rfc5869.html
>
> === Return Value and Output Option ===
>
> The most common HKDF usage with PHP would be:
>  1. CSRF token generation that is specific to a request with expiration
> time.
>      (HEX return value would be appropriate, not BINARY)
>  2. API access token generation that does not transmit "The API Key", but
>      derived key by HKDF. It also should have expiration time.
>      (HEX return value would be appropriate, not BINARY)
>
> Consistency with other hash_*() functions:
>  1. All of other hash_*()  returns HEX string hash value.
>  2. hash_hkdf() is the only one returns BINARY hash value.
>
> Conclusion:
> hash_hkdf() should return HEX by default, not BINARY.
> Optional [, bool $raw_output = false ] should be added just like other
> hash_*().
>
>
> === Compatibility ===
>
> IMHO, current hash_hkdf() should not be added by PHP 7.1.2, but 7.2.0
> in the first place. The mess could be resolved by 7.2.
>
> Anyway, hash_hkdf() is added 7.1.2. Hopefully not many users are
> using it yet. If we change API with 7.2 release, there would be least
> possible confusions. (We may remove it from 7.1 to avoid confusions, too)
>
> Our choices:
>  - Keep the current insecure/inconsistent API forever.
>  - Change the API to have secure/consistent API forever.
>
> Conclusion:
> No conclusion for this. There would be conflicting options.
>
>
> I strongly think it is not worth to keep this insecure/inconsistent API
> forever.
> I prefer to change the API to what it should be.
>
> What should we do for this?
> Comments?
>
>
>
> P.S.
>
> Nikita, you've said following during HKDF discussion:
>  - HKDF for CSRF tokens/etc does not make sense at all.
>  - Current parameter order and return value makes perfect sense
>    and has valid/common usages.
>  - Everyone one this list, shouldn't listen to me because I'm insane and
>    totally misunderstood what the HKDF is.
>
> Phrases are not the exact, but it should be correct enough. Some part
> is my fault, w/o reading mail and/or poor English. I blindly assumed
> you've understand RFC 5869 and possible common usages with PHP. I
> apologized for the confusion and tried to explain why w/o success.
>
> If you still believe what you've said is correct, I don't request you to
> take these back, show us at least one common/reasonable hash_hkdf()
> usage example for current API, and point out what's wrong my
> recommendations
> and rationales above.
>
> If not, I request you to take it back. I respect your contributions much,
> but
> the last one you've said is out of tolerance.
>
> --
> Yasuo Ohgaki
> yohg...@ohgaki.net
>
>

Reply via email to