Re: Hash algorithm analysis

Joan Daemen Tue, 18 Sep 2018 08:28:16 -0700

Dear all,

when going over my todo list I was confronted with the mail of Dan Shumow on 
the successor of SHA-1 for git. I know the decision was made and it is not my 
intention to change it, but please see below some comments on Dan's arguments.

In short, I argue below that SHA256 has no serious advantages when compared to 
KangarooTwelve. In that light, the fact that SHA2 was designed behind closed 
doors (like SHA-1 was) should be enough reason to skip it entirely in an 
undertaking that takes open-source seriously.

Kind regards,

Joan

PS: In my comments below I use "we" as I discussed them with the members of the 
Keccak team being Gilles Van Assche, Michaël Peeters, Guido Bertoni, Ronny Van 
Keer and Seth Hoffert, and we agree on all of them.

On 30/07/2018 22:01, Dan Shumow wrote:
> Hello all.   Johannes, thanks for adding me to this discussion.
>
> So, as one of the coauthors of the SHA-1 collision detection code, I just 
> wanted to chime in and say I'm glad to see the move to a longer hash 
> function.  Though, as a cryptographer, I have a few thoughts on the matter 
> that I thought I would share.
>
> I think that moving to SHA256 is a fine change, and I support it.
>
> I'm not anywhere near the expert in this that Joan Daeman is. 

Note that the correct spelling is "Daemen". But anyway, it is not a matter of 
being expert, but a matter of taking the arguments at face value. 

> I am someone who has worked in this space more or less peripherally.  
> However, I agree with Adam Langley that basically all of the finalists for a 
> hash function replacement are about the same for the security needs of Git.  
> I think that, for this community, other software engineering considerations 
> should be more important to the selection process.

We are also with Adam on the engineering considerations. We see the parallelism 
that K12 can exploit adaptively (unlike SHA256) as an example of such a 
consideration.

> I think Joan's survey of cryptanalysis papers and the numbers that he gives 
> are interesting, and I had never seen the comparison laid out like that.  So, 
> I think that there is a good argument to be made that SHA3 has had more 
> cryptanalysis than SHA2.  Though, Joan, are the papers that you surveyed only 
> focused on SHA2?  I'm curious if you think that the design/construction of 
> SHA2, as it can be seen as an iteration of MD5/SHA1, means that the 
> cryptanalysis papers on those constructions can be considered to apply to 
> SHA2?  

This argument works both ways, i.e., the knowledge and experience of the 
symmetric cryptography community in general has also contributed to our choices 
in Keccak and in K12 (including the experience gained by Rijndael/AES). But in 
the end, the only objective metric we have for comparing public scrutiny is the 
amount of cryptanalysis (and analysis) published, and there Keccak simply 
scores better.

> Again, I'm not an expert in this, but I do know that Marc Steven's techniques 
> for constructing collisions also provided some small cryptanalytic 
> improvements against the SHA2 family as well.  I also think that while the 
> paper survey is a good way to look over all of this, the more time in the 
> position of high profile visibility that SHA2 has can give us some confidence 
> as well.

High profile visibility to implementers does not mean more cryptanalysis, since 
users and implementers are usually not cryptanalysts. Actually, one of the 
reasons that SHA2 attracted much less cryptanalysis than you would expect due 
to its age is that during the SHA3 competition all cryptanalysts pointed their 
arrows to SHA3 candidates.

> Also something worth pointing out is that the connection SHA2 has to SHA1 
> means that if Marc Steven's cryptanalysis of MD5/SHA-1 were ever successfully 
> applied to SHA2, the SHA1 collision detection approach could be applied there 
> as well, thus providing a drop in replacement in that situation.  That said, 
> we don't know that there is not a similar way of addressing issues with the 
> SHA3/Sponge design.  It's just that because we haven't seen any weaknesses of 
> this sort in similar designs, we just don't know what a similar approach 
> would be there yet.  I don't want to put too much stock in this argument, 
> it's just saying "Well, we already know how SHA2 is likely to break, and 
> we've had fixes for similar things in the past."  This is pragmatic but not 
> inspiring or confidence building.
>
> So, I also want to state my biases in favor of SHA2 as an employee of 
> Microsoft.  Microsoft, being a corporation headquartered in a America, with 
> the US Gov't as a major customer definitely prefers to defer to the US Gov't 
> NIST standardization process.  And from that perspective SHA2 or SHA3 would 
> be good choices.  I, personally, think that the NIST process is the best we 
> have.  It is relatively transparent, and NIST employs a fair number of very 
> competent cryptographers.  Also, I am encouraged by the widespread 
> international participation that the NIST competitions and selection 
> processes attract.

Of course, NIST has done (and is still doing) a great job at organizing public 
competitions, where all submissions have to include a design rationale and 
where the final selection is based on extensive openly published cryptanalysis 
and comparisons done by the cryptographic community. This is obviously AES and 
SHA3. However, NIST also put forward NSA designs as standards, without design 
rationale or public cryptanalysis whatsoever, and in some cases even with 
built-in backdoors
(EC-DRBG as Dan probably remembers). Examples of this are DES, SHA(0), SHA-1 
and, yes, SHA2. The former we would call open-source philosophy and
the latter closed-source.

> As such, and reflecting this bias, in the internal discussions that Johannes 
> alluded to, SHA2 and SHA3 were the primary suggestions.  There was a slight 
> preference for SHA2 because SHA3 is not exposed through the windows 
> cryptographic APIs (though Git does not use those, so this is a nonissue for 
> this discussion.)

We find it cynical to bring up a Microsoft-internal argument that is actually 
not relevant to Git.

> I also wanted to thank Johannes for keeping the cryptographers that he 
> discussed this with anonymous.  After all, cryptographers are known for being 
> private.  And I wanted to say that Johannes did, in fact, accurately 
> represent our internal discussions on the matter.

Our experience is that in the cryptographic community there are many outspoken 
individuals that fearlessly ventilate their opinions (sometimes even 
controversial ones).

> I also wanted to comment on the discussion of the "internal state having the 
> same size as the output."  Linus referred to this several times.  This is 
> known as narrow-pipe vs wide-pipe in the hash function design literature.  
> Linus is correct that wide-pipe designs are more in favor currently, and 
> IIRC, all of the serious SHA3 candidates employed this.  That said, it did 
> seem that in the discussion this was being equated with "length extension 
> attacks."  And that connection is just not accurate.  Length extension 
> attacks are primarily a motivation of the HMAC liked nested hashing design 
> for MACs, because of a potential forgery attack.  Again, this doesn't really 
> matter because the decision has been made despite this discussion.  I just 
> wanted to set the record straight about this, as to avoid doing the right 
> thing for the wrong reason (T.S. Elliot's "greatest treason.")

Indeed, vulnerability to length extension attacks and size of the internal 
state (chaining value in compression function based designs, or capacity in 
sponge) are two different things. Still, we would like to make a few points 
here.

1) There were SHA3 submissions that were narrow-pipe, i.e., finalist Blake is 
narrow-pipe.

2) SHA2 and its predecessors are vulnerable to length extension, SHA3 (or any 
of the SHA3 finalists) isn't. Length extension is a problem when using the hash 
function for MAC computation but this can be fixed by putting a construction on 
top of it. That construction is HMAC, that comes with some fixed overhead (up 
to a factor 4 for short messages).

3) The relatively large state in the sponge construction increases the generic 
strength against attacks when the input contains redundancy or
has a certain form. For instance, if the input is restricted to be text in 
ASCII (such as source code), then the collision-resistance grows
higher than the nominal 2^{c/2}. Such an effect does not exist with narrow-pipe 
Merkle-Damgård. (This may be what Linus had intuitively in mind.)

> One other thing that I wanted to throw out there for the future is that in 
> the crypto community there is currently a very large push to post quantum 
> cryptography.  Whether the threat of quantum computers is real or imagined 
> this is a hot area of research, with a NIST competition to select post 
> quantum asymmetric cryptographic algorithms.  That is not directly of concern 
> to the selection of a hash function.  However, if we take this threat as 
> legitimate, quantum computers reduce the strength of symmetric crypto, both 
> encryption and hash functions, by 1/2.  

This is not what the experts say. In [1] a quantum algorithm is given that 
reduces the effort to generate a hash collision to 2^{n/3} (instead of 2^{n/2} 
classically). So according to [1] the strength reduction is 2/3 rather than 
1/2. Moreover, in [2], Dan Bernstein takes a more detailed look at the actual 
cost of that algorithm and argues that the quantum algorithm of [1] performs 
worse than classical ones and that there is no security reduction at all for 
collision-resistance.

[1] Gilles Brassard, Peter Høyer, Alain Tapp, Quantum cryptanalysis of hash and 
claw-free functions, in LATIN'98 proceedings (1998), 163–169.

[2] Daniel J. Bernstein, Cost analysis of hash collisions: Will quantum 
computers make SHARCS obsolete? Workshop Record of SHARCS'09.

> So, if this is the direction that the crypto community ultimately goes in, 
> 512bit hashes will be seen as standard over the next decade or so.  I don't 
> think that this should be involved in this discussion, presently.   I'm just 
> saying that not unlike the time when SHA1 was selected, I think that the 
> replacement of a 256bit hash is on the horizon as well.
>
> Thanks,
> Dan Shumow
>

Re: Hash algorithm analysis

Reply via email to