Glossary of attacks (was: Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format)

Daniel Shahaf Thu, 26 Jan 2023 01:38:14 -0800

Definitions of attacks:

1. Collision attack:
   Given h(),
   find x₁, x₂ such that h(x₁) == h(x₂).


2. Second preimage attack:
   Given h() and x,
   find x′ such that h(x) == h(x′).

3. First preimage attack:
   Given h() and y,
   find x such that h(x) == y.

4. Chosen prefix attack:
   Given h(), p₁, and p₂,
   find m₁, m₂ such that h(m₁) == h(m₂) and m₁.startswith(p₁) and 
m₂.startswith(p₂).

Daniel Shahaf wrote on Thu, Jan 26, 2023 at 09:33:59 +0000:
> Evgeny Kotkov via dev wrote on Mon, Jan 23, 2023 at 02:28:50 +0300:
> > However, with the feasibility of chosen-prefix attacks on SHA-1 [2], it's
> > probably only a matter of time until the situation becomes worse.
> > 
> 
> Quoting the third hunk of 
> <https://mail-archives.apache.org/mod_mbox/subversion-dev/202212.mbox/%3C20221220201300.GH32332%40tarpaulin.shahaf.local2%3E>:
> 
>     What's the acceptance test we use for candidate checksum algorithms?
>     
>     You say we should switch to a checksum algorithm that doesn't have known
>     collisions, but, why should we require that?  Consider the following
>     160-bit checksum algorithm:
>     .
>         1. If the input consists of 40 ASCII lowercase hex digits and
>            nothing else, return the input.
>         2. Else, return the SHA-1 of the input.
>     
>     This algorithm has a trivial first preimage attack.  If a wc used this
>     identity-then-sha1 algorithm instead of SHA-1, then… what?
> 
> > That could happen after a public disclosure of a pair of executable
> > files/scripts where the forged version allows for remote code execution.
> > Or maybe something similar with a file format that is often stored in
> > repositories and that can be executed or used by a build script, etc.
> > 
> 
> Err, hang on.  Your reference described a chosen-prefix attack, while
> this scenario concerns a single public collision.  These are two
> different things.
> 
> Disclosure of of a pair of executable files/scripts isn't by itself
> a problem unless one of the pair ("file A") is in a repository
> somewhere.  Now, was the colliding file ("file B") generated _before_ or
> _after_ file A was committed?
> 
> - If _before_, then it would seem Mallory had somehow managed to:
> 
>   1. get a file of his choosing committed to Alice's repository; and
> 
>   2. get a wc of Alice's repository into one of the codepaths that
>      assume SHA-1 is one-to-one / collission-free (currently that's the
>      ra_serf optimization and the 1.15 wc status).
> 
>   Now, step #1 seems plausible enough.  As to step #2, it's not clear to
>   me how file B would reach the wc in step #2… but insofar as security
>   assumptions go, it seems reasonable to assume Mallory can make this
>   happen.
> 
>   So, I agree it's a scenario we should address.  What options do we
>   have to address it?  (I grant that migrating away from SHA-1 is one
>   option.)
> 
> - If _after_, then you're presuming not simply a collision attack but
>   a second preimage attack.  Should we assume Mallory to be able to
>   mount a second preimage attack?
> 
> Chosen-prefix collision attacks can help Mallory in a variant of the
> "before" case: Mallory computes a collision, sends file A to Alice (who
> commits it), and invokes his assumed ability to inject file B into
> Alice's wc.  This would work for file formats that ignore the unchosen
> suffix.

Glossary of attacks (was: Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format)

Reply via email to