On Sun, 16 Jun 2019 at 23:31, Kees Cook <[email protected]> wrote:
>
> On Sat, Jun 15, 2019 at 04:18:21PM +0200, Ard Biesheuvel wrote:
> > Yes, I am using the same saturation point as x86. In this example, I
> > am not entirely sure I understand why it matters, though: the atomics
> > guarantee that the write by CPU2 fails if CPU1 changed the value in
> > the mean time, regardless of which value it wrote.
> >
> > I think the concern is more related to the likelihood of another CPU
> > doing something nasty between the moment that the refcount overflows
> > and the moment that the handler pins it at INT_MIN/2, e.g.,
> >
> > > CPU 1                   CPU 2
> > > inc()
> > >   load INT_MAX
> > >   about to overflow?
> > >   yes
> > >
> > >   set to 0
> > >                          <insert exploit here>
> > >   set to INT_MIN/2
>
> Ah, gotcha, but the "set to 0" is really "set to INT_MAX+1" (not zero)
> if you're using the same saturation.
>

Of course. So there is no issue here: whatever manipulations are
racing with the overflow handler can never result in the counter to
unsaturate.

And actually, moving the checks before the stores is not as trivial as
I thought, E.g., for the LSE refcount_add case, we have

        "       ldadd           %w[i], w30, %[cval]\n"                  \
        "       adds            %w[i], %w[i], w30\n"                    \
        REFCOUNT_PRE_CHECK_ ## pre (w30))                               \
        REFCOUNT_POST_CHECK_ ## post                                    \

and changing this into load/test/store defeats the purpose of using
the LSE atomics in the first place.

On my single core TX2, the comparative performance is as follows

Baseline: REFCOUNT_TIMING test using REFCOUNT_FULL (LSE cmpxchg)
      191057942484      cycles                    #    2.207 GHz
      148447589402      instructions              #    0.78  insn per
cycle

      86.568269904 seconds time elapsed

Upper bound: ATOMIC_TIMING
      116252672661      cycles                    #    2.207 GHz
       28089216452      instructions              #    0.24  insn per
cycle

      52.689793525 seconds time elapsed

REFCOUNT_TIMING test using LSE atomics
      127060259162      cycles                    #    2.207 GHz
                 0      instructions              #    0.00  insn per
cycle

      57.243690077 seconds time elapsed

Reply via email to