On Thu, 18 Jan 2018 08:54:21 +0000, Otto Moerbeek wrote:
> Looking back the rotor thing is ill-convceived indeed. I'm now
> testing the diff below. I still re-use r, because I really think a
> little bit of correlation does not hurt here.

Actually, I think it does hurt, because it introduces a lot of
regularity in the allocation patterns.  This could be exploited by an
attacker in some way, as it makes it much easier to reason about the
precise layout of those pages than an unbiased distribution would.  For
example, for chunks of size MALLOC_MAXCHUNK / 2, there should be 24
different ways to fill a given page, but currently only four of them are
possible: 0123, 1320, 2103 and 3012.  This is even worse in the new diff
that you suggest, where the only possible configurations are the most
trivial ones: 0123, 1230, 2301 and 3012.

Likewise, for all the other sizes (except MALLOC_MAXCHUNK of course),
the number of possible patterns that can arise from the current code (or
the one that you suggest) is considerably lower than the expected n!
permutations that would be generated by an unbiased approach.  Moreover,
these patterns tend to be very regular and easy to predict, as seen in
the above examples.

Are random bytes really so expansive that it justifies such intentional
biases to be introduced?  I actually thought they were quite cheap,
especially seeing as, on average, more than 25% of them are merely
discarded (and never used) for the sole purpose of introducing noise in
the arc4random_buf(3) calls...

So what I would suggest is unconditionally initialising `r' with two
getrbyte() calls.  For all architectures, this gives enough random bits
to choose the list to use as well as the offset to start at, both in a
perfectly uniform way, without any correlation nor bias of any kind, and
without the need for accumulating bits in an extra rotor variable or
doing some other conditional expansion of the resulting offset when
there are not enough bits in it.  But, indeed, this method includes this
one additional getrbyte() call compared to the current one.  So the
question is: is that really a problem?

(By the way, your diff was also off by one in the check against
UCHAR_MAX, but this is a different issue.)

--- malloc.c.orig       Sun Jan 14 11:18:10 2018
+++ malloc.c    Thu Jan 18 09:45:39 2018
@@ -172,7 +172,6 @@ struct chunk_info {
        u_short free;                   /* how many free chunks */
        u_short total;                  /* how many chunks */
        u_short offset;                 /* requested size table offset */
-       u_short rotor;                  /* randomization rotor */
        u_short bits[1];                /* which chunks are free */
 };
 
@@ -941,7 +940,7 @@ malloc_bytes(struct dir_info *d, size_t size, void *f)
 
        j = find_chunksize(size);
 
-       r = getrbyte(d);
+       r = ((u_int)getrbyte(d) << 8) | getrbyte(d);
        listnum = r % MALLOC_CHUNK_LISTS;
        /* If it's empty, make a page more of that size chunks */
        if ((bp = LIST_FIRST(&d->chunk_dir[j][listnum])) == NULL) {
@@ -953,9 +952,7 @@ malloc_bytes(struct dir_info *d, size_t size, void *f)
        if (bp->canary != (u_short)d->canary1)
                wrterror(d, "chunk info corrupted");
 
-       if (bp->free > 1)
-               bp->rotor += r;
-       i = bp->rotor++ & (bp->total - 1);
+       i = (r / MALLOC_CHUNK_LISTS) & (bp->total - 1);
 
        /* start somewhere in a short */
        lp = &bp->bits[i / MALLOC_BITS];

Regards,

kshe

Reply via email to