Renaud Allard <[email protected]> wrote:
> The readhash() function in diff(1) uses a signed int for the
> hash accumulator.  The computation `sum = sum * 127 + t` overflows
> after about 5 characters on any line containing high byte values.
> 
> Trace with input "3\xd9100\n" compared against /dev/null:
> 
>   sum = 1
>   t = '3' (51):   sum = 1 * 127 + 51       =        178
>   t = 0xd9 (217): sum = 178 * 127 + 217     =     22,823
>   t = '1' (49):   sum = 22823 * 127 + 49    =  2,898,570
>   t = '0' (48):   sum = 2898570 * 127 + 48  = 368,118,438
>   t = '0' (48):   sum = 368118438 * 127 + 48  -> overflow (> INT_MAX)
> 
> Signed integer overflow is undefined behavior in C.  Since sum is
> only used as a hash value for equality comparison and sorting, the
> fix is to make it unsigned int so the multiplication wraps
> deterministically.
> 
> Found by AFL++ fuzzing with UBSan.

makes sense to me; okay op@

> Index: usr.bin/diff/diffreg.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/diff/diffreg.c,v
> retrieving revision 1.95
> diff -u -p -r1.95 diffreg.c
> --- usr.bin/diff/diffreg.c    24 Oct 2021 21:24:16 -0000      1.95
> +++ usr.bin/diff/diffreg.c
> @@ -1177,7 +1177,7 @@ static int
>  readhash(FILE *f, int flags)
>  {
>       int i, t, space;
> -     int sum;
> +     unsigned int sum;
> 
>       sum = 1;
>       space = 0;


Reply via email to