Hello, today I found a bug in the latest GMP (6.3) using uninitialized memory 
in /mpn/generic/mod_1_1.c while trying to understand some functions like 
mpn_mod_1_1p_cps.

Take a look at function mpn_mod_1() in /mpn/generic/ mod_1.c on lines 248 - 250

mp_limb_t pre[4];
mpn_mod_1_1p_cps (pre, b);
return mpn_mod_1_1p (ap, n, b, pre);

mp_limb_t pre[4] is not initialized and the mpn_mod_1_1p_cps() function never 
writes to pre[2]. So if we change that to mp_limb_t pre[4] = { -1, -1, -1, -1 
}; as a test we'll quickly see that inside mpn_mod_1_1p_cps() the value cps[2] 
(pre[2]) is never written to and if we print out pre[4] after running it we'll 
get output like: 0x21CFE6CFC938B36BU, 0x0000000000000000U, 0xFFFFFFFFFFFFFFFFU, 
0x9581CA13918612E1U (pre[2] is -1)

/mpn/generic/mod_1_1.c lines 260-266:

  if (LIKELY (cnt != 0))
    {
      mp_limb_t B1modb = -b;
      B1modb *= ((bi >> (GMP_LIMB_BITS-cnt)) | (CNST_LIMB(1) << cnt));
      ASSERT (B1modb <= b);             /* NB: not fully reduced mod b */
      cps[2] = B1modb >> cnt;
    }

cnt will ALWAYS equal 0 since there will be NO leading 0's when this is called 
since /mpn/generic/mod_1_1.c checks for the high bit being SET before calling.. 
which means this function only ever gets called with NO leading zeros!!!

ALSO there are unexplained differences between MOD_1_1P_METHOD 1 and 
MOD_1_1P_METHOD 2 when it comes to the function mpn_mod_1_1p_cps(). In one of 
them pre[4] is always set and in the other (method 2) pre[2] is skipped! Why 
are these two functions different? It looks like their job is to calculate the 
inverses for the following mod functions so one would expect them to be the 
same.

Method 2 is the function that was written in x64 asm.

ALSO the x64 assembly versions of these codes also does not always set all 4 
values of pre[4] and this has lead to incorrect results when using the exported 
function mpn_mod_1().

/mpn/x86_64/mod_1_1.asm

        neg     %r12
        mov     %r12, %r8
        mov     %rax, (%rbx)            C store bi
        mov     %rbp, 8(%rbx)           C store cnt
        imul    %rax, %r12
        mov     %r12, 24(%rbx)          C store B2modb
        mov     R32(%rbp), R32(%rcx)
        test    R32(%rcx), R32(%rcx)
        jz      L(z)

        mov     $1, R32(%rdx)
ifdef(`SHLD_SLOW',`
        C Destroys %rax, unlike shld. Otherwise, we could do B1modb
        C before B2modb, and get rid of the move %r12, %r8 above.

        shl     R8(%rcx), %rdx
        neg     R32(%rcx)
        shr     R8(%rcx), %rax
        or      %rax, %rdx
        neg     R32(%rcx)
        shld    R8(%rcx), %rax, %rdx
        imul    %rdx, %r8
        shr     R8(%rcx), %r8
        mov     %r8, 16(%rbx)           C store B1modb
L(z):
        pop     %r12
        pop     %rbx
        pop     %rbp
        FUNC_EXIT()
        ret

Notice on line 230 of the asm file (mov %r8, 16(%rbx) // C store B1modb) gets 
skipped over when count is 0 (which it will be 100% of the time when the 
high-bit is set).

I checked mod_1_2.c, mod_1_3.c, and mod_1_4.c and they do not seem to have this 
problem.

Also note that most compilers and run-times for me seem to have a flukey '0' 
here in memory (which will be the correct value when the high-bit is set) so 
that might be why no one has stumbled upon this bug yet or why it might pass 
tests.

I do not understand these functions enough to suggest a fix. I do not know if 
initializing pre[4] to all 0's will produce correct results.

Which is the correct version of mpn_mod_1_1p_cps() to use for now? (Method 1 or 
Method 2)

My apologies if this is not the best bug report ever.

-Brett Kuntz
_______________________________________________
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs

Reply via email to