ni...@lysator.liu.se (Niels Möller) writes:

  L(top):
        mov     (ap, n, 8), %rdx
        mulx    %r8, alo, hi
        adox    ahi, alo
        mov     hi, ahi                 C 2-way unroll.
        adox    zero, ahi               C Clears O
        
        mov     (bp, n), %rdx
        mulx    %r9, blo, hi
        adox    bhi, blo
        mov     hi, bhi
        adox    zero, bhi               C clears O

        adc     blo, alo                C Or sbb, for addsubmul_1msb0
        mov     alo, (rp, n, 8)
        inc     n
        jnz     top

Problem: the adc will write a useless value to the O flag.  That is then
read by the first adox, yielding incorrect results.  Clearing O without
creating any (too bad false) dependencies could perhaps be done with an
additional dummy adox zero, zero.

Another remedy would be to use adcx instead of adc, but then we cannot
easily make a addsubmul_1msb0 variant.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Reply via email to