Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Torbjörn Granlund
t...@gmplib.org (Torbjörn Granlund) writes:

  A crude test generator:

A slightly better variant.  It expects an argument, either "add" or
"sub", allowing it to test both add_ss and sub_ddmmss.

This version should work fine on 32-bit systems without modification.

This code is not well-defined (as Vincent might point out) as it
converts between signed and unsigned quantities in the range where the
signed number is negative.  But it does the job I intended it to do.

I will integrate this in some form in the GMP test facility.



gen-longlong.c
Description: Binary data


-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Paul Zimmermann
   Dear Torbjörn,

> A crude test generator: [...]

great! On gcc115 it fails with the longlong.h shipped with GMP 6.2.0:

zimmerma@gcc115:/tmp/gmp-6.2.0$ /opt/cfarm/gcc-latest/bin/gcc test.c
zimmerma@gcc115:/tmp/gmp-6.2.0$ ./a.out > test1.c
zimmerma@gcc115:/tmp/gmp-6.2.0$ /opt/cfarm/gcc-latest/bin/gcc test1.c -lgmp
zimmerma@gcc115:/tmp/gmp-6.2.0$ ./a.out 
error for f2(0x1,0x0): want (0x0,0x1) got (0x,0x1)
...
error for f4034(0x7fff,0x0): want (0x0,0x7fff) got 
(0x,0x7fff)

With the patch you sent it works:

zimmerma@gcc115:/tmp/gmp-6.2.0$ ./a.out
zimmerma@gcc115:/tmp/gmp-6.2.0$ 

Paul
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Torbjörn Granlund
Vincent Lefevre  writes:

  Note: in the tests, it is important to test the macro on constants
  in order to test the "if" case.

A crude test generator:

#include 
#include 
#include "gmp-impl.h"

void
one (size_t ind, mp_limb_t m0, mp_limb_t s0)
{
  printf ("void f%zu(mp_limb_t* r1p, mp_limb_t* r0p) {\n", ind);
  printf ("mp_limb_t r1, r0;\n");
  printf ("sub_ddmmss (r1, r0, 0, %ld, 0, %ld);\n", (long) m0, (long) s0);
  printf ("*r1p = r1;  *r0p = r0;\n");
  printf ("}\n");
}

mp_limb_t ops[1000];
mp_limb_t res1[1000];
mp_limb_t res0[1000];

int
main ()
{
  size_t n_operands = 0;
  size_t n_functions = 0;

  for (int i = 0; i < 16; i++)
{
  ops[n_operands++] = 1 << i;
  ops[n_operands++] = -(1 << i);
  ops[n_operands++] = (1 << i) - 1;
  ops[n_operands++] = -(1 << i) - 1;
}

  printf ("#include \n");
  printf ("#include \n");
  printf ("#include \"gmp-impl.h\"\n");
  printf ("#include \"longlong.h\"\n");

  /* Generate functions and print them.  */
  for (int i = 0; i < n_operands; i++)
{
  for (int j = 0; j < n_operands; j++)
{
  one (n_functions++, ops[i], ops[j]);
}
}

#if 1
  /* Print out ops[] definition.  */
  printf ("int ops[%zu] = {\n", n_operands);
  for (int i = 0; i < n_operands; i++)
{
  printf ("%ld,", (long) ops[i]);
  if ((i + 1) % 4 == 0)
puts ("");
}
  printf ("};\n");
#endif

  /* Print out function pointer table.  */
  printf ("typedef void (*func_t) (mp_limb_t*, mp_limb_t*);\n");
  printf ("func_t funcs[%zu] = {\n", n_functions);
  for (size_t i = 0; i < n_functions; i++)
{
  printf ("f%zu,", i);
  if ((i + 1) % 16 == 0)
puts ("");
}
  printf ("};\n");

  /* Print out table of reference results.  */
  printf ("mp_limb_t ref[%zu][2] = {\n", n_functions);
  for (int i = 0; i < n_operands; i++)
{
  for (int j = 0; j < n_operands; j++)
{
  printf ("{0x%llx,0x%llx},\n", (unsigned long long) ( ops[i] - 
ops[j]), (unsigned long long) (-(mp_limb_t) (ops[i] < ops[j])));
}
}
  printf ("};\n");


  printf ("int main ()\n{\n");
  printf ("  mp_limb_t r1, r0;\n");
  printf ("  int err = 0;\n");
  printf ("  size_t ind = 0;\n");
  printf ("  for (size_t i = 0; i < %zu; i++)\n", n_functions);
  printf ("{\n");
  printf ("  int ii = i / %zu, jj = i %% %zu;\n", n_operands, n_operands);
  printf ("  funcs[i](, );\n");
  printf ("  if (r0 != ref[ind][0] || r1 != ref[ind][1]) {\n");
  printf (" gmp_printf (\"error for f%%zu(0x%%Mx,0x%%Mx): want 
(0x%%Mx,0x%%Mx) got (0x%%Mx,0x%%Mx)\\n\", i, ops[ii], ops[jj], ref[ind][1], 
ref[ind][0], r1, r0);\n");
  printf (" err++;\n");
  printf ("   }\n");
  printf ("  ind++;\n");
  printf ("}\n");


  printf ("  return err != 0;\n");
  printf ("}\n");
  return 0;
}


-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Niels Möller
Paul Zimmermann  writes:

>Dear Niels,
>
>> The excluded case,
>> 
>>   sub_ddmmss(ah, al, bh, /*compile time constant*/0) 
>> 
>> could clearly be optimized, in a different way, but I'd guess it's rare
>> enough in real code to not be worth the effort?

Sorry for being unclear, sub_ddmmss has six arguments, and the case I
wanted to single out was

   sub_ddmmss(rh, rl, ah, al, bh, /*compile time constant*/0)
 
> in MPFR we have 15 places where we call sub_ddmmss, among which 8 have bl=0:

Those seem to have al == 0 (in above notation), which is a different
case.

  sub_ddmmss (rh, rl, ah, 0, bh, bl)

should be the same as

  rh = ah - bh - (bl > 0);
  rl = -bl;

So we have a borrow to propagate except of also bl == 0, and hence some
runtime carry logic is needed.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Paul Zimmermann
   Dear Niels,

> The excluded case,
> 
>   sub_ddmmss(ah, al, bh, /*compile time constant*/0) 
> 
> could clearly be optimized, in a different way, but I'd guess it's rare
> enough in real code to not be worth the effort?

in MPFR we have 15 places where we call sub_ddmmss, among which 8 have bl=0:

$ grep sub_ddmmss *.[ch] | grep -v "mpfr-longlong" | grep -v "mpfr-gmp"
div.c:  sub_ddmmss (h, l, u0, 0, h, l);
div.c:sub_ddmmss (h, l, u0, 0, h, l);
div.c:sub_ddmmss (r3, r2, r3, r2, v1, v0);
div.c:  sub_ddmmss (s1, s0, s1, s0, v1, v0);
invsqrt_limb.h:sub_ddmmss (_h, _l, _u, 0, _h, _l); \
invsqrt_limb.h:sub_ddmmss (_h, _l, _h, _l, _r >> 60, _r << 4); \
invsqrt_limb.h:sub_ddmmss (_h, _l, _h, _l, 0, 64); \
invsqrt_limb.h:sub_ddmmss (_h, _l, _h, _l, _r >> 61, _r << 3); \
invsqrt_limb.h:sub_ddmmss (_h, _l, _h, _l, 0, 16); \
invsqrt_limb.h:sub_ddmmss (_h, _l, _h, _l, _r >> 62, _r << 2); \
invsqrt_limb.h:sub_ddmmss (_h, _l, _h, _l, 0, 4);  \
invsqrt_limb.h:sub_ddmmss (_h, _l, _h, _l, _r >> 63, (_r << 1) + 1);   \
rec_sqrt_limb.h:sub_ddmmss (_h, _l, 0x4000UL, 0, _h, _l);   
\
sqrt.c:  sub_ddmmss (rb, sb, u0, 0, rb, sb);
sqrt.c:  sub_ddmmss (rb, sb, u0, low, rb, sb);

Best regards,
Paul
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Torbjörn Granlund
ni...@lysator.liu.se (Niels Möller) writes:

  And, which I guess is more relevant in the sub_ddmmss context, it also
  means that there's little need for separate instructions for adding and
  subtracting constants.

The error we struck here for ARM and (one of the errors) for PPC was
that we added 0 instead of subtracting zero, the latter meaning adding
~0 + 1.

Using the ARM "subs rd,rm,imm12" instruction, we compute

{cout, rd} = rm + ~imm + 1

while the "adds rd,rm,imm12" instruction, we compute

{cout, rd} = rm + imm

.  which is quite different.  The former will for example always set
cout when rm = imm = 0 as in Vincent's example.  The latter will never
set carry when imm = 0 or rm = 0;


-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Niels Möller
t...@gmplib.org (Torbjörn Granlund) writes:

>   >   {cout, r} = a + ~b + cin
>
>   This is a - b - borrow, where the borrow is the complement of the
>   carry bit.
>
> Niels' definition is important as it captures the similarity with
> addition.  It is indeed how the instructions are described in the vendor
> manuals.

And it's closer to the hardware, since this makes it clear that there's
only one stage of carry propagation (~b is a totally parallel bit-by-bit
inversion, while logic to constuct -b is slower and more complex), done
identically as for addition. Likely using something like
https://en.wikipedia.org/wiki/Kogge%E2%80%93Stone_adder.

Then, the ISA could of course complement the carry flag at input and
output, regardless of the rest of the hardware, like it's done on x86.
But I've found the ARM convention more convenient, e.g., when writing
wraparound code for special-form ECC-related primes.

And, which I guess is more relevant in the sub_ddmmss context, it also
means that there's little need for separate instructions for adding and
subtracting constants.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Torbjörn Granlund
Vincent Lefevre  writes:

  On 2020-06-17 11:04:26 +0200, Niels Möller wrote:
  > t...@gmplib.org (Torbjörn Granlund) writes:
  > 
  > > The two - signs ought to be ~, I think.  Let me think a buit more about 
that.
  > 
  > If I remember ARM conventions correctly, subtract with carry is defined
  > as
  > 
  >   {cout, r} = a + ~b + cin

  This is a - b - borrow, where the borrow is the complement of the
  carry bit.

Niels' definition is important as it captures the similarity with
addition.  It is indeed how the instructions are described in the vendor
manuals.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Vincent Lefevre
On 2020-06-17 11:04:26 +0200, Niels Möller wrote:
> t...@gmplib.org (Torbjörn Granlund) writes:
> 
> > The two - signs ought to be ~, I think.  Let me think a buit more about 
> > that.
> 
> If I remember ARM conventions correctly, subtract with carry is defined
> as
> 
>   {cout, r} = a + ~b + cin

This is a - b - borrow, where the borrow is the complement of the
carry bit.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Torbjörn Granlund
ni...@lysator.liu.se (Niels Möller) writes:

  t...@gmplib.org (Torbjörn Granlund) writes:

  > The two - signs ought to be ~, I think.  Let me think a buit more about 
that.

  If I remember ARM conventions correctly, subtract with carry is defined
  as

{cout, r} = a + ~b + cin

Here we traded non-borrow subtract and non-carry addition on both ARM
and PPC.

Subtract is defined on both these architectures as

{cout, r} = a + ~b + 1

while addition is refined similarly as

{cout, r} = a + b

.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-17 Thread Niels Möller
t...@gmplib.org (Torbjörn Granlund) writes:

> The two - signs ought to be ~, I think.  Let me think a buit more about that.

If I remember ARM conventions correctly, subtract with carry is defined
as

  {cout, r} = a + ~b + cin

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Paul Zimmermann
   Dear Torbjörn,

> This bug comes untimely for me.  I consider a major purge along the
> lines of the patch below. [...]

I confirm all MPFR 4.1.0-rc1 tests pass on gcc115 with this patch applied to
mpfr-longlong.h, both with GCC 9.1.0 and 10.1.0 (they failed with GCC 9.1.0),
where mpfr-longlong.h was copied from GMP 6.2.0, with minor changes.

If not already present, maybe some tests of the longlong.h routines that
use builtin_constant_p would be useful?

Paul
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Torbjörn Granlund
Vincent Lefevre  writes:

  Note: in the tests, it is important to test the macro on constants
  in order to test the "if" case.

Indeed, and one macro expansion per trivial function or else gcc might
get the idea cse constants.

This bug comes untimely for me.  I consider a major purge along the
lines of the patch below.

If you Vincent have the time to go through these cases to see what can
safely remain, that would be useful.

*** /tmp/extdiff.sqiwiR/gmp-main.89ed9376523e/longlong.hThu Jun 11 
15:53:20 2020
--- /home/tege/prec/gmp-main/longlong.h Tue Jun 16 15:59:10 2020
***
*** 436,453 
  && W_TYPE_SIZE == 32
  #define add_ss(sh, sl, ah, al, bh, bl) \
!   do {
\
! if (__builtin_constant_p (bl) && -(USItype)(bl) < 0x100)  \
!   __asm__ ("subs\t%1, %4, %5\n\tadc\t%0, %2, %3"  \
   : "=r" (sh), "=" (sl)  \
!  : "r" (ah), "rI" (bh),   \
!"%r" (al), "rI" (-(USItype)(bl)) __CLOBBER_CC);\
! else  \
!   __asm__ ("adds\t%1, %4, %5\n\tadc\t%0, %2, %3"  \
!  : "=r" (sh), "=" (sl)  \
!  : "r" (ah), "rI" (bh), "%r" (al), "rI" (bl) __CLOBBER_CC);   \
!   } while (0)
  /* FIXME: Extend the immediate range for the low word by using both ADDS and
!SUBS, since they set carry in the same way.  Note: We need separate
!definitions for thumb and non-thumb to to th absense of RSC under thumb.  
*/
  #if defined (__thumb__)
  #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
--- 436,445 
  && W_TYPE_SIZE == 32
  #define add_ss(sh, sl, ah, al, bh, bl) \
!   __asm__ ("adds\t%1, %4, %5\n\tadc\t%0, %2, %3"  \
   : "=r" (sh), "=" (sl)  \
!  : "r" (ah), "rI" (bh), "%r" (al), "rI" (bl) __CLOBBER_CC)
  /* FIXME: Extend the immediate range for the low word by using both ADDS and
!SUBS, since they set carry in the same way.  We need separate definitions
!for thumb and non-thumb since thumb lacks RSC.  */
  #if defined (__thumb__)
  #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
***
*** 564,592 
 ADDS and SUBS, since they set carry in the same way.  */
  #define add_ss(sh, sl, ah, al, bh, bl) \
!   do {
\
! if (__builtin_constant_p (bl) && -(UDItype)(bl) < 0x1000) \
!   __asm__ ("subs\t%1, %x4, %5\n\tadc\t%0, %x2, %x3"   
\
!  : "=r" (sh), "=" (sl)  \
!  : "rZ" ((UDItype)(ah)), "rZ" ((UDItype)(bh)),\
!"%r" ((UDItype)(al)), "rI" (-(UDItype)(bl)) __CLOBBER_CC);\
! else  \
!   __asm__ ("adds\t%1, %x4, %5\n\tadc\t%0, %x2, %x3"   
\
!  : "=r" (sh), "=" (sl)  \
!  : "rZ" ((UDItype)(ah)), "rZ" ((UDItype)(bh)),\
!"%r" ((UDItype)(al)), "rI" ((UDItype)(bl)) __CLOBBER_CC);\
!   } while (0)
  #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
!   do {
\
! if (__builtin_constant_p (bl) && -(UDItype)(bl) < 0x1000) \
!   __asm__ ("adds\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3"   
\
!  : "=r,r" (sh), "=," (sl) \
!  : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
!"r,Z"   ((UDItype)(al)), "rI,r" (-(UDItype)(bl)) 
__CLOBBER_CC);\
! else  \
!   __asm__ ("subs\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3"   
\
!  : "=r,r" (sh), "=," (sl) \
!  : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
!"r,Z"   ((UDItype)(al)), "rI,r"  ((UDItype)(bl)) 
__CLOBBER_CC);\
!   } while(0);
  #if __GMP_GNUC_PREREQ (4,9)
  #define umul_ppmm(w1, w0, u, v) \
--- 556,568 
 ADDS and SUBS, since they set carry in the same way.  */
  #define add_ss(sh, sl, ah, al, bh, bl) \
!   __asm__ ("adds\t%1, %x4, %5\n\tadc\t%0, %x2, %x3"   \
!  : "=r" (sh), "=" (sl)  \
!  : "rZ" ((UDItype)(ah)), "rZ" ((UDItype)(bh)),\
!"%r" ((UDItype)(al)), "rI" ((UDItype)(bl)) __CLOBBER_CC)
  #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
!   __asm__ ("subs\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3"   \
!  : "=r,r" (sh), "=," (sl) \
!  : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" 

Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Vincent Lefevre
On 2020-06-16 16:11:53 +0200, Vincent Lefevre wrote:
> Note that I'm rather surprised to see that the issue disappears
> if I add unrelated code. For instance, in the following testcase,
> with GCC 9.1.0 -O2, I get "h = 0x", which is
> incorrect, but if I define V, I get the correct "h = 0x0".

Defining V apparently disables some optimization, so that only
the "else" case of sub_ddmmss is used.

Note: in the tests, it is important to test the macro on constants
in order to test the "if" case.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Vincent Lefevre
On 2020-06-16 14:47:23 +0200, Torbjorn Granlund wrote:
> Vincent Lefevre  writes:
> 
>   On 2020-06-16 13:40:14 +0200, Torbjorn Granlund wrote:
>   > Vincent Lefevre  writes:
>   > 
>   >   #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
>   > do {  
> \
>   >   if (__builtin_constant_p (bl) && -(UDItype)(bl) < 0x1000)   
> \
>   > __asm__ ("adds\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" 
> \
>   >  : "=r,r" (sh), "=," (sl) 
> \
>   >  : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  
> \
>   >"r,Z"   ((UDItype)(al)), "rI,r" (-(UDItype)(bl)) 
> __CLOBBER_CC);\
>   >   else
> \
>   > __asm__ ("subs\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" 
> \
>   >  : "=r,r" (sh), "=," (sl) 
> \
>   >  : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  
> \
>   >"r,Z"   ((UDItype)(al)), "rI,r"  ((UDItype)(bl)) 
> __CLOBBER_CC);\
>   > } while(0);
>   > 
>   > The two - signs ought to be ~, I think.  Let me think a buit more about 
> that.
> 
>   Note that the "else" case, which doesn't have a - sign in its
>   arguments is affected too, AFAIK.
> 
> I cannot follow you here.
> 
> Are you saying that the asm in the else clause is broken too?
> Please explain.

Sorry, it seems OK (I thought that on (0,0), subs and adds would
be equivalent, but the carry out is actually different).

Note that I'm rather surprised to see that the issue disappears
if I add unrelated code. For instance, in the following testcase,
with GCC 9.1.0 -O2, I get "h = 0x", which is
incorrect, but if I define V, I get the correct "h = 0x0".

typedef unsigned int UDItype __attribute__ ((mode (DI)));
typedef unsigned long UWtype;
#define W_TYPE_SIZE 64
#define __GMP_DECLSPEC
#define __GMP_GNUC_PREREQ(maj, min) \
  ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#include 
#include "longlong.h"

int v = 0;

int main (void)
{
  volatile unsigned long u0 = 0x4000, r0 = 0x7ffd;
  unsigned long u, r, h, l;
  u = u0;
  r = r0;
  if (r < 0x8000)
r = 0x8000;
#ifdef V
  if (v)
fflush (stdout);
#endif
  umul_ppmm (h, l, r, r);
  sub_ddmmss (h, l, u, 0, h, l);
  printf ("h = 0x%lx\n", h);
  return 0;
}

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Torbjörn Granlund
Vincent Lefevre  writes:

  On 2020-06-16 13:40:14 +0200, Torbjorn Granlund wrote:
  > Vincent Lefevre  writes:
  > 
  >   #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
  > do {  \
  >   if (__builtin_constant_p (bl) && -(UDItype)(bl) < 0x1000)   \
  > __asm__ ("adds\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
  >  : "=r,r" (sh), "=," (sl) \
  >  : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
  >"r,Z"   ((UDItype)(al)), "rI,r" (-(UDItype)(bl)) 
__CLOBBER_CC);\
  >   else\
  > __asm__ ("subs\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
  >  : "=r,r" (sh), "=," (sl) \
  >  : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
  >"r,Z"   ((UDItype)(al)), "rI,r"  ((UDItype)(bl)) 
__CLOBBER_CC);\
  > } while(0);
  > 
  > The two - signs ought to be ~, I think.  Let me think a buit more about 
that.

  Note that the "else" case, which doesn't have a - sign in its
  arguments is affected too, AFAIK.

I cannot follow you here.

Are you saying that the asm in the else clause is broken too?
Please explain.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Torbjörn Granlund
Vincent Lefevre  writes:

  #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
do {  \
  if (__builtin_constant_p (bl) && -(UDItype)(bl) < 0x1000)   \
__asm__ ("adds\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
 : "=r,r" (sh), "=," (sl) \
 : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
   "r,Z"   ((UDItype)(al)), "rI,r" (-(UDItype)(bl)) 
__CLOBBER_CC);\
  else\
__asm__ ("subs\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
 : "=r,r" (sh), "=," (sl) \
 : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
   "r,Z"   ((UDItype)(al)), "rI,r"  ((UDItype)(bl)) 
__CLOBBER_CC);\
} while(0);

The two - signs ought to be ~, I think.  Let me think a buit more about that.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Vincent Lefevre
On 2020-06-16 12:55:07 +0200, Vincent Lefevre wrote:
> In longlong.h from GMP 6.2.0:
> 
> #define sub_ddmmss(sh, sl, ah, al, bh, bl) \
>   do {  \
> if (__builtin_constant_p (bl) && -(UDItype)(bl) < 0x1000)   \
>   __asm__ ("adds\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
>: "=r,r" (sh), "=," (sl) \
>: "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
>  "r,Z"   ((UDItype)(al)), "rI,r" (-(UDItype)(bl)) 
> __CLOBBER_CC);\
> else\
>   __asm__ ("subs\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
>: "=r,r" (sh), "=," (sl) \
>: "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
>  "r,Z"   ((UDItype)(al)), "rI,r"  ((UDItype)(bl)) 
> __CLOBBER_CC);\
>   } while(0);
> 
> I don't understand this code. For instance, on ah=al=bh=bl=0, you
> do adds(0,0), which unsets the carry. And since borrow = not(carry),
> the sbc(0,0) will return -1.

Here's a test without using MPFR:

#define W_TYPE_SIZE 64
#define __GMP_DECLSPEC
#define __GMP_GNUC_PREREQ(maj, min) \
  ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
#include 
#include "longlong.h"

int main (void)
{
  unsigned long sh, sl;

  sub_ddmmss (sh, sl, 0, 0, 0, 0);
  printf ("%lx\n", sh);
  return 0;
}

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


bug in longlong.h for aarch64 sub_ddmmss

2020-06-16 Thread Vincent Lefevre
Hi,

In longlong.h from GMP 6.2.0:

#define sub_ddmmss(sh, sl, ah, al, bh, bl) \
  do {  \
if (__builtin_constant_p (bl) && -(UDItype)(bl) < 0x1000)   \
  __asm__ ("adds\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
   : "=r,r" (sh), "=," (sl) \
   : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
 "r,Z"   ((UDItype)(al)), "rI,r" (-(UDItype)(bl)) 
__CLOBBER_CC);\
else\
  __asm__ ("subs\t%1, %x4, %5\n\tsbc\t%0, %x2, %x3" \
   : "=r,r" (sh), "=," (sl) \
   : "rZ,rZ" ((UDItype)(ah)), "rZ,rZ" ((UDItype)(bh)),  \
 "r,Z"   ((UDItype)(al)), "rI,r"  ((UDItype)(bl)) 
__CLOBBER_CC);\
  } while(0);

I don't understand this code. For instance, on ah=al=bh=bl=0, you
do adds(0,0), which unsets the carry. And since borrow = not(carry),
the sbc(0,0) will return -1.

This is consistent with what I observe in MPFR with

#define MPFR_NEED_LONGLONG_H
#include "mpfr-test.h"
#include "invsqrt_limb.h"

int main (void)
{
  mp_limb_t h, l;

  sub_ddmmss (h, l, 0x4000, 0, 0x4000, 0);
  printf ("%lx\n", h);
  return 0;
}

which outputs  instead of 0.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs