Torbjorn Granlund t...@gmplib.org writes:
I optimised submul_1.asm, and then edited both addmul_1 and submul_1 to
use as similar operand order as possible.
So remaining differences are necessary? I don't remember much sparc
assembly, but it seems carry handling is done slightly differently.
Here's a patch that reorders the arguments for mpn_addcnd_n and
mpn_subcnd_n (I think it's best to keep this change separate from the
renaming, since the potential problems are quite different).
It's tested on x86_64, arm, and with --disable-assembly. I've run a
regular make check and
I wrote 4-way unrolled mul_2 and addmul_2 for T3/T4.
The FAKE_T3 stuff includes missing.m4, which impelements some
instructions missing from my old systems around here. I might retain
that stuff for a while to allow local regression testing, even if it is
a bit ugly.
Could you please run time
romes p romes_12...@yahoo.com writes:
Hello developers
I noticed that there is also a CUMP site
http:/www.hpcs.cs.tsukuba.ac.jp/~nakayama/cump/
Sheesh, the guy has copyied and edited the GMP webpages and now claims
the default all rights reserved with himself as owner. Not a serious
From: Torbjorn Granlund t...@gmplib.org
Date: Thu, 07 Mar 2013 20:58:51 +0100
I wrote 4-way unrolled mul_2 and addmul_2 for T3/T4.
The FAKE_T3 stuff includes missing.m4, which impelements some
instructions missing from my old systems around here. I might retain
that stuff for a while to
I only now spotted FPMADDXHI and FPMADDX. No Sun/Oracle SPARC hae been
a floating-point demon, and these intger multiply instructions are
performed in the fpu.
Multiply-accumulate instructions are tricky, since one may easily put
the accumulation on a carry recurrency path, and thereby kill
From: Torbjorn Granlund t...@gmplib.org
Date: Thu, 07 Mar 2013 20:58:51 +0100
I'm reasonably sure this is correct.
Needs some work still:
davem@patience:~/src/GMP/HG/build-sparc64-ultrasparct4/tests/devel$ ./try
-s1-10 mpn_addmul_2
pagesize is 0x2000 bytes
s[0] 0xf80100048000 to