gmp-devel list

2013-03-05 Thread Zimmermann Paul
Hi all, the gmp-devel list is for "Technical discussions between developers". We have seen recently several patches posted, which I believe do no match the list definition. If there is no other way to transfer source code, maybe one should create a gmp-patch list? Paul Zimmermann _

Re: [PATCH 1/2] Add 64-bit sparc multiply routines for T3 and later.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Wed, 06 Mar 2013 00:53:44 +0100 > I have finished reviewing the configure.ac changes. They are OK, but at > some point we need to clean up sparc config code. Agreed. > Your latest asm looks OK, but I will nag you again about running > tests/devel/try... Please le

Re: [PATCH 1/2] Add 64-bit sparc multiply routines for T3 and later.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: This is a respin of patch #2 from last night, it incorporates all of the improvements either explicitly or implicitly suggested :-) Torbjorn, I'm leaving out the configure regeneration from the patch, so that the patch is not so large, since I'm pretty sure you're

[PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
* mpn/sparc64/ultrasparct3/add_n.asm: New file. * mpn/sparc64/ultrasparct3/sub_n.asm: New file. --- diff --git a/mpn/sparc64/ultrasparct3/add_n.asm b/mpn/sparc64/ultrasparct3/add_n.asm new file mode 100644 index 000..16bd0c4 --- /dev/null +++ b/mpn/sparc64/ultrasparct3/add_n.

[PATCH 1/2] Add 64-bit sparc multiply routines for T3 and later.

2013-03-05 Thread David Miller
This is a respin of patch #2 from last night, it incorporates all of the improvements either explicitly or implicitly suggested :-) Torbjorn, I'm leaving out the configure regeneration from the patch, so that the patch is not so large, since I'm pretty sure you're going to regenerate it yourself.

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 23:43:00 +0100 > This should have the right instructions. Scheduling might be needed. > Obviously untested. I already had this done locally: dnl SPARC v9 mpn_mul_1 for T3/T4. dnl Copyright 2013 Free Software Foundation, Inc. dnl This file is

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Wed, 06 Mar 2013 00:08:09 +0100 > The addmul code could be simularly improved. Grumble... and I did this work already, I sent older versions of my T3/T4 changes, let me go see how I screwed this up: dnl SPARC v9 mpn_addmul_1 for T3/T4. dnl Copyright 2013 Free So

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Torbjorn Granlund
The addmul code could be simularly improved. But unlike mul_1, we cannot keep a recurrent carry alive, since we are to add 3 limbs at each column. Instead, one can add in two phases. See powerpc/mode64/aorsmul_1.asm for an example. With two-way unrolling, one would need these insns: mulxup

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: From: Torbjorn Granlund Date: Tue, 05 Mar 2013 23:27:45 +0100 > David Miller writes: > > diff --git a/mpn/sparc64/ultrasparct3/mul_1.asm b/mpn/sparc64/ultrasparct3/mul_1.asm > index df52647..6a3f193 100644 > --- a/mpn/sparc64/ultrasparct3/mul_1.asm

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 23:27:45 +0100 > David Miller writes: > > diff --git a/mpn/sparc64/ultrasparct3/mul_1.asm > b/mpn/sparc64/ultrasparct3/mul_1.asm > index df52647..6a3f193 100644 > --- a/mpn/sparc64/ultrasparct3/mul_1.asm > +++ b/mpn/sparc64/ultrasparct3/mu

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: diff --git a/mpn/sparc64/ultrasparct3/mul_1.asm b/mpn/sparc64/ultrasparct3/mul_1.asm index df52647..6a3f193 100644 --- a/mpn/sparc64/ultrasparct3/mul_1.asm +++ b/mpn/sparc64/ultrasparct3/mul_1.asm @@ -50,8 +50,7 @@ L(top): umulxhi %o4, v0, %o4 addcc

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: Richard Henderson Date: Tue, 05 Mar 2013 12:25:53 -0800 > One extra add insn here (copy-paste from addmul)? > > addcc %o5, %g3, %g3 > addxccc %g2, %g1, %g1 > addxc %g0, %o4, %o5 Yep, that works perfectly fine, but makes no difference performance wise. I made the sam

Re: [PATCH 1/3] Optimize 32-bit sparc T1 multiply routines.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 13:42:58 +0100 > Note that ALIGN between ASM_START and PROLOGUE is ineffective on this > platform. If stricter alignment is needed for function starts (but not > loop starts?) then we need to override the default PROLOGUE_cpu. Thanks for pointing t

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: David Miller Date: Tue, 05 Mar 2013 16:36:12 -0500 (EST) > From: Torbjorn Granlund > Date: Tue, 05 Mar 2013 22:33:30 +0100 > >> David Miller writes: >> >> The versions I posted passed all of the tests. >> >> What does "all the tests" mean? >> >> I insists that you run tests/devel/

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 22:33:30 +0100 > David Miller writes: > > The versions I posted passed all of the tests. > > What does "all the tests" mean? > > I insists that you run tests/devel/try. Please send me the output of > the command I asked you to run. > > Runn

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: The versions I posted passed all of the tests. What does "all the tests" mean? I insists that you run tests/devel/try. Please send me the output of the command I asked you to run. Running GMP's test suite is *not* adequate for testing new assembly code. -- Torbjörn

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 22:14:07 +0100 > David Miller writes: > > From: Torbjorn Granlund > Date: Tue, 05 Mar 2013 21:35:19 +0100 > > > Richard Henderson writes: > > > > One extra add insn here (copy-paste from addmul)? > > > > addcc %

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: From: Torbjorn Granlund Date: Tue, 05 Mar 2013 21:35:19 +0100 > Richard Henderson writes: > > One extra add insn here (copy-paste from addmul)? > > addcc %o5, %g3, %g3 > addxccc %g2, %g1, %g1 > addxc %g0, %o4, %o5 > > Since

Re: [PATCH 06/20] Use "gmp-renamei.h" for renaming the internal routines

2013-03-05 Thread Richard Henderson
On 03/05/2013 12:51 PM, bodr...@mail.dm.unipi.it wrote: >> > +__GMP_INTERN (extern const mp_limb_t, __gmp_oddfac_table, []); >> > +__GMP_INTERN (extern const mp_limb_t, __gmp_odd2fac_table, []); >> > +__GMP_INTERN (extern const unsigned char, __gmp_fac2cnt_table, []); >> > +__GMP_INTERN (extern con

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 21:35:19 +0100 > Richard Henderson writes: > > One extra add insn here (copy-paste from addmul)? > > addcc %o5, %g3, %g3 > addxccc %g2, %g1, %g1 > addxc %g0, %o4, %o5 > > Since I cannot test this at all (qemu-system-sp

Re: [PATCH 06/20] Use "gmp-renamei.h" for renaming the internal routines

2013-03-05 Thread bodrato
Ciao, Il Lun, 4 Marzo 2013 7:41 pm, Richard Henderson ha scritto: > gmp-rename.h: gen-rename.c gen-rename.awk gmp.h > $(COMPILE) -E $< | $(AWK) -f $(srcdir)/gen-rename.awk > $@ || (rm > +gmp-renamei.h: gen-renamei.c gen-rename.awk gmp.h gmp-impl.h > + $(COMPILE) -E $< | $(AWK) -f $(sr

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Torbjorn Granlund
Richard Henderson writes: One extra add insn here (copy-paste from addmul)? addcc %o5, %g3, %g3 addxccc %g2, %g1, %g1 addxc %g0, %o4, %o5 Since I cannot test this at all (qemu-system-sparc64 persistenty resists all my usage attempts) I need you to perform the f

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Richard Henderson
On 03/05/2013 11:43 AM, David Miller wrote: > +PROLOGUE(mpn_mul_1) > + subcc n, 1, n > + be L(final_one) > + clr%o5 > + > +L(top): > + ldx [up+0], %g1 > + ldx [up+8], %o4 > + mulx%g1, v0, %g3 > + add up, 16, up > + umulxhi %g1, v0, %g2 > +

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
From: Richard Henderson Date: Tue, 05 Mar 2013 12:11:56 -0800 > On 03/05/2013 12:05 PM, David Miller wrote: >> Which still hasn't made it to the list yet. I wonder why what is >> rejecting it as I never receive any kind of notification. Torbjorn >> did you at least receive it this time as you'r

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread Richard Henderson
On 03/05/2013 12:05 PM, David Miller wrote: > Which still hasn't made it to the list yet. I wonder why what is > rejecting it as I never receive any kind of notification. Torbjorn > did you at least receive it this time as you're on the CC:? > > BTW, I also noticed that Richard's 20 piece patch

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
From: David Miller Date: Tue, 05 Mar 2013 14:44:46 -0500 (EST) > From: Torbjorn Granlund > Date: Tue, 05 Mar 2013 13:54:37 +0100 > >> David Miller writes: >> >> * mpn/sparc64/ultrasparct3/add_n.asm: New file. >> * mpn/sparc64/ultrasparct3/sub_n.asm: New file. >> >> There is current

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 16:01:11 +0100 > They should of course ether have applied the negation to the r1 operand, > or used an unsigned imm field. An unsigned imm field would eliminate a very useful existing construct. Right now you can compose any 32-bit constant sign ex

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
From: Richard Henderson Date: Tue, 05 Mar 2013 06:53:18 -0800 > On 03/05/2013 04:54 AM, Torbjorn Granlund wrote: >> There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For >> which CPUs are these new add_n/sub_n intended? Why not also for for >> other CPUs? > > For T3 and T4. T

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
From: Torbjorn Granlund Date: Tue, 05 Mar 2013 13:54:37 +0100 > David Miller writes: > > * mpn/sparc64/ultrasparct3/add_n.asm: New file. > * mpn/sparc64/ultrasparct3/sub_n.asm: New file. > > There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For > which CPUs are th

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Richard Henderson
I should mention before someone else does that I've just now tried the "speed" program with these changes, and some adjustment is needed. I think it's not playing by the rules trying to get at internal symbols... r~ ___ gmp-devel mailing list gmp-devel

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread Torbjorn Granlund
Richard Henderson writes: For T3 and T4. This file makes use of new instructions: addxc(cc). Thanks. Honestly, why they didn't have a proper 64-bit with carry insn right from the very first v9 cpu is a mystery. The SPARC cpu is so full of design mistakes that I am not surprised. Cons

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread Richard Henderson
On 03/05/2013 04:54 AM, Torbjorn Granlund wrote: > There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For > which CPUs are these new add_n/sub_n intended? Why not also for for > other CPUs? For T3 and T4. This file makes use of new instructions: addxc(cc). Honestly, why they did

Re: [PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: * mpn/sparc64/ultrasparct3/add_n.asm: New file. * mpn/sparc64/ultrasparct3/sub_n.asm: New file. There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For which CPUs are these new add_n/sub_n intended? Why not also for for other CPUs? I suppose

Re: [PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: This is a resubmit of the work I did 2 months ago now that my FSF assignment has finally been completed. Just the simple stuff, use of mulx/umulx/addxccc and 1 level of loop unrolling. We now got patches 1/3 and 3/3. Is there a 2/3 too? -- Torbjörn

Re: [PATCH 1/3] Optimize 32-bit sparc T1 multiply routines.

2013-03-05 Thread Torbjorn Granlund
David Miller writes: * mpn/sparc32/ultrasparct1/mul_1.asm (mpn_mul_1): Unroll main loop one time, align code on 32-byte boundary, add T2/T3/T4 timings. * mpn/sparc32/ultrasparct1/addmul_1.asm (mpn_addmul_1): Likewise. * mpn/sparc32/ultrasparct1/submul_1.asm (mpn_su

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Torbjorn Granlund
ni...@lysator.liu.se (Niels Möller) writes: That would certainly cause some additional confusion. Any suggestion for appropriate m4 quote characters to use? ;-) I think one should be kind and use [ and ]. The resulting C dialect, where indexing would be written arr[[i]] is not too bad...

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Niels Möller
Torbjorn Granlund writes: > Ehum, I don't understand with which cpp quirk that indirection is > coping... The point of the indirection is to get macro arguments expanded *before* substitution. Which matters only (I think) when using the # and ## cpp operators. Example: gcc -E on this file #de

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Niels Möller
Torbjorn Granlund writes: > ni...@lysator.liu.se (Niels Möller) writes: > > I would expect #line to cause syntax problem for macines where # is not > a comment charachter. Like ARM, where #17 is the small constant > argument 17. GNU as on my arm doesn't complain about #line as generated by m

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Torbjorn Granlund
ni...@lysator.liu.se (Niels Möller) writes: Actually, I think that's incorrect. Everyone has some *familiarity* with the C preprocessor, which surely is an advantage. And maybe most C programmers think they they understand it. But in my experience, very few understand the fine details o

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Niels Möller
Richard Henderson writes: > But perhaps more importantly, everyone who programs in C understands > how the preprocessor works. Actually, I think that's incorrect. Everyone has some *familiarity* with the C preprocessor, which surely is an advantage. And maybe most C programmers think they they

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Torbjorn Granlund
ni...@lysator.liu.se (Niels Möller) writes: David Miller writes: > And it causes the debugging problem Richard mentioned too. I really > want to step in the orignal source file, the thing I'm going to edit > to fix the bug, not some intermediate file. Does it help to just add -s

Re: [PATCH 00/20] Create and use hidden aliases in libgmp.so

2013-03-05 Thread Niels Möller
David Miller writes: > And it causes the debugging problem Richard mentioned too. I really > want to step in the orignal source file, the thing I'm going to edit > to fix the bug, not some intermediate file. Does it help to just add -s to the m4 invocation? Regards, /Niels -- Niels Möller. P

[PATCH 3/3] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
* mpn/sparc64/ultrasparct3/add_n.asm: New file. * mpn/sparc64/ultrasparct3/sub_n.asm: New file. --- diff --git a/mpn/sparc64/ultrasparct3/add_n.asm b/mpn/sparc64/ultrasparct3/add_n.asm new file mode 100644 index 000..16bd0c4 --- /dev/null +++ b/mpn/sparc64/ultrasparct3/add_n.

[PATCH 1/3] Optimize 32-bit sparc T1 multiply routines.

2013-03-05 Thread David Miller
* mpn/sparc32/ultrasparct1/mul_1.asm (mpn_mul_1): Unroll main loop one time, align code on 32-byte boundary, add T2/T3/T4 timings. * mpn/sparc32/ultrasparct1/addmul_1.asm (mpn_addmul_1): Likewise. * mpn/sparc32/ultrasparct1/submul_1.asm (mpn_submul_1): Likewise. ---

[PATCH 0/3] Resubmit of Sparc T3/T4 patches.

2013-03-05 Thread David Miller
This is a resubmit of the work I did 2 months ago now that my FSF assignment has finally been completed. Just the simple stuff, use of mulx/umulx/addxccc and 1 level of loop unrolling. ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/ma