Hi all,
the gmp-devel list is for "Technical discussions between developers". We have
seen recently several patches posted, which I believe do no match the list
definition. If there is no other way to transfer source code, maybe one should
create a gmp-patch list?
Paul Zimmermann
_
From: Torbjorn Granlund
Date: Wed, 06 Mar 2013 00:53:44 +0100
> I have finished reviewing the configure.ac changes. They are OK, but at
> some point we need to clean up sparc config code.
Agreed.
> Your latest asm looks OK, but I will nag you again about running
> tests/devel/try... Please le
David Miller writes:
This is a respin of patch #2 from last night, it incorporates all of
the improvements either explicitly or implicitly suggested :-)
Torbjorn, I'm leaving out the configure regeneration from the patch,
so that the patch is not so large, since I'm pretty sure you're
* mpn/sparc64/ultrasparct3/add_n.asm: New file.
* mpn/sparc64/ultrasparct3/sub_n.asm: New file.
---
diff --git a/mpn/sparc64/ultrasparct3/add_n.asm
b/mpn/sparc64/ultrasparct3/add_n.asm
new file mode 100644
index 000..16bd0c4
--- /dev/null
+++ b/mpn/sparc64/ultrasparct3/add_n.
This is a respin of patch #2 from last night, it incorporates all of
the improvements either explicitly or implicitly suggested :-)
Torbjorn, I'm leaving out the configure regeneration from the patch,
so that the patch is not so large, since I'm pretty sure you're going
to regenerate it yourself.
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 23:43:00 +0100
> This should have the right instructions. Scheduling might be needed.
> Obviously untested.
I already had this done locally:
dnl SPARC v9 mpn_mul_1 for T3/T4.
dnl Copyright 2013 Free Software Foundation, Inc.
dnl This file is
From: Torbjorn Granlund
Date: Wed, 06 Mar 2013 00:08:09 +0100
> The addmul code could be simularly improved.
Grumble... and I did this work already, I sent older versions
of my T3/T4 changes, let me go see how I screwed this up:
dnl SPARC v9 mpn_addmul_1 for T3/T4.
dnl Copyright 2013 Free So
The addmul code could be simularly improved.
But unlike mul_1, we cannot keep a recurrent carry alive, since we are
to add 3 limbs at each column.
Instead, one can add in two phases. See powerpc/mode64/aorsmul_1.asm
for an example.
With two-way unrolling, one would need these insns:
mulxup
David Miller writes:
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 23:27:45 +0100
> David Miller writes:
>
> diff --git a/mpn/sparc64/ultrasparct3/mul_1.asm
b/mpn/sparc64/ultrasparct3/mul_1.asm
> index df52647..6a3f193 100644
> --- a/mpn/sparc64/ultrasparct3/mul_1.asm
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 23:27:45 +0100
> David Miller writes:
>
> diff --git a/mpn/sparc64/ultrasparct3/mul_1.asm
> b/mpn/sparc64/ultrasparct3/mul_1.asm
> index df52647..6a3f193 100644
> --- a/mpn/sparc64/ultrasparct3/mul_1.asm
> +++ b/mpn/sparc64/ultrasparct3/mu
David Miller writes:
diff --git a/mpn/sparc64/ultrasparct3/mul_1.asm
b/mpn/sparc64/ultrasparct3/mul_1.asm
index df52647..6a3f193 100644
--- a/mpn/sparc64/ultrasparct3/mul_1.asm
+++ b/mpn/sparc64/ultrasparct3/mul_1.asm
@@ -50,8 +50,7 @@ L(top):
umulxhi %o4, v0, %o4
addcc
From: Richard Henderson
Date: Tue, 05 Mar 2013 12:25:53 -0800
> One extra add insn here (copy-paste from addmul)?
>
> addcc %o5, %g3, %g3
> addxccc %g2, %g1, %g1
> addxc %g0, %o4, %o5
Yep, that works perfectly fine, but makes no difference performance
wise. I made the sam
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 13:42:58 +0100
> Note that ALIGN between ASM_START and PROLOGUE is ineffective on this
> platform. If stricter alignment is needed for function starts (but not
> loop starts?) then we need to override the default PROLOGUE_cpu.
Thanks for pointing t
From: David Miller
Date: Tue, 05 Mar 2013 16:36:12 -0500 (EST)
> From: Torbjorn Granlund
> Date: Tue, 05 Mar 2013 22:33:30 +0100
>
>> David Miller writes:
>>
>> The versions I posted passed all of the tests.
>>
>> What does "all the tests" mean?
>>
>> I insists that you run tests/devel/
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 22:33:30 +0100
> David Miller writes:
>
> The versions I posted passed all of the tests.
>
> What does "all the tests" mean?
>
> I insists that you run tests/devel/try. Please send me the output of
> the command I asked you to run.
>
> Runn
David Miller writes:
The versions I posted passed all of the tests.
What does "all the tests" mean?
I insists that you run tests/devel/try. Please send me the output of
the command I asked you to run.
Running GMP's test suite is *not* adequate for testing new assembly
code.
--
Torbjörn
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 22:14:07 +0100
> David Miller writes:
>
> From: Torbjorn Granlund
> Date: Tue, 05 Mar 2013 21:35:19 +0100
>
> > Richard Henderson writes:
> >
> > One extra add insn here (copy-paste from addmul)?
> >
> > addcc %
David Miller writes:
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 21:35:19 +0100
> Richard Henderson writes:
>
> One extra add insn here (copy-paste from addmul)?
>
> addcc %o5, %g3, %g3
> addxccc %g2, %g1, %g1
> addxc %g0, %o4, %o5
>
> Since
On 03/05/2013 12:51 PM, bodr...@mail.dm.unipi.it wrote:
>> > +__GMP_INTERN (extern const mp_limb_t, __gmp_oddfac_table, []);
>> > +__GMP_INTERN (extern const mp_limb_t, __gmp_odd2fac_table, []);
>> > +__GMP_INTERN (extern const unsigned char, __gmp_fac2cnt_table, []);
>> > +__GMP_INTERN (extern con
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 21:35:19 +0100
> Richard Henderson writes:
>
> One extra add insn here (copy-paste from addmul)?
>
> addcc %o5, %g3, %g3
> addxccc %g2, %g1, %g1
> addxc %g0, %o4, %o5
>
> Since I cannot test this at all (qemu-system-sp
Ciao,
Il Lun, 4 Marzo 2013 7:41 pm, Richard Henderson ha scritto:
> gmp-rename.h: gen-rename.c gen-rename.awk gmp.h
> $(COMPILE) -E $< | $(AWK) -f $(srcdir)/gen-rename.awk > $@ || (rm
> +gmp-renamei.h: gen-renamei.c gen-rename.awk gmp.h gmp-impl.h
> + $(COMPILE) -E $< | $(AWK) -f $(sr
Richard Henderson writes:
One extra add insn here (copy-paste from addmul)?
addcc %o5, %g3, %g3
addxccc %g2, %g1, %g1
addxc %g0, %o4, %o5
Since I cannot test this at all (qemu-system-sparc64 persistenty resists
all my usage attempts) I need you to perform the f
On 03/05/2013 11:43 AM, David Miller wrote:
> +PROLOGUE(mpn_mul_1)
> + subcc n, 1, n
> + be L(final_one)
> + clr%o5
> +
> +L(top):
> + ldx [up+0], %g1
> + ldx [up+8], %o4
> + mulx%g1, v0, %g3
> + add up, 16, up
> + umulxhi %g1, v0, %g2
> +
From: Richard Henderson
Date: Tue, 05 Mar 2013 12:11:56 -0800
> On 03/05/2013 12:05 PM, David Miller wrote:
>> Which still hasn't made it to the list yet. I wonder why what is
>> rejecting it as I never receive any kind of notification. Torbjorn
>> did you at least receive it this time as you'r
On 03/05/2013 12:05 PM, David Miller wrote:
> Which still hasn't made it to the list yet. I wonder why what is
> rejecting it as I never receive any kind of notification. Torbjorn
> did you at least receive it this time as you're on the CC:?
>
> BTW, I also noticed that Richard's 20 piece patch
From: David Miller
Date: Tue, 05 Mar 2013 14:44:46 -0500 (EST)
> From: Torbjorn Granlund
> Date: Tue, 05 Mar 2013 13:54:37 +0100
>
>> David Miller writes:
>>
>> * mpn/sparc64/ultrasparct3/add_n.asm: New file.
>> * mpn/sparc64/ultrasparct3/sub_n.asm: New file.
>>
>> There is current
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 16:01:11 +0100
> They should of course ether have applied the negation to the r1 operand,
> or used an unsigned imm field.
An unsigned imm field would eliminate a very useful existing construct.
Right now you can compose any 32-bit constant sign ex
From: Richard Henderson
Date: Tue, 05 Mar 2013 06:53:18 -0800
> On 03/05/2013 04:54 AM, Torbjorn Granlund wrote:
>> There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For
>> which CPUs are these new add_n/sub_n intended? Why not also for for
>> other CPUs?
>
> For T3 and T4. T
From: Torbjorn Granlund
Date: Tue, 05 Mar 2013 13:54:37 +0100
> David Miller writes:
>
> * mpn/sparc64/ultrasparct3/add_n.asm: New file.
> * mpn/sparc64/ultrasparct3/sub_n.asm: New file.
>
> There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For
> which CPUs are th
I should mention before someone else does that I've just now tried the "speed"
program with these changes, and some adjustment is needed. I think it's not
playing by the rules trying to get at internal symbols...
r~
___
gmp-devel mailing list
gmp-devel
Richard Henderson writes:
For T3 and T4. This file makes use of new instructions: addxc(cc).
Thanks.
Honestly, why they didn't have a proper 64-bit with carry insn right
from the very first v9 cpu is a mystery.
The SPARC cpu is so full of design mistakes that I am not surprised.
Cons
On 03/05/2013 04:54 AM, Torbjorn Granlund wrote:
> There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For
> which CPUs are these new add_n/sub_n intended? Why not also for for
> other CPUs?
For T3 and T4. This file makes use of new instructions: addxc(cc).
Honestly, why they did
David Miller writes:
* mpn/sparc64/ultrasparct3/add_n.asm: New file.
* mpn/sparc64/ultrasparct3/sub_n.asm: New file.
There is currently no mpn/sparc64/ultrasparct3, only ultrasparct1. For
which CPUs are these new add_n/sub_n intended? Why not also for for
other CPUs?
I suppose
David Miller writes:
This is a resubmit of the work I did 2 months ago now
that my FSF assignment has finally been completed.
Just the simple stuff, use of mulx/umulx/addxccc and 1
level of loop unrolling.
We now got patches 1/3 and 3/3. Is there a 2/3 too?
--
Torbjörn
David Miller writes:
* mpn/sparc32/ultrasparct1/mul_1.asm (mpn_mul_1): Unroll main loop
one time, align code on 32-byte boundary, add T2/T3/T4 timings.
* mpn/sparc32/ultrasparct1/addmul_1.asm (mpn_addmul_1): Likewise.
* mpn/sparc32/ultrasparct1/submul_1.asm (mpn_su
ni...@lysator.liu.se (Niels Möller) writes:
That would certainly cause some additional confusion. Any suggestion for
appropriate m4 quote characters to use? ;-)
I think one should be kind and use [ and ]. The resulting C dialect,
where indexing would be written arr[[i]] is not too bad...
Torbjorn Granlund writes:
> Ehum, I don't understand with which cpp quirk that indirection is
> coping...
The point of the indirection is to get macro arguments expanded *before*
substitution. Which matters only (I think) when using the # and ## cpp
operators. Example: gcc -E on this file
#de
Torbjorn Granlund writes:
> ni...@lysator.liu.se (Niels Möller) writes:
>
> I would expect #line to cause syntax problem for macines where # is not
> a comment charachter. Like ARM, where #17 is the small constant
> argument 17.
GNU as on my arm doesn't complain about #line as generated by m
ni...@lysator.liu.se (Niels Möller) writes:
Actually, I think that's incorrect.
Everyone has some *familiarity* with the C preprocessor, which surely is
an advantage. And maybe most C programmers think they they understand
it. But in my experience, very few understand the fine details o
Richard Henderson writes:
> But perhaps more importantly, everyone who programs in C understands
> how the preprocessor works.
Actually, I think that's incorrect.
Everyone has some *familiarity* with the C preprocessor, which surely is
an advantage. And maybe most C programmers think they they
ni...@lysator.liu.se (Niels Möller) writes:
David Miller writes:
> And it causes the debugging problem Richard mentioned too. I really
> want to step in the orignal source file, the thing I'm going to edit
> to fix the bug, not some intermediate file.
Does it help to just add -s
David Miller writes:
> And it causes the debugging problem Richard mentioned too. I really
> want to step in the orignal source file, the thing I'm going to edit
> to fix the bug, not some intermediate file.
Does it help to just add -s to the m4 invocation?
Regards,
/Niels
--
Niels Möller. P
* mpn/sparc64/ultrasparct3/add_n.asm: New file.
* mpn/sparc64/ultrasparct3/sub_n.asm: New file.
---
diff --git a/mpn/sparc64/ultrasparct3/add_n.asm
b/mpn/sparc64/ultrasparct3/add_n.asm
new file mode 100644
index 000..16bd0c4
--- /dev/null
+++ b/mpn/sparc64/ultrasparct3/add_n.
* mpn/sparc32/ultrasparct1/mul_1.asm (mpn_mul_1): Unroll main loop
one time, align code on 32-byte boundary, add T2/T3/T4 timings.
* mpn/sparc32/ultrasparct1/addmul_1.asm (mpn_addmul_1): Likewise.
* mpn/sparc32/ultrasparct1/submul_1.asm (mpn_submul_1): Likewise.
---
This is a resubmit of the work I did 2 months ago now
that my FSF assignment has finally been completed.
Just the simple stuff, use of mulx/umulx/addxccc and 1
level of loop unrolling.
___
gmp-devel mailing list
gmp-devel@gmplib.org
http://gmplib.org/ma
45 matches
Mail list logo