Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-07 Thread Niels Möller
Torbjorn Granlund writes: > There are no 64-bit subtraction instruction that take a useful > carry-in. > More recently, as part of a special "visual instruction set 3", Oracle > added 64-bit addition instructions which take a useful carry-in... And one can use the 64-bit add-with-carry to do a

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-07 Thread Torbjorn Granlund
ni...@lysator.liu.se (Niels Möller) writes: Torbjorn Granlund writes: > I optimised submul_1.asm, and then edited both addmul_1 and submul_1 to > use as similar operand order as possible. So remaining differences are necessary? Some differences need to remain, yes... I don't rem

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-07 Thread Niels Möller
Torbjorn Granlund writes: > I optimised submul_1.asm, and then edited both addmul_1 and submul_1 to > use as similar operand order as possible. So remaining differences are necessary? I don't remember much sparc assembly, but it seems carry handling is done slightly differently. But appearantly

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread Torbjorn Granlund
David Miller writes: From: David Miller Date: Thu, 07 Mar 2013 01:06:55 -0500 (EST) > I'll test your routines with the obvious fix in a moment. With the one-liner fix both of your new implementations work. Thanks for testing! submul_1 is now much better, about 5.8 cycles per

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread David Miller
From: David Miller Date: Thu, 07 Mar 2013 01:06:55 -0500 (EST) > I'll test your routines with the obvious fix in a moment. With the one-liner fix both of your new implementations work. submul_1 is now much better, about 5.8 cycles per limb on T4. ___

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread David Miller
From: Torbjorn Granlund Date: Thu, 07 Mar 2013 07:00:00 +0100 > Why doesn't your functions use the 'return' insns, btw? They were expensive at one point, but I just checked and on T4 using return appears to be quite cheap. > PS. I have created a poor man's T3 with some m4 macros. This will all

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread Torbjorn Granlund
David Miller writes: > I optimised submul_1.asm, and then edited both addmul_1 and submul_1 to > use as similar operand order as possible. Please test these using > tests/devel/try, and please time this new submul_1. The testsuite starts failing very early with these changes. Sorry

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread David Miller
From: Torbjorn Granlund Date: Thu, 07 Mar 2013 00:51:24 +0100 > David Miller writes: > > From: Torbjorn Granlund > Date: Wed, 06 Mar 2013 12:36:34 +0100 > > > I think all you T3/T4 changes are now in. Please check that I didn't > > mess something up. > > > > Thanks for this con

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread Torbjorn Granlund
David Miller writes: From: Torbjorn Granlund Date: Wed, 06 Mar 2013 12:36:34 +0100 > I think all you T3/T4 changes are now in. Please check that I didn't > mess something up. > > Thanks for this contribution! Looks good, there is some trailing whitespace in the ChangeLog but

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread David Miller
From: Torbjorn Granlund Date: Wed, 06 Mar 2013 12:36:34 +0100 > I think all you T3/T4 changes are now in. Please check that I didn't > mess something up. > > Thanks for this contribution! Looks good, there is some trailing whitespace in the ChangeLog but that's probably my fault: diff -r 84dd2

Re: [PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-06 Thread Torbjorn Granlund
I think all you T3/T4 changes are now in. Please check that I didn't mess something up. Thanks for this contribution! -- Torbjörn ___ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel

[PATCH 2/2] Optimize 64-bit mpn_add_N and mpn_sub_N for sparc T3 and later.

2013-03-05 Thread David Miller
* mpn/sparc64/ultrasparct3/add_n.asm: New file. * mpn/sparc64/ultrasparct3/sub_n.asm: New file. --- diff --git a/mpn/sparc64/ultrasparct3/add_n.asm b/mpn/sparc64/ultrasparct3/add_n.asm new file mode 100644 index 000..16bd0c4 --- /dev/null +++ b/mpn/sparc64/ultrasparct3/add_n.