Re: [mpir-devel] New assembler

2011-07-14 Thread Cactus
Interesting stuff Jason - I will have to look at the mod_1_2 code to see what might be different about it. Brian -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To view this discussion on the web visit https://groups.google.com/d/msg/mp

Re: [mpir-devel] New assembler

2011-07-14 Thread Jason
On Wednesday 13 July 2011 15:50:10 Jason wrote: > On Wednesday 13 July 2011 14:26:02 Jason wrote: > > On Wednesday 13 July 2011 14:01:39 Cactus wrote: > > > Darn, I already did the conversion :-( > > > > > > I don't have enough registers to use only the 32-bit registers so I > > > have to put stuf

Re: [mpir-devel] New assembler

2011-07-13 Thread Jason
On Wednesday 13 July 2011 14:26:02 Jason wrote: > On Wednesday 13 July 2011 14:01:39 Cactus wrote: > > Darn, I already did the conversion :-( > > > > I don't have enough registers to use only the 32-bit registers so I have > > to put stuff in r8 and r9 instead. Given this involved prefix opcodes

Re: [mpir-devel] New assembler

2011-07-13 Thread Jason
On Wednesday 13 July 2011 14:01:39 Cactus wrote: > Darn, I already did the conversion :-( > > I don't have enough registers to use only the 32-bit registers so I have to > put stuff in r8 and r9 instead. Given this involved prefix opcodes, I am > wondering what I should o with your coded nops si

Re: [mpir-devel] New assembler

2011-07-13 Thread Cactus
Darn, I already did the conversion :-( I don't have enough registers to use only the 32-bit registers so I have to put stuff in r8 and r9 instead. Given this involved prefix opcodes, I am wondering what I should o with your coded nops since any alignment you are seeking won't be the same wit

Re: [mpir-devel] New assembler

2011-07-13 Thread Jason
On Wednesday 13 July 2011 10:54:53 Jason wrote: > On Wednesday 13 July 2011 10:06:50 Jason wrote: > > Hi > > > > New karasub/add for nehalem , I did do a re-shuffle and it was pretty > > much optimal , but the feedin/winddown code killed it , so at the mo > > this is just the existing K10 code wit

Re: [mpir-devel] New assembler

2011-07-13 Thread Jason
On Wednesday 13 July 2011 10:06:50 Jason wrote: > Hi > > New karasub/add for nehalem , I did do a re-shuffle and it was pretty much > optimal , but the feedin/winddown code killed it , so at the mo this is > just the existing K10 code with the inc's replaced by an add and lea's , > just do a diff.

[mpir-devel] New assembler

2011-07-13 Thread Jason
Hi New karasub/add for nehalem , I did do a re-shuffle and it was pretty much optimal , but the feedin/winddown code killed it , so at the mo this is just the existing K10 code with the inc's replaced by an add and lea's , just do a diff. Going to a 3-way unroll on karasub free's up 2 registers

Re: [mpir-devel] New assembler

2010-12-03 Thread Jason
On Saturday 04 December 2010 02:01:48 Jason wrote: > On Saturday 04 December 2010 01:40:13 Bill Hart wrote: > > On 4 December 2010 00:52, Jason wrote: > > > Hi > > > > > > Heres the first lot of new assembler code for the x64 (in trunk) > > > > > > popcount/hamdist are not terribly useful for MP

Re: [mpir-devel] New assembler

2010-12-03 Thread Jason
On Saturday 04 December 2010 01:40:13 Bill Hart wrote: > On 4 December 2010 00:52, Jason wrote: > > Hi > > > > Heres the first lot of new assembler code for the x64 (in trunk) > > > > popcount/hamdist are not terribly useful for MPIR , but they do offer a > > simple way to practice stuff. > > >

Re: [mpir-devel] New assembler

2010-12-03 Thread Bill Hart
On 4 December 2010 00:52, Jason wrote: > Hi > > Heres the first lot of new assembler code for the x64 (in trunk) > > popcount/hamdist are not terribly useful for MPIR , but they do offer a simple > way to practice stuff. > > K8 popcount was 5.5c/l with 2way unroll now 4.66c/l with 3way > K8 hamdis

[mpir-devel] New assembler

2010-12-03 Thread Jason
Hi Heres the first lot of new assembler code for the x64 (in trunk) popcount/hamdist are not terribly useful for MPIR , but they do offer a simple way to practice stuff. K8 popcount was 5.5c/l with 2way unroll now 4.66c/l with 3way K8 hamdist was 5.5c/l with 2way unroll now 5.0c/l with 3way Th