[mpir-devel] Re: Core2

2009-05-05 Thread Jason Moxham
Deleting case1,2,3 so we do the main loop and just fall thru straight into case0 then the time is back to 3393 , so there are no branches now to get in the way. ie add $4,%r8 mov %rcx,-16(%rdi,%r8,8) jnc lp # this is end of main loop ALIGN(32) skiplp: #cmp $2,%r8 #j

[mpir-devel] Re: Core2

2009-05-05 Thread Jason Moxham
On Tuesday 05 May 2009 03:09:54 Gonzalo Tornaria wrote: > On Mon, May 4, 2009 at 11:27 AM, Jason Moxham wrote: > > Hi > > > > I've been playing with some assembler for the Intel Core2 chips and have > > come across this timing oddity which I cant explain . Any ideas? > > Maybe it's to do with th

[mpir-devel] Re: Core2

2009-05-05 Thread Jason Moxham
On Monday 04 May 2009 19:23:43 David Harvey wrote: > Does it make a difference if you permute the case0 block with any of > the others? > No difference > Does it make a difference if you insert a dummy read/write instruction > into the case0 block? > if I put a mov %r15,%r9 at the start of cas

[mpir-devel] Re: Core2

2009-05-04 Thread Gonzalo Tornaria
On Mon, May 4, 2009 at 11:27 AM, Jason Moxham wrote: > > Hi > > I've been playing with some assembler for the Intel Core2 chips and have come > across this timing oddity which I cant explain . Any ideas? Maybe it's to do with the branch predictor? Remarks: 1. It seems to me that this starts hap

[mpir-devel] Re: Core2

2009-05-04 Thread David Harvey
Does it make a difference if you permute the case0 block with any of the others? Does it make a difference if you insert a dummy read/write instruction into the case0 block? david On May 4, 1:39 pm, Jason Moxham wrote: > Making all cases the same ie using jmp case0 then all the times are fast

[mpir-devel] Re: Core2

2009-05-04 Thread Jason Moxham
Making all cases the same ie using jmp case0 then all the times are fast , and using a jmp case1 then all the times are slow. This looks like just the case0 epilogue is fast , and case1,2,3 epilogues are taking 500 cycles. L1 cache is 32Kb and our 2srcs and 1dst are 24K overall , so all data sh

[mpir-devel] Re: Core2

2009-05-04 Thread David Harvey
What happens if you remove the epilogue, i.e. make it run the main loop exactly floor(n/4) times, so that it performs exactly the same sequence of instructions for e.g. n = 12, 13, 14, 15? david On May 4, 11:44 am, Jason Moxham wrote: > Yeah , the numbers are consistent , nice surprise for core

[mpir-devel] Re: Core2

2009-05-04 Thread Jason Moxham
Yeah , the numbers are consistent , nice surprise for core2 :) And running tests on there own gives us the same numbers. tune$ ./speed -c -s 1000 mpn_test_pppn overhead 7.00 cycles, precision 100 units of 5.37e-10 secs, CPU freq 1861.91 MHz mpn_test_pppn 1000 2809.93 tune$

[mpir-devel] Re: Core2

2009-05-04 Thread David Harvey
Do you get consistent numbers if you run only for a single value of n? i.e. it's not an artifact of the way the buffers are allocated or something? david On May 4, 10:27 am, Jason Moxham wrote: > Hi > > I've been playing with some assembler for the Intel Core2 chips and have come > across this

[mpir-devel] Re: core2

2009-02-23 Thread jason
Done On Monday 23 February 2009 17:21:25 Bill Hart wrote: > Yes we can. > > 2009/2/23 Cactus : > > On Feb 23, 3:22 pm, ja...@njkfrudils.plus.com wrote: > >> I've finished with the core2 and K8 branches now , so I can delete them > >> , I assume the svn log will be transfered/kept in trunk? > >>

[mpir-devel] Re: core2

2009-02-23 Thread Bill Hart
Yes we can. 2009/2/23 Cactus : > > > > On Feb 23, 3:22 pm, ja...@njkfrudils.plus.com wrote: >> I've finished with the core2 and K8 branches now , so I can delete them , I >> assume the svn log will be transfered/kept in trunk? >> >> Is this OK , Brian? >> >> Jason > > Yes, that is fine by me. I

[mpir-devel] Re: core2

2009-02-23 Thread Cactus
On Feb 23, 3:22 pm, ja...@njkfrudils.plus.com wrote: > I've finished with the core2 and K8 branches now , so I can delete them  , I > assume the svn log will be transfered/kept in trunk? > > Is this OK , Brian? > > Jason Yes, that is fine by me. I think we can also delete the toom3 branch as w

[mpir-devel] Re: core2

2009-02-23 Thread jason
I've finished with the core2 and K8 branches now , so I can delete them , I assume the svn log will be transfered/kept in trunk? Is this OK , Brian? Jason --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "mpir

[mpir-devel] Re: core2

2009-02-23 Thread Bill Hart
If the sage.math machine is overloaded you could try ssh mod That machine is identical. It is officially for modular forms development, but if you have access and it is not loaded I am sure no one will mind you running some short timings. Bill. 2009/2/23 : > > On Monday 23 February 2009 12:12:

[mpir-devel] Re: core2

2009-02-23 Thread jason
On Monday 23 February 2009 12:12:59 Bill Hart wrote: > I think SkyNet/eno is a Core 2. It would be interesting to see what > score one obtains on a slower Core 2 machine to test the idea that > these scores scale linearly with clock speed. > > At the moment it looks like we'd be about 10% behind G

[mpir-devel] Re: core2

2009-02-23 Thread jason
On Monday 23 February 2009 12:12:59 Bill Hart wrote: > I think SkyNet/eno is a Core 2. It would be interesting to see what > score one obtains on a slower Core 2 machine to test the idea that > these scores scale linearly with clock speed. > > At the moment it looks like we'd be about 10% behind G

[mpir-devel] Re: core2

2009-02-23 Thread Bill Hart
I think SkyNet/eno is a Core 2. It would be interesting to see what score one obtains on a slower Core 2 machine to test the idea that these scores scale linearly with clock speed. At the moment it looks like we'd be about 10% behind GMP 4.3 if the benchmarks do scale linearly. Bill. 2009/2/23

[mpir-devel] Re: core2

2009-02-23 Thread jason
I've done the merge , it is only marginally faster than before ,probably because there were not many inc/dec's , and there is probably a still a fair amount of slack in the functions. I got a bench of 10364 on sage.math On Sunday 22 February 2009 22:34:11 Jason Martin wrote: > Sounds fine. >

[mpir-devel] Re: core2

2009-02-22 Thread Jason Martin
Sounds fine. Jason Worth Martin Asst. Professor of Mathematics http://www.math.jmu.edu/~martin On Sun, Feb 22, 2009 at 2:49 PM, wrote: > > > If there are no objections , I will merge the core-2 branch into trunk > tomorrow. All I have changed are inc/dec to add/sub . > I didn't do it for mpn

[mpir-devel] Re: core2

2009-02-22 Thread mabshoff
On Feb 22, 11:49 am, ja...@njkfrudils.plus.com wrote: > If there are no objections , I will merge the core-2 branch into trunk > tomorrow. All I have changed are inc/dec to add/sub . > I didn't do it for mpn-divexact_byff or the sub,add part of redc_basecase > because it was not trivial. > > Whe

[mpir-devel] Re: core2

2009-02-22 Thread jason
If there are no objections , I will merge the core-2 branch into trunk tomorrow. All I have changed are inc/dec to add/sub . I didn't do it for mpn-divexact_byff or the sub,add part of redc_basecase because it was not trivial. When I say a merge I will just copy the new files across , as the

[mpir-devel] Re: core2

2009-02-20 Thread Jason Martin
On Fri, Feb 20, 2009 at 3:18 PM, wrote: > > On Friday 20 February 2009 16:42:19 Jason Martin wrote: >> > On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote: >> > >> > What happened to core-2 mul_basecase and sqr_basecase ? , no-wonder >> > core-2 benchmarks are crap >> >> There

[mpir-devel] Re: core2

2009-02-20 Thread jason
On Friday 20 February 2009 20:37:50 Cactus wrote: > On Feb 20, 8:18 pm, ja...@njkfrudils.plus.com wrote: > > On Friday 20 February 2009 16:42:19 Jason Martin wrote: > > > > On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote: > > > > > > > > What happened to core-2 mul_basecase and

[mpir-devel] Re: core2

2009-02-20 Thread Cactus
On Feb 20, 8:18 pm, ja...@njkfrudils.plus.com wrote: > On Friday 20 February 2009 16:42:19 Jason Martin wrote: > > > > On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote: > > > > What happened to core-2 mul_basecase and sqr_basecase ? , no-wonder > > > core-2 benchmarks are crap

[mpir-devel] Re: core2

2009-02-20 Thread jason
On Friday 20 February 2009 16:42:19 Jason Martin wrote: > > On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote: > > > > What happened to core-2 mul_basecase and sqr_basecase ? , no-wonder > > core-2 benchmarks are crap > > There aren't any :-) I was just using Gaudry's code for t

[mpir-devel] Re: core2

2009-02-20 Thread William Stein
On Fri, Feb 20, 2009 at 8:45 AM, mabshoff wrote: > > > > On Feb 20, 8:45 am, ja...@njkfrudils.plus.com wrote: > > Jason: those numbers already *rock* > >> Some benchmarks on core2-unknown-linux-gnu (sage.math)Intel(R) Xeon(R) CPU >> X7460 @ 2.66GHz >> >> 8307 on trunk r1623 , should be same sco

[mpir-devel] Re: core2

2009-02-20 Thread mabshoff
On Feb 20, 8:45 am, ja...@njkfrudils.plus.com wrote: Jason: those numbers already *rock* > Some benchmarks on core2-unknown-linux-gnu  (sage.math)Intel(R) Xeon(R) CPU   >         > X7460  @ 2.66GHz > > 8307 on trunk r1623 , should be same score as mpir-0.9.0 > 10252 on core-2 branch r1623 > >

[mpir-devel] Re: core2

2009-02-20 Thread Jason Martin
> On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote: > > What happened to core-2 mul_basecase and sqr_basecase ? , no-wonder core-2 > benchmarks are crap There aren't any :-) I was just using Gaudry's code for those routines. Should be able to use your amd64 code for those eve

[mpir-devel] Re: core2

2009-02-20 Thread jason
Some benchmarks on core2-unknown-linux-gnu (sage.math)Intel(R) Xeon(R) CPU X7460 @ 2.66GHz 8307 on trunk r1623 , should be same score as mpir-0.9.0 10252 on core-2 branch r1623 a 23.4% speedup . I keep getting make[5]: warning: Clock skew detected. Your build may be incomplete

[mpir-devel] Re: core2

2009-02-20 Thread jason
On Friday 20 February 2009 14:12:19 ja...@njkfrudils.plus.com wrote: What happened to core-2 mul_basecase and sqr_basecase ? , no-wonder core-2 benchmarks are crap > Running the k8/k10 asm code with no changes on the core2 machine sage we > get this > > popcount,hamdist no popcount instructi