Ha, GAP isn't fast at everything. I just found timings for their multiple polynomial quadratic sieve. It takes 2hr to factor a 60 digit number. My sieve takes about 9sec. But what's a factor of 800 between friends.
Bill. On 19 May, 22:23, Bill Hart <[EMAIL PROTECTED]> wrote: > Martin, > > That's all excellent news!! So on the c2d we are caning magma. But we > should try and figure out if your magma version is optimised for c2d > or for amd64, since that will make a big difference. Is your machine > some kind of 64 bit Intel OSX machine? I don't see a specific core 2 > version of Magma on their current list. Of course if you just had a > generic linux x86 version of Magma, that would be much slower than > optimal. > > It's amazing how much difference the SSE makes on your machine. The > AMD does essentially use its MMX or SSE hardware to read in cache > lines I believe, so basically unless you are doing something requiring > lots of wide arithmetic/logic, you aren't going to get anything more > out of the chip. > > I look forward to seeing the new code now that you've cleaned it up. > > I'm going to try and figure out what GAP does, in case there's any > ideas we missed. It's surely old code, but there might be lots of > interesting things in there. > > Anyhow, who would have thought that one would see 1.22s for a > 10000x10000 matrix multiply. That's pretty exciting. > > Bill. > > On 19 May, 21:39, Martin Albrecht <[EMAIL PROTECTED]> > wrote: > > > On Monday 19 May 2008, Bill Hart wrote: > > > > You seemed to be getting up to 8% at points there. That's definitely > > > worth it. I'll be interested to see this evening how it comes out, > > > though I recommend optimising my combine3 function (which I suppose > > > should now be combine8), even including it inline rather than have it > > > in a separate file. > > > > Of course on the Opteron, SSE should be switched off, since it is > > > definitely slower by about 5%-10% even with careful optimisation. > > > > Bill. > > > Okay, I added SSE2 support again and the timings are pretty good on the > > C2D: > > > Dimension Old New > > 10000 x 10000 2.270 1.720 > > 16384 x 16384 9.130 6.760 > > 20000 x 20000 16.110 12.310 > > 32000 x 32000 64.340 50.690 > > > Throwing parallelism in the mix (still lame implementation): > > > Dimension Old New > > 10000 x 10000 1.470 1.220 > > 16384 x 16384 5.540 4.390 > > 20000 x 20000 11.800 8.580 > > 32000 x 32000 40.040 32.810 > > > Btw. Mike Hansen pointed out on IRC that GAP has a pretty fast > > implementation > > of matrix multiplication too: > > > GAP4, Version: 4.4.10 of 02-Oct-2007, x86_64-unknown-linux-gnu-gcc > > gap> A := RandomMat(10000,10000,GF(2)); > > <a 10000x10000 matrix over GF2> > > gap> B := RandomMat(10000,10000,GF(2)); > > <a 10000x10000 matrix over GF2> > > gap> C := A*B; > > <a 10000x10000 matrix over GF2> > > gap> time; > > 5951 > > > The unit here is ms so this takes 6 seconds. However, the generation of > > random > > matrices takes forever. Mike also pointed out that GAP is twice as fast for > > the example he tried than the current Sage code (i.e. the code before the > > improvements discussed in this thread). > > > On sage.math things don't improve as expected: > > > sage: A = random_matrix(GF(2),32000,32000) > > sage: B = random_matrix(GF(2),32000,32000) > > sage: time C = A._multiply_strassen(B,cutoff=2^11) > > CPU times: user 121.69 s, sys: 3.93 s, total: 125.62 s > > Wall time: 125.62 > > > This was 114.620 before. > > > Martin > > > -- > > name: Martin Albrecht > > _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 > > _www:http://www.informatik.uni-bremen.de/~malb > > _jab: [EMAIL PROTECTED] --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---