Re: speed of unbalanced multiplication

2013-02-08 Thread bodrato
Ciao, Il Ven, 8 Febbraio 2013 11:42 am, Torbjorn Granlund ha scritto: > bodr...@mail.dm.unipi.it writes: > I agree, but ... the only difference I could see on my netbook is not > memory alignment, but "position". > > Was this reproduced on any non-Linux system? Perhaps Linux somehow > messes

Re: speed of unbalanced multiplication

2013-02-08 Thread Torbjorn Granlund
bodr...@mail.dm.unipi.it writes: I agree, but ... the only difference I could see on my netbook is not memory alignment, but "position". Was this reproduced on any non-Linux system? Perhaps Linux somehow messes up caching and/or TLD for certain address ranges? -- Torbjörn ___

Re: speed of unbalanced multiplication

2013-02-07 Thread bodrato
Ciao Niels, Il Gio, 7 Febbraio 2013 10:29 am, Niels Möller ha scritto: > But if we change the meaning of r, maybe it would make sense to > interpret it as follows: > > if r <= size, it's the size of the smaller operand (what your change > does). > > if r > size, its the size of the *product*

Re: speed of unbalanced multiplication

2013-02-07 Thread bodrato
Ciao Torbjorn, Il Gio, 7 Febbraio 2013 11:22 am, Torbjorn Granlund ha scritto: > It would be nice to understand the slowdown of before, though... I agree, but ... the only difference I could see on my netbook is not memory alignment, but "position". Before the patch, I obtain: $ tune/speed -o a

Re: speed of unbalanced multiplication

2013-02-07 Thread Torbjorn Granlund
bodr...@mail.dm.unipi.it writes: If other developers does not dislike the changed meaning of the . parameter to mpn_mul, this patch can be applied to the main repo... I don't mind, since I never remember which way it is anyway. Avoiding the local allocation is nice too. It would be nice to

Re: speed of unbalanced multiplication

2013-02-07 Thread bodrato
Ciao Niels, Il Gio, 7 Febbraio 2013 10:29 am, Niels Möller ha scritto: > I don't understand the details, like the align parameter to the Unfortunately, it's the same for me... I only tried to mimic the MPN_MUL_N macro. > Makes sense to me to have the r parameter give the size of the smaller > op

Re: speed of unbalanced multiplication

2013-02-07 Thread Niels Möller
bodr...@mail.dm.unipi.it writes: > After the patch, only changing the way tune/speed allocate memory for the > operands, their results are comparable: I don't understand the details, like the align parameter to the allocation macros. > If other developers does not dislike the changed meaning of

Re: speed of unbalanced multiplication

2013-02-07 Thread Zimmermann Paul
cs, CPU freq 800.00 MHz mpn_mul_n mpn_mul mpn_mul_n mpn_mul 800.660041000 #0.656041000 0.660041000 0.660041000 > There is a side-effect: to measure the speed of unbalanced multiplication, > eg ## x ##, you used > > tune/speed -s ## mpn_mul.

Re: speed of unbalanced multiplication

2013-02-07 Thread Zimmermann Paul
> if the culprit is the macro used in speed, it should be fixed! > > I stared at it for an hour yesterday, and I cannot see any problems. > > Operand alignment will differ, but then we shouldn't get consistently > worse performance from mpn_mul. strange indeed. Did you try to use the same op

Re: speed of unbalanced multiplication

2013-02-07 Thread bodrato
the speed of unbalanced multiplication, eg ## x ##, you used tune/speed -s ## mpn_mul.## now the roles of the two parameters are swapped, and you have to write tune/speed -s ## mpn_mul.## The transposed version of the matrix of times I suggested in the previous message, can now be

Re: speed of unbalanced multiplication

2013-02-07 Thread Torbjorn Granlund
Zimmermann Paul writes: if the culprit is the macro used in speed, it should be fixed! I stared at it for an hour yesterday, and I cannot see any problems. Operand alignment will differ, but then we shouldn't get consistently worse performance from mpn_mul. -- Torbjörn ___

Re: speed of unbalanced multiplication

2013-02-07 Thread Zimmermann Paul
Marco, > Date: Wed, 6 Feb 2013 17:59:44 +0100 (CET) > From: bodr...@mail.dm.unipi.it > > Ciao Paul! Ciao!!! > Of course. With current implementation, unbalanced multiplications need > some more memory and a few additions/subtractions, but this should not > give a measurable slow-down. Th

Re: speed of unbalanced multiplication

2013-02-06 Thread bodrato
Ciao Paul! Il Dom, 27 Gennaio 2013 10:09 am, Zimmermann Paul ha scritto: > In the FTT range, multiplying n limbs by m limbs should not be more > expensive then multiplying two numbers of (n+m)/2 limbs. Of course. With current implementation, unbalanced multiplications need some more memory and a

Re: speed of unbalanced multiplication

2013-01-27 Thread Zimmermann Paul
Marco, > Date: Sat, 26 Jan 2013 16:21:28 +0100 (CET) > From: bodr...@mail.dm.unipi.it > > Ciao, > > Il Sab, 26 Gennaio 2013 4:01 pm, bodr...@mail.dm.unipi.it ha scritto: > > I mean, which timing do you obtain with the following? > > ./speed -s $((100+775660)/2) mpn_mul_n mpn_mul_n >

Re: speed of unbalanced multiplication

2013-01-26 Thread bodrato
Ciao, Il Sab, 26 Gennaio 2013 4:01 pm, bodr...@mail.dm.unipi.it ha scritto: > I mean, which timing do you obtain with the following? > ./speed -s $((100+775660)/2) mpn_mul_n mpn_mul_n Sorry... I mean: ./speed -s $[(100+775660)/2] mpn_mul_n mpn_mul_n -- http://bodrato.it/ _

Re: speed of unbalanced multiplication

2013-01-26 Thread bodrato
Ciao Paul, Il Ven, 25 Gennaio 2013 9:24 pm, Zimmermann Paul ha scritto: > in GMP 5.1.0, a multiplication of n x m limbs for m < n can be slower than > a multiplication of n x n limbs. Compare for example the line starting > mpn_mul.100 mpn_mul.100 > 775660 #0.740046000 0.74404

Re: speed of unbalanced multiplication

2013-01-26 Thread Torbjorn Granlund
Zimmermann Paul writes: in GMP 5.1.0, a multiplication of n x m limbs for m < n can be slower than a multiplication of n x n limbs. Compare for example the line starting with 775660 in the first output from speed, and the one starting with 100 in the second one below. [snip] T

speed of unbalanced multiplication

2013-01-25 Thread Zimmermann Paul
Hi, in GMP 5.1.0, a multiplication of n x m limbs for m < n can be slower than a multiplication of n x n limbs. Compare for example the line starting with 775660 in the first output from speed, and the one starting with 100 in the second one below. frite% ./speed -s 50-100 -f 1