Ciao,
Il Ven, 8 Febbraio 2013 11:42 am, Torbjorn Granlund ha scritto:
> bodr...@mail.dm.unipi.it writes:
> I agree, but ... the only difference I could see on my netbook is not
> memory alignment, but "position".
>
> Was this reproduced on any non-Linux system? Perhaps Linux somehow
> messes
bodr...@mail.dm.unipi.it writes:
I agree, but ... the only difference I could see on my netbook is not
memory alignment, but "position".
Was this reproduced on any non-Linux system? Perhaps Linux somehow
messes up caching and/or TLD for certain address ranges?
--
Torbjörn
___
Ciao Niels,
Il Gio, 7 Febbraio 2013 10:29 am, Niels Möller ha scritto:
> But if we change the meaning of r, maybe it would make sense to
> interpret it as follows:
>
> if r <= size, it's the size of the smaller operand (what your change
> does).
>
> if r > size, its the size of the *product*
Ciao Torbjorn,
Il Gio, 7 Febbraio 2013 11:22 am, Torbjorn Granlund ha scritto:
> It would be nice to understand the slowdown of before, though...
I agree, but ... the only difference I could see on my netbook is not
memory alignment, but "position".
Before the patch, I obtain:
$ tune/speed -o a
bodr...@mail.dm.unipi.it writes:
If other developers does not dislike the changed meaning of the .
parameter to mpn_mul, this patch can be applied to the main repo...
I don't mind, since I never remember which way it is anyway.
Avoiding the local allocation is nice too.
It would be nice to
Ciao Niels,
Il Gio, 7 Febbraio 2013 10:29 am, Niels Möller ha scritto:
> I don't understand the details, like the align parameter to the
Unfortunately, it's the same for me... I only tried to mimic the MPN_MUL_N
macro.
> Makes sense to me to have the r parameter give the size of the smaller
> op
bodr...@mail.dm.unipi.it writes:
> After the patch, only changing the way tune/speed allocate memory for the
> operands, their results are comparable:
I don't understand the details, like the align parameter to the
allocation macros.
> If other developers does not dislike the changed meaning of
cs, CPU freq
800.00 MHz
mpn_mul_n mpn_mul mpn_mul_n mpn_mul
800.660041000 #0.656041000 0.660041000 0.660041000
> There is a side-effect: to measure the speed of unbalanced multiplication,
> eg ## x ##, you used
>
> tune/speed -s ## mpn_mul.
> if the culprit is the macro used in speed, it should be fixed!
>
> I stared at it for an hour yesterday, and I cannot see any problems.
>
> Operand alignment will differ, but then we shouldn't get consistently
> worse performance from mpn_mul.
strange indeed. Did you try to use the same op
the speed of unbalanced multiplication,
eg ## x ##, you used
tune/speed -s ## mpn_mul.##
now the roles of the two parameters are swapped, and you have to write
tune/speed -s ## mpn_mul.##
The transposed version of the matrix of times I suggested in the previous
message, can now be
Zimmermann Paul writes:
if the culprit is the macro used in speed, it should be fixed!
I stared at it for an hour yesterday, and I cannot see any problems.
Operand alignment will differ, but then we shouldn't get consistently
worse performance from mpn_mul.
--
Torbjörn
___
Marco,
> Date: Wed, 6 Feb 2013 17:59:44 +0100 (CET)
> From: bodr...@mail.dm.unipi.it
>
> Ciao Paul!
Ciao!!!
> Of course. With current implementation, unbalanced multiplications need
> some more memory and a few additions/subtractions, but this should not
> give a measurable slow-down. Th
Ciao Paul!
Il Dom, 27 Gennaio 2013 10:09 am, Zimmermann Paul ha scritto:
> In the FTT range, multiplying n limbs by m limbs should not be more
> expensive then multiplying two numbers of (n+m)/2 limbs.
Of course. With current implementation, unbalanced multiplications need
some more memory and a
Marco,
> Date: Sat, 26 Jan 2013 16:21:28 +0100 (CET)
> From: bodr...@mail.dm.unipi.it
>
> Ciao,
>
> Il Sab, 26 Gennaio 2013 4:01 pm, bodr...@mail.dm.unipi.it ha scritto:
> > I mean, which timing do you obtain with the following?
> > ./speed -s $((100+775660)/2) mpn_mul_n mpn_mul_n
>
Ciao,
Il Sab, 26 Gennaio 2013 4:01 pm, bodr...@mail.dm.unipi.it ha scritto:
> I mean, which timing do you obtain with the following?
> ./speed -s $((100+775660)/2) mpn_mul_n mpn_mul_n
Sorry... I mean:
./speed -s $[(100+775660)/2] mpn_mul_n mpn_mul_n
--
http://bodrato.it/
_
Ciao Paul,
Il Ven, 25 Gennaio 2013 9:24 pm, Zimmermann Paul ha scritto:
> in GMP 5.1.0, a multiplication of n x m limbs for m < n can be slower than
> a multiplication of n x n limbs. Compare for example the line starting
> mpn_mul.100 mpn_mul.100
> 775660 #0.740046000 0.74404
Zimmermann Paul writes:
in GMP 5.1.0, a multiplication of n x m limbs for m < n can be slower than
a multiplication of n x n limbs. Compare for example the line starting with
775660 in the first output from speed, and the one starting with 100 in
the second one below.
[snip]
T
Hi,
in GMP 5.1.0, a multiplication of n x m limbs for m < n can be slower than
a multiplication of n x n limbs. Compare for example the line starting with
775660 in the first output from speed, and the one starting with 100 in
the second one below.
frite% ./speed -s 50-100 -f 1
18 matches
Mail list logo