paul zimmermann writes:
> together with Raphaël Rieu-Hleft (in cc), we believe we have found some dead
> code in
> mpn/generic/div_q.c around lines 173-182:
>
> else if (UNLIKELY (qh != 0))
> {
> /* This happens only when the quotient is close to B^n and
On Fri, 20 Apr 2018, Marc Glisse wrote:
On Fri, 20 Apr 2018, Marc Glisse wrote:
On Fri, 20 Apr 2018, Marc Glisse wrote:
On Fri, 20 Apr 2018, Vincent Lefevre wrote:
On 2018-04-20 04:14:15 +0200, Fredrik Johansson wrote:
For operands with 1-4 limbs, that is; on my machine, mpn_mul takes up
"Marco Bodrato" writes:
> On the generic (or no-asm) side, we could at least swap the first branches
> in mpn_mul. Currently we have:
>
> if (un == vn)
> {
> if (up == vp)
> mpn_sqr (prodp, up, un);
> else
> mpn_mul_n (prodp, up, vp, un);
> }
> else if (vn < MU
Interesting. I see that this paper compares to NTL as well.
I spent the morning seeing what I could do to improve the situation
for NTL, whose mul routine has essentially the same functionality
as mpz_mul (takes care of memory allocation and signs).
I reduced some overheads and for small inputs ju
On 2018-04-20 20:26:05 +0200, Vincent Lefevre wrote:
> Yes, but this just means that the user must not call these
> functions in such a case. But he can do some work before
> calling these functions. In particular, mpq_numref and
> mpq_denref should work.
In particular, the user can write wrappers
On 2018-04-20 18:29:55 +0200, Trevor Spiteri wrote:
> >>> Only 0 can have lazy allocation, and I think we document that it isn't
> >>> legal to put 0 on the denominator.
> >> where is this documented?
> > That was in a "I think" sentence. Now that I looked a bit more, I don't
> > find it... Well,
>>> Only 0 can have lazy allocation, and I think we document that it isn't
>>> legal to put 0 on the denominator.
>> where is this documented?
> That was in a "I think" sentence. Now that I looked a bit more, I don't
> find it... Well, you can't call any mpq function that reads that mpq_t,
> but
Ciao,
Il Ven, 20 Aprile 2018 7:36 pm, Marc Glisse ha scritto:
> On Fri, 20 Apr 2018, Marco Bodrato wrote:
>> Il Ven, 20 Aprile 2018 12:39 pm, Marc Glisse ha scritto:
>>> there is, the timings are:
>>>
>>> mpn_mul: .56
>>> mpn_mul_n: .36
>>> mpn_mul_basecase: .16
>>
>> Did you try also the document
Ciao,
Il Ven, 20 Aprile 2018 1:38 pm, Torbjörn Granlund ha scritto:
> ni...@lysator.liu.se (Niels Möller) writes:
> Fredrik Johansson writes:
>
> > It would be possible to have mpn_mul itself assembly-coded to do
> something
> > like this:
> >
> > case 1x1: ...
> > case 2x1: ...
> >
On Fri, 20 Apr 2018, Marco Bodrato wrote:
Il Ven, 20 Aprile 2018 12:39 pm, Marc Glisse ha scritto:
I just tried (LTO+PGO) on a trivial testcase, and gcc didn't manage to do
anything clever with it. Doing it by hand to see how much potential gain
there is, the timings are:
mpn_mul: .56
mpn_mul_
Ciao,
Il Ven, 20 Aprile 2018 12:39 pm, Marc Glisse ha scritto:
> I just tried (LTO+PGO) on a trivial testcase, and gcc didn't manage to do
> anything clever with it. Doing it by hand to see how much potential gain
> there is, the timings are:
>
> mpn_mul: .56
> mpn_mul_n: .36
> mpn_mul_basecase: .
On Fri, 20 Apr 2018, Marc Glisse wrote:
On Fri, 20 Apr 2018, Marc Glisse wrote:
On Fri, 20 Apr 2018, Vincent Lefevre wrote:
On 2018-04-20 04:14:15 +0200, Fredrik Johansson wrote:
For operands with 1-4 limbs, that is; on my machine, mpn_mul takes up to
twice as long as mpn_mul_basecase, and
Hi,
together with Raphaël Rieu-Hleft (in cc), we believe we have found some dead
code in
mpn/generic/div_q.c around lines 173-182:
else if (UNLIKELY (qh != 0))
{
/* This happens only when the quotient is close to B^n and
mpn_*_d
ni...@lysator.liu.se (Niels Möller) writes:
Fredrik Johansson writes:
> It would be possible to have mpn_mul itself assembly-coded to do something
> like this:
>
> case 1x1: ...
> case 2x1: ...
> case 2x2: ...
> generic case, small n: (basecase loop)
> generic case, large n: (f
paul zimmermann writes:
>Niels,
>
>> Such an assembly routine would need access to the threshold between
>> basecase and generic, which in the case of fat builds isn't a compile
>> time constant.
>
> but you could determine at compile time a lower bound for fat builds, no?
If it's too i
Niels,
> Such an assembly routine would need access to the threshold between
> basecase and generic, which in the case of fat builds isn't a compile
> time constant.
but you could determine at compile time a lower bound for fat builds, no?
Paul
___
Fredrik Johansson writes:
> It would be possible to have mpn_mul itself assembly-coded to do something
> like this:
>
> case 1x1: ...
> case 2x1: ...
> case 2x2: ...
> generic case, small n: (basecase loop)
> generic case, large n: (fall back to calling an mpn_mul_generic function
> that selects
On Fri, 20 Apr 2018, Marc Glisse wrote:
On Fri, 20 Apr 2018, Vincent Lefevre wrote:
On 2018-04-20 04:14:15 +0200, Fredrik Johansson wrote:
For operands with 1-4 limbs, that is; on my machine, mpn_mul takes up to
twice as long as mpn_mul_basecase, and inline assembly for 1x1, 2x1 or 2x2
multip
On Fri, 20 Apr 2018, Vincent Lefevre wrote:
On 2018-04-20 04:14:15 +0200, Fredrik Johansson wrote:
For operands with 1-4 limbs, that is; on my machine, mpn_mul takes up to
twice as long as mpn_mul_basecase, and inline assembly for 1x1, 2x1 or 2x2
multiplication is even faster. The problem is th
On 2018-04-20 04:14:15 +0200, Fredrik Johansson wrote:
> For operands with 1-4 limbs, that is; on my machine, mpn_mul takes up to
> twice as long as mpn_mul_basecase, and inline assembly for 1x1, 2x1 or 2x2
> multiplication is even faster. The problem is that there are three function
> calls (mpn_m
On Fri, 20 Apr 2018, paul zimmermann wrote:
Only 0 can have lazy allocation, and I think we document that it isn't
legal to put 0 on the denominator.
where is this documented?
That was in a "I think" sentence. Now that I looked a bit more, I don't
find it... Well, you can't call any mpq fun
> Only 0 can have lazy allocation, and I think we document that it isn't
> legal to put 0 on the denominator.
where is this documented? In mpfr_set_q we use the fact that the user can set q
to 1/0
for example to represent +Inf.
Paul
___
gmp-devel mail
For operands with 1-4 limbs, that is; on my machine, mpn_mul takes up to
twice as long as mpn_mul_basecase, and inline assembly for 1x1, 2x1 or 2x2
multiplication is even faster. The problem is that there are three function
calls (mpn_mul -> mpn_mul_n -> mpn_mul_basecase) + branches between the
use
On Fri, 20 Apr 2018, Marco Bodrato wrote:
Ciao,
Il Gio, 19 Aprile 2018 4:37 pm, Marc Glisse ha scritto:
I finally pushed it. It seemed unsafe to keep mpq unaware of lazy
allocation, in case people start swapping the numerator of a rational with
a lazy 0 integer or something like that.
If we
24 matches
Mail list logo