bodr...@mail.dm.unipi.it writes:
> I suspect that one remaining optimisation for the smallest operands is
> to get rid of the final lshift. By being cleverer with 2 removal from
You can get rid of the final lshift (and all i2cnt and j2cnt update code)
by pre-computing the exponent of 2
bodr...@mail.dm.unipi.it writes:
There is one strength in the code I wrote for binomial: code recycling.
Yes, that's very nice.
For example: my basecase code to compute the odd component of the
factorial does not "shift out twos", it uses the following trick.
Let n be an integer. An
ni...@lysator.liu.se (Niels Möller) writes:
Would that work (I haven't thought very carefully about the math)? The
size of the needed tables is the same, and it ought to give a decent
speedup. fac_inv[k] holds the mod B inverse of the odd part of k!.
Sorry about the mail crossing...it wor
Torbjorn Granlund writes:
Unlike my bdiv_bin_uiui code, this code puts all multiplies and
"divides" on the same dependency chain. This might hurt performance on
some machines (with good multiply throughput and long latency).
Wait a minute...
Multiply is associative and commutative, eve
Torbjorn Granlund writes:
> c = n - k + 1;
> i = n - k + 2;
> for (j = 1; j < k; j++)
> {
> c = c * i;
> i++;
> c = c * zinv[j] >> ctz[j];
> }
> Is this the best way to handle really small numbers? This code cannot
> handle k > 32 (with present tables) and n grea
ni...@lysator.liu.se (Niels Möller) writes:
At least a table för k should take little memory. For each k < k', it
could tabulate not only k! but also a hensel inverse (of the odd part,
and the shift count) of one or two limbs. Should be useful also for the
case of large n and small k.
I
bodr...@mail.dm.unipi.it writes:
Ciao,
Il Dom, 11 Dicembre 2011 2:59 pm, Torbjorn Granlund ha scritto:
> bodr...@mail.dm.unipi.it writes:
>
> > Forgot to point to the updated code: See http://gmplib.org/devel/
>
> I can not find it...
>
> What happens? Cannot you find the
bodr...@mail.dm.unipi.it writes:
On shell, for binomial(1234,k) the faster algorithm is:
- with k < 14, the current GMP 5.0.1 basecase;
- with 14 < k < 540, your_bdiv_uiui code;
- with 540 < k (<620), the code using prime sieving.
Nice, more improvement that I had anticipated. It is
Ciao,
Il Dom, 11 Dicembre 2011 3:46 am, Torbjorn Granlund ha scritto:
> Now the code beats the repo code everywhere on my test machines. I
> suspect that your improved basecase code might still be somewhat better
> in some area, but perhaps the difference is negligible?
Great!
On shell, for bin
Ciao,
Il Dom, 11 Dicembre 2011 2:59 pm, Torbjorn Granlund ha scritto:
> bodr...@mail.dm.unipi.it writes:
>
> > Forgot to point to the updated code: See http://gmplib.org/devel/
>
> I can not find it...
>
> What happens? Cannot you find the header New binomial code, or are you
It probably was
bodr...@mail.dm.unipi.it writes:
> Forgot to point to the updated code: See http://gmplib.org/devel/ under
> New binomial code.
I can not find it...
What happens? Cannot you find the header New binomial code, or are you
saying that the link to the code does not work? (I checked the h
PS:
Il Dom, 11 Dicembre 2011 11:24 am, bodr...@mail.dm.unipi.it ha scritto:
> Let n be an integer. And let oddfac(n)= n! >> (n-popcount(n)) be the odd
> factor of the factorial of n. Then oddfac(n)=oddfac(floor(n/2))*(n-n%2)!!
> I.e. we can compute the product of odd numbers from 3 to n, times th
Ciao!
Torbjorn Granlund writes:
> (1) Now the number of allowable factors that fit into a limb is tabled.
> This table takes into account that the mulN functions now shift out
> twos.
There is one strength in the code I wrote for binomial: code recycling.
I think that this kind of
Torbjorn Granlund writes:
> We could actually walk down the factorial path and tabulate for n < n'
> and k < k', for some suitably small n', k'.
At least a table för k should take little memory. For each k < k', it
could tabulate not only k! but also a hensel inverse (of the odd part,
and the sh
14 matches
Mail list logo