Ciao,

Il 2020-03-22 10:00 Seth Troisi ha scritto:
I measure the regression as ~5-15% for input < ~10,000

And no regression for larger input? Good.

I have a proposed patch which uses int, and trial div up to some
threshold this appears to be 5x faster.

A couple of question arises.
5x faster than what? Than the patch I proposed or than the old/current code?

Which one is the good idea? using int or avoiding the sieve?

Using unsigned long to compute the modulus is obviously faster than many calls to mpz_cdiv_ui. I tried a patch that surely is not as fast as yours, but should speed-up all numbers that fits in a single unsigned long. The speed-up seems small, I'm not sure it's worth.

The cut off point as I measure it is ~80,000,000 but requires
increasing the primegap_small list to ~1100 entries

I'm not sure what you think about a list of that length.

Well, the cut off point is surely different on different architectures. It should be tuned. Moreover: a large list ... can't it be generated on the fly? If we want a larger table, in my opinion it should be an improvement for all the functions that need primes (_primorial, _bin, _fac...).

It might also be useful to commit tune_nextprime_small which
dynamically selects a higher reps count for
mpz_nextprime input and helps increase precision.

Uhm...
j = 200000 / s->size;

This means that
for size=10 you compute 20000 primes: all primes between 2^10 and 2^18.
for size=11: all primes between 2^11 and 2^18.
for size=12... and so on up to 2^18.

This is not exactly representative of the speed of nextprime on operands around 2^10, 2^11... and so on.


PS: a comment in the current code says
 /* Larger threshold is faster but takes ~O(n/ln(n)) memory.
To be honest, the array sieve needs n/24 bytes, and the array primegap_tmp uses primepi(n) ~= n/ln(n) bytes. And asymptotically the sum of both is O(n).

Ĝis,
m
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Reply via email to