Comment #37 on issue 3809 by dana.jac...@gmail.com: isprime can be faster
http://code.google.com/p/sympy/issues/detail?id=3809
I've been working on primality tests and proofs for a Perl module, so this
is something I've been looking at a lot recently. Here are my long-winded
thoughts, with the caveat that I'm not a sympy developer.
Comment 1 & 2 mention AKS. AKS is theoretically important, but is still
basically useless in any practical sense. It is insanely slow, even with
Lenstra and Bernstein's improvements to the 'r' selection. Brent 2010
estimates 840 years for a 308 digit number (on a 1GHz machine, but even
with 100x faster hardware that's ridiculous). For the deterministic
variants I think you'll find that time isn't out of line.
Comment 3 mentions the "pypy ran a deterministic test in 18 minutes"
comment. He ran a single Fermat test, which is much less accurate than a
single M-R test. Try 561 through his function. or 1105, 1729, 2465, etc.
So after 18 minutes we've not accomplished very much. This is definitely
not the direction to go.
For small numbers (<2^64), you can use deterministic M-R tests, such as the
ones from https://miller-rabin.appspot.com/. That would be faster than the
existing code and remove the table of exceptions. BPSW would probably be
faster for the upper range. BPSW (standard, strong, extra strong, or the
oddball variants some packages implement) has no counterexamples < 2^64,
and this is typically the same speed or faster than running 4 M-R tests.
With my native (C without GMP) implementations I cut over to BPSW if the
number is over 32 bit in size -- your mileage may vary (this is purely a
benchmark-oriented performance decision at this size).
For large values, I think you really should use a variant of BPSW (see
Nicely's page or
http://en.wikipedia.org/w/index.php?title=Baillie%E2%80%93PSW_primality_test).
Here is a number from Arnault 1993 and another from Arnault 1995 that fool
sympy's current implementation:
8038374574536394912570796143419421081388376882875581458374889175222974273765333652186502336163960045457915042023603208766569966760987284043965408232928738791850869166857328267761771029389697739470167082304286871099974399765441448453411558724506334092790222752962294149842306881685404326457534018329786111298960644845216191652872597534901
2887148238050771212671429597130393991977609459279722700926516024197432303799152733116328983144639225941977803110929349655578418949441740933805615113979999421542416933972905423711002751042080134966731755152859226962916775325475044445856101949404200039904432116776619949629539250452698719329070373564032273701278453899126120309244841494728976885406024976768122077071687938121709811322297802059565867
Running lots of M-R tests to many fixed bases like sympy's current
implementation or libtommath does (and hence Perl 6) is problematic.
Arnault 1995 gives an example (the second above) that will pass MR tests
with all bases up to 306. That paper gives instructions on how to
construct such numbers. Using random bases mostly solves this problem.
However, we're still left with it being terribly inefficient -- running 46
M-R tests is 10-15 times slower than a BPSW test. If you're feeling
paranoid about letting a pseudoprime through, you can add a couple more
random-base M-R tests (only Mathematica and Math::Prime::Util seem to go to
this length that I'm aware of).
Why BPSW? Nicely's page should cover lots of the reasons. Almost all math
packages switched to using it, mostly in the 1990s. Maple, Mathematica
(with an added base 3 M-R test), Pari, SAGE, Maxima, FLINT, MPIR, PRIMO are
some examples that use a BPSW variant for their probable prime test. In 33
years nobody has published a concrete counterexample (though we know they
exist). For some reason the tests act anti-correlated.
Python's gmpy2 includes David Cleaver's C+GMP code for primality functions
including Lucas tests and a strong BPSW test, but is_prime still uses
mpz_probab_prime_p. I spoke to some of the Perl 6 people last month, and I
need to write up some code for NQP as well as start a discussion on the API.
Here are examples and times on my computer. The Perl tests use
Math::Prime::Util + Math::Prime::Util::GMP. This is C+GMP under the hood,
just like the gmpy and gmpy2 modules. Python 2.7. Based on the time
required for one run of mr with the latest git version (that uses pow(base,
t, n)), it looks like this could be done in a reasonable length of time.
# sympy from git, mr(int("1808010808"*1560 + "1"),[2])
12.2s
# sympy from git, isprime(int("1808010808"*1560 + "1"))
9 minutes 16.4s
Hmm, this is a few minutes. Of course I have the latest code so it's
presumably much more efficient than earlier, but I'm concerned about the
comment "[...]there is no way (with MR) that the prime given in the OP will
return in a few minutes." Let's look at some other systems.
# Pari/gp 2.6.0 ispseudoprime(....) uses a BPSW variant
60.12s
# gmpy 1.15's is_prime (mpz_probab_prime_p(n,25), or M-R with 25
fixed-random bases)
# Installed via Fedora
8 minutes 29.3s
# gmpy 2.0's is_prime (mpz_probab_prime_p(n,25), or M-R with 25
fixed-random bases)
# Built from sources along with latest MPFR, MPC, and GMP.
5 minutes 12.8s
# gmpy 2.0 is_strong_prp(int("1808010808"*1560 + "1"),2)
12.03s
# gmpy 2.0 is_selfridge_prp(int("1808010808"*1560 + "1"))
64.95s
# gmpy 2.0 is_strong_selfridge_prp(int("1808010808"*1560 + "1"))
64.85s
I wasn't able to get gmpy2's is_strong_bpsw_prp function to work, but it
would be about 77s based on the two tests it has to do.
My Perl module, with GMP implementations that are a little faster than the
ones in gmpy2:
# Just the SPRP-2 test:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_strong_pseudoprime($n,2);'
12.06s
# Strong Lucas-Selfridge test:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_strong_lucas_pseudoprime($n);'
47.17s
# Extra Strong Lucas test -- typically a bit faster:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_extra_strong_lucas_pseudoprime($n);'
31.84s
# is_prob_prime includes some quick tests for weeding out composites, then
SPRP-2 and strong Lucas-Selfridge:
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_prob_prime($n);'
59.32s
# is_prime is is_prob_prime plus 2-5 extra M-R tests (2 for this size):
time perl -Mbigint -MMath::Prime::Util=:all -E 'my $n = 0+("1808010808" x
1560 . "1"); say is_prime($n);'
82.97s
So it's possible to do a BPSW variant (SPRP-2 + Grantham Extra Strong test)
and have it done in under 44 seconds, assuming modular math including
exponentiation is fast. Add ~12s per extra M-R test if you want.
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
--
You received this message because you are subscribed to the Google Groups
"sympy-issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to sympy-issues+unsubscr...@googlegroups.com.
To post to this group, send email to sympy-issues@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy-issues.
For more options, visit https://groups.google.com/groups/opt_out.