Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)

Lennart Augustsson Sun, 25 Feb 2007 09:19:22 -0800

Here's another program you can add.  It's fairly short and efficient.


        -- Lennart

import System (getArgs)

infixr :>

data StreamInt = !Int :> StreamInt

(!>) :: StreamInt -> Int -> Int
(x :> _)  !> 0 = x
(_ :> xs) !> n = xs !> (n-1)

-- By replacing lprimes on the next line by '5 :> gen 7 4 2' thisalgorithm

-- runs in very little space, but is somewhat slower.
primes = 2 :> 3 :> lprimes
  where isPrime (p:>ps) n = n `rem` p /= 0 && (p*p > n || isPrime ps n)
        lprimes = 5 :> gen 7 4 2

gen n a b = if isPrime lprimes n then n :> gen (n+a) b aelse gen (n+a) b a


printNthPrime n = print (n, primes !> (n-1))

main = do
    args <- getArgs
    printNthPrime $ read $ head args



On Feb 25, 2007, at 12:51 , Melissa O'Neill wrote:

For those enjoying the fun with prime finding, I've updated thesource at
    http://www.cs.hmc.edu/~oneill/code/haskell-primes.zip
I've tweaked my code a little to improve its space behavior whenfinding primes up to some limit, added an up-to-limit version ofthe Naive Primes algorithm, and added Oleg's prime finding code too.
I also got a chance to look at space usage more generally. I won'treproduce a table here, but the conclusions were more-or-less whatyou'd expect. The "unlimited list" algorithms used O(n) space tofind n primes (except for Runciman's algorithm, which appeared tobe much worse), and the "primes up to a limit" algorithms used O(sqrt(n)) space to find the nth prime.
Both of these are better than the classic C algorithm, which uses O(n log n) space to find the nth prime. For example, heap profilingshows that my own O(sqrt(n)) algorithm uses only 91200 bytes tofind the 10^7th prime, whereas the classic C algorithm needs atleast 11214043 bytes for its array -- a factor of more than 100different, and one that gets worse for larger n.
Lennart Augustsson wrote:
Another weird thing is that much of the Haskell code seems to workwith Integer whereas the C code uses int.
Originally, I was comparing Haskell with Haskell, and for thatpurpose I wanted to have a level playing field, so going withInteger everywhere made sense.
That doesn't seem fair.
Actually, to the extent that any of the comparisons are "fair", Ithink this one is too. After all, typical Haskell code usesInteger and typical C code uses int. I could use arrays in myHaskell code and never use laziness, but when I program in Haskell,I'm not trying to exactly recreate C programs, but rather writetheir Haskell equivalents. For example, to me, producing a lazylist was essential for a true Haskell feel. For some people, the"Haskell feel" also includes treating the language as a declarativespecification language where brevity is everything -- but for me,other things (like fundamental algorithmic efficiency andfaithfulness to the core ideas that make the Sieve of Eratosthenesan *efficient* algorithm) are universal and ought to be common toboth C and Haskell versions.
But to allow a better comparison with C, I've added a run for anInt version of my algorithm. With that change, my code is closerto the speed of the C code. More interestingly, for larger n, Iseem to be narrowing the gap. At 10^6, my code runs nearly 30times slower than the classic C version, but at 10^8, I'm onlyabout 20 times slower. This is especially interesting to me therewas some (reasonable looking) speculation from apfelmus severaldays ago, that suggested that my use of a priority queue incurredan extra log(n) overhead, from which you would expect a worseasymptotic complexity, not equivalent or better.
    Melissa.

Enc. (best viewed with a fixed-width font)

   ------------------------------------------------------------------
                 Time (in seconds) for Number of Primes
                 ----------------------------------------------------
   Algorithm     10^3    10^4     10^5     10^6     10^7     10^8
   ------------------------------------------------------------------
   C-Sieve       0.00      0.00     0.01     0.29      5.12    88.24
   O'Neill (#3)  0.01      0.04     0.55     8.34    122.62  1779.18
   O'Neill (#2)  0.01      0.06     0.95    13.85    194.96  2699.61
   O'Neill (#1)  0.01      0.07     1.07    15.95    230.11     -
   Bromage       0.02      0.39     6.50   142.85     -         -
   "sieve" (#3)  0.01      0.25     7.28   213.19     -         -
   Naive (#2)    0.02      0.59    14.70   386.40     -         -
   Naive (#1)    0.32      0.66    16.04   419.22     -         -
   Runciman      0.02      0.74    29.25    -         -         -
   Reinke        0.04      1.21    41.00    -         -         -
   Zilibowitz    0.02      2.50   368.33    -         -         -
   Gale (#1)     0.12     17.99    -        -         -         -
   "sieve" (#1)  0.16     32.59    -        -         -         -
   "sieve" (#2)  0.01     32.76    -        -         -         -
   Oleg          0.18     68.40    -        -         -         -
   Gale (#2)     1.36    268.65    -        -         -         -
   ------------------------------------------------------------------
- The dashes in the table mean "I gave up waiting" (i.e., > 500seconds)
- "sieve" (#1) is the classic example we're all familiar with
- "sieve" (#2) is the classic example, but sieving a list withoutmultiples of 2,3,5, or 7 -- notice how it makes no real difference- "sieve" (#3) is the classic example, but generating a lazy-but-finite list (see below)- O'Neill (#1) is basically the algorithm of mine discussed inhttp://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf, with a fewminor tweaks- O'Neill (#2) is a variant of that algorithm that generates a lazy-but-finite list of primes.- O'Neill (#3) is a variant of that algoritm that uses Ints when itcan get away with it.- Naive (#1) is the the non-sieve-based "divide by every prime upto the square root" algorithm for finding primes (calledSimplePrimes in the source)- Naive (#2) is the same algorithm, with a limit on the number ofprimes- Runciman is Colin Runciman's algorithm, from his _Lazy WheelSieves and Spirals of Primes_ paper
- Reinke is the ``applyAt'' algorithm Claus Reinke posted here
- Gale (#1) is Yitz Gale's deleteOrd algorithm
- Gale (#2) is Yitz Gale's crossOff algorithm
- Oleg is [EMAIL PROTECTED]'s algoirthm, as posted to Haskell Cafe
- Zilibowitz is Ruben Zilibowitz's GCD-based primes generator, asposted on Haskell-Cafe- Bromage is Andrew Bromage's implementation of the Atkin-Bernsteinsieve. Like O'Neill (#2) and "sieve" (#3), asks for some upperlimit on the number of primes it generates. Unlike O'Neill (#2)and "sieve" (#3), it uses arrays, and the upper limit causes alarge initial array allocation. Also, unlike the other Haskellalgorithms, it does not produce a lazy list; no output is produceduntil sieving is complete- C-Sieve is a "typical" simple implementation of the sieve in Cfound with Google; it skips multiples of 2 and uses a bit array.Also, obviously, it doesn't produce incremental output.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re: Code and Perf. Data for Prime Finders (was: Genuine Eratosthenes sieve)

Reply via email to