Re: [Haskell-cafe] Re: Code and Perf. Data for PrimeFinders(was:Genuine Eratosthenes sieve)

Claus Reinke Wed, 28 Feb 2007 16:02:48 -0800

Melissa, thank you for taking my comments as constructively as they were meant.


and not to worry, i'm sure i'm not the only one who enjoyed this little 
exercise and
learned a lot from it!-) and even if we've found more that you expected, you're 
quite
right, one doesn't get anywhere unless someone starts, noticing that something
interesting is going on, raising questions about it, supplying the first 
answers, and
documenting the results when the discussion quiets down.

I think modulus is a bit of a red herring -- its a symptom, not the  disease.


you mean i'm distracting the primates with smoked fish?-) perhaps you're right,
but let me illustrate what i meant anyway. assuming that mod is not a primitive,
but has to be computed some way, perhaps by a fragment of gcd, here's a look
at the recursion for computing mod x y:

   mods a b | b>a       = [(a,b)]
            | otherwise = (a,b):mods (a-b) b

now, if we take that folklore code again, and have a look at the sieves for 5 
and 7,
after 70 has already been filtered out by the sieve for 2 (first column has the 
input
list, followed by the unfolded mod-recursion for each element):

   Main> putStrLn $ unlines [show $ mods x 5|x<-[3,5..90],(x`mod`3)>0]
   [(5,5),(0,5)]
   [(7,5),(2,5)]
   ..
   
[(67,5),(62,5),(57,5),(52,5),(47,5),(42,5),(37,5),(32,5),(27,5),(22,5),(17,5),(12,5),(7,5),(2,5)]
   
[(71,5),(66,5),(61,5),(56,5),(51,5),(46,5),(41,5),(36,5),(31,5),(26,5),(21,5),(16,5),(11,5),(6,5),(1,5)]
   ..
   
[(85,5),(80,5),(75,5),(70,5),(65,5),(60,5),..,(20,5),(15,5),(10,5),(5,5),(0,5)]
   
[(89,5),(84,5),(79,5),(74,5),(69,5),(64,5),(59,5),..,(19,5),(14,5),(9,5),(4,5)]

   Main> putStrLn $ unlines [show $ mods x 7|x<-[3,5..90],(x`mod`3)>0,x`mod`5>0]
   [(7,7),(0,7)]
   [(11,7),(4,7)]
   ..
   [(67,7),(60,7),(53,7),(46,7),(39,7),(32,7),(25,7),(18,7),(11,7),(4,7)]
   [(71,7),(64,7),(57,7),(50,7),(43,7),(36,7),(29,7),(22,7),(15,7),(8,7),(1,7)]
   [(73,7),(66,7),(59,7),(52,7),(45,7),(38,7),(31,7),(24,7),(17,7),(10,7),(3,7)]
   
[(77,7),(70,7),(63,7),(56,7),(49,7),(42,7),(35,7),(28,7),(21,7),(14,7),(7,7),(0,7)]
   
[(79,7),(72,7),(65,7),(58,7),(51,7),(44,7),(37,7),(30,7),(23,7),(16,7),(9,7),(2,7)]
   
[(83,7),(76,7),(69,7),(62,7),(55,7),(48,7),(41,7),(34,7),(27,7),(20,7),(13,7),(6,7)]
   
[(89,7),(82,7),(75,7),(68,7),(61,7),(54,7),(47,7),(40,7),(33,7),(26,7),(19,7),(12,7),(5,7)]

we find that 70 is indeed revisted by both (eg, 85 and 77), even though it is 
no longer
in the input list (first columns).

by aligning the mod-recursion with going through the list of prime candidates,
we save ourselves a lot of recomputation by sharing, eg the recursion for mod 
91 7
overlaps with, and can share that of mod 77 7, and so on

   Main> mods 91 7
   
[(91,7),(84,7),(77,7),(70,7),(63,7),(56,7),(49,7),(42,7),(35,7),(28,7),(21,7),(14,7),(7,7),(0,7)]

which is what i meant with incrementally computing the modulus (if i know the 
result of
mod 77 7, i can use that to compute mod 91 7), instead of doing it from scratch 
for each
candidate. both versions have to pass by 70 at least three times, even though 
the first filter
is sufficient to remove that number as a candidate, again in both versions.

different people think differently, and i found the idea of incremental vs 
from-scratch
modulus helpful in explaining the performance differences, and the idea of 
explicit vs
implicit sieves helpful in explaining the resource usage difference between the 
folklore
and the so-called naive version. but others might find the cross-out footprint 
more
helpful.

going beyond simple sieves, wheel sieves are all about using the advantages of
incremental computations, but they still make their wheels explicit, and suffer 
resource
usage problems, which indirectly harm their performance. i still suspect that 
there's an
incremental and implicit variant of the incremental, but explicit wheel sieve.

[transforming sieve variants into each other]
At some point in the middle, you'll have something that can be viewed from both perspectives.But the existence of such a progression of algorithms between algorithm A and algorithm B doesn'tmean that A and B are fundamentally the same.


i would hope that one of two things would happen: either one could get from A
to B via equivalence-preserving transformations (for a suitably chosen form of
equivalence), in which case A and B would be equivalent, or one would need
to do non-equivalence-perserving transformations at some point in the middle,
clearly exposing the implementation decisions that distinguish B from A (if B is
simply more concrete than A, prescribing efficient behaviour where A just allows
it, then we are into the subject of refinement).

Perhaps we're saying the same thing and misunderstanding each other,


quite possibly, so i'll stop here, with a closing remark

From an operational behavior standpoint, the ``folklore sieve'' just doesn't do the same thing.Every expectation people have about the behavior of the Sieve of Eratosthenes is confounded bythe ``folklore sieve''. Although, if all you're looking for is a short and elegant- looking*specification* that brings to mind the Sieve of Eratosthenes, the ``folklore sieve'' will suitjust fine. To me though, the Sieve of Eratosthenes doesn't exist as a specification forprimality -- their are easier definitions -- it serves as a clever way to find primes (and factorcomposites) efficiently. The fundamental operational details matter.


indeed, there are specifications of properties (1), specifications of
algorithms (2), and specifications of operational behaviour (3), in order of
decreasing abstraction/increasing detail. functional languages are declarative,
but encourage/require us to be more concrete about operational behaviour
than, say, search-based languages. haskellers don't specify a property, then
hope that the implementation will find an algorithm, but neither do they usually
pin down the operational behaviour precisely enough to ensure efficient
realisation.

and it does often surprise people that (2) is easy and beautiful, but 
potentially
horribly inefficient, and that efficiency can be improved substantially by being
more concrete in the direction of (3), without changing the language, but after
investing some thought into the difference between default behaviour and desired
behaviour. and it did surprise me to see so many different approaches, which
got me started thinking about how they might be linked to each other, rather
than invented separately. so, yes, i agree that a suitably expanded paper
documenting the issues for prime sieves, perhaps linking to similar cases,
such as the infamous quicksort, would be valuable.

claus

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re: Code and Perf. Data for PrimeFinders(was:Genuine Eratosthenes sieve)

Reply via email to