apfelmus quantentunnel.de> writes:
>
~~ This is a repost, with apologies to anyone who sees this twice (I've replied
to a two years old thread, and it doesn't show up in GMANE as I thought it
would). ~~
> Dave Bayer wrote:
> > What I'm calling a "venturi"
> >
> >venturi :: Ord a => [[a]] -> [a]
> >
> > merges an infinite list of infinite lists into one list, under the
> > assumption that each list, and the heads of the lists, are in
> > increasing order.
> >
> > I wrote this as an experiment in coding data structures in Haskell's
> > lazy evaluation model, rather than as explicit data. The majority of the
> > work done by this code is done by "merge"; the multiples of each prime
> > percolate up through a tournament consisting of a balanced tree of
> > suspended "merge" function calls. In order to create an infinite lazy
> > balanced tree of lists, the datatype
> >
> >data List a = A a (List a) | B [a]
> >
> > is used as scaffolding. One thinks of the root of the infinite tree as
> > starting at the leftmost child, and crawling up the left spine as
> > necessary.
>
> After some pondering, the List a data structure for merging is really
> ingenious! :) Here's a try to explain how it works:
>
> The task is to merge a number of sorted and infinite lists when it's
> known that their heads are in increasing order. In particular, we want
> to write
>
> primes = (2:) $ diff [3..] $ venturi $ map multiple primes
>
> Thus, we have a (maybe infinite) list
>
> xss = [xs1, xs2, xs3, ...]
>
> of infinite lists with the following properties
>
> all sorted xss
> sorted (map head xss)
>
> where sorted is a function that returns True if the argument is a
> sorted list. A first try to implement the merging function is
>
> venturi xss = foldr1 merge xss
> = xs1 `merge` (xs2 `merge` (xs3 `merge` ...
>
> where merge is the standard function to merge to sorted lists.
>
> However, there are two problems. The first problem is that this doesn't
> work for infinite lists since merge is strict in both arguments. But
> the property head xs1 < head xs2 < head xs3 < ... we missed to exploit
> yet can now be used in the following way
>
> venturi xss = foldr1 merge' xss
>
> merge' (x:xt) ys = x : merge xt ys
>
> In other words, merge' is biased towards the left element
>
> merge' (x:_|_) _|_ = x : _|_
>
> which is correct since we know that (head xs < head ys).
>
> The second problem is that we want the calls to merge to be arranged
> as a balanced binary tree since that gives an efficient heap. It's not
> so difficult to find a good shape for the infinite tree, the real
> problem is to adapt merge' to this situation since it's not associative:
>
> ..
>
> The problem is that the second form doesn't realize that y is also
> smaller than the third argument. In other words, the second form has to
> treat more than one element as "privileged", namely x1,x2,... and y.
> This can be done with the aforementioned list data structure
>
> data People a = VIP a (People a) | Crowd [a]
>
> The people (VIPs and crowd) are assumed to be _sorted_. Now, we can
> start to implement
>
> merge' :: Ord a => People a -> People a -> People a
Hi,
... replying to a two-years-old post here, :) :) and after consulting the
full "VIP" version in haskellwiki/Prime_Numers#Implicit_Heap ...
It is indeed the major problem with the merged multiples removing code (similar
one to Richard Bird's code from Melissa O'Neill's JFP article) - the linear
nature of foldr, requiring an (:: a->b->b) merge function. To make it freely
composable to rearrange the list into arbitrary form tree it must indeed be
type uniform (:: a->a->a) first, and associative second.
The structure of the folded tree should be chosen to better suit the primes
multiples production. I guestimate the total cost as Sum (1/p)*d, where p is a
generating prime at the leaf, and d the leaf's depth, i.e. the amount of merge
nodes its produced multiple must pass on its way to the top.
The structure used in your VIP code, 1+(2+(4+(8+...))), can actually be
improved upon with another, (2+4)+( (4+8)+( (8+16)+...)), for which the
estimated cost is about 10%-12% lower.
This can be expressed concisely as the following:
primes :: () -> [Integer]
primes () = 2:primes'
where
primes'= [3,5] ++ drop 2 [3,5..] `minus` comps
mults = map (\p-> fromList [p*p,p*p+2*p..]) $ primes'
(comps,_) = tfold mergeSP (pairwise mergeSP mults)
fromList (x:xs) = ([x],xs)
tfold f (a: ~(b: ~(c:xs)))
= (a `f` (b `f` c)) `f` tfold f (pairwise f xs)
pairwise f (x:y:ys) = f x y : pairwise f ys
mergeSP (a,b) ~(c,d) = let (bc,b') = spMerge b c
in (a ++ bc, merge b' d)
where
spMerge u@(x:xs) w@(y:ys) = case compare x y of
LT -> (x:c,d) where (c,d) = spMerge xs w
EQ -> (x:c,d) where (c,d) = spMerge xs ys