I hacked together a version that I'm pretty happy with today. Started
off trying an algorithm with channels and forking, then realized that
in Haskell thanks to referential transparency we can get parallelism
almost for free, and redid it all in Control.Parallel (below).
Unfortunately, I don't have a multicore processor so I can't put this
through any special paces. However, its compactness and expressively
match or beat the simple Ruby, etc. scripts while it gets
(theoretically) most of the parallel benefits of the enormous and
unwieldy Erlang and JOcaml ones.
--S
module Main where
import qualified Data.ByteString.Lazy.Char8 as LB
import Data.List (foldl', unfoldr, insertBy)
import qualified Data.Map as M
import System.Environment (getArgs)
import Control.Parallel (par)
import Control.Parallel.Strategies (parMap, rwhnf)
count :: M.Map LB.ByteString Int -> LB.ByteString -> M.Map
LB.ByteString Int
count m line = if LB.pack "/ongoing/When" `LB.isPrefixOf` myLn then
M.insertWith' (+) (LB.drop 14 myLn) 1 m else m
where myLn = (LB.takeWhile (/=' ') . LB.dropWhile (/='/') .
LB.dropWhile (/='\"')) line
mapUnionPar :: (Ord k, Num a) => [M.Map k a] -> M.Map k a
mapUnionPar m = head $ until (null . tail) mapUnionPar' m
where a |:| b = par a . par b $ a : b
mapUnionPar' (x:x':xs) = (M.unionWith (+) x x' |:|
mapUnionPar' xs)
mapUnionPar' x = x
newPar :: FilePath -> IO (M.Map LB.ByteString Int)
newPar = ((mapUnionPar . parMap rwhnf (foldl' count M.empty) .
chunkify . LB.lines) `fmap`) . LB.readFile
where chunkify = unfoldr (\x -> if null x then Nothing else Just
(splitAt 512 x))
main = mapM_ ((print . fst . foldl' takeTop ([],[]) . M.toList =<<) .
newPar) =<< getArgs
where takeTop ac@(bs,low) a = if null low || (snd . head) low <
snd a then (splitAt 10 . insertBy ((. snd) . flip compare . snd) a)
bs else ac
On Nov 7, 2007, at 9:06 AM, Bayley, Alistair wrote:
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of manu
Haskell is conspicuously absent from the languages used to tackle Tim
Bray's Wide Finder problem
(http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results?
updated).
So far we have Ocaml, Erlang, Python, Ruby, etc...
Tim Bray mentions that GHC won't build on Solaris, so presumably that
problem would need to be solved before Haskell appears in his table. I
see that there are Solaris binary packages:
http://www.haskell.org/ghc/download_ghc_661.html#sparcsolaris
so perhaps he just needs to be pointed to them?
Alistair
*****************************************************************
Confidentiality Note: The information contained in this message,
and any attachments, may contain confidential and/or privileged
material. It is intended solely for the person(s) or entity to
which it is addressed. Any review, retransmission, dissemination,
or taking of any action in reliance upon this information by
persons or entities other than the intended recipient(s) is
prohibited. If you received this in error, please contact the
sender and delete the material from any computer.
*****************************************************************
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe