On 25 October 2005 11:59, Jon Fairbairn wrote: > On 2005-10-25 at 12:20+0200 Lemmih wrote: >> On 10/25/05, Charles SDudu <[EMAIL PROTECTED]> wrote: >>> Hello, I need to calculate the frequency of each >>> character in a String. And if I can do this really well >>> in C, I dont find a nice (and fast) answer in haskell. I >>> tried several functions, listed below, and even the >>> fastest do a lot of unnecessary things : >>> >>> calc :: String -> [ (Char, Int) ] >>> calc = filter (\p -> snd p > 0) . assocs . >>> foldl (\k c -> unsafeReplace k [(fromEnum c, >>> (unsafeAt k (fromEnum c))+1)] ) k where k = array >>> (toEnum 0, toEnum 255) [(toEnum i, 0) | i <- [0 .. 255]] >>>>> UArray Char Int > > [snip even more disagreable code] > > Ugh! These are all horrid. If something on the lines of > >> calc = accumArray (+) 0 (minBound, maxBound) . (map (\x->(x,1))) > > isn't fast enough, complain to the implementors! What's the > point of functional programming if one has to twist into a > shape that allows inspection of one's own fundament to get > stuff to run in decent time?
To make that go fast, you'd want to use a UArray. Could the compiler figure this out for itself? Not really, it might be able to figure out certain special cases where changing an Array to a UArray doesn't affect the semantics, but not in general (and GHC certainly makes no attempt to do it). Sometimes, you just need to give the compiler a bit of help. And you really want to read the file as a packed string, not a String. Or hope that GHC manages to deforest calc with whatever is generating the String. It's possible that calc deforests, but highly unlikely that hGetContents does, so I'm afraid you'll have to do the packed string conversion yourself. To make it go even faster, you need to inline & specialise accumArray. I'm not sure how much of this GHC currently does. Cheers, Simon _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe