On Tuesday, February 12, 2013, wrote: > On Tue, 12 Feb 2013 15:57:37 -0700 > Nicolas Bock <nicolasb...@gmail.com <javascript:;>> wrote: > > > > Here is haskell version that is faster than python, almost as fast as > c++. > > > You need to install bytestring-lexing package for readDouble. > > > I was hoping Branimir could comment on how the improvements were allocated. > > how much is due to text.regex.pcre (which looks to be a wrapper to > libpcre) ? > > how much can be attributed to using data.bytestring ? > > you have to admit, it's amazing how well a byte-compiled, _dynamically > typed_ interpreter can do against an actualy native code compiler. Can't > regex be done effectively in haskell ? Is it something that can't be done, > or is it just such minimal effort to link to pcre that it's not worth the > trouble ?
I think that there are two bottlenecks: the regex engine, and converting a bytestring to a double. There doesn't appear to be a fast and accurate strtod implementation for Haskell, and the faster regex implementations that I could find appear to be unmaintained. > > > Brian > > > > import Text.Regex.PCRE > > > import Data.Maybe > > > import Data.Array.IO > > > import Data.Array.Unboxed > > > import qualified Data.ByteString.Char8 as B > > > import Data.ByteString.Lex.Double (readDouble) > > > > > > strataBounds :: UArray Int Double > > > strataBounds = listArray (0,10) [ 0.0, 1.0e-8, 1.0e-7, 1.0e-6, 1.0e-5, > > > 1.0e-4, 1.0e-3, 1.0e-2, 1.0e-1, 1.0, 2.0 ] > > > > > > newStrataCounts :: IO(IOUArray Int Int) > > > newStrataCounts = newArray (bounds strataBounds) 0 > > > > > > main = do > > > l <- B.getContents > > > let a = B.lines l > > > strataCounts <- newStrataCounts > > > n <- calculate strataCounts a 0 > > > let > > > printStrataCounts :: IO () > > > printStrataCounts = do > > > let s = round $ sqrt (fromIntegral n::Double) :: Int > > > printf "read %d matrix elements (%dx%d = %d)\n" n s s n > > > printStrataCounts' 0 0 > > > printStrataCounts' :: Int -> Int -> IO () > > > printStrataCounts' i total > > > | i < (snd $ bounds strataBounds) = do > > > count <- readArray strataCounts i > > > let > > > p :: Double > > > p = (100.0*(fromIntegral count) :: > > > Double)/(fromIntegral n :: Double) > > > printf "[%1.2e, %1.2e) = %i (%1.2f%%) %i\n" > (strataBounds > > > ! i) (strataBounds ! (i+1)) > > > count p > > > (total + count) > > > printStrataCounts' (i+1) (total+count) > > > | otherwise = return () > > > printStrataCounts > > > > > > calculate :: IOUArray Int Int -> [B.ByteString] -> Int -> IO Int > > > calculate _ [] n = return n > > > calculate counts (l:ls) n = do > > > let > > > a = case getAllTextSubmatches $ l =~ B.pack "matrix.*= > > > ([0-9eE.+-]+)$" :: [B.ByteString] of > > > [_,v] -> Just (readDouble v) :: Maybe (Maybe > > > (Double,B.ByteString)) > > > _ -> Nothing > > > b = (fst.fromJust.fromJust) a > > > loop :: Int -> IO() > > > loop i > > > | i < (snd $ bounds strataBounds) = > > > if (b >= (strataBounds ! i)) && (b < (strataBounds ! > > > (i+1))) > > > then do > > > c <- readArray counts i > > > writeArray counts i (c+1) > > > else > > > loop (i+1) > > > | otherwise = return () > > > if isNothing a > > > then > > > calculate counts ls n > > > else do > > > loop 0 > > > calculate counts ls (n+1) > > > > > > > > > ------------------------------ > > > From: nicolasb...@gmail.com <javascript:;> > > > Date: Fri, 8 Feb 2013 12:26:09 -0700 > > > To: haskell-cafe@haskell.org <javascript:;> > > > Subject: [Haskell-cafe] performance question > > > > > > Hi list, > > > > > > I wrote a script that reads matrix elements from standard input, parses > > > the input using a regular expression, and then bins the matrix > elements by > > > magnitude. I wrote the same script in python (just to be sure :) ) and > find > > > that the python version vastly outperforms the Haskell script. > > > > > > To be concrete: > > > > > > $ time ./createMatrixDump.py -N 128 | ./printMatrixDecay > > > real 0m2.655s > > > user 0m2.677s > > > sys 0m0.095s > > > > > > $ time ./createMatrixDump.py -N 128 | ./printMatrixDecay.py - > > > real 0m0.445s > > > user 0m0.615s > > > sys 0m0.032s > > > > > > The Haskell script was compiled with "ghc --make printMatrixDecay.hs". > > > > > > Could you have a look at the script and give me some pointers as to > where > > > I could improve it, both in terms of performance and also generally, > as I > > > am very new to Haskell. > > > > > > Thanks already, > > > > > > nick > > > > > > > > > _______________________________________________ Haskell-Cafe mailing > list > > > Haskell-Cafe@haskell.org <javascript:;> > > > http://www.haskell.org/mailman/listinfo/haskell-cafe > > > > > >
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe