Re: [Haskell-cafe] my Fasta is slow ;(
I've already submitted it, thanks. The Fortran program commits the same sin as the C++ one, of doing floating point arithmetic in the inner loop; that's why it's slow. On Dec 27, 2012, at 18:05, Branimir Maksimovic wrote: Thank you. Your entry is great. Faster than fortran entry! Dou you want to contribute at the site, or you want me to do it for you? -- Date: Thu, 27 Dec 2012 15:58:40 -0800 Subject: Re: [Haskell-cafe] my Fasta is slow ;( From: b...@serpentine.com To: bm...@hotmail.com CC: haskell-cafe@haskell.org On Tue, Dec 18, 2012 at 12:42 PM, Branimir Maksimovic wrote: Seems to me that culprit is in function random as I have tested rest of code and didn't found speed related problems. The problem with your original program was that it was not pure enough. Because you stored your PRNG state in an IORef, you forced the program to allocate and case-inspect boxed Ints in its inner loop. I refactored it slightly to make genRand and genRandom pure, and combined with using the LLVM back end, this doubled the program's performance, so that the Haskell program ran at the same speed as your C++ version. The next bottleneck was that your program was performing floating point arithmetic in the inner loop. I changed it to precompute a small lookup table, followed by only using integer arithmetic in the inner loop (the same technique used by the fastest C fasta program). This further improved performance: the new Haskell code is 40% faster than the C++ program, and only ~20% slower than the C program that currently tops the shootout chart. The Haskell source is a little over half the size of the C source. You can follow the work I did here: https://github.com/bos/shootout-fasta ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] my Fasta is slow ;(
Thank you. Your entry is great. Faster than fortran entry!Dou you want to contribute at the site, or you want me to do it for you? Date: Thu, 27 Dec 2012 15:58:40 -0800 Subject: Re: [Haskell-cafe] my Fasta is slow ;( From: b...@serpentine.com To: bm...@hotmail.com CC: haskell-cafe@haskell.org On Tue, Dec 18, 2012 at 12:42 PM, Branimir Maksimovic wrote: Seems to me that culprit is in function random as I have tested rest of codeand didn't found speed related problems. The problem with your original program was that it was not pure enough. Because you stored your PRNG state in an IORef, you forced the program to allocate and case-inspect boxed Ints in its inner loop. I refactored it slightly to make genRand and genRandom pure, and combined with using the LLVM back end, this doubled the program's performance, so that the Haskell program ran at the same speed as your C++ version. The next bottleneck was that your program was performing floating point arithmetic in the inner loop. I changed it to precompute a small lookup table, followed by only using integer arithmetic in the inner loop (the same technique used by the fastest C fasta program). This further improved performance: the new Haskell code is 40% faster than the C++ program, and only ~20% slower than the C program that currently tops the shootout chart. The Haskell source is a little over half the size of the C source. You can follow the work I did here: https://github.com/bos/shootout-fasta ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] my Fasta is slow ;(
On Tue, Dec 18, 2012 at 12:42 PM, Branimir Maksimovic wrote: > > Seems to me that culprit is in function random as I have tested rest of > code > and didn't found speed related problems. > The problem with your original program was that it was not pure enough. Because you stored your PRNG state in an IORef, you forced the program to allocate and case-inspect boxed Ints in its inner loop. I refactored it slightly to make genRand and genRandom pure, and combined with using the LLVM back end, this doubled the program's performance, so that the Haskell program ran at the same speed as your C++ version. The next bottleneck was that your program was performing floating point arithmetic in the inner loop. I changed it to precompute a small lookup table, followed by only using integer arithmetic in the inner loop (the same technique used by the fastest C fasta program). This further improved performance: the new Haskell code is 40% faster than the C++ program, and only ~20% slower than the C program that currently tops the shootout chart. The Haskell source is a little over half the size of the C source. You can follow the work I did here: https://github.com/bos/shootout-fasta ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] my Fasta is slow ;(
I took your Haskell program as a base and have refactored it into a version that is about the same speed as your original C++ program. Will follow up with details when I have a little more time. On Tue, Dec 18, 2012 at 12:42 PM, Branimir Maksimovic wrote: > This time I have tried fasta benchmark since current entries does not > display correct output. > Program is copy of mine > http://benchmarksgame.alioth.debian.org/u64q/program.php?test=fasta&lang=gpp&id=1 > c++ benchmark, but unfortunately executes more than twice time. > > Seems to me that culprit is in function random as I have tested rest of > code > and didn't found speed related problems. > > bmaxa@maxa:~/shootout/fasta$ time ./fastahs 2500 > /dev/null > > real0m5.262s > user0m5.228s > sys 0m0.020s > > bmaxa@maxa:~/shootout/fasta$ time ./fastacpp 2500 > /dev/null > > real0m2.075s > user0m2.056s > sys 0m0.012s > > Since I am planning to contribute program, perhaps someone can > see a problem to speed it up at least around 3.5 secs which is > speed of bench that display incorrect result (in 7.6.1). > > Program follows: > > {-# LANGUAGE BangPatterns #-} > {- The Computer Language Benchmarks Game > > http://shootout.alioth.debian.org/ > > contributed by Branimir Maksimovic > -} > > import System.Environment > import System.IO.Unsafe > > import Data.IORef > import Data.Array.Unboxed > import Data.Array.Storable > import Data.Array.Base > import Data.Word > > import Foreign.Ptr > import Foreign.C.Types > > type A = UArray Int Word8 > type B = StorableArray Int Word8 > type C = (UArray Int Word8,UArray Int Double) > > foreign import ccall unsafe "stdio.h" > puts :: Ptr a -> IO () > foreign import ccall unsafe "string.h" > strlen :: Ptr a -> IO CInt > > main :: IO () > main = do > n <- getArgs >>= readIO.head > > let !a = (listArray (0,(length alu)-1) > $ map (fromIntegral. fromEnum) alu:: A) > make "ONE" "Homo sapiens alu" (n*2) $ Main.repeat a (length alu) > make "TWO" "IUB ambiguity codes" (n*3) $ random iub > make "THREE" "Homo sapiens frequency" (n*5) $ random homosapiens > > make :: String -> String -> Int -> IO Word8 -> IO () > {-# INLINE make #-} > make id desc n f = do > let lst = ">" ++ id ++ " " ++ desc > a <- (newListArray (0,length lst) > $ map (fromIntegral. fromEnum) lst:: IO B) > unsafeWrite a (length lst) 0 > pr a > make' n 0 > where > make' :: Int -> Int -> IO () > make' !n !i = do > let line = (unsafePerformIO $ > newArray (0,60) 0 :: B) > if n > 0 > then do > !c <- f > unsafeWrite line i c > if i+1 >= 60 > then do > pr line > make' (n-1) 0 > else > make' (n-1) (i+1) > else do > unsafeWrite line i 0 > l <- len line > if l /= 0 > then pr line > else return () > > pr :: B -> IO () > pr line = withStorableArray line (\ptr -> puts ptr) > len :: B -> IO CInt > len line = withStorableArray line (\ptr -> strlen ptr) > > repeat :: A -> Int -> IO Word8 > repeat xs !n = do > let v = unsafePerformIO $ newIORef 0 > !i <- readIORef v > if i+1 >= n > then writeIORef v 0 > else writeIORef v (i+1) > return $ xs `unsafeAt` i > > random :: C -> IO Word8 > random (a,b) = do > !rnd <- rand > let > find :: Int -> IO Word8 > find !i = > let > !c = a `unsafeAt` i > !p = b `unsafeAt` i > in if p >= rnd > then return c > else find (i+1) > find 0 > > rand :: IO Double > {-# INLINE rand #-} > rand = do > !seed <- readIORef last > let > newseed = (seed * ia + ic) `rem` im > newran = fromIntegral newseed * rimd > rimd = 1.0 / (fromIntegral im) > im, ia, ic :: Int > im = 139968 > ia = 3877 > ic = 29573 > writeIORef last newseed > return newran > where > last = unsafePerformIO $ newIORef 42 > > alu:: [Char] > alu = > "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG\ > \GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA\ > \CCAGCCTGGCCAACATGGTGAAAGTCTCTACTAT\ > \ACATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA\ > \GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG\ > \AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC\ > \AGCCTGGGCGACAGAGCGAGACTCCGTCTCA" > > mkCum :: [(Char,Double)] -> [(Word8,Double)] > mkCum lst = map (\(c,p) -> ((fromIntegral.fromEnum) c,p)) $ > scanl1 (\(_,p) (c',p') -> (c', p+p')) lst > > homosapiens, iub ::
[Haskell-cafe] my Fasta is slow ;(
This time I have tried fasta benchmark since current entries does notdisplay correct output.Program is copy of mine http://benchmarksgame.alioth.debian.org/u64q/program.php?test=fasta&lang=gpp&id=1c++ benchmark, but unfortunately executes more than twice time. Seems to me that culprit is in function random as I have tested rest of codeand didn't found speed related problems. bmaxa@maxa:~/shootout/fasta$ time ./fastahs 2500 > /dev/null real0m5.262suser0m5.228ssys 0m0.020s bmaxa@maxa:~/shootout/fasta$ time ./fastacpp 2500 > /dev/null real0m2.075suser0m2.056ssys 0m0.012s Since I am planning to contribute program, perhaps someone cansee a problem to speed it up at least around 3.5 secs which is speed of bench that display incorrect result (in 7.6.1). Program follows: {-# LANGUAGE BangPatterns #-}{- The Computer Language Benchmarks Game http://shootout.alioth.debian.org/ contributed by Branimir Maksimovic-} import System.Environmentimport System.IO.Unsafe import Data.IORefimport Data.Array.Unboxedimport Data.Array.Storableimport Data.Array.Baseimport Data.Word import Foreign.Ptrimport Foreign.C.Types type A = UArray Int Word8type B = StorableArray Int Word8type C = (UArray Int Word8,UArray Int Double) foreign import ccall unsafe "stdio.h" puts :: Ptr a -> IO ()foreign import ccall unsafe "string.h" strlen :: Ptr a -> IO CInt main :: IO () main = don <- getArgs >>= readIO.head let !a = (listArray (0,(length alu)-1) $ map (fromIntegral. fromEnum) alu:: A)make "ONE" "Homo sapiens alu" (n*2) $ Main.repeat a (length alu)make "TWO" "IUB ambiguity codes" (n*3) $ random iubmake "THREE" "Homo sapiens frequency" (n*5) $ random homosapiens make :: String -> String -> Int -> IO Word8 -> IO (){-# INLINE make #-}make id desc n f = dolet lst = ">" ++ id ++ " " ++ desca <- (newListArray (0,length lst) $ map (fromIntegral. fromEnum) lst:: IO B) unsafeWrite a (length lst) 0pr amake' n 0where make' :: Int -> Int -> IO ()make' !n !i = dolet line = (unsafePerformIO $ newArray (0,60) 0 :: B)if n > 0 then do!c <- funsafeWrite line i c if i+1 >= 60 then do pr linemake' (n-1) 0 else make' (n-1) (i+1)else do unsafeWrite line i 0l <- len line if l /= 0then pr line else return () pr :: B -> IO ()pr line = withStorableArray line (\ptr -> puts ptr)len :: B -> IO CIntlen line = withStorableArray line (\ptr -> strlen ptr) repeat :: A -> Int -> IO Word8repeat xs !n = dolet v = unsafePerformIO $ newIORef 0!i <- readIORef vif i+1 >= nthen writeIORef v 0else writeIORef v (i+1)return $ xs `unsafeAt` i random :: C -> IO Word8random (a,b) = do !rnd <- randlet find :: Int -> IO Word8find !i = let !c = a `unsafeAt` i!p = b `unsafeAt` i in if p >= rndthen return celse find (i+1)find 0 rand :: IO Double{-# INLINE rand #-}rand = do!seed <- readIORef lastlet newseed = (seed * ia + ic) `rem` imnewran = fromIntegral newseed * rimdrimd = 1.0 / (fromIntegral im)im, ia, ic :: Intim = 139968ia = 3877ic = 29573writeIORef last newseedreturn newranwhere last = unsafePerformIO $ newIORef 42 alu:: [Char]alu = "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG\ \GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA\ \CCAGCCTGGCCAACATGGTGAAAGTCTCTACTAT\ \ACATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA\ \GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG\ \AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC\ \AGCCTGGGCGACAGAGCGAGACTCCGTCTCA" mkCum :: [(Char,Double)] -> [(Word8,Double)]mkCum lst = map (\(c,p) -> ((fromIntegral.fromEnum) c,p)) $ scanl1 (\(_,p) (c',p') -> (c', p+p')) lst homosapiens, iub :: C iub' = mkCum [('a',0.27),('c',0.12),('g',0.12),('t',0.27),('B',0.02) ,('D',0.02),('H',0.02),('K',0.02),('M',0.02),('N',0.02) ,('R',0.02),('S',0.02),('V',0.02),('W',0.02),('Y',0.02)] homosapiens' = mkCum [('a',0.3029549426680),('c',0.1979883004921) ,('g',0.1975473066391),('t',0.3015094502008)] iub = (listArray (0, (length iub')-1) $ map fst iub',listArray (0, (length iub')-1) $ map snd iub') homosapiens = (listArray (0, (length homosapiens')-1) $ map fst homosapiens', listArray (0, (length homosapiens')-1) $ map snd homosapiens')