Re: [Haskell-cafe] Slower with ByteStrings?

2007-05-28 Thread Donald Bruce Stewart
bulat.ziganshin:
 Hello Bryan,
 
 Sunday, May 27, 2007, 3:30:50 AM, you wrote:
  I think, given my simple algorithm that means that (==) for
  ByteStrings is slower than (==) for String.  Is this possible?
 
  Yes indeed.  Over ByteStrings, (==) is implemented as a call to memcmp.
For small strings, this loses by a large margin because it has to go
  through the FFI.
 
 how about using *unsafe* memcmp import and more complex code for the
 case of large BS length?
 
 a==b | min (length a) (length b)  20   = memcmp a b
  
 

Good idea. I'll try to do this before the next bytestring comes out .

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Slower with ByteStrings?

2007-05-27 Thread Bulat Ziganshin
Hello Bryan,

Sunday, May 27, 2007, 3:30:50 AM, you wrote:
 I think, given my simple algorithm that means that (==) for
 ByteStrings is slower than (==) for String.  Is this possible?

 Yes indeed.  Over ByteStrings, (==) is implemented as a call to memcmp.
   For small strings, this loses by a large margin because it has to go
 through the FFI.

how about using *unsafe* memcmp import and more complex code for the
case of large BS length?

a==b | min (length a) (length b)  20   = memcmp a b
 


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Slower with ByteStrings?

2007-05-26 Thread Jason Dagit

Hello,

We recently had a challenge as follows:

Given a word, find all the words in the dictionary which can be made
from the letters of that word.  A letter can be used at most as many
times as it appears in the input word.  So, letter can only match
words with 0, 1, or 2 t's in them.

I opted for simplicity in my implementation including hard coding the
input word and using /usr/share/dict/words as the dictionary:

-- Begin Words.hs
module Main where

import List

-- I was lazy and borrowed perms from the Haskell wiki
-- but I wrote everything else
perms [] = [[]]
perms xs = [ x : ps | x - xs, ps - perms (xs\\[x]) ]

-- creates permutations of all lengths then cleans up duplicates and
gets rid of the
-- empty list, this is probably the least efficient way possible
allPerms x = drop 1 $ sort $ nub $ concatMap inits $ perms x


main = do wordList - readFile /usr/share/dict/words
 let words = lines wordList
 mapM_ print $ filter (`elem` words) $ allPerms ubuntu
-- End Words.hs

Next I decided to try it with byte stings:

-- Begin ByteStringWords.hs
module Main where

import List
-- import qualified Data.ByteString.Char8 as C
import qualified Data.ByteString.Lazy.Char8 as C

-- I was lazy and borrowed perms from the Haskell wiki
-- but I wrote everything else
perms [] = [[]]
perms xs = [ x : ps | x - xs, ps - perms (xs\\[x]) ]

-- creates permutations of all lengths then cleans up duplicates and
gets rid of the
-- empty list, this is probably the least efficient way possible
allPerms x = drop 1 $ sort $ nub $ concatMap inits $ perms x


main = do wordList - C.readFile /usr/share/dict/words
 let words = C.lines wordList
 mapM_ print $ filter (`elem` words) $ map C.pack $ allPerms ubuntu
-- End ByteStringWords.hs

I don't think the overhead to compute the permutations matters here as
the input to the permutations calculation is so small.  Any ideas why
the byte string version is slower?  (Strict bytestrings appear to be
about 2 seconds slower and lazy bytestrings appear to be about 1
second slower).

I think, given my simple algorithm that means that (==) for
ByteStrings is slower than (==) for String.  Is this possible?  I
think the program might be spending more time cleaning up after
execution with the ByteString versions as it seems to stall after
printing the last match.

Thanks,
Jason
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Slower with ByteStrings?

2007-05-26 Thread Bryan O'Sullivan

Jason Dagit wrote:


I think, given my simple algorithm that means that (==) for
ByteStrings is slower than (==) for String.  Is this possible?


Yes indeed.  Over ByteStrings, (==) is implemented as a call to memcmp. 
 For small strings, this loses by a large margin because it has to go 
through the FFI.


b
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Slower with ByteStrings?

2007-05-26 Thread Donald Bruce Stewart
bos:
 Jason Dagit wrote:
 
 I think, given my simple algorithm that means that (==) for
 ByteStrings is slower than (==) for String.  Is this possible?
 
 Yes indeed.  Over ByteStrings, (==) is implemented as a call to memcmp. 
  For small strings, this loses by a large margin because it has to go 
 through the FFI.
 

Yes, a non-memcmp version can sometimes be profitably used here.

Something like this Core:

eq !n (Ptr p) (Ptr q) = inlinePerformIO $ IO $ go n p q
  where 
go !n p q s
| n == 0= (# s , True #)
| otherwise = case readInt8OffAddr# p 0# s of
(# s, a #) - case readInt8OffAddr# q 0# s of
(# s, b #) | a /=# b   - (# s, False #)
   | otherwise - go (n-1) (plusAddr# p 1#) 
(plusAddr# q 1#) s

Ok, so that's not Core, but it could be ;)

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe