jeff: > Alright, I've been hacking away at what I posted the other day, and I > have something that works for files that will fit entirely into memory. > And then I figured out why I've been restricted to files that fit > entirely into memory... One of my functions is causing the entire thing > to be read in, when, in the way I analyze the program, only a very small > portion of the file should be read in. Here are the functions I've used > to test this problem... > > import Data.Bits > import qualified Data.ByteString.Lazy as BS > import Foreign.C.Types > ... > > {-# INLINE decodeLengthBits #-} > decodeLengthBits :: BS.ByteString -> CInt > decodeLengthBits doc = (shift (pieces !! 3) 24) .|. > (shift (pieces !! 2) 16) .|. > (shift (pieces !! 1) 8) .|. > (pieces !! 0) > where pieces::[CInt] = map fromIntegral . BS.unpack . BS.take 4 $ doc > > breakIntoDocuments :: RawDocument -> [RawDocument] > breakIntoDocuments f | BS.length f > 0 = if len > 0
Argh! -- | /O(n\/c)/ 'length' returns the length of a ByteString as an 'Int64' length :: ByteString -> Int64 length (LPS ss) = L.foldl' (\n ps -> n + fromIntegral (P.length ps)) 0 ss that'll force the whole file (it is the sum of the length of each chunk). Try comparing against the null bytestring, -- | /O(1)/ Test whether a ByteString is empty. null :: ByteString -> Bool null (LPS []) = True null (_) = False :-) > then (BS.take bytes f) : > (breakIntoDocuments > (BS.drop bytes f)) > else (breakIntoDocuments > (BS.drop bytes f)) > | otherwise = [] > where len = decodeLengthBits f > bytes = fromIntegral (len * 2 + len * 4 + 4) > > > and a main function of: > > main = do > f <- B.readFile "Documents.bin" > print (take 1 (breakIntoDocuments f)) > > > Shouldn't the program only read in enough of the lazy byte-string to > create the first return value of breakIntoDocuments? The return value > of decodeLengthBits is only 277. I watched it, and it's reading in my > whole 2gb file... > > -- Jeff Got to be more lazy :-) -- Don _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe