On 7/7/11 3:38 AM, Aleksandar Dimitrov wrote: > On Wed, Jul 06, 2011 at 07:27:10PM -0700, wren ng thornton wrote: >> I definitely agree with the iteratees comment, but I'm curious about the >> leaks you mention. I haven't run into leakiness issues (that I'm aware of) >> in my use of ByteStrings for NLP. > > The issue is this: strict ByteStrings retain pointers to the original chunk. The > chunk is probably bigger than you'd want to keep in memory, if you, say, wanted > to just keep one or two words. In my case, the chunk was some 65K (that was my > Iteratee chunk size.)
Oh, that issue. Yeah, I maintain an intern table and make sure that the copy in the table is a trimmed copy instead of keeping the whole string alive. I guess I should factor that part of my tagger out into a separate package :) I didn't know if you meant there was a technical issue, e.g. something about the fact that ByteStrings uses pinned memory (whereas Text doesn't IIRC). -- Live well, ~wren _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe