"Uwe Schmidt" <[EMAIL PROTECTED]> schrieb im Newsbeitrag news:[EMAIL PROTECTED] it into HXT. > > This still does not solve the processing of "very very large" > XML document. I doubt, whether we can do this with a DOM > like approach, as in HXT or HaXml. Lazy input does not solve all problems. > A SAX like parser could be a more useful choice for very large documents. > > Uwe
I think a step towards support medium size documents in HXT would be to store the tags and content more efficiently. If I undertand the coding correctly every tag is stored as a seperate Haskell string. As each byte of a string under GHC takes 12 bytes this alone leads to high memory usage. Tags tend to repeat. You could store them uniquely using a hash table. Content could be stored in compressed byte strings. As I mentioned in an earlier post 2GB memory is not enough to process a 35MB XML document in HXT as we have 30 x 2 x 12 = 720 MB for starters to just store the string data (once in the parser and once in the DOM). (Well a machine with 2GB memory). I guess I had somewhere around 1GB free for the program. Other overheads most likely used up the ramaining 300 MB. Rene. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe