On 20/02/14 11:30, Christian Maeder wrote:
> Hi,
>
> I've got some difficulties parsing "large" xml files (> 100MB).
> A plain SAX parser, as provided by hexpat, is fine. However,
> constructing a tree consumes too much memory on a 32bit machine.
>
> see http://trac.informatik.uni-bremen.de:8080
I think you need to import Data.OldTypeable to get Typeable2 and friends.
Pedro may remember more of the reasoning about backward compatibility.
| -Original Message-
| From: Christian Maeder [mailto:christian.mae...@dfki.de]
| Sent: 20 February 2014 16:09
| To: Simon Peyton Jones; Daniil
With "TypeableN " I mean, Typeable1, Typeable2, etc.
Typeable2 was not supported (below) by ghc-7.8-rc1.
Where is the "backward compat"?
In fact, Typeable and Typeable2 mean the same thing for ghc-7.8-rc1 and
ghc-7.6 resp.!
C.
Am 20.02.2014 17:00, schrieb Simon Peyton Jones:
| Can you not s
I'm afraid our use case is not a lazy prefix traversal.
I'm more shocked that about 100 MB xml content do not fit (as tree) into
3 GB memory.
Christian
Am 20.02.2014 16:49, schrieb malcolm.wallace:
Is your usage pattern over the constructed tree likely to be a lazy
prefix traversal? If so, t
| Can you not simply let ghc-7.8 interpret TypeableN as Typeable?
Not really: ghc-7.8 does still support TypeableN I think (for backward compat
reasons). So it can't take it to mean two different things.
| -Original Message-
| From: Glasgow-haskell-users [mailto:glasgow-haskell-users-
|
Is your usage pattern over the constructed tree likely to be a lazy prefix traversal? If so, then HaXml supports lazy construction of the parse tree. Some plots appear at the end of this paper, showing how memory usage can be reduced to a constant, even for very large inputs (1 million tree nodes
Yes, changing Typeable2 to Typeable in:
{-# LANGUAGE StandaloneDeriving, DeriveDataTypeable #-}
...
deriving instance Typeable Gr
goes through with ghc-7.8-rc1.
However, this change refuses to compile with ghc-7.6.3:
Expecting two more arguments to `Gr'
In the stand-alone deriving inst
Hi Christian,
as regards your question about sharing strings, there are a number of
libraries on Hackage to achieve this, e.g. in the context of compiler
symbols. To cite only a few: intern, stringtable-atom, simple-atom.
I'm sure there are others.
Best,
--
Mathieu Boespflug
Founder at http://twe
Ah, I'd misunderstood your question, and thought you were looking for a
sax-like alternative.
On Feb 20, 2014 6:57 AM, "Christian Maeder"
wrote:
> I've just tried:
>
> import Text.HTML.TagSoup
> import Text.HTML.TagSoup.Tree
>
> main :: IO ()
> main = getContents >>= putStr . renderTags .
I've just tried:
import Text.HTML.TagSoup
import Text.HTML.TagSoup.Tree
main :: IO ()
main = getContents >>= putStr . renderTags . flattenTree . tagTree .
parseTags
which also ends with the getMBlock error.
Only "renderTags . parseTags" works fine (like the hexpat SAX parser).
Why sh
Have you looked at tagsoup?
On Feb 20, 2014 3:30 AM, "Christian Maeder"
wrote:
Hi,
I've got some difficulties parsing "large" xml files (> 100MB).
A plain SAX parser, as provided by hexpat, is fine. However, constructing a
tree consumes too much memory on a 32bit machine.
see http://trac.inform
Hi,
I've got some difficulties parsing "large" xml files (> 100MB).
A plain SAX parser, as provided by hexpat, is fine. However,
constructing a tree consumes too much memory on a 32bit machine.
see http://trac.informatik.uni-bremen.de:8080/hets/ticket/1248
I suspect that sharing strings when
12 matches
Mail list logo