Nick is correct about the serializer but the question was about the turtle parser, and it is also valid.
The Raptor turtle (n3, trig) parser relies on flex and bison (aka lex+yacc) of which bison: a) has to have the entire input in memory in one block in order to parse b) uses 32 bit unsigned int offsets So Raptor has to assemble the input in memory (lots of alloc / realloc) and end up with a max 2G size. A 5G file is not going to parse. I have looked at fixing this several times but writing a streaming lexer and parser is damn hard - months of work. Using ANTLR and other things that do the same job looks like it would make things a lot more complex (it's C++). I've also tried looking at sqlite's lemon but it doesn't stream so it seems the only road to this is a lot of work. Dave On 7/9/12 1:30 AM, Nicholas Humfrey wrote: > Hello, > > Yes, the Turtle serialiser puts everything into RAM, in order to build a tree > of the data and out a nice pretty file, with all the triples with the same > subject next to each other. > > If you output as ntriples, then output will be much faster and it won't try > and load everything into RAM. > > nick. > > > On 9 Jul 2012, at 02:15, Medha Atre wrote: > >> Hello, >> >> I am trying to use the Raptor RDF parser library to parse a very large >> RDF/XML file of LUBM dataset (synthetically generated) and convert it into >> Turle representation. The gzipped format of RDF/XML file itself is 5.1 GB (I >> am reading its input through a fifo and "rapper" reads from this fifo). >> >> When I run "rapper" command to convert RDF/XML into Turtle on this file, the >> memory utilization shoots up very high (it consumes almost all of my RAM >> leaving me unable to do anything else on the computer). >> >> I was wondering if there is any option to restrict the memory used by >> "rapper" tool? I checked "configure" and "rapper --help", but didn't find >> any such option. >> >> Can someone please let me know what the best and easiest workaround for this? >> >> Thanks. >> >> Medha >> >> _______________________________________________ >> redland-dev mailing list >> [email protected] >> http://lists.librdf.org/mailman/listinfo/redland-dev > > _______________________________________________ > redland-dev mailing list > [email protected] > http://lists.librdf.org/mailman/listinfo/redland-dev > _______________________________________________ redland-dev mailing list [email protected] http://lists.librdf.org/mailman/listinfo/redland-dev
