Hi Preston,

do you have indications that this is a limitation of just the free version?
I think that it wouldn't be completely surprising to see a big memory blow up.
Assuming that the XML file is in single-byte UTF-8 (which I think it is) and 
that the text is stored in 2-byte UTF-16 characters in the JVM, we already have 
a factor of 2. And then there are probably a number of objects and references 
that take up additional memory. So it might be that all versions of Saxon take 
up a lot of space in memory. But of course it is also possible that the 
commercial version uses a more memory efficient representation.

Cheers,
Till

On Feb 11, 2014, at 8:07 PM, Eldon Carman <[email protected]> wrote:

> In testing larger datasets sizes, saxon has run into a memory limitation. A
> data set size of 2.21 GB was not able to be queried by saxon. Even with
> setting the java heap size be larger than the data set, the application
> throws an error: "Exception in thread "main" java.lang.OutOfMemoryError: GC
> overhead limit exceeded". Just to confirm, I used the following settings:
> JAVA_OPTS="-Xmx12g -Xms12g"
> 
> Several internet posts comment on allocating 5 times as much memory as the
> xml data size as a rule of thumb. Not guaranteed to work. Some of my
> testing have worked with datasets up to 460MB (happens to the be the my
> tiny dataset size). Guess we now have confirmed the memory limitation of
> the free version of saxon.

Reply via email to