In testing larger datasets sizes, saxon has run into a memory limitation. A data set size of 2.21 GB was not able to be queried by saxon. Even with setting the java heap size be larger than the data set, the application throws an error: "Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded". Just to confirm, I used the following settings: JAVA_OPTS="-Xmx12g -Xms12g"
Several internet posts comment on allocating 5 times as much memory as the xml data size as a rule of thumb. Not guaranteed to work. Some of my testing have worked with datasets up to 460MB (happens to the be the my tiny dataset size). Guess we now have confirmed the memory limitation of the free version of saxon.
