Hello, I am facing an issue while retrieving some big amount of XML documents from a BaseX collection.
Each document (as an XML file) is around 10 KB, and in the problematic case
I must retrieve around 70000 of them.
I am using Session#query(String query) then Query#more() and Query#next()
to iterate through the result of my query.
try (final Query query = l_Session.query(“query”)) {
while (query.more()) {
String xml = query.next();
}
}
If there is more than a certain amount of XML document in the result of my
query I get a OutOfMemoryError (full stack trace in attached file) when
executing query.more().
I did the test with BaseX 8.6.6 and 8.6.7, Java 8, VM arguments –Xmx1024m
Increasing the Xmx value is not a solution as I don’t know what the maximum
amount of data I will have to retrieve in the future. So what I need is a
reliable way of executing such queries and iterate through the result
without exploding the heap size.
I also try to use QueryProcessor and QueryProcessor#iter() instead of
Session#query(String
query). But is it safe to use it knowing that my application is
multithreaded and that each thread has its own session to query or add
elements from/to multiple collections?
Moreover, for now all access to BaseX are done through a session, so my
application can run with an embedded BaseX or with a BaseX server. If I
start using QueryProcessor, then it will be embedded BaseX only, right?
I also attached a simple example showing the problem.
Any advice would be much appreciated
Thanks
Simon
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
at org.basex.io.out.ArrayOutput.write(ArrayOutput.java:25)
at org.basex.io.out.ServerOutput.write(ServerOutput.java:31)
at java.io.OutputStream.write(OutputStream.java:116)
at org.basex.io.out.BufferOutput.flush(BufferOutput.java:60)
at org.basex.io.out.BufferOutput.write(BufferOutput.java:54)
at org.basex.io.out.PrintOutput.write(PrintOutput.java:66)
at org.basex.io.out.PrintOutput.print(PrintOutput.java:76)
at org.basex.io.out.NewlineOutput.print(NewlineOutput.java:33)
at org.basex.io.out.PrintOutput.print(PrintOutput.java:99)
at
org.basex.io.serial.MarkupSerializer.finishClose(MarkupSerializer.java:226)
at org.basex.io.serial.Serializer.closeElement(Serializer.java:189)
at org.basex.io.serial.Serializer.node(Serializer.java:364)
at org.basex.io.serial.Serializer.node(Serializer.java:158)
at
org.basex.io.serial.StandardSerializer.node(StandardSerializer.java:105)
at
org.basex.io.serial.AdaptiveSerializer.node(AdaptiveSerializer.java:75)
at org.basex.io.serial.Serializer.serialize(Serializer.java:109)
at
org.basex.io.serial.AdaptiveSerializer.serialize(AdaptiveSerializer.java:66)
at org.basex.server.ServerQuery.execute(ServerQuery.java:138)
at org.basex.api.client.LocalQuery.cache(LocalQuery.java:48)
at org.basex.api.client.Query.more(Query.java:75)
at
basextest.OutOfMemoryWithQueryMore.main(OutOfMemoryWithQueryMore.java:42)
OutOfMemoryWithQueryMore.java
Description: Binary data

