Hello,

I am facing an issue while retrieving some big amount of XML documents from
a BaseX collection.

Each document (as an XML file) is around 10 KB, and in the problematic case
I must retrieve around 70000 of them.

I am using Session#query(String query) then Query#more() and Query#next()
to iterate through the result of my query.



try (final Query query = l_Session.query(“query”)) {

while (query.more()) {

                String xml = query.next();

}

}

If there is more than a certain amount of XML document in the result of my
query I get a OutOfMemoryError (full stack trace in attached file) when
executing query.more().



I did the test with BaseX 8.6.6 and 8.6.7, Java 8, VM arguments –Xmx1024m



Increasing the Xmx value is not a solution as I don’t know what the maximum
amount of data I will have to retrieve in the future. So what I need is a
reliable way of executing such queries and iterate through the result
without exploding the heap size.

I also try to use QueryProcessor and QueryProcessor#iter() instead of
Session#query(String
query). But is it safe to use it knowing that my application is
multithreaded and that each thread has its own session to query or add
elements from/to multiple collections?

Moreover, for now all access to BaseX are done through a session, so my
application can run with an embedded BaseX or with a BaseX server. If I
start using QueryProcessor, then it will be embedded BaseX only, right?



I also attached a simple example showing the problem.



Any advice would be much appreciated



Thanks

Simon
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3236)
        at org.basex.io.out.ArrayOutput.write(ArrayOutput.java:25)
        at org.basex.io.out.ServerOutput.write(ServerOutput.java:31)
        at java.io.OutputStream.write(OutputStream.java:116)
        at org.basex.io.out.BufferOutput.flush(BufferOutput.java:60)
        at org.basex.io.out.BufferOutput.write(BufferOutput.java:54)
        at org.basex.io.out.PrintOutput.write(PrintOutput.java:66)
        at org.basex.io.out.PrintOutput.print(PrintOutput.java:76)
        at org.basex.io.out.NewlineOutput.print(NewlineOutput.java:33)
        at org.basex.io.out.PrintOutput.print(PrintOutput.java:99)
        at 
org.basex.io.serial.MarkupSerializer.finishClose(MarkupSerializer.java:226)
        at org.basex.io.serial.Serializer.closeElement(Serializer.java:189)
        at org.basex.io.serial.Serializer.node(Serializer.java:364)
        at org.basex.io.serial.Serializer.node(Serializer.java:158)
        at 
org.basex.io.serial.StandardSerializer.node(StandardSerializer.java:105)
        at 
org.basex.io.serial.AdaptiveSerializer.node(AdaptiveSerializer.java:75)
        at org.basex.io.serial.Serializer.serialize(Serializer.java:109)
        at 
org.basex.io.serial.AdaptiveSerializer.serialize(AdaptiveSerializer.java:66)
        at org.basex.server.ServerQuery.execute(ServerQuery.java:138)
        at org.basex.api.client.LocalQuery.cache(LocalQuery.java:48)
        at org.basex.api.client.Query.more(Query.java:75)
        at 
basextest.OutOfMemoryWithQueryMore.main(OutOfMemoryWithQueryMore.java:42)

Attachment: OutOfMemoryWithQueryMore.java
Description: Binary data

Reply via email to