Hello,

we're getting an apparent deadlock (followed by a GC overhead limit
exceeded) on one machine, when starting some processing on a
collection of over 800 000 records. Going after it with YourKit yields
the following:

application-akka.actor.default-dispatcher-110 <--- Frozen for at least 3m 7s
org.basex.server.Query.cache(InputStream)
org.basex.server.ClientQuery.cache()
org.basex.server.Query.more()
eu.delving.basex.client.Implicits$RichClientQuery.hasNext()
scala.collection.Iterator$$anon$19.hasNext()
scala.collection.Iterator$$anon$29.hasNext()
scala.collection.Iterator$class.foreach(Iterator, Function1)
scala.collection.Iterator$$anon$29.foreach(Function1)
core.processing.CollectionProcessor$$anonfun$process$2.apply(ClientSession)
core.processing.CollectionProcessor$$anonfun$process$2.apply(Object)
core.storage.BaseXStorage$$anonfun$withSession$1.apply(ClientSession)
core.storage.BaseXStorage$$anonfun$withSession$1.apply(Object)
eu.delving.basex.client.BaseX$$anonfun$withSession$1.apply(ClientSession)
eu.delving.basex.client.BaseX$$anonfun$withSession$1.apply(Object)
eu.delving.basex.client.BaseX.withSession(Function1)
eu.delving.basex.client.BaseX.withSession(String, Function1)
core.storage.BaseXStorage$.withSession(Collection, Function1)
core.processing.CollectionProcessor.process(Function0, Function1,
Function1, Function3)
core.processing.DataSetCollectionProcessor$.process(DataSet)
actors.Processor$$anonfun$receive$1.apply(Object)<2 recursive calls>
akka.actor.Actor$class.apply(Actor, Object)
actors.Processor.apply(Object)
akka.actor.ActorCell.invoke(Envelope)
akka.dispatch.Mailbox.processMailbox(int, long)
akka.dispatch.Mailbox.run()
akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec()
akka.jsr166y.ForkJoinTask.doExec()
akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinTask)
akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool$WorkQueue)
akka.jsr166y.ForkJoinWorkerThread.run()



In the server logs I can observe:

09:10:45.129    [192.168.1.214:47530]:
dimcon____geheugen-van-nederland        QUERY(3)        for $i in 
/record[@version =
0] order by $i/system/index return $i   OK      0.06 ms
09:10:45.129    [192.168.1.214:47530]:
dimcon____geheugen-van-nederland        QUERY(3)        OK      0.03 ms
09:13:23.155    [192.168.1.214:47530]:
dimcon____geheugen-van-nederland        ITER(3) Error: Connection reset
09:13:23.155    [192.168.1.214:47530]:
dimcon____geheugen-van-nederland        LOGOUT admin    OK



I looked up the code, and it looks as though the whole (?) query is
cached in memory upon retrieval. Given the DB is over 1.2 GB in size,
our client server has a hard time (it only has 1,5 GB of Xmx).

Is there any preferred way of dealing with this?

What I am going to do for the moment, I think, is to override or
intercept the creation of the ClientQuery and make an implementation
that has a different caching strategy. Another approach may probably
be to limit the query output and implement some custom iteration
behavior - but if it that can be handled directly at the query level,
I think it would make things easier.

Manuel
_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to