FYI we have locally applied the patch mentioned below and this has fixed the problem.
Many thanks Holger On Sep 21, 2011, at 1:54 AM, Stephen Allen wrote: > Hi Holger, > > I believe you are correct that Query objects with aggregators cannot be > reused by different threads. They *can* be reused by the same thread or by > different threads that synchronize the compile step, but even then there is > a problem with the Query object hanging onto references to a new aggregator > for each query execution. > > The thing causing this appears to be in AlgebraGenerator.java line 562, > where the aggregators added to a Query object are referenced directly by the > compiled query plan. Instead, we should make a copy of the aggregators so > that the original Query object remains immutable. > > I've created a JIRA issue and submitted a patch, JENA-120: > https://issues.apache.org/jira/browse/JENA-120 > > As a work-around until the patch is applied, I think you can synchronize > around the QueryExecutionFactory.create() method. Or, you can decide not to > cache Group By queries (test for this with Query.hasGroupBy()). > > I don't know if there are other issues that may prevent reusing Query > objects, maybe Andy can chime in here. > > -Stephen > > P.S. Your strategy of caching Query objects does avoid having to reparse > the query string, which can be quite beneficial. Along these same lines, a > better enhancement to ARQ would be a mechanism to cache the query plans > after the optimizer step. Query optimization itself can get quite expensive > (n! for left-deep trees, and even worse for bushy trees). > > > >> -----Original Message----- >> From: Holger Knublauch [mailto:[email protected]] >> Sent: Tuesday, September 20, 2011 1:14 AM >> To: [email protected] >> Subject: Aggregators and concurrent use of Query object >> >> Hi Andy, >> >> we have (unreliably) run into exceptions like the one below, and my >> suspicion is that the ARQ Query class is not meant to be re-used by >> multiple threads. Although each step in the Query is converted into a >> corresponding Algebra objects for execution, the Aggregators seem to be >> shared between multiple objects. Is this correct and do I need to >> create a new Query each time I want a QueryExecution? This would slow >> down things quite a lot, as we currently cache all Queries that were >> created from string representation. If this is the case, are there any >> ways to tell which particular queries are not thread-safe, e.g. all >> queries involving aggregations? >> >> If I am totally off the mark, do you know what else could cause the >> exception below, only sometimes in multi-threading conditions? >> >> Thank you, >> Holger >> >> >> com.hp.hpl.jena.sparql.ARQInternalErrorException: Null for accumulator >> at >> com.hp.hpl.jena.sparql.expr.aggregate.AggregatorBase.getValue(Aggregato >> rBase.java:61) >> at >> com.hp.hpl.jena.sparql.engine.iterator.QueryIterGroup.calc(QueryIterGro >> up.java:121) >> at >> com.hp.hpl.jena.sparql.engine.iterator.QueryIterGroup.<init>(QueryIterG >> roup.java:32) >> at >> com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:4 >> 13) >> at >> com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDis >> patch.java:255) >> at >> com.hp.hpl.jena.sparql.algebra.op.OpGroup.visit(OpGroup.java:37) >> at >> com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDisp >> atch.java:33) >> at >> com.hp.hpl.jena.sparql.engine.main.OpExecutor.executeOp(OpExecutor.java >> :107) >> at >> com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:4 >> 41) >> at >> com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDis >> patch.java:241) >> at >> com.hp.hpl.jena.sparql.algebra.op.OpExtend.visit(OpExtend.java:107) >> at >> com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDisp >> atch.java:33) >> at >> com.hp.hpl.jena.sparql.engine.main.OpExecutor.executeOp(OpExecutor.java >> :107) >> at >> com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:3 >> 93) >> at >> com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDis >> patch.java:213) >> at >> com.hp.hpl.jena.sparql.algebra.op.OpProject.visit(OpProject.java:34) >> at >> com.hp.hpl.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDisp >> atch.java:33) >> at >> com.hp.hpl.jena.sparql.engine.main.OpExecutor.executeOp(OpExecutor.java >> :107) >> at >> com.hp.hpl.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:8 >> 0) >> at com.hp.hpl.jena.sparql.engine.main.QC.execute(QC.java:40) >> at >> com.hp.hpl.jena.sparql.engine.main.QueryEngineMain.eval(QueryEngineMain >> .java:52) >> at >> com.hp.hpl.jena.sparql.engine.QueryEngineBase.evaluate(QueryEngineBase. >> java:138) >> at >> com.hp.hpl.jena.sparql.engine.QueryEngineBase.createPlan(QueryEngineBas >> e.java:109) >> at >> com.hp.hpl.jena.sparql.engine.QueryEngineBase.getPlan(QueryEngineBase.j >> ava:97) >> at >> com.hp.hpl.jena.sparql.engine.main.QueryEngineMain$1.create(QueryEngine >> Main.java:91) >> at >> com.hp.hpl.jena.sparql.engine.QueryExecutionBase.getPlan(QueryExecution >> Base.java:266) >> at >> com.hp.hpl.jena.sparql.engine.QueryExecutionBase.startQueryIterator(Que >> ryExecutionBase.java:243) >> at >> com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execResultSet(QueryExe >> cutionBase.java:248) >> at >> com.hp.hpl.jena.sparql.engine.QueryExecutionBase.execSelect(QueryExecut >> ionBase.java:94) >> at >> org.topbraid.spin.arq.SPINARQFunction.executeBody(SPINARQFunction.java: >> 121) > >
