Mark - here is another way.

This query:

SELECT ?score ?ent
WHERE {
   { ?ent spatial:nearby ( .... ) }
   { ?ent text:query ( ..... ) }
   # No ?ent rdf:type iotic:Entity .
   # This focuses the query on the presenting issue.
}

and then run Fuseki with the following flags:

  --set arq:optIndexJoinStrategy=false --set arq:optMergeBGPs=false

for however you are running the server.

You need both --set

The service script will not do this very easily - if environment variable FUSEKI_ARGS is set it might do. Untested.

It is easier to run the server standalone:

(Linux, Mac)

The "fuseki-server" script should pass these in:

fuseki-server \
  --set arq:optIndexJoinStrategy=false --set arq:optMergeBGPs=false \
  .. other args ..

(Windows or any platform)

You can call the server java code directly: all one line:


java -Xmx1200M -jar fuseki-server.jar --set arq:optIndexJoinStrategy=false --set arq:optMergeBGPs=false .. other args ..

you'll need to put the full path name of fuseki-server.jar

Sorry - I don't have your setup to test this fully. I did make sure that the reworked query does lead to an execution plan that is different and should yield some information about the situation.

    Andy

On 22/12/15 09:50, Andy Seaborne wrote:
On 22/12/15 07:06, Mark Wharton wrote:
Ah, wheels within wheels.

The formulation with the filter in it is fine, except that if you want
to search for more than one word or you match in label and comment then
the UNION formulation returns you duplicate rows.  This isn't a problem
with the Lucene search which is why (I now remember) I used it in the
first place.

I'm not sure what version of jena I'm using - I just use the fuseki
release at 2.3.0.  Is there a way to find out?

3.0.0

Many of the java commands support --version and the fuseki- server jar
is an all-in-one jar:

java -cp <YourInstall>/fuseki-server.jar arq.sparql --version

What's the status on the JENA-999 and JENA-1093 issues?  I see there's
been some activity on 999 in the last few days. Andy Seaborne's last
comment seems encouraging.

I don't want to adopt a single version as I'll be stuck forever patching
back and forward and it will break eventually.

Many thanks for your continued help.

JENA-999 may sort of help but I'm not that positive because each ?ent
from the first part will be different going into the second part.  It
looks to me as if it is the overhead of going out to Lucene. (This is
Lucene right? not Solr?)

The ideal is some super compilation of the text:query and spatial query
into one big Lucene query.

What would also be good, which is stop the general optimizer (this is
nothing to do with TDB) using an index join.  Except that is the better
choice for the rdf:type.  This is what the addition {} were trying for
except the optimizer outsmarted

SELECT ?score ?ent
WHERE {
  ?ent spatial:nearby( ...) .
  (?ent ?score) text:query (...) .
  ?ent rdf:type iotic:Entity .
}



Mark - can you ask the query from Java?  If so,

Add  "Optimize.noOptimizer(); " before executing the query.  I can't see
a way to do that from setting the environment for Fuseki.

Or (the effect on time of this is version specific and whether it does
anything useful is a big "maybe") you could try this:

SELECT ?score ?ent
WHERE {
  { OPTIONAL { ?ent spatial:nearby "ABC" . }}
  { OPTIONAL { ?ent  text:query "DEF" } }
}

     Andy



Reply via email to