Re: Different parse times for similar result sets

Andy Seaborne Thu, 28 Feb 2013 09:52:22 -0800

On 28/02/13 17:22, Stephen Allen wrote:

The results you are seeing indicate that this is probably 4store
executing the query slowly, and not anything to do with the Jena
client.  You could even take Jena out of the mix and test getting the
results directly from the endpoint:


    time curl --data-binary "@query1.txt" -H "Content-Type:
application/sparql-query" "http://localhost:3030/ds/query"; >>
/dev/null

Unfortunately, databases are notorious for handling IN clauses poorly
(even many SQL databases).  If 4store supports all of SPARQL 1.1, then
you can try changing the IN clause to a VALUES clause [1] and see if
that helps.

-Stephen

[1] http://www.w3.org/TR/sparql11-query/#inline-data


or even writing

FILTER(?x = <uri1> || ?x = <uri2> || ... )

which is logically the same but might (just might) trigger the optimizerto so something.


But I'm guessing that Stephen's suggestion shows it's how 4Store executes.




On Thu, Feb 28, 2013 at 10:30 AM, Burak Yönyül <[email protected]> wrote:

Hi,

When I reduce FILTER block, the execution time of the query longs shorter,
but I receive less result than original query. So result set  is reducing
too.


Sounds like it's probing to see if the variable has one of the values.


I recorded each elapsed time round the while loop, and there is variability
at some looping times. The code that records times:

                 int i = 0;
long before = System.currentTimeMillis();
while (resultSet.hasNext()) {
i++;
resultSet.next();
long after = System.currentTimeMillis();
fileWriter.append("Time of " + i + ". result: " + (after - before)+" ms"
+ "\n");
before = System.currentTimeMillis();
}

The example output:

Time of 1. result: 4 ms
Time of 2. result: 0 ms
Time of 3. result: 1 ms
...
Time of 20. result: 14 ms
Time of 21. result: 0 ms
Time of 22. result: 1 ms
Time of 23. result: 1 ms
...
Time of 27. result: 17 ms
Time of 28. result: 1 ms
...
Time of 34. result: 10 ms
... and so on.

So the server is sending rows back burstily - that is not Java CG at10-20 rows or the cost of sending the query. It's 4Store.


But when I execute LIMIT query, these times are all 0 or 1.

I don't know that, why in FILTER query, there is time differences at
getting some results. Do you have any idea about that?

It really does look like the cost of the FILTER having to get thelexical form of the URI to do the comparison on a high number of items.Maybe also probing to see if it is a value, not getting all thechoices once per query and testing.

(ARQ+TDB can go mad on these as well - it's a tricky thing to optimizein all situations.)


        Andy


Best,
Burak Yönyül

Re: Different parse times for similar result sets

Reply via email to