Re: Achieving reasonably performing federated queries

2013-07-26 Thread Sarven Capadisli
On 07/26/2013 12:51 PM, Andy Seaborne wrote: Not silly - with federated query there isn't enough information available for an optimizer to make sensible choices, and, worse, it's an area where ordering of operations is much more noticeable (the operations are inherently more expensive). In pract

Re: Achieving reasonably performing federated queries

2013-07-26 Thread Andy Seaborne
On 26/07/13 09:07, Olivier Rossel wrote: Usually the author knows the data repartition and can hint the query planer with some info. May we such hints could be keywords ahead of the SERVICE keyword. Example of chaining queries: SELECT... WHERE { SERVICE {} THEN SERVICE { ... } } Example o

Re: Achieving reasonably performing federated queries

2013-07-26 Thread Olivier Rossel
Usually the author knows the data repartition and can hint the query planer with some info. May we such hints could be keywords ahead of the SERVICE keyword. Example of chaining queries: SELECT... WHERE { SERVICE {} THEN SERVICE { ... } } Example of parallel queries: SELECT... WHERE { SERVI

Re: Achieving reasonably performing federated queries

2013-07-25 Thread Andy Seaborne
It may help if ARQ did a hash join in this case - getting the data from the two SERVICEs could even be done in parallel (except that in turn may be unacceptable). The advantage of the current approach is that it does not run out of memory - it does not consume temporary RAM in proportion to th

Re: Achieving reasonably performing federated queries

2013-07-25 Thread Sarven Capadisli
On 07/25/2013 11:49 AM, Rob Vesse wrote: Can you provide examples of the query plan (the algebra) with the optimizer on and off? The issue is likely down to ARQs index join linearization optimization, this works great for local data and small federated queries but can work poorly for large feder

Re: Achieving reasonably performing federated queries

2013-07-25 Thread Rob Vesse
Yes you should be able to add the following: --set arq:optIndexJoinStrategy=false I'm not 100% sure that the short form will work, you may need to use the fully expanded form: --set http://jena.hpl.hp.com/ARQ#optIndexJoinStrategy=false However as noted in my email this is new in 2.10.2-SNAPSHOT

Re: Achieving reasonably performing federated queries

2013-07-25 Thread Diogo FC Patrao
Hello The better plan for the query you posted would be (1), simply because of >> the cost of accessing a remote service. But, if the first SERVICEd query >> would return just a few lines, maybe it would be better to run a couple >> of >> times the same query as in (2) than to get all results. >

Re: Achieving reasonably performing federated queries

2013-07-25 Thread Sarven Capadisli
On 07/23/2013 04:57 PM, Diogo FC Patrao wrote: Hello I observed the same behaviour as you did and have some considerations that are product of that. I haven't checked the Jena sources, so I may be wrong here. As stated before in this list, ARQ doesn't have any federation-specific optimization,

Re: Achieving reasonably performing federated queries

2013-07-25 Thread Rob Vesse
Can you provide examples of the query plan (the algebra) with the optimizer on and off? The issue is likely down to ARQs index join linearization optimization, this works great for local data and small federated queries but can work poorly for large federated queries. If this is the case then the

Re: Achieving reasonably performing federated queries

2013-07-24 Thread Claude Warren
I did something like this a year ago (I should probably write it up). In our case we had what we called a "roadmap" that could identify properties various sparql endpoints that were logically the same (e.g. foo:molecularWeight, bar:molecular_weight and baz:atomic_weight might all be the same). We

Re: Achieving reasonably performing federated queries

2013-07-23 Thread Diogo FC Patrao
Hello I observed the same behaviour as you did and have some considerations that are product of that. I haven't checked the Jena sources, so I may be wrong here. As stated before in this list, ARQ doesn't have any federation-specific optimization, so it behaves as if the cost of accessing local a

Re: Achieving reasonably performing federated queries

2013-07-23 Thread Olivier Rossel
Same interrogations here. So I +1 this question immensely! On Tue, Jul 23, 2013 at 11:48 AM, Sarven Capadisli wrote: > Hi all, > > This is partly a summary of my recent experiences with federated queries > and partly a request for your feedback on making /reasonably/ performing > federated quer

Achieving reasonably performing federated queries

2013-07-23 Thread Sarven Capadisli
Hi all, This is partly a summary of my recent experiences with federated queries and partly a request for your feedback on making /reasonably/ performing federated queries. The query in question is here [1]. Essentially there are two endpoints (which may or may not be the same), and they ret