Dear Rob, Thanks a lot for the reply again.
But I am curious does Jena implement Hash joins -- which are essentially blocking in nature. If Jena does not, then how is a join between two unsorted lists (of intermediate results) done in Jena? On Tue, Mar 10, 2015 at 4:34 PM, Rob Vesse <rve...@dotnetrdf.org> wrote: > Yes that is pretty much what happens > > Note though that most query evaluation in ARQ is done in a streaming > fashion so the full set of solutions is typically never held in memory for > any query unless an operator which requires full /partial materialisation > e.g. DISTINCT is encountered > > Rob > > On 10/03/2015 10:24, "Rose Beck" <rosebeck...@gmail.com> wrote: > >>Dear Rob, >> >>First and foremost thank you for such a wonderful explanation. >> >>Just to clarify, say my example query is: >> >>select ?a?b?c where{?a <pred1> ?c. ?a <pred2> ?b} order by ?c limit 10 >> >>Then for the query above all the solutions are generated just as it >>would be the case for the SPARQL query: select ?a?b?c where{?a <pred1> >>?c. ?a <pred2> ?b}. But within the priority queue (which is the last >>operation which is applied before results are output to the users) at >>any point just 10 solutions ordered by ?c are placed. Please correct >>me if I am wrong. >> >>Cheers, >>Rose >> >> >>On Tue, Mar 10, 2015 at 3:38 PM, Rob Vesse <rve...@dotnetrdf.org> wrote: >>> Query execution in ARQ is based on nested iterators so QueryIterTopN >>>will >>> always apply over another iterator >>> >>> A PriorityQueue is used internally as temporary storage within >>> QueryIterTopN while it exhausts the inner iterator allowing it to only >>>use >>> at most the limit amount of storage in the priority queue plus whatever >>> temporary storage the inner iterator(s) may need. >>> >>> There is still a "total sort" in the sense that every possible solution >>> has to be compared to see if it should be placed into the priority queue >>> however there is not a "total sort" in the sense of needing to >>>materialise >>> all possible solutions into memory and then sort. >>> >>> Rob >>> >>> p.s. Please don't post identical questions to both users@ and dev@ - one >>> list is sufficient as the developers are on both lists. As a general >>>rule >>> general support questions should go to users@ and technical/architecture >>> questions like this should go to dev@ >>> >>> >>> >>> On 10/03/2015 05:54, "Rose Beck" <rosebeck...@gmail.com> wrote: >>> >>>>Hi, >>>> >>>>I saw the following issue posted on Jena website (which has been >>>>recently resolved): >>>>Avoid a total sort for ORDER BY + LIMIT queries >>>>(https://issues.apache.org/jira/browse/JENA-89). >>>> >>>>I am very interested in understanding as to how does Jena-ARQ avoids >>>>total sort for ORDER BY + LIMIT queries. In the post it is mentioned >>>>that Jena-ARQ uses priority queue for avoiding a final sort, however >>>>it is also mentioned that "ARQ's algebra package contains already a >>>>OpTopN [3] operator. The OpExecutor [4] will need to use a new >>>>QueryIterTopN instead of QueryIterSort + QueryIterSlice." It is not >>>>clear now does the priority queue benefit from OpTopN operator and >>>>QueryIterTopN as the links [3] and [4] mentioned on the website does >>>>not work, so I am not able to understand their operation and as to how >>>>do they help in avoiding a total sort. >>>> >>>>Can someone please explain how does Jena-ARQ execute the queries >>>>containing ORDER BY + LIMIT clause. >>>> >>>>With Warm Regards, >>>>Rose >>> >>> >>> >>> >> >> >> >>-- >>With Warm Regards, >>Rose > > > > -- With Warm Regards, Rose