[jira] [Created] (JENA-119) Eliminate memory bounds during query execution

Stephen Allen (JIRA) Mon, 19 Sep 2011 16:23:33 -0700

Eliminate memory bounds during query execution
----------------------------------------------


                 Key: JENA-119
                 URL: https://issues.apache.org/jira/browse/JENA-119
             Project: Jena
          Issue Type: New Feature
          Components: ARQ
            Reporter: Stephen Allen


It would be nice to eliminate all memory bounds on queries.  Similar to 
JENA-44, it would involve modifying all of the QueryIterator objects that 
maintain unbounded collections of Bindings.


The ones I've identified (let me know if I've missed any):

QueryIterSort
    Complete!

QueryIterGroup
    Probably one of the more complicated implementations.  I think it can be 
done with a DistinctDataBag.

QueryIterDistinct
    Can be implemented trivially using DistinctDataBag, but would lose 
streaming capability.  We could do streaming just until the first spill, which 
would be a little more difficult but not bad.  If we wanted streaming even 
after spilling, then we would need an on-disk hashtable or b-tree (which could 
get expensive for maybe limited benefit, do you really need streaming after 
10,000 results?).

QueryIteratorCopy
    Only appears to be used QueryIterService.  Simple implementation using 
DefaultDataBag.

QueryIteratorCaching
    Does not match DataBag's assumption of completing all writes before 
iterating.  But it isn't used anywhere, so maybe we remove it?

QueryIterDiff
QueryIterMinus
    Both of these materialize the RHS into a collection.  Can be implemented 
with DefaultDataBag.  As an aside, is this necessary to do for all queries?  
What if the RHS is cheap (i.e. a single TriplePattern)?

QueryIterJoin
QueryIterLeftJoin
   Both materialize RHS.  Are they used anywhere?  I was under the impression 
that ARQ only considered left-deep plans with indexed joins on the RHS 
TriplePatterns.

SubQueries
   I'm not sure how this is handled.  Are these materialized somewhere?



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (JENA-119) Eliminate memory bounds during query execution

Reply via email to