[ 
https://issues.apache.org/jira/browse/JENA-108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096215#comment-13096215
 ] 

Paolo Castagna commented on JENA-108:
-------------------------------------

Andy, I am sorry but I am not sure I understand what you are suggesting.

> Optimise DISTINCT + ORDER BY + LIMIT queries
> --------------------------------------------
>
>                 Key: JENA-108
>                 URL: https://issues.apache.org/jira/browse/JENA-108
>             Project: Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Paolo Castagna
>            Assignee: Paolo Castagna
>            Priority: Minor
>              Labels: optimization, sparql
>
> When users what to find the top 10 most 'something' 'things' in their data 
> write queries of this form: SELECT DISTINCT * { ... } ORDER BY ... LIMIT 10 
> We can add a specific low level optimization for this type of queries using a 
> new QueryIterTopNDistinct iterator.
> This SPARQL query: 
> SELECT * { ... } ORDER BY ... LIMIT 10
> has this algebra:
> (slice _ 10
>   (order (...)
>     (bgp (...))))
> which is now transformed into (see JENA-89):
> (top (10 ...)
>   (bgp (...)))
> This query: 
> SELECT DISTINCT * { ... } ORDER BY ... 
> has this algebra:
> (distinct
>   (order (...)
>     (bgp (...))))
> which is now transformed into (see JENA-90):
> (reduced
>   (order (...)
>     (bgp (...))))
> What about: 
> SELECT DISTINCT * { ... } ORDER BY ... LIMIT 10 ?
> The algebra expression is:
> (slice _ 10
>   (distinct
>     (order (?p ?o)
>       (bgp (triple ?s ?p ?o)))))
> which is (currently) transformed into:
> (slice _ 10
>   (reduced
>     (order (?p ?o)
>       (bgp (triple ?s ?p ?o)))))
> The optimization in JENA-90 kicks in, but it would be better to use the top 
> operator for small limits. 
> However, it must be a slightly different top implementation (which accepts 
> only distinct bindings). 
> As Andy suggested in a comment on JENA-90 (12 August 2011), we can transform 
> the algebra into:
> (top (10 ...)
>   (distinct
>     (bgp (...))))
> and then use a QueryIterTopNDistinct when OpExecutor(OpTop) sees an 
> opTop.getSubOp() instanceof OpDistinct.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to