[
https://issues.apache.org/jira/browse/JENA-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084105#comment-13084105
]
Andy Seaborne commented on JENA-90:
-----------------------------------
There are a number of ways to approach this:
1/ Add a parameter to the "top" operator e.g.(top distinct 10 (sort) (sub))
2/ Add a new operator e.g. top_distinct
3/ Decide when choosing QueryIterator based on (sub)
In (3), if the OpExecutor(OpTop) sees the algebra pattern:
(top N (?x)
(distinct
....))
by
opTop.getSubOp() instanceof OpDistinct
then it chooses QueryIterTopNDistinct over the sub-expression of OpDistinct
else choose QueryIterTopN.
This is treating it as a lower-level optimization.
I think 3 is neater.
> Use OpReduce instead of OpDistinct for DISTINCT + ORDER BY queries
> ------------------------------------------------------------------
>
> Key: JENA-90
> URL: https://issues.apache.org/jira/browse/JENA-90
> Project: Jena
> Issue Type: Improvement
> Components: ARQ
> Reporter: Paolo Castagna
> Priority: Trivial
> Labels: arq, optimizer, sparql
>
> ARQ's optimizer could use an OpReduce instead of OpDistinct if a query is
> DISTINCT + ORDER BY.
> OpReduce removes adjacent duplicates and it does not require a set of already
> seen bindings as the current OpDistinct implementation does.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira