[ https://issues.apache.org/jira/browse/PHOENIX-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321443#comment-15321443 ]
James Taylor commented on PHOENIX-2965: --------------------------------------- I'd remove the statement.isDistinct() check here as I don't think it's needed and it might even lead to an issue: {code} --- a/phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java +++ b/phoenix-core/src/main/java/org/apache/phoenix/iterate/BaseResultIterators.java @@ -230,7 +230,7 @@ public abstract class BaseResultIterators extends ExplainTable implements Result !plan.getStatement().getHint().hasHint(HintNode.Hint.RANGE_SCAN) && cols < plan.getTableRef().getTable().getRowKeySchema().getFieldCount() && plan.getGroupBy().isOrderPreserving() && - (plan.getStatement().isDistinct() || context.getAggregationManager().isEmpty())) + (plan.getStatement().isDistinct() || context.getAggregationManager().isEmpty() || plan.getGroupBy().isUngroupedAggregate())) {code} One more test to add would be an aggregate query that does a distinct (in which case plan.getStatement().isDistinct() would be true). In this case, the distinct is executed by deduping on the client side. I don't think you'd want to use the optimization, but it might kick in with out changing the above. {code} SELECT DISTINCT sum(pk2) FROM t GROUP BY pk1; {code} > Use DistinctPrefixFilter logic for COUNT(DISTINCT ...) and COUNT(...) GROUP BY > ------------------------------------------------------------------------------ > > Key: PHOENIX-2965 > URL: https://issues.apache.org/jira/browse/PHOENIX-2965 > Project: Phoenix > Issue Type: Sub-task > Reporter: Lars Hofhansl > Fix For: 4.8.0 > > Attachments: 2965-v2.txt, 2965-v3.txt, 2965-v4.txt, 2965-v5.txt, > 2965.txt > > > Parent uses skip scanning to optimize DISTINCT and certain GROUP BY > operations along the row key. > COUNT queries are optimized differently, could be sped up significantly as > well. > [~giacomotaylor], I might need to help into where COUNT(DISTINCT) queries are > planned and optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)