[ https://issues.apache.org/jira/browse/PHOENIX-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326107#comment-15326107 ]
James Taylor commented on PHOENIX-2965: --------------------------------------- +1, but if there's not already a test, can you add a test for a {{SELECT COUNT(DISTINCT nonPKCol)}} test? I just want to make sure that having GroupBy.expressions on an ungrouped aggregation doesn't throw any logic off for this case. Also, one more optimization that would really benefit the {{SELECT COUNT(DISTINCT pkCol)}} case: if there's only a single COUNT(DISTINCT pkCol) and the GroupBy ends up being order preserving, you can replace the {{COUNT(DISTINCT pkCol)}} with a {{COUNT(pkCol)}} in the select expression nodes. Just pass through {{select}} in the call to groupBy.compile() in QueryCompiler and you can do the replacement in place. That'll prevent the DistinctValueWithCountServerAggregator from being used which keeps a Map of all unique values and instead just keep a single overall count, which is all we need thanks to your DistinctPrefixFilter. > Use DistinctPrefixFilter logic for COUNT(DISTINCT ...) and COUNT(...) GROUP BY > ------------------------------------------------------------------------------ > > Key: PHOENIX-2965 > URL: https://issues.apache.org/jira/browse/PHOENIX-2965 > Project: Phoenix > Issue Type: Sub-task > Reporter: Lars Hofhansl > Assignee: Lars Hofhansl > Fix For: 4.8.0 > > Attachments: 2965-v2.txt, 2965-v3.txt, 2965-v4.txt, 2965-v5.txt, > 2965-v6.txt, 2965-v7.txt, 2965-v8.txt, 2965-v9.txt, 2965.txt, > PHOENIX-2965_wip.patch > > > Parent uses skip scanning to optimize DISTINCT and certain GROUP BY > operations along the row key. > COUNT queries are optimized differently, could be sped up significantly as > well. > [~giacomotaylor], I might need to help into where COUNT(DISTINCT) queries are > planned and optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)