[ 
https://issues.apache.org/jira/browse/PHOENIX-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325450#comment-15325450
 ] 

James Taylor commented on PHOENIX-2965:
---------------------------------------

It's fine if there's a single count(distinct) with no other aggregations. For 
example, the following query could use your filter:
{code}
select count(distinct pk1) from t
{code}
You'd end up grouping by pk1 after applying your filter and counting the rows 
with a non null pk1 value. You should add a test that has null values too. It's 
a nice optimization.

> Use DistinctPrefixFilter logic for COUNT(DISTINCT ...) and COUNT(...) GROUP BY
> ------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2965
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2965
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 4.8.0
>
>         Attachments: 2965-v2.txt, 2965-v3.txt, 2965-v4.txt, 2965-v5.txt, 
> 2965-v6.txt, 2965.txt, PHOENIX-2965_wip.patch
>
>
> Parent uses skip scanning to optimize DISTINCT and certain GROUP BY 
> operations along the row key.
> COUNT queries are optimized differently, could be sped up significantly as 
> well.
> [~giacomotaylor], I might need to help into where COUNT(DISTINCT) queries are 
> planned and optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to