[ 
https://issues.apache.org/jira/browse/PHOENIX-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325041#comment-15325041
 ] 

James Taylor commented on PHOENIX-2965:
---------------------------------------

No, the logic is wrong in GroupByCompiler. We shouldn't be adding all the 
select expressions to the group by list - only if statement.isDistinct() is 
true should we do that. I think that for your optimization to kick in, we can 
add all select expressions are instances of DistinctCountAggregateFunction and 
then we need to add the *child node* of DistinctCountAggregateFunction as the 
group by expression (I'm surprised it worked before, as you'd have a GROUP BY 
COUNT(DISTINCT...) ). Not sure if this will be more performant in the non order 
preserving case - it'll probably be more or less the same - so I suppose it's 
ok to always do it (but only for DistinctCountAggregateFunction).

> Use DistinctPrefixFilter logic for COUNT(DISTINCT ...) and COUNT(...) GROUP BY
> ------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2965
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2965
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 4.8.0
>
>         Attachments: 2965-v2.txt, 2965-v3.txt, 2965-v4.txt, 2965-v5.txt, 
> 2965-v6.txt, 2965.txt, PHOENIX-2965_wip.patch
>
>
> Parent uses skip scanning to optimize DISTINCT and certain GROUP BY 
> operations along the row key.
> COUNT queries are optimized differently, could be sped up significantly as 
> well.
> [~giacomotaylor], I might need to help into where COUNT(DISTINCT) queries are 
> planned and optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to