James Taylor created PHOENIX-2794:
-------------------------------------

             Summary: Optimize aggregates of aggregates when possible
                 Key: PHOENIX-2794
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2794
             Project: Phoenix
          Issue Type: Bug
            Reporter: James Taylor


The following query:
{code}
SELECT TRUNC(ts,'HOUR'), AVG(avg_val)
FROM (SELECT AVG(val),ts FROM T GROUP BY ts)
GROUP BY TRUNC(ts,'HOUR');
{code}
will run much more efficiently if flattened so that the hourly bucketing is 
done on the server-side like this:
{code}
SELECT TRUNC(ts,'HOUR'), AVG(val)
FROM T
GROUP BY TRUNC(ts,'HOUR');
{code}
We should flatten when possible. Not sure what the general rule is, but perhaps 
if the inner and outer aggregate function matches, you can always do this? 
Maybe only for some aggregate functions like SUM, MIN, MAX, AVG?

This comes up in time series queries in particular.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to