Hello all, I've submitted a patch for this issue: https://github.com/apache/phoenix/pull/308
The JIRA ticket is https://issues.apache.org/jira/browse/PHOENIX-4751 Thanks, Gerald On Thu, Jun 14, 2018 at 8:33 AM, Gerald Sangudi <[email protected]> wrote: > Thanks James. Looking into that. > > Gerald > > > On Thu, Jun 14, 2018 at 6:30 AM, James Taylor <[email protected]> > wrote: > >> Hi Gerald, >> No further suggestions than my comments on the JIRA. Maybe a good next >> step would be a patch? >> Thanks, >> James >> >> On Tue, Jun 12, 2018 at 8:15 PM, Gerald Sangudi <[email protected]> >> wrote: >> >>> Hi Maryann and James, >>> >>> Any further guidance on PHOENIX-4751 >>> <https://issues.apache.org/jira/browse/PHOENIX-4751>? >>> >>> Thanks, >>> Gerald >>> >>> On Wed, May 23, 2018 at 11:00 AM, Gerald Sangudi <[email protected]> >>> wrote: >>> >>>> Hi Maryann, >>>> >>>> I filed PHOENIX-4751 >>>> <https://issues.apache.org/jira/browse/PHOENIX-4751>. >>>> >>>> Is this likely to be reviewed soon (say next few weeks), or should I >>>> look at the Phoenix source to estimate the scope / impact? >>>> >>>> Thanks, >>>> Gerald >>>> >>>> On Tue, May 22, 2018 at 11:12 AM, Maryann Xue <[email protected]> >>>> wrote: >>>> >>>>> Since the performance running a group-by aggregation on client side is >>>>> most likely bad, it’s usually not desired. The original implementation was >>>>> for functionality completeness only so it chose the easiest way, which >>>>> reused some existing classes. In some cases, though, the client group-by >>>>> can still be tolerable if there aren’t many distinct keys. So yes, please >>>>> open a JIRA for implementing hash aggregation on client side. Thank you! >>>>> >>>>> >>>>> Thanks, >>>>> Maryann >>>>> >>>>> On Tue, May 22, 2018 at 10:50 AM Gerald Sangudi <[email protected]> >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Any guidance or thoughts on the thread below? >>>>>> >>>>>> Thanks, >>>>>> Gerald >>>>>> >>>>>> >>>>>> On Fri, May 18, 2018 at 11:39 AM, Gerald Sangudi < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Maryann, >>>>>>> >>>>>>> Can Phoenix provide hash aggregation on the client side? Are there >>>>>>> design / implementation reasons not to, or should I file a ticket for >>>>>>> this? >>>>>>> >>>>>>> Thanks, >>>>>>> Gerald >>>>>>> >>>>>>> On Fri, May 18, 2018 at 11:29 AM, Maryann Xue <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> Hi Gerald, >>>>>>>> >>>>>>>> Phoenix does have hash aggregation. The reason why sort-based >>>>>>>> aggregation is used in your query plan is that the aggregation happens >>>>>>>> on >>>>>>>> the client side. And that is because sort-merge join is used (as >>>>>>>> hinted) >>>>>>>> which is a client driven join, and after that join stage all >>>>>>>> operations can >>>>>>>> only be on the client-side. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Marynn >>>>>>>> >>>>>>>> On Fri, May 18, 2018 at 10:57 AM, Gerald Sangudi < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Does Phoenix provide hash aggregation? If not, is it on the >>>>>>>>> roadmap, or should I file a ticket? We have aggregation queries that >>>>>>>>> do not >>>>>>>>> require sorted results. >>>>>>>>> >>>>>>>>> For example, this EXPLAIN plan shows a CLIENT SORT. >>>>>>>>> >>>>>>>>> *CREATE TABLE unsalted ( keyA BIGINT NOT NULL, keyB >>>>>>>>> BIGINT NOT NULL, val SMALLINT, CONSTRAINT pk PRIMARY KEY >>>>>>>>> (keyA, >>>>>>>>> keyB));* >>>>>>>>> >>>>>>>>> >>>>>>>>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */ t1.val v1, t2.val v2, >>>>>>>>> COUNT(*) c FROM unsalted t1 JOIN unsalted t2 ON (t1.keyA = t2.keyA) >>>>>>>>> GROUP >>>>>>>>> BY t1.val, >>>>>>>>> t2.val;+------------------------------------------------------------+-----------------+----------------+--+| >>>>>>>>> PLAN | EST_BYTES_READ | EST_ROWS_READ | >>>>>>>>> |+------------------------------------------------------------+-----------------+----------------+--+| >>>>>>>>> SORT-MERGE-JOIN (INNER) TABLES | null | >>>>>>>>> null | >>>>>>>>> || CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED | null >>>>>>>>> | null >>>>>>>>> | || AND | >>>>>>>>> null | >>>>>>>>> null | || CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED >>>>>>>>> | null >>>>>>>>> | null | || CLIENT SORTED BY [TO_DECIMAL(T1.VAL), T2.VAL] >>>>>>>>> | >>>>>>>>> null | null | || CLIENT AGGREGATE INTO DISTINCT ROWS BY [T1.VAL, >>>>>>>>> T2.VAL] >>>>>>>>> | null | null | >>>>>>>>> |+------------------------------------------------------------+-----------------+----------------+--+* >>>>>>>>> Thanks, >>>>>>>>> Gerald >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>> >> >
