[
https://issues.apache.org/jira/browse/PHOENIX-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218205#comment-15218205
]
Samarth Jain commented on PHOENIX-2758:
---------------------------------------
+1, looks good.
> Ordered GROUP BY not occurring with leading PK equality expression
> ------------------------------------------------------------------
>
> Key: PHOENIX-2758
> URL: https://issues.apache.org/jira/browse/PHOENIX-2758
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Attachments: PHOENIX-2758.patch, PHOENIX-2758_wip.patch
>
>
> The following query:
> {code}
> SELECT SUM(DUP_COUNT) FROM (
> SELECT COUNT(1) As DUP_COUNT
> FROM DATACLOUD.DATA_ASSESSMENT_RECORD
> WHERE JOURNEY_ID='07ixx000000004J' AND
> DATASOURCE=0 AND MATCH_STATUS <= 1 and
> ORGANIZATION_ID='07ixx000000004J'
> GROUP BY MATCH_STATUS, EXTERNAL_DATASOURCE_KEY
> HAVING COUNT(1) > 1);
> {code}
> Should use an ORDERED DISTINCT, but doesn't.
> This is the DDL:
> {code}
> CREATE TABLE DATACLOUD.DATA_ASSESSMENT_RECORD (ORGANIZATION_ID char(15) not
> null,
> JOURNEY_ID char(15) not null,
> DATASOURCE SMALLINT not null,
> MATCH_STATUS TINYINT not null,
> EXTERNAL_DATASOURCE_KEY varchar(30),
> ENTITY_ID char(15) not null,
> CONSTRAINT PK PRIMARY KEY (
> ORGANIZATION_ID,
> JOURNEY_ID,
> DATASOURCE,
> MATCH_STATUS,
> EXTERNAL_DATASOURCE_KEY,
> ENTITY_ID))
> {code}
> The optimization does occur if you include the PK columns in the GROUP BY
> like this:
> {code}
> SELECT SUM(DUP_COUNT) FROM (
> SELECT COUNT(1) As DUP_COUNT
> FROM DATACLOUD.DATA_ASSESSMENT_RECORD
> WHERE JOURNEY_ID='07ixx000000004J' AND
> DATASOURCE=0 AND MATCH_STATUS <= 1 and
> ORGANIZATION_ID='07ixx000000004J'
> GROUP BY ORGANIZATION_ID, JOURNEY_ID, DATASOURCE, MATCH_STATUS,
> EXTERNAL_DATASOURCE_KEY
> HAVING COUNT(1) > 1);
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)