James Taylor created PHOENIX-2758: ------------------------------------- Summary: Ordered GROUP BY not occurring with leading PK equality expression Key: PHOENIX-2758 URL: https://issues.apache.org/jira/browse/PHOENIX-2758 Project: Phoenix Issue Type: Bug Reporter: James Taylor
The following query: {code} SELECT SUM(DUP_COUNT) FROM ( SELECT COUNT(1) As DUP_COUNT FROM DATACLOUD.DATA_ASSESSMENT_RECORD WHERE JOURNEY_ID='07ixx000000004J' AND DATASOURCE=0 AND MATCH_STATUS <= 1 and ORGANIZATION_ID='07ixx000000004J' GROUP BY MATCH_STATUS, EXTERNAL_DATASOURCE_KEY HAVING COUNT(1) > 1); {code} Should use an ORDERED DISTINCT, but doesn't. This is the DDL: {code} CREATE TABLE DATACLOUD.DATA_ASSESSMENT_RECORD (ORGANIZATION_ID char(15) not null, JOURNEY_ID char(15) not null, DATASOURCE SMALLINT not null, MATCH_STATUS TINYINT not null, EXTERNAL_DATASOURCE_KEY varchar(30), ENTITY_ID char(15) not null, CONSTRAINT PK PRIMARY KEY ( ORGANIZATION_ID, JOURNEY_ID, DATASOURCE, MATCH_STATUS, EXTERNAL_DATASOURCE_KEY, ENTITY_ID)) {code} The optimization does occur if you include the PK columns in the GROUP BY like this: {code} SELECT SUM(DUP_COUNT) FROM ( SELECT COUNT(1) As DUP_COUNT FROM DATACLOUD.DATA_ASSESSMENT_RECORD WHERE JOURNEY_ID='07ixx000000004J' AND DATASOURCE=0 AND MATCH_STATUS <= 1 and ORGANIZATION_ID='07ixx000000004J' GROUP BY ORGANIZATION_ID, JOURNEY_ID, DATASOURCE, MATCH_STATUS, EXTERNAL_DATASOURCE_KEY HAVING COUNT(1) > 1); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)