James Taylor created PHOENIX-2758:
-------------------------------------

             Summary: Ordered GROUP BY not occurring with leading PK equality 
expression
                 Key: PHOENIX-2758
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2758
             Project: Phoenix
          Issue Type: Bug
            Reporter: James Taylor


The following query:
{code}
SELECT SUM(DUP_COUNT) FROM (
    SELECT COUNT(1) As DUP_COUNT
    FROM DATACLOUD.DATA_ASSESSMENT_RECORD 
   WHERE JOURNEY_ID='07ixx000000004J' AND 
                 DATASOURCE=0 AND MATCH_STATUS <= 1 and 
                 ORGANIZATION_ID='07ixx000000004J' 
    GROUP BY MATCH_STATUS, EXTERNAL_DATASOURCE_KEY 
    HAVING COUNT(1) > 1);
{code}
Should use an ORDERED DISTINCT, but doesn't.

This is the DDL:
{code}
CREATE TABLE DATACLOUD.DATA_ASSESSMENT_RECORD (ORGANIZATION_ID char(15) not 
null, 
JOURNEY_ID char(15) not null, 
DATASOURCE SMALLINT not null, 
MATCH_STATUS TINYINT not null, 
EXTERNAL_DATASOURCE_KEY varchar(30), 
ENTITY_ID char(15) not null, 
CONSTRAINT PK PRIMARY KEY (
    ORGANIZATION_ID, 
    JOURNEY_ID, 
    DATASOURCE, 
    MATCH_STATUS,
    EXTERNAL_DATASOURCE_KEY,
    ENTITY_ID))
{code}

The optimization does occur if you include the PK columns in the GROUP BY like 
this:
{code}
SELECT SUM(DUP_COUNT) FROM (
    SELECT COUNT(1) As DUP_COUNT
    FROM DATACLOUD.DATA_ASSESSMENT_RECORD 
   WHERE JOURNEY_ID='07ixx000000004J' AND 
                 DATASOURCE=0 AND MATCH_STATUS <= 1 and 
                 ORGANIZATION_ID='07ixx000000004J' 
    GROUP BY ORGANIZATION_ID, JOURNEY_ID, DATASOURCE, MATCH_STATUS, 
EXTERNAL_DATASOURCE_KEY 
    HAVING COUNT(1) > 1);
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to