[jira] [Commented] (PHOENIX-3452) Secondary index and query using distinct: ORDER BY doesn't work correctly

chenglei (JIRA) Tue, 08 Nov 2016 20:42:25 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649794#comment-15649794
 ]


chenglei commented on PHOENIX-3452:
-----------------------------------

I think the problem is cause by the GroupByCompiler, when GroupBy.compile 
method called the OrderPreservingTracker.track method to track the groupBy 
expression's order,just as following(in GroupByCompiler.java):
{code:borderStyle=solid} 
144    if (isOrderPreserving) {
145         OrderPreservingTracker tracker = new 
OrderPreservingTracker(context, GroupBy.EMPTY_GROUP_BY, Ordering.UNORDERED, 
expressions.size(), tupleProjector);
146          for (int i = 0; i < expressions.size(); i++) {
147                    Expression expression = expressions.get(i);
148                    tracker.track(expression);
149           }
{code}


The track method inappropriately used the sortOrder != SortOrder.getDefault() 
as the thrid "isNullsLast" parameter as following(in 
OrderPreservingTracker.java):

{code:borderStyle=solid} 
  101     public void track(Expression node) {
  102           SortOrder sortOrder = node.getSortOrder();
  103           track(node, sortOrder, sortOrder != SortOrder.getDefault());
  104     }
  105    
  106    public void track(Expression node, SortOrder sortOrder, boolean 
isNullsLast) {
{code}

Once the node's SortOrder is SortOrder.DESC,  the "isNullsLast" is true. it 
affected the GroupBy 's isOrderPreserving  as following(in 
OrderPreservingTracker.java) :

{code:borderStyle=solid}
  141    if (node.isNullable()) {
  142        if (!Boolean.valueOf(isNullsLast).equals(isReverse)) {
  143              isOrderPreserving = false;
  144              isReverse = false;
  145              return;
  146        }
  147  }
{code}

Actually, the "isNullsLast"  parameter is just related to orderBy ,it  should 
just affected the display order of "Null " in the sorted results , groupBy 
should not be affetced by "isNullsLast". I wrote a simple unit test to 
reproduce this problem in my patch:

{code:borderStyle=solid}
    @Test
    public void testGroupByDesc() throws Exception {
        Connection conn = DriverManager.getConnection(getUrl());
        try {
            conn.createStatement().execute("DROP TABLE IF EXISTS 
GROUPBYDESC_TEST");

            String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+
                    "ORGANIZATION_ID VARCHAR,"+
                    "CONTAINER_ID VARCHAR,"+
                    "CONSTRAINT TEST_PK PRIMARY KEY ( "+
                    "ORGANIZATION_ID DESC,"+
                    "CONTAINER_ID DESC"+
                    "))";
            conn.createStatement().execute(sql);


            sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM 
GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID";
            PhoenixPreparedStatement statement = 
conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class);

            QueryPlan queryPlan = statement.optimizeQuery(sql);
            queryPlan.iterator();
            assertTrue(queryPlan.getGroupBy().isOrderPreserving());

        } finally {
            conn.close();
        }
    }
{code}

I uploaded my patch, [~jamestaylor], please review.


> Secondary index and query using distinct: ORDER BY doesn't work correctly
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-3452
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3452
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.8.0
>            Reporter: Joel Palmert
>            Assignee: chenglei
>
> This may be related to PHOENIX-3451 but the behavior is different so filing 
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
>     ORGANIZATION_ID CHAR(15) NOT NULL,
>     CONTAINER_ID CHAR(15) NOT NULL,
>     ENTITY_ID CHAR(15) NOT NULL,
>     SCORE DOUBLE,
>     CONSTRAINT TEST_PK PRIMARY KEY (
>         ORGANIZATION_ID,
>         CONTAINER_ID,
>         ENTITY_ID
>     )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, 
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org1','container1','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org1','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org1','container1','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org1','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org1','container1','entityId2',1.5);
> UPSERT INTO test.test VALUES ('org1','container1','entityId1',1.6);
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org1'
> AND container_id = 'container1'
> ORDER BY score DESC
> Notice that the returned results are not returned in descending score order. 
> Instead they are returned in descending entity_id order. If I remove the 
> DISTINCT or remove the secondary index the result is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-3452) Secondary index and query using distinct: ORDER BY doesn't work correctly

Reply via email to