[
https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649650#comment-15649650
]
chenglei edited comment on PHOENIX-3451 at 11/9/16 3:19 AM:
------------------------------------------------------------
I think the problem is cause by the GroupByCompiler, when GroupByCompiler
called the OrderPreservingTracker.track method, it inappropriately used the
sortOrder != SortOrder.getDefault() as the thrid "isNullsLast" parameter as
following(in OrderPreservingTracker.java):
{code:borderStyle=solid}
101 public void track(Expression node) {
102 SortOrder sortOrder = node.getSortOrder();
103 track(node, sortOrder, sortOrder != SortOrder.getDefault());
104 }
105
106 public void track(Expression node, SortOrder sortOrder, boolean
isNullsLast) {
{code}
Once the node's SortOrder is SortOrder.DESC, the "isNullsLast" is true. it
affected the GroupBy 's isOrderPreserving as following(in
OrderPreservingTracker.java) :
{code:borderStyle=solid}
141 if (node.isNullable()) {
142 if (!Boolean.valueOf(isNullsLast).equals(isReverse)) {
143 isOrderPreserving = false;
144 isReverse = false;
145 return;
146 }
147 }
{code}
Actually, the "isNullsLast" parameter is just related to orderBy ,it should
just affected the display order of "Null " in the sorted results , groupBy
should not be affetced by "isNullsLast". I wrote a simple unit test to
reproduce this problem in my patch:
{code:borderStyle=solid}
@Test
public void testGroupByDesc() throws Exception {
Connection conn = DriverManager.getConnection(getUrl());
try {
conn.createStatement().execute("DROP TABLE IF EXISTS
GROUPBYDESC_TEST");
String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+
"ORGANIZATION_ID VARCHAR,"+
"CONTAINER_ID VARCHAR,"+
"CONSTRAINT TEST_PK PRIMARY KEY ( "+
"ORGANIZATION_ID DESC,"+
"CONTAINER_ID DESC"+
"))";
conn.createStatement().execute(sql);
sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM
GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID";
PhoenixPreparedStatement statement =
conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class);
QueryPlan queryPlan = statement.optimizeQuery(sql);
queryPlan.iterator();
assertTrue(queryPlan.getGroupBy().isOrderPreserving());
} finally {
conn.close();
}
}
{code}
I uploaded my patch, please review.
was (Author: comnetwork):
I think the problem is cause by the GroupByCompiler, when GroupByCompiler
called the OrderPreservingTracker.track method, it inappropriately used the
sortOrder != SortOrder.getDefault() as the thrid "isNullsLast" parameter as
following(in OrderPreservingTracker.java):
{code:borderStyle=solid}
101 public void track(Expression node) {
102 SortOrder sortOrder = node.getSortOrder();
103 track(node, sortOrder, sortOrder != SortOrder.getDefault());
104 }
105
106 public void track(Expression node, SortOrder sortOrder, boolean
isNullsLast) {
{code}
Once the node's SortOrder is SortOrder.DESC, the "isNullsLast" is true. it
affected the GroupBy 's isOrderPreserving as following(in
OrderPreservingTracker.java) :
{code:borderStyle=solid}
141 if (node.isNullable()) {
142 if (!Boolean.valueOf(isNullsLast).equals(isReverse)) {
143 isOrderPreserving = false;
144 isReverse = false;
145 return;
146 }
147 }
{code}
Actually, the "isNullsLast" parameter is just related to orderBy ,it should
just affected the display order of "Null " in the sorted results , groupBy
should not be affetced by "isNullsLast". I wrote a simple unit test to
reproduce this problem in my patch:
{code:borderStyle=solid}
@Test
public void testGroupByDesc() throws Exception {
Connection conn = DriverManager.getConnection(getUrl());
try {
conn.createStatement().execute("DROP TABLE IF EXISTS
GROUPBYDESC_TEST");
String sql="CREATE TABLE IF NOT EXISTS GROUPBYDESC_TEST ( "+
"ORGANIZATION_ID VARCHAR,"+
"CONTAINER_ID VARCHAR,"+
"CONSTRAINT TEST_PK PRIMARY KEY ( "+
"ORGANIZATION_ID DESC,"+
"CONTAINER_ID DESC"+
"))";
conn.createStatement().execute(sql);
sql="SELECT ORGANIZATION_ID, CONTAINER_ID,count(*) FROM
GROUPBYDESC_TEST group by ORGANIZATION_ID, CONTAINER_ID";
PhoenixPreparedStatement statement =
conn.prepareStatement(sql).unwrap(PhoenixPreparedStatement.class);
QueryPlan queryPlan = statement.optimizeQuery(sql);
queryPlan.iterator();
assertTrue(queryPlan.getGroupBy().isOrderPreserving());
} finally {
conn.close();
}
}
{code}
> Secondary index and query using distinct: LIMIT doesn't return the first rows
> -----------------------------------------------------------------------------
>
> Key: PHOENIX-3451
> URL: https://issues.apache.org/jira/browse/PHOENIX-3451
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.8.0
> Reporter: Joel Palmert
> Assignee: chenglei
> Attachments: PHOENIX-3451_v1.patch
>
>
> This may be related to PHOENIX-3452 but the behavior is different so filing
> it separately.
> Steps to repro:
> CREATE TABLE IF NOT EXISTS TEST.TEST (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> CONTAINER_ID CHAR(15) NOT NULL,
> ENTITY_ID CHAR(15) NOT NULL,
> SCORE DOUBLE,
> CONSTRAINT TEST_PK PRIMARY KEY (
> ORGANIZATION_ID,
> CONTAINER_ID,
> ENTITY_ID
> )
> ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000;
> CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC,
> ENTITY_ID DESC);
> UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1);
> UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2);
> UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3);
> UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4);
> UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35);
> UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45);
> EXPLAIN
> SELECT DISTINCT entity_id, score
> FROM test.test
> WHERE organization_id = 'org2'
> AND container_id IN ( 'container1','container2','container3' )
> ORDER BY score DESC
> LIMIT 2
> OUTPUT
> entityId5 1.2
> entityId3 1.4
> The expected out out would be
> entityId8 1.45
> entityId3 1.4
> You will get the expected output if you remove the secondary index from the
> table or remove distinct from the query.
> As described in PHOENIX-3452 if you run the query without the LIMIT the
> ordering is not correct. However, the 2first results in that ordering is
> still not the onces returned by the limit clause, which makes me think there
> are multiple issues here and why I filed both separately. The rows being
> returned are the ones assigned to container1. It looks like Phoenix is first
> getting the rows from the first container and when it finds that to be enough
> it stops the scan. What it should be doing is getting 2 results for each
> container and then merge then and then limit again.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)