[ https://issues.apache.org/jira/browse/PHOENIX-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15656799#comment-15656799 ]
chenglei edited comment on PHOENIX-3451 at 11/11/16 1:14 PM: ------------------------------------------------------------- [~jamestaylor], In order to remove the effect of MULTI_TENANT, I add the "ORGANIZATION_ID" column as the first column of the index TEST_SCORE , so this problem can be reproduced for a ordinary table,just as following: {code:borderStyle=solid} CREATE TABLE IF NOT EXISTS TEST.TEST ( ORGANIZATION_ID CHAR(15) NOT NULL, CONTAINER_ID CHAR(15) NOT NULL, ENTITY_ID CHAR(15) NOT NULL, SCORE DOUBLE, CONSTRAINT TEST_PK PRIMARY KEY ( ORGANIZATION_ID, CONTAINER_ID, ENTITY_ID ) ) CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC); UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1); UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2); UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3); UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4); UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35); UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45); SELECT DISTINCT entity_id, score FROM test.test WHERE organization_id = 'org2' AND container_id IN ( 'container1','container2','container3' ) ORDER BY score DESC LIMIT 2 {code} and I also wrote a simple unit case to reproduce this problem without index: {code:borderStyle=solid} @Test public void testBug3451() throws Exception { Connection conn = DriverManager.getConnection(getUrl()); try { String tableName="GROUPBY3451_TEST"; conn.createStatement().execute("DROP TABLE if exists "+tableName); String sql="CREATE TABLE "+tableName+" ( "+ "ORGANIZATION_ID CHAR(15) NOT NULL,"+ "CONTAINER_ID CHAR(15) NOT NULL,"+ "SCORE DOUBLE NOT NULL,"+ "ENTITY_ID CHAR(15) NOT NULL,"+ "CONSTRAINT TEST_PK PRIMARY KEY ( "+ "ORGANIZATION_ID,"+ "CONTAINER_ID,"+ "SCORE DESC,"+ "ENTITY_ID DESC"+ "))"; conn.createStatement().execute(sql); sql="SELECT DISTINCT entity_id, score FROM "+tableName+" WHERE organization_id = 'org1' AND container_id IN ( 'container1','container2','container3' ) ORDER BY score DESC LIMIT 2"; QueryPlan queryPlan=getQueryPlan(conn, sql); assertFalse(queryPlan.getOrderBy()==OrderBy.FWD_ROW_KEY_ORDER_BY); } finally{ if(conn!=null) { conn.close(); } } } {code} [~jamestaylor], this problem can only be reproduced with GroupBy,so in order not to interfere with IT tests, maybe we should fix PHOENIX-3452 first.I will upload my first patch soon. was (Author: comnetwork): [~jamestaylor], In order to remove the effect of MULTI_TENANT, I create a new index , this problem can be reproduced with a ordinary table as following: {code:borderStyle=solid} CREATE TABLE IF NOT EXISTS TEST.TEST ( ORGANIZATION_ID CHAR(15) NOT NULL, CONTAINER_ID CHAR(15) NOT NULL, ENTITY_ID CHAR(15) NOT NULL, SCORE DOUBLE, CONSTRAINT TEST_PK PRIMARY KEY ( ORGANIZATION_ID, CONTAINER_ID, ENTITY_ID ) ) CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST(ORGANIZATION_ID,CONTAINER_ID, SCORE DESC, ENTITY_ID DESC); UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1); UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2); UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3); UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4); UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35); UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45); SELECT DISTINCT entity_id, score FROM test.test WHERE organization_id = 'org2' AND container_id IN ( 'container1','container2','container3' ) ORDER BY score DESC LIMIT 2 {code} and I also wrote a simple unit case to reproduce this problem without index: {code:borderStyle=solid} @Test public void testBug3451() throws Exception { Connection conn = DriverManager.getConnection(getUrl()); try { String tableName="GROUPBY3451_TEST"; conn.createStatement().execute("DROP TABLE if exists "+tableName); String sql="CREATE TABLE "+tableName+" ( "+ "ORGANIZATION_ID CHAR(15) NOT NULL,"+ "CONTAINER_ID CHAR(15) NOT NULL,"+ "SCORE DOUBLE NOT NULL,"+ "ENTITY_ID CHAR(15) NOT NULL,"+ "CONSTRAINT TEST_PK PRIMARY KEY ( "+ "ORGANIZATION_ID,"+ "CONTAINER_ID,"+ "SCORE DESC,"+ "ENTITY_ID DESC"+ "))"; conn.createStatement().execute(sql); sql="SELECT DISTINCT entity_id, score FROM "+tableName+" WHERE organization_id = 'org1' AND container_id IN ( 'container1','container2','container3' ) ORDER BY score DESC LIMIT 2"; QueryPlan queryPlan=getQueryPlan(conn, sql); assertFalse(queryPlan.getOrderBy()==OrderBy.FWD_ROW_KEY_ORDER_BY); } finally{ if(conn!=null) { conn.close(); } } } {code} [~jamestaylor], this problem can only be reproduced with GroupBy,so in order not to interfere with IT tests, maybe we should fix PHOENIX-3452 first. > Secondary index and query using distinct: LIMIT doesn't return the first rows > ----------------------------------------------------------------------------- > > Key: PHOENIX-3451 > URL: https://issues.apache.org/jira/browse/PHOENIX-3451 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.8.0 > Reporter: Joel Palmert > Assignee: chenglei > > This may be related to PHOENIX-3452 but the behavior is different so filing > it separately. > Steps to repro: > CREATE TABLE IF NOT EXISTS TEST.TEST ( > ORGANIZATION_ID CHAR(15) NOT NULL, > CONTAINER_ID CHAR(15) NOT NULL, > ENTITY_ID CHAR(15) NOT NULL, > SCORE DOUBLE, > CONSTRAINT TEST_PK PRIMARY KEY ( > ORGANIZATION_ID, > CONTAINER_ID, > ENTITY_ID > ) > ) VERSIONS=1, MULTI_TENANT=TRUE, REPLICATION_SCOPE=1, TTL=31536000; > CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (CONTAINER_ID, SCORE DESC, > ENTITY_ID DESC); > UPSERT INTO test.test VALUES ('org2','container2','entityId6',1.1); > UPSERT INTO test.test VALUES ('org2','container1','entityId5',1.2); > UPSERT INTO test.test VALUES ('org2','container2','entityId4',1.3); > UPSERT INTO test.test VALUES ('org2','container1','entityId3',1.4); > UPSERT INTO test.test VALUES ('org2','container3','entityId7',1.35); > UPSERT INTO test.test VALUES ('org2','container3','entityId8',1.45); > EXPLAIN > SELECT DISTINCT entity_id, score > FROM test.test > WHERE organization_id = 'org2' > AND container_id IN ( 'container1','container2','container3' ) > ORDER BY score DESC > LIMIT 2 > OUTPUT > entityId5 1.2 > entityId3 1.4 > The expected out out would be > entityId8 1.45 > entityId3 1.4 > You will get the expected output if you remove the secondary index from the > table or remove distinct from the query. > As described in PHOENIX-3452 if you run the query without the LIMIT the > ordering is not correct. However, the 2first results in that ordering is > still not the onces returned by the limit clause, which makes me think there > are multiple issues here and why I filed both separately. The rows being > returned are the ones assigned to container1. It looks like Phoenix is first > getting the rows from the first container and when it finds that to be enough > it stops the scan. What it should be doing is getting 2 results for each > container and then merge then and then limit again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)