[ https://issues.apache.org/jira/browse/PHOENIX-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934214#comment-15934214 ]
chenglei edited comment on PHOENIX-3745 at 3/21/17 2:11 PM: ------------------------------------------------------------ {quote} 1. Checking if the sub-query is already ordered on the join key might be redundant, since that should be taken care of by the sub-query compilation itself later on. So could you please verify? {quote} After verifying I think checking if the sub-query is already ordered on the join key is not redundant , just see following unit tests: {noformat} @Test 1 public void testSortMergeJoinSubQueryBugForOrderByPrefix() throws Exception { 2 Connection conn = null; 3 try { 4 conn= DriverManager.getConnection(getUrl()); 5 6 String tableName1="MERGE1"; 7 String tableName2="MERGE2"; 8 9 conn.createStatement().execute("DROP TABLE if exists "+tableName1); 10 11 String sql="CREATE TABLE IF NOT EXISTS "+tableName1+" ( "+ 12 "AID INTEGER PRIMARY KEY,"+ 13 "AGE INTEGER"+ 14 ")"; 15 conn.createStatement().execute(sql); 16 conn.createStatement().execute("DROP TABLE if exists "+tableName2); 17 sql="CREATE TABLE IF NOT EXISTS "+tableName2+" ( "+ 18 "BID INTEGER PRIMARY KEY,"+ 19 "CODE INTEGER"+ 20 ")"; 21 conn.createStatement().execute(sql); 22 23 sql="select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from (select aid,age from "+tableName1+" where age >=11 and age<=33 order by age limit 3) a inner join"+ "(select bid,code from "+tableName2+" order by bid,code limit 1) b on a.aid=b.bid "; 24 25 QueryPlan queryPlan=getQueryPlan(conn, sql); 26 SortMergeJoinPlan sortMergeJoinPlan=(SortMergeJoinPlan)((ClientScanPlan)queryPlan).getDelegate(); 27 28 ClientScanPlan rhsOuterPlan=(ClientScanPlan)((TupleProjectionPlan)(sortMergeJoinPlan.getRhsPlan())).getDelegate(); 29 OrderBy orderBy=rhsOuterPlan.getOrderBy(); 30 assertTrue(orderBy.getOrderByExpressions().isEmpty()); 31 } 32 finally { 33 if(conn!=null) { 34 conn.close(); 35 } 36 } 37 } {noformat} the line 30 in above code is failed after I remove the SubselectRewriter.isOrderByPrefix method in my patch, the RHS: "(select bid,code from (select bid,code from merge2 order by bid,code limit 1) order by bid)" still has {{order by bid}} after compilation even if the inner query " (select bid,code from merge2 order by bid,code limit 1)". is already ordered on the bid. The problem is caused by following line 520 in QueryCompiler.compileSingleQuery method,the judgment of {{isInRowKeyOrder}} is too coarse, and obviously inner query " (select bid,code from merge2 order by bid,code limit 1)" is not {{isInRowKeyOrder}} by this judgment ,so the outer {{order by bid}} can not be compiled out. May be we can file a new JIRA to improve it : {code} 515 TableRef tableRef = context.getResolver().getTables().get(0); 516 ColumnResolver resolver = FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(), tableRef, innerPlan.getProjector()); 517 context.setResolver(resolver); 518 tableRef = resolver.getTables().get(0); 519 context.setCurrentTable(tableRef); 520 boolean isInRowKeyOrder = innerPlan.getGroupBy() == GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY; 521 522 return compileSingleFlatQuery(context, select, binds, asSubquery, allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder); {code} \\ {quote} 2. Shouldn't the subselectAsTableNode always be a DerivedTableNode? Is if(subselectAsTableNode instanceof DerivedTableNode) necessary? Shall we use assert or Preconditions check instead? {quote} Yes, the {{subselectAsTableNode}} is always a {{DerivedTableNode}}, we indeed should use assert instead of if. I uploaded my second patch with two minor modifications: 1. change the if statement to "assert subselectAsTableNode instanceof DerivedTableNode". 2. add more unit tests and IT tests for checking if the sub-query is already ordered on the join key. I also create a new pull request,please help me have a review, thanks. was (Author: comnetwork): {quote} 1. Checking if the sub-query is already ordered on the join key might be redundant, since that should be taken care of by the sub-query compilation itself later on. So could you please verify? {quote} I think checking if the sub-query is already ordered on the join key is not redundant after verifying , just see following unit tests: {noformat} @Test 1 public void testSortMergeJoinSubQueryBugForOrderByPrefix() throws Exception { 2 Connection conn = null; 3 try { 4 conn= DriverManager.getConnection(getUrl()); 5 6 String tableName1="MERGE1"; 7 String tableName2="MERGE2"; 8 9 conn.createStatement().execute("DROP TABLE if exists "+tableName1); 10 11 String sql="CREATE TABLE IF NOT EXISTS "+tableName1+" ( "+ 12 "AID INTEGER PRIMARY KEY,"+ 13 "AGE INTEGER"+ 14 ")"; 15 conn.createStatement().execute(sql); 16 conn.createStatement().execute("DROP TABLE if exists "+tableName2); 17 sql="CREATE TABLE IF NOT EXISTS "+tableName2+" ( "+ 18 "BID INTEGER PRIMARY KEY,"+ 19 "CODE INTEGER"+ 20 ")"; 21 conn.createStatement().execute(sql); 22 23 sql="select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from (select aid,age from "+tableName1+" where age >=11 and age<=33 order by age limit 3) a inner join"+ "(select bid,code from "+tableName2+" order by bid,code limit 1) b on a.aid=b.bid "; 24 25 QueryPlan queryPlan=getQueryPlan(conn, sql); 26 SortMergeJoinPlan sortMergeJoinPlan=(SortMergeJoinPlan)((ClientScanPlan)queryPlan).getDelegate(); 27 28 ClientScanPlan rhsOuterPlan=(ClientScanPlan)((TupleProjectionPlan)(sortMergeJoinPlan.getRhsPlan())).getDelegate(); 29 OrderBy orderBy=rhsOuterPlan.getOrderBy(); 30 assertTrue(orderBy.getOrderByExpressions().isEmpty()); 31 } 32 finally { 33 if(conn!=null) { 34 conn.close(); 35 } 36 } 37 } {noformat} the line 30 in above code is failed after I remove the SubselectRewriter.isOrderByPrefix method in my patch, the RHS: "(select bid,code from (select bid,code from merge2 order by bid,code limit 1) order by bid)" still has {{order by bid}} after compilation even if the inner query " (select bid,code from merge2 order by bid,code limit 1)". is already ordered on the bid. The problem is caused by following line 520 in QueryCompiler.compileSingleQuery method,the judgment of {{isInRowKeyOrder}} is too coarse, and obviously inner query " (select bid,code from merge2 order by bid,code limit 1)" is not {{isInRowKeyOrder}} by this judgment ,so the outer {{order by bid}} can not be compiled out. May be we can file a new JIRA to improve it : {code} 515 TableRef tableRef = context.getResolver().getTables().get(0); 516 ColumnResolver resolver = FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(), tableRef, innerPlan.getProjector()); 517 context.setResolver(resolver); 518 tableRef = resolver.getTables().get(0); 519 context.setCurrentTable(tableRef); 520 boolean isInRowKeyOrder = innerPlan.getGroupBy() == GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY; 521 522 return compileSingleFlatQuery(context, select, binds, asSubquery, allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder); {code} \\ {quote} 2. Shouldn't the subselectAsTableNode always be a DerivedTableNode? Is if(subselectAsTableNode instanceof DerivedTableNode) necessary? Shall we use assert or Preconditions check instead? {quote} Yes, the {{subselectAsTableNode}} always is a {{DerivedTableNode}}, we indeed should use assert instead of if. I uploaded my second version patch, the modifications of patch are minor: 1. change the if statement to "assert subselectAsTableNode instanceof DerivedTableNode". 2. add more unit tests and IT tests for Checking if the sub-query is already ordered on the join key. > SortMergeJoin might incorrectly override the OrderBy of LHS or RHS > ------------------------------------------------------------------ > > Key: PHOENIX-3745 > URL: https://issues.apache.org/jira/browse/PHOENIX-3745 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.9.0 > Reporter: chenglei > Assignee: chenglei > Attachments: PHOENIX-3745_v2.patch > > > Let us look a simple test case: > h4. 1. Create two tables > {noformat} > CREATE TABLE IF NOT EXISTS MERGE1 ( > AID INTEGER PRIMARY KEY > AGE INTEGER > ); > CREATE TABLE IF NOT EXISTS MERGE2 ( > BID INTEGER PRIMARY KEY, > CODE INTEGER > ); > {noformat} > h4. 2. Upsert values > {noformat} > UPSERT INTO MERGE1(AID,AGE) VALUES (1,11); > UPSERT INTO MERGE1(AID,AGE) VALUES (2,22); > UPSERT INTO MERGE1 (AID,AGE) VALUES (3,33); > UPSERT INTO MERGE2 (BID,CODE) VALUES (1,66); > UPSERT INTO MERGE2 (BID,CODE) VALUES (2,55); > UPSERT INTO MERGE2 (BID,CODE) VALUES (3,44); > {noformat} > h4. 3. Execute query > {noformat} > select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from > (select aid,age from merge1 where age >=11 and age<=33) a inner join > (select bid,code from merge2 order by code limit 1) b on a.aid=b.bid > {noformat} > h4. (/) Expected result > {noformat} > 3,44 > {noformat} > h4. (!) Incorrect actual result > {noformat} > 1,66 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)