[ 
https://issues.apache.org/jira/browse/PHOENIX-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934214#comment-15934214
 ] 

chenglei edited comment on PHOENIX-3745 at 3/21/17 2:13 PM:
------------------------------------------------------------

{quote}
     1. Checking if the sub-query is already ordered on the join key might be 
redundant, since that should be taken care of by the sub-query compilation 
itself later on. So could you please verify?
{quote}

After verifying I think checking if the sub-query is already ordered on the 
join key is not redundant , just see following unit tests:
{noformat}
    @Test
1    public void testSortMergeJoinSubQueryBugForOrderByPrefix() throws 
Exception  {     
2       Connection conn = null;
3       try {
4              conn= DriverManager.getConnection(getUrl());
5
6               String tableName1="MERGE1";
7               String tableName2="MERGE2";
8
9                conn.createStatement().execute("DROP TABLE if exists 
"+tableName1);
10
11             String sql="CREATE TABLE IF NOT EXISTS "+tableName1+" ( "+
12                        "AID INTEGER PRIMARY KEY,"+
13                        "AGE INTEGER"+
14                        ")";
15               conn.createStatement().execute(sql);
16             conn.createStatement().execute("DROP TABLE if exists 
"+tableName2);
17       sql="CREATE TABLE IF NOT EXISTS "+tableName2+" ( "+
18             "BID INTEGER PRIMARY KEY,"+
19              "CODE INTEGER"+
20              ")";
21        conn.createStatement().execute(sql);
22
23        sql="select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from (select 
aid,age from "+tableName1+" where age >=11 and age<=33 order by age limit 3) a 
inner join"+
"(select bid,code from "+tableName2+" order by bid,code limit 1) b on 
a.aid=b.bid ";
24
25        QueryPlan queryPlan=getQueryPlan(conn, sql);
26        SortMergeJoinPlan 
sortMergeJoinPlan=(SortMergeJoinPlan)((ClientScanPlan)queryPlan).getDelegate();
27
28        ClientScanPlan 
rhsOuterPlan=(ClientScanPlan)((TupleProjectionPlan)(sortMergeJoinPlan.getRhsPlan())).getDelegate();
29        OrderBy orderBy=rhsOuterPlan.getOrderBy();
30        assertTrue(orderBy.getOrderByExpressions().isEmpty());        
31    }
32    finally {
33        if(conn!=null) {
34              conn.close();
35      }
36      }
37    }
 {noformat}

the line 30 in above code is failed after I remove the 
SubselectRewriter.isOrderByPrefix method in my patch, the RHS:
"(select bid,code from (select bid,code from merge2 order by bid,code limit 1) 
order by bid)" still has {{order by bid}} after compilation even if the inner 
query " (select bid,code from merge2 order by bid,code limit 1)". is already 
ordered on the bid.

The problem is caused by following  line 520 in 
QueryCompiler.compileSingleQuery method,the judgment of  {{isInRowKeyOrder}} is 
too coarse, and obviously inner query  " (select bid,code from merge2 order by 
bid,code limit 1)" is not {{isInRowKeyOrder}} by this judgment ,so the outer 
{{order by bid}}  can not be compiled out. May be we can file a new JIRA to 
improve it  :
{code}
515        TableRef tableRef = context.getResolver().getTables().get(0);
516        ColumnResolver resolver = 
FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(), 
tableRef, innerPlan.getProjector());
517        context.setResolver(resolver);
518        tableRef = resolver.getTables().get(0);
519        context.setCurrentTable(tableRef);
520        boolean isInRowKeyOrder = innerPlan.getGroupBy() == 
GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
521
522        return compileSingleFlatQuery(context, select, binds, asSubquery, 
allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder);
{code}

\\
{quote}
2. Shouldn't the subselectAsTableNode always be a DerivedTableNode? Is 
if(subselectAsTableNode instanceof DerivedTableNode) necessary? Shall we use 
assert or Preconditions check instead?
{quote}

Yes, the {{subselectAsTableNode}} is always  a {{DerivedTableNode}}, we indeed 
should use assert instead of if.


I uploaded my second patch with two minor modifications:
1. change the if statement to "assert subselectAsTableNode instanceof 
DerivedTableNode".
2. add more unit tests and IT tests for checking if the sub-query is already 
ordered on the join key.

I also create a new pull request, please help me have a review, thanks.
https://github.com/apache/phoenix/pull/235


was (Author: comnetwork):
{quote}
     1. Checking if the sub-query is already ordered on the join key might be 
redundant, since that should be taken care of by the sub-query compilation 
itself later on. So could you please verify?
{quote}

After verifying I think checking if the sub-query is already ordered on the 
join key is not redundant , just see following unit tests:
{noformat}
    @Test
1    public void testSortMergeJoinSubQueryBugForOrderByPrefix() throws 
Exception  {     
2       Connection conn = null;
3       try {
4              conn= DriverManager.getConnection(getUrl());
5
6               String tableName1="MERGE1";
7               String tableName2="MERGE2";
8
9                conn.createStatement().execute("DROP TABLE if exists 
"+tableName1);
10
11             String sql="CREATE TABLE IF NOT EXISTS "+tableName1+" ( "+
12                        "AID INTEGER PRIMARY KEY,"+
13                        "AGE INTEGER"+
14                        ")";
15               conn.createStatement().execute(sql);
16             conn.createStatement().execute("DROP TABLE if exists 
"+tableName2);
17       sql="CREATE TABLE IF NOT EXISTS "+tableName2+" ( "+
18             "BID INTEGER PRIMARY KEY,"+
19              "CODE INTEGER"+
20              ")";
21        conn.createStatement().execute(sql);
22
23        sql="select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from (select 
aid,age from "+tableName1+" where age >=11 and age<=33 order by age limit 3) a 
inner join"+
"(select bid,code from "+tableName2+" order by bid,code limit 1) b on 
a.aid=b.bid ";
24
25        QueryPlan queryPlan=getQueryPlan(conn, sql);
26        SortMergeJoinPlan 
sortMergeJoinPlan=(SortMergeJoinPlan)((ClientScanPlan)queryPlan).getDelegate();
27
28        ClientScanPlan 
rhsOuterPlan=(ClientScanPlan)((TupleProjectionPlan)(sortMergeJoinPlan.getRhsPlan())).getDelegate();
29        OrderBy orderBy=rhsOuterPlan.getOrderBy();
30        assertTrue(orderBy.getOrderByExpressions().isEmpty());        
31    }
32    finally {
33        if(conn!=null) {
34              conn.close();
35      }
36      }
37    }
 {noformat}

the line 30 in above code is failed after I remove the 
SubselectRewriter.isOrderByPrefix method in my patch, the RHS:
"(select bid,code from (select bid,code from merge2 order by bid,code limit 1) 
order by bid)" still has {{order by bid}} after compilation even if the inner 
query " (select bid,code from merge2 order by bid,code limit 1)". is already 
ordered on the bid.

The problem is caused by following  line 520 in 
QueryCompiler.compileSingleQuery method,the judgment of  {{isInRowKeyOrder}} is 
too coarse, and obviously inner query  " (select bid,code from merge2 order by 
bid,code limit 1)" is not {{isInRowKeyOrder}} by this judgment ,so the outer 
{{order by bid}}  can not be compiled out. May be we can file a new JIRA to 
improve it  :
{code}
515        TableRef tableRef = context.getResolver().getTables().get(0);
516        ColumnResolver resolver = 
FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(), 
tableRef, innerPlan.getProjector());
517        context.setResolver(resolver);
518        tableRef = resolver.getTables().get(0);
519        context.setCurrentTable(tableRef);
520        boolean isInRowKeyOrder = innerPlan.getGroupBy() == 
GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
521
522        return compileSingleFlatQuery(context, select, binds, asSubquery, 
allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder);
{code}

\\
{quote}
2. Shouldn't the subselectAsTableNode always be a DerivedTableNode? Is 
if(subselectAsTableNode instanceof DerivedTableNode) necessary? Shall we use 
assert or Preconditions check instead?
{quote}

Yes, the {{subselectAsTableNode}} is always  a {{DerivedTableNode}}, we indeed 
should use assert instead of if.


I uploaded my second patch with two minor modifications:
1. change the if statement to "assert subselectAsTableNode instanceof 
DerivedTableNode".
2. add more unit tests and IT tests for checking if the sub-query is already 
ordered on the join key.

I also create a new pull request,please help me have a review, thanks.

> SortMergeJoin might incorrectly override the OrderBy of LHS or RHS
> ------------------------------------------------------------------
>
>                 Key: PHOENIX-3745
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3745
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.9.0
>            Reporter: chenglei
>            Assignee: chenglei
>         Attachments: PHOENIX-3745_v2.patch
>
>
> Let us look a simple test case:
> h4. 1. Create two tables
>  {noformat}
>    CREATE TABLE IF NOT EXISTS MERGE1 (
>          AID INTEGER PRIMARY KEY
>          AGE INTEGER
>     );
>   CREATE TABLE IF NOT EXISTS MERGE2 (
>          BID INTEGER PRIMARY KEY,
>          CODE INTEGER
>   );
>  {noformat}
>  h4. 2. Upsert values
>  {noformat}
>           UPSERT INTO MERGE1(AID,AGE) VALUES (1,11);
>           UPSERT INTO MERGE1(AID,AGE) VALUES (2,22);
>           UPSERT INTO MERGE1 (AID,AGE) VALUES (3,33);
>           UPSERT INTO MERGE2 (BID,CODE) VALUES (1,66);
>           UPSERT INTO MERGE2 (BID,CODE) VALUES (2,55);
>           UPSERT INTO MERGE2 (BID,CODE) VALUES (3,44);
>  {noformat}
>  h4. 3. Execute query
>  {noformat}
>     select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from
>      (select aid,age from merge1  where age >=11 and age<=33) a inner join 
>      (select bid,code from merge2  order by code limit 1) b on a.aid=b.bid 
>  {noformat}
>  h4. (/) Expected result
>  {noformat}
>     3,44
>  {noformat}
>  h4. (!) Incorrect actual result
>  {noformat}
>     1,66 
>  {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to