[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 10:25 AM:
-

This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} is called to push {{m.id in (1,2)}} 
into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}
\\
Finally, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id are 1, the rows which 
id are 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

\\
In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  where

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 10:22 AM:
-

This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} is called to push {{m.id in (1,2)}} 
into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}
\\
In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id are 1, the rows which 
id are 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

\\
In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  wh

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:59 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}
\\
In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id are 1, the rows which 
id are 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

\\
In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  whe

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:57 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}
\\
In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
{{[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF)}},
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

\\
In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  w

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:58 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}
\\
In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

\\
In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  where

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:56 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}
\\
In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

\\
In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  where

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:56 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

\\
\\
In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  wher

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:55 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}



In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  where m.

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:54 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method,so the Scan would use 
SkipScanFilter: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  where m.id

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:50 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{noformat}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
{noformat}

and modify the sql as :
{noformat}
   select m.id, m.col1,d.col2 from master m, detail d  where m.id = d.id  and 
d.id in (3,5,7) order by

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:47 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :

{code}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
(code}

and modify the sql as :
{code}
   select m.id, m.col1,d.col2 from master m, detail d  where m.id = d.id  and 
d.id in (3,5,7) order by m.id desc
{

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:46 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
(code}

and modify the sql as :
{code}
   select m.id, m.col1,d.c

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:43 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,
\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code:borderStyle=solid}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
(code}

and modify the sql as :
{code:borderSty

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:43 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the rows of {{master}} table which id is 1, the rows which 
id is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code:borderStyle=solid}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
(code}

and modify the sql as :
{code:borderStyl

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:41 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\x80\x00\x00\x01,\x80\x00\x00\x02\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF,\x80\x00\x00\x00\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF),
 so we can only get the row of {{master}} table which id is 1, the row which id 
is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\x80\x00\x00\x02\x00}},which should return {{\x80\x00\x00\x02}},not 
{{\x80\x00\x00\x01\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code:borderStyle=solid}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INTO detail VALUES (8, 8, 'B8');
(code}

and modify the sql as :
{code:borderStyle=

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:37 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is {{\x80\x00\x00\x02\x00}}, and 
correspondingly,the Scan's startRow is {{\x80\x00\x00\x01}}, and Scan's endRow 
is {{\x80\x00\x00\x02\x00}}:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\\x80\\x00\\x00\\x01,\\x80\\x00\\x00\\x02\\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF,\\x80\\x00\\x00\\x00\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF),
 so we can only get the row of {{master}} table which id is 1, the row which id 
is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\\x80\\x00\\x00\\x02\\x00}},which should return {{\\x80\\x00\\x00\\x02}},not 
{{\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code:borderStyle=solid}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT INT

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:35 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\x80\x00\x00\x01}},and maxKey is \\x80\\x00\\x00\\x02\\x00, and 
correspondingly,the Scan's startRow is \\x80\\x00\\x00\\x01, and Scan's endRow 
is \\x80\\x00\\x00\\x02\\x00:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\\x80\\x00\\x00\\x01,\\x80\\x00\\x00\\x02\\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF,\\x80\\x00\\x00\\x00\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF),
 so we can only get the row of {{master}} table which id is 1, the row which id 
is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\\x80\\x00\\x00\\x02\\x00}},which should return {{\\x80\\x00\\x00\\x02}},not 
{{\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code:borderStyle=solid}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSERT I

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:35 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
 {{\\x80\\x00\\x00\\x01}},and maxKey is \\x80\\x00\\x00\\x02\\x00, and 
correspondingly,the Scan's startRow is \\x80\\x00\\x00\\x01, and Scan's endRow 
is \\x80\\x00\\x00\\x02\\x00:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\\x80\\x00\\x00\\x01,\\x80\\x00\\x00\\x02\\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF,\\x80\\x00\\x00\\x00\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF),
 so we can only get the row of {{master}} table which id is 1, the row which id 
is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\\x80\\x00\\x00\\x02\\x00}},which should return {{\\x80\\x00\\x00\\x02}},not 
{{\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code:borderStyle=solid}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSE

[jira] [Comment Edited] (PHOENIX-3578) Incorrect query results when applying inner join and orderby desc

2017-02-21 Thread chenglei (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875665#comment-15875665
 ] 

chenglei edited comment on PHOENIX-3578 at 2/21/17 9:30 AM:


This issue is caused by the join dynamic filter, from following RHS, we get 
d.id is in (1,2):
{code:borderStyle=solid} 
  select d.seq,d.col2,d.id from detail d  where d.id between 1 and 2
{code} 
so with join dynamic filter, m.id is also in (1,2). Before applying join 
dynamic filter,LHS is:
{code:borderStyle=solid}
  select m,id,m.col1,d.seq,d.col2 from master m order by m.id desc
{code}
Obviously, LHS's OrderBy is {{OrderBy.REV_ROW_KEY_ORDER_BY}},after applying 
join dynamic filter LHS turns to :
{code:borderStyle=solid} 
select m,id,m.col1,d.seq,d.col2 from master m where m.id in (1,2) order by 
m.id desc
{code} 
Notice LHS's OrderBy is still {{OrderBy.REV_ROW_KEY_ORDER_BY}} now,then 
{{WhereOptimizer.pushKeyExpressionsToScan}} was called to push {{m.id in 
(1,2)}} into Scan , and useSkipScan is true in following line 274 of 
{{WhereOptimizer.pushKeyExpressionsToScan}} method: 
{code:borderStyle=solid}
273stopExtracting |= (hasUnboundedRange && !forcedSkipScan) || 
(hasRangeKey && forcedRangeScan);
274useSkipScan |= !stopExtracting && !forcedRangeScan && 
(keyRanges.size() > 1 || hasRangeKey);
{code} 

next step the {{startRow}} and {{endRow}} of LHS's Scan was computed in 
{{ScanRanges.create}} method, in following line 112 the LHS's RowKeySchema is 
turned to SchemaUtil.VAR_BINARY_SCHEMA: 

{code:borderStyle=solid} 
111  if (keys.size() > 1 || 
SchemaUtil.getSeparatorByte(schema.rowKeyOrderOptimizable(), false, 
schema.getField(schema.getFieldCount()-1)) == 
QueryConstants.DESC_SEPARATOR_BYTE) {
112schema = SchemaUtil.VAR_BINARY_SCHEMA;
113slotSpan = ScanUtil.SINGLE_COLUMN_SLOT_SPAN;
114   } else { 
{code}

so in following line 135 and line 136 of {{ScanRanges.create}} method,minKey is 
{{\\x80\\x00\\x00\\x01}},and maxKey is \\x80\\x00\\x00\\x02\\x00, and 
correspondingly,the Scan's startRow is \\x80\\x00\\x00\\x01, and Scan's endRow 
is \\x80\\x00\\x00\\x02\\x00:
{code:borderStyle=solid}
134if (nBuckets == null || !isPointLookup || !useSkipScan) {
135byte[] minKey = ScanUtil.getMinKey(schema, sortedRanges, 
slotSpan);
136byte[] maxKey = ScanUtil.getMaxKey(schema, sortedRanges, 
slotSpan);
{code}

In summary, when we scan the LHS {{master}} table, the Scan range is 
{{[\\x80\\x00\\x00\\x01,\\x80\\x00\\x00\\x02\\x00)}} ,and the Scan uses 
{{SkipScanFilter}}.Furthermore,because the LHS's OrderBy is 
{{OrderBy.REV_ROW_KEY_ORDER_By}},so the Scan range should be reversed.In 
{{BaseScannerRegionObserver.preScannerOpen}} method,following 
{{ScanUtil.setupReverseScan}} method is called to reverse the Scan's startRow 
and endRow.Unfortunately, the reversed Scan's range computed by  
{{ScanUtil.setupReverseScan}} method is 
[\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF,\\x80\\x00\\x00\\x00\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF),
 so we can only get the row of {{master}} table which id is 1, the row which id 
is 2 is excluded.

{code:borderStyle=solid} 
621  public static void setupReverseScan(Scan scan) {
622if (isReversed(scan)) {
623byte[] newStartRow = getReversedRow(scan.getStartRow());
624byte[] newStopRow = getReversedRow(scan.getStopRow());
625scan.setStartRow(newStopRow);
626scan.setStopRow(newStartRow);
627scan.setReversed(true);
628}
629}  
{code}

In conclusion, following two problems causes this issue:
(1) the {{ScanUtil.getReversedRow}} method is not right for 
{{\\x80\\x00\\x00\\x02\\x00}},which should return {{\\x80\\x00\\x00\\x02}},not 
{{\\x80\\x00\\x00\\x01\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF}}.
(2) even though {{ScanUtil.getReversedRow}} method is right,there may be 
another problem,if I change the table data as following :
{code:borderStyle=solid}
UPSERT INTO master VALUES (1, 'A1');
UPSERT INTO master VALUES (2, 'A2');
UPSERT INTO master VALUES (3, 'A3');
UPSERT INTO master VALUES (4, 'A4');
UPSERT INTO master VALUES (5, 'A5');
UPSERT INTO master VALUES (6, 'A6');
UPSERT INTO master VALUES (8, 'A8');

UPSERT INTO detail VALUES (1, 1, 'B1');
UPSERT INTO detail VALUES (2, 2, 'B2');
UPSERT INTO detail VALUES (3, 3, 'B3');
UPSERT INTO detail VALUES (4, 4, 'B4');
UPSERT INTO detail VALUES (5, 5, 'B5');
UPSERT INTO detail VALUES (6, 6, 'B6');
UPSERT INTO detail VALUES (7, 7, 'B7');
UPSER