[ https://issues.apache.org/jira/browse/PHOENIX-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15895471#comment-15895471 ]
chenglei commented on PHOENIX-3705: ----------------------------------- Thank you for the review, [~jamestaylor]. Pushed to the master, 4.x-HBase-0.98, and 4.x-HBase-1.1 branches. > SkipScanFilter may repeatedly copy rowKey Columns to startKey > ------------------------------------------------------------- > > Key: PHOENIX-3705 > URL: https://issues.apache.org/jira/browse/PHOENIX-3705 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.9.0 > Reporter: chenglei > Priority: Blocker > Fix For: 4.10.0 > > Attachments: PHOENIX-3705_v2.patch > > > See following simple unit test first,the rowKey is composed of three PInteger > columns,and the slots of SkipScanFilter are: > [ [[1 - 4]], [5, 7], [[9 - 10]] ], > When SkipScanFilter.filterKeyValue method is invoked on a KeyValue whose > rowKey is [2,7,11], obviously SkipScanFilter.filterKeyValue > returns ReturnCode.SEEK_NEXT_USING_HINT and SkipScanFilter.getNextCellHint > returns [3,5,9], but unfortunately, SkipScanFilter.getNextCellHint actually > returns [2,8,5,9] , a very strange value, the unit tests failed. > {code} > @Test > public void testNavigate() { > RowKeySchemaBuilder builder = new RowKeySchemaBuilder(3); > for(int i=0;i<3;i++) { > builder.addField( > new PDatum() { > @Override > public boolean isNullable() { > return false; > } > @Override > public PDataType getDataType() { > return PInteger.INSTANCE; > } > @Override > public Integer getMaxLength() { > return PInteger.INSTANCE.getMaxLength(null); > } > @Override > public Integer getScale() { > return PInteger.INSTANCE.getScale(null); > } > @Override > public SortOrder getSortOrder() { > return SortOrder.getDefault(); > } > }, false, SortOrder.getDefault()); > } > > List<List<KeyRange>> rowKeyColumnRangesList=Arrays.asList( > Arrays.asList( > > PInteger.INSTANCE.getKeyRange(PInteger.INSTANCE.toBytes(1), true, > PInteger.INSTANCE.toBytes(4), true)), > Arrays.asList( > KeyRange.getKeyRange(PInteger.INSTANCE.toBytes(5)), > KeyRange.getKeyRange(PInteger.INSTANCE.toBytes(7))), > Arrays.asList( > > PInteger.INSTANCE.getKeyRange(PInteger.INSTANCE.toBytes(9), true, > PInteger.INSTANCE.toBytes(10), true)) > ); > > SkipScanFilter skipScanFilter=new > SkipScanFilter(rowKeyColumnRangesList, builder.build()); > > System.out.println(skipScanFilter); > > byte[] rowKey=ByteUtil.concat( > PInteger.INSTANCE.toBytes(2), > PInteger.INSTANCE.toBytes(7), > PInteger.INSTANCE.toBytes(11)); > KeyValue keyValue=KeyValue.createFirstOnRow(rowKey); > ReturnCode returnCode=skipScanFilter.filterKeyValue(keyValue); > assertTrue(returnCode == ReturnCode.SEEK_NEXT_USING_HINT); > Cell nextCellHint=skipScanFilter.getNextCellHint(keyValue); > > assertTrue(Bytes.toStringBinary(CellUtil.cloneRow(nextCellHint)).equals( > "\\x80\\x00\\x00\\x03\\x80\\x00\\x00\\x05\\x80\\x00\\x00\\x09")); > } > {code} > Let us see what's wrong, first column of rowKey [2,7,11 ] is 2, which is in > SkipScanFilter's first slot range [1-4], so position[0] is 0 and we go to the > second column 7, which match the second range [7] of SkipScanFilter's second > slot [5, 7],so position[1] is 1 and we go to the third column 11, which is > bigger than the third slot range [9 - 10],so position[2] is 0 and the > {{SkipScanFilter.ptr}} which points to current column still stay on the third > column. Now we begin to backtrack to second column, because the second range > [7] of SkipScanFilter's second slot is singleKey and there is no more > range,so position[1] is 0 and we continue to backtrack to first column, > because the first slot range [1-4] is not singleKey so we stop backtracking > at first column. > Now the problem comes, in following line 448 of {{SkipScanFilter.navigate}} > method,{{SkipScanFilter.setStartKey}} method is invoked,first copy rowKey > columns before {{SkipScanFilter.ptr}} to {{SkipScanFilter.startKey}}, because > now {{SkipScanFilter.ptr}} still point to the third column, so copy the first > and second columns to {{SkipScanFilter.startKey}},{{SkipScanFilter.startKey}} > is [2,7] after this step , then setStartKey method copy the lower bound > {{SkipScanFilter.slots}} from {{j+1}} column, accoring to > {{SkipScanFilter.position}} array,because now j is 0, and both position[1] > and position[2] are 0, so {{SkipScanFilter.startKey}} becomes [2,7,5,9], and > in following line 457, {{ByteUtil.nextKey}} is invoked on [2,7], [2,7] is > incremented to [2,8], finally {{SkipScanFilter.startKey}} is [2,8,5,9]. > {code} > 448 int currentLength = setStartKey(ptr, minOffset, j+1, > nSlots, false); > 449 // From here on, we use startKey as our buffer > (resetting minOffset and maxOffset) > 450 // We've copied the part of the current key above that > we need into startKey > 451 // Reinitialize the iterator to be positioned at > previous slot position > 452 minOffset = 0; > 453 maxOffset = startKeyLength; > 454 schema.iterator(startKey, minOffset, maxOffset, ptr, > ScanUtil.getRowKeyPosition(slotSpan, j)+1); > 455 // Do nextKey after setting the accessor b/c otherwise > the null byte may have > 456 // been incremented causing us not to find it > 457 ByteUtil.nextKey(startKey, currentLength); > {code} > From above analysis, we can see the second column is repeatedly copied to > {{SkipScanFilter.startKey}}, copy 7 to {{SkipScanFilter.startKey}} is error > ,before {{SkipScanFilter.setStartKey}} method is invoked in line 448, we > should first move {{SkipScanFilter.ptr}} to {{j+1}}, that is to say > ,following code might be inserted before line 448: > {code} > schema.reposition( > ptr, > ScanUtil.getRowKeyPosition(slotSpan, i), > ScanUtil.getRowKeyPosition(slotSpan, j+1), > minOffset, > maxOffset, > slotSpan[j+1]); > {code} > Another problem is why the existing SkipScanFilter unit tests are ok? That > is because in all existing unit test cases, backtrack column is just before > current column which ptr points to, i.e. i==j+1, so the unit tests are ok. > This issue may not affect the final query result , but obviously, it may lead > to the erroneous startKey and cause the Skip Scan inefficient. -- This message was sent by Atlassian JIRA (v6.3.15#6346)