[ 
https://issues.apache.org/jira/browse/PHOENIX-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated PHOENIX-5996:
------------------------------
    Description: 
With PHOENIX-5748, 
{{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}} is responsible 
for generating index table mutations for rebuild.
In the processing of data table mutations list, there can be a delete and put 
mutation with the same timestamp.   If so, the delete and put are processed 
together in one iteration. First, the delete mutation is applied on the put 
mutation and current row state, and then the modified put mutation is processed.
But when the {{modified put mutation}} is empty , even the current row state is 
not empty after the delete mutation is applied, the whole index row is  
deleted, just as following line 1191 in 
{{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}}:
{code:java}
1189              } else {
1190                    if (currentDataRowState != null) {
1191                        Mutation del = 
indexMaintainer.buildRowDeleteMutation(indexRowKeyForCurrentDataRow,
1192                                IndexMaintainer.DeleteType.ALL_VERSIONS, 
ts);
1193                        indexMutations.add(del);
1194                        // For the next iteration of the for loop
1195                        currentDataRowState = null;
1196                        indexRowKeyForCurrentDataRow = null;
1197                    }
1198              }
{code} 

I think above logical is wrong, when the current row state is not empty after 
the delete mutation is applied, we can not  delete the whole index row, but 
instead we should reuse the logical of processing a delete mutation.  I wrote a 
unit test in {{PrepareIndexMutationsForRebuildTest}} to produce the case:
{code:java}
    @Test
    public void testPutDeleteOnSameTimeStampAndPutNullifiedByDelete() throws 
Exception {
        SetupInfo info = setup(
                TABLE_NAME,
                INDEX_NAME,
                "ROW_KEY VARCHAR, CF1.C1 VARCHAR, CF2.C2 VARCHAR",
                "CF2.C2",
                "ROW_KEY",
                "");

        Put dataPut = new Put(Bytes.toBytes(ROW_KEY));
        addCellToPutMutation(
                dataPut,
                Bytes.toBytes("CF2"),
                Bytes.toBytes("C2"),
                1,
                Bytes.toBytes("v2"));
        addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 1);
        
        addCellToPutMutation(
                dataPut,
                Bytes.toBytes("CF1"),
                Bytes.toBytes("C1"),
                2,
                Bytes.toBytes("v1"));
        addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 2);

        Delete dataDel = new Delete(Bytes.toBytes(ROW_KEY));
        addCellToDelMutation(
                dataDel,
                Bytes.toBytes("CF1"),
                null,
                2,
                KeyValue.Type.DeleteFamily);

        List<Mutation> actualIndexMutations = 
IndexRebuildRegionScanner.prepareIndexMutationsForRebuild(
                info.indexMaintainer,
                dataPut,
                dataDel);

        List<Mutation> expectedIndexMutations = new ArrayList<>();
        byte[] idxKeyBytes = generateIndexRowKey("v2");

        
        Put idxPut1 = new Put(idxKeyBytes);
        addEmptyColumnToIndexPutMutation(idxPut1, info.indexMaintainer, 1);
        expectedIndexMutations.add(idxPut1);
        
        Put idxPut2 = new Put(idxKeyBytes);
        addEmptyColumnToIndexPutMutation(idxPut2, info.indexMaintainer, 2);
        expectedIndexMutations.add(idxPut2);

        assertEqualMutationList(expectedIndexMutations, actualIndexMutations);
    }
{code} 


  was:With PHOENIX-5748, IndexRebuildRegionScanner is responsible for 
generating index table mutations


> IndexRebuildRegionScanner.prepareIndexMutationsForRebuild may incorrectly 
> delete index row when a delete and put mutation with the same timestamp
> -------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5996
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5996
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.1.0, 4.16.0
>            Reporter: chenglei
>            Priority: Major
>
> With PHOENIX-5748, 
> {{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}} is responsible 
> for generating index table mutations for rebuild.
> In the processing of data table mutations list, there can be a delete and put 
> mutation with the same timestamp.   If so, the delete and put are processed 
> together in one iteration. First, the delete mutation is applied on the put 
> mutation and current row state, and then the modified put mutation is 
> processed.
> But when the {{modified put mutation}} is empty , even the current row state 
> is not empty after the delete mutation is applied, the whole index row is  
> deleted, just as following line 1191 in 
> {{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}}:
> {code:java}
> 1189              } else {
> 1190                    if (currentDataRowState != null) {
> 1191                        Mutation del = 
> indexMaintainer.buildRowDeleteMutation(indexRowKeyForCurrentDataRow,
> 1192                                IndexMaintainer.DeleteType.ALL_VERSIONS, 
> ts);
> 1193                        indexMutations.add(del);
> 1194                        // For the next iteration of the for loop
> 1195                        currentDataRowState = null;
> 1196                        indexRowKeyForCurrentDataRow = null;
> 1197                    }
> 1198              }
> {code} 
> I think above logical is wrong, when the current row state is not empty after 
> the delete mutation is applied, we can not  delete the whole index row, but 
> instead we should reuse the logical of processing a delete mutation.  I wrote 
> a unit test in {{PrepareIndexMutationsForRebuildTest}} to produce the case:
> {code:java}
>     @Test
>     public void testPutDeleteOnSameTimeStampAndPutNullifiedByDelete() throws 
> Exception {
>         SetupInfo info = setup(
>                 TABLE_NAME,
>                 INDEX_NAME,
>                 "ROW_KEY VARCHAR, CF1.C1 VARCHAR, CF2.C2 VARCHAR",
>                 "CF2.C2",
>                 "ROW_KEY",
>                 "");
>         Put dataPut = new Put(Bytes.toBytes(ROW_KEY));
>         addCellToPutMutation(
>                 dataPut,
>                 Bytes.toBytes("CF2"),
>                 Bytes.toBytes("C2"),
>                 1,
>                 Bytes.toBytes("v2"));
>         addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 1);
>         
>         addCellToPutMutation(
>                 dataPut,
>                 Bytes.toBytes("CF1"),
>                 Bytes.toBytes("C1"),
>                 2,
>                 Bytes.toBytes("v1"));
>         addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 2);
>         Delete dataDel = new Delete(Bytes.toBytes(ROW_KEY));
>         addCellToDelMutation(
>                 dataDel,
>                 Bytes.toBytes("CF1"),
>                 null,
>                 2,
>                 KeyValue.Type.DeleteFamily);
>         List<Mutation> actualIndexMutations = 
> IndexRebuildRegionScanner.prepareIndexMutationsForRebuild(
>                 info.indexMaintainer,
>                 dataPut,
>                 dataDel);
>         List<Mutation> expectedIndexMutations = new ArrayList<>();
>         byte[] idxKeyBytes = generateIndexRowKey("v2");
>         
>         Put idxPut1 = new Put(idxKeyBytes);
>         addEmptyColumnToIndexPutMutation(idxPut1, info.indexMaintainer, 1);
>         expectedIndexMutations.add(idxPut1);
>         
>         Put idxPut2 = new Put(idxKeyBytes);
>         addEmptyColumnToIndexPutMutation(idxPut2, info.indexMaintainer, 2);
>         expectedIndexMutations.add(idxPut2);
>         assertEqualMutationList(expectedIndexMutations, actualIndexMutations);
>     }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to