[ https://issues.apache.org/jira/browse/PHOENIX-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated PHOENIX-5996: ------------------------------ Description: With PHOENIX-5748, {{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}} is responsible for generating index table mutations for rebuild. In the processing of data table mutations list, there can be a delete and put mutation with the same timestamp. If so, the delete and put are processed together in one iteration. First, the delete mutation is applied on the put mutation and current row state, and then the modified put mutation is processed. But when the {{modified put mutation}} is empty , even the current row state is not empty after the delete mutation is applied, the whole index row is deleted, just as following line 1191 in {{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}}: {code:java} 1189 } else { 1190 if (currentDataRowState != null) { 1191 Mutation del = indexMaintainer.buildRowDeleteMutation(indexRowKeyForCurrentDataRow, 1192 IndexMaintainer.DeleteType.ALL_VERSIONS, ts); 1193 indexMutations.add(del); 1194 // For the next iteration of the for loop 1195 currentDataRowState = null; 1196 indexRowKeyForCurrentDataRow = null; 1197 } 1198 } {code} I think above logical is wrong, when the current row state is not empty after the delete mutation is applied, we can not delete the whole index row, but instead we should reuse the logical of processing a delete mutation. I wrote a unit test in {{PrepareIndexMutationsForRebuildTest}} to produce the case: {code:java} @Test public void testPutDeleteOnSameTimeStampAndPutNullifiedByDelete() throws Exception { SetupInfo info = setup( TABLE_NAME, INDEX_NAME, "ROW_KEY VARCHAR, CF1.C1 VARCHAR, CF2.C2 VARCHAR", "CF2.C2", "ROW_KEY", ""); Put dataPut = new Put(Bytes.toBytes(ROW_KEY)); addCellToPutMutation( dataPut, Bytes.toBytes("CF2"), Bytes.toBytes("C2"), 1, Bytes.toBytes("v2")); addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 1); addCellToPutMutation( dataPut, Bytes.toBytes("CF1"), Bytes.toBytes("C1"), 2, Bytes.toBytes("v1")); addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 2); Delete dataDel = new Delete(Bytes.toBytes(ROW_KEY)); addCellToDelMutation( dataDel, Bytes.toBytes("CF1"), null, 2, KeyValue.Type.DeleteFamily); List<Mutation> actualIndexMutations = IndexRebuildRegionScanner.prepareIndexMutationsForRebuild( info.indexMaintainer, dataPut, dataDel); List<Mutation> expectedIndexMutations = new ArrayList<>(); byte[] idxKeyBytes = generateIndexRowKey("v2"); Put idxPut1 = new Put(idxKeyBytes); addEmptyColumnToIndexPutMutation(idxPut1, info.indexMaintainer, 1); expectedIndexMutations.add(idxPut1); Put idxPut2 = new Put(idxKeyBytes); addEmptyColumnToIndexPutMutation(idxPut2, info.indexMaintainer, 2); expectedIndexMutations.add(idxPut2); assertEqualMutationList(expectedIndexMutations, actualIndexMutations); } {code} was:With PHOENIX-5748, IndexRebuildRegionScanner is responsible for generating index table mutations > IndexRebuildRegionScanner.prepareIndexMutationsForRebuild may incorrectly > delete index row when a delete and put mutation with the same timestamp > ------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-5996 > URL: https://issues.apache.org/jira/browse/PHOENIX-5996 > Project: Phoenix > Issue Type: Bug > Affects Versions: 5.1.0, 4.16.0 > Reporter: chenglei > Priority: Major > > With PHOENIX-5748, > {{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}} is responsible > for generating index table mutations for rebuild. > In the processing of data table mutations list, there can be a delete and put > mutation with the same timestamp. If so, the delete and put are processed > together in one iteration. First, the delete mutation is applied on the put > mutation and current row state, and then the modified put mutation is > processed. > But when the {{modified put mutation}} is empty , even the current row state > is not empty after the delete mutation is applied, the whole index row is > deleted, just as following line 1191 in > {{IndexRebuildRegionScanner.prepareIndexMutationsForRebuild}}: > {code:java} > 1189 } else { > 1190 if (currentDataRowState != null) { > 1191 Mutation del = > indexMaintainer.buildRowDeleteMutation(indexRowKeyForCurrentDataRow, > 1192 IndexMaintainer.DeleteType.ALL_VERSIONS, > ts); > 1193 indexMutations.add(del); > 1194 // For the next iteration of the for loop > 1195 currentDataRowState = null; > 1196 indexRowKeyForCurrentDataRow = null; > 1197 } > 1198 } > {code} > I think above logical is wrong, when the current row state is not empty after > the delete mutation is applied, we can not delete the whole index row, but > instead we should reuse the logical of processing a delete mutation. I wrote > a unit test in {{PrepareIndexMutationsForRebuildTest}} to produce the case: > {code:java} > @Test > public void testPutDeleteOnSameTimeStampAndPutNullifiedByDelete() throws > Exception { > SetupInfo info = setup( > TABLE_NAME, > INDEX_NAME, > "ROW_KEY VARCHAR, CF1.C1 VARCHAR, CF2.C2 VARCHAR", > "CF2.C2", > "ROW_KEY", > ""); > Put dataPut = new Put(Bytes.toBytes(ROW_KEY)); > addCellToPutMutation( > dataPut, > Bytes.toBytes("CF2"), > Bytes.toBytes("C2"), > 1, > Bytes.toBytes("v2")); > addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 1); > > addCellToPutMutation( > dataPut, > Bytes.toBytes("CF1"), > Bytes.toBytes("C1"), > 2, > Bytes.toBytes("v1")); > addEmptyColumnToDataPutMutation(dataPut, info.pDataTable, 2); > Delete dataDel = new Delete(Bytes.toBytes(ROW_KEY)); > addCellToDelMutation( > dataDel, > Bytes.toBytes("CF1"), > null, > 2, > KeyValue.Type.DeleteFamily); > List<Mutation> actualIndexMutations = > IndexRebuildRegionScanner.prepareIndexMutationsForRebuild( > info.indexMaintainer, > dataPut, > dataDel); > List<Mutation> expectedIndexMutations = new ArrayList<>(); > byte[] idxKeyBytes = generateIndexRowKey("v2"); > > Put idxPut1 = new Put(idxKeyBytes); > addEmptyColumnToIndexPutMutation(idxPut1, info.indexMaintainer, 1); > expectedIndexMutations.add(idxPut1); > > Put idxPut2 = new Put(idxKeyBytes); > addEmptyColumnToIndexPutMutation(idxPut2, info.indexMaintainer, 2); > expectedIndexMutations.add(idxPut2); > assertEqualMutationList(expectedIndexMutations, actualIndexMutations); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)