[ https://issues.apache.org/jira/browse/PHOENIX-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kadir OZDEMIR updated PHOENIX-5743: ----------------------------------- Attachment: PHOENIX-5743.4.x-HBase-1.3.001.patch > Concurrent read repairs on the same index row should be idempotent > ------------------------------------------------------------------ > > Key: PHOENIX-5743 > URL: https://issues.apache.org/jira/browse/PHOENIX-5743 > Project: Phoenix > Issue Type: Bug > Affects Versions: 5.0.0, 4.14.3 > Reporter: Kadir OZDEMIR > Assignee: Kadir OZDEMIR > Priority: Critical > Attachments: PHOENIX-5743.4.x-HBase-1.3.001.patch, > PHOENIX-5743.master.001.patch > > Time Spent: 20m > Remaining Estimate: 0h > > It is possible that two or more read repairs can work on the same row. > Regardless of how many read repairs concurrently happen on this row, the end > result should be the same. The current implementation does not satisfy this > property in one case. This can happen with the following steps: > # An update on a data table row fails due to the data table row write > failure (the phase two write). Since the phase 1 (unverified index write) has > completed here, this leaves an unverified row in the index table. > # Two (or more) concurrent queries on this table scans this unverified index > row. > # Each query triggers a separate read repair activity. > # The first one deletes the unverified row correctly. > # The subsequent ones may leave a wrong delete marker which corrupts this > index row. > Step 5 can happen because of two bugs in deleteRowIfAgedEnough() in > GlobalIndexChecker.GlobalIndexScanner: > # "deleteRowScan.setTimeRange(0, ts + 1);" should read > "deleteRowScan.setTimeRange(ts, ts + 1);". This will make sure that the first > read repair will retrieve the cells of the unverified row with the timestamp > ts but the subsequent read repair gets either the same set of cells the first > one got, or no cell (i.e., empty row). > # If the unverified row has been already deleted, deleteRowIfAgedEnough() > should do nothing and return. However, the current implementation either the > read repair will retrieve the previous row version (i.e., previous to the > unverified row) and leaves DeleteColumn markers for wrong cells, or it will > get no cells (if no previous row version exists) and leaves a DeleteFamily > marker which will deletes all previous versions of the row if such rows are > inserted back by index rebuild. -- This message was sent by Atlassian Jira (v8.3.4#803005)