[
https://issues.apache.org/jira/browse/PHOENIX-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kadir OZDEMIR updated PHOENIX-5743:
-----------------------------------
Attachment: PHOENIX-5743.4.x-HBase-1.3.001.patch
> Concurrent read repairs on the same index row should be idempotent
> ------------------------------------------------------------------
>
> Key: PHOENIX-5743
> URL: https://issues.apache.org/jira/browse/PHOENIX-5743
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 5.0.0, 4.14.3
> Reporter: Kadir OZDEMIR
> Assignee: Kadir OZDEMIR
> Priority: Critical
> Attachments: PHOENIX-5743.4.x-HBase-1.3.001.patch,
> PHOENIX-5743.master.001.patch
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> It is possible that two or more read repairs can work on the same row.
> Regardless of how many read repairs concurrently happen on this row, the end
> result should be the same. The current implementation does not satisfy this
> property in one case. This can happen with the following steps:
> # An update on a data table row fails due to the data table row write
> failure (the phase two write). Since the phase 1 (unverified index write) has
> completed here, this leaves an unverified row in the index table.
> # Two (or more) concurrent queries on this table scans this unverified index
> row.
> # Each query triggers a separate read repair activity.
> # The first one deletes the unverified row correctly.
> # The subsequent ones may leave a wrong delete marker which corrupts this
> index row.
> Step 5 can happen because of two bugs in deleteRowIfAgedEnough() in
> GlobalIndexChecker.GlobalIndexScanner:
> # "deleteRowScan.setTimeRange(0, ts + 1);" should read
> "deleteRowScan.setTimeRange(ts, ts + 1);". This will make sure that the first
> read repair will retrieve the cells of the unverified row with the timestamp
> ts but the subsequent read repair gets either the same set of cells the first
> one got, or no cell (i.e., empty row).
> # If the unverified row has been already deleted, deleteRowIfAgedEnough()
> should do nothing and return. However, the current implementation either the
> read repair will retrieve the previous row version (i.e., previous to the
> unverified row) and leaves DeleteColumn markers for wrong cells, or it will
> get no cells (if no previous row version exists) and leaves a DeleteFamily
> marker which will deletes all previous versions of the row if such rows are
> inserted back by index rebuild.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)