Kadir OZDEMIR created PHOENIX-5813:
--------------------------------------

             Summary: Index read repair should not interfere with concurrent 
updates 
                 Key: PHOENIX-5813
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5813
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.14.3, 5.0.0
            Reporter: Kadir OZDEMIR


Let \{1, a, x, y} be a row in the data table. Let the first column be the only 
pk column and the second column be the only indexed column of the table, and 
finally the forth column be the only covered column by the index for this 
table. The corresponding row in the index table would be \{a, 1, y}. 

Now, let the same data table row be mutated and the new state of the row be 
\{1, b, x, y}. The index row \{a, 1, y} is not valid any more in the index 
table and needs to be deleted. Thus, the prepared index mutations will include 
the delete row mutation for the row key \{a, 1} and a put mutation, that is, 
put \{b, 1, y} for the new row.  

Let \{1, c, x, y} be another mutation on the same row that arrives before the 
previous mutation updates the data table. This means that the prepared index 
mutations will include the delete row mutation for the row key \{a, 1} and a 
put mutation, that is, put \{c, 1, y}. However, the last update should have 
deleted index row \{b, 1} instead of \{a, 1}. To prevent this, 
IndexRegionObserver maintains a collection of data table row keys for each 
pending data table row update in order to detect concurrent updates, and skips 
the third write phase for them. In the first update phase, index rows are made 
unverified and in the third update phase, they are verified or deleted. The 
read-repair operation on these unverified rows will lead to proper resolution 
of these concurrent updates. 

Therefore, two or more pending updates from different batches on the same data 
row are concurrent if and only if for all of these updates the data table row 
state is read from HBase under a Phoenix level row lock and for none of them 
the row lock has been acquired the second time for updating the data table. In 
other words, all of them are in the first update phase concurrently. For 
concurrent updates, the first two update phases are done but the last update 
phase is skipped. This means the data table row will be updated by these 
updates but the corresponding index table rows will be left with the unverified 
status. Then, the read repair process will repair these unverified index rows 
during scans.

For the example given above, \{1, b, x, y} and \{1, c, x, y} are concurrent 
updates (on the same data table row). As explained above, the index rows 
generated for these updates should be left unverified. Now assume that a scan 
on the index table detects that index row \{1, b, x, y} is unverified while the 
concurrent updates are in progress, and the index row is repaired from the data 
table. It is possible that the read repair gets the row \{1, b, x, y} from the 
data table. Then it will rebuild the corresponding index row which is the row 
\{b, 1, y} and will make the row verified. This rebuild may happen just after 
the row \{b, 1, y} is made unverified by the concurrent updates. This means 
that the repair will overwrite the result of the concurrent updates. 

This scan will return \{b, 1, y} to the client. Then this scan may also detect 
that \{c, 1, y} is also unverified. By the time, this row is repaired, the data 
table row could be \{1, c, x, y}. This means the corresponding index row \{c, 
1, y} will be made verified by the read repair and also returned to the client 
for the same scan. However, only one these index rows should have been returned 
to the client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to