Kadir OZDEMIR created PHOENIX-5795:
--------------------------------------

             Summary: Supporting selective queries for index rows updated 
concurrently
                 Key: PHOENIX-5795
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5795
             Project: Phoenix
          Issue Type: Sub-task
            Reporter: Kadir OZDEMIR


>From the consistent indexing design (PHOENIX-5156) perspective, two or more 
>pending updates from different batches on the same data row are concurrent if 
>and only if for all of these updates the data table row state is read from 
>HBase under the row lock and for none of them the row lock has been acquired 
>the second time for updating the data table. In other words, all of them are 
>in the first update phase concurrently. For concurrent updates, the first two 
>update phases are done but the last update phase is skipped. This means the 
>data table row will be updated by these updates but the corresponding index 
>table rows will be left with the unverified status. Then, the read repair 
>process will repair these unverified index rows during scans.

In addition to leaving index rows unverified, the concurrent updates may 
generate index row with incorrect row keys. For example, consider that an 
application issues the verify first two upserts on the same row concurrently 
and the second update does not include one or more of the indexed columns. When 
these updates arrive concurrently to IndexRegionObserver, the existing row 
state would be null for both of these updates. This mean the index updates will 
be generated solely from the pending updates. The partial upsert with missing 
indexed columns will generate an index row by assuming missing indexed columns 
have null value, and this assumption may not true as the other concurrent 
upsert may have non-null values for indexed columns. After issuing the 
concurrent update, if the application attempts to read back the row using a 
selective query on the index table and this selective query maps to an HBase 
scan that does not scan these unverified rows due to incorrect row keys on 
these rows, the application will not get the row content back correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to