James Taylor created PHOENIX-4051:
-------------------------------------

             Summary: Prevent out-of-order updates for mutable index updates
                 Key: PHOENIX-4051
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4051
             Project: Phoenix
          Issue Type: Bug
            Reporter: James Taylor


Out-of-order processing of data rows during index maintenance causes mutable 
indexes to become out of sync with regard to the data table. Here's a simple 
example to illustrate the issue:

# Assume table T(K,V) and index X(V,K).
# Upsert T(A, 1) at t10. Index updates: Put X(1,A) at t10.
# Upsert T(A, 3) at t30. Index updates: Delete X(1,A) at t29, Put X(3,A) at t30.
# Upsert T(A,2) at t20. Index updates: Delete X(1,A) at t19, Put X(2,A) at t20, 
Delete X(2,A) at t29

Ideally, we'd want to remove the Delete X(1,A) at t29 since this isn't correct 
in terms of timeline consistency, but we can't do that with HBase without 
support for deleting/undoing Delete markers. 

The above is not what is occurring. Instead, when T(A,2) comes in, the Put 
X(2,A) will occur at t20, but the Delete won't occur. This causes more index 
rows than data rows, essentially making it invalid.

A quick fix is to reset the timestamp of the data table mutations to the 
current time within the preBatchMutate call, when the row is exclusively 
locked. This skirts the issue because then timestamps won't overlap.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to