James Taylor created PHOENIX-4051:
-------------------------------------
Summary: Prevent out-of-order updates for mutable index updates
Key: PHOENIX-4051
URL: https://issues.apache.org/jira/browse/PHOENIX-4051
Project: Phoenix
Issue Type: Bug
Reporter: James Taylor
Out-of-order processing of data rows during index maintenance causes mutable
indexes to become out of sync with regard to the data table. Here's a simple
example to illustrate the issue:
# Assume table T(K,V) and index X(V,K).
# Upsert T(A, 1) at t10. Index updates: Put X(1,A) at t10.
# Upsert T(A, 3) at t30. Index updates: Delete X(1,A) at t29, Put X(3,A) at t30.
# Upsert T(A,2) at t20. Index updates: Delete X(1,A) at t19, Put X(2,A) at t20,
Delete X(2,A) at t29
Ideally, we'd want to remove the Delete X(1,A) at t29 since this isn't correct
in terms of timeline consistency, but we can't do that with HBase without
support for deleting/undoing Delete markers.
The above is not what is occurring. Instead, when T(A,2) comes in, the Put
X(2,A) will occur at t20, but the Delete won't occur. This causes more index
rows than data rows, essentially making it invalid.
A quick fix is to reset the timestamp of the data table mutations to the
current time within the preBatchMutate call, when the row is exclusively
locked. This skirts the issue because then timestamps won't overlap.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)