Tanuj Khurana created PHOENIX-7758:
--------------------------------------

             Summary: Read repair with scan filters can give incorrect results
                 Key: PHOENIX-7758
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7758
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 5.3.0, 5.2.1, 5.1.2
            Reporter: Tanuj Khurana
            Assignee: Tanuj Khurana


When we scan an index table and find that a row is unverified, we trigger the 
read repair process. The result of the read repair process can delete the index 
row but if there are filters on the scan the state of the filter is not reset. 
This can cause issues. One such instance is the DistinctPrefixFilter. Assume 
that the first unique prefix is an unverified row which is deleted after read 
repair. When we scan the next row with the same prefix, DistinctPrefixFilter 
will ignore the row because it has already seen that prefix and will seek to 
the next row key prefix thereby skipping all subsequent rows with that prefix.

One solution is to add a _reinitialize_ API to the filter interface so that we 
can reset the state of the filter. HBase already has a _reset_ API defined on 
the filter interface but that is used to reset the state of the filter after 
every row. The state which we want to _reinitialize_ is maintained across rows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to