Viraj Jasani created PHOENIX-7733:
-------------------------------------
Summary: Data integrity issue impacting uncovered indexes (and
potentially others) with rare occurrence
Key: PHOENIX-7733
URL: https://issues.apache.org/jira/browse/PHOENIX-7733
Project: Phoenix
Issue Type: Improvement
Reporter: Viraj Jasani
Phoenix provides two types of indexes: covered indexes and uncovered indexes.
While running some tests on uncovered indexes, we discovered data integrity
issue when run against HBase 2.5 but is not present when run against HBase 2.6.
The issue is likely not related to uncovered indexes only.
While scanning rows in uncovered index table, the corresponding full row is
scanned from the data table. If the condition expression is provided by the
user, the condition is evaluated on the data table row. Condition is evaluated
as server side filters on the table regions. The test that discovered the issue
has very large num of rows from the beginning that do not satisfy the filter
expression. In other words, more than "hbase.client.scanner.max.result.size" MB
worth of rows do not satisfy the filter expression. Therefore, the scanner
returns no rows for HBase 2.5. However, increasing
"hbase.client.scanner.max.result.size" to higher value made the scanner return
correct result.
This data correctness issue is not present on HBase 2.6 because HBASE-27558
fixed it in a way already, while fixing this was not the intention of the Jira.
The large num of changes b/ HBase 2.5 and 2.6 in the scan path (while mostly
related to quotas and metrics) makes it difficult to find the root cause.
Jira for fixing the issue on HBase 2.5: HBASE-29722
The purpose of this Jira is to create tests to reproduce the rare occurrence of
the data correctness issue. We need to wait until new HBase 2.5 release is
available with the above fix.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)