Viraj Jasani created HBASE-29722:
------------------------------------
Summary: Fix data integrity issues for scan with heavy filters
(backport portion of HBASE-27558)
Key: HBASE-29722
URL: https://issues.apache.org/jira/browse/HBASE-29722
Project: HBase
Issue Type: Bug
Reporter: Viraj Jasani
Phoenix provides two types of indexes: covered indexes and uncovered indexes.
While running some tests on uncovered indexes, we discovered data integrity
issue when run against HBase 2.5 but is not present when run against HBase 2.6.
The issue is likely not related to uncovered indexes only.
While scanning rows in uncovered index table, the corresponding full row is
scanned from the data table. If the condition expression is provided by the
user, the condition is evaluated on the data table row. Condition is evaluated
as server side filters on the table regions. The test that discovered the issue
has very large num of rows from the beginning that do not satisfy the filter
expression. In other words, more than "hbase.client.scanner.max.result.size" MB
worth of rows do not satisfy the filter expression. Therefore, the scanner
returns no rows for HBase 2.5. However, increasing
"hbase.client.scanner.max.result.size" to higher value made the scanner return
correct result.
This data correctness issue is not present on HBase 2.6 because HBASE-27558
fixed it in a way already, while fixing this was not the intention of the Jira.
The large num of changes b/ HBase 2.5 and 2.6 in the scan path (while mostly
related to quotas and metrics) makes it difficult to find the root cause.
The purpose of this Jira is to fix the issue for HBase 2.5.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)