Bryan Beaudreault created HBASE-27402:
-----------------------------------------
Summary: Clone Scan in ClientScanner to avoid errors with Scan
re-used
Key: HBASE-27402
URL: https://issues.apache.org/jira/browse/HBASE-27402
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
This has come up before in https://issues.apache.org/jira/browse/HBASE-1774 and
https://issues.apache.org/jira/browse/HBASE-4891. The major pushback was around
ScanMetrics, which relied on sharing a mutable Scan object.
Since https://issues.apache.org/jira/browse/HBASE-17584, ScanMetrics are
available on ResultScanner and the method on Scan was deprecated (removed in
master).
I think this issue became pretty urgent in
https://issues.apache.org/jira/browse/HBASE-17167, when we started passing mvcc
into the Scan object. If a user unknowingly reuses the Scan object, this can
seem like data loss since the Scan will return none of the expected data. We
recently hit this in our upgrade from hbase client 1.2 to 2.4.6, where
use-cases that had worked in 1.2 suddenly started returning no results in
2.4.6. It's very hard to debug.
I suggest that we now add the clone in master branch. For branch-2, I think we
could put it behind a config param to preserve backwards compatibility of
Scan.getScanMetrics. If the config param is enabled, scan cloning occurs and
Scan.getScanMetrics will be inaccurate. Personally I think this is far better
scenario, because data result accuracy is more important than metrics. But we
can leave it to the user to decide, and provide a release note.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)