Bryan Beaudreault created HBASE-27402:
-----------------------------------------

             Summary: Clone Scan in ClientScanner to avoid errors with Scan 
re-used
                 Key: HBASE-27402
                 URL: https://issues.apache.org/jira/browse/HBASE-27402
             Project: HBase
          Issue Type: Improvement
            Reporter: Bryan Beaudreault


This has come up before in https://issues.apache.org/jira/browse/HBASE-1774 and 
https://issues.apache.org/jira/browse/HBASE-4891. The major pushback was around 
ScanMetrics, which relied on sharing a mutable Scan object.

Since https://issues.apache.org/jira/browse/HBASE-17584, ScanMetrics are 
available on ResultScanner and the method on Scan was deprecated (removed in 
master).

I think this issue became pretty urgent in 
https://issues.apache.org/jira/browse/HBASE-17167, when we started passing mvcc 
into the Scan object. If a user unknowingly reuses the Scan object, this can 
seem like data loss since the Scan will return none of the expected data.  We 
recently hit this in our upgrade from hbase client 1.2 to 2.4.6, where 
use-cases that had worked in 1.2 suddenly started returning no results in 
2.4.6. It's very hard to debug.

I suggest that we now add the clone in master branch. For branch-2, I think we 
could put it behind a config param to preserve backwards compatibility of 
Scan.getScanMetrics. If the config param is enabled, scan cloning occurs and 
Scan.getScanMetrics will be inaccurate. Personally I think this is far better 
scenario, because data result accuracy is more important than metrics. But we 
can leave it to the user to decide, and provide a release note.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to