[ https://issues.apache.org/jira/browse/HBASE-25582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anoop Sam John resolved HBASE-25582. ------------------------------------ Fix Version/s: 2.4.2 2.3.5 Hadoop Flags: Reviewed Resolution: Fixed > Support setting scan ReadType to be STREAM at cluster level > ----------------------------------------------------------- > > Key: HBASE-25582 > URL: https://issues.apache.org/jira/browse/HBASE-25582 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.0.0 > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.5, 2.4.2 > > > We have the config 'hbase.storescanner.use.pread' at cluster level to set > ReadType to be PRead if not explicitly specified in Scan object. > Same way we can have a way to make scan as STREAM type at cluster level (if > not specified at Scan object level) > We do not need any new configs or so. We have the config > 'hbase.storescanner.pread.max.bytes' which specifies when to switch read type > to stream and it defaults to 4 * HFile block size. If one config this value > as <= 0 means user need the switch when scanner is created itself. With such > a handling we can support it. > So every scan need not set the read type. > The issue is in Cloud storage based system using Stream reads might be > better. We introduced this PRead based scan with tests on HDFS based > storage. In my customer case, Azure storage in place and WASB driver been > used. We have a read ahead mechanism there (Read an entire Block of a blob in > one REST call) and buffer that in WASB driver. This helps a lot wrt longer > scans. Ya with config 'hbase.storescanner.pread.max.bytes' we can make the > switch to happen early but better to go with 1.x way where the scan starts > with Stream read itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)