[ https://issues.apache.org/jira/browse/HBASE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508476#comment-13508476 ]
Lars Hofhansl commented on HBASE-7180: -------------------------------------- How about another approach: # introduce a RawRegionScanner interface, which extends RegionScanner. # RawRegionScanner has all the additional methods on it we need. # Add a getRawScannner to the RegionScanner interface. # RegionScannerImpl would then implement RawRegionScanner. To the coprocessor framework we'd still hand a RegionScanner, but now the coprocessor can get the raw scanner via getRawScanner(). The RegionScannerImpl's implementation of getRawScanner() just returns "this". Is that better? Or does anybody have another a cleaner idea? closeRegionOperation and startRegionOperation would still need to be public, so that coprocessors can start/stop region operations. > RegionScannerImpl.next() is inefficient. > ---------------------------------------- > > Key: HBASE-7180 > URL: https://issues.apache.org/jira/browse/HBASE-7180 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Attachments: 7180-0.94-SKETCH.txt, 7180-0.94-v1.txt > > > We just came across a special scenario. > For our Phoenix project (SQL runtime for HBase), we push a lot of work into > HBase via coprocessors. One method is to wrap RegionScanner in coprocessor > hooks and then do processing in the hook to avoid returning a lot of data to > the client unnecessarily. > In this specific case this is pretty bad. Since the wrapped RegionScanner's > next() does not "know" that it is called this way is still does all of this > on each invocation: > # Starts a RegionOperation > # Increments the request count > # set the current read point on a thread local (because generally each call > could come from a different thread) > # Finally does the next on its StoreScanner(s) > # Ends the RegionOperation > When this is done in a tight loop millions of times (as is the case for us) > it starts to become significant. > Not sure what to do about this, really. Opening this issue for discussion. > One way is to extend the RegionScanner with an "internal" next() method of > sorts, so that all this overhead can be avoided. The coprocessor could call > the regular next() methods once and then just call the cheaper internal > version. > Are there better/cleaner ways? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira