[ https://issues.apache.org/jira/browse/HBASE-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011203#comment-13011203 ]
Sean Sechrist commented on HBASE-3686: -------------------------------------- I did a little more testing and it turns out this problem isn't limited to the misconfiguration. You'll also lose rows if you kill -9 a region server in the middle of scan. In HTable.ClientScanner.next(), there's this skipFirst boolean that is supposed to skip the first row that was "already let out on a previous invocation". But instead of just skipping the first row, getConnection().getRegionServerWithRetries(callable) is called an extra time, which will skip [caching] rows. So I think fixing it to only skip 1 row will also fixing the problem if there's a misconfiguration, so sending the timeout to the server won't be needed. > Scanner timeout on RegionServer but Client won't know what happened > ------------------------------------------------------------------- > > Key: HBASE-3686 > URL: https://issues.apache.org/jira/browse/HBASE-3686 > Project: HBase > Issue Type: Bug > Components: client > Affects Versions: 0.89.20100924 > Reporter: Sean Sechrist > Priority: Minor > > This can cause rows to be lost from a scan. > See this thread where the issue was brought up: > http://search-hadoop.com/m/xITBQ136xGJ1 > If hbase.regionserver.lease.period is higher on the client than the server we > can get this series of events: > 1. Client is scanning along happily, and does something slow. > 2. Scanner times out on region server > 3. Client calls HTable.ClientScanner.next() > 4. The region server throws an UnknownScannerException > 5. Client catches exception and sees that it's not longer then it's > hbase.regionserver.lease.period config, so it doesn't throw a > ScannerTimeoutException. Instead, it treats it like a NSRE. > Right now the workaround is to make sure the configs are consistent. > A possible fix would be to use whatever the region server's scanner timeout > is, rather than the local one. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira