This problem is not related to the shell. I checked 0.20.3 has the same code HConnectionManager.java:1034, I expect that to be broken too.
Miklos 2010/5/4 Jean-Daniel Cryans <jdcry...@apache.org>: > Trunk is a work in progress and the shell was recently redone. This > configuration was set tentatively by the author of that change but, as > you can see, it doesn't work very well! The jira is here > https://issues.apache.org/jira/browse/HBASE-2352 > > J-D > > On Mon, May 3, 2010 at 3:12 PM, Miklós Kurucz <mkur...@gmail.com> wrote: >> Hi! >> >> I'm using a fresh version of trunk. >> I'm experiencing a problem where the invalid region locations are not >> removed from the cache of HCM. >> I'm only using scanners on the table and I receive the following errors: >> >> 2010-05-03 23:42:52,574 DEBUG >> org.apache.hadoop.hbase.client.HTable$ClientScanner: Advancing >> internal scanner to startKey at >> 'http://hu.gaabi.www/jordania/\x28041022\x29_jord-155_petra.jpg' >> 2010-05-03 23:42:52,574 DEBUG >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache >> hit for row <http://hu.gaabi.www/jordania/(041022)_jord-155_petra.jpg> >> in tableName Test5: location server 10.1.3.111:60020, location region >> name >> Test5,http://hu.gaabi.www/jordania/\x28041022\x29_jord-155_petra.jpg,1272896369136 >> SEVERE: Trying to contact region server 10.1.3.111:60020 for region >> Test5,http://hu.gaabi.www/jordania/\x28041022\x29_jord-155_petra.jpg,1272896369136, >> row 'http://hu.gaabi.www/jordania/\x28041022\x29_jord-155_petra.jpg', >> but failed after 1 attempts. >> Exceptions: >> java.net.ConnectException: Connection refused >> >> Which is expected as the 10.1.3.111:60020 regionserver was offline for >> hours at that time. >> The cause of this problem is that I set hbase.client.retries.number to >> 1 as I don't like the current retry options. >> In this case the following code at HConnectionManager.java:1061 >> callable.instantiateServer(tries != 0); >> will make scanners to always use the cache. >> This makes hbase.client.retries.number = 1 an unusable option. >> >> This is not intentional, am I correct? >> Am I forced to use the retries, or is there an other option? >> >> Also I would like to ask, when is it a good thing to retry an operation? >> In my experience there exists two kinds of failures >> 1) org.apache.hadoop.hbase.NotServingRegionException : region is offline >> This can be due to a compaction, in which case we probably need to >> wait for a few seconds. >> Or it can be due to a split, in which case we might need to wait for minutes. >> Either case I would not want my client to wait for such long times >> when I could reschedule other things to do in that time. >> It is also possible that region has been transfered to an other >> regionserver but that is rare compared to the other cases. >> >> 2) java.net.ConnectException : regionserver is offline >> This is solved as soon as the master can reopen regions on an other >> regionserver, but still can take minutes. >> Anyway this exception is also rare(usually) >> >> Best regards, >> Miklos >> >