If it's a single row, I would expect the server to return the error immediately. Then you will have the sleep I was mentioning previously, but the cache should be cleaned before the sleep...
On Fri, Aug 10, 2012 at 1:32 PM, deanforwever2010 <deanforwever2...@gmail.com> wrote: > hi, Keywal > my hbase version is 0.94, > my query is just to get limited columns of a row, > I make a callable task of 1.5 seconds, so maybe it didnot fail but > canceled by my process,so the region cache didnot clear after many requests > happened. > my question is why should it take so long time for failure? and it behave > different between my servers, and there is no problem with network. > > 2012/8/10 N Keywal <nkey...@gmail.com> > >> Hi, >> >> What are your queries exactly? What's the HBase version? >> >> The mechanism is: >> - There is a location cache, per HConnection, on the client >> - The client first tries the region server in its cache >> - if it fails, the client removes this entry from the cache and enters >> the retry loop >> - there is a limited amount of retries and a sleep between the retries >> - most of the times, the client will connect to meta to get the new >> location >> >> When there are multiple queries, before HBASE-5924, the errors will be >> analyzed after the other regions servers has returned as well. It >> could be an explanation. HBASE-5877 exists as well, but only for >> moves, not for splits... >> >> Cheers, >> >> N. >> >> >> On Fri, Aug 10, 2012 at 11:26 AM, deanforwever2010 >> <deanforwever2...@gmail.com> wrote: >> > on the region server's log :2012-08-10 11:49:50,796 DEBUG >> > org.apache.hadoop.hbase.regionserver.HRegionServer: >> > NotServingRegionException; Region is not online: >> > test_list,zWPpyme,1342510667492.91486e7fa0ac39048276848a2618479b. >> > >> > after region split, client didnt get result after timeout setting(1.5 >> > second),then the task is canceled by my program, so the >> HConnectionManager >> > didnt delete the cachedLocation; >> > the client still query the old region id which is no more exists >> > >> > And more, part of my processes updated the region location info, part >> > not.I'm sure the network is fine; >> > >> > how to fix the problem?why does it need so long time to detect the new >> > regions? >>