As JG says, please update to 0.20.3. Its a waste of everyone's time spending our time debugging only to find the issue already resolved in a later release.
Also as JG suggests, there is something up w/ your region balancer. This has been going on for a while now going by the messages here and those you've been sending me off list. At a guess, the occasional long delay is because a region is because region is being redeployed elsewhere (even the redeploy is faster in 0.20.3 -- another reason so to update). St.Ack On Sat, Mar 13, 2010 at 3:48 AM, Ted Yu <yuzhih...@gmail.com> wrote: > We use hbase 0.20.1 > There are 3 region servers. 13 regions on each server. > > I don't see any exception in master log. > > I was able to run 10 successful get commands before hitting the following: > I am attaching (partial) master log and region server log from 10.10.31.137 > > hbase(main):014:0> get 'ruletable', 'com.about.acne' > COLUMN CELL > lpm_1.0:category timestamp=1268347483823, > value=http://acne.about.com\t9002:0.86580086\thost\n > 1 row(s) in 0.0120 seconds > hbase(main):015:0> get 'ruletable', 'com.about.acne' > NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: > Trying to contact region server 10.10.31.137:60020 for region > ruletable,,1268083966723, row 'com.about.acne', but failed after 5 attempts. > Exceptions: > org.apache.hadoop.hbase.NotServingRegionException: > org.apache.hadoop.hbase.NotServingRegionException: ruletable,,1268083966723 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2307) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1784) > at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > > > On Fri, Mar 12, 2010 at 9:08 PM, Jonathan Gray <jl...@streamy.com> wrote: >> >> Seems like something weird is going on with your regionservers and >> balancing. >> >> Can you post big snippets from the regionserver and master logs? Have you >> checked to see what's going on? Is there repetitive balancing going on >> that >> never seems to reach steady-state? >> >> How many regions and how many nodes on which version of HBase? >> >> > -----Original Message----- >> > From: Ted Yu [mailto:yuzhih...@gmail.com] >> > Sent: Friday, March 12, 2010 8:24 PM >> > To: hbase-user@hadoop.apache.org >> > Subject: slow response in hbase shell >> > >> > Hi, >> > >> > > We sometimes saw over 5 second delay running get in hbase 0.20.1 >> > shell: >> > > >> > > hbase(main):002:0> get 'ruletable', 'ca.tsn.www' >> > > 0 row(s) in 10.1330 seconds >> > > >> > > From our 3 region servers there are a lot of such messages: >> > > >> > >> > >> > > 2010-03-12 00:00:00,996 INFO [regionserver/10.10.31.135:60020] >> > > regionserver.HRegionServer(493): MSG_REGION_CLOSE: >> > > >> > crawltable,com.pandora.www:http\x2Finclude\x2FlyricsAdEmbed.html\x3Fgen >> > re\x3Delectronica\x26artist\x3DR273847\x26webname\x3D\x26sz\x3D2000x8\x >> > 26ord\x3D125823029226371645\x26tile\x3D3\x26_id\x3Dbottom_leaderboard_c >> > ontainer,1266944566406: >> > > Overloaded >> > > 2010-03-12 00:00:00,997 INFO [regionserver/10.10.31.135:60020] >> > > regionserver.HRegionServer(493): MSG_REGION_CLOSE: >> > > domaincrawltable,,1268098564908: Overloaded >> > > >> > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc >> > > 40428 343638 11523230 >> > > >> > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc >> > > 40430 343655 11307703 >> > > >> > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc >> > > 40466 343961 11340379 >> > > >> > > Was the slow response due to the load balancing ? >> > > >> > >> > The strange thing was that after several quick responses I would see: >> > >> > hbase(main):004:0> get 'ruletable', 'com.about.acne' >> > COLUMN >> > CELL >> > lpm_1.0:category timestamp=1268347483823, value= >> > http://acne.about.com\t9002:0.86580 086\thost\n >> > 1 row(s) in 0.0040 seconds >> > hbase(main):005:0> get 'ruletable', 'com.about.acne' >> > COLUMN >> > CELL >> > lpm_1.0:category timestamp=1268347483823, value= >> > http://acne.about.com\t9002:0.86580 086\thost\n >> > 1 row(s) in 0.0040 seconds >> > hbase(main):006:0> get 'ruletable', 'com.about.acne' >> > NativeException: >> > org.apache.hadoop.hbase.client.RetriesExhaustedException: >> > Trying to contact region server 10.10.31.136:60020 for region >> > ruletable,,1268083966723, row 'com.about.acne', but failed after 5 >> > attempts. >> > Exceptions: >> > org.apache.hadoop.hbase.NotServingRegionException: >> > org.apache.hadoop.hbase.NotServingRegionException: >> > ruletable,,1268083966723 >> > at >> > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSer >> > ver.java:2307) >> > at >> > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.ja >> > va:1784) >> > at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) >> > at >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso >> > rImpl.java:25) >> > at java.lang.reflect.Method.invoke(Method.java:597) >> > at >> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) >> > at >> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:91 >> > 5) >> > >> > But 006:60010/master.jsp refreshes quickly and shows all three >> > regionservers. >> > Don't know why hbase shell encountered NSRE. >> > >> > > >> > > Thanks >> > >