As JG says, please update to 0.20.3.   Its a waste of everyone's time
spending our time debugging only to find the issue already resolved in
a later release.

Also as JG suggests, there is something up w/ your region balancer.
This has been going on for a while now going by the messages here and
those you've been sending me off list.  At a guess, the occasional
long delay is because a region is because region is being redeployed
elsewhere (even the redeploy is faster in 0.20.3 -- another reason so
to update).

St.Ack

On Sat, Mar 13, 2010 at 3:48 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> We use hbase 0.20.1
> There are 3 region servers. 13 regions on each server.
>
> I don't see any exception in master log.
>
> I was able to run 10 successful get commands before hitting the following:
> I am attaching (partial) master log and region server log from 10.10.31.137
>
> hbase(main):014:0> get 'ruletable', 'com.about.acne'
> COLUMN                       CELL
>  lpm_1.0:category            timestamp=1268347483823,
> value=http://acne.about.com\t9002:0.86580086\thost\n
> 1 row(s) in 0.0120 seconds
> hbase(main):015:0> get 'ruletable', 'com.about.acne'
> NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying to contact region server 10.10.31.137:60020 for region
> ruletable,,1268083966723, row 'com.about.acne', but failed after 5 attempts.
> Exceptions:
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: ruletable,,1268083966723
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2307)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1784)
>         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>         at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
>
> On Fri, Mar 12, 2010 at 9:08 PM, Jonathan Gray <jl...@streamy.com> wrote:
>>
>> Seems like something weird is going on with your regionservers and
>> balancing.
>>
>> Can you post big snippets from the regionserver and master logs?  Have you
>> checked to see what's going on?  Is there repetitive balancing going on
>> that
>> never seems to reach steady-state?
>>
>> How many regions and how many nodes on which version of HBase?
>>
>> > -----Original Message-----
>> > From: Ted Yu [mailto:yuzhih...@gmail.com]
>> > Sent: Friday, March 12, 2010 8:24 PM
>> > To: hbase-user@hadoop.apache.org
>> > Subject: slow response in hbase shell
>> >
>> > Hi,
>> >
>> > > We sometimes saw over 5 second delay running get in hbase 0.20.1
>> > shell:
>> > >
>> > > hbase(main):002:0> get 'ruletable', 'ca.tsn.www'
>> > > 0 row(s) in 10.1330 seconds
>> > >
>> > > From our 3 region servers there are a lot of such messages:
>> > >
>> >
>> >
>> > > 2010-03-12 00:00:00,996 INFO  [regionserver/10.10.31.135:60020]
>> > > regionserver.HRegionServer(493): MSG_REGION_CLOSE:
>> > >
>> > crawltable,com.pandora.www:http\x2Finclude\x2FlyricsAdEmbed.html\x3Fgen
>> > re\x3Delectronica\x26artist\x3DR273847\x26webname\x3D\x26sz\x3D2000x8\x
>> > 26ord\x3D125823029226371645\x26tile\x3D3\x26_id\x3Dbottom_leaderboard_c
>> > ontainer,1266944566406:
>> > > Overloaded
>> > > 2010-03-12 00:00:00,997 INFO  [regionserver/10.10.31.135:60020]
>> > > regionserver.HRegionServer(493): MSG_REGION_CLOSE:
>> > > domaincrawltable,,1268098564908: Overloaded
>> > >
>> > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc
>> > >   40428  343638 11523230
>> > >
>> > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc
>> > >   40430  343655 11307703
>> > >
>> > > grep Overloaded hbase-hbaseadmin-regionserver.log | wc
>> > >   40466  343961 11340379
>> > >
>> > > Was the slow response due to the load balancing ?
>> > >
>> >
>> > The strange thing was that after several quick responses I would see:
>> >
>> > hbase(main):004:0> get 'ruletable', 'com.about.acne'
>> > COLUMN
>> > CELL
>> >  lpm_1.0:category            timestamp=1268347483823, value=
>> > http://acne.about.com\t9002:0.86580       086\thost\n
>> > 1 row(s) in 0.0040 seconds
>> > hbase(main):005:0> get 'ruletable', 'com.about.acne'
>> > COLUMN
>> > CELL
>> >  lpm_1.0:category            timestamp=1268347483823, value=
>> > http://acne.about.com\t9002:0.86580       086\thost\n
>> > 1 row(s) in 0.0040 seconds
>> > hbase(main):006:0> get 'ruletable', 'com.about.acne'
>> > NativeException:
>> > org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> > Trying to contact region server 10.10.31.136:60020 for region
>> > ruletable,,1268083966723, row 'com.about.acne', but failed after 5
>> > attempts.
>> > Exceptions:
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > ruletable,,1268083966723
>> >         at
>> > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSer
>> > ver.java:2307)
>> >         at
>> > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.ja
>> > va:1784)
>> >         at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
>> >         at
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>> > rImpl.java:25)
>> >         at java.lang.reflect.Method.invoke(Method.java:597)
>> >         at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >         at
>> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:91
>> > 5)
>> >
>> > But 006:60010/master.jsp refreshes quickly and shows all three
>> > regionservers.
>> > Don't know why hbase shell encountered NSRE.
>> >
>> > >
>> > > Thanks
>>
>
>

Reply via email to