Hey Schubert:

Just FYI, after noticing the mismatch, rather than restart the whole
cluster, you might try closing the single region.  That can jog the master
into noticing it has a bad assignment.  To do this, in the shell type
'tools' and you'll see some admin facility.

The root problem seems to be an issue fixed in the new hbase 0.19.1 release
candidate: See HBASE-1121 'Cluster confused about where -ROOT- is'.

Worrying is that even after a restart, you cannot get to the troublesome
region.  Is it deployed on a regionserver?  If so, anything pertinent in the
logs regards this region?

St.Ack

On Thu, Mar 12, 2009 at 4:31 AM, schubert zhang <[email protected]> wrote:

> oh, it is not fine.
> Now, I can find:
> TESTTABLE,13575565...@2008-12-01
> 17:16:55.117,1236847258901<
> http://nd0-rack0-cloud:60010/regionhistorian.jsp?regionname=WAPCDR,13575565...@2008-12-01%2017:16:55.117,1236847258901
> >
> nd1-rack0-cloud:60020 <http://nd1-rack0-cloud:60030/> 916003194
> 13575565...@2008-12-01 17:16:55.117 13576301...@2008-12-08 13:57:43.163
>
> but when I try to get get 13575565...@2008-12-01 17:16:55.117, nothing
> returned. It seems this region is gone.
>
>
> On Thu, Mar 12, 2009 at 7:09 PM, schubert zhang <[email protected]> wrote:
>
> > Hi all,
> > Today, I encounter a new issue about failure to batchUpdate commit.
> >
> > I am running a program to insert rows into a HBase table, but after long
> > time of batchUpdating, following exception occur:
> >
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact
> > region server Some server for region 
> > TESTTABLE,13575565...@2008-12-0117:16:55.117,1236847258901,
> row '13575581...@2008-12-0606:15:48.077', but failed after 10 attempts.
> > Exceptions:
> >         at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:942)
> >         at
> > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372)
> >         at org.apache.hadoop.hbase.client.HTable.close(HTable.java:1385)
> >         ......
> >
> > And after waiting for a long time, I still cannot insert new data.
> >
> > Then, I check the HBase status, all master and regionservers are running.
> >
> > But, I find a mismatch about region "TESTTABLE,13575565...@2008-12-0117
> :16:55.117,1236847258901".
> > In the metadata, I found it said this region is severed by 10.24.1.12,
> but
> > when I check into 10.24.1.12, there is no this region.
> > And then, I stop all HBase cluster and start it. Regions locations are
> > re-structured and seems everything is OK.
> >
> > In the log file of 10.24.1.12, I found following exceptions:
> >
> > 836118938_60020/hlog.dat.1236849158178, entries=100010. New log writer:
> > /hbase/log_10.24.1.12_1236836118938_60020/hlog.dat.1236849168393
> > 2009-03-12 17:12:49,298 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > compaction completed on region 
> > TESTTABLE,13575565...@2008-12-0117:16:55.117,1236847258901
> in 48sec
> > 2009-03-12 17:12:49,298 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Starting split of region TESTTABLE,13575565...@2008-12-0117
> :16:55.117,1236847258901
> > 2009-03-12 17:12:50,648 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Closed TESTTABLE,13575565...@2008-12-01 17:16:55.117,1236847258901
> > 2009-03-12 17:12:50,809 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > region TESTTABLE,13575565...@2008-12-0117:16:55.117,1236849169299/1762744366
> available
> > 2009-03-12 17:12:50,809 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Closed TESTTABLE,13575565...@2008-12-01 17:16:55.117,1236849169299
> > 2009-03-12 17:12:50,865 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > region TESTTABLE,13575590...@2008-12-1615:49:40.143,1236849169299/1344805089
> available
> > 2009-03-12 17:12:50,865 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > Closed TESTTABLE,13575590...@2008-12-16 15:49:40.143,1236849169299
> > 2009-03-12 17:29:15,495 WARN org.apache.hadoop.hbase.RegionHistorian:
> > Unable to 'Region split from: WAPCDR,13575565...@2008-12-0117
> :16:55.117,1236847258901'
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact
> > region server Some server for region , row
> 'TESTTABLE,13575565...@2008-12-0117:16:55.117,1236849169299', but failed
> after 11 attempts.
> > Exceptions:
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
> >         at
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
> >         at
> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> >
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
> >         at
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
> >         at
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546)
> >         at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >         at java.lang.reflect.Method.invoke(Method.java:597)
> >         at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
> >         at
> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> >
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
> >
>

Reply via email to