I find the "ulimit nofile" of one node of my cluster is not enlarged. May my
issue is cause by it. I will retest.
Thank you very much. and thank J-D very much.

Refer to: item 6 of http://wiki.apache.org/hadoop/Hbase/FAQ


On Fri, Mar 13, 2009 at 6:09 PM, schubert zhang <[email protected]> wrote:

> This time, I have another region missed, and I use close_region
> 'REGIONNAME' to close it. but then all regions after this one missed on the
> web GUI, but I can find them when  scan '.META.':-( notes: This case,
> there is no log infos form -ROOT- table.
>
>
> On Fri, Mar 13, 2009 at 1:10 AM, schubert zhang <[email protected]> wrote:
>
>> Thank you stack, it seems HBASE-1121.I will continue to track it. Sorry
>> for the log files have been removed.
>>
>>
>> On Fri, Mar 13, 2009 at 12:29 AM, stack <[email protected]> wrote:
>>
>>> Hey Schubert:
>>>
>>> Just FYI, after noticing the mismatch, rather than restart the whole
>>> cluster, you might try closing the single region.  That can jog the
>>> master
>>> into noticing it has a bad assignment.  To do this, in the shell type
>>> 'tools' and you'll see some admin facility.
>>>
>>> The root problem seems to be an issue fixed in the new hbase 0.19.1
>>> release
>>> candidate: See HBASE-1121 'Cluster confused about where -ROOT- is'.
>>>
>>> Worrying is that even after a restart, you cannot get to the troublesome
>>> region.  Is it deployed on a regionserver?  If so, anything pertinent in
>>> the
>>> logs regards this region?
>>>
>>> St.Ack
>>>
>>> On Thu, Mar 12, 2009 at 4:31 AM, schubert zhang <[email protected]>
>>> wrote:
>>>
>>> > oh, it is not fine.
>>> > Now, I can find:
>>> > TESTTABLE,13575565...@2008-12-01
>>> > 17:16:55.117,1236847258901<
>>> >
>>> http://nd0-rack0-cloud:60010/regionhistorian.jsp?regionname=WAPCDR,13575565...@2008-12-01%2017:16:55.117,1236847258901
>>> > >
>>> > nd1-rack0-cloud:60020 <http://nd1-rack0-cloud:60030/> 916003194
>>> > 13575565...@2008-12-01 17:16:55.117 13576301...@2008-12-0813:57:43.163
>>> >
>>> > but when I try to get get 13575565...@2008-12-01 17:16:55.117, nothing
>>> > returned. It seems this region is gone.
>>> >
>>> >
>>> > On Thu, Mar 12, 2009 at 7:09 PM, schubert zhang <[email protected]>
>>> wrote:
>>> >
>>> > > Hi all,
>>> > > Today, I encounter a new issue about failure to batchUpdate commit.
>>> > >
>>> > > I am running a program to insert rows into a HBase table, but after
>>> long
>>> > > time of batchUpdating, following exception occur:
>>> > >
>>> > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>> > contact
>>> > > region server Some server for region
>>> TESTTABLE,13575565...@2008-12-0117:16:55.117,1236847258901,
>>> > row '13575581...@2008-12-0606:15:48.077', but failed after 10
>>> attempts.
>>> > > Exceptions:
>>> > >         at
>>> > >
>>> >
>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:942)
>>> > >         at
>>> > > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372)
>>> > >         at
>>> org.apache.hadoop.hbase.client.HTable.close(HTable.java:1385)
>>> > >         ......
>>> > >
>>> > > And after waiting for a long time, I still cannot insert new data.
>>> > >
>>> > > Then, I check the HBase status, all master and regionservers are
>>> running.
>>> > >
>>> > > But, I find a mismatch about region
>>> "TESTTABLE,13575565...@2008-12-0117
>>> > :16:55.117,1236847258901".
>>> > > In the metadata, I found it said this region is severed by
>>> 10.24.1.12,
>>> > but
>>> > > when I check into 10.24.1.12, there is no this region.
>>> > > And then, I stop all HBase cluster and start it. Regions locations
>>> are
>>> > > re-structured and seems everything is OK.
>>> > >
>>> > > In the log file of 10.24.1.12, I found following exceptions:
>>> > >
>>> > > 836118938_60020/hlog.dat.1236849158178, entries=100010. New log
>>> writer:
>>> > > /hbase/log_10.24.1.12_1236836118938_60020/hlog.dat.1236849168393
>>> > > 2009-03-12 17:12:49,298 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegion:
>>> > > compaction completed on region TESTTABLE,13575565...@2008-12-0117
>>> :16:55.117,1236847258901
>>> > in 48sec
>>> > > 2009-03-12 17:12:49,298 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegion:
>>> > > Starting split of region TESTTABLE,13575565...@2008-12-0117
>>> > :16:55.117,1236847258901
>>> > > 2009-03-12 17:12:50,648 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegion:
>>> > > Closed TESTTABLE,13575565...@2008-12-01 17:16:55.117,1236847258901
>>> > > 2009-03-12 17:12:50,809 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegion:
>>> > > region TESTTABLE,13575565...@2008-12-0117
>>> :16:55.117,1236849169299/1762744366
>>> > available
>>> > > 2009-03-12 17:12:50,809 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegion:
>>> > > Closed TESTTABLE,13575565...@2008-12-01 17:16:55.117,1236849169299
>>> > > 2009-03-12 17:12:50,865 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegion:
>>> > > region TESTTABLE,13575590...@2008-12-1615
>>> :49:40.143,1236849169299/1344805089
>>> > available
>>> > > 2009-03-12 17:12:50,865 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegion:
>>> > > Closed TESTTABLE,13575590...@2008-12-16 15:49:40.143,1236849169299
>>> > > 2009-03-12 17:29:15,495 WARN org.apache.hadoop.hbase.RegionHistorian:
>>> > > Unable to 'Region split from: WAPCDR,13575565...@2008-12-0117
>>> > :16:55.117,1236847258901'
>>> > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>>> > contact
>>> > > region server Some server for region , row
>>> > 'TESTTABLE,13575565...@2008-12-0117:16:55.117,1236849169299', but
>>> failed
>>> > after 11 attempts.
>>> > > Exceptions:
>>> > > org.apache.hadoop.hbase.NotServingRegionException:
>>> > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>>> > >         at
>>> > >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
>>> > >         at
>>> > >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546)
>>> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> > >         at
>>> > >
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> > >         at
>>> > >
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> > >         at java.lang.reflect.Method.invoke(Method.java:597)
>>> > >         at
>>> > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>>> > >         at
>>> > >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>>> > >
>>> > > org.apache.hadoop.hbase.NotServingRegionException:
>>> > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>>> > >         at
>>> > >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065)
>>> > >         at
>>> > >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546)
>>> > >         at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown
>>> Source)
>>> > >         at
>>> > >
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> > >         at java.lang.reflect.Method.invoke(Method.java:597)
>>> > >         at
>>> > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>>> > >         at
>>> > >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>>> > >
>>> > > org.apache.hadoop.hbase.NotServingRegionException:
>>> > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0
>>> > >
>>> >
>>>
>>
>>
>

Reply via email to