WrongRegionException
Hello Everyone I am using coprocesser to prevent the normal put and replace it with another rowkey, The method is HRegion.put(). It works fine, but when the region splited, There will be an WrongRegionException. 2018-01-28 09:32:51,528 WARN [B.DefaultRpcServer.handler=21,queue=3,port=60020] regionserver.HRegion: Failed getting lock in batch put, row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for row lock on HRegion GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac., startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00', row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r' at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:4677) at org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:4695) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2786) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2653) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589) at org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:3192) at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459) at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287) at site.luoyu.Core.Index.JavaTreeMap.insertRecord(JavaTreeMap.java:256) at site.luoyu.Core.Observer.IndexCopressor.prePut(IndexCopressor.java:130) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(RegionCoprocessorHost.java:1122) at org.apache.hadoop.hbase.regionserver.HRegion.doPreMutationHook(HRegion.java:2674) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2649) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2593) at org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4402) at org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3584) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3474) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:3) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) It is said rowkey is out of region's bound. This Exception is logged in regionserver's log as an warning , I can't catch and handle it. According the source code, RowLock rowLock = null; try { rowLock = getRowLock(mutation.getRow(), shouldBlock); } catch (IOException ioe) { LOG.warn("Failed getting lock in batch put, row=" + Bytes.toStringBinary(mutation.getRow()), ioe); } HBase just cache and log this exception , I guess it even didn't remove it from the batch. So I got so many Exception log and can't put data anymore. Why HBase handle this WrongRegionException like this? Anyone can help? Thanks verymuch.
Re: WrongRegionException
Another related Q was also there.. Can you tell the actual requirement? So the incoming puts you want to change the RKs of that? Or you want to insert those as well as some new cells with a changed RK? -Anoop- On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang wrote: > Hello Everyone > > I am using coprocesser to prevent the normal put and replace it > with another rowkey, The method is HRegion.put(). It works fine, but when > the region splited, There will be an WrongRegionException. > > 2018-01-28 09:32:51,528 WARN > [B.DefaultRpcServer.handler=21,queue=3,port=60020] regionserver.HRegion: > Failed getting lock in batch put, > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row > out of range for row lock on HRegion > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac., > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00', > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r' > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:4677) > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:4695) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2786) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2653) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589) > at > org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:3192) > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459) > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287) > at site.luoyu.Core.Index.JavaTreeMap.insertRecord(JavaTreeMap.java:256) > at site.luoyu.Core.Observer.IndexCopressor.prePut(IndexCopressor.java:130) > at > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(RegionCoprocessorHost.java:1122) > at > org.apache.hadoop.hbase.regionserver.HRegion.doPreMutationHook(HRegion.java:2674) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2649) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2593) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4402) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3584) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3474) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:3) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) > at java.lang.Thread.run(Thread.java:745) > > It is said rowkey is out of region's bound. This Exception is logged in > regionserver's log as an warning , I can't catch and handle it. > > According the source code, > RowLock rowLock = null; > try { > rowLock = getRowLock(mutation.getRow(), shouldBlock); > } catch (IOException ioe) { > LOG.warn("Failed getting lock in batch put, row=" > + Bytes.toStringBinary(mutation.getRow()), ioe); > } > > HBase just cache and log this exception , I guess it even didn't remove it > from the batch. So I got so many Exception log and can't put data anymore. > > Why HBase handle this WrongRegionException like this? Anyone can help? > Thanks verymuch.
Re: WrongRegionException
Both are the same question. I want to prevent the incoming puts and then copy it as one or more new puts with different rowkey. So when there is only one region, my rowkey will belong to it. But when region splited, some rowkeys may not belong to the new region. I used to thought HBase will stop new coming puts, finish all of the puts in the batch, and then try to split. But this maybe not right according to the exception that I got. BTY , It seems that I can't add put to MiniBatchOperationInProgress miniBatchOp. There are only some functions for get. Thank you very much for your help 2018-01-29 18:46 GMT+08:00 Anoop John : > Another related Q was also there.. Can you tell the actual > requirement? So the incoming puts you want to change the RKs of that? > Or you want to insert those as well as some new cells with a changed > RK? > > -Anoop- > > On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang > wrote: > > Hello Everyone > > > > I am using coprocesser to prevent the normal put and replace it > > with another rowkey, The method is HRegion.put(). It works fine, but when > > the region splited, There will be an WrongRegionException. > > > > 2018-01-28 09:32:51,528 WARN > > [B.DefaultRpcServer.handler=21,queue=3,port=60020] regionserver.HRegion: > > Failed getting lock in batch put, > > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row > > out of range for row lock on HRegion > > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac., > > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00', > > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r' > > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow( > HRegion.java:4677) > > at > > org.apache.hadoop.hbase.regionserver.HRegion. > getRowLock(HRegion.java:4695) > > at > > org.apache.hadoop.hbase.regionserver.HRegion. > doMiniBatchMutation(HRegion.java:2786) > > at > > org.apache.hadoop.hbase.regionserver.HRegion. > batchMutate(HRegion.java:2653) > > at > > org.apache.hadoop.hbase.regionserver.HRegion. > batchMutate(HRegion.java:2589) > > at > > org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java: > 3192) > > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459) > > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287) > > at site.luoyu.Core.Index.JavaTreeMap.insertRecord(JavaTreeMap.java:256) > > at site.luoyu.Core.Observer.IndexCopressor.prePut( > IndexCopressor.java:130) > > at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut( > RegionCoprocessorHost.java:1122) > > at > > org.apache.hadoop.hbase.regionserver.HRegion.doPreMutationHook(HRegion. > java:2674) > > at > > org.apache.hadoop.hbase.regionserver.HRegion. > batchMutate(HRegion.java:2649) > > at > > org.apache.hadoop.hbase.regionserver.HRegion. > batchMutate(HRegion.java:2589) > > at > > org.apache.hadoop.hbase.regionserver.HRegion. > batchMutate(HRegion.java:2593) > > at > > org.apache.hadoop.hbase.regionserver.HRegionServer. > doBatchOp(HRegionServer.java:4402) > > at > > org.apache.hadoop.hbase.regionserver.HRegionServer. > doNonAtomicRegionMutation(HRegionServer.java:3584) > > at > > org.apache.hadoop.hbase.regionserver.HRegionServer. > multi(HRegionServer.java:3474) > > at > > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2. > callBlockingMethod(ClientProtos.java:3) > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078) > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > > at > > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( > RpcExecutor.java:114) > > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) > > at java.lang.Thread.run(Thread.java:745) > > > > It is said rowkey is out of region's bound. This Exception is logged in > > regionserver's log as an warning , I can't catch and handle it. > > > > According the source code, > > RowLock rowLock = null; > > try { > > rowLock = getRowLock(mutation.getRow(), shouldBlock); > > } catch (IOException ioe) { > > LOG.warn("Failed getting lock in batch put, row=" > > + Bytes.toStringBinary(mutation.getRow()), ioe); > > } > > > > HBase just cache and log this exception , I guess it even didn't remove > it > > from the batch. So I got so many Exception log and can't put data > anymore. > > > > Why HBase handle this WrongRegionException like this? Anyone can help? > > Thanks verymuch. >
Re: WrongRegionException
w.r.t. region split, do you verify that the new rowkey is in the same region as the rowkey from incoming Put ? If not, there is a chance that the new rowkey is in different region which is going thru split. FYI On Mon, Jan 29, 2018 at 6:40 AM, Yang Zhang wrote: > Both are the same question. > I want to prevent the incoming puts and then copy it as one or more new > puts with different rowkey. > So when there is only one region, my rowkey will belong to it. But when > region splited, some rowkeys may not belong to the new region. > I used to thought HBase will stop new coming puts, finish all of the puts > in the batch, and then try to split. > But this maybe not right according to the exception that I got. > > BTY , It seems that I can't add put > to MiniBatchOperationInProgress miniBatchOp. There are only some > functions for get. > > Thank you very much for your help > > 2018-01-29 18:46 GMT+08:00 Anoop John : > > > Another related Q was also there.. Can you tell the actual > > requirement? So the incoming puts you want to change the RKs of that? > > Or you want to insert those as well as some new cells with a changed > > RK? > > > > -Anoop- > > > > On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang > > wrote: > > > Hello Everyone > > > > > > I am using coprocesser to prevent the normal put and replace it > > > with another rowkey, The method is HRegion.put(). It works fine, but > when > > > the region splited, There will be an WrongRegionException. > > > > > > 2018-01-28 09:32:51,528 WARN > > > [B.DefaultRpcServer.handler=21,queue=3,port=60020] > regionserver.HRegion: > > > Failed getting lock in batch put, > > > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r > > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested > row > > > out of range for row lock on HRegion > > > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac., > > > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00', > > > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r' > > > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow( > > HRegion.java:4677) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > > getRowLock(HRegion.java:4695) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > > doMiniBatchMutation(HRegion.java:2786) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > > batchMutate(HRegion.java:2653) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > > batchMutate(HRegion.java:2589) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > doBatchMutate(HRegion.java: > > 3192) > > > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459) > > > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287) > > > at site.luoyu.Core.Index.JavaTreeMap.insertRecord( > JavaTreeMap.java:256) > > > at site.luoyu.Core.Observer.IndexCopressor.prePut( > > IndexCopressor.java:130) > > > at > > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut( > > RegionCoprocessorHost.java:1122) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > doPreMutationHook(HRegion. > > java:2674) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > > batchMutate(HRegion.java:2649) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > > batchMutate(HRegion.java:2589) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegion. > > batchMutate(HRegion.java:2593) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegionServer. > > doBatchOp(HRegionServer.java:4402) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegionServer. > > doNonAtomicRegionMutation(HRegionServer.java:3584) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegionServer. > > multi(HRegionServer.java:3474) > > > at > > > org.apache.hadoop.hbase.protobuf.generated. > ClientProtos$ClientService$2. > > callBlockingMethod(ClientProtos.java:3) > > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078) > > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > > > at > > > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( > > RpcExecutor.java:114) > > > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.ru
Re: WrongRegionException
Theoretically I dont think what you are trying to do is correct. I mean the incoming keys are fully ignored and new one is been made at CP layers. When CP is been contacted already the mutations has reached upto that Region. The new RK might be outside the boundary of this region. (As the split happens). Dont know your use case. But I feel like you should handle this conversion of RKs and create new Put in a client tier not at CP. -Anoop- On Mon, Jan 29, 2018 at 9:44 PM, Ted Yu wrote: > w.r.t. region split, do you verify that the new rowkey is in the same > region as the rowkey from incoming Put ? > > If not, there is a chance that the new rowkey is in different region which > is going thru split. > > FYI > > On Mon, Jan 29, 2018 at 6:40 AM, Yang Zhang wrote: > >> Both are the same question. >> I want to prevent the incoming puts and then copy it as one or more new >> puts with different rowkey. >> So when there is only one region, my rowkey will belong to it. But when >> region splited, some rowkeys may not belong to the new region. >> I used to thought HBase will stop new coming puts, finish all of the puts >> in the batch, and then try to split. >> But this maybe not right according to the exception that I got. >> >> BTY , It seems that I can't add put >> to MiniBatchOperationInProgress miniBatchOp. There are only some >> functions for get. >> >> Thank you very much for your help >> >> 2018-01-29 18:46 GMT+08:00 Anoop John : >> >> > Another related Q was also there.. Can you tell the actual >> > requirement? So the incoming puts you want to change the RKs of that? >> > Or you want to insert those as well as some new cells with a changed >> > RK? >> > >> > -Anoop- >> > >> > On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang >> > wrote: >> > > Hello Everyone >> > > >> > > I am using coprocesser to prevent the normal put and replace it >> > > with another rowkey, The method is HRegion.put(). It works fine, but >> when >> > > the region splited, There will be an WrongRegionException. >> > > >> > > 2018-01-28 09:32:51,528 WARN >> > > [B.DefaultRpcServer.handler=21,queue=3,port=60020] >> regionserver.HRegion: >> > > Failed getting lock in batch put, >> > > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r >> > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested >> row >> > > out of range for row lock on HRegion >> > > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac., >> > > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00', >> > > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r' >> > > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow( >> > HRegion.java:4677) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> > getRowLock(HRegion.java:4695) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> > doMiniBatchMutation(HRegion.java:2786) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> > batchMutate(HRegion.java:2653) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> > batchMutate(HRegion.java:2589) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> doBatchMutate(HRegion.java: >> > 3192) >> > > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459) >> > > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287) >> > > at site.luoyu.Core.Index.JavaTreeMap.insertRecord( >> JavaTreeMap.java:256) >> > > at site.luoyu.Core.Observer.IndexCopressor.prePut( >> > IndexCopressor.java:130) >> > > at >> > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut( >> > RegionCoprocessorHost.java:1122) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> doPreMutationHook(HRegion. >> > java:2674) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> > batchMutate(HRegion.java:2649) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> > batchMutate(HRegion.java:2589) >> > > at >> > > org.apache.hadoop.hbase.regionserver.HRegion. >> > batchMutate(HRegion.java:2593) >> > > at >> > > org
Re: WrongRegionException
Or else you should do below in prePut Create the new Put and get the COnnection from the CP environment and do a put op on the connection. (Not directly on Region) And then call bypass() to bypass this Put op with old RK. Just suggesting. But I some how feel that why not a middle client layer can do this convert -Anoop- On Thu, Feb 1, 2018 at 1:07 PM, Anoop John wrote: > Theoretically I dont think what you are trying to do is correct. I > mean the incoming keys are fully ignored and new one is been made at > CP layers. When CP is been contacted already the mutations has reached > upto that Region. The new RK might be outside the boundary of this > region. (As the split happens). Dont know your use case. But I feel > like you should handle this conversion of RKs and create new Put in a > client tier not at CP. > > -Anoop- > > On Mon, Jan 29, 2018 at 9:44 PM, Ted Yu wrote: >> w.r.t. region split, do you verify that the new rowkey is in the same >> region as the rowkey from incoming Put ? >> >> If not, there is a chance that the new rowkey is in different region which >> is going thru split. >> >> FYI >> >> On Mon, Jan 29, 2018 at 6:40 AM, Yang Zhang wrote: >> >>> Both are the same question. >>> I want to prevent the incoming puts and then copy it as one or more new >>> puts with different rowkey. >>> So when there is only one region, my rowkey will belong to it. But when >>> region splited, some rowkeys may not belong to the new region. >>> I used to thought HBase will stop new coming puts, finish all of the puts >>> in the batch, and then try to split. >>> But this maybe not right according to the exception that I got. >>> >>> BTY , It seems that I can't add put >>> to MiniBatchOperationInProgress miniBatchOp. There are only some >>> functions for get. >>> >>> Thank you very much for your help >>> >>> 2018-01-29 18:46 GMT+08:00 Anoop John : >>> >>> > Another related Q was also there.. Can you tell the actual >>> > requirement? So the incoming puts you want to change the RKs of that? >>> > Or you want to insert those as well as some new cells with a changed >>> > RK? >>> > >>> > -Anoop- >>> > >>> > On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang >>> > wrote: >>> > > Hello Everyone >>> > > >>> > > I am using coprocesser to prevent the normal put and replace it >>> > > with another rowkey, The method is HRegion.put(). It works fine, but >>> when >>> > > the region splited, There will be an WrongRegionException. >>> > > >>> > > 2018-01-28 09:32:51,528 WARN >>> > > [B.DefaultRpcServer.handler=21,queue=3,port=60020] >>> regionserver.HRegion: >>> > > Failed getting lock in batch put, >>> > > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r >>> > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested >>> row >>> > > out of range for row lock on HRegion >>> > > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac., >>> > > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00', >>> > > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r' >>> > > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow( >>> > HRegion.java:4677) >>> > > at >>> > > org.apache.hadoop.hbase.regionserver.HRegion. >>> > getRowLock(HRegion.java:4695) >>> > > at >>> > > org.apache.hadoop.hbase.regionserver.HRegion. >>> > doMiniBatchMutation(HRegion.java:2786) >>> > > at >>> > > org.apache.hadoop.hbase.regionserver.HRegion. >>> > batchMutate(HRegion.java:2653) >>> > > at >>> > > org.apache.hadoop.hbase.regionserver.HRegion. >>> > batchMutate(HRegion.java:2589) >>> > > at >>> > > org.apache.hadoop.hbase.regionserver.HRegion. >>> doBatchMutate(HRegion.java: >>> > 3192) >>> > > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459) >>> > > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287) >>> > > at site.luoyu.Core.Index.JavaTreeMap.insertRecord( >>> JavaTreeMap.java:256) >>> > > at site.luoyu.Core.Observer.IndexCopressor.prePut( >>> > IndexCopressor.java:130) >
Puts failing with WrongRegionException
Hi there, Wonder if anyone has seen error like this 2014-10-03 16:03:45,203 WARN [RpcServer.handler=7,port=60020] regionserver.HRegion: Failed getting lock in batch put, row=65317d52abfedc8b94a19f6fbffe187c org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for row lock on HRegion m_test,64d7e88463b88e7325b623fbd6629cda,1408803862959.cb513be341b94588469efa9d26d29857., startKey='64d7e88463b88e7325b623fbd6629cda', getEndKey()='6516687f5dae26f529c53f309cb36fca', row='65317d52abfedc8b94a19f6fbffe187c' Recently, we have added 10 more region servers to our cluster and then I started seeing errors like above when doing puts via TableOutputFormat in a MR job. Maybe where hbase stores the region info is corrupted? thanks for your help in advance thomas
WrongRegionException and inconsistent table found
Hi, We're running a hbase cluster including 37 regionservers. Today, we found losts of WrongRegionException when putting object into it. hbase hbck -details reports that Chain of regions in table STable is broken; edges does not contain ztxrGmCwn-6BE32s3cX1TNeHU_I= ERROR: Found inconsistency in table STable echo "scan '.META.'"| hbase shell &> meta.txt grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt reports that Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', STARTKEY => 'ztn0ukLW 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => 'ztqdVD8fCMP-dDbXUAydan e9.kboD4=', ENCODED => 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = -- D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', STARTKEY => ztqdVD8f 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => ' ztxrGmCwn-6BE32s3cX1TN 18.eHU_I=', ENCODED => c45191821053d03537596f4a2e759718, TABLE => {{NAME = -- pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', STARTKEY => 'zu3zVaLc 78722ea3f8d1b099313bec82 GDnnpjKCbnboXgAFspA=', ENDKEY => 'zu7qkr5fH6MMJ3GxbCv_0d 98.6g8yI=', ENCODED => c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = It looks like the meta indeed has a hole.(We tried scan '.META.' several times, to confirm it's not a transient status.) We've tried hbase hbck -fix, does not help. We found a thread 'wrong region exception' about two months ago. Stack suggested a 'little surgery' like *So, make sure you actually have a hole. Dump out your meta table: echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt Then look ensure that there is a hole between the above regions (compare start and end keys... the end key of one region needs to match the start key of the next). If indeed a hole, you need to do a little surgery inserting a new missing region (hbck should fix this but it doesn't have the smarts just yet). Basically, you create a new region with start and end keys to fill the hole then you insert it into .META. and then assign it. There are some scripts in our bin directory that do various parts of this. I'm pretty sure its beyond any but a few figuring this mess out so if you do the above foot work and provide a few more details, I'll hack up something for you (and hopefully something generalized to be use by others later, and later to be integrated into hbck).* Can anyone give a detailed example, step by step instruction would be greatly appreciated. My understand is we should 1.Since we already has the lost region, we now have start and end keys. 2.generate the row represents the missing region. But how can I generate the encoded name? It looks like I need column=info:server,column=info:serverstartcode and column=info:regioninfo for the missing region. And column=info:regioninfo includes so many information. How to generate them one by one? As for the name of row, it consists of tablename, startkey, encode, and one more long number, how to get this number? 3.use assing command in the hbase shell We also tried check_meta.rb --fix, it reports 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME => 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.', STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY => 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718, TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'userbucket', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'userpass', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} 11/07/06 00:28:40 WARN check_meta: Missing .regioninfo: hdfs:// hd0013.c.gj.com:9000/hbase/STable/3e6faca40a7ccad7ed8c0b5848c0f945/.regioninfo The problem is still there. BTW, what about the blue warning? Is this a serious issue? The situation is quite hard to us, it looks like even we can fill the hole in the meta, we would lost all the data in the hole region, right? Thanks and regards, Mao Xu-Feng
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 actions: WrongRegionException:
Hi, I have seen similar issue being handled here but i didn't fully understand how to fix it in my case. this is the earlier thread - http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/18410 i am seeing the below issues in my client - org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 ac tions: WrongRegionException: 3 times, servers with issues: ip-10-32-61-60.ec2.in ternal:44911, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen tation.processBatch(HConnectionManager.java:1220) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen tation.processBatchOfPuts(HConnectionManager.java:1234) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660) ... also, hbase hbck -details gives the following errors. -fix doesn't do anything. neither does check_meta.rb. ERROR: Region AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b. not deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839 on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/3af08f3ba88d7c2785fd089702f89241 on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/a522e78c7d018547b8979ed6fd381358 on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/b89f7bb47e53ea9cc729c86978a7327c on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/fd7ce07003bfd883f53a210bb0985065 on HDFS, but not listed in META or deployed on any region server. Chain of regions in table AkStats is broken; edges does not contain 277808094:1314921600:daily:Volume ERROR: Found inconsistency in table AkStats Summary: -ROOT- is okay. Number of regions: 1 Deployed on: ip-10-32-61-60.ec2.internal:55915 .META. is okay. Number of regions: 1 Deployed on: ip-10-32-61-60.ec2.internal:55915 Chain of regions in table AkStats is broken; edges does not contain 277808094:1314921600:daily:Volume Table AkStats is inconsistent. Number of regions: 133 Deployed on: ip-10-32-61-60.ec2.internal:55915 7 inconsistencies detected. Status: INCONSISTENT my production system is down on its knees and I need to get unblocked asap. any help will be highly appreciated. thanks vinod
Re: Puts failing with WrongRegionException
Can you check region server log to see if region m_test, 64d7e88463b88e7325b623fbd6629cda,1408803862959. cb513be341b94588469efa9d26d29857. moved / splitted between MR job launch and the time when this error showed up ? Thanks On Fri, Oct 3, 2014 at 4:08 PM, Thomas Kwan wrote: > Hi there, > > Wonder if anyone has seen error like this > > 2014-10-03 16:03:45,203 WARN [RpcServer.handler=7,port=60020] > regionserver.HRegion: Failed getting lock in batch put, > row=65317d52abfedc8b94a19f6fbffe187c > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row > out of range for row lock on HRegion > > m_test,64d7e88463b88e7325b623fbd6629cda,1408803862959.cb513be341b94588469efa9d26d29857., > startKey='64d7e88463b88e7325b623fbd6629cda', > getEndKey()='6516687f5dae26f529c53f309cb36fca', > row='65317d52abfedc8b94a19f6fbffe187c' > > > Recently, we have added 10 more region servers to our cluster and then I > started seeing errors like above when doing puts via TableOutputFormat in a > MR job. > > Maybe where hbase stores the region info is corrupted? > > thanks for your help in advance > thomas >
Re: WrongRegionException and inconsistent table found
We also check the master log, nothing interesting found. On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao wrote: > Hi, > > We're running a hbase cluster including 37 regionservers. Today, we found > losts of WrongRegionException when putting object into it. > > hbase hbck -details > reports that > > Chain of regions in table STable is broken; edges does not contain > ztxrGmCwn-6BE32s3cX1TNeHU_I= > ERROR: Found inconsistency in table STable > > > echo "scan '.META.'"| hbase shell &> meta.txt > grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt > reports that > > Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', > STARTKEY => 'ztn0ukLW > 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => > 'ztqdVD8fCMP-dDbXUAydan > e9.kboD4=', ENCODED => > 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = > -- > D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', > STARTKEY => ztqdVD8f > 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => ' > ztxrGmCwn-6BE32s3cX1TN > 18.eHU_I=', ENCODED => > c45191821053d03537596f4a2e759718, TABLE => {{NAME = > -- > pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', > STARTKEY => 'zu3zVaLc > 78722ea3f8d1b099313bec82 GDnnpjKCbnboXgAFspA=', ENDKEY => > 'zu7qkr5fH6MMJ3GxbCv_0d > 98.6g8yI=', ENCODED => > c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = > > > It looks like the meta indeed has a hole.(We tried scan '.META.' several > times, to confirm it's not a transient status.) > We've tried hbase hbck -fix, does not help. > > We found a thread 'wrong region exception' about two months ago. Stack > suggested a 'little surgery' like > > > *So, make sure you actually have a hole. Dump out your meta table: > > echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt > > Then look ensure that there is a hole between the above regions > (compare start and end keys... the end key of one region needs to > match the start key of the next). > > If indeed a hole, you need to do a little surgery inserting a new > missing region (hbck should fix this but it doesn't have the smarts > just yet). > > Basically, you create a new region with start and end keys to fill the > hole then you insert it into .META. and then assign it. There are > some scripts in our bin directory that do various parts of this. I'm > pretty sure its beyond any but a few figuring this mess out so if you > do the above foot work and provide a few more details, I'll hack up > something for you (and hopefully something generalized to be use by > others later, and later to be integrated into hbck).* > > > > Can anyone give a detailed example, step by step instruction would be > greatly appreciated. > My understand is we should > 1.Since we already has the lost region, we now have start and end keys. > 2.generate the row represents the missing region. But how can I generate > the encoded name? > It looks like I need > column=info:server,column=info:serverstartcode and column=info:regioninfo > for the missing region. > And column=info:regioninfo includes so many information. How to generate > them one by one? > As for the name of row, it consists of tablename, startkey, encode, and one > more long number, > how to get this number? > 3.use assing command in the hbase shell > > We also tried check_meta.rb --fix, it reports > > 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME => > 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.', > STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY => > 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718, > TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', > TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', > BLOCKCACHE => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE', > REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => ' > 2
Re: WrongRegionException and inconsistent table found
I forgot the version, we are using cdh3u0. Mao Xu-Feng 在 2011-7-6,0:59,Xu-Feng Mao 写道: We also check the master log, nothing interesting found. On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao wrote: > Hi, > > We're running a hbase cluster including 37 regionservers. Today, we found > losts of WrongRegionException when putting object into it. > > hbase hbck -details > reports that > > Chain of regions in table STable is broken; edges does not contain > ztxrGmCwn-6BE32s3cX1TNeHU_I= > ERROR: Found inconsistency in table STable > > > echo "scan '.META.'"| hbase shell &> meta.txt > grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt > reports that > > Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', > STARTKEY => 'ztn0ukLW > 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => > 'ztqdVD8fCMP-dDbXUAydan > e9.kboD4=', ENCODED => > 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = > -- > D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', > STARTKEY => ztqdVD8f > 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => ' > ztxrGmCwn-6BE32s3cX1TN > 18.eHU_I=', ENCODED => > c45191821053d03537596f4a2e759718, TABLE => {{NAME = > -- > pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', > STARTKEY => 'zu3zVaLc > 78722ea3f8d1b099313bec82 GDnnpjKCbnboXgAFspA=', ENDKEY => > 'zu7qkr5fH6MMJ3GxbCv_0d > 98.6g8yI=', ENCODED => > c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = > > > It looks like the meta indeed has a hole.(We tried scan '.META.' several > times, to confirm it's not a transient status.) > We've tried hbase hbck -fix, does not help. > > We found a thread 'wrong region exception' about two months ago. Stack > suggested a 'little surgery' like > > > *So, make sure you actually have a hole. Dump out your meta table: > > echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt > > Then look ensure that there is a hole between the above regions > (compare start and end keys... the end key of one region needs to > match the start key of the next). > > If indeed a hole, you need to do a little surgery inserting a new > missing region (hbck should fix this but it doesn't have the smarts > just yet). > > Basically, you create a new region with start and end keys to fill the > hole then you insert it into .META. and then assign it. There are > some scripts in our bin directory that do various parts of this. I'm > pretty sure its beyond any but a few figuring this mess out so if you > do the above foot work and provide a few more details, I'll hack up > something for you (and hopefully something generalized to be use by > others later, and later to be integrated into hbck).* > > > > Can anyone give a detailed example, step by step instruction would be > greatly appreciated. > My understand is we should > 1.Since we already has the lost region, we now have start and end keys. > 2.generate the row represents the missing region. But how can I generate > the encoded name? > It looks like I need > column=info:server,column=info:serverstartcode and column=info:regioninfo > for the missing region. > And column=info:regioninfo includes so many information. How to generate > them one by one? > As for the name of row, it consists of tablename, startkey, encode, and one > more long number, > how to get this number? > 3.use assing command in the hbase shell > > We also tried check_meta.rb --fix, it reports > > 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME => > 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.', > STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY => > 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718, > TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER => > 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', > TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', > BLOCKCACHE => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE', > REPLICATION_SCOPE => '0'
RE: WrongRegionException and inconsistent table found
egionInfo(bytes) if oldHRI if oldHRI.isOffline() && Bytes.equals(oldHRI.getStartKey(), hri.getStartKey()) # Presume offlined parent elsif Bytes.equals(oldHRI.getEndKey(), hri.getStartKey()) # Start key of next matches end key of previous else LOG.info("hole after " + oldHRI.toString()) if fixup bad = 1 unless fixup(oldHRI, hri, metatable, conf) else bad = 1 end end else if not Bytes.toString(hri.getStartKey()) == "" bad = 1 unless fixup("", hri, metatable, conf) end end oldHRI = hri end if not Bytes.toString(hri.getEndKey()) == "" bad = 1 unless fixup(hri, "", metatable, conf) end scanner.close() if bad LOG.info(".META. has holes") else LOG.info(".META. is healthy") end # Return 0 if meta is good, else non-zero. exit bad ===END CODE -Original Message- From: Xu-Feng Mao [mailto:m9s...@gmail.com] Sent: Tuesday, July 05, 2011 6:21 PM To: Xu-Feng Mao Cc: user@hbase.apache.org; hbase-u...@hadoop.apache.org Subject: Re: WrongRegionException and inconsistent table found I forgot the version, we are using cdh3u0. Mao Xu-Feng 在 2011-7-6,0:59,Xu-Feng Mao 写道: We also check the master log, nothing interesting found. On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao wrote: > Hi, > > We're running a hbase cluster including 37 regionservers. Today, we > found losts of WrongRegionException when putting object into it. > > hbase hbck -details > reports that > > Chain of regions in table STable is broken; edges does not contain > ztxrGmCwn-6BE32s3cX1TNeHU_I= > ERROR: Found inconsistency in table STable > > echo "scan '.META.'"| hbase shell &> meta.txt grep -A1 "STARTKEY => > 'EStore_everbox_z" meta.txt > reports that > > Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', > STARTKEY => 'ztn0ukLW > 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => > 'ztqdVD8fCMP-dDbXUAydan > e9.kboD4=', ENCODED => > 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = > -- > D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', > STARTKEY => ztqdVD8f > 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => ' > ztxrGmCwn-6BE32s3cX1TN > 18.eHU_I=', ENCODED => > c45191821053d03537596f4a2e759718, TABLE => {{NAME = > -- > pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', > STARTKEY => 'zu3zVaLc > 78722ea3f8d1b099313bec82 GDnnpjKCbnboXgAFspA=', ENDKEY => > 'zu7qkr5fH6MMJ3GxbCv_0d > 98.6g8yI=', ENCODED => > c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = > > It looks like the meta indeed has a hole.(We tried scan '.META.' > several times, to confirm it's not a transient status.) We've tried > hbase hbck -fix, does not help. > > We found a thread 'wrong region exception' about two months ago. Stack > suggested a 'little surgery' like > > *So, make sure you actually have a hole. Dump out your meta table: > > echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt > > Then look ensure that there is a hole between the above regions > (compare start and end keys... the end key of one region needs to > match the start key of the next). > > If indeed a hole, you need to do a little surgery inserting a new > missing region (hbck should fix this but it doesn't have the smarts > just yet). > > Basically, you create a new region with start and end keys to fill the > hole then you insert it into .META. and then assign it. There are > some scripts in our bin directory that do various parts of this. I'm > pretty sure its beyond any but a few figuring this mess out so if you > do the above foot work and provide a few more details, I'll hack up > something for you (and hopefully something generalized to be use by > others later, and later to be integrated into hbck).* > > > > Can anyone give a detailed example, step by step instruction would be > greatly appreciated. > My understand is we should > 1.Since we already has the lost region, we now have start and end keys. > 2.generate the row represents the missing region. But how can I > generate the encoded name? > It looks like I need > column=info:server,column=info:serverstartcode and > column=info:regioninfo for
Fwd: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 actions: WrongRegionException:
Hi, I have seen similar issue being handled here but i didn't fully understand how to fix it in my case. this is the earlier thread - http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/18410 i am seeing the below issues in my client - org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 ac tions: WrongRegionException: 3 times, servers with issues: ip-10-32-61-60.ec2.in ternal:44911, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen tation.processBatch(HConnectionManager.java:1220) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen tation.processBatchOfPuts(HConnectionManager.java:1234) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660) ... also, hbase hbck -details gives the following errors. -fix doesn't do anything. neither does check_meta.rb. ERROR: Region AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b. not deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839 on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/3af08f3ba88d7c2785fd089702f89241 on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/a522e78c7d018547b8979ed6fd381358 on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/b89f7bb47e53ea9cc729c86978a7327c on HDFS, but not listed in META or deployed on any region server. ERROR: Region file:/media/ephemeral0/hbase-data/AkStats/fd7ce07003bfd883f53a210bb0985065 on HDFS, but not listed in META or deployed on any region server. Chain of regions in table AkStats is broken; edges does not contain 277808094:1314921600:daily:Volume ERROR: Found inconsistency in table AkStats Summary: -ROOT- is okay. Number of regions: 1 Deployed on: ip-10-32-61-60.ec2.internal:55915 .META. is okay. Number of regions: 1 Deployed on: ip-10-32-61-60.ec2.internal:55915 Chain of regions in table AkStats is broken; edges does not contain 277808094:1314921600:daily:Volume Table AkStats is inconsistent. Number of regions: 133 Deployed on: ip-10-32-61-60.ec2.internal:55915 7 inconsistencies detected. Status: INCONSISTENT my production system is down on its knees and I need to get unblocked asap. any help will be highly appreciated. thanks vinod
Dead loop for batch put when get WrongRegionException
Hi ,all We are using batch put to insert rows, and sometimes get the following WARN in the region server log: 2015-07-23 10:08:49,684 WARN [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion: Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for row lock on HRegion atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21., startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399' at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456) at org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217) at org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386) at org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) And the WARN message is logged no-stop. I think the batch put dived into the dead loop. And i look up into the source code, and find the batch put will never stop if got WrongRegionException for some row. Any body know how to avoid this situation? Any idea will be appreciated!
Possible solution to 'WrongRegionException and inconsistent table found'
Hi, I looks like we've lost a region, include the directory on hdfs and its meta record as well. We need some more time to dig into the log sea, to figure out the root cause. But first of all, we need to recover the meta, so that we can put keys in that region. My understanding is the check_meta.rb and add_table.rb could fix some meta issues in case the directory on hdfs and its .regioninfo still exists. In our situation however, since we could not find the region directory any longer, it seems that all we could do is still insert a record into the meta, then assign it. I modified the check_meta.rb, to achieve the insertion. I've tried in our environment, it seems work, at least hbase hbck tells me okay. I attached it with this message.Any comments is great appreciated. I have one more question. I create the new region record with both startkey and endkey set, it seems possible that if we're unlucky, during the insertion, some split happens, then we might lead to overlap region. I wonder how hbase handles this sort of problems generally. When I was playing with the test environment, I saw message like some region 'is multiply assigned to region servers', it is also a inconsistent scenario, how can I recover this problem? Thanks and regards, Mao Xu-Feng -- Forwarded message -- From: Xu-Feng Mao Date: Wed, Jul 6, 2011 at 7:20 AM Subject: Re: WrongRegionException and inconsistent table found To: Xu-Feng Mao Cc: "user@hbase.apache.org" , " hbase-u...@hadoop.apache.org" I forgot the version, we are using cdh3u0. Mao Xu-Feng 在 2011-7-6,0:59,Xu-Feng Mao 写道: We also check the master log, nothing interesting found. On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao < m9s...@gmail.com> wrote: > Hi, > > We're running a hbase cluster including 37 regionservers. Today, we found > losts of WrongRegionException when putting object into it. > > hbase hbck -details > reports that > > Chain of regions in table STable is broken; edges does not contain > ztxrGmCwn-6BE32s3cX1TNeHU_I= > ERROR: Found inconsistency in table STable > > > echo "scan '.META.'"| hbase shell &> meta.txt > grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt > reports that > > Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', > STARTKEY => 'ztn0ukLW > 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => > 'ztqdVD8fCMP-dDbXUAydan > e9.kboD4=', ENCODED => > 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = > -- > D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', > STARTKEY => ztqdVD8f > 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => ' > ztxrGmCwn-6BE32s3cX1TN > 18.eHU_I=', ENCODED => > c45191821053d03537596f4a2e759718, TABLE => {{NAME = > -- > pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', > STARTKEY => 'zu3zVaLc > 78722ea3f8d1b099313bec82 GDnnpjKCbnboXgAFspA=', ENDKEY => > 'zu7qkr5fH6MMJ3GxbCv_0d > 98.6g8yI=', ENCODED => > c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = > > > It looks like the meta indeed has a hole.(We tried scan '.META.' several > times, to confirm it's not a transient status.) > We've tried hbase hbck -fix, does not help. > > We found a thread 'wrong region exception' about two months ago. Stack > suggested a 'little surgery' like > > > *So, make sure you actually have a hole. Dump out your meta table: > > echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt > > Then look ensure that there is a hole between the above regions > (compare start and end keys... the end key of one region needs to > match the start key of the next). > > If indeed a hole, you need to do a little surgery inserting a new > missing region (hbck should fix this but it doesn't have the smarts > just yet). > > Basically, you create a new region with start and end keys to fill the > hole then you insert it into .META. and then assign it. There are > some scripts in our bin directory that do various parts of this. I'm > pretty sure its beyond any but a few figuring this mess out so if you > do the above foot work and provide a few more details, I'll hack up > something for you (and hopefully something generalized to be use by > others later, and later to be integrated into hbck).* > > > > Can anyone give a detailed example, ste
Re: Dead loop for batch put when get WrongRegionException
Any chance that this would be your problem? https://issues.apache.org/jira/browse/HBASE-13896 On Thu, Jul 23, 2015 at 11:17 AM Louis Hust wrote: > Hi ,all > > We are using batch put to insert rows, and sometimes get the following WARN > in the region server log: > > > 2015-07-23 10:08:49,684 WARN > [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion: > Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399 > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row > out of range for row lock on HRegion > atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21., > startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399' > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456) > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474) > at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477) > at > > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) > > > And the WARN message is logged no-stop. I think the batch put dived into > the dead loop. > > And i look up into the source code, and find the batch put will never stop > if got WrongRegionException for some row. > > Any body know how to avoid this situation? > > Any idea will be appreciated! >
Re: Dead loop for batch put when get WrongRegionException
It seems that the HBASE-13896 <https://issues.apache.org/jira/browse/HBASE-13896> is client-side dead loop, but my problem is the region server side dead lock for get row lock. 2015-07-23 11:22 GMT+08:00 Victor Xu : > Any chance that this would be your problem? > https://issues.apache.org/jira/browse/HBASE-13896 > > On Thu, Jul 23, 2015 at 11:17 AM Louis Hust wrote: > > > Hi ,all > > > > We are using batch put to insert rows, and sometimes get the following > WARN > > in the region server log: > > > > > > 2015-07-23 10:08:49,684 WARN > > [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion: > > Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399 > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row > > out of range for row lock on HRegion > > atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21., > > startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399' > > at > org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474) > > at > > > > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217) > > at > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386) > > at > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588) > > at > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477) > > at > > > > > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593) > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > > at > > > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) > > > > > > And the WARN message is logged no-stop. I think the batch put dived into > > the dead loop. > > > > And i look up into the source code, and find the batch put will never > stop > > if got WrongRegionException for some row. > > > > Any body know how to avoid this situation? > > > > Any idea will be appreciated! > > >
Re: Dead loop for batch put when get WrongRegionException
My hbase version is 0.98.6, i will try update client to 0.98.14 and keep server at 0.98.6, Thanks very much! 2015-07-23 15:00 GMT+08:00 Victor Xu : > Client-side dead loop can cause sending wrong read/write requests to the > region servers, and you've got exactly the same log output as myself when > the the bug happens. However, this only occurs when you are using 0.98.X > version. 1.0 and above do not have this problem. > > On Thu, Jul 23, 2015 at 2:55 PM Louis Hust wrote: > >> It seems that the HBASE-13896 >> <https://issues.apache.org/jira/browse/HBASE-13896> is client-side dead >> loop, >> but my problem is the regionserver-side dead lock for get row lock, >> >> 2015-07-23 11:23 GMT+08:00 Victor Xu : >> >>> FYI >>> >>> -- Forwarded message - >>> From: Victor Xu >>> Date: Thu, Jul 23, 2015 at 11:22 AM >>> Subject: Re: Dead loop for batch put when get WrongRegionException >>> To: user@hbase.apache.org >>> >>> >>> Any chance that this would be your problem? >>> https://issues.apache.org/jira/browse/HBASE-13896 >>> >>> On Thu, Jul 23, 2015 at 11:17 AM Louis Hust >>> wrote: >>> >>>> Hi ,all >>>> >>>> We are using batch put to insert rows, and sometimes get the following >>>> WARN >>>> in the region server log: >>>> >>>> >>>> 2015-07-23 10:08:49,684 WARN >>>> [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion: >>>> Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399 >>>> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row >>>> out of range for row lock on HRegion >>>> atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21., >>>> startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399' >>>> at >>>> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588) >>>> at >>>> >>>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477) >>>> at >>>> >>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593) >>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) >>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) >>>> at >>>> >>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) >>>> >>>> >>>> And the WARN message is logged no-stop. I think the batch put dived into >>>> the dead loop. >>>> >>>> And i look up into the source code, and find the batch put will never >>>> stop >>>> if got WrongRegionException for some row. >>>> >>>> Any body know how to avoid this situation? >>>> >>>> Any idea will be appreciated! >>>> >>> >>
Re: Dead loop for batch put when get WrongRegionException
This patch hasn't been merge into 0.98.14 right now. you can apply it in your version, just recompile the hbase-client is ok. On Thu, Jul 23, 2015 at 3:04 PM Louis Hust wrote: > My hbase version is 0.98.6, i will try update client to 0.98.14 and keep > server at 0.98.6, > Thanks very much! > > 2015-07-23 15:00 GMT+08:00 Victor Xu : > >> Client-side dead loop can cause sending wrong read/write requests to the >> region servers, and you've got exactly the same log output as myself when >> the the bug happens. However, this only occurs when you are using 0.98.X >> version. 1.0 and above do not have this problem. >> >> On Thu, Jul 23, 2015 at 2:55 PM Louis Hust wrote: >> >>> It seems that the HBASE-13896 >>> <https://issues.apache.org/jira/browse/HBASE-13896> is client-side >>> dead loop, >>> but my problem is the regionserver-side dead lock for get row lock, >>> >>> 2015-07-23 11:23 GMT+08:00 Victor Xu : >>> >>>> FYI >>>> >>>> -- Forwarded message - >>>> From: Victor Xu >>>> Date: Thu, Jul 23, 2015 at 11:22 AM >>>> Subject: Re: Dead loop for batch put when get WrongRegionException >>>> To: user@hbase.apache.org >>>> >>>> >>>> Any chance that this would be your problem? >>>> https://issues.apache.org/jira/browse/HBASE-13896 >>>> >>>> On Thu, Jul 23, 2015 at 11:17 AM Louis Hust >>>> wrote: >>>> >>>>> Hi ,all >>>>> >>>>> We are using batch put to insert rows, and sometimes get the following >>>>> WARN >>>>> in the region server log: >>>>> >>>>> >>>>> 2015-07-23 10:08:49,684 WARN >>>>> [B.defaultRpcServer.handler=5,queue=5,port=60020] >>>>> regionserver.HRegion: >>>>> Failed getting lock in batch put, >>>>> row=BHXYHZFIHHR3ECON10100215072399 >>>>> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested >>>>> row >>>>> out of range for row lock on HRegion >>>>> atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21., >>>>> startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399' >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593) >>>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) >>>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) >>>>> at >>>>> >>>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) >>>>> >>>>> >>>>> And the WARN message is logged no-stop. I think the batch put dived >>>>> into >>>>> the dead loop. >>>>> >>>>> And i look up into the source code, and find the batch put will never >>>>> stop >>>>> if got WrongRegionException for some row. >>>>> >>>>> Any body know how to avoid this situation? >>>>> >>>>> Any idea will be appreciated! >>>>> >>>> >>> >
Re: Possible solution to 'WrongRegionException and inconsistent table found'
The attachment didn't go through. Can you put the file on pastebin ? Or you can open a JIRA and attach it there. Thanks On Jul 6, 2011, at 5:37 AM, Xu-Feng Mao wrote: > Hi, > > I looks like we've lost a region, include the directory on hdfs and its meta > record as well. We need some more time to dig into the log sea, to figure out > the root cause. > > But first of all, we need to recover the meta, so that we can put keys in > that region. My understanding is the check_meta.rb and add_table.rb could fix > some meta issues in case the directory on hdfs and its .regioninfo still > exists. > > In our situation however, since we could not find the region directory any > longer, it seems that all we could do is still insert a record into the meta, > then assign it. > > I modified the check_meta.rb, to achieve the insertion. I've tried in our > environment, it seems work, at least hbase hbck tells me okay. I attached it > with this message.Any comments is great appreciated. > > I have one more question. I create the new region record with both startkey > and endkey set, it seems possible that if we're unlucky, during the > insertion, some split happens, then we might lead to overlap region. I wonder > how hbase handles this sort of problems generally. > > When I was playing with the test environment, I saw message like some region > 'is multiply assigned to region servers', it is also a inconsistent scenario, > how can I recover this problem? > > Thanks and regards, > > Mao Xu-Feng > > ------ Forwarded message -- > From: Xu-Feng Mao > Date: Wed, Jul 6, 2011 at 7:20 AM > Subject: Re: WrongRegionException and inconsistent table found > To: Xu-Feng Mao > Cc: "user@hbase.apache.org" , > "hbase-u...@hadoop.apache.org" > > > I forgot the version, we are using cdh3u0. > > Mao Xu-Feng > > 在 2011-7-6,0:59,Xu-Feng Mao 写道: > >> We also check the master log, nothing interesting found. >> >> On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao wrote: >> Hi, >> >> We're running a hbase cluster including 37 regionservers. Today, we found >> losts of WrongRegionException when putting object into it. >> >> hbase hbck -details >> reports that >> >> Chain of regions in table STable is broken; edges does not contain >> ztxrGmCwn-6BE32s3cX1TNeHU_I= >> ERROR: Found inconsistency in table STable >> >> >> echo "scan '.META.'"| hbase shell &> meta.txt >> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt >> reports that >> >> Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', >> STARTKEY => 'ztn0ukLW >> 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => >> 'ztqdVD8fCMP-dDbXUAydan >> e9.kboD4=', ENCODED => >> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = >> -- >> D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', >> STARTKEY => ztqdVD8f >> 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => >> 'ztxrGmCwn-6BE32s3cX1TN >> 18.eHU_I=', ENCODED => >> c45191821053d03537596f4a2e759718, TABLE => {{NAME = >> -- >> pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', >> STARTKEY => 'zu3zVaLc >> 78722ea3f8d1b099313bec82 GDnnpjKCbnboXgAFspA=', ENDKEY => >> 'zu7qkr5fH6MMJ3GxbCv_0d >> 98.6g8yI=', ENCODED => >> c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = >> >> >> It looks like the meta indeed has a hole.(We tried scan '.META.' several >> times, to confirm it's not a transient status.) >> We've tried hbase hbck -fix, does not help. >> >> We found a thread 'wrong region exception' about two months ago. Stack >> suggested a 'little surgery' like >> >> So, make sure you actually have a hole. Dump out your meta table: >> >> echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt >> >> Then look ensure that there is a hole between the above regions >> (compare start and end keys... the end key of one region needs to >> match the start key of the next). >> >> If indeed a hole, you need to do a little surgery inserting a new >> missin
Re: Possible solution to 'WrongRegionException and inconsistent table found'
Thanks Ted for your attention. I do not know how to open a JIRA, so paste script here. This is not meant to work for general purpose, all I want to do here is just to insert the missing meta record. === # ${HBASE_HOME}/bin/hbase org.jruby.Main test.rb --help include Java import org.apache.commons.logging.LogFactory import org.apache.hadoop.hbase.util.VersionInfo import org.apache.hadoop.hbase.HBaseConfiguration import org.apache.hadoop.fs.FileSystem import org.apache.hadoop.fs.Path import org.apache.hadoop.hbase.HConstants import org.apache.hadoop.hbase.util.FSUtils import org.apache.hadoop.hbase.client.HTable import org.apache.hadoop.hbase.client.Scan import org.apache.hadoop.hbase.util.Writables import org.apache.hadoop.hbase.HRegionInfo import org.apache.hadoop.hbase.util.Bytes import org.apache.hadoop.hbase.HTableDescriptor import org.apache.hadoop.hbase.client.Put import com.google.common.base.Objects # Name of this script NAME = 'test' # Print usage for this script def usage puts 'Usage: %s.rb ' % NAME exit! end def getConfiguration hbase_twenty = VersionInfo.getVersion().match('0\.20\..*') # Get configuration to use. if hbase_twenty c = HBaseConfiguration.new() else c = HBaseConfiguration.create() end # Set hadoop filesystem configuration using the hbase.rootdir. # Otherwise, we'll always use localhost though the hbase.rootdir # might be pointing at hdfs location. Do old and new key for fs. c.set("fs.default.name", c.get(HConstants::HBASE_DIR)) c.set("fs.defaultFS", c.get(HConstants::HBASE_DIR)) return c end # Get configuration conf = getConfiguration() # Filesystem fs = FileSystem.get(conf) # Rootdir rootdir = FSUtils.getRootDir(conf) # Get a logger and a metautils instance. LOG = LogFactory.getLog(NAME) # Scan the .META. looking for holes metatable = HTable.new(conf, HConstants::META_TABLE_NAME) scan = Scan.new() scanner = metatable.getScanner(scan) oldHRI = nil bad = nil while (result = scanner.next()) rowid = Bytes.toString(result.getRow()) rowidStr = java.lang.String.new(rowid) bytes = result.getValue(HConstants::CATALOG_FAMILY, HConstants::REGIONINFO_QUALIFIER) hri = Writables.getHRegionInfo(bytes) endKey = Bytes.toString(hri.getEndKey()) if Objects.equal(endKey, "text_7aea6698-2503-4624-a835-3ea641d52ba1") puts Bytes.toString(hri.getStartKey()) newhri = HRegionInfo.new(hri.getTableDesc(), java.lang.String.new("text_7aea6698-2503-4624-a835-3ea641d52ba1").getBytes(), java.lang.String.new("").getBytes(), false) puts newhri.toString() p = Put.new(newhri.getRegionName()) p.add(HConstants::CATALOG_FAMILY, HConstants::REGIONINFO_QUALIFIER, Writables.getBytes(newhri)) metatable.put(p) break end end scanner.close() exit 0 === On Wed, Jul 6, 2011 at 9:01 PM, Ted Yu wrote: > The attachment didn't go through. > Can you put the file on pastebin ? > > Or you can open a JIRA and attach it there. > > Thanks > > > > On Jul 6, 2011, at 5:37 AM, Xu-Feng Mao wrote: > > Hi, > > I looks like we've lost a region, include the directory on hdfs and its > meta record as well. We need some more time to dig into the log sea, to > figure out the root cause. > > But first of all, we need to recover the meta, so that we can put keys in > that region. My understanding is the check_meta.rb and add_table.rb could > fix some meta issues in case the directory on hdfs and its .regioninfo still > exists. > > In our situation however, since we could not find the region directory any > longer, it seems that all we could do is still insert a record into the > meta, then assign it. > > I modified the check_meta.rb, to achieve the insertion. I've tried in our > environment, it seems work, at least hbase hbck tells me okay. I attached it > with this message.Any comments is great appreciated. > > I have one more question. I create the new region record with both startkey > and endkey set, it seems possible that if we're unlucky, during the > insertion, some split happens, then we might lead to overlap region. I > wonder how hbase handles this sort of problems generally. > > When I was playing with the test environment, I saw message like some > region > 'is multiply assigned to region servers', it is also a inconsistent > scenario, how can I recover this problem? > > Thanks and regards, > > Mao Xu-Feng > > -- Forwarded message -- > From: Xu-Feng Mao < m9s...@gmail.com> > Date: Wed, Jul 6, 2011 at 7:20 AM > Subject: Re: WrongRegionException and inconsistent table found > To: Xu-Feng Mao < m9s...@gmail.com> > Cc: " user@hbase.apache.org" < > user@hbase.apache.org>, " > hbase-u...@hadoop.apache.org" < >
Re: Possible solution to 'WrongRegionException and inconsistent table found'
Have you read http://wiki.apache.org/hadoop/Hbase/HowToContribute ? You can file an issue by starting from https://issues.apache.org/jira/secure/CreateIssue!default.jspa An issue solves a general problem. So you should parametrize the end key. Cheers On Wed, Jul 6, 2011 at 8:29 AM, Xu-Feng Mao wrote: > Thanks Ted for your attention. > > I do not know how to open a JIRA, so paste script here. This is not meant > to work for general purpose, all I want to do here is just to insert the > missing meta record. > > On Wed, Jul 6, 2011 at 9:01 PM, Ted Yu wrote: > >> The attachment didn't go through. >> Can you put the file on pastebin ? >> >> Or you can open a JIRA and attach it there. >> >> Thanks >> >> >> >> On Jul 6, 2011, at 5:37 AM, Xu-Feng Mao wrote: >> >> Hi, >> >> I looks like we've lost a region, include the directory on hdfs and its >> meta record as well. We need some more time to dig into the log sea, to >> figure out the root cause. >> >> But first of all, we need to recover the meta, so that we can put keys in >> that region. My understanding is the check_meta.rb and add_table.rb could >> fix some meta issues in case the directory on hdfs and its .regioninfo still >> exists. >> >> In our situation however, since we could not find the region directory any >> longer, it seems that all we could do is still insert a record into the >> meta, then assign it. >> >> I modified the check_meta.rb, to achieve the insertion. I've tried in our >> environment, it seems work, at least hbase hbck tells me okay. I attached it >> with this message.Any comments is great appreciated. >> >> I have one more question. I create the new region record with both >> startkey and endkey set, it seems possible that if we're unlucky, during the >> insertion, some split happens, then we might lead to overlap region. I >> wonder how hbase handles this sort of problems generally. >> >> When I was playing with the test environment, I saw message like some >> region >> 'is multiply assigned to region servers', it is also a inconsistent >> scenario, how can I recover this problem? >> >> Thanks and regards, >> >> Mao Xu-Feng >> >> -- Forwarded message -- >> From: Xu-Feng Mao < m9s...@gmail.com> >> Date: Wed, Jul 6, 2011 at 7:20 AM >> Subject: Re: WrongRegionException and inconsistent table found >> To: Xu-Feng Mao < m9s...@gmail.com> >> Cc: " user@hbase.apache.org" < >> user@hbase.apache.org>, " >> hbase-u...@hadoop.apache.org" < >> hbase-u...@hadoop.apache.org> >> >> >> I forgot the version, we are using cdh3u0. >> >> Mao Xu-Feng >> >> 在 2011-7-6,0:59,Xu-Feng Mao < m9s...@gmail.com> 写道: >> >> We also check the master log, nothing interesting found. >> >> On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao < >> >> m9s...@gmail.com> wrote: >> >>> Hi, >>> >>> We're running a hbase cluster including 37 regionservers. Today, we found >>> losts of WrongRegionException when putting object into it. >>> >>> hbase hbck -details >>> reports that >>> >>> Chain of regions in table STable is broken; edges does not contain >>> ztxrGmCwn-6BE32s3cX1TNeHU_I= >>> ERROR: Found inconsistency in table STable >>> >>> >>> echo "scan '.META.'"| hbase shell &> meta.txt >>> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt >>> reports that >>> >>> Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', >>> STARTKEY => 'ztn0ukLW >>> 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => >>> 'ztqdVD8fCMP-dDbXUAydan >>> e9.kboD4=', ENCODED => >>> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = >>> -- >>> D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', >>> STARTKEY => ztqdVD8f >>> 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => ' >>> ztxrGmCwn-6BE32s3cX1TN >>> 18.eHU_I=', ENCODED => >>> c45191821053d03537596f4a2e759718, TABLE => {{NAME = >>> -- >>> pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', >>> STARTKEY
Re: Possible solution to 'WrongRegionException and inconsistent table found'
On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao wrote: > I looks like we've lost a region, include the directory on hdfs and its meta > record as well. We need some more time to dig into the log sea, to figure > out the root cause. > You think it was https://issues.apache.org/jira/browse/HBASE-3872? > But first of all, we need to recover the meta, so that we can put keys in > that region. My understanding is the check_meta.rb and add_table.rb could > fix some meta issues in case the directory on hdfs and its .regioninfo still > exists. > Yes. add_table.rb will go out on fs and find regions for the table and rewrite that portion of .META. In 0.90 it will not assign them though you will likely need to disable then reenable the table to get the regions out on the cluster. Check_meta is likely the same. It looks for the hole and if you pass the -fix, will create a new region to plug the hole. This is probably what you need (You may need to assign the region post running the script). > I modified the check_meta.rb, to achieve the insertion. I've tried in our > environment, it seems work, at least hbase hbck tells me okay. I attached it > with this message.Any comments is great appreciated. > Good. > I have one more question. I create the new region record with both startkey > and endkey set, it seems possible that if we're unlucky, during the > insertion, some split happens, then we might lead to overlap region. I > wonder how hbase handles this sort of problems generally. > Well, you can't do cross-row transactions which is sort of what you would need here in this case so, yes, its possible that there could be overlap, though, didn't you say the region was missing? (If so, how could it split?). > When I was playing with the test environment, I saw message like some region > 'is multiply assigned to region servers', it is also a inconsistent > scenario, how can I recover this problem? > Can you figure how this double-assign happened? To 'recover' you'd close it on one of the regionservers. Send a close_region 'REGION_NAME', 'SERVER_NAME' in the shell (Read the shell close_region help to be sure for my memory is not reliable). St.Ack
Re: Possible solution to 'WrongRegionException and inconsistent table found'
On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao wrote: >> Can anyone give a detailed example, step by step instruction would be >> greatly appreciated. >> My understand is we should >> 1.Since we already has the lost region, we now have start and end keys. >> 2.generate the row represents the missing region. But how can I generate >> the encoded name? Its generate for you when you create an HRegionInfo instance (See its constructors; it takes start and end keys as well as HTableDesc instance for your table). >> It looks like I need >> column=info:server,column=info:serverstartcode and column=info:regioninfo >> for the missing region. info:startcode and info:server will be added for you on assign. You just add the info:regioninfo. >> As for the name of row, it consists of tablename, startkey, encode, and >> one more long number, >> how to get this number? The row key is the region name. Region name is made up of the startkey, encode, and a hash of the startkey, encode. Its made for you when you create the HRegionInfo instance. See below for more. >> 3.use assing command in the hbase shell >> >> We also tried check_meta.rb --fix, it reports >> >> 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME => >> 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.', >> STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY => >> 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718, >> TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER => >> 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', >> TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE >> => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE', REPLICATION_SCOPE >> => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', >> BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => >> 'userbucket', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION >> => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', >> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'userpass', >> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', >> VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => >> 'false', BLOCKCACHE => 'true'}]}} >> 11/07/06 00:28:40 WARN check_meta: Missing .regioninfo: >> hdfs://hd0013.c.gj.com:9000/hbase/STable/3e6faca40a7ccad7ed8c0b5848c0f945/.regioninfo >> >> >> The problem is still there. BTW, what about the blue warning? Is this a >> serious issue? Yes. check_meta.rb can't find the missing region on the fs so it can't repair it. You'll need to hack check_meta.rb so it doesn't get the HRegionInfo to use by reading the filesystem. Instead create the instance to insert in your version of check_meta.rb. >> The situation is quite hard to us, it looks like even we can fill the hole >> in the meta, we would lost all the data in the hole region, right? >> If its the bug I cited in my previous message, yes. Its critical we fix it and roll out a 0.90.4. St.Ack
Re: Possible solution to 'WrongRegionException and inconsistent table found'
Thanks Stack and Ted, Yes, it looks like just the case of HBASE-3872. Regarding *'is multiply assigned to region servers'* I found these messages after running add_table.rb, and assign them. Maybe before executes add_table.rb, we should disable the table? Or use 'unassign'. Regarding *the recovery script I attached*. After I run the script, I can insert values in that region now. But hbck reports *Chain of regions in table STable contains less elements than are listed in META; visited=64035, edges=64044 ERROR: Found inconsistency in table STable * I did check hbck before the execution, to set most recent correct startkey and endkey of the missing meta record, but it looks like the execution introduces some short-cut path in the meta? I guess it might cause loss of data in that 9 regions. Is there any tools to check out the hfiles on fs, to validate the data, if we can found out those 9 regions(we'll go through the .META.)? Thanks and regards, Mao Xu-Feng On Thu, Jul 7, 2011 at 3:21 AM, Stack wrote: > On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao wrote: > > I looks like we've lost a region, include the directory on hdfs and its > meta > > record as well. We need some more time to dig into the log sea, to figure > > out the root cause. > > > > You think it was https://issues.apache.org/jira/browse/HBASE-3872? > > > But first of all, we need to recover the meta, so that we can put keys in > > that region. My understanding is the check_meta.rb and add_table.rb could > > fix some meta issues in case the directory on hdfs and its .regioninfo > still > > exists. > > > > Yes. add_table.rb will go out on fs and find regions for the table > and rewrite that portion of .META. In 0.90 it will not assign them > though you will likely need to disable then reenable the table to get > the regions out on the cluster. > > Check_meta is likely the same. It looks for the hole and if you pass > the -fix, will create a new region to plug the hole. This is probably > what you need (You may need to assign the region post running the > script). > > > I modified the check_meta.rb, to achieve the insertion. I've tried in our > > environment, it seems work, at least hbase hbck tells me okay. I attached > it > > with this message.Any comments is great appreciated. > > > > Good. > > > I have one more question. I create the new region record with both > startkey > > and endkey set, it seems possible that if we're unlucky, during the > > insertion, some split happens, then we might lead to overlap region. I > > wonder how hbase handles this sort of problems generally. > > > > Well, you can't do cross-row transactions which is sort of what you > would need here in this case so, yes, its possible that there could be > overlap, though, didn't you say the region was missing? (If so, how > could it split?). > > > When I was playing with the test environment, I saw message like some > region > > 'is multiply assigned to region servers', it is also a inconsistent > > scenario, how can I recover this problem? > > > > Can you figure how this double-assign happened? > > To 'recover' you'd close it on one of the regionservers. Send a > close_region 'REGION_NAME', 'SERVER_NAME' in the shell (Read the shell > close_region help to be sure for my memory is not reliable). > > St.Ack >
Re: Possible solution to 'WrongRegionException and inconsistent table found'
On Wed, Jul 6, 2011 at 7:28 PM, Xu-Feng Mao wrote: > Regarding 'is multiply assigned to region servers' > I found these messages after running add_table.rb, and assign them. > Maybe before executes add_table.rb, we should disable the table? > Or use 'unassign'. > Yes. If some already assigned, it'll likely reassign regions though you are on 0.90.x and I didn't think regions added by add_table.rb in 0.90 context would be assigned. > Regarding the recovery script I attached. > After I run the script, I can insert values in that region now. But hbck > reports > > Chain of regions in table STable contains less elements than are listed in > META; visited=64035, edges=64044 > ERROR: Found inconsistency in table STable > Hmm... this reads as though there are some regions not yet assigned. Is that possible? If you add -details to hbck does it name the regions not assigned? If you try assigning one manually in the shell, does the could of visited edges go up? St.Ack
Re: Possible solution to 'WrongRegionException and inconsistent table found'
Thanks Stack, I embed my reply in italic. On Thu, Jul 7, 2011 at 12:19 PM, Stack wrote: > On Wed, Jul 6, 2011 at 7:28 PM, Xu-Feng Mao wrote: > > Regarding 'is multiply assigned to region servers' > > I found these messages after running add_table.rb, and assign them. > > Maybe before executes add_table.rb, we should disable the table? > > Or use 'unassign'. > > > > Yes. If some already assigned, it'll likely reassign regions though > you are on 0.90.x and I didn't think regions added by add_table.rb in > 0.90 context would be assigned. > > *= Yes, I assigned them manually.* > > Regarding the recovery script I attached. > > After I run the script, I can insert values in that region now. But hbck > > reports > > > > Chain of regions in table STable contains less elements than are listed > in > > META; visited=64035, edges=64044 > > ERROR: Found inconsistency in table STable > > > > Hmm... this reads as though there are some regions not yet assigned. > Is that possible? If you add -details to hbck does it name the > regions not assigned? If you try assigning one manually in the shell, > does the could of visited edges go up? > > * It's interesting* than now hbck -details reports *ERROR: Region STable,5i_aGOoK0oOTn8a4wmsuIiMcE_w=,1308035103936.feaedb6f49e83fff1e0cf498d3b4d734. listed in META on region server hd0040.sj.sd.com:60020 but found on region server hd0024.sj.sd.com:60020 Chain of regions in table S3Table contains less elements than are listed in META; visited=64100, edges=64109 ERROR: Found inconsistency in table S3Table Now there is only one problem region. Do I need to unassigned the region in the shell? Or maybe it would just take care of this issue?* Thanks and regards, Mao Xu-Feng
Java client throws WrongRegionException but same key accessible via hbase shell
I find this hard to believe. For the same row key, my java client is throwing wrong region exception. But I can query the same using hbase shell. Im on 0.90.2 version. Also note that I have inconsistencies in my regions that I am still trying to figure out. But regardless, the inconsistencies should impact both methods of querying similarly. right? thanks vinod
Re: Java client throws WrongRegionException but same key accessible via hbase shell
Hi vinod, Yeah WREs are never fun, hopefully we can help you fixing it. First, about the difference when querying from the shell and your java client. - Is it a long lived client? Did you restart it since you got the WREs? - If not, this could just be due to the fact that it has a cache of regions that's different from what a new client would see. - If you did restart it, then I would have to think a bit more about it to find the difference. Either way, it'd be nice to see what you're doing. We need; - A full dump of your .META. (in the shell: scan '.META.'), please put this on a web server or a pastebin - The keys you are trying to reach - The exceptions you are seeing that contain the row keys and regions it's trying to reach - Another thing that would be nice is to have the output of when you are reaching that row key from the shell, but with the shell started with the -d option (will show a lot more debug info). - The master log of the day the exception started happening. J-D On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala wrote: > I find this hard to believe. For the same row key, my java client is > throwing wrong region exception. But I can query the same using hbase shell. > Im on 0.90.2 version. > > Also note that I have inconsistencies in my regions that I am still trying > to figure out. But regardless, the inconsistencies should impact both > methods of querying similarly. right? > > thanks > vinod >
Re: Java client throws WrongRegionException but same key accessible via hbase shell
J-D, I was getting these errors even after restarting the client. So it probably is not straightforward. Also, I was able to run a combination of check_meta.rb and hbck with their fix options and restore some of the inconsistencies. i still have 4 inconsistencies left (earlier it was 7) but check_meta thinks .META. is fine. After i did this, I am not getting java client errors any more. Not sure how to explain that. Do you still want me to send the information you requested? may be you can help with remaining inconsistencies. thanks vinod On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans wrote: > Hi vinod, > > Yeah WREs are never fun, hopefully we can help you fixing it. > > First, about the difference when querying from the shell and your java > client. > > - Is it a long lived client? Did you restart it since you got the WREs? > - If not, this could just be due to the fact that it has a cache of > regions that's different from what a new client would see. > - If you did restart it, then I would have to think a bit more about > it to find the difference. > > Either way, it'd be nice to see what you're doing. We need; > > - A full dump of your .META. (in the shell: scan '.META.'), please > put this on a web server or a pastebin > - The keys you are trying to reach > - The exceptions you are seeing that contain the row keys and regions > it's trying to reach > - Another thing that would be nice is to have the output of when you > are reaching that row key from the shell, but with the shell started > with the -d option (will show a lot more debug info). > - The master log of the day the exception started happening. > > J-D > > On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala > wrote: > > I find this hard to believe. For the same row key, my java client is > > throwing wrong region exception. But I can query the same using hbase > shell. > > Im on 0.90.2 version. > > > > Also note that I have inconsistencies in my regions that I am still > trying > > to figure out. But regardless, the inconsistencies should impact both > > methods of querying similarly. right? > > > > thanks > > vinod > > >
Re: Java client throws WrongRegionException but same key accessible via hbase shell
Yes, please send the info I asked. About the hbck errors you had, this is usually fixed with -fix: Region AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b. not deployed on any region server. This is "probably" a region that wasn't cleaned up so it's not really a problem: Region file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839 on HDFS, but not listed in META or deployed on any region server. This is a real problem: Chain of regions in table AkStats is broken; edges does not contain 277808094:1314921600:daily:Volume J-D On Tue, Sep 27, 2011 at 1:00 PM, Vinod Gupta Tankala wrote: > J-D, > I was getting these errors even after restarting the client. So it probably > is not straightforward. > Also, I was able to run a combination of check_meta.rb and hbck with their > fix options and restore some of the inconsistencies. i still have 4 > inconsistencies left (earlier it was 7) but check_meta thinks .META. is > fine. After i did this, I am not getting java client errors any more. Not > sure how to explain that. > > Do you still want me to send the information you requested? may be you can > help with remaining inconsistencies. > > thanks > vinod > > On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans > wrote: > >> Hi vinod, >> >> Yeah WREs are never fun, hopefully we can help you fixing it. >> >> First, about the difference when querying from the shell and your java >> client. >> >> - Is it a long lived client? Did you restart it since you got the WREs? >> - If not, this could just be due to the fact that it has a cache of >> regions that's different from what a new client would see. >> - If you did restart it, then I would have to think a bit more about >> it to find the difference. >> >> Either way, it'd be nice to see what you're doing. We need; >> >> - A full dump of your .META. (in the shell: scan '.META.'), please >> put this on a web server or a pastebin >> - The keys you are trying to reach >> - The exceptions you are seeing that contain the row keys and regions >> it's trying to reach >> - Another thing that would be nice is to have the output of when you >> are reaching that row key from the shell, but with the shell started >> with the -d option (will show a lot more debug info). >> - The master log of the day the exception started happening. >> >> J-D >> >> On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala >> wrote: >> > I find this hard to believe. For the same row key, my java client is >> > throwing wrong region exception. But I can query the same using hbase >> shell. >> > Im on 0.90.2 version. >> > >> > Also note that I have inconsistencies in my regions that I am still >> trying >> > to figure out. But regardless, the inconsistencies should impact both >> > methods of querying similarly. right? >> > >> > thanks >> > vinod >> > >> >
Re: Java client throws WrongRegionException but same key accessible via hbase shell
Try running hbase org.jruby.Main add_table.rb /hbase/tablename This will clean the inconsistencies in .META. table . If you see run hbck again and you see holes in the table then you have to do more effort in cleaning the table. Rohit On Tue, Sep 27, 2011 at 1:23 PM, Jean-Daniel Cryans wrote: > Yes, please send the info I asked. > > About the hbck errors you had, this is usually fixed with -fix: > > Region > AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b. > not deployed on any region server. > > This is "probably" a region that wasn't cleaned up so it's not really a > problem: > > Region > file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839 > on HDFS, but not listed in META or deployed on any region server. > > This is a real problem: > > Chain of regions in table AkStats is broken; edges does not contain > 277808094:1314921600:daily:Volume > > J-D > > On Tue, Sep 27, 2011 at 1:00 PM, Vinod Gupta Tankala > wrote: > > J-D, > > I was getting these errors even after restarting the client. So it > probably > > is not straightforward. > > Also, I was able to run a combination of check_meta.rb and hbck with > their > > fix options and restore some of the inconsistencies. i still have 4 > > inconsistencies left (earlier it was 7) but check_meta thinks .META. is > > fine. After i did this, I am not getting java client errors any more. Not > > sure how to explain that. > > > > Do you still want me to send the information you requested? may be you > can > > help with remaining inconsistencies. > > > > thanks > > vinod > > > > On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans < > jdcry...@apache.org>wrote: > > > >> Hi vinod, > >> > >> Yeah WREs are never fun, hopefully we can help you fixing it. > >> > >> First, about the difference when querying from the shell and your java > >> client. > >> > >> - Is it a long lived client? Did you restart it since you got the WREs? > >> - If not, this could just be due to the fact that it has a cache of > >> regions that's different from what a new client would see. > >> - If you did restart it, then I would have to think a bit more about > >> it to find the difference. > >> > >> Either way, it'd be nice to see what you're doing. We need; > >> > >> - A full dump of your .META. (in the shell: scan '.META.'), please > >> put this on a web server or a pastebin > >> - The keys you are trying to reach > >> - The exceptions you are seeing that contain the row keys and regions > >> it's trying to reach > >> - Another thing that would be nice is to have the output of when you > >> are reaching that row key from the shell, but with the shell started > >> with the -d option (will show a lot more debug info). > >> - The master log of the day the exception started happening. > >> > >> J-D > >> > >> On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala > >> wrote: > >> > I find this hard to believe. For the same row key, my java client is > >> > throwing wrong region exception. But I can query the same using hbase > >> shell. > >> > Im on 0.90.2 version. > >> > > >> > Also note that I have inconsistencies in my regions that I am still > >> trying > >> > to figure out. But regardless, the inconsistencies should impact both > >> > methods of querying similarly. right? > >> > > >> > thanks > >> > vinod > >> > > >> > > >
Re: Java client throws WrongRegionException but same key accessible via hbase shell
J-D, here is the meta scan file - https://docs.google.com/document/d/1_g2Ce20H65rukrEe8i9UWaW9wskaV34_96ElpOQDsYE/edit?hl=en_US I solved this problem "Chain of regions in table AkStats is broken; edges does not contain 277808094:1314921600:daily:Volume" using combination of check_meta and hbck -fix. But every now and then i hit this problem where java client hits WRE but hbase shell works. an example of the above is - get 'AkStats', '26696569976:1317081600:weekly:AudEng' I have a suspicion on why this could be happening. Please confirm if I should be concerned about the following - 1) 5-10% of my rows are really large. ~2.5MB+. Remaining are smaller - few KB. 2) Before I write the large rows, I delete the row if it exists and then write a new one. The reason I do it this way is because some of the columns in the existing row don't apply any more. So I delete the whole row and rewrite it. Ofcourse, this happens every few hours for 1-2K rows only based on my current load. Does hbase scale well for row deletes? thanks On Tue, Sep 27, 2011 at 1:23 PM, Jean-Daniel Cryans wrote: > Yes, please send the info I asked. > > About the hbck errors you had, this is usually fixed with -fix: > > Region > AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b. > not deployed on any region server. > > This is "probably" a region that wasn't cleaned up so it's not really a > problem: > > Region > file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839 > on HDFS, but not listed in META or deployed on any region server. > > This is a real problem: > > Chain of regions in table AkStats is broken; edges does not contain > 277808094:1314921600:daily:Volume > > J-D > > On Tue, Sep 27, 2011 at 1:00 PM, Vinod Gupta Tankala > wrote: > > J-D, > > I was getting these errors even after restarting the client. So it > probably > > is not straightforward. > > Also, I was able to run a combination of check_meta.rb and hbck with > their > > fix options and restore some of the inconsistencies. i still have 4 > > inconsistencies left (earlier it was 7) but check_meta thinks .META. is > > fine. After i did this, I am not getting java client errors any more. Not > > sure how to explain that. > > > > Do you still want me to send the information you requested? may be you > can > > help with remaining inconsistencies. > > > > thanks > > vinod > > > > On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans < > jdcry...@apache.org>wrote: > > > >> Hi vinod, > >> > >> Yeah WREs are never fun, hopefully we can help you fixing it. > >> > >> First, about the difference when querying from the shell and your java > >> client. > >> > >> - Is it a long lived client? Did you restart it since you got the WREs? > >> - If not, this could just be due to the fact that it has a cache of > >> regions that's different from what a new client would see. > >> - If you did restart it, then I would have to think a bit more about > >> it to find the difference. > >> > >> Either way, it'd be nice to see what you're doing. We need; > >> > >> - A full dump of your .META. (in the shell: scan '.META.'), please > >> put this on a web server or a pastebin > >> - The keys you are trying to reach > >> - The exceptions you are seeing that contain the row keys and regions > >> it's trying to reach > >> - Another thing that would be nice is to have the output of when you > >> are reaching that row key from the shell, but with the shell started > >> with the -d option (will show a lot more debug info). > >> - The master log of the day the exception started happening. > >> > >> J-D > >> > >> On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala > >> wrote: > >> > I find this hard to believe. For the same row key, my java client is > >> > throwing wrong region exception. But I can query the same using hbase > >> shell. > >> > Im on 0.90.2 version. > >> > > >> > Also note that I have inconsistencies in my regions that I am still > >> trying > >> > to figure out. But regardless, the inconsistencies should impact both > >> > methods of querying similarly. right? > >> > > >> > thanks > >> > vinod > >> > > >> > > >
WrongRegionException: Requested row out of range for calculated split on HRegion => How is this possible?
(I have added line feeds to make it easier to read) org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for calculated split on HRegion work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00 http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16,1376139517597.b39bf00b980b632901859761caafb9d0., startKey ='\xF5\x9A\xEA&\x00\x00\x00\x00 http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16', getEndKey()='\xF5\x9B@}\x00\x00\x00\x00 http://fr.video.sympatico.ca/accueil/les-plus-populaires/watch/kim-kardashian-rit-des-rumeurs-dinfidelite/2477090497001?sort=date&filter=Splash&page=5', row='\xFA\xCDH?\x00\x00\x00\x00http://www.futur. Start key is xF5 x9A xEA End key is xF5 x9B x40 But I'm getting xFA xCD as the mid key... Which is not in the range. MidKey definition: * An approximation to the {@link HFile}'s mid-key. Operates on block * boundaries, and does not go inside blocks. In other words, returns the * first key of the middle block of the file Does it mean that my blocks into my HFile are not correctly ordered??? I have just one store file for this region. If I run bin/hbase org.apache.hadoop.hbase.io.hfile.HFile on this region, I get this: firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00... lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00... But from the WebUI, I have those 2 regions at the end: work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00... buldo:60030 \xF5\x9A\xEA&\x00\x00\x00\x00h... \xF5\x9B@}\x00\x00\x00\x00... 0 work_proposed,\xF5\x9B@}\x00\x00\x00\x00... Which is the same as what I got on the logs. But not the same as what the HFilePrettyPrinter is giving me. The provided midkey is fine if we consider the output of the HFilePrettyPrinter. But wrong if we consider the WebUI. http://pastebin.com/dmtAnQtF Version:0.94.12-SNAPSHOT but I'm facing that for weeks now. So not new. I will continue to investigate. Most probably will try to print the 58M keys into the HFile to see who's right, who's wrong. And why those information are different. Might also drop the entry in the META to let HBCK rebuild it based on the HDFS file and see... All the ideas are welcome. JM
Re: WrongRegionException: Requested row out of range for calculated split on HRegion => How is this possible?
By looking at the HFile content, I can see that the information display on the WebUI is not correct. The last key printed by HFilePrettyPrinter is K: \xFF\xFF\xFF\xFE\x00\x00\x00\x00 The region after this one is listed by the same application to have: firstKey=\xF5\x9BB\xF4\x00\x00\x00\x00... lastKey=\xFF\xFF\xFF`\x00\x00\x00\x00... And the concernend region: firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00... lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00... Which mean I have an overlap between the 2. So now. What are the options. 1) HBCK doesn't report any issue. 2) HFile report the right keys information 3) WebUI does'nt report the right information. Since the WebUI display the information based on the META, my best guess is that META content is not correct. So I can "simply" remove it and let HBCK repair that. Another option might be to copy the files from the 2nd region to the 1st one as another store and re-compact the 2 together? Should we have something to detect such region overlap or some disconnect between the META and the HFiles? I will not do anything for now because I want to know you opinion, but I think we should at least have something to detect that in HBCK, and most probably something to fix that too. JM 2013/8/24 Jean-Marc Spaggiari > (I have added line feeds to make it easier to read) > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row > out of range for calculated split on HRegion > work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00 > http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16,1376139517597.b39bf00b980b632901859761caafb9d0., > > startKey ='\xF5\x9A\xEA&\x00\x00\x00\x00 > http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16', > > getEndKey()='\xF5\x9B@}\x00\x00\x00\x00 > http://fr.video.sympatico.ca/accueil/les-plus-populaires/watch/kim-kardashian-rit-des-rumeurs-dinfidelite/2477090497001?sort=date&filter=Splash&page=5', > > row='\xFA\xCDH?\x00\x00\x00\x00http://www.futur. > > Start key is xF5 x9A xEA > End key is xF5 x9B x40 > > But I'm getting xFA xCD as the mid key... Which is not in the range. > > MidKey definition: > > * An approximation to the {@link HFile}'s mid-key. Operates on block > * boundaries, and does not go inside blocks. In other words, returns > the > * first key of the middle block of the file > > Does it mean that my blocks into my HFile are not correctly ordered??? I > have just one store file for this region. > > If I run bin/hbase org.apache.hadoop.hbase.io.hfile.HFile on this region, > I get this: > > firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00... > lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00... > > But from the WebUI, I have those 2 regions at the end: > work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00... buldo:60030 > \xF5\x9A\xEA&\x00\x00\x00\x00h... \xF5\x9B@}\x00\x00\x00\x00... 0 > work_proposed,\xF5\x9B@}\x00\x00\x00\x00... > Which is the same as what I got on the logs. But not the same as what the > HFilePrettyPrinter is giving me. The provided midkey is fine if we consider > the output of the HFilePrettyPrinter. But wrong if we consider the WebUI. > > > http://pastebin.com/dmtAnQtF > Version:0.94.12-SNAPSHOT but I'm facing that for weeks now. So not new. > > I will continue to investigate. Most probably will try to print the 58M > keys into the HFile to see who's right, who's wrong. And why those > information are different. Might also drop the entry in the META to let > HBCK rebuild it based on the HDFS file and see... > > All the ideas are welcome. > > JM >