WrongRegionException

2018-01-29 Thread Yang Zhang
Hello Everyone

I am using coprocesser to prevent the normal put and replace it
with another rowkey, The method is HRegion.put(). It works fine, but when
the region splited, There will be an WrongRegionException.

2018-01-28 09:32:51,528 WARN
[B.DefaultRpcServer.handler=21,queue=3,port=60020] regionserver.HRegion:
Failed getting lock in batch put,
row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
out of range for row lock on HRegion
GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac.,
startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00',
row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r'
at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:4677)
at
org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:4695)
at
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2786)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2653)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589)
at
org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:3192)
at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459)
at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287)
at site.luoyu.Core.Index.JavaTreeMap.insertRecord(JavaTreeMap.java:256)
at site.luoyu.Core.Observer.IndexCopressor.prePut(IndexCopressor.java:130)
at
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(RegionCoprocessorHost.java:1122)
at
org.apache.hadoop.hbase.regionserver.HRegion.doPreMutationHook(HRegion.java:2674)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2649)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2593)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4402)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3584)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3474)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:3)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Thread.java:745)

It is said rowkey is out of region's bound.  This Exception is logged in
regionserver's log as an warning , I can't catch  and handle it.

According the source code,
RowLock rowLock = null;
try {
  rowLock = getRowLock(mutation.getRow(), shouldBlock);
} catch (IOException ioe) {
  LOG.warn("Failed getting lock in batch put, row="
+ Bytes.toStringBinary(mutation.getRow()), ioe);
}

HBase just cache and log this exception , I guess it even didn't remove it
from the batch. So I got so many Exception log  and can't put data anymore.

Why HBase handle this WrongRegionException like this? Anyone can help?
Thanks verymuch.


Re: WrongRegionException

2018-01-29 Thread Anoop John
Another related Q was also there..  Can you tell the actual
requirement?  So the incoming puts you want to change the RKs of that?
Or you want to insert those as well as some new cells with a changed
RK?

-Anoop-

On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang  wrote:
> Hello Everyone
>
> I am using coprocesser to prevent the normal put and replace it
> with another rowkey, The method is HRegion.put(). It works fine, but when
> the region splited, There will be an WrongRegionException.
>
> 2018-01-28 09:32:51,528 WARN
> [B.DefaultRpcServer.handler=21,queue=3,port=60020] regionserver.HRegion:
> Failed getting lock in batch put,
> row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> out of range for row lock on HRegion
> GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac.,
> startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00',
> row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r'
> at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:4677)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:4695)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2786)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2653)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:3192)
> at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459)
> at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287)
> at site.luoyu.Core.Index.JavaTreeMap.insertRecord(JavaTreeMap.java:256)
> at site.luoyu.Core.Observer.IndexCopressor.prePut(IndexCopressor.java:130)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(RegionCoprocessorHost.java:1122)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doPreMutationHook(HRegion.java:2674)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2649)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2589)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2593)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4402)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3584)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3474)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:3)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
> at java.lang.Thread.run(Thread.java:745)
>
> It is said rowkey is out of region's bound.  This Exception is logged in
> regionserver's log as an warning , I can't catch  and handle it.
>
> According the source code,
> RowLock rowLock = null;
> try {
>   rowLock = getRowLock(mutation.getRow(), shouldBlock);
> } catch (IOException ioe) {
>   LOG.warn("Failed getting lock in batch put, row="
> + Bytes.toStringBinary(mutation.getRow()), ioe);
> }
>
> HBase just cache and log this exception , I guess it even didn't remove it
> from the batch. So I got so many Exception log  and can't put data anymore.
>
> Why HBase handle this WrongRegionException like this? Anyone can help?
> Thanks verymuch.


Re: WrongRegionException

2018-01-29 Thread Yang Zhang
Both are the same question.
I want to prevent the incoming puts and then copy it as one or more new
puts with different rowkey.
So when there is only one region, my rowkey will belong to it. But when
region splited, some rowkeys may not belong to the new region.
I used to thought HBase will stop new coming puts, finish all of the puts
in the batch, and then try to split.
But this maybe not right according to the exception that I got.

BTY , It seems that I can't add put
to MiniBatchOperationInProgress miniBatchOp. There are only some
functions for get.

Thank you very much for your help

2018-01-29 18:46 GMT+08:00 Anoop John :

> Another related Q was also there..  Can you tell the actual
> requirement?  So the incoming puts you want to change the RKs of that?
> Or you want to insert those as well as some new cells with a changed
> RK?
>
> -Anoop-
>
> On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang 
> wrote:
> > Hello Everyone
> >
> > I am using coprocesser to prevent the normal put and replace it
> > with another rowkey, The method is HRegion.put(). It works fine, but when
> > the region splited, There will be an WrongRegionException.
> >
> > 2018-01-28 09:32:51,528 WARN
> > [B.DefaultRpcServer.handler=21,queue=3,port=60020] regionserver.HRegion:
> > Failed getting lock in batch put,
> > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r
> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> > out of range for row lock on HRegion
> > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac.,
> > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00',
> > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r'
> > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(
> HRegion.java:4677)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.
> getRowLock(HRegion.java:4695)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.
> doMiniBatchMutation(HRegion.java:2786)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.
> batchMutate(HRegion.java:2653)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.
> batchMutate(HRegion.java:2589)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:
> 3192)
> > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459)
> > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287)
> > at site.luoyu.Core.Index.JavaTreeMap.insertRecord(JavaTreeMap.java:256)
> > at site.luoyu.Core.Observer.IndexCopressor.prePut(
> IndexCopressor.java:130)
> > at
> > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(
> RegionCoprocessorHost.java:1122)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.doPreMutationHook(HRegion.
> java:2674)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.
> batchMutate(HRegion.java:2649)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.
> batchMutate(HRegion.java:2589)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegion.
> batchMutate(HRegion.java:2593)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.
> doBatchOp(HRegionServer.java:4402)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.
> doNonAtomicRegionMutation(HRegionServer.java:3584)
> > at
> > org.apache.hadoop.hbase.regionserver.HRegionServer.
> multi(HRegionServer.java:3474)
> > at
> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.
> callBlockingMethod(ClientProtos.java:3)
> > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
> > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> > at
> > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:114)
> > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > It is said rowkey is out of region's bound.  This Exception is logged in
> > regionserver's log as an warning , I can't catch  and handle it.
> >
> > According the source code,
> > RowLock rowLock = null;
> > try {
> >   rowLock = getRowLock(mutation.getRow(), shouldBlock);
> > } catch (IOException ioe) {
> >   LOG.warn("Failed getting lock in batch put, row="
> > + Bytes.toStringBinary(mutation.getRow()), ioe);
> > }
> >
> > HBase just cache and log this exception , I guess it even didn't remove
> it
> > from the batch. So I got so many Exception log  and can't put data
> anymore.
> >
> > Why HBase handle this WrongRegionException like this? Anyone can help?
> > Thanks verymuch.
>


Re: WrongRegionException

2018-01-29 Thread Ted Yu
w.r.t. region split, do you verify that the new rowkey is in the same
region as the rowkey from incoming Put ?

If not, there is a chance that the new rowkey is in different region which
is going thru split.

FYI

On Mon, Jan 29, 2018 at 6:40 AM, Yang Zhang  wrote:

> Both are the same question.
> I want to prevent the incoming puts and then copy it as one or more new
> puts with different rowkey.
> So when there is only one region, my rowkey will belong to it. But when
> region splited, some rowkeys may not belong to the new region.
> I used to thought HBase will stop new coming puts, finish all of the puts
> in the batch, and then try to split.
> But this maybe not right according to the exception that I got.
>
> BTY , It seems that I can't add put
> to MiniBatchOperationInProgress miniBatchOp. There are only some
> functions for get.
>
> Thank you very much for your help
>
> 2018-01-29 18:46 GMT+08:00 Anoop John :
>
> > Another related Q was also there..  Can you tell the actual
> > requirement?  So the incoming puts you want to change the RKs of that?
> > Or you want to insert those as well as some new cells with a changed
> > RK?
> >
> > -Anoop-
> >
> > On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang 
> > wrote:
> > > Hello Everyone
> > >
> > > I am using coprocesser to prevent the normal put and replace it
> > > with another rowkey, The method is HRegion.put(). It works fine, but
> when
> > > the region splited, There will be an WrongRegionException.
> > >
> > > 2018-01-28 09:32:51,528 WARN
> > > [B.DefaultRpcServer.handler=21,queue=3,port=60020]
> regionserver.HRegion:
> > > Failed getting lock in batch put,
> > > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r
> > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested
> row
> > > out of range for row lock on HRegion
> > > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac.,
> > > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00',
> > > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r'
> > > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(
> > HRegion.java:4677)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> > getRowLock(HRegion.java:4695)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> > doMiniBatchMutation(HRegion.java:2786)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> > batchMutate(HRegion.java:2653)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> > batchMutate(HRegion.java:2589)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> doBatchMutate(HRegion.java:
> > 3192)
> > > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459)
> > > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287)
> > > at site.luoyu.Core.Index.JavaTreeMap.insertRecord(
> JavaTreeMap.java:256)
> > > at site.luoyu.Core.Observer.IndexCopressor.prePut(
> > IndexCopressor.java:130)
> > > at
> > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(
> > RegionCoprocessorHost.java:1122)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> doPreMutationHook(HRegion.
> > java:2674)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> > batchMutate(HRegion.java:2649)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> > batchMutate(HRegion.java:2589)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegion.
> > batchMutate(HRegion.java:2593)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegionServer.
> > doBatchOp(HRegionServer.java:4402)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegionServer.
> > doNonAtomicRegionMutation(HRegionServer.java:3584)
> > > at
> > > org.apache.hadoop.hbase.regionserver.HRegionServer.
> > multi(HRegionServer.java:3474)
> > > at
> > > org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.
> > callBlockingMethod(ClientProtos.java:3)
> > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
> > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> > > at
> > > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> > RpcExecutor.java:114)
> > > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.ru

Re: WrongRegionException

2018-01-31 Thread Anoop John
Theoretically I dont think what you are trying to do is correct. I
mean the incoming keys are fully ignored and new one is been made at
CP layers. When CP is been contacted already the mutations has reached
upto that Region.  The new RK might be outside the boundary of this
region. (As the split happens).   Dont know your use case. But I feel
like you should handle this conversion of RKs and create new Put in a
client tier not at CP.

-Anoop-

On Mon, Jan 29, 2018 at 9:44 PM, Ted Yu  wrote:
> w.r.t. region split, do you verify that the new rowkey is in the same
> region as the rowkey from incoming Put ?
>
> If not, there is a chance that the new rowkey is in different region which
> is going thru split.
>
> FYI
>
> On Mon, Jan 29, 2018 at 6:40 AM, Yang Zhang  wrote:
>
>> Both are the same question.
>> I want to prevent the incoming puts and then copy it as one or more new
>> puts with different rowkey.
>> So when there is only one region, my rowkey will belong to it. But when
>> region splited, some rowkeys may not belong to the new region.
>> I used to thought HBase will stop new coming puts, finish all of the puts
>> in the batch, and then try to split.
>> But this maybe not right according to the exception that I got.
>>
>> BTY , It seems that I can't add put
>> to MiniBatchOperationInProgress miniBatchOp. There are only some
>> functions for get.
>>
>> Thank you very much for your help
>>
>> 2018-01-29 18:46 GMT+08:00 Anoop John :
>>
>> > Another related Q was also there..  Can you tell the actual
>> > requirement?  So the incoming puts you want to change the RKs of that?
>> > Or you want to insert those as well as some new cells with a changed
>> > RK?
>> >
>> > -Anoop-
>> >
>> > On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang 
>> > wrote:
>> > > Hello Everyone
>> > >
>> > > I am using coprocesser to prevent the normal put and replace it
>> > > with another rowkey, The method is HRegion.put(). It works fine, but
>> when
>> > > the region splited, There will be an WrongRegionException.
>> > >
>> > > 2018-01-28 09:32:51,528 WARN
>> > > [B.DefaultRpcServer.handler=21,queue=3,port=60020]
>> regionserver.HRegion:
>> > > Failed getting lock in batch put,
>> > > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r
>> > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested
>> row
>> > > out of range for row lock on HRegion
>> > > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac.,
>> > > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00',
>> > > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r'
>> > > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(
>> > HRegion.java:4677)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> > getRowLock(HRegion.java:4695)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> > doMiniBatchMutation(HRegion.java:2786)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> > batchMutate(HRegion.java:2653)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> > batchMutate(HRegion.java:2589)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> doBatchMutate(HRegion.java:
>> > 3192)
>> > > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459)
>> > > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287)
>> > > at site.luoyu.Core.Index.JavaTreeMap.insertRecord(
>> JavaTreeMap.java:256)
>> > > at site.luoyu.Core.Observer.IndexCopressor.prePut(
>> > IndexCopressor.java:130)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(
>> > RegionCoprocessorHost.java:1122)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> doPreMutationHook(HRegion.
>> > java:2674)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> > batchMutate(HRegion.java:2649)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> > batchMutate(HRegion.java:2589)
>> > > at
>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>> > batchMutate(HRegion.java:2593)
>> > > at
>> > > org

Re: WrongRegionException

2018-01-31 Thread Anoop John
Or else you should do below in prePut
Create the new Put and get the COnnection from the CP environment and
do a put op on the connection. (Not directly on Region)  And then call
bypass() to bypass this Put op with old RK.  Just suggesting. But I
some how feel that why not a middle client layer can do this convert

-Anoop-

On Thu, Feb 1, 2018 at 1:07 PM, Anoop John  wrote:
> Theoretically I dont think what you are trying to do is correct. I
> mean the incoming keys are fully ignored and new one is been made at
> CP layers. When CP is been contacted already the mutations has reached
> upto that Region.  The new RK might be outside the boundary of this
> region. (As the split happens).   Dont know your use case. But I feel
> like you should handle this conversion of RKs and create new Put in a
> client tier not at CP.
>
> -Anoop-
>
> On Mon, Jan 29, 2018 at 9:44 PM, Ted Yu  wrote:
>> w.r.t. region split, do you verify that the new rowkey is in the same
>> region as the rowkey from incoming Put ?
>>
>> If not, there is a chance that the new rowkey is in different region which
>> is going thru split.
>>
>> FYI
>>
>> On Mon, Jan 29, 2018 at 6:40 AM, Yang Zhang  wrote:
>>
>>> Both are the same question.
>>> I want to prevent the incoming puts and then copy it as one or more new
>>> puts with different rowkey.
>>> So when there is only one region, my rowkey will belong to it. But when
>>> region splited, some rowkeys may not belong to the new region.
>>> I used to thought HBase will stop new coming puts, finish all of the puts
>>> in the batch, and then try to split.
>>> But this maybe not right according to the exception that I got.
>>>
>>> BTY , It seems that I can't add put
>>> to MiniBatchOperationInProgress miniBatchOp. There are only some
>>> functions for get.
>>>
>>> Thank you very much for your help
>>>
>>> 2018-01-29 18:46 GMT+08:00 Anoop John :
>>>
>>> > Another related Q was also there..  Can you tell the actual
>>> > requirement?  So the incoming puts you want to change the RKs of that?
>>> > Or you want to insert those as well as some new cells with a changed
>>> > RK?
>>> >
>>> > -Anoop-
>>> >
>>> > On Mon, Jan 29, 2018 at 3:49 PM, Yang Zhang 
>>> > wrote:
>>> > > Hello Everyone
>>> > >
>>> > > I am using coprocesser to prevent the normal put and replace it
>>> > > with another rowkey, The method is HRegion.put(). It works fine, but
>>> when
>>> > > the region splited, There will be an WrongRegionException.
>>> > >
>>> > > 2018-01-28 09:32:51,528 WARN
>>> > > [B.DefaultRpcServer.handler=21,queue=3,port=60020]
>>> regionserver.HRegion:
>>> > > Failed getting lock in batch put,
>>> > > row=\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r
>>> > > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested
>>> row
>>> > > out of range for row lock on HRegion
>>> > > GISdoop_GeoKey,,1517085124215.341534e84727245f1c67f345c3e467ac.,
>>> > > startKey='', getEndKey()='\xE6G8\x00\x00\x00\x00\x00\x00\x00\x00\x00',
>>> > > row='\xF0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0F\x10\xC5r'
>>> > > at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(
>>> > HRegion.java:4677)
>>> > > at
>>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>>> > getRowLock(HRegion.java:4695)
>>> > > at
>>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>>> > doMiniBatchMutation(HRegion.java:2786)
>>> > > at
>>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>>> > batchMutate(HRegion.java:2653)
>>> > > at
>>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>>> > batchMutate(HRegion.java:2589)
>>> > > at
>>> > > org.apache.hadoop.hbase.regionserver.HRegion.
>>> doBatchMutate(HRegion.java:
>>> > 3192)
>>> > > at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2459)
>>> > > at site.luoyu.Core.Index.JavaTreeMap.insertPoint(JavaTreeMap.java:287)
>>> > > at site.luoyu.Core.Index.JavaTreeMap.insertRecord(
>>> JavaTreeMap.java:256)
>>> > > at site.luoyu.Core.Observer.IndexCopressor.prePut(
>>> > IndexCopressor.java:130)
>

Puts failing with WrongRegionException

2014-10-03 Thread Thomas Kwan
Hi there,

Wonder if anyone has seen error like this

2014-10-03 16:03:45,203 WARN  [RpcServer.handler=7,port=60020]
regionserver.HRegion: Failed getting lock in batch put,
row=65317d52abfedc8b94a19f6fbffe187c
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
out of range for row lock on HRegion
m_test,64d7e88463b88e7325b623fbd6629cda,1408803862959.cb513be341b94588469efa9d26d29857.,
startKey='64d7e88463b88e7325b623fbd6629cda',
getEndKey()='6516687f5dae26f529c53f309cb36fca',
row='65317d52abfedc8b94a19f6fbffe187c'


Recently, we have added 10 more region servers to our cluster and then I
started seeing errors like above when doing puts via TableOutputFormat in a
MR job.

Maybe where hbase stores the region info is corrupted?

thanks for your help in advance
thomas


WrongRegionException and inconsistent table found

2011-07-05 Thread Xu-Feng Mao
Hi,

We're running a hbase cluster including 37 regionservers. Today, we found
losts of WrongRegionException when putting object into it.

hbase hbck -details
reports that

Chain of regions in table STable is broken; edges does not contain
ztxrGmCwn-6BE32s3cX1TNeHU_I=
ERROR: Found inconsistency in table STable


echo "scan '.META.'"| hbase shell &> meta.txt
grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt
reports that

 Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.',
STARTKEY => 'ztn0ukLW
 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY =>
'ztqdVD8fCMP-dDbXUAydan
 e9.kboD4=', ENCODED =>
71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME =
--
 D4=,1305619724446.c45191   45191821053d03537596f4a2e759718.',
STARTKEY => ztqdVD8f
 821053d03537596f4a2e7597   CMP-dDbXUAydankboD4=', ENDKEY => '
ztxrGmCwn-6BE32s3cX1TN
 18.eHU_I=', ENCODED =>
c45191821053d03537596f4a2e759718, TABLE => {{NAME =
--
 pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.',
STARTKEY => 'zu3zVaLc
 78722ea3f8d1b099313bec82   GDnnpjKCbnboXgAFspA=', ENDKEY =>
'zu7qkr5fH6MMJ3GxbCv_0d
 98.6g8yI=', ENCODED =>
c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME =


It looks like the meta indeed has a hole.(We tried scan '.META.' several
times, to confirm it's not a transient status.)
We've tried hbase hbck -fix, does not help.

We found a thread 'wrong region exception' about two months ago. Stack
suggested a 'little surgery' like


*So, make sure you actually have a hole.  Dump out your meta table:

echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt

Then look ensure that there is a hole between the above regions
(compare start and end keys... the end key of one region needs to
match the start key of the next).

If indeed a hole, you need to do a little surgery inserting a new
missing region (hbck should fix this but it doesn't have the smarts
just yet).

Basically, you create a new region with start and end keys to fill the
hole then you insert it into .META. and then assign it.  There are
some scripts in our bin directory that do various parts of this.  I'm
pretty sure its beyond any but a few figuring this mess out so if you
do the above foot work and provide a few more details, I'll hack up
something for you (and hopefully something generalized to be use by
others later, and later to be integrated into hbck).*



Can anyone give a detailed example, step by step instruction would be
greatly appreciated.
My understand is we should
1.Since we already has the lost region, we now have start and end keys.
2.generate the row represents the missing region. But how can I generate the
encoded name?
It looks like I need
column=info:server,column=info:serverstartcode and column=info:regioninfo
for the missing region.
And column=info:regioninfo includes so many information. How to generate
them one by one?
As for the name of row, it consists of tablename, startkey, encode, and one
more long number,
how to get this number?
3.use assing command in the hbase shell

We also tried check_meta.rb --fix, it reports

11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME =>
'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.',
STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY =>
'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718,
TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER =>
'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3',
TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE
=> 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE', REPLICATION_SCOPE
=> '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647',
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
'userbucket', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION
=> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'userpass',
BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE',
VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>
'false', BLOCKCACHE => 'true'}]}}
11/07/06 00:28:40 WARN check_meta: Missing .regioninfo: hdfs://
hd0013.c.gj.com:9000/hbase/STable/3e6faca40a7ccad7ed8c0b5848c0f945/.regioninfo


The problem is still there. BTW, what about the blue warning? Is this a
serious issue?
The situation is quite hard to us, it looks like even we can fill the hole
in the meta, we would lost all the data in the hole region, right?

Thanks and regards,

Mao Xu-Feng


org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 actions: WrongRegionException:

2011-09-26 Thread Vinod Gupta Tankala
Hi,
I have seen similar issue being handled here but i didn't fully understand
how to fix it in my case. this is the earlier thread -
http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/18410

i am seeing the below issues in my client -
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
3 ac
tions: WrongRegionException: 3 times, servers with issues:
ip-10-32-61-60.ec2.in
ternal:44911,
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
tation.processBatch(HConnectionManager.java:1220)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
tation.processBatchOfPuts(HConnectionManager.java:1234)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
...

also, hbase hbck -details gives the following errors. -fix doesn't do
anything. neither does check_meta.rb.

ERROR: Region
AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b.
not deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/3af08f3ba88d7c2785fd089702f89241
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/a522e78c7d018547b8979ed6fd381358
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/b89f7bb47e53ea9cc729c86978a7327c
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/fd7ce07003bfd883f53a210bb0985065
on HDFS, but not listed in META or deployed on any region server.
Chain of regions in table AkStats is broken; edges does not contain
277808094:1314921600:daily:Volume
ERROR: Found inconsistency in table AkStats
Summary:
  -ROOT- is okay.
Number of regions: 1
Deployed on:  ip-10-32-61-60.ec2.internal:55915
  .META. is okay.
Number of regions: 1
Deployed on:  ip-10-32-61-60.ec2.internal:55915
Chain of regions in table AkStats is broken; edges does not contain
277808094:1314921600:daily:Volume
Table AkStats is inconsistent.
Number of regions: 133
Deployed on:  ip-10-32-61-60.ec2.internal:55915
7 inconsistencies detected.
Status: INCONSISTENT

my production system is down on its knees and I need to get unblocked asap.
any help will be highly appreciated.

thanks
vinod


Re: Puts failing with WrongRegionException

2014-10-03 Thread Ted Yu
Can you check region server log to see if region m_test,
64d7e88463b88e7325b623fbd6629cda,1408803862959.
cb513be341b94588469efa9d26d29857. moved / splitted between MR job launch
and the time when this error showed up ?

Thanks

On Fri, Oct 3, 2014 at 4:08 PM, Thomas Kwan  wrote:

> Hi there,
>
> Wonder if anyone has seen error like this
>
> 2014-10-03 16:03:45,203 WARN  [RpcServer.handler=7,port=60020]
> regionserver.HRegion: Failed getting lock in batch put,
> row=65317d52abfedc8b94a19f6fbffe187c
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> out of range for row lock on HRegion
>
> m_test,64d7e88463b88e7325b623fbd6629cda,1408803862959.cb513be341b94588469efa9d26d29857.,
> startKey='64d7e88463b88e7325b623fbd6629cda',
> getEndKey()='6516687f5dae26f529c53f309cb36fca',
> row='65317d52abfedc8b94a19f6fbffe187c'
>
>
> Recently, we have added 10 more region servers to our cluster and then I
> started seeing errors like above when doing puts via TableOutputFormat in a
> MR job.
>
> Maybe where hbase stores the region info is corrupted?
>
> thanks for your help in advance
> thomas
>


Re: WrongRegionException and inconsistent table found

2011-07-05 Thread Xu-Feng Mao
We also check the master log, nothing interesting found.

On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao  wrote:

> Hi,
>
> We're running a hbase cluster including 37 regionservers. Today, we found
> losts of WrongRegionException when putting object into it.
>
> hbase hbck -details
> reports that
> 
> Chain of regions in table STable is broken; edges does not contain
> ztxrGmCwn-6BE32s3cX1TNeHU_I=
> ERROR: Found inconsistency in table STable
> 
>
> echo "scan '.META.'"| hbase shell &> meta.txt
> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt
> reports that
> 
>  Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.',
> STARTKEY => 'ztn0ukLW
>  0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY =>
> 'ztqdVD8fCMP-dDbXUAydan
>  e9.kboD4=', ENCODED =>
> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME =
> --
>  D4=,1305619724446.c45191   45191821053d03537596f4a2e759718.',
> STARTKEY => ztqdVD8f
>  821053d03537596f4a2e7597   CMP-dDbXUAydankboD4=', ENDKEY => '
> ztxrGmCwn-6BE32s3cX1TN
>  18.eHU_I=', ENCODED =>
> c45191821053d03537596f4a2e759718, TABLE => {{NAME =
> --
>  pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.',
> STARTKEY => 'zu3zVaLc
>  78722ea3f8d1b099313bec82   GDnnpjKCbnboXgAFspA=', ENDKEY =>
> 'zu7qkr5fH6MMJ3GxbCv_0d
>  98.6g8yI=', ENCODED =>
> c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME =
> 
>
> It looks like the meta indeed has a hole.(We tried scan '.META.' several
> times, to confirm it's not a transient status.)
> We've tried hbase hbck -fix, does not help.
>
> We found a thread 'wrong region exception' about two months ago. Stack
> suggested a 'little surgery' like
> 
>
> *So, make sure you actually have a hole.  Dump out your meta table:
>
> echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt
>
> Then look ensure that there is a hole between the above regions
> (compare start and end keys... the end key of one region needs to
> match the start key of the next).
>
> If indeed a hole, you need to do a little surgery inserting a new
> missing region (hbck should fix this but it doesn't have the smarts
> just yet).
>
> Basically, you create a new region with start and end keys to fill the
> hole then you insert it into .META. and then assign it.  There are
> some scripts in our bin directory that do various parts of this.  I'm
> pretty sure its beyond any but a few figuring this mess out so if you
> do the above foot work and provide a few more details, I'll hack up
> something for you (and hopefully something generalized to be use by
> others later, and later to be integrated into hbck).*
>
> 
>
> Can anyone give a detailed example, step by step instruction would be
> greatly appreciated.
> My understand is we should
> 1.Since we already has the lost region, we now have start and end keys.
> 2.generate the row represents the missing region. But how can I generate
> the encoded name?
> It looks like I need
> column=info:server,column=info:serverstartcode and column=info:regioninfo
> for the missing region.
> And column=info:regioninfo includes so many information. How to generate
> them one by one?
> As for the name of row, it consists of tablename, startkey, encode, and one
> more long number,
> how to get this number?
> 3.use assing command in the hbase shell
>
> We also tried check_meta.rb --fix, it reports
> 
> 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME =>
> 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.',
> STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY =>
> 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718,
> TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3',
> TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCACHE => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE',
> REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '
> 2

Re: WrongRegionException and inconsistent table found

2011-07-05 Thread Xu-Feng Mao
I forgot the version, we are using cdh3u0.

Mao Xu-Feng

在 2011-7-6,0:59,Xu-Feng Mao  写道:

We also check the master log, nothing interesting found.

On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao  wrote:

> Hi,
>
> We're running a hbase cluster including 37 regionservers. Today, we found
> losts of WrongRegionException when putting object into it.
>
> hbase hbck -details
> reports that
> 
> Chain of regions in table STable is broken; edges does not contain
> ztxrGmCwn-6BE32s3cX1TNeHU_I=
> ERROR: Found inconsistency in table STable
> 
>
> echo "scan '.META.'"| hbase shell &> meta.txt
> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt
> reports that
> 
>  Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.',
> STARTKEY => 'ztn0ukLW
>  0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY =>
> 'ztqdVD8fCMP-dDbXUAydan
>  e9.kboD4=', ENCODED =>
> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME =
> --
>  D4=,1305619724446.c45191   45191821053d03537596f4a2e759718.',
> STARTKEY => ztqdVD8f
>  821053d03537596f4a2e7597   CMP-dDbXUAydankboD4=', ENDKEY => '
> ztxrGmCwn-6BE32s3cX1TN
>  18.eHU_I=', ENCODED =>
> c45191821053d03537596f4a2e759718, TABLE => {{NAME =
> --
>  pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.',
> STARTKEY => 'zu3zVaLc
>  78722ea3f8d1b099313bec82   GDnnpjKCbnboXgAFspA=', ENDKEY =>
> 'zu7qkr5fH6MMJ3GxbCv_0d
>  98.6g8yI=', ENCODED =>
> c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME =
> 
>
> It looks like the meta indeed has a hole.(We tried scan '.META.' several
> times, to confirm it's not a transient status.)
> We've tried hbase hbck -fix, does not help.
>
> We found a thread 'wrong region exception' about two months ago. Stack
> suggested a 'little surgery' like
> 
>
> *So, make sure you actually have a hole.  Dump out your meta table:
>
> echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt
>
> Then look ensure that there is a hole between the above regions
> (compare start and end keys... the end key of one region needs to
> match the start key of the next).
>
> If indeed a hole, you need to do a little surgery inserting a new
> missing region (hbck should fix this but it doesn't have the smarts
> just yet).
>
> Basically, you create a new region with start and end keys to fill the
> hole then you insert it into .META. and then assign it.  There are
> some scripts in our bin directory that do various parts of this.  I'm
> pretty sure its beyond any but a few figuring this mess out so if you
> do the above foot work and provide a few more details, I'll hack up
> something for you (and hopefully something generalized to be use by
> others later, and later to be integrated into hbck).*
>
> 
>
> Can anyone give a detailed example, step by step instruction would be
> greatly appreciated.
> My understand is we should
> 1.Since we already has the lost region, we now have start and end keys.
> 2.generate the row represents the missing region. But how can I generate
> the encoded name?
> It looks like I need
> column=info:server,column=info:serverstartcode and column=info:regioninfo
> for the missing region.
> And column=info:regioninfo includes so many information. How to generate
> them one by one?
> As for the name of row, it consists of tablename, startkey, encode, and one
> more long number,
> how to get this number?
> 3.use assing command in the hbase shell
>
> We also tried check_meta.rb --fix, it reports
> 
> 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME =>
> 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.',
> STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY =>
> 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718,
> TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER =>
> 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3',
> TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCACHE => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE',
> REPLICATION_SCOPE => '0'

RE: WrongRegionException and inconsistent table found

2011-07-14 Thread Robert Gonzalez
egionInfo(bytes)
  if oldHRI
if oldHRI.isOffline() && Bytes.equals(oldHRI.getStartKey(), 
hri.getStartKey())
  # Presume offlined parent
elsif Bytes.equals(oldHRI.getEndKey(), hri.getStartKey())
  # Start key of next matches end key of previous
else
  LOG.info("hole after " + oldHRI.toString())
  if fixup
bad = 1 unless fixup(oldHRI, hri, metatable, conf)
  else
bad = 1
  end
end
  else
if not Bytes.toString(hri.getStartKey()) == ""
  bad = 1 unless fixup("", hri, metatable, conf)
end
  end
  oldHRI = hri
end
if not Bytes.toString(hri.getEndKey()) == ""
  bad = 1 unless fixup(hri, "", metatable, conf)
end
scanner.close()
if bad
  LOG.info(".META. has holes")
else
  LOG.info(".META. is healthy")
end

# Return 0 if meta is good, else non-zero.
exit bad
===END CODE


-Original Message-
From: Xu-Feng Mao [mailto:m9s...@gmail.com] 
Sent: Tuesday, July 05, 2011 6:21 PM
To: Xu-Feng Mao
Cc: user@hbase.apache.org; hbase-u...@hadoop.apache.org
Subject: Re: WrongRegionException and inconsistent table found

I forgot the version, we are using cdh3u0.

Mao Xu-Feng

在 2011-7-6,0:59,Xu-Feng Mao  写道:

We also check the master log, nothing interesting found.

On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao  wrote:

> Hi,
>
> We're running a hbase cluster including 37 regionservers. Today, we 
> found losts of WrongRegionException when putting object into it.
>
> hbase hbck -details
> reports that
> 
> Chain of regions in table STable is broken; edges does not contain 
> ztxrGmCwn-6BE32s3cX1TNeHU_I=
> ERROR: Found inconsistency in table STable 
>
> echo "scan '.META.'"| hbase shell &> meta.txt grep -A1 "STARTKEY => 
> 'EStore_everbox_z" meta.txt
> reports that
> 
>  Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.',
> STARTKEY => 'ztn0ukLW
>  0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY =>
> 'ztqdVD8fCMP-dDbXUAydan
>  e9.kboD4=', ENCODED =>
> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME =
> --
>  D4=,1305619724446.c45191   45191821053d03537596f4a2e759718.',
> STARTKEY => ztqdVD8f
>  821053d03537596f4a2e7597   CMP-dDbXUAydankboD4=', ENDKEY => '
> ztxrGmCwn-6BE32s3cX1TN
>  18.eHU_I=', ENCODED =>
> c45191821053d03537596f4a2e759718, TABLE => {{NAME =
> --
>  pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.',
> STARTKEY => 'zu3zVaLc
>  78722ea3f8d1b099313bec82   GDnnpjKCbnboXgAFspA=', ENDKEY =>
> 'zu7qkr5fH6MMJ3GxbCv_0d
>  98.6g8yI=', ENCODED =>
> c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = 
>
> It looks like the meta indeed has a hole.(We tried scan '.META.' 
> several times, to confirm it's not a transient status.) We've tried 
> hbase hbck -fix, does not help.
>
> We found a thread 'wrong region exception' about two months ago. Stack 
> suggested a 'little surgery' like 
>
> *So, make sure you actually have a hole.  Dump out your meta table:
>
> echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt
>
> Then look ensure that there is a hole between the above regions 
> (compare start and end keys... the end key of one region needs to 
> match the start key of the next).
>
> If indeed a hole, you need to do a little surgery inserting a new 
> missing region (hbck should fix this but it doesn't have the smarts 
> just yet).
>
> Basically, you create a new region with start and end keys to fill the 
> hole then you insert it into .META. and then assign it.  There are 
> some scripts in our bin directory that do various parts of this.  I'm 
> pretty sure its beyond any but a few figuring this mess out so if you 
> do the above foot work and provide a few more details, I'll hack up 
> something for you (and hopefully something generalized to be use by 
> others later, and later to be integrated into hbck).*
>
> 
>
> Can anyone give a detailed example, step by step instruction would be 
> greatly appreciated.
> My understand is we should
> 1.Since we already has the lost region, we now have start and end keys.
> 2.generate the row represents the missing region. But how can I 
> generate the encoded name?
> It looks like I need
> column=info:server,column=info:serverstartcode and 
> column=info:regioninfo for

Fwd: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 actions: WrongRegionException:

2011-09-27 Thread Vinod Gupta Tankala
Hi,
I have seen similar issue being handled here but i didn't fully understand
how to fix it in my case. this is the earlier thread -
http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/18410

i am seeing the below issues in my client -
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed
3 ac
tions: WrongRegionException: 3 times, servers with issues:
ip-10-32-61-60.ec2.in
ternal:44911,
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
tation.processBatch(HConnectionManager.java:1220)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
tation.processBatchOfPuts(HConnectionManager.java:1234)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
...

also, hbase hbck -details gives the following errors. -fix doesn't do
anything. neither does check_meta.rb.

ERROR: Region
AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b.
not deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/3af08f3ba88d7c2785fd089702f89241
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/a522e78c7d018547b8979ed6fd381358
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/b89f7bb47e53ea9cc729c86978a7327c
on HDFS, but not listed in META or deployed on any region server.
ERROR: Region
file:/media/ephemeral0/hbase-data/AkStats/fd7ce07003bfd883f53a210bb0985065
on HDFS, but not listed in META or deployed on any region server.
Chain of regions in table AkStats is broken; edges does not contain
277808094:1314921600:daily:Volume
ERROR: Found inconsistency in table AkStats
Summary:
  -ROOT- is okay.
Number of regions: 1
Deployed on:  ip-10-32-61-60.ec2.internal:55915
  .META. is okay.
Number of regions: 1
Deployed on:  ip-10-32-61-60.ec2.internal:55915
Chain of regions in table AkStats is broken; edges does not contain
277808094:1314921600:daily:Volume
Table AkStats is inconsistent.
Number of regions: 133
Deployed on:  ip-10-32-61-60.ec2.internal:55915
7 inconsistencies detected.
Status: INCONSISTENT

my production system is down on its knees and I need to get unblocked asap.
any help will be highly appreciated.

thanks
vinod


Dead loop for batch put when get WrongRegionException

2015-07-22 Thread Louis Hust
Hi ,all

We are using batch put to insert rows, and sometimes get the following WARN
in the region server log:


2015-07-23 10:08:49,684 WARN
 [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion:
Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
out of range for row lock on HRegion
atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21.,
startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399'
at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456)
at
org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474)
at
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)


And the WARN message is logged no-stop. I think the batch put dived into
the dead loop.

And i look up into the source code, and find the batch put will never stop
if got WrongRegionException for some row.

Any body know how to avoid this situation?

Any idea will be appreciated!


Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
Hi,

I looks like we've lost a region, include the directory on hdfs and its meta
record as well. We need some more time to dig into the log sea, to figure
out the root cause.

But first of all, we need to recover the meta, so that we can put keys in
that region. My understanding is the check_meta.rb and add_table.rb could
fix some meta issues in case the directory on hdfs and its .regioninfo still
exists.

In our situation however, since we could not find the region directory any
longer, it seems that all we could do is still insert a record into the
meta, then assign it.

I modified the check_meta.rb, to achieve the insertion. I've tried in our
environment, it seems work, at least hbase hbck tells me okay. I attached it
with this message.Any comments is great appreciated.

I have one more question. I create the new region record with both startkey
and endkey set, it seems possible that if we're unlucky, during the
insertion, some split happens, then we might lead to overlap region. I
wonder how hbase handles this sort of problems generally.

When I was playing with the test environment, I saw message like some region
'is multiply assigned to region servers', it is also a inconsistent
scenario, how can I recover this problem?

Thanks and regards,

Mao Xu-Feng

-- Forwarded message --
From: Xu-Feng Mao 
Date: Wed, Jul 6, 2011 at 7:20 AM
Subject: Re: WrongRegionException and inconsistent table found
To: Xu-Feng Mao 
Cc: "user@hbase.apache.org" , "
hbase-u...@hadoop.apache.org" 


I forgot the version, we are using cdh3u0.

Mao Xu-Feng

在 2011-7-6,0:59,Xu-Feng Mao  写道:

We also check the master log, nothing interesting found.

On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao < 
m9s...@gmail.com> wrote:

> Hi,
>
> We're running a hbase cluster including 37 regionservers. Today, we found
> losts of WrongRegionException when putting object into it.
>
> hbase hbck -details
> reports that
> 
> Chain of regions in table STable is broken; edges does not contain
> ztxrGmCwn-6BE32s3cX1TNeHU_I=
> ERROR: Found inconsistency in table STable
> 
>
> echo "scan '.META.'"| hbase shell &> meta.txt
> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt
> reports that
> 
>  Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.',
> STARTKEY => 'ztn0ukLW
>  0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY =>
> 'ztqdVD8fCMP-dDbXUAydan
>  e9.kboD4=', ENCODED =>
> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME =
> --
>  D4=,1305619724446.c45191   45191821053d03537596f4a2e759718.',
> STARTKEY => ztqdVD8f
>  821053d03537596f4a2e7597   CMP-dDbXUAydankboD4=', ENDKEY => '
> ztxrGmCwn-6BE32s3cX1TN
>  18.eHU_I=', ENCODED =>
> c45191821053d03537596f4a2e759718, TABLE => {{NAME =
> --
>  pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.',
> STARTKEY => 'zu3zVaLc
>  78722ea3f8d1b099313bec82   GDnnpjKCbnboXgAFspA=', ENDKEY =>
> 'zu7qkr5fH6MMJ3GxbCv_0d
>  98.6g8yI=', ENCODED =>
> c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME =
> 
>
> It looks like the meta indeed has a hole.(We tried scan '.META.' several
> times, to confirm it's not a transient status.)
> We've tried hbase hbck -fix, does not help.
>
> We found a thread 'wrong region exception' about two months ago. Stack
> suggested a 'little surgery' like
> 
>
> *So, make sure you actually have a hole.  Dump out your meta table:
>
> echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt
>
> Then look ensure that there is a hole between the above regions
> (compare start and end keys... the end key of one region needs to
> match the start key of the next).
>
> If indeed a hole, you need to do a little surgery inserting a new
> missing region (hbck should fix this but it doesn't have the smarts
> just yet).
>
> Basically, you create a new region with start and end keys to fill the
> hole then you insert it into .META. and then assign it.  There are
> some scripts in our bin directory that do various parts of this.  I'm
> pretty sure its beyond any but a few figuring this mess out so if you
> do the above foot work and provide a few more details, I'll hack up
> something for you (and hopefully something generalized to be use by
> others later, and later to be integrated into hbck).*
>
> 
>
> Can anyone give a detailed example, ste

Re: Dead loop for batch put when get WrongRegionException

2015-07-22 Thread Victor Xu
Any chance that this would be your problem?
https://issues.apache.org/jira/browse/HBASE-13896

On Thu, Jul 23, 2015 at 11:17 AM Louis Hust  wrote:

> Hi ,all
>
> We are using batch put to insert rows, and sometimes get the following WARN
> in the region server log:
>
> 
> 2015-07-23 10:08:49,684 WARN
>  [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion:
> Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> out of range for row lock on HRegion
> atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21.,
> startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399'
> at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477)
> at
>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
> 
>
> And the WARN message is logged no-stop. I think the batch put dived into
> the dead loop.
>
> And i look up into the source code, and find the batch put will never stop
> if got WrongRegionException for some row.
>
> Any body know how to avoid this situation?
>
> Any idea will be appreciated!
>


Re: Dead loop for batch put when get WrongRegionException

2015-07-22 Thread Louis Hust
It seems that the HBASE-13896
<https://issues.apache.org/jira/browse/HBASE-13896>  is client-side dead
loop,
but my problem is the region server side dead lock for get row lock.

2015-07-23 11:22 GMT+08:00 Victor Xu :

> Any chance that this would be your problem?
> https://issues.apache.org/jira/browse/HBASE-13896
>
> On Thu, Jul 23, 2015 at 11:17 AM Louis Hust  wrote:
>
> > Hi ,all
> >
> > We are using batch put to insert rows, and sometimes get the following
> WARN
> > in the region server log:
> >
> > 
> > 2015-07-23 10:08:49,684 WARN
> >  [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion:
> > Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399
> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> > out of range for row lock on HRegion
> > atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21.,
> > startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399'
> > at
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456)
> > at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474)
> > at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
> > at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
> > at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
> > at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217)
> > at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386)
> > at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588)
> > at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477)
> > at
> >
> >
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593)
> > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
> > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> > at
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
> > 
> >
> > And the WARN message is logged no-stop. I think the batch put dived into
> > the dead loop.
> >
> > And i look up into the source code, and find the batch put will never
> stop
> > if got WrongRegionException for some row.
> >
> > Any body know how to avoid this situation?
> >
> > Any idea will be appreciated!
> >
>


Re: Dead loop for batch put when get WrongRegionException

2015-07-23 Thread Louis Hust
My hbase version is 0.98.6,  i will try update client to 0.98.14 and keep
server at 0.98.6,
Thanks very much!

2015-07-23 15:00 GMT+08:00 Victor Xu :

> Client-side dead loop can cause sending wrong read/write requests to the
> region servers, and you've got exactly the same log output as myself when
> the the bug happens. However, this only occurs when you are using 0.98.X
> version. 1.0 and above do not have this problem.
>
> On Thu, Jul 23, 2015 at 2:55 PM Louis Hust  wrote:
>
>> It seems that the HBASE-13896
>> <https://issues.apache.org/jira/browse/HBASE-13896>  is client-side dead
>> loop,
>> but my problem is the regionserver-side dead lock for get row lock,
>>
>> 2015-07-23 11:23 GMT+08:00 Victor Xu :
>>
>>> FYI
>>>
>>> -- Forwarded message -
>>> From: Victor Xu 
>>> Date: Thu, Jul 23, 2015 at 11:22 AM
>>> Subject: Re: Dead loop for batch put when get WrongRegionException
>>> To: user@hbase.apache.org 
>>>
>>>
>>> Any chance that this would be your problem?
>>> https://issues.apache.org/jira/browse/HBASE-13896
>>>
>>> On Thu, Jul 23, 2015 at 11:17 AM Louis Hust 
>>> wrote:
>>>
>>>> Hi ,all
>>>>
>>>> We are using batch put to insert rows, and sometimes get the following
>>>> WARN
>>>> in the region server log:
>>>>
>>>> 
>>>> 2015-07-23 10:08:49,684 WARN
>>>>  [B.defaultRpcServer.handler=5,queue=5,port=60020] regionserver.HRegion:
>>>> Failed getting lock in batch put, row=BHXYHZFIHHR3ECON10100215072399
>>>> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
>>>> out of range for row lock on HRegion
>>>> atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21.,
>>>> startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399'
>>>> at
>>>> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593)
>>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
>>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>> at
>>>>
>>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>> 
>>>>
>>>> And the WARN message is logged no-stop. I think the batch put dived into
>>>> the dead loop.
>>>>
>>>> And i look up into the source code, and find the batch put will never
>>>> stop
>>>> if got WrongRegionException for some row.
>>>>
>>>> Any body know how to avoid this situation?
>>>>
>>>> Any idea will be appreciated!
>>>>
>>>
>>


Re: Dead loop for batch put when get WrongRegionException

2015-07-23 Thread Victor Xu
This patch hasn't been merge into 0.98.14 right now. you can apply it in
your version, just recompile the hbase-client is ok.

On Thu, Jul 23, 2015 at 3:04 PM Louis Hust  wrote:

> My hbase version is 0.98.6,  i will try update client to 0.98.14 and keep
> server at 0.98.6,
> Thanks very much!
>
> 2015-07-23 15:00 GMT+08:00 Victor Xu :
>
>> Client-side dead loop can cause sending wrong read/write requests to the
>> region servers, and you've got exactly the same log output as myself when
>> the the bug happens. However, this only occurs when you are using 0.98.X
>> version. 1.0 and above do not have this problem.
>>
>> On Thu, Jul 23, 2015 at 2:55 PM Louis Hust  wrote:
>>
>>> It seems that the HBASE-13896
>>> <https://issues.apache.org/jira/browse/HBASE-13896>  is client-side
>>> dead loop,
>>> but my problem is the regionserver-side dead lock for get row lock,
>>>
>>> 2015-07-23 11:23 GMT+08:00 Victor Xu :
>>>
>>>> FYI
>>>>
>>>> -- Forwarded message -
>>>> From: Victor Xu 
>>>> Date: Thu, Jul 23, 2015 at 11:22 AM
>>>> Subject: Re: Dead loop for batch put when get WrongRegionException
>>>> To: user@hbase.apache.org 
>>>>
>>>>
>>>> Any chance that this would be your problem?
>>>> https://issues.apache.org/jira/browse/HBASE-13896
>>>>
>>>> On Thu, Jul 23, 2015 at 11:17 AM Louis Hust 
>>>> wrote:
>>>>
>>>>> Hi ,all
>>>>>
>>>>> We are using batch put to insert rows, and sometimes get the following
>>>>> WARN
>>>>> in the region server log:
>>>>>
>>>>> 
>>>>> 2015-07-23 10:08:49,684 WARN
>>>>>  [B.defaultRpcServer.handler=5,queue=5,port=60020]
>>>>> regionserver.HRegion:
>>>>> Failed getting lock in batch put,
>>>>> row=BHXYHZFIHHR3ECON10100215072399
>>>>> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested
>>>>> row
>>>>> out of range for row lock on HRegion
>>>>> atpco:ttf_fare,C,1437145538123.9c2b8cb846b318045f2ad6b5c87fef21.,
>>>>> startKey='C', getEndKey()='D', row='BHXYHZFIHHR3ECON10100215072399'
>>>>> at
>>>>> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3456)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3474)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2217)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4386)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3588)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3477)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29593)
>>>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
>>>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>>> 
>>>>>
>>>>> And the WARN message is logged no-stop. I think the batch put dived
>>>>> into
>>>>> the dead loop.
>>>>>
>>>>> And i look up into the source code, and find the batch put will never
>>>>> stop
>>>>> if got WrongRegionException for some row.
>>>>>
>>>>> Any body know how to avoid this situation?
>>>>>
>>>>> Any idea will be appreciated!
>>>>>
>>>>
>>>
>


Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Ted Yu
The attachment didn't go through. 
Can you put the file on pastebin ?

Or you can open a JIRA and attach it there. 

Thanks



On Jul 6, 2011, at 5:37 AM, Xu-Feng Mao  wrote:

> Hi, 
> 
> I looks like we've lost a region, include the directory on hdfs and its meta 
> record as well. We need some more time to dig into the log sea, to figure out 
> the root cause.
> 
> But first of all, we need to recover the meta, so that we can put keys in 
> that region. My understanding is the check_meta.rb and add_table.rb could fix 
> some meta issues in case the directory on hdfs and its .regioninfo still 
> exists.
> 
> In our situation however, since we could not find the region directory any 
> longer, it seems that all we could do is still insert a record into the meta, 
> then assign it.
> 
> I modified the check_meta.rb, to achieve the insertion. I've tried in our 
> environment, it seems work, at least hbase hbck tells me okay. I attached it 
> with this message.Any comments is great appreciated.
> 
> I have one more question. I create the new region record with both startkey 
> and endkey set, it seems possible that if we're unlucky, during the 
> insertion, some split happens, then we might lead to overlap region. I wonder 
> how hbase handles this sort of problems generally. 
> 
> When I was playing with the test environment, I saw message like some region
> 'is multiply assigned to region servers', it is also a inconsistent scenario, 
> how can I recover this problem?
> 
> Thanks and regards,
> 
> Mao Xu-Feng
> 
> ------ Forwarded message --
> From: Xu-Feng Mao 
> Date: Wed, Jul 6, 2011 at 7:20 AM
> Subject: Re: WrongRegionException and inconsistent table found
> To: Xu-Feng Mao 
> Cc: "user@hbase.apache.org" , 
> "hbase-u...@hadoop.apache.org" 
> 
> 
> I forgot the version, we are using cdh3u0.
> 
> Mao Xu-Feng
> 
> 在 2011-7-6,0:59,Xu-Feng Mao  写道:
> 
>> We also check the master log, nothing interesting found.
>> 
>> On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao  wrote:
>> Hi,
>> 
>> We're running a hbase cluster including 37 regionservers. Today, we found 
>> losts of WrongRegionException when putting object into it.
>> 
>> hbase hbck -details 
>> reports that
>> 
>> Chain of regions in table STable is broken; edges does not contain 
>> ztxrGmCwn-6BE32s3cX1TNeHU_I=
>> ERROR: Found inconsistency in table STable
>> 
>> 
>> echo "scan '.META.'"| hbase shell &> meta.txt 
>> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt
>> reports that
>> 
>>  Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', 
>> STARTKEY => 'ztn0ukLW
>>  0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => 
>> 'ztqdVD8fCMP-dDbXUAydan
>>  e9.kboD4=', ENCODED => 
>> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME =
>> --
>>  D4=,1305619724446.c45191   45191821053d03537596f4a2e759718.', 
>> STARTKEY => ztqdVD8f
>>  821053d03537596f4a2e7597   CMP-dDbXUAydankboD4=', ENDKEY => 
>> 'ztxrGmCwn-6BE32s3cX1TN
>>  18.eHU_I=', ENCODED => 
>> c45191821053d03537596f4a2e759718, TABLE => {{NAME =
>> --
>>  pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.', 
>> STARTKEY => 'zu3zVaLc
>>  78722ea3f8d1b099313bec82   GDnnpjKCbnboXgAFspA=', ENDKEY => 
>> 'zu7qkr5fH6MMJ3GxbCv_0d
>>  98.6g8yI=', ENCODED => 
>> c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME =
>> 
>> 
>> It looks like the meta indeed has a hole.(We tried scan '.META.' several 
>> times, to confirm it's not a transient status.)
>> We've tried hbase hbck -fix, does not help.
>> 
>> We found a thread 'wrong region exception' about two months ago. Stack 
>> suggested a 'little surgery' like
>> 
>> So, make sure you actually have a hole.  Dump out your meta table:
>> 
>> echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt
>> 
>> Then look ensure that there is a hole between the above regions
>> (compare start and end keys... the end key of one region needs to
>> match the start key of the next).
>> 
>> If indeed a hole, you need to do a little surgery inserting a new
>> missin

Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
Thanks Ted for your attention.

I do not know how to open a JIRA, so paste script here. This is not meant to
work for general purpose, all I want to do here is just to insert the
missing meta record.

===
#  ${HBASE_HOME}/bin/hbase org.jruby.Main test.rb --help
include Java
import org.apache.commons.logging.LogFactory
import org.apache.hadoop.hbase.util.VersionInfo
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path
import org.apache.hadoop.hbase.HConstants
import org.apache.hadoop.hbase.util.FSUtils
import org.apache.hadoop.hbase.client.HTable
import org.apache.hadoop.hbase.client.Scan
import org.apache.hadoop.hbase.util.Writables
import org.apache.hadoop.hbase.HRegionInfo
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.HTableDescriptor
import org.apache.hadoop.hbase.client.Put
import com.google.common.base.Objects

# Name of this script
NAME = 'test'

# Print usage for this script
def usage
  puts 'Usage: %s.rb ' % NAME
  exit!
end

def getConfiguration
  hbase_twenty = VersionInfo.getVersion().match('0\.20\..*')
  # Get configuration to use.
  if hbase_twenty
c = HBaseConfiguration.new()
  else
c = HBaseConfiguration.create()
  end
  # Set hadoop filesystem configuration using the hbase.rootdir.
  # Otherwise, we'll always use localhost though the hbase.rootdir
  # might be pointing at hdfs location. Do old and new key for fs.
  c.set("fs.default.name", c.get(HConstants::HBASE_DIR))
  c.set("fs.defaultFS", c.get(HConstants::HBASE_DIR))
  return c
end

# Get configuration
conf = getConfiguration()

# Filesystem
fs = FileSystem.get(conf)

# Rootdir
rootdir = FSUtils.getRootDir(conf)

# Get a logger and a metautils instance.
LOG = LogFactory.getLog(NAME)

# Scan the .META. looking for holes
metatable = HTable.new(conf, HConstants::META_TABLE_NAME)
scan = Scan.new()
scanner = metatable.getScanner(scan)
oldHRI = nil
bad = nil
while (result = scanner.next())
  rowid = Bytes.toString(result.getRow())
  rowidStr = java.lang.String.new(rowid)
  bytes = result.getValue(HConstants::CATALOG_FAMILY,
HConstants::REGIONINFO_QUALIFIER)
  hri = Writables.getHRegionInfo(bytes)
  endKey = Bytes.toString(hri.getEndKey())
  if Objects.equal(endKey, "text_7aea6698-2503-4624-a835-3ea641d52ba1")
puts Bytes.toString(hri.getStartKey())
newhri = HRegionInfo.new(hri.getTableDesc(),
java.lang.String.new("text_7aea6698-2503-4624-a835-3ea641d52ba1").getBytes(),
java.lang.String.new("").getBytes(), false)
puts newhri.toString()
p = Put.new(newhri.getRegionName())
p.add(HConstants::CATALOG_FAMILY, HConstants::REGIONINFO_QUALIFIER,
Writables.getBytes(newhri))
metatable.put(p)
break
  end

end
scanner.close()
exit 0

===

On Wed, Jul 6, 2011 at 9:01 PM, Ted Yu  wrote:

> The attachment didn't go through.
> Can you put the file on pastebin ?
>
> Or you can open a JIRA and attach it there.
>
> Thanks
>
>
>
> On Jul 6, 2011, at 5:37 AM, Xu-Feng Mao  wrote:
>
> Hi,
>
> I looks like we've lost a region, include the directory on hdfs and its
> meta record as well. We need some more time to dig into the log sea, to
> figure out the root cause.
>
> But first of all, we need to recover the meta, so that we can put keys in
> that region. My understanding is the check_meta.rb and add_table.rb could
> fix some meta issues in case the directory on hdfs and its .regioninfo still
> exists.
>
> In our situation however, since we could not find the region directory any
> longer, it seems that all we could do is still insert a record into the
> meta, then assign it.
>
> I modified the check_meta.rb, to achieve the insertion. I've tried in our
> environment, it seems work, at least hbase hbck tells me okay. I attached it
> with this message.Any comments is great appreciated.
>
> I have one more question. I create the new region record with both startkey
> and endkey set, it seems possible that if we're unlucky, during the
> insertion, some split happens, then we might lead to overlap region. I
> wonder how hbase handles this sort of problems generally.
>
> When I was playing with the test environment, I saw message like some
> region
> 'is multiply assigned to region servers', it is also a inconsistent
> scenario, how can I recover this problem?
>
> Thanks and regards,
>
> Mao Xu-Feng
>
> -- Forwarded message --
> From: Xu-Feng Mao < m9s...@gmail.com>
> Date: Wed, Jul 6, 2011 at 7:20 AM
> Subject: Re: WrongRegionException and inconsistent table found
> To: Xu-Feng Mao < m9s...@gmail.com>
> Cc: " user@hbase.apache.org" <
> user@hbase.apache.org>, " 
> hbase-u...@hadoop.apache.org" < 
>

Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Ted Yu
Have you read http://wiki.apache.org/hadoop/Hbase/HowToContribute ?

You can file an issue by starting from
https://issues.apache.org/jira/secure/CreateIssue!default.jspa

An issue solves a general problem. So you should parametrize the end key.

Cheers

On Wed, Jul 6, 2011 at 8:29 AM, Xu-Feng Mao  wrote:

> Thanks Ted for your attention.
>
> I do not know how to open a JIRA, so paste script here. This is not meant
> to work for general purpose, all I want to do here is just to insert the
> missing meta record.
>
> On Wed, Jul 6, 2011 at 9:01 PM, Ted Yu  wrote:
>
>> The attachment didn't go through.
>> Can you put the file on pastebin ?
>>
>> Or you can open a JIRA and attach it there.
>>
>> Thanks
>>
>>
>>
>> On Jul 6, 2011, at 5:37 AM, Xu-Feng Mao  wrote:
>>
>> Hi,
>>
>> I looks like we've lost a region, include the directory on hdfs and its
>> meta record as well. We need some more time to dig into the log sea, to
>> figure out the root cause.
>>
>> But first of all, we need to recover the meta, so that we can put keys in
>> that region. My understanding is the check_meta.rb and add_table.rb could
>> fix some meta issues in case the directory on hdfs and its .regioninfo still
>> exists.
>>
>> In our situation however, since we could not find the region directory any
>> longer, it seems that all we could do is still insert a record into the
>> meta, then assign it.
>>
>> I modified the check_meta.rb, to achieve the insertion. I've tried in our
>> environment, it seems work, at least hbase hbck tells me okay. I attached it
>> with this message.Any comments is great appreciated.
>>
>> I have one more question. I create the new region record with both
>> startkey and endkey set, it seems possible that if we're unlucky, during the
>> insertion, some split happens, then we might lead to overlap region. I
>> wonder how hbase handles this sort of problems generally.
>>
>> When I was playing with the test environment, I saw message like some
>> region
>> 'is multiply assigned to region servers', it is also a inconsistent
>> scenario, how can I recover this problem?
>>
>> Thanks and regards,
>>
>> Mao Xu-Feng
>>
>> -- Forwarded message --
>> From: Xu-Feng Mao < m9s...@gmail.com>
>> Date: Wed, Jul 6, 2011 at 7:20 AM
>> Subject: Re: WrongRegionException and inconsistent table found
>> To: Xu-Feng Mao < m9s...@gmail.com>
>> Cc: " user@hbase.apache.org" <
>> user@hbase.apache.org>, " 
>> hbase-u...@hadoop.apache.org" < 
>> hbase-u...@hadoop.apache.org>
>>
>>
>> I forgot the version, we are using cdh3u0.
>>
>> Mao Xu-Feng
>>
>> 在 2011-7-6,0:59,Xu-Feng Mao < m9s...@gmail.com> 写道:
>>
>> We also check the master log, nothing interesting found.
>>
>> On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao < 
>> 
>> m9s...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We're running a hbase cluster including 37 regionservers. Today, we found
>>> losts of WrongRegionException when putting object into it.
>>>
>>> hbase hbck -details
>>> reports that
>>> 
>>> Chain of regions in table STable is broken; edges does not contain
>>> ztxrGmCwn-6BE32s3cX1TNeHU_I=
>>> ERROR: Found inconsistency in table STable
>>> 
>>>
>>> echo "scan '.META.'"| hbase shell &> meta.txt
>>> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt
>>> reports that
>>> 
>>>  Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.',
>>> STARTKEY => 'ztn0ukLW
>>>  0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY =>
>>> 'ztqdVD8fCMP-dDbXUAydan
>>>  e9.kboD4=', ENCODED =>
>>> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME =
>>> --
>>>  D4=,1305619724446.c45191   45191821053d03537596f4a2e759718.',
>>> STARTKEY => ztqdVD8f
>>>  821053d03537596f4a2e7597   CMP-dDbXUAydankboD4=', ENDKEY => '
>>> ztxrGmCwn-6BE32s3cX1TN
>>>  18.eHU_I=', ENCODED =>
>>> c45191821053d03537596f4a2e759718, TABLE => {{NAME =
>>> --
>>>  pA=,1309455605341.c5c5f55c5f578722ea3f8d1b099313bec8298.',
>>> STARTKEY

Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Stack
On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao  wrote:
> I looks like we've lost a region, include the directory on hdfs and its meta
> record as well. We need some more time to dig into the log sea, to figure
> out the root cause.
>

You think it was https://issues.apache.org/jira/browse/HBASE-3872?

> But first of all, we need to recover the meta, so that we can put keys in
> that region. My understanding is the check_meta.rb and add_table.rb could
> fix some meta issues in case the directory on hdfs and its .regioninfo still
> exists.
>

Yes.  add_table.rb will go out on fs and find regions for the table
and rewrite that portion of .META.  In 0.90 it will not assign them
though you will likely need to disable then reenable the table to get
the regions out on the cluster.

Check_meta is likely the same.  It looks for the hole and if you pass
the -fix, will create a new region to plug the hole.  This is probably
what you need (You may need to assign the region post running the
script).

> I modified the check_meta.rb, to achieve the insertion. I've tried in our
> environment, it seems work, at least hbase hbck tells me okay. I attached it
> with this message.Any comments is great appreciated.
>

Good.

> I have one more question. I create the new region record with both startkey
> and endkey set, it seems possible that if we're unlucky, during the
> insertion, some split happens, then we might lead to overlap region. I
> wonder how hbase handles this sort of problems generally.
>

Well, you can't do cross-row transactions which is sort of what you
would need here in this case so, yes, its possible that there could be
overlap, though, didn't you say the region was missing? (If so, how
could it split?).

> When I was playing with the test environment, I saw message like some region
> 'is multiply assigned to region servers', it is also a inconsistent
> scenario, how can I recover this problem?
>

Can you figure how this double-assign happened?

To 'recover' you'd close it on one of the regionservers.  Send a
close_region 'REGION_NAME', 'SERVER_NAME' in the shell (Read the shell
close_region help to be sure for my memory is not reliable).

St.Ack


Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Stack
On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao  wrote:
>> Can anyone give a detailed example, step by step instruction would be
>> greatly appreciated.
>> My understand is we should
>> 1.Since we already has the lost region, we now have start and end keys.
>> 2.generate the row represents the missing region. But how can I generate
>> the encoded name?


Its generate for you when you create an HRegionInfo instance (See its
constructors; it takes start and end keys as well as HTableDesc
instance for your table).

>> It looks like I need
>> column=info:server,column=info:serverstartcode and column=info:regioninfo
>> for the missing region.


info:startcode and info:server will be added for you on assign.  You
just add the info:regioninfo.


>> As for the name of row, it consists of tablename, startkey, encode, and
>> one more long number,
>> how to get this number?

The row key is the region name.  Region name is made up of the
startkey, encode, and a hash of the startkey, encode.  Its made
for you when you create the HRegionInfo instance.


See below for more.


>> 3.use assing command in the hbase shell
>>
>> We also tried check_meta.rb --fix, it reports
>> 
>> 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME =>
>> 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.',
>> STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY =>
>> 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718,
>> TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER =>
>> 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3',
>> TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE
>> => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE', REPLICATION_SCOPE
>> => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647',
>> BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
>> 'userbucket', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION
>> => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'userpass',
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE',
>> VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>
>> 'false', BLOCKCACHE => 'true'}]}}
>> 11/07/06 00:28:40 WARN check_meta: Missing .regioninfo:
>> hdfs://hd0013.c.gj.com:9000/hbase/STable/3e6faca40a7ccad7ed8c0b5848c0f945/.regioninfo
>> 
>>
>> The problem is still there. BTW, what about the blue warning? Is this a
>> serious issue?

Yes.  check_meta.rb can't find the missing region on the fs so it
can't repair it.  You'll need to hack check_meta.rb so it doesn't get
the HRegionInfo to use by reading the filesystem.  Instead create the
instance to insert in your version of check_meta.rb.


>> The situation is quite hard to us, it looks like even we can fill the hole
>> in the meta, we would lost all the data in the hole region, right?
>>

If its the bug I cited in my previous message, yes.  Its critical we
fix it and roll out a 0.90.4.

St.Ack


Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
Thanks Stack and Ted,

Yes, it looks like just the case of HBASE-3872.

Regarding *'is multiply assigned to region servers'*
I found these messages after running add_table.rb, and assign them.
Maybe before executes add_table.rb, we should disable the table?
Or use 'unassign'.

Regarding *the recovery script I attached*.
After I run the script, I can insert values in that region now. But hbck
reports

*Chain of regions in table STable contains less elements than are listed in
META; visited=64035, edges=64044
ERROR: Found inconsistency in table STable
*
I did check hbck before the execution, to set most recent correct startkey
and endkey of the missing meta record, but it looks like the execution
introduces some short-cut path in the meta? I guess it might cause loss of
data in that 9 regions. Is there any tools to check out the hfiles on fs, to
validate the data, if we can found out those 9 regions(we'll go through the
.META.)?

Thanks and regards,

Mao Xu-Feng

On Thu, Jul 7, 2011 at 3:21 AM, Stack  wrote:

> On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao  wrote:
> > I looks like we've lost a region, include the directory on hdfs and its
> meta
> > record as well. We need some more time to dig into the log sea, to figure
> > out the root cause.
> >
>
> You think it was https://issues.apache.org/jira/browse/HBASE-3872?
>
> > But first of all, we need to recover the meta, so that we can put keys in
> > that region. My understanding is the check_meta.rb and add_table.rb could
> > fix some meta issues in case the directory on hdfs and its .regioninfo
> still
> > exists.
> >
>
> Yes.  add_table.rb will go out on fs and find regions for the table
> and rewrite that portion of .META.  In 0.90 it will not assign them
> though you will likely need to disable then reenable the table to get
> the regions out on the cluster.
>
> Check_meta is likely the same.  It looks for the hole and if you pass
> the -fix, will create a new region to plug the hole.  This is probably
> what you need (You may need to assign the region post running the
> script).
>
> > I modified the check_meta.rb, to achieve the insertion. I've tried in our
> > environment, it seems work, at least hbase hbck tells me okay. I attached
> it
> > with this message.Any comments is great appreciated.
> >
>
> Good.
>
> > I have one more question. I create the new region record with both
> startkey
> > and endkey set, it seems possible that if we're unlucky, during the
> > insertion, some split happens, then we might lead to overlap region. I
> > wonder how hbase handles this sort of problems generally.
> >
>
> Well, you can't do cross-row transactions which is sort of what you
> would need here in this case so, yes, its possible that there could be
> overlap, though, didn't you say the region was missing? (If so, how
> could it split?).
>
> > When I was playing with the test environment, I saw message like some
> region
> > 'is multiply assigned to region servers', it is also a inconsistent
> > scenario, how can I recover this problem?
> >
>
> Can you figure how this double-assign happened?
>
> To 'recover' you'd close it on one of the regionservers.  Send a
> close_region 'REGION_NAME', 'SERVER_NAME' in the shell (Read the shell
> close_region help to be sure for my memory is not reliable).
>
> St.Ack
>


Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Stack
On Wed, Jul 6, 2011 at 7:28 PM, Xu-Feng Mao  wrote:
> Regarding 'is multiply assigned to region servers'
> I found these messages after running add_table.rb, and assign them.
> Maybe before executes add_table.rb, we should disable the table?
> Or use 'unassign'.
>

Yes.  If some already assigned, it'll likely reassign regions though
you are on 0.90.x and I didn't think regions added by add_table.rb in
0.90 context would be assigned.

> Regarding the recovery script I attached.
> After I run the script, I can insert values in that region now. But hbck
> reports
> 
> Chain of regions in table STable contains less elements than are listed in
> META; visited=64035, edges=64044
> ERROR: Found inconsistency in table STable
> 

Hmm... this reads as though there are some regions not yet assigned.
Is that possible?  If you add -details to hbck does it name the
regions not assigned?  If you try assigning one manually in the shell,
does the could of visited edges go up?

St.Ack


Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
Thanks Stack, I embed my reply in italic.

On Thu, Jul 7, 2011 at 12:19 PM, Stack  wrote:

> On Wed, Jul 6, 2011 at 7:28 PM, Xu-Feng Mao  wrote:
> > Regarding 'is multiply assigned to region servers'
> > I found these messages after running add_table.rb, and assign them.
> > Maybe before executes add_table.rb, we should disable the table?
> > Or use 'unassign'.
> >
>
> Yes.  If some already assigned, it'll likely reassign regions though
> you are on 0.90.x and I didn't think regions added by add_table.rb in
> 0.90 context would be assigned.
>
> *=
Yes, I assigned them manually.*


> > Regarding the recovery script I attached.
> > After I run the script, I can insert values in that region now. But hbck
> > reports
> > 
> > Chain of regions in table STable contains less elements than are listed
> in
> > META; visited=64035, edges=64044
> > ERROR: Found inconsistency in table STable
> > 
>
> Hmm... this reads as though there are some regions not yet assigned.
> Is that possible?  If you add -details to hbck does it name the
> regions not assigned?  If you try assigning one manually in the shell,
> does the could of visited edges go up?
>
> *
It's interesting* than now hbck -details reports
*ERROR: Region
STable,5i_aGOoK0oOTn8a4wmsuIiMcE_w=,1308035103936.feaedb6f49e83fff1e0cf498d3b4d734.
listed in META on region server hd0040.sj.sd.com:60020 but found on region
server hd0024.sj.sd.com:60020
Chain of regions in table S3Table contains less elements than are listed in
META; visited=64100, edges=64109
ERROR: Found inconsistency in table S3Table

Now there is only one problem region.
Do I need to unassigned the region in the shell? Or maybe it would just take
care of this issue?*

Thanks and regards,

Mao Xu-Feng


Java client throws WrongRegionException but same key accessible via hbase shell

2011-09-27 Thread Vinod Gupta Tankala
I find this hard to believe. For the same row key, my java client is
throwing wrong region exception. But I can query the same using hbase shell.
Im on 0.90.2 version.

Also note that I have inconsistencies in my regions that I am still trying
to figure out. But regardless, the inconsistencies should impact both
methods of querying similarly. right?

thanks
vinod


Re: Java client throws WrongRegionException but same key accessible via hbase shell

2011-09-27 Thread Jean-Daniel Cryans
Hi vinod,

Yeah WREs are never fun, hopefully we can help you fixing it.

First, about the difference when querying from the shell and your java client.

 - Is it a long lived client? Did you restart it since you got the WREs?
 - If not, this could just be due to the fact that it has a cache of
regions that's different from what a new client would see.
 - If you did restart it, then I would have to think a bit more about
it to find the difference.

Either way, it'd be nice to see what you're doing. We need;

 - A full dump of your .META. (in the shell: scan '.META.'), please
put this on a web server or a pastebin
 - The keys you are trying to reach
 - The exceptions you are seeing that contain the row keys and regions
it's trying to reach
 - Another thing that would be nice is to have the output of when you
are reaching that row key from the shell, but with the shell started
with the -d option (will show a lot more debug info).
 - The master log of the day the exception started happening.

J-D

On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala
 wrote:
> I find this hard to believe. For the same row key, my java client is
> throwing wrong region exception. But I can query the same using hbase shell.
> Im on 0.90.2 version.
>
> Also note that I have inconsistencies in my regions that I am still trying
> to figure out. But regardless, the inconsistencies should impact both
> methods of querying similarly. right?
>
> thanks
> vinod
>


Re: Java client throws WrongRegionException but same key accessible via hbase shell

2011-09-27 Thread Vinod Gupta Tankala
J-D,
I was getting these errors even after restarting the client. So it probably
is not straightforward.
Also, I was able to run a combination of check_meta.rb and hbck with their
fix options and restore some of the inconsistencies. i still have 4
inconsistencies left (earlier it was 7) but check_meta thinks .META. is
fine. After i did this, I am not getting java client errors any more. Not
sure how to explain that.

Do you still want me to send the information you requested? may be you can
help with remaining inconsistencies.

thanks
vinod

On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans wrote:

> Hi vinod,
>
> Yeah WREs are never fun, hopefully we can help you fixing it.
>
> First, about the difference when querying from the shell and your java
> client.
>
>  - Is it a long lived client? Did you restart it since you got the WREs?
>  - If not, this could just be due to the fact that it has a cache of
> regions that's different from what a new client would see.
>  - If you did restart it, then I would have to think a bit more about
> it to find the difference.
>
> Either way, it'd be nice to see what you're doing. We need;
>
>  - A full dump of your .META. (in the shell: scan '.META.'), please
> put this on a web server or a pastebin
>  - The keys you are trying to reach
>  - The exceptions you are seeing that contain the row keys and regions
> it's trying to reach
>  - Another thing that would be nice is to have the output of when you
> are reaching that row key from the shell, but with the shell started
> with the -d option (will show a lot more debug info).
>  - The master log of the day the exception started happening.
>
> J-D
>
> On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala
>  wrote:
> > I find this hard to believe. For the same row key, my java client is
> > throwing wrong region exception. But I can query the same using hbase
> shell.
> > Im on 0.90.2 version.
> >
> > Also note that I have inconsistencies in my regions that I am still
> trying
> > to figure out. But regardless, the inconsistencies should impact both
> > methods of querying similarly. right?
> >
> > thanks
> > vinod
> >
>


Re: Java client throws WrongRegionException but same key accessible via hbase shell

2011-09-27 Thread Jean-Daniel Cryans
Yes, please send the info I asked.

About the hbck errors you had, this is usually fixed with -fix:

 Region 
AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b.
not deployed on any region server.

This is "probably" a region that wasn't cleaned up so it's not really a problem:

Region 
file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839
on HDFS, but not listed in META or deployed on any region server.

This is a real problem:

Chain of regions in table AkStats is broken; edges does not contain
277808094:1314921600:daily:Volume

J-D

On Tue, Sep 27, 2011 at 1:00 PM, Vinod Gupta Tankala
 wrote:
> J-D,
> I was getting these errors even after restarting the client. So it probably
> is not straightforward.
> Also, I was able to run a combination of check_meta.rb and hbck with their
> fix options and restore some of the inconsistencies. i still have 4
> inconsistencies left (earlier it was 7) but check_meta thinks .META. is
> fine. After i did this, I am not getting java client errors any more. Not
> sure how to explain that.
>
> Do you still want me to send the information you requested? may be you can
> help with remaining inconsistencies.
>
> thanks
> vinod
>
> On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans 
> wrote:
>
>> Hi vinod,
>>
>> Yeah WREs are never fun, hopefully we can help you fixing it.
>>
>> First, about the difference when querying from the shell and your java
>> client.
>>
>>  - Is it a long lived client? Did you restart it since you got the WREs?
>>  - If not, this could just be due to the fact that it has a cache of
>> regions that's different from what a new client would see.
>>  - If you did restart it, then I would have to think a bit more about
>> it to find the difference.
>>
>> Either way, it'd be nice to see what you're doing. We need;
>>
>>  - A full dump of your .META. (in the shell: scan '.META.'), please
>> put this on a web server or a pastebin
>>  - The keys you are trying to reach
>>  - The exceptions you are seeing that contain the row keys and regions
>> it's trying to reach
>>  - Another thing that would be nice is to have the output of when you
>> are reaching that row key from the shell, but with the shell started
>> with the -d option (will show a lot more debug info).
>>  - The master log of the day the exception started happening.
>>
>> J-D
>>
>> On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala
>>  wrote:
>> > I find this hard to believe. For the same row key, my java client is
>> > throwing wrong region exception. But I can query the same using hbase
>> shell.
>> > Im on 0.90.2 version.
>> >
>> > Also note that I have inconsistencies in my regions that I am still
>> trying
>> > to figure out. But regardless, the inconsistencies should impact both
>> > methods of querying similarly. right?
>> >
>> > thanks
>> > vinod
>> >
>>
>


Re: Java client throws WrongRegionException but same key accessible via hbase shell

2011-09-27 Thread Rohit Nigam
Try running
hbase org.jruby.Main add_table.rb /hbase/tablename

This will clean the inconsistencies  in .META. table . If you see run hbck
again and you see holes in the table then you have  to do more effort in
cleaning the table.
Rohit

On Tue, Sep 27, 2011 at 1:23 PM, Jean-Daniel Cryans wrote:

> Yes, please send the info I asked.
>
> About the hbck errors you had, this is usually fixed with -fix:
>
>  Region
> AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b.
> not deployed on any region server.
>
> This is "probably" a region that wasn't cleaned up so it's not really a
> problem:
>
> Region
> file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839
> on HDFS, but not listed in META or deployed on any region server.
>
> This is a real problem:
>
> Chain of regions in table AkStats is broken; edges does not contain
> 277808094:1314921600:daily:Volume
>
> J-D
>
> On Tue, Sep 27, 2011 at 1:00 PM, Vinod Gupta Tankala
>  wrote:
> > J-D,
> > I was getting these errors even after restarting the client. So it
> probably
> > is not straightforward.
> > Also, I was able to run a combination of check_meta.rb and hbck with
> their
> > fix options and restore some of the inconsistencies. i still have 4
> > inconsistencies left (earlier it was 7) but check_meta thinks .META. is
> > fine. After i did this, I am not getting java client errors any more. Not
> > sure how to explain that.
> >
> > Do you still want me to send the information you requested? may be you
> can
> > help with remaining inconsistencies.
> >
> > thanks
> > vinod
> >
> > On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans <
> jdcry...@apache.org>wrote:
> >
> >> Hi vinod,
> >>
> >> Yeah WREs are never fun, hopefully we can help you fixing it.
> >>
> >> First, about the difference when querying from the shell and your java
> >> client.
> >>
> >>  - Is it a long lived client? Did you restart it since you got the WREs?
> >>  - If not, this could just be due to the fact that it has a cache of
> >> regions that's different from what a new client would see.
> >>  - If you did restart it, then I would have to think a bit more about
> >> it to find the difference.
> >>
> >> Either way, it'd be nice to see what you're doing. We need;
> >>
> >>  - A full dump of your .META. (in the shell: scan '.META.'), please
> >> put this on a web server or a pastebin
> >>  - The keys you are trying to reach
> >>  - The exceptions you are seeing that contain the row keys and regions
> >> it's trying to reach
> >>  - Another thing that would be nice is to have the output of when you
> >> are reaching that row key from the shell, but with the shell started
> >> with the -d option (will show a lot more debug info).
> >>  - The master log of the day the exception started happening.
> >>
> >> J-D
> >>
> >> On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala
> >>  wrote:
> >> > I find this hard to believe. For the same row key, my java client is
> >> > throwing wrong region exception. But I can query the same using hbase
> >> shell.
> >> > Im on 0.90.2 version.
> >> >
> >> > Also note that I have inconsistencies in my regions that I am still
> >> trying
> >> > to figure out. But regardless, the inconsistencies should impact both
> >> > methods of querying similarly. right?
> >> >
> >> > thanks
> >> > vinod
> >> >
> >>
> >
>


Re: Java client throws WrongRegionException but same key accessible via hbase shell

2011-09-27 Thread Vinod Gupta Tankala
J-D,
here is the meta scan file -
https://docs.google.com/document/d/1_g2Ce20H65rukrEe8i9UWaW9wskaV34_96ElpOQDsYE/edit?hl=en_US

I solved this problem "Chain of regions in table AkStats is broken; edges
does not contain
277808094:1314921600:daily:Volume" using combination of check_meta and hbck
-fix.

But every now and then i hit this problem where java client hits WRE but
hbase shell works.
an example of the above is -
get 'AkStats', '26696569976:1317081600:weekly:AudEng'

I have a suspicion on why this could be happening. Please confirm if I
should be concerned about the following -
1) 5-10% of my rows are really large. ~2.5MB+. Remaining are smaller - few
KB.
2) Before I write the large rows, I delete the row if it exists and then
write a new one. The reason I do it this way is because some of the columns
in the existing row don't apply any more. So I delete the whole row and
rewrite it. Ofcourse, this happens every few hours for 1-2K rows only based
on my current load.

Does hbase scale well for row deletes?

thanks

On Tue, Sep 27, 2011 at 1:23 PM, Jean-Daniel Cryans wrote:

> Yes, please send the info I asked.
>
> About the hbck errors you had, this is usually fixed with -fix:
>
>  Region
> AkStats,277808094:1314921600:daily:Volume,1317052667861.1ecc871503cd827934c3a9077b44e52b.
> not deployed on any region server.
>
> This is "probably" a region that wasn't cleaned up so it's not really a
> problem:
>
> Region
> file:/media/ephemeral0/hbase-data/AkStats/3520c0cfbedcf212084379c0e41e7839
> on HDFS, but not listed in META or deployed on any region server.
>
> This is a real problem:
>
> Chain of regions in table AkStats is broken; edges does not contain
> 277808094:1314921600:daily:Volume
>
> J-D
>
> On Tue, Sep 27, 2011 at 1:00 PM, Vinod Gupta Tankala
>  wrote:
> > J-D,
> > I was getting these errors even after restarting the client. So it
> probably
> > is not straightforward.
> > Also, I was able to run a combination of check_meta.rb and hbck with
> their
> > fix options and restore some of the inconsistencies. i still have 4
> > inconsistencies left (earlier it was 7) but check_meta thinks .META. is
> > fine. After i did this, I am not getting java client errors any more. Not
> > sure how to explain that.
> >
> > Do you still want me to send the information you requested? may be you
> can
> > help with remaining inconsistencies.
> >
> > thanks
> > vinod
> >
> > On Tue, Sep 27, 2011 at 11:05 AM, Jean-Daniel Cryans <
> jdcry...@apache.org>wrote:
> >
> >> Hi vinod,
> >>
> >> Yeah WREs are never fun, hopefully we can help you fixing it.
> >>
> >> First, about the difference when querying from the shell and your java
> >> client.
> >>
> >>  - Is it a long lived client? Did you restart it since you got the WREs?
> >>  - If not, this could just be due to the fact that it has a cache of
> >> regions that's different from what a new client would see.
> >>  - If you did restart it, then I would have to think a bit more about
> >> it to find the difference.
> >>
> >> Either way, it'd be nice to see what you're doing. We need;
> >>
> >>  - A full dump of your .META. (in the shell: scan '.META.'), please
> >> put this on a web server or a pastebin
> >>  - The keys you are trying to reach
> >>  - The exceptions you are seeing that contain the row keys and regions
> >> it's trying to reach
> >>  - Another thing that would be nice is to have the output of when you
> >> are reaching that row key from the shell, but with the shell started
> >> with the -d option (will show a lot more debug info).
> >>  - The master log of the day the exception started happening.
> >>
> >> J-D
> >>
> >> On Tue, Sep 27, 2011 at 7:57 AM, Vinod Gupta Tankala
> >>  wrote:
> >> > I find this hard to believe. For the same row key, my java client is
> >> > throwing wrong region exception. But I can query the same using hbase
> >> shell.
> >> > Im on 0.90.2 version.
> >> >
> >> > Also note that I have inconsistencies in my regions that I am still
> >> trying
> >> > to figure out. But regardless, the inconsistencies should impact both
> >> > methods of querying similarly. right?
> >> >
> >> > thanks
> >> > vinod
> >> >
> >>
> >
>


WrongRegionException: Requested row out of range for calculated split on HRegion => How is this possible?

2013-08-24 Thread Jean-Marc Spaggiari
(I have added line feeds to make it easier to read)
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
out of range for calculated split on HRegion
work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00
http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16,1376139517597.b39bf00b980b632901859761caafb9d0.,

startKey   ='\xF5\x9A\xEA&\x00\x00\x00\x00
http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16',

getEndKey()='\xF5\x9B@}\x00\x00\x00\x00
http://fr.video.sympatico.ca/accueil/les-plus-populaires/watch/kim-kardashian-rit-des-rumeurs-dinfidelite/2477090497001?sort=date&filter=Splash&page=5',

row='\xFA\xCDH?\x00\x00\x00\x00http://www.futur.

Start key is xF5 x9A xEA
End key is xF5 x9B x40

But I'm getting xFA xCD as the mid key... Which is not in the range.

MidKey definition:

 * An approximation to the {@link HFile}'s mid-key. Operates on block
 * boundaries, and does not go inside blocks. In other words, returns
the
 * first key of the middle block of the file

Does it mean that my blocks into my HFile are not correctly ordered??? I
have just one store file for this region.

If I run  bin/hbase org.apache.hadoop.hbase.io.hfile.HFile on this region,
I get this:

firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00...
lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00...

But from the WebUI, I have those 2 regions at the end:
work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00... buldo:60030
\xF5\x9A\xEA&\x00\x00\x00\x00h... \xF5\x9B@}\x00\x00\x00\x00... 0
work_proposed,\xF5\x9B@}\x00\x00\x00\x00...
Which is the same as what I got on the logs. But not the same as what the
HFilePrettyPrinter is giving me. The provided midkey is fine if we consider
the output of the HFilePrettyPrinter. But wrong if we consider the WebUI.


http://pastebin.com/dmtAnQtF
Version:0.94.12-SNAPSHOT but I'm facing that for weeks now. So not new.

I will continue to investigate. Most probably will try to print the 58M
keys into the HFile to see who's right, who's wrong. And why those
information are different. Might also drop the entry in the META to let
HBCK rebuild it based on the HDFS file and see...

All the ideas are welcome.

JM


Re: WrongRegionException: Requested row out of range for calculated split on HRegion => How is this possible?

2013-08-24 Thread Jean-Marc Spaggiari
By looking at the HFile content, I can see that the information display on
the WebUI is not correct.
The last key printed by HFilePrettyPrinter is K:
\xFF\xFF\xFF\xFE\x00\x00\x00\x00

The region after this one is listed by the same application to have:
firstKey=\xF5\x9BB\xF4\x00\x00\x00\x00...
lastKey=\xFF\xFF\xFF`\x00\x00\x00\x00...

And the concernend region:
firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00...
lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00...

Which mean I have an overlap between the 2.

So now. What are the options.

1) HBCK doesn't report any issue.
2) HFile report the right keys information
3) WebUI does'nt report the right information.

Since the WebUI  display the information based on the META, my best guess
is that META content is not correct. So I can "simply" remove it and let
HBCK repair that. Another option might be to copy the files from the 2nd
region to the 1st one as another store and re-compact the 2 together?

Should we have something to detect such region overlap or some disconnect
between the META and the HFiles? I will not do anything for now because I
want to know you opinion, but I think we should at least have something to
detect that in HBCK, and most probably something to fix that too.

JM



2013/8/24 Jean-Marc Spaggiari 

> (I have added line feeds to make it easier to read)
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> out of range for calculated split on HRegion
> work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00
> http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16,1376139517597.b39bf00b980b632901859761caafb9d0.,
>
> startKey   ='\xF5\x9A\xEA&\x00\x00\x00\x00
> http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-10-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16',
>
> getEndKey()='\xF5\x9B@}\x00\x00\x00\x00
> http://fr.video.sympatico.ca/accueil/les-plus-populaires/watch/kim-kardashian-rit-des-rumeurs-dinfidelite/2477090497001?sort=date&filter=Splash&page=5',
>
> row='\xFA\xCDH?\x00\x00\x00\x00http://www.futur.
>
> Start key is xF5 x9A xEA
> End key is xF5 x9B x40
>
> But I'm getting xFA xCD as the mid key... Which is not in the range.
>
> MidKey definition:
>
>  * An approximation to the {@link HFile}'s mid-key. Operates on block
>  * boundaries, and does not go inside blocks. In other words, returns
> the
>  * first key of the middle block of the file
>
> Does it mean that my blocks into my HFile are not correctly ordered??? I
> have just one store file for this region.
>
> If I run  bin/hbase org.apache.hadoop.hbase.io.hfile.HFile on this region,
> I get this:
>
> firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00...
> lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00...
>
> But from the WebUI, I have those 2 regions at the end:
> work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00... buldo:60030
> \xF5\x9A\xEA&\x00\x00\x00\x00h... \xF5\x9B@}\x00\x00\x00\x00... 0
> work_proposed,\xF5\x9B@}\x00\x00\x00\x00...
> Which is the same as what I got on the logs. But not the same as what the
> HFilePrettyPrinter is giving me. The provided midkey is fine if we consider
> the output of the HFilePrettyPrinter. But wrong if we consider the WebUI.
>
>
> http://pastebin.com/dmtAnQtF
> Version:0.94.12-SNAPSHOT but I'm facing that for weeks now. So not new.
>
> I will continue to investigate. Most probably will try to print the 58M
> keys into the HFile to see who's right, who's wrong. And why those
> information are different. Might also drop the entry in the META to let
> HBCK rebuild it based on the HDFS file and see...
>
> All the ideas are welcome.
>
> JM
>