Hi Jean,

This is the scenario i am talking about... Let me know if every thing is ok
with this region chain....

Regionname StartKey Endkey
RAW,GpsgQ,1393477054705.defb006868d8191e76a2ae7e9d203419. GpsgQ G7stL
RAW,G7stL,1393477054705.0d123f246312f937e930fc76fc8d4b9c. G7stL HCuv

RAW,HCuv,1393490697926.d1ea022c6fd534aaf89139ca0726cce5. HCuv
HCuv,1377721814384.87a47003ddb6f0e18a1b735c89bf8ac3.,1378197588060.bcae64eb2788208ab723eb2ad4f5925f.

RAW,HCuv,1377721814384.87a47003ddb6f0e18a1b735c89bf8ac3.,1378197588060.bcae64eb2788208ab723eb2ad4f5925f.,1393484971849.a71d77c695705d07a121340c07a1cc0c.
HCuv,1377721814384.87a47003ddb6f0e18a1b735c89bf8ac3.,1378197588060.bcae64eb2788208ab723eb2ad4f5925f.
eE08

The region hashes size in HDFS are "d1ea022c6fd534aaf89139ca0726cce5
d1ea022c6fd534aaf89139ca0726cce5" 600KB

Also Initially we though it was human error that some one might have
deleted hdfs dirs under some regions. But, surprisingy only some columns in
a column family for a row in the region are lost. If some one has deleted
entire dir, then entire column family for those region rows should be lost,
since hbase has stores files for each column family.

We also have rows with millions of columns in the table and some of them
are present and some of them are lost.... and it happened in a some regions
and not across all the table regions. All other tables are good in the
cluster.

On Thu, Feb 27, 2014 at 9:40 PM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:

> Hi Kiran,
>
> 2 things.
>
> 1) Is there any reason for you to use a so old HBase version? any chance to
> migrate to a more recent one? 0.94.17 is out.
> 2) What do you mean by "I have never seen a wierd start key and end key
> like this"? I don't see anything wrong with what you described. What you
> keys look like? Can you go a get with key beeing "K3.xyz,138798010000.xyp"?
>
> JM
>
>
> 2014-02-27 10:55 GMT-05:00 kiran <kiran.sarvabho...@gmail.com>:
>
> > Adding to that there are many regions with 0MB size and have CF's as
> > specified in the table...
> >
> >
> > On Thu, Feb 27, 2014 at 9:23 PM, kiran <kiran.sarvabho...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > We have been experiencing severe data loss issues from few hours. There
> > > are some wierd things going on in the cluster. We were unable to locate
> > the
> > > data even in hdfs
> > >
> > > Hbase version 0.94.1
> > >
> > > Here is the wierd things that are going on:
> > >
> > > 1) Table which was once 1TB has now become 170GB with many of the
> regions
> > > which we once 7gb are now becoming few MB's. We are no clue  what is
> > > happening at all
> > >
> > > 2) Table is splitting (or what ever) (100 regions have become 200
> > regions)
> > > and ours is constantregionsplitpolicy with region size 20gb. I don't
> know
> > > why it is even spltting
> > >
> > > 3) HDFS namenode dump size which we periodically backup is decreasing
> > >
> > > 4) And there is a region chain with start keys and end keys as, I can't
> > > copy paste the exact thing. For example
> > >
> > > K1.xxx K2.xyz
> > > K2.xyz K3.xyz,138798010000.xyp
> > > K3.xyz,138798010000.xyp K4.xyq
> > >
> > > I have never seen a wierd start key and end key like this. We also
> > suspect
> > > a failed split of a region around 20GB. We looked at logs many times
> but
> > > unable to get any sense out of it. Please help us out and we can't
> afford
> > > data loss.
> > >
> > > Yesterday, There was an cluster crash of root region but we thought we
> > > sucessfully restored that.But things did n't go that way.... There was
> a
> > > consitent data loss after that.
> > >
> > >
> > > --
> > > Thank you
> > > Kiran Sarvabhotla
> > >
> > > -----Even a correct decision is wrong when it is taken late
> > >
> > >
> >
> >
> > --
> > Thank you
> > Kiran Sarvabhotla
> >
> > -----Even a correct decision is wrong when it is taken late
> >
>



-- 
Thank you
Kiran Sarvabhotla

-----Even a correct decision is wrong when it is taken late

Reply via email to