You should be able to kill -9 the regionserver and only lose data that
follows the last time we sync'd (Default is sync each write IIRC).  If
is this not the case for you, then something is broken.  Lets figure
it out.

St.Ack

On Fri, Sep 3, 2010 at 1:55 AM, Gagandeep Singh
<[email protected]> wrote:
> I think I have figured out the problem or may be not.
>
> In order to simulate RegionServer failure my fellow programmer was killing
> the Regionserver by *kill -9 pid . *But when I used *kill pid* everything
> seems to be working fine. Obviously now region server is going down
> gracefully so there is no data loss.
>
> I also checked it on hadoop 0.20.2(without append), HBase 0.20.5 version and
> found no data-loss in case of simple kill command.
> Now my next question is should it also work with *kill -9* command?
>
> FYI - I am using VMs. In my current setup I am using 3 VMs, 1 for Namenode
> and HBase Master and both 2 and 3 have Data node and region servers running
> on them.
>
> Thanks,
> Gagan
>
>
>
> On Thu, Sep 2, 2010 at 8:18 PM, Stack <[email protected]> wrote:
>
>> On Thu, Sep 2, 2010 at 3:34 AM, Gagandeep Singh
>> <[email protected]> wrote:
>> > Hi Daniel
>> >
>> > I have downloaded hadoop-0.20.2+320.tar.gz from this location
>> > http://archive.cloudera.com/cdh/3/
>>
>>
>> That looks right, yes.
>>
>> > And also changed the *dfs.support.append* flag to *true* in your *
>> > hdfs-site.xml* as mentioned here
>> > http://wiki.apache.org/hadoop/Hbase/HdfsSyncSupport.
>> >
>>
>> That sounds right too.  As Ted suggests, you put it in to all configs
>> (though I believe it enabled by default on that branch -- in the UI
>> you'd see a warning if it was NOT enabled).
>>
>> > But data loss is still happening. Am I using the right version?
>> > Is there any other settings that I need to make so that data gets flushed
>> to
>> > HDFS.
>> >
>>
>> It looks like you are doing the right thing.  Can we see master log please?
>>
>> Thanks,
>> St.Ack
>>
>>
>> > Thanks,
>> > Gagan
>> >
>> >
>> >
>> > On Thu, Aug 26, 2010 at 11:57 PM, Jean-Daniel Cryans <
>> [email protected]>wrote:
>> >
>> >> That, or use CDH3b2.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Aug 26, 2010 at 11:22 AM, Gagandeep Singh
>> >> <[email protected]> wrote:
>> >> > Thanks Daniel
>> >> >
>> >> > It means I have to checkout the code from branch and build it on my
>> local
>> >> > machine.
>> >> >
>> >> > Gagan
>> >> >
>> >> >
>> >> > On Thu, Aug 26, 2010 at 9:51 PM, Jean-Daniel Cryans <
>> [email protected]
>> >> >wrote:
>> >> >
>> >> >> Then I would expect some form of dataloss yes, because stock hadoop
>> >> >> 0.20 doesn't have any form of fsync so HBase doesn't know whether the
>> >> >> data made it to the datanodes when appending to the WAL. Please use
>> >> >> the 0.20-append hadoop branch with HBase 0.89 or cloudera's CDH3b2.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Aug 26, 2010 at 7:22 AM, Gagandeep Singh
>> >> >> <[email protected]> wrote:
>> >> >> > HBase - 0.20.5
>> >> >> > Hadoop - 0.20.2
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Gagan
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Aug 26, 2010 at 7:11 PM, Jean-Daniel Cryans <
>> >> [email protected]
>> >> >> >wrote:
>> >> >> >
>> >> >> >> Hadoop and HBase version?
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Aug 26, 2010 5:36 AM, "Gagandeep Singh" <
>> >> [email protected]>
>> >> >> >> wrote:
>> >> >> >>
>> >> >> >> Hi Group,
>> >> >> >>
>> >> >> >> I am checking HBase/HDFS fail over. I am inserting 1M records from
>> my
>> >> >> HBase
>> >> >> >> client application. I am clubbing my Put operation such that 10
>> >> records
>> >> >> get
>> >> >> >> added into the List<Put> and then I call the table.put(). I have
>> not
>> >> >> >> modified the default setting of Put operation which means all data
>> is
>> >> >> >> written in WAL and in case of server failure my data should not be
>> >> lost.
>> >> >> >>
>> >> >> >> But I noticed somewhat strange behavior, while adding records if I
>> >> kill
>> >> >> my
>> >> >> >> Region Server then my application waits till the time region data
>> is
>> >> >> moved
>> >> >> >> to another region. But I noticed while doing so all my data is
>> lost
>> >> and
>> >> >> my
>> >> >> >> table is emptied.
>> >> >> >>
>> >> >> >> Could you help me understand the behavior. Is there some kind of
>> >> Cache
>> >> >> also
>> >> >> >> involved while writing because of which my data is lost.
>> >> >> >>
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> Gagan
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Reply via email to