Re: Recovering hbase after a failure

2014-10-02 Thread Esteban Gutierrez
On Thu, Oct 2, 2014 at 3:12 PM, Andrew Purtell wrote: > On Thu, Oct 2, 2014 at 3:02 PM, Esteban Gutierrez > wrote: > > > Another possibility is that we could > > live with createNonRecursive until FileSystem becomes fully deprecated > and > > ​​ > > we can migrate to FileContext, perhaps for HBa

Re: Recovering hbase after a failure

2014-10-02 Thread Andrew Purtell
On Thu, Oct 2, 2014 at 3:02 PM, Esteban Gutierrez wrote: > I get that isDirectory is not atomic and not the best solution, but at > least can provide an alternative to fail the operation without using the > deprecated API or altering FileSystem > This is not an alternative solution because it's

Re: Recovering hbase after a failure

2014-10-02 Thread Andrew Purtell
On Thu, Oct 2, 2014 at 3:02 PM, Esteban Gutierrez wrote: > Another possibility is that we could > live with createNonRecursive until FileSystem becomes fully deprecated and > ​​ > we can migrate to FileContext, perhaps for HBase 3.x? > ​Sure​ > HBASE-11045 goes in > the opposite direction to t

Re: Recovering hbase after a failure

2014-10-02 Thread Esteban Gutierrez
omeone else. > >> > > > >> > > To recover this, the first thing I tried was to just mv the /hbase > >> > > directory back. That doesn’t work. > >> > > > >> > > To get back going had to completely shut down and restart. > >> > > > >>

Re: Recovering hbase after a failure

2014-10-02 Thread Andrew Purtell
> >> > > Also, once the original /hbase got mv'd, a few of the region servers >> did >> > > some flush's before they aborted. Those RS's actually created a new >> > > /hbase, with new table directories, but only containing the da

Re: Recovering hbase after a failure

2014-10-02 Thread Andrew Purtell
vers > did > > > some flush's before they aborted. Those RS's actually created a new > > > /hbase, with new table directories, but only containing the data from > the > > > flush. > > > > > > > > > -Original Message- > &g

Re: Recovering hbase after a failure

2014-10-02 Thread Nick Dimiduk
t;> > >>> To get back going had to completely shut down and restart. > >>> > >>> Also, once the original /hbase got mv'd, a few of the region servers > did > >>> some flush's before they aborted. Those RS's actually created a

Re: Recovering hbase after a failure

2014-10-02 Thread Andrew Purtell
ctually created a new >>> /hbase, with new table directories, but only containing the data from the >>> flush. >>> >>> >>> -Original Message- >>> From: Buckley,Ron >>> Sent: Thursday, October 02, 2014 2:09 PM >>> To: hbase-use

RE: Recovering hbase after a failure

2014-10-02 Thread Buckley,Ron
There were a bunch of new WAL's, but they were all empty. -Original Message- From: Esteban Gutierrez [mailto:este...@cloudera.com] Sent: Thursday, October 02, 2014 2:27 PM To: user@hbase.apache.org Subject: Re: Recovering hbase after a failure Thanks for sharing the details Ron.

Re: Recovering hbase after a failure

2014-10-02 Thread Esteban Gutierrez
al /hbase got mv'd, a few of the region servers > did > > > some flush's before they aborted. Those RS's actually created a new > > > /hbase, with new table directories, but only containing the data from > the > > > flush. > > > > >

Re: Recovering hbase after a failure

2014-10-02 Thread Nick Dimiduk
h's before they aborted. Those RS's actually created a new > > /hbase, with new table directories, but only containing the data from the > > flush. > > > > > > -Original Message- > > From: Buckley,Ron > > Sent: Thursday, October 02, 20

Re: Recovering hbase after a failure

2014-10-02 Thread Esteban Gutierrez
7;s before they aborted. Those RS's actually created a new > > /hbase, with new table directories, but only containing the data from the > > flush. > > > > > > -Original Message- > > From: Buckley,Ron > > Sent: Thursday, October 02, 2014 2:09 PM

Re: Recovering hbase after a failure

2014-10-02 Thread Esteban Gutierrez
RS's actually created a new > /hbase, with new table directories, but only containing the data from the > flush. > > > -Original Message- > From: Buckley,Ron > Sent: Thursday, October 02, 2014 2:09 PM > To: hbase-user > Subject: RE: Recovering hbase after a

Re: Recovering hbase after a failure

2014-10-02 Thread Andrew Purtell
d a new > /hbase, with new table directories, but only containing the data from the > flush. > > > -Original Message- > From: Buckley,Ron > Sent: Thursday, October 02, 2014 2:09 PM > To: hbase-user > Subject: RE: Recovering hbase after a failure > > Nick, &g

RE: Recovering hbase after a failure

2014-10-02 Thread Buckley,Ron
14 1:27 PM To: hbase-user Subject: Re: Recovering hbase after a failure Hi Ron, Yikes! Do you have any basic metrics regarding the amount of data in the system -- size of store files before the incident, number of records, &c? You could sift through the HDFS audit log and see if any files tha

RE: Recovering hbase after a failure

2014-10-02 Thread Buckley,Ron
-user Subject: Re: Recovering hbase after a failure Hi Ron, Yikes! Do you have any basic metrics regarding the amount of data in the system -- size of store files before the incident, number of records, &c? You could sift through the HDFS audit log and see if any files that were there previo

RE: Recovering hbase after a failure

2014-10-02 Thread Buckley,Ron
26 PM To: user@hbase.apache.org Subject: Re: Recovering hbase after a failure Hi Ron, Look into dropped snapshot exceptions in the logs and puts or deletes that skip the WAL. If everything is good there then clients should have handled the unavailability of HBase and there should not be any dat

Re: Recovering hbase after a failure

2014-10-02 Thread Nick Dimiduk
Hi Ron, Yikes! Do you have any basic metrics regarding the amount of data in the system -- size of store files before the incident, number of records, &c? You could sift through the HDFS audit log and see if any files that were there previously have not been restored. -n On Thu, Oct 2, 2014 at

Re: Recovering hbase after a failure

2014-10-02 Thread Esteban Gutierrez
Hi Ron, Look into dropped snapshot exceptions in the logs and puts or deletes that skip the WAL. If everything is good there then clients should have handled the unavailability of HBase and there should not be any dataloss from the server side. Also double check if after the crash there were not e

Recovering hbase after a failure

2014-10-02 Thread Buckley,Ron
We just had an event where, on our main hbase instance, the /hbase directory got moved out from under the running system (Human error). HBase was really unhappy about that, but we were able to recover it fairly easily and get back going. As far as I can tell, all the data and tables came back c