On Thu, Oct 2, 2014 at 11:17 AM, Buckley,Ron <buckl...@oclc.org> wrote:
> Also, once the original /hbase got mv'd, a few of the region servers did > some flush's before they aborted. Those RS's actually created a new > /hbase, with new table directories, but only containing the data from the > flush. Sounds like we should be creating flush files with createNonRecursive (even though it's deprecated) On Thu, Oct 2, 2014 at 11:17 AM, Buckley,Ron <buckl...@oclc.org> wrote: > FWIW, in case something like this happens to someone else. > > To recover this, the first thing I tried was to just mv the /hbase > directory back. That doesn’t work. > > To get back going had to completely shut down and restart. > > Also, once the original /hbase got mv'd, a few of the region servers did > some flush's before they aborted. Those RS's actually created a new > /hbase, with new table directories, but only containing the data from the > flush. > > > -----Original Message----- > From: Buckley,Ron > Sent: Thursday, October 02, 2014 2:09 PM > To: hbase-user > Subject: RE: Recovering hbase after a failure > > Nick, > > Good ideas. Compared file and region counts with our DR site. Things > looks OK. Going to run some rowcounter's too. > > Feels like we got off easy. > > Ron > > -----Original Message----- > From: Nick Dimiduk [mailto:ndimi...@gmail.com] > Sent: Thursday, October 02, 2014 1:27 PM > To: hbase-user > Subject: Re: Recovering hbase after a failure > > Hi Ron, > > Yikes! > > Do you have any basic metrics regarding the amount of data in the system > -- size of store files before the incident, number of records, &c? > > You could sift through the HDFS audit log and see if any files that were > there previously have not been restored. > > -n > > On Thu, Oct 2, 2014 at 10:18 AM, Buckley,Ron <buckl...@oclc.org> wrote: > > > We just had an event where, on our main hbase instance, the /hbase > > directory got moved out from under the running system (Human error). > > > > HBase was really unhappy about that, but we were able to recover it > > fairly easily and get back going. > > > > As far as I can tell, all the data and tables came back correct. But, > > I'm pretty concerned that there may be some hidden corruption or data > loss. > > > > 'hbase hbck' runs clean and there are no new complaints in the logs. > > > > Can anyone think of anything else we should look at? > > > > > > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)