Re: Best practices to recover from Corrupt Namenode

praveenesh kumar Mon, 16 Jan 2012 23:32:39 -0800

I have a replication factor of 2, because of the reason that I can not
afford 3 replicas on my cluster.
fsck output was saying block replicas missing for some files that was
making Namenode is corrupt
I don't have the output with me. but issue was block replicas were missing.
How can we tackle that ?


Is their an internal mechanism of creating new blocks, if they were found
missing / some kind of refresh command  or something ?


Thanks,
Praveenesh

On Tue, Jan 17, 2012 at 12:48 PM, Harsh J <ha...@cloudera.com> wrote:

> You ran into a corrupt files issue, not a namenode corruption (which
> generally refers to the fsimage or edits getting corrupted).
>
> Did your files not have adequate replication that they could not withstand
> the loss of one DN's disk? What exactly did fsck output? Did all block
> replicas go missing for your files?
>
> On 17-Jan-2012, at 12:08 PM, praveenesh kumar wrote:
>
> > Hi guys,
> >
> > I just faced a weird situation, in which one of my hard disks on DN went
> > down.
> > Due to which when I restarted namenode, some of the blocks went missing
> and
> > it was saying my namenode is CORRUPT and in safe mode, which doesn't
> allow
> > you to add or delete any files on HDFS.
> >
> > I know , we can close the safe mode part.
> > Problem is how to deal with Corrupt Namenode problem in this case -- Best
> > practices.
> >
> > In my case, I was lucky that all missing blocks were that of the Outputs
> of
> > my M/R codes I ran previously.
> > So I just deleted all those files with the missing blocks from HDFS to
> come
> > from CORRUPT --> HEALTHY state.
> >
> > But had it be for the large input data files , it won't be a good
> solution
> > in that case to delete those files.
> >
> > So I wanted to know what should be the best practices to deal with above
> > kind of problems to go from CORRUPT NAMENODE --> HEALTHY NAMENODE?
> >
> > Thanks,
> > Praveenesh
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>
>

Re: Best practices to recover from Corrupt Namenode

Reply via email to