I have a replication factor of 2, because of the reason that I can not afford 3 replicas on my cluster. fsck output was saying block replicas missing for some files that was making Namenode is corrupt I don't have the output with me. but issue was block replicas were missing. How can we tackle that ?
Is their an internal mechanism of creating new blocks, if they were found missing / some kind of refresh command or something ? Thanks, Praveenesh On Tue, Jan 17, 2012 at 12:48 PM, Harsh J <ha...@cloudera.com> wrote: > You ran into a corrupt files issue, not a namenode corruption (which > generally refers to the fsimage or edits getting corrupted). > > Did your files not have adequate replication that they could not withstand > the loss of one DN's disk? What exactly did fsck output? Did all block > replicas go missing for your files? > > On 17-Jan-2012, at 12:08 PM, praveenesh kumar wrote: > > > Hi guys, > > > > I just faced a weird situation, in which one of my hard disks on DN went > > down. > > Due to which when I restarted namenode, some of the blocks went missing > and > > it was saying my namenode is CORRUPT and in safe mode, which doesn't > allow > > you to add or delete any files on HDFS. > > > > I know , we can close the safe mode part. > > Problem is how to deal with Corrupt Namenode problem in this case -- Best > > practices. > > > > In my case, I was lucky that all missing blocks were that of the Outputs > of > > my M/R codes I ran previously. > > So I just deleted all those files with the missing blocks from HDFS to > come > > from CORRUPT --> HEALTHY state. > > > > But had it be for the large input data files , it won't be a good > solution > > in that case to delete those files. > > > > So I wanted to know what should be the best practices to deal with above > > kind of problems to go from CORRUPT NAMENODE --> HEALTHY NAMENODE? > > > > Thanks, > > Praveenesh > > -- > Harsh J > Customer Ops. Engineer, Cloudera > >