Did one datanode fail or did the namenode fail? By "fail" do you mean that the system was rebooted or was there a bad disk that caused the problem?
thanks, dhruba On Sun, May 11, 2008 at 7:23 PM, C G <[EMAIL PROTECTED]> wrote: > Hi All: > > We had a primary node failure over the weekend. When we brought the node > back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure > how best to proceed. Any advice is greatly appreciated. If I've missed a > Wiki page or documentation somewhere please feel free to tell me to RTFM and > let me know where to look. > > Specific question: how to clear under and over replicated files? Is the > correct procedure to copy the file locally, delete from HDFS, and then copy > back to HDFS? > > The fsck output is long, but the final summary is: > > Total size: 4899680097382 B > Total blocks: 994252 (avg. block size 4928006 B) > Total dirs: 47404 > Total files: 952070 > ******************************** > CORRUPT FILES: 2 > MISSING BLOCKS: 24 > MISSING SIZE: 1501009630 B > ******************************** > Over-replicated blocks: 1 (1.0057812E-4 %) > Under-replicated blocks: 14958 (1.5044476 %) > Target replication factor: 3 > Real replication factor: 2.9849212 > > The filesystem under path '/' is CORRUPT > > > > --------------------------------- > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it > now.