I didn't have a space problem which led to it (I think). The corruption started after I bounced the cluster. At the time, I tried to investigate what led to the corruption but didn't find anything useful in the logs besides this line: saveLeases found path /tmp/temp623789763/tmp659456056/_temporary_attempt_200904211331_0010_r_000002_0/part-00002 but no matching entry in namespace
I also tried to recover from the secondary name node files but the corruption my too wide-spread and I had to format. Tamir On Mon, May 4, 2009 at 4:48 PM, Stas Oskin <stas.os...@gmail.com> wrote: > Hi. > > Same conditions - where the space has run out and the fs got corrupted? > > Or it got corrupted by itself (which is even more worrying)? > > Regards. > > 2009/5/4 Tamir Kamara <tamirkam...@gmail.com> > > > I had the same problem a couple of weeks ago with 0.19.1. Had to reformat > > the cluster too... > > > > On Mon, May 4, 2009 at 3:50 PM, Stas Oskin <stas.os...@gmail.com> wrote: > > > > > Hi. > > > > > > After rebooting the NameNode server, I found out the NameNode doesn't > > start > > > anymore. > > > > > > The logs contained this error: > > > "FSNamesystem initialization failed" > > > > > > > > > I suspected filesystem corruption, so I tried to recover from > > > SecondaryNameNode. Problem is, it was completely empty! > > > > > > I had an issue that might have caused this - the root mount has run out > > of > > > space. But, both the NameNode and the SecondaryNameNode directories > were > > on > > > another mount point with plenty of space there - so it's very strange > > that > > > they were impacted in any way. > > > > > > Perhaps the logs, which were located on root mount and as a result, > could > > > not be written, have caused this? > > > > > > > > > To get back HDFS running, i had to format the HDFS (including manually > > > erasing the files from DataNodes). While this reasonable in test > > > environment > > > - production-wise it would be very bad. > > > > > > Any idea why it happened, and what can be done to prevent it in the > > future? > > > I'm using the stable 0.18.3 version of Hadoop. > > > > > > Thanks in advance! > > > > > >