Re: Failed namenode restart, recovering from corrupt edits file?

Adam Phelps Wed, 12 Jan 2011 13:36:37 -0800

On 1/12/11 1:05 PM, Friso van Vollenhoven wrote:

If I am correct your proposed solution would set you back to a image
from about 15-30 minutes before the crash. I think it depends on what
you do with your HDFS (HBase, append only things, ?), whether that will
work out. In our case we are running HBase and going back in time with
the NN image is not very helpful then, because of splits and compactions
removing and adding files all the time. On append only workloads where
you have the option of redoing whatever it is that you did just before
the time of the crash, this could work. But, please verify with someone
with a better understanding of HDFS internals.

We do run HBase. Its our desire to avoid trashing the intervening data,however ditching the particular MR output files that show up in theerror would be fine.

Also, there apparently is a way of healing a corrupt edits file using
your favorite hex editor. There is a thread here:
http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201010.mbox/%3caanlktinbhmn1x8dlir-c4ibhja9nh46tns588cqcn...@mail.gmail.com%3e
<http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201010.mbox/<aanlktinbhmn1x8dlir-c4ibhja9nh46tns588cqcn...@mail.gmail.com>>

Thanks for the link. Manually editing the edits file is our currentthought, a little understanding of the format should save us some pain.

There is a thread about this (our) problem on the cdh-user Google group.
You could also try to post there.


Thanks, I'll go take a look there.

- Adam

Re: Failed namenode restart, recovering from corrupt edits file?

Reply via email to