Thanks a lot for your reply, Todd. I added some answers below.
On Wed, Dec 16, 2009 at 6:06 PM, Todd Lipcon <[email protected]> wrote:
> Hi Erik,
>
> A few things to try:
>
> - does this FS store sensitive data or would it be possible to bzip2 the
> files and upload them somewhere?
>
Unfortunately, I think it'd be hard to do that. There is some sensitive data
in there. Also, fsimage and edits is 100 megs and edits.new is 500 megs, so
it's quite a bit.
What's the difference between edits and edits.new? I tried removing edits
but keeping edits.new, and we got some (expected) error messages such as:
09/12/16 15:10:09 DEBUG hdfs.StateChange: DIR*
FSDirectory.unprotectedDelete: failed to remove
/user_genres/delta_2009_11_05-
temp-2009-11-16T13-00-20.006799/_temporary/_attempt_200911131430_0279_r_000066_0/part-00066
because it does not exist
09/12/16 15:10:09 ERROR namenode.NameNode: java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1006)
(...)
09/12/16 15:10:09 INFO namenode.NameNode: SHUTDOWN_MSG:
I guess this makes sense since the edits.new contains a list of operations
that refer to files created in edits, or something similar. But it seems
strange to me that edits is still there and hasn't been merged in a month.
- can you add logging to the replay of FSEditLog so as to be aware of what
> byte offset is causing the issue?
>
Is there a way of doing that without having to recompile Hadoop?
> - DO take a backup of all of the state, immediately, before you
> accidentally
> lose the data.
>
Including the actual data on the datanodes? That's a lot of TBs but you're
right we should start look into that.
Will blocks not belonging to files be deleted? When removing edits +
edits.new, we had loads of warnings looking like "block xxx does not belong
to any file, added to invalidSet", though at least the namenode seemed to
start. I didn't let it run for many seconds before killing it, however. If I
knew for certain that I could safely ignore these errors, maybe I could try
to fire up a namenode using an old fsimage, then try to merge edits.new in
there somehow.
> - Do you have a metadata backup from earlier this week?
>
Unfortunately not. I think we just found out the hard way that we've been
putting a bit too much trust in our setup.
My guess is that it might still have led to a crash, since it seems to be
the edits file and not the edits.new file that causes it, and since edits is
a month old. Which brings me back to my confusion regarding the roles of
fsimage, edits and edits.new.
Thanks,
Erik Bernhardsson
> On Wed, Dec 16, 2009 at 6:54 AM, Erik Bernhardsson <[email protected]
> >wrote:
>
> > Hi,
> > we just encountered some problems when restarting our namenode. I'd
> really
> > appreciate if anyone has any clue of what is going on here.
> >
> > The error message is as follows:
> >
> > 09/12/16 14:25:03 INFO namenode.NameNode: STARTUP_MSG:
> > /************************************************************
> > STARTUP_MSG: Starting NameNode
> > STARTUP_MSG: host = ida.int.sto.spotify.net/78.31.15.83
> > STARTUP_MSG: args = []
> > STARTUP_MSG: version = 0.20.0
> > STARTUP_MSG: build =
> > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r
> > 763504;
> > compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009
> > ************************************************************/
> > 09/12/16 14:25:03 DEBUG conf.Configuration: java.io.IOException: config()
> > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:210)
> > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:197)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:937)
> > at
> > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:964)
> >
> > 09/12/16 14:25:03 INFO metrics.RpcMetrics: Initializing RPC Metrics with
> > hostName=NameNode, port=54310
> > 09/12/16 14:25:03 INFO namenode.NameNode: Namenode up at:
> > ida.int.sto.spotify.net/78.31.15.83:54310
> > 09/12/16
> > <http://ida.int.sto.spotify.net/78.31.15.83:54310%0A09/12/16>14:25:03
> INFO jvm.JvmMetrics: Initializing JVM Metrics with
> > processName=NameNode, sessionId=null
> > 09/12/16 14:25:03 INFO metrics.NameNodeMetrics: Initializing
> > NameNodeMeterics using context
> > object:org.apache.hadoop.metrics.spi.NullContext
> > 09/12/16 14:25:03 DEBUG security.UserGroupInformation: Unix Login:
> > root,root
> > 09/12/16 14:25:03 INFO namenode.FSNamesystem: fsOwner=root,root
> > 09/12/16 14:25:03 INFO namenode.FSNamesystem: supergroup=supergroup
> > 09/12/16 14:25:03 INFO namenode.FSNamesystem: isPermissionEnabled=true
> > 09/12/16 14:25:03 INFO metrics.FSNamesystemMetrics: Initializing
> > FSNamesystemMetrics using context
> > object:org.apache.hadoop.metrics.spi.NullContext
> > 09/12/16 14:25:03 INFO namenode.FSNamesystem: Registered
> > FSNamesystemStatusMBean
> > 09/12/16 14:25:03 INFO common.Storage: Number of files = 835872
> > 09/12/16 14:25:20 INFO common.Storage: Number of files under construction
> =
> > 26
> > 09/12/16 14:25:20 INFO common.Storage: Image file of size 128698266
> loaded
> > in 17 seconds.
> > 09/12/16 14:25:20 ERROR namenode.NameNode: java.lang.NullPointerException
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1006)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:992)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:966)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:953)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:696)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:992)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:302)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:283)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
> > at
> > org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
> > at
> >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:955)
> > at
> > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:964)
> >
> > 09/12/16 14:25:21 INFO namenode.NameNode: SHUTDOWN_MSG:
> >
> > For some reason the edits.new seems to have become surprisingly large. At
> > half a GB, it's more than five times as large as fsimage.
> >
> > # ls -al
> > drwxr-xr-x 3 root root 4096 2009-12-16 10:19 .
> > drwxr-xr-x 4 root root 4096 2009-12-16 10:36 ..
> > -rwxr-xr-x 1 root root 1615894 2009-12-16 10:19 edits
> > -rwxr-xr-x 1 root root 598054491 2009-12-16 10:19 edits.new
> > -rwxr-xr-x 1 root root 128698266 2009-12-16 10:19 fsimage
> > -rwxr-xr-x 1 root root 8 2009-12-16 10:19 fstime
> > -rwxr-xr-x 1 root root 113 2009-12-16 10:19 VERSION
> >
> > Before I accidentally changed the timestamps of these file, one could
> > clearly seen that
> > 1. None of the files except edits.new had been modified since 2009-11-16
> > (time of last restart)
> > 2. edits.new was modified this morning (2009-12-16)
> >
> > Unfortunately the secondary namenode contains no data that is more recent
> > than 2009-11-16. I'm not sure why this is the case - it might be cause it
> > was configured in a bad way. However, the fsimage file for the namenode
> and
> > secondary namenode are exactly identical. The edits file for the
> secondary
> > namenode is almost empty:
> >
> > drwxr-xr-x 2 root root 4096 2009-12-16 10:18 .
> > drwxr-xr-x 4 root root 4096 2009-12-16 10:18 ..
> > -rwxr-xr-x 1 root root 4 2009-12-16 10:18 edits
> > -rwxr-xr-x 1 root root 128698266 2009-12-16 10:18 fsimage
> > -rwxr-xr-x 1 root root 8 2009-12-16 10:18 fstime
> > -rwxr-xr-x 1 root root 113 2009-12-16 10:18 VERSION
> >
> > Tailing edits.new reveals some junk at the end of the file, which I think
> I
> > tried trimming away according to the format (I was taking a look in
> > FSEditsLog.java), but I still get the NPE problem. Even removing
> edits.new
> > and restarting the namenode leads to the same NPE. I havent seen any
> > obvious
> > corruption in edits, though there might be something there that I can't
> > spot
> > with my eyes.
> >
> > I also tried restarting the namenode with -importCheckpoint. It seems to
> > work, but starts outputting warning messages about blocks not belonging
> to
> > any specific file. I suppose this makes sense since the secondary
> namenode
> > contains data from 2009-11-16. I'm really afraid of losing data from the
> > past month (2009-11-16 - 2009-12-16) so I haven't let the
> -importCheckpoint
> > finish. Is there a risk of losing new blocks when running
> > -importCheckpoint?
> > If not, it seems like we should probably start with that, then try to
> merge
> > the edits.new somehow.
> >
> > Does anyone know how to approach this problem? We've had similar problems
> > before, typically when the namenode ran out of disk space and the edits
> > file
> > got damaged. However, the disk seems far from full right now so I'm not
> > sure
> > why we ran into it, though I can't swear the disk hasn't been full during
> > the past week or so.
> >
> > HDFS has seemed quite slow over the past week also, I'm not sure if
> that's
> > by random or if it has anything to do with it.
> >
> > Regards,
> > Erik Bernhardsson
> >
>