[ 
https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529317#comment-13529317
 ] 

Eugene Koontz commented on HDFS-4302:
-------------------------------------

On the other hand, it is a shame to lose that apparently useful debug info.
                
> Precondition in EditLogFileInputStream's length() method is checked too early 
> in NameNode startup, causing fatal exception
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-4302
>                 URL: https://issues.apache.org/jira/browse/HDFS-4302
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, namenode
>            Reporter: Eugene Koontz
>         Attachments: HDFS-4302.patch, HDFS-4302.patch
>
>
> When bringing up a namenode in standby mode, where DEBUG is enabled for 
> namenode, the namenode will hit the following code in 
> {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}:
> {code}
>  if (LOG.isDebugEnabled()) {
>       LOG.debug("edit log length: " + in.length() + ", start txid: "
>           + expectedStartingTxId + ", last txid: " + lastTxId);
>     }
> {code}.
> However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, 
> this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} 
> is initialized (before the HTTP client connects to the remote edit log server 
> (i.e. the journal node)). This causes the following precondition to fail in 
> {{EditLogFileInputStream:length()}}:
> {code}
>       Preconditions.checkState(advertisedSize != -1,
>           "must get input stream before length is available");
> {code}
> which shuts down the namenode with the following log messages and stack trace:
> {code}
> 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine 
> (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms
> 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager 
> (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream 
> manifests:
> 172.16.175.1:8485: [[1,3]]
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(605)) - Planning to load image :
> FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000,
>  cpktTxId=0000000000000000000)
> 2012-12-11 10:45:33,351 DEBUG namenode.FSImage 
> (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(168)) 
> - Loading image file 
> /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000 using no 
> compression
> 2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(171)) 
> - Number of files = 1
> 2012-12-11 10:45:33,356 INFO  namenode.FSImage 
> (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under 
> construction = 0
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImageFormat.java:load(193)) 
> - Image file of size 119 loaded in 0 seconds.
> 2012-12-11 10:45:33,357 INFO  namenode.FSImage 
> (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from 
> /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000
> 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) 
> - About to load edits:
>   org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
> 2012-12-11 10:45:33,359 INFO  namenode.FSImage (FSImage.java:loadEdits(694)) 
> - Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 
> expecting start txid #1
> 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping 
> client
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC 
> Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 
> from hdfs/eugenes-macbook-pro.lo...@example.com: closed
> 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client 
> (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from 
> hdfs/eugenes-macbook-pro.lo...@example.com: stopped, remaining connections 0
> 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - 
> Exception in namenode join
> java.lang.IllegalStateException: must get input stream before length is 
> available
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:145)
>         at 
> org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.length(EditLogFileInputStream.java:405)
>         at 
> org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.length(EditLogFileInputStream.java:258)
>         at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256)
>         at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:125)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:88)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:697)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:642)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:259)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:604)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:447)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:409)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:400)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:434)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:606)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:591)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1219)
> 2012-12-11 10:45:33,470 INFO  util.ExitUtil (ExitUtil.java:terminate(84)) - 
> Exiting with status 1
> 2012-12-11 10:45:33,471 INFO  namenode.NameNode (StringUtils.java:run(620)) - 
> SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at Eugenes-MacBook-Pro.local/172.16.175.1
> ************************************************************/
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to