FSEditLog.selectinputStreams is reading through in-progress streams even when 
non-in-progress are requested
-----------------------------------------------------------------------------------------------------------

                 Key: HDFS-2738
                 URL: https://issues.apache.org/jira/browse/HDFS-2738
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: ha, name-node
    Affects Versions: HA branch (HDFS-1623)
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
            Priority: Critical


The new code in HDFS-1580 is causing an issue with selectInputStreams in the HA 
context. When the active is writing to the shared edits, selectInputStreams is 
called on the standby. This ends up calling {{journalSet.getInputStream}} but 
doesn't pass the {{inProgressOk=false}} flag. So, {{getInputStream}} ends up 
reading and validating the in-progress stream unnecessarily. Since the 
validation results are no longer properly cached, {{findMaxTransaction}} also 
re-validates the in-progress stream, and then breaks the corruption check in 
this code. The end result is a lot of errors like:

2011-12-30 16:45:02,521 ERROR namenode.FileJournalManager 
(FileJournalManager.java:getNumberOfTransactions(266)) - Gap in transactions, 
max txnid is 579, 0 txns from 578
2011-12-30 16:45:02,521 INFO  ha.EditLogTailer (EditLogTailer.java:run(163)) - 
Got error, will try again.
java.io.IOException: No non-corrupt logs for txid 578
        at 
org.apache.hadoop.hdfs.server.namenode.JournalSet.getInputStream(JournalSet.java:229)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1081)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:115)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$0(EditLogTailer.java:100)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:154)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to