[ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083785#comment-13083785 ]
Ivan Kelly commented on HDFS-2018: ---------------------------------- {quote} no need for the more complicated caching in FileJournalManager, since we only scan the directory once {quote} The caching in FileJournalManager is only for inprogress logs, which will be recovered the first time they are read while not being written to. This is the only caching. I don't think scanning a directory during load time is going to cause a significant performance hit. {quote} treats log recovery as an explicit step at startup - it's good to make it explicit since we need to not do recovery when a NN starts up in standby mode, for example. {quote} Proper fencing is the correct way to handle this. {quote} the EditLogReference interface will also make it easier to allow other types of journal managers to participate in edits-transfer, I think. {quote} Other journal types shouldn't need any edit transfer mechanism, as anything that isn't a local file will be shared storage. Frankly, I don't see that this alternative approach brings enough to justify holding off this for longer. This patch has been held up for 6 weeks for various reasons, and changing the approach now is just going to delay it again. Jitendra and I have other patches which have been held up waiting for this. It would have been nice if this alternative approach had been proposed a month ago. > 1073: Move all journal stream management code into one place > ------------------------------------------------------------ > > Key: HDFS-2018 > URL: https://issues.apache.org/jira/browse/HDFS-2018 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Ivan Kelly > Assignee: Ivan Kelly > Fix For: 0.23.0 > > Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, > HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, > HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, > HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, > HDFS-2018.diff, hdfs-2018-otherapi.txt > > > Currently in the HDFS-1073 branch, the code for creating output streams is in > FileJournalManager and the code for input streams is in the inspectors. This > change does a number of things. > - Input and Output streams are now created by the JournalManager. > - FSImageStorageInspectors now deals with URIs when referring to edit logs > - Recovery of inprogress logs is performed by counting the number of > transactions instead of looking at the length of the file. > The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira