[ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083785#comment-13083785
 ] 

Ivan Kelly commented on HDFS-2018:
----------------------------------

{quote}
no need for the more complicated caching in FileJournalManager, since we only 
scan the directory once
{quote}
The caching in FileJournalManager is only for inprogress logs, which will be 
recovered the first time they are read while not being written to. This is the 
only caching. I don't think scanning a directory during load time is going to 
cause a significant performance hit.

{quote}
treats log recovery as an explicit step at startup - it's good to make it 
explicit since we need to not do recovery when a NN starts up in standby mode, 
for example.
{quote}
Proper fencing is the correct way to handle this. 

{quote}
the EditLogReference interface will also make it easier to allow other types of 
journal managers to participate in edits-transfer, I think.
{quote}
Other journal types shouldn't need any edit transfer mechanism, as anything 
that isn't a local file will be shared storage.

Frankly, I don't see that this alternative approach brings enough to justify 
holding off this for longer. This patch has been held up for 6 weeks for 
various reasons, and changing the approach now is just going to delay it again. 
Jitendra and I have other patches which have been held up waiting for this. It 
would have been nice if this alternative approach had been proposed a month 
ago. 

> 1073: Move all journal stream management code into one place
> ------------------------------------------------------------
>
>                 Key: HDFS-2018
>                 URL: https://issues.apache.org/jira/browse/HDFS-2018
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, hdfs-2018-otherapi.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in 
> FileJournalManager and the code for input streams is in the inspectors. This 
> change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of 
> transactions instead of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to