[ 
https://issues.apache.org/jira/browse/HDFS-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026940#comment-13026940
 ] 

Ivan Kelly commented on HDFS-1799:
----------------------------------

My primary criticism of this patch is that it introduces the JournalManager 
interface which is unnecessary in my view. It creates a intermediatory layer 
between FSEditLog and OutputStream. In one sense this is a straight refactor. 
You've just moved revert and divert into a different class. However, this 
straight refactor has required that you change almost every method in FSEditLog 
to use a list of JournalManagers instead of EditLogOutputStreams. 

JournalManager currently deals only with output. Its not clear how input of the 
streams will be handled. open(), close(), abort(), restore(), 
getCurrentStream(), setBufferCapacity() are all output only. How do you plan to 
add input methods to this? If you don't, then you've added a new abstraction 
for output streams, which should be rolled directly into output stream. 

Lastly, this patch doesn't hide the fact that files are in use here. FSEditLog 
still knows that it's using edits and edits.new. Are different filenames ever 
used? The only other place I know of is the Backup spool, which is disabled on 
this branch.

In summary, this patch...
   1. Changes too much code.
   2. Unclear how input functions will be added in the future.
   3. Doesn't hide the fact that files are still in use.

Sorry for being a pain in the ass about this. I expect you're quite frustrated 
by it. I do think there needs to be unified view of how this will look in the 
future though, as getting the right abstractions now will prevent a lot of pain 
later. Regarding LogSegment, I'm still on the fence about this. I see some of 
the merits in it, and I can see why JournalMananger is possibly needed for it. 
If you retain the list of EditLogOutputStreams as it is now, you'll have to 
maintain a separate list of some other objects for the creation of new log 
segments. This could be problematic as these lists need to be kept in sync for 
error handling etc. 


> Refactor log rolling and filename management out of FSEditLog
> -------------------------------------------------------------
>
>                 Key: HDFS-1799
>                 URL: https://issues.apache.org/jira/browse/HDFS-1799
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: 0001-Added-state-management-to-FSEditLog.patch, 
> 0002-Standardised-error-pattern.patch, 
> 0003-Add-JournalFactory-and-move-divert-revert-out-of-FSE.patch, 
> HDFS-1799-all.diff, hdfs-1799.txt, hdfs-1799.txt, hdfs-1799.txt, hdfs-1799.txt
>
>
> This is somewhat similar to HDFS-1580, but less ambitious. While that JIRA 
> focuses on pluggability, this task is simply the minimum needed for HDFS-1073:
> - Refactor the filename-specific code for rolling, diverting, and reverting 
> log streams out of FSEditLog into a new class
> - Clean up the related code in FSEditLog a bit
> Notably, this JIRA is going to temporarily break the BackupNode. I plan to 
> circle back on the BackupNode later on this branch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to