[ 
https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041914#comment-13041914
 ] 

Jitendra Nath Pandey commented on HDFS-1580:
--------------------------------------------

@Sanjay
>Jitendra had mentioned to me why he preferred the getNumTransaction(sinceTx) 
>but I >forget the reason.
 getNumTransaction(sinceTx) will throw an exception if it sees a gap (in 
sequence of transactions due to an earlier failure of the journal) after 
sinceTx. It will return a number only if it the journal can actually serve 
those many transactions starting from sinceTx.

@Ivan
>Finalizing in getNumTransactions is a bit messy.
 getNumTransactions will also be called by readers of edit logs. Finalize or 
recover should happen only in the context of the writer. I think finalization 
might make sense at the creation of output stream. For example, finalize the 
edit logs when namenode comes back up, after a crash, and opens output stream 
for writing. A separate recover method in the interface may also be useful.
  Two distinct cases where getNumTransactions can be used:
  (a) At namenode startup or backup at failover: 
       In this case the in_progress file must be read to capture all the 
transactions. This is in the context of the writer.
  (b) Checkpointer, backup (non-failover case) or any other reader:
       In this case in_progress file can be ignored and checkpoint only up to 
the last rolled/finalized edit log file. This is the context of a reader.

 I think we have following options
  1) getNumTransactions reads in_progress file in both cases up to whatever can 
be read successfully. Caveat: Should checkpointer download the in_progress file 
as well?
  2) Don't read in_progress file, and handle case (a) by first calling a 
'recover' method that finalizes the edit logs, and handle case (b) by rolling 
the edit logs.
  3) Third option, is to have two separate methods one that counts in_progress 
file and other doesn't.

It seems to me option (1) is the simplest. Checkpointer doesn't need to 
download in_progress file, however for shared nfs storage it can read 
in_progress file too.

> Add interface for generic Write Ahead Logging mechanisms
> --------------------------------------------------------
>
>                 Key: HDFS-1580
>                 URL: https://issues.apache.org/jira/browse/HDFS-1580
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ivan Kelly
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: EditlogInterface.1.pdf, EditlogInterface.2.pdf, 
> HDFS-1580+1521.diff, HDFS-1580.diff, HDFS-1580.diff, HDFS-1580.diff, 
> generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.pdf, 
> generic_wal_iface.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to