[ https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041914#comment-13041914 ]
Jitendra Nath Pandey commented on HDFS-1580: -------------------------------------------- @Sanjay >Jitendra had mentioned to me why he preferred the getNumTransaction(sinceTx) >but I >forget the reason. getNumTransaction(sinceTx) will throw an exception if it sees a gap (in sequence of transactions due to an earlier failure of the journal) after sinceTx. It will return a number only if it the journal can actually serve those many transactions starting from sinceTx. @Ivan >Finalizing in getNumTransactions is a bit messy. getNumTransactions will also be called by readers of edit logs. Finalize or recover should happen only in the context of the writer. I think finalization might make sense at the creation of output stream. For example, finalize the edit logs when namenode comes back up, after a crash, and opens output stream for writing. A separate recover method in the interface may also be useful. Two distinct cases where getNumTransactions can be used: (a) At namenode startup or backup at failover: In this case the in_progress file must be read to capture all the transactions. This is in the context of the writer. (b) Checkpointer, backup (non-failover case) or any other reader: In this case in_progress file can be ignored and checkpoint only up to the last rolled/finalized edit log file. This is the context of a reader. I think we have following options 1) getNumTransactions reads in_progress file in both cases up to whatever can be read successfully. Caveat: Should checkpointer download the in_progress file as well? 2) Don't read in_progress file, and handle case (a) by first calling a 'recover' method that finalizes the edit logs, and handle case (b) by rolling the edit logs. 3) Third option, is to have two separate methods one that counts in_progress file and other doesn't. It seems to me option (1) is the simplest. Checkpointer doesn't need to download in_progress file, however for shared nfs storage it can read in_progress file too. > Add interface for generic Write Ahead Logging mechanisms > -------------------------------------------------------- > > Key: HDFS-1580 > URL: https://issues.apache.org/jira/browse/HDFS-1580 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Ivan Kelly > Fix For: Edit log branch (HDFS-1073) > > Attachments: EditlogInterface.1.pdf, EditlogInterface.2.pdf, > HDFS-1580+1521.diff, HDFS-1580.diff, HDFS-1580.diff, HDFS-1580.diff, > generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.pdf, > generic_wal_iface.txt > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira