[ 
https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039769#comment-13039769
 ] 

Ivan Kelly commented on HDFS-1580:
----------------------------------

I've been working on implementing the design for HDFS-1580 on top of the 
HDFS-1073 branch and have run into a problem with 
#getNumberOfTransactions().

Specifically, I've been working on the input code in FSImage:
{code}
   protected boolean loadEdits(JournalManager journal) throws IOException {
     LOG.debug("About to load edits:\n  " + journal);

     FSEditLogLoader loader = new FSEditLogLoader(namesystem);
     long startingTxId = storage.getMostRecentCheckpointTxId() + 1;
     int numLoaded = 0;
     // Load latest edits

     long numTransactionsToLoad = 
journal.getNumberOfTransactions(startingTxId);

     while (numLoaded < numTransactionsToLoad) {
       EditLogInputStream editIn = journal.getInputStream(startingTxId);
       LOG.debug("Reading " + editIn + " expecting start txid #" + 
startingTxId);

       int thisNumLoaded = loader.loadFSEdits(editIn, startingTxId);

       startingTxId += thisNumLoaded;
       numLoaded += thisNumLoaded;
       editIn.close();
     }

     // update the counts
     getFSNamesystem().dir.updateCountForINodeWithQuota();

     // update the txid for the edit log
     editLog.setNextTxId(storage.getMostRecentCheckpointTxId() + 
numLoaded + 1);

     // If we loaded any edits, need to save.
     return numLoaded > 0;
   }
{code}

The load is in a loop now, as the output is still in LogSegment form, 
but even in a single stream implementation getNumberOfTransactions() 
presents a problem.

The problem is that sometimes it is impossible to return a number for 
getNumberOfTransactions(). This case is when NameNode has crashed in the 
middle of an edit log. The editlog is named edits_inprogress_N where N 
is the first transaction id in the edit log. But since NN crashed, we 
dont know the last transaction so we cannot possibly return the number 
of transactions in the journal without scanning the file from the start.

Without getNumberOfTransactions() its difficult to choose which journal 
has the most edits. HDFS-1073 uses the number of bytes in the file, but 
this doesn't feel very safe for anything that isn't a file. Whats more, 
if the start transaction of two journal snippets are out of sync, then 
it becomes impossible to choose which journal has the most transactions 
using just filesize(This is an argument for log segments).

The simplest solution I see is to actually scan the _inprogress file 
from the start to get the last transaction written. As this should only 
happen in NameNode crashes, the delay for doing this shouldn't be 
prohibitive. 


> Add interface for generic Write Ahead Logging mechanisms
> --------------------------------------------------------
>
>                 Key: HDFS-1580
>                 URL: https://issues.apache.org/jira/browse/HDFS-1580
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ivan Kelly
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: EditlogInterface.1.pdf, EditlogInterface.2.pdf, 
> HDFS-1580+1521.diff, HDFS-1580.diff, HDFS-1580.diff, HDFS-1580.diff, 
> generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.pdf, 
> generic_wal_iface.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to