[ 
https://issues.apache.org/jira/browse/HDFS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247871#comment-13247871
 ] 

Todd Lipcon commented on HDFS-3212:
-----------------------------------

bq. Is it the case that JN will reject it since the old NN has a smaller epoch?

Right -- that's why it needs to persist, IMO.

bq. 2. might be less optimal because now it consists of 2 operations. 1) 
rolling the log and creating a new segment 2) updating a metadata file.

I think it's just a matter of getting the ordering right. Before starting a log 
segment, you need to fence prior writers. The fencing step is what writes down 
the epoch. Then, when you create a new log segment, you tag it (eg by storing 
it in a directory per-epoch, or by writing a metadata file next to it before 
you create the file). I think this is sufficiently atomic.

bq. So 2 edit logs with same txid but can be differentiated using epochs

I've had another idea which I want to write up in the design doc. But, 
basically, I think we can solve this problem more simply by the following:
- Currently, when FSEditLog starts a new segment, it calls 
journal.startLogSegment(), then journal.logEdit(StartLogSegmentOp), then 
journal.logSync(). So there is a point of time when the log segment is empty, 
with no transactions. If instead, we changed it so that the startLogSegment() 
call was responsible for writing the first transaction (and only the first), 
atomically, then we might not have a problem. We just have to make the 
restriction that the first transaction of any segment is always deterministic 
(eg just START_LOG_SEGMENT(txid) and nothing else).

Let me revise the design doc in HDFS-3077 with this idea to see if it works 
when fully fleshed out.

                
> Persist the epoch received by the JournalService
> ------------------------------------------------
>
>                 Key: HDFS-3212
>                 URL: https://issues.apache.org/jira/browse/HDFS-3212
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: Shared journals (HDFS-3092)
>            Reporter: Suresh Srinivas
>
> epoch received over JournalProtocol should be persisted by JournalService.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to