[ https://issues.apache.org/jira/browse/HDFS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247871#comment-13247871 ]
Todd Lipcon commented on HDFS-3212: ----------------------------------- bq. Is it the case that JN will reject it since the old NN has a smaller epoch? Right -- that's why it needs to persist, IMO. bq. 2. might be less optimal because now it consists of 2 operations. 1) rolling the log and creating a new segment 2) updating a metadata file. I think it's just a matter of getting the ordering right. Before starting a log segment, you need to fence prior writers. The fencing step is what writes down the epoch. Then, when you create a new log segment, you tag it (eg by storing it in a directory per-epoch, or by writing a metadata file next to it before you create the file). I think this is sufficiently atomic. bq. So 2 edit logs with same txid but can be differentiated using epochs I've had another idea which I want to write up in the design doc. But, basically, I think we can solve this problem more simply by the following: - Currently, when FSEditLog starts a new segment, it calls journal.startLogSegment(), then journal.logEdit(StartLogSegmentOp), then journal.logSync(). So there is a point of time when the log segment is empty, with no transactions. If instead, we changed it so that the startLogSegment() call was responsible for writing the first transaction (and only the first), atomically, then we might not have a problem. We just have to make the restriction that the first transaction of any segment is always deterministic (eg just START_LOG_SEGMENT(txid) and nothing else). Let me revise the design doc in HDFS-3077 with this idea to see if it works when fully fleshed out. > Persist the epoch received by the JournalService > ------------------------------------------------ > > Key: HDFS-3212 > URL: https://issues.apache.org/jira/browse/HDFS-3212 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node > Affects Versions: Shared journals (HDFS-3092) > Reporter: Suresh Srinivas > > epoch received over JournalProtocol should be persisted by JournalService. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira