[ 
https://issues.apache.org/jira/browse/HDFS-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-11292:
---------------------------------
    Attachment: HDFS-11292.002.patch

> log lastWrittenTxId in logSyncAll
> ---------------------------------
>
>                 Key: HDFS-11292
>                 URL: https://issues.apache.org/jira/browse/HDFS-11292
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-11292.001.patch, HDFS-11292.002.patch
>
>
> For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, 
> the problem still exists, this means there might be some synchronization 
> issue.
> To diagnose that, create this jira to report the lastWrittenTxId info in 
> {{logSyncAll()}} call, such that we can compare against the error message 
> reported in HDFS-7964
> Specifically, there is two possibility for the HDFS-10943 issue:
> 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all 
> requested txs for some reason
> 2.  {{logSyncAll()}} does flush all requested txs, but some new txs sneaked 
> in between A and B. It's observed that the lastWrittenTxId in B and C are the 
> same.
> This proposed reporting would help confirming if 2 is true.
> {code}
>  public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
>     LOG.info("Ending log segment " + curSegmentTxId);
>     Preconditions.checkState(isSegmentOpen(),
>         "Bad state: %s", state);
>     if (writeEndTxn) {
>       logEdit(LogSegmentOp.getInstance(cache.get(),
>           FSEditLogOpCodes.OP_END_LOG_SEGMENT));
>     }
>     // always sync to ensure all edits are flushed.
> A.    logSyncAll();
> B.    printStatistics(true);
>     final long lastTxId = getLastWrittenTxId();
>     try {
> C.      journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
>       editLogStream = null;
>     } catch (IOException e) {
>       //All journals have failed, it will be handled in logSync.
>     }
>     state = State.BETWEEN_LOG_SEGMENTS;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to