Yongjun Zhang created HDFS-11292: ------------------------------------ Summary: log lastWrittenTxId in logSyncAll Key: HDFS-11292 URL: https://issues.apache.org/jira/browse/HDFS-11292 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: Yongjun Zhang
For the issue reported in HDFS-10943, even after HDFS-7964's fix is included, the problem still exists, this means there might be some synchronization issue. To diagnose that, create this jira to report the lastWrittenTxId info in {{logSyncAll()}} call, such that we can compare against the error message reported in HDFS-7964 Specifically, there is two possibility for the HDFS-10943 issue: 1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all requested txs for some reason 2. {{logSyncAll()}} does flush all requested txs, but some new txs sneaked in between A and B. It's observed that the lastWrittenTxId in B and C are the same. This proposed reporting would help confirming if 2 is true. {code} public synchronized void endCurrentLogSegment(boolean writeEndTxn) { LOG.info("Ending log segment " + curSegmentTxId); Preconditions.checkState(isSegmentOpen(), "Bad state: %s", state); if (writeEndTxn) { logEdit(LogSegmentOp.getInstance(cache.get(), FSEditLogOpCodes.OP_END_LOG_SEGMENT)); } // always sync to ensure all edits are flushed. A. logSyncAll(); B. printStatistics(true); final long lastTxId = getLastWrittenTxId(); try { C. journalSet.finalizeLogSegment(curSegmentTxId, lastTxId); editLogStream = null; } catch (IOException e) { //All journals have failed, it will be handled in logSync. } state = State.BETWEEN_LOG_SEGMENTS; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org