Yongjun Zhang created HDFS-11292:
------------------------------------
Summary: log lastWrittenTxId in logSyncAll
Key: HDFS-11292
URL: https://issues.apache.org/jira/browse/HDFS-11292
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs
Reporter: Yongjun Zhang
For the issue reported in HDFS-10943, even after HDFS-7964's fix is included,
the problem still exists, this means there might be some synchronization issue.
To diagnose that, create this jira to report the lastWrittenTxId info in
{{logSyncAll()}} call, such that we can compare against the error message
reported in HDFS-7964
Specifically, there is two possibility for the HDFS-10943 issue:
1. {{logSyncAll()}} (statement A in the code quoted below) doesn't flush all
requested txs for some reason
2. {{logSyncAll()}} does flush all requested txs, but some new txs sneaked in
between A and B. It's observed that the lastWrittenTxId in B and C are the same.
This proposed reporting would help confirming if 2 is true.
{code}
public synchronized void endCurrentLogSegment(boolean writeEndTxn) {
LOG.info("Ending log segment " + curSegmentTxId);
Preconditions.checkState(isSegmentOpen(),
"Bad state: %s", state);
if (writeEndTxn) {
logEdit(LogSegmentOp.getInstance(cache.get(),
FSEditLogOpCodes.OP_END_LOG_SEGMENT));
}
// always sync to ensure all edits are flushed.
A. logSyncAll();
B. printStatistics(true);
final long lastTxId = getLastWrittenTxId();
try {
C. journalSet.finalizeLogSegment(curSegmentTxId, lastTxId);
editLogStream = null;
} catch (IOException e) {
//All journals have failed, it will be handled in logSync.
}
state = State.BETWEEN_LOG_SEGMENTS;
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]