Kiran Kumar Maturi created HBASE-28665:
------------------------------------------
Summary: WALs not marked closed when there are errors in closing
WALs
Key: HBASE-28665
URL: https://issues.apache.org/jira/browse/HBASE-28665
Project: HBase
Issue Type: Bug
Components: wal
Affects Versions: 2.5.8
Reporter: Kiran Kumar Maturi
Assignee: Kiran Kumar Maturi
In our production clusters we have observed that when WAL close fails It causes
the the oldWAL files not marked as close and not letting them cleaned. When a
WAL close fails in closeWriter it increments the error count.
{code:java}
Span span = Span.current();
try {
span.addEvent("closing writer");
writer.close();
span.addEvent("writer closed");
} catch (IOException ioe) {
int errors = closeErrorCount.incrementAndGet();
boolean hasUnflushedEntries = isUnflushedEntries();
if (syncCloseCall && (hasUnflushedEntries || (errors >
this.closeErrorsTolerated))) {
LOG.error("Close of WAL " + path + " failed. Cause=\"" +
ioe.getMessage() + "\", errors="
+ errors + ", hasUnflushedEntries=" + hasUnflushedEntries);
throw ioe;
}
LOG.warn("Riding over failed WAL close of " + path
+ "; THIS FILE WAS NOT CLOSED BUT ALL EDITS SYNCED SO SHOULD BE OK",
ioe);
}
{code}
When there are errors in closing WAL only twice doReplaceWALWriter enters this
code block
{code:java}
if (isUnflushedEntries() || closeErrorCount.get() >= this.closeErrorsTolerated)
{
try {
closeWriter(this.writer, oldPath, true);
} finally {
inflightWALClosures.remove(oldPath.getName());
}
}
{code}
as we don't mark them closed here like we do it here
{code:java}
Writer localWriter = this.writer;
closeExecutor.execute(() -> {
try {
closeWriter(localWriter, oldPath, false);
} catch (IOException e) {
LOG.warn("close old writer failed", e);
} finally {
// call this even if the above close fails, as there is no other
chance we can set
// closed to true, it will not cause big problems.
{color:red} markClosedAndClean(oldPath);{color}
inflightWALClosures.remove(oldPath.getName());
}
});
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)