Kiran Kumar Maturi created HBASE-28665: ------------------------------------------
Summary: WALs not marked closed when there are errors in closing WALs Key: HBASE-28665 URL: https://issues.apache.org/jira/browse/HBASE-28665 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.5.8 Reporter: Kiran Kumar Maturi Assignee: Kiran Kumar Maturi In our production clusters we have observed that when WAL close fails It causes the the oldWAL files not marked as close and not letting them cleaned. When a WAL close fails in closeWriter it increments the error count. {code:java} Span span = Span.current(); try { span.addEvent("closing writer"); writer.close(); span.addEvent("writer closed"); } catch (IOException ioe) { int errors = closeErrorCount.incrementAndGet(); boolean hasUnflushedEntries = isUnflushedEntries(); if (syncCloseCall && (hasUnflushedEntries || (errors > this.closeErrorsTolerated))) { LOG.error("Close of WAL " + path + " failed. Cause=\"" + ioe.getMessage() + "\", errors=" + errors + ", hasUnflushedEntries=" + hasUnflushedEntries); throw ioe; } LOG.warn("Riding over failed WAL close of " + path + "; THIS FILE WAS NOT CLOSED BUT ALL EDITS SYNCED SO SHOULD BE OK", ioe); } {code} When there are errors in closing WAL only twice doReplaceWALWriter enters this code block {code:java} if (isUnflushedEntries() || closeErrorCount.get() >= this.closeErrorsTolerated) { try { closeWriter(this.writer, oldPath, true); } finally { inflightWALClosures.remove(oldPath.getName()); } } {code} as we don't mark them closed here like we do it here {code:java} Writer localWriter = this.writer; closeExecutor.execute(() -> { try { closeWriter(localWriter, oldPath, false); } catch (IOException e) { LOG.warn("close old writer failed", e); } finally { // call this even if the above close fails, as there is no other chance we can set // closed to true, it will not cause big problems. {color:red} markClosedAndClean(oldPath);{color} inflightWALClosures.remove(oldPath.getName()); } }); {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)