Kiran Kumar Maturi created HBASE-28801:
------------------------------------------
Summary: WALs remain unclosed for long
Key: HBASE-28801
URL: https://issues.apache.org/jira/browse/HBASE-28801
Project: HBase
Issue Type: Bug
Components: wal
Affects Versions: 2.5.6
Reporter: Kiran Kumar Maturi
In our production fleet we have observed that WAL files are not cleaned up even
when all the entries have been flushed. I have fixed the WAL close issue when
there is issue with the wal closures as part of
[HBASE-28665|https://issues.apache.org/jira/projects/HBASE/issues/HBASE-28665].
There is a case in case of unflushed entried that can lead for the wal not be
cleaned even after all the entries have been flushed
[FSHLog.java
|https://github.com/apache/hbase/blob/branch-2.6/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java#L388]
{code:java}
if (isUnflushedEntries() || closeErrorCount.get() >=
this.closeErrorsTolerated) {
try {
closeWriter(this.writer, oldPath, true);
} finally {
inflightWALClosures.remove(oldPath.getName());
if (!isUnflushedEntries()) {
markClosedAndClean(oldPath);
}
}
{code}
If there are unflushed entries then wal will never be marked close and won't be
cleaned further
{code:java}
private synchronized void cleanOldLogs() {
List<Pair<Path, Long>> logsToArchive = null;
// For each log file, look at its Map of regions to the highest sequence
id; if all sequence ids
// are older than what is currently in memory, the WAL can be GC'd.
for (Map.Entry<Path, WALProps> e : this.walFile2Props.entrySet()) {
if (!e.getValue().closed) {
LOG.debug("{} is not closed yet, will try archiving it next time",
e.getKey());
continue;
}
Path log = e.getKey();
Map<byte[], Long> sequenceNums =
e.getValue().encodedName2HighestSequenceId;
if (this.sequenceIdAccounting.areAllLower(sequenceNums)) {
if (logsToArchive == null) {
logsToArchive = new ArrayList<>();
}
logsToArchive.add(Pair.newPair(log, e.getValue().logSize));
if (LOG.isTraceEnabled()) {
LOG.trace("WAL file ready for archiving " + log);
}
}
}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)