[ 
https://issues.apache.org/jira/browse/HBASE-28801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875358#comment-17875358
 ] 

Kiran Kumar Maturi commented on HBASE-28801:
--------------------------------------------

If we don't want to proceed with the above option the next place to check is 
during the clean up. Where we can check to make sure current file is not 
cleaned up and making sure that its 

{code:java}
private synchronized void cleanOldLogs() {
    List<Pair<Path, Long>> logsToArchive = null;
    // For each log file, look at its Map of regions to the highest sequence 
id; if all sequence ids
    // are older than what is currently in memory, the WAL can be GC'd.
    for (Map.Entry<Path, WALProps> e : this.walFile2Props.entrySet()) {
      Path log = e.getKey();
      Map<byte[], Long> sequenceNums = 
e.getValue().encodedName2HighestSequenceId;
      if (e.getKey() == this.getCurrentFileName() || (!e.getValue().closed && 
this.sequenceIdAccounting.areAllLower(sequenceNums))) {
        LOG.debug("{} is not closed yet and has unflushed entries, will try 
archiving it next time", e.getKey());
        continue;
      }
      // check to make sure closed files have also drained all the entries
      if (this.sequenceIdAccounting.areAllLower(sequenceNums)) {
         if (logsToArchive == null) {
              logsToArchive = new ArrayList<>();
           }
           logsToArchive.add(Pair.newPair(log, e.getValue().logSize));
            if (LOG.isTraceEnabled()) {
                LOG.trace("WAL file ready for archiving " + log);
            }
      
    }
{code}


> WALs remain unclosed for long
> -----------------------------
>
>                 Key: HBASE-28801
>                 URL: https://issues.apache.org/jira/browse/HBASE-28801
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 2.5.6
>            Reporter: Kiran Kumar Maturi
>            Priority: Minor
>
> In our production fleet we have observed that WAL files are not cleaned up 
> even when all the entries have been flushed. I have fixed the WAL  close 
> issue when there is issue with the wal closures as part of 
> [HBASE-28665|https://issues.apache.org/jira/projects/HBASE/issues/HBASE-28665].
>  There is a case in case of unflushed entried that can lead for the wal not 
> be cleaned even after all the entries have been flushed
> [FSHLog.java 
> |https://github.com/apache/hbase/blob/branch-2.6/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java#L388]
> {code:java}
>  if (isUnflushedEntries() || closeErrorCount.get() >= 
> this.closeErrorsTolerated) {
>           try {
>             closeWriter(this.writer, oldPath, true);
>           } finally {
>             inflightWALClosures.remove(oldPath.getName());
>             if (!isUnflushedEntries()) {
>               markClosedAndClean(oldPath);
>             }
>           }
> {code}
> If there are unflushed entries then wal will never be marked close and won't 
> be cleaned further
> {code:java}
> private synchronized void cleanOldLogs() {
>     List<Pair<Path, Long>> logsToArchive = null;
>     // For each log file, look at its Map of regions to the highest sequence 
> id; if all sequence ids
>     // are older than what is currently in memory, the WAL can be GC'd.
>     for (Map.Entry<Path, WALProps> e : this.walFile2Props.entrySet()) {
>       if (!e.getValue().closed) {
>         LOG.debug("{} is not closed yet, will try archiving it next time", 
> e.getKey());
>         continue;
>       }
>       Path log = e.getKey();
>       Map<byte[], Long> sequenceNums = 
> e.getValue().encodedName2HighestSequenceId;
>       if (this.sequenceIdAccounting.areAllLower(sequenceNums)) {
>         if (logsToArchive == null) {
>           logsToArchive = new ArrayList<>();
>         }
>         logsToArchive.add(Pair.newPair(log, e.getValue().logSize));
>         if (LOG.isTraceEnabled()) {
>           LOG.trace("WAL file ready for archiving " + log);
>         }
>       }
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to