[jira] [Commented] (HBASE-28184) Tailing the WAL is very slow if there are multiple peers.

Rushabh Shah (Jira) Tue, 07 Nov 2023 12:10:04 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-28184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783777#comment-17783777
 ]


Rushabh Shah commented on HBASE-28184:
--------------------------------------

Committed the change to branch-2.4. branch-2.5, branch-2, branch-3 and master. 

Thank you [~zhangduo]  [~vjasani] [~Xiaolin Ha]  for the reviews!

> Tailing the WAL is very slow if there are multiple peers.
> ---------------------------------------------------------
>
>                 Key: HBASE-28184
>                 URL: https://issues.apache.org/jira/browse/HBASE-28184
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 2.0.0
>            Reporter: Rushabh Shah
>            Assignee: Rushabh Shah
>            Priority: Major
>             Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.7
>
>
> Noticed in one of our production clusters which has 4 peers.
> Due to sudden ingestion of data, the size of log queue increased to a peak of 
> 506. We have configured log roll size to 256 MB. Most of the edits in the WAL 
> were from a table for which replication is disabled. 
> So all ReplicationSourceWALReader thread had to do was to replay the WAL and 
> NOT replicate them. Still it took 12 hours to drain the queue.
> Took few jstacks and found that ReplicationSourceWALReader was waiting to 
> acquire rollWriterLock 
> [here|https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java#L1231]
> {noformat}
> "regionserver/<rs>,1" #1036 daemon prio=5 os_prio=0 tid=0x00007f44b374e800 
> nid=0xbd7f waiting on condition [0x00007f37b4d19000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f3897a3e150> (a 
> java.util.concurrent.locks.ReentrantLock$FairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:872)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1202)
>         at 
> java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:228)
>         at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.getLogFileSizeIfBeingWritten(AbstractFSWAL.java:1102)
>         at 
> org.apache.hadoop.hbase.wal.WALProvider.lambda$null$0(WALProvider.java:128)
>         at 
> org.apache.hadoop.hbase.wal.WALProvider$$Lambda$177/1119730685.apply(Unknown 
> Source)
>         at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>         at 
> java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
>         at 
> java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
>         at 
> java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
>         at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
>         at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>         at 
> java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
>         at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at 
> java.util.stream.ReferencePipeline.findAny(ReferencePipeline.java:536)
>         at 
> org.apache.hadoop.hbase.wal.WALProvider.lambda$getWALFileLengthProvider$2(WALProvider.java:129)
>         at 
> org.apache.hadoop.hbase.wal.WALProvider$$Lambda$140/1246380717.getLogFileSizeIfBeingWritten(Unknown
>  Source)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:260)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:172)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:222)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:157)
> {noformat}
>  All the peers will contend for this lock during every batch read.
> Look at the code snippet below. We are guarding this section with 
> rollWriterLock if we are replicating the active WAL file. But in our case we 
> are NOT replicating active WAL file but still we acquire this lock only to 
> return OptionalLong.empty();
> {noformat}
>   /**
>    * if the given {@code path} is being written currently, then return its 
> length.
>    * <p>
>    * This is used by replication to prevent replicating unacked log entries. 
> See
>    * https://issues.apache.org/jira/browse/HBASE-14004 for more details.
>    */
>   @Override
>   public OptionalLong getLogFileSizeIfBeingWritten(Path path) {
>     rollWriterLock.lock();
>     try {
>        ...
>        ...
>     } finally {
>       rollWriterLock.unlock();
>     }
> {noformat}
> We can check the size of log queue and if it is greater than 1 then we can 
> return early without acquiring the lock.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28184) Tailing the WAL is very slow if there are multiple peers.

Reply via email to