[jira] [Comment Edited] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]

2021-07-24 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386780#comment-17386780
 ] 

Michael Stack edited comment on HBASE-25984 at 7/24/21, 10:14 PM:
--

git bisect flags this pr as why we have 100% fail running 
TestPostIncrementAndAppendBeforeWAL on branch 2.3 (Run on mac or see bottom of 
[https://ci-hadoop.apache.org/view/HBase/job/HBase/job/HBase-Find-Flaky-Tests/job/branch-2.3/lastSuccessfulBuild/artifact/output/dashboard.html)]
 Let me see if can fix.


was (Author: stack):
git bisect flags this pr as why we have 100% fail running 
TestPostIncrementAndAppendBeforeWAL (See bottom of 
[https://ci-hadoop.apache.org/view/HBase/job/HBase/job/HBase-Find-Flaky-Tests/job/branch-2.3/lastSuccessfulBuild/artifact/output/dashboard.html)]
 Let me see if can fix.

> FSHLog WAL lockup with sync future reuse [RS deadlock]
> --
>
> Key: HBASE-25984
> URL: https://issues.apache.org/jira/browse/HBASE-25984
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5
>Reporter: Bharath Vissapragada
>Assignee: Bharath Vissapragada
>Priority: Critical
>  Labels: deadlock, hang
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 1.7.1, 2.4.5
>
> Attachments: HBASE-25984-unit-test.patch
>
>
> We use FSHLog as the WAL implementation (branch-1 based) and under heavy load 
> we noticed the WAL system gets locked up due to a subtle bug involving racy 
> code with sync future reuse. This bug applies to all FSHLog implementations 
> across branches.
> Symptoms:
> On heavily loaded clusters with large write load we noticed that the region 
> servers are hanging abruptly with filled up handler queues and stuck MVCC 
> indicating appends/syncs not making any progress.
> {noformat}
>  WARN  [8,queue=9,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=172383686, writePoint=172383690, 
> regionName=1ce4003ab60120057734ffe367667dca}
>  WARN  [6,queue=2,port=60020] regionserver.MultiVersionConcurrencyControl - 
> STUCK for : 296000 millis. 
> MultiVersionConcurrencyControl{readPoint=171504376, writePoint=171504381, 
> regionName=7c441d7243f9f504194dae6bf2622631}
> {noformat}
> All the handlers are stuck waiting for the sync futures and timing out.
> {noformat}
>  java.lang.Object.wait(Native Method)
> 
> org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:183)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1509)
> .
> {noformat}
> Log rolling is stuck because it was unable to attain a safe point
> {noformat}
>java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1799)
>  
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:900)
> {noformat}
> and the Ring buffer consumer thinks that there are some outstanding syncs 
> that need to finish..
> {noformat}
>   
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.attainSafePoint(FSHLog.java:2031)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1999)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1857)
> {noformat}
> On the other hand, SyncRunner threads are idle and just waiting for work 
> implying that there are no pending SyncFutures that need to be run
> {noformat}
>sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1297)
> java.lang.Thread.run(Thread.java:748)
> {noformat}
> Overall the WAL system is dead locked and could make no progress until it was 
> aborted. I got to the bottom of this issue and have a patch that can fix it 
> (more details in the comments due to word limit in the description).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-25984) FSHLog WAL lockup with sync future reuse [RS deadlock]

2021-06-08 Thread Bharath Vissapragada (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359575#comment-17359575
 ] 

Bharath Vissapragada edited comment on HBASE-25984 at 6/8/21, 9:39 PM:
---

Obtained a heap dump and poked around quite a bit and observed that in-memory 
state of the {{RingBufferEventHandler}} is corrupted. After analyzing it and 
eyeballing the code (long story short), came up with the following theory...

This is the current implementation of SyncFuture#reset()
{noformat}
1  synchronized SyncFuture reset(final long sequence, Span span) {
2if (t != null && t != Thread.currentThread()) throw new 
IllegalStateException();
3t = Thread.currentThread();
4if (!isDone()) throw new IllegalStateException("" + sequence + " " + 
Thread.currentThread());
5this.doneSequence = NOT_DONE;
6this.ringBufferSequence = sequence;
7this.span = span;
8this.throwable = null;
9return this;
10 }
{noformat}
We can see, there are guards against overwriting un-finished sync futures with 
checks on ‘current thread’ (L2) and !isDone() (L4). These are not tight checks, 
consider the following sequence of interleaved actions that can result in a 
deadlock..
 # RPC handler#1 → WAL sync - SyncFuture#1 (from ThreadLocalCache) → Add to 
RingBuffer - state: NOT_DONE
 # RingBufferConsumer#onEvent() → SyncFuture#1 dispatch → old state: NOT_DONE, 
new state: DONE (via releaseSyncFuture() in SyncRunner.run())
 # RPCHandler#1 (reused for another region operation) → ThreadLocal 
SyncFuture#1 NOT_DONE (via reset()) (*this goes through because isDone() now 
returns true and L2 always returns true because we reuse the ThreadLocal 
instance*).
 #  RingBufferConsumer#onEvent() - attainSafePoint() -> works on the 
overwritten sync future state (from step 3)
 #   DEADLOCK === (as the sync future remains in the state forever)

The problem here is that once a SyncFuture is marked DONE, it is eligible for 
reuse. Even though ring buffer still has references to it and is checking on it 
to attain a safe point, in the background another handler can just overwrite it 
resulting in ring buffer operating on a wrong future and deadlocking the 
system. Very subtle bug.. I'm able to reproduce this in a carefully crafted 
unit test (attached). I have a fix solves this problem, will create a PR for 
review soon.


was (Author: bharathv):
Obtained a heap dump and poked around quite a bit and observed that in-memory 
state is corrupted. After analyzing it and eyeballing the code (long story 
short), came up with the following theory...

This is the current implementation of SyncFuture#reset()
{noformat}
1  synchronized SyncFuture reset(final long sequence, Span span) {
2if (t != null && t != Thread.currentThread()) throw new 
IllegalStateException();
3t = Thread.currentThread();
4if (!isDone()) throw new IllegalStateException("" + sequence + " " + 
Thread.currentThread());
5this.doneSequence = NOT_DONE;
6this.ringBufferSequence = sequence;
7this.span = span;
8this.throwable = null;
9return this;
10 }
{noformat}
We can see, there are guards against overwriting un-finished sync futures with 
checks on ‘current thread’ (L2) and !isDone() (L4). These are not tight checks, 
consider the following sequence of interleaved actions that can result in a 
deadlock..
 # RPC handler#1 → WAL sync - SyncFuture#1 (from ThreadLocalCache) → Add to 
RingBuffer - state: NOT_DONE
 # RingBufferConsumer#onEvent() → SyncFuture#1 dispatch → old state: NOT_DONE, 
new state: DONE (via releaseSyncFuture() in SyncRunner.run())
 # RPCHandler#1 (reused for another region operation) → ThreadLocal 
SyncFuture#1 NOT_DONE (via reset()) (*this goes through because isDone() now 
returns true and L2 always returns true because we reuse the ThreadLocal 
instance*).
 #  RingBufferConsumer#onEvent() - attainSafePoint() -> works on the 
overwritten sync future state (from step 3)
 #   DEADLOCK === (as the sync future remains in the state forever)


The problem here is that once a SyncFuture is marked DONE, it is eligible for 
reuse. Even though ring buffer still has references to it and is checking on it 
to attain a safe point, in the background another handler can just overwrite it 
resulting in ring buffer operating on a wrong future and deadlocking the 
system. Very subtle bug.. I'm able to reproduce this in a carefully crafted 
unit test (attached). I have a fix solves this problem, will create a PR for 
review soon.

> FSHLog WAL lockup with sync future reuse [RS deadlock]
> --
>
> Key: HBASE-25984
> URL: https://issues.apache.org/jira/browse/HBASE-25984
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.5
>