horizonzy commented on PR #4467:
URL: https://github.com/apache/bookkeeper/pull/4467#issuecomment-2244187602
From #4457 stack info.
```
"pulsar-io-32-13":
at
org.apache.bookkeeper.client.PendingAddOp.unsetSuccessAndSendWriteRequest(PendingAddOp.java:181)
- waiting to lock <0x00000403f63e7520> (a
org.apache.bookkeeper.client.PendingAddOp)
at
org.apache.bookkeeper.client.LedgerHandle.unsetSuccessAndSendWriteRequest(LedgerHandle.java:2007)
at
org.apache.bookkeeper.client.ReadOnlyLedgerHandle.handleBookieFailure(ReadOnlyLedgerHandle.java:227)
- locked <0x0000044fc7914058> (a java.lang.Object)
at
org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:353)
- locked <0x00000401526088f8> (a
org.apache.bookkeeper.client.PendingAddOp)
at
org.apache.bookkeeper.proto.BookieClientImpl.completeAdd(BookieClientImpl.java:284)
at
org.apache.bookkeeper.proto.BookieClientImpl.access$000(BookieClientImpl.java:78)
at
org.apache.bookkeeper.proto.BookieClientImpl$ChannelReadyForAddEntryCallback.operationComplete(BookieClientImpl.java:396)
at
org.apache.bookkeeper.proto.BookieClientImpl$ChannelReadyForAddEntryCallback.operationComplete(BookieClientImpl.java:356)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ConnectionFutureListener.operationComplete(PerChannelBookieClient.java:2581)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ConnectionFutureListener.operationComplete(PerChannelBookieClient.java:2486)
at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)
at
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)
at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at
io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at
io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
at
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.fulfillConnectPromise(AbstractEpollChannel.java:675)
at
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:694)
at
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
at
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407)
at
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run([email protected]/Thread.java:833)
"pulsar-io-32-14":
at
org.apache.bookkeeper.client.ReadOnlyLedgerHandle.handleBookieFailure(ReadOnlyLedgerHandle.java:216)
- waiting to lock <0x0000044fc7914058> (a java.lang.Object)
at
org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:353)
- locked <0x00000403f63e7520> (a
org.apache.bookkeeper.client.PendingAddOp)
at
org.apache.bookkeeper.proto.BookieClientImpl.completeAdd(BookieClientImpl.java:284)
at
org.apache.bookkeeper.proto.BookieClientImpl.access$000(BookieClientImpl.java:78)
at
org.apache.bookkeeper.proto.BookieClientImpl$ChannelReadyForAddEntryCallback.operationComplete(BookieClientImpl.java:396)
at
org.apache.bookkeeper.proto.BookieClientImpl$ChannelReadyForAddEntryCallback.operationComplete(BookieClientImpl.java:356)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ConnectionFutureListener.operationComplete(PerChannelBookieClient.java:2581)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ConnectionFutureListener.operationComplete(PerChannelBookieClient.java:2486)
at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
at
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)
at
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)
at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
at
io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
at
io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
at
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.fulfillConnectPromise(AbstractEpollChannel.java:675)
at
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:694)
at
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
at
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407)
at
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run([email protected]/Thread.java:833)
```
thread `pulsar-io-32-13`:
Timepoint 1:PendingAddOp(1) writeComplete method lock, object level.
(0x00000401526088f8)
Timepoint 3: PendingAddOp(1) handleBookieFailure
object(ReadOblyLedgerHandle#metadata) lock, object level. (0x0000044fc7914058)
Timepoint 4: PendingAddOp(2) unsetSuccessAndSendWriteRequest method lock,
object level (0x00000403f63e7520)
thread `pulsar-io-32-14`:
Timepoint 2: PendingAddOp(2) writeComplete method lock, object level.
(0x00000403f63e7520).
Timepoint 5: PendingAddOp(2) handleBookieFailure
object(ReadOblyLedgerHandle#metadata) lock, object level. (0x0000044fc7914058)
I think we should fix the Timepoint4 behavior, it will handle all the
pendindAddOps, and try to send the data to new replaced bookie. I think this
behavior shouldn't use the same thread as before.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]