Slack digest for #general - 2020-10-10

Apache Pulsar Slack Sat, 10 Oct 2020 02:11:35 -0700

2020-10-09 10:03:40 UTC - Amir Kadoorie: @Amir Kadoorie has joined the channel
----
2020-10-09 10:58:49 UTC - yaodr: @yaodr has joined the channel
----
2020-10-09 12:15:26 UTC - Marcio Martins: Ok, so I increased resources, now 
they can go up to 16gb and 4 cores, and I removed all non-essential traffic 
from the cluster, so now there is only 1 topic, with 1 producer, producing ~ 
5Mbits/per sec - I would say this is over provisioned (9x bookie, 5x broker, 3x 
zk). Those 2 bookies are still consuming 100% CPU (1 core) and now about 3gb of 
memory. If I turn off auto-recovery it goes back to normal. I think you are 
right that the issue is the cpu and memory usage, but it seems like the trigger 
for that is the auto-recovery.
----
2020-10-09 12:31:55 UTC - Marcio Martins: I just manually deleted those two 
ledgers and it all went back to normal...
----
2020-10-09 12:32:10 UTC - Marcio Martins: I guess there is a bug somewhere in 
the bookies, which might be fixed by that commit I linked.
----
2020-10-09 13:33:44 UTC - Marcio Martins: Thanks @Sijie Guo
----
2020-10-09 13:36:12 UTC - Frank Kelly: Hey Addison - Good morning - I might 
have found something in relation to message ordering with Async Send. I setup a 
test program here <https://github.com/kellyfj/message-ordering> and can 
reproduce the problem. Before I filed an issue I want to see if there was 
anything obvious to you that I was doing incorrectly (probably!). Let me know 
if you have any questions.
----
2020-10-09 14:37:11 UTC - Sijie Guo: @Penghui Li
+1 : Lari Hotari
----
2020-10-09 16:26:41 UTC - Alan Hoffmeister: I've created low level bindings for 
calling libpulsar inside Rust :slightly_smiling_face:


<https://docs.rs/pulsar-sys/2.6.1/pulsar_sys/>
+1 : flafik, Sijie Guo
----
2020-10-09 19:54:05 UTC - Andrew Jacobs: @Andrew Jacobs has joined the channel
----
2020-10-09 20:23:38 UTC - Lari Hotari: ```root@somehost/pulsar# 
/tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 -e 
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.startReadOperationOnLedger 
1
Started 
[org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.startReadOperationOnLedger]
 profiling
--- Execution profile ---
Total samples       : 13176152
skipped             : 57730 (0.44%)

Frame buffer usage  : 0.0011%

--- 13118422 calls (100.00%), 13118422 samples
  [ 0] 
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.startReadOperationOnLedger
  [ 1] 
org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$checkReadCompletion$1
  [ 2] org.apache.bookkeeper.mledger.impl.OpReadEntry$$Lambda$822.800251664.run
  [ 3] org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun
  [ 4] org.apache.bookkeeper.common.util.SafeRunnable.run
  [ 5] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [ 6] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [ 7] java.util.concurrent.ThreadPoolExecutor.runWorker
  [ 8] java.util.concurrent.ThreadPoolExecutor$Worker.run
  [ 9] io.netty.util.concurrent.FastThreadLocalRunnable.run
  [10] java.lang.Thread.run

       calls  percent  samples  top
  ----------  -------  -------  ---
    13118422  100.00%  13118422  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.startReadOperationOnLedger```
I haven't been able to reproduce the issue yet in a development environment. In 
the k8s based 3 zk, 3 bookie, 3 broker cluster, I still have one broker where 
the problem is ongoing after about 36 hours. It seems that async-profiler 
(<https://github.com/jvm-profiling-tools/async-profiler>) is the most effective 
tool for debugging such problems. I copied the async-profiler binaries directly 
to the rootfs of the broker container's under /tmp and then opened a shell 
inside the Pulsar broker container to run async-profiler commands. Async 
profiler can also observe a single method. In the above example, I observed 
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.startReadOperationOnLedger 
method calls. async-profiler detected 13118422 calls in 15 seconds 
(874561op/s). It seems like an extremely tight loop. Looks like 8 cores are 
running full out since top on the k8s worker node show 700-800% CPU consumption.
----
2020-10-09 20:25:53 UTC - Addison Higham: hey @Frank Kelly I just wanted to 
acknowledge this, I downloaded the repo but haven't had a chance to dig in
----
2020-10-09 20:26:31 UTC - Frank Kelly: No worries - I have a workaround - it's 
probably something silly I did in code but just in case it's not I figured I'd 
share
----
2020-10-09 20:50:56 UTC - Lari Hotari: This verifies that no new calls are 
arriving. org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntries 
is not called at all.
```root@somehost:/pulsar# /tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 
-e org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntries 1
Started [org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntries] 
profiling
--- Execution profile ---
Total samples       : 0

Frame buffer usage  : 0%

       calls  percent  samples  top
  ----------  -------  -------  ---```
neither is 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntriesOrWait 
called:
```root@somehost:/pulsar# /tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 
-e org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntriesOrWait 1
Started 
[org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntriesOrWait] 
profiling
--- Execution profile ---
Total samples       : 0

Frame buffer usage  : 0%

       calls  percent  samples  top
  ----------  -------  -------  ---```
This proves that there are entries circulating in a tight loop.

The tight loop can be seen in these stacktraces and by looking at the code:
```root@somehost:/pulsar# /tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 
-e org.apache.bookkeeper.mledger.impl.OpReadEntry.checkReadCompletion 1
Started [org.apache.bookkeeper.mledger.impl.OpReadEntry.checkReadCompletion] 
profiling
--- Execution profile ---
Total samples       : 13613401
skipped             : 48177 (0.35%)

Frame buffer usage  : 0.0026%

--- 13565114 calls (100.00%), 13565114 samples
  [ 0] org.apache.bookkeeper.mledger.impl.OpReadEntry.checkReadCompletion
  [ 1] 
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.internalReadFromLedger
  [ 2] org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncReadEntries
  [ 3] 
org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$checkReadCompletion$1
  [ 4] 
org.apache.bookkeeper.mledger.impl.OpReadEntry$$Lambda$1087.1120645357.run
  [ 5] org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun
  [ 6] org.apache.bookkeeper.common.util.SafeRunnable.run
  [ 7] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [ 8] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [ 9] java.util.concurrent.ThreadPoolExecutor.runWorker
  [10] java.util.concurrent.ThreadPoolExecutor$Worker.run
  [11] io.netty.util.concurrent.FastThreadLocalRunnable.run
  [12] java.lang.Thread.run

--- 110 calls (0.00%), 110 samples
  [ 0] org.apache.bookkeeper.mledger.impl.OpReadEntry.checkReadCompletion
  [ 1] 
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.internalReadFromLedger
  [ 2] org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncReadEntries
  [ 3] 
org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$checkReadCompletion$1
  [ 4] org.apache.bookkeeper.mledger.impl.OpReadEntry$$Lambda$822.800251664.run
  [ 5] org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun
  [ 6] org.apache.bookkeeper.common.util.SafeRunnable.run
  [ 7] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [ 8] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [ 9] java.util.concurrent.ThreadPoolExecutor.runWorker
  [10] java.util.concurrent.ThreadPoolExecutor$Worker.run
  [11] io.netty.util.concurrent.FastThreadLocalRunnable.run
  [12] java.lang.Thread.run

       calls  percent  samples  top
  ----------  -------  -------  ---
    13565224  100.00%  13565224  
org.apache.bookkeeper.mledger.impl.OpReadEntry.checkReadCompletion```
----
2020-10-09 20:50:56 UTC - Lari Hotari: Would it be possible that one of the 
authors of "Managed Ledger" could take a look? Is there something in the 
solution that could cause a tight loop that I'm observing. @Matteo Merli? 
@Rajan Dhabalia? @xiaolong.ran ? Some recent modifications in this 
<https://github.com/apache/pulsar/commit/c635d08a448f7b2a4868bb6aefc13dd5f17000a1>
 (since 2.5.0)
<https://github.com/apache/pulsar/commit/45ee784f5ef92fc25e970bbf0be34412472173b1>
 (since 2.6.0)
I'm sorry I haven't yet been able to reproduce and isolate the issue. I seem to 
run into some other bugs when I tried to isolate and reproduce this issue... 
:slightly_smiling_face:
----
2020-10-09 20:52:59 UTC - Lari Hotari: There's a lot of information bits in 
this thread about the issue. For example, I uploaded quite a few thread dumps 
to a GH gist.
----
2020-10-09 20:57:29 UTC - Lari Hotari: If someone is wondering why I'm posting 
this information in this thread is that I have one broker running in a tight 
loop currently and I can easily observe it now since it's in the problem state. 
I cannot attach a debugger to the process. The only way to debug it seems to be 
the async-profiler and it's awesome features.

One more profiling from async-profiler to verify whether 
OpReadEntry.readEntriesFailed gets called. No it doesn't:
```root@somehost:/pulsar# /tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 
-e org.apache.bookkeeper.mledger.impl.OpReadEntry.readEntriesFailed 1
Started [org.apache.bookkeeper.mledger.impl.OpReadEntry.readEntriesFailed] 
profiling
--- Execution profile ---
Total samples       : 0

Frame buffer usage  : 0%

       calls  percent  samples  top
  ----------  -------  -------  ---```
----
2020-10-09 21:10:22 UTC - Lari Hotari: I also extracted some data from the 
heapdump using Eclipse MAT's OQL: 
<https://gist.github.com/lhotari/41a4f86ca45018c0c54f479c9af3d4e5> . I believe 
that the information can be used to understand why the code chooses certain 
branches and runs into an infinite loop. However there might be something else 
going on, for example that the topic has been deleted. In the current case, 
topics are getting deleted using the forceful flag. I wonder if deleting a 
topic in a forceful way could trigger the infinite looping behavior?
----
2020-10-09 21:12:07 UTC - Alexandre DUVAL: Did you know 
<https://github.com/wyyerd/pulsar-rs|https://github.com/wyyerd/pulsar-rs>?
----
2020-10-09 21:56:06 UTC - Lari Hotari: one more profiling to determine the 
branch chosen in 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl#hasMoreEntries()
```root@somehost:/pulsar# /tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 
-e org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.getNextValidPosition 1
Started 
[org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.getNextValidPosition] 
profiling
--- Execution profile ---
Total samples       : 0

Frame buffer usage  : 0%

       calls  percent  samples  top
  ----------  -------  -------  ---
root@somehost:/pulsar# /tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 -e 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.hasMoreEntries 1
Started [org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.hasMoreEntries] 
profiling
--- Execution profile ---
Total samples       : 12220914
skipped             : 36586 (0.30%)

Frame buffer usage  : 0.0014%

--- 12184328 calls (100.00%), 12184328 samples
  [ 0] org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.hasMoreEntries
  [ 1] org.apache.bookkeeper.mledger.impl.OpReadEntry.checkReadCompletion
  [ 2] 
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.internalReadFromLedger
  [ 3] org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncReadEntries
  [ 4] 
org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$checkReadCompletion$1
  [ 5] 
org.apache.bookkeeper.mledger.impl.OpReadEntry$$Lambda$1108.1400779947.run
  [ 6] org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun
  [ 7] org.apache.bookkeeper.common.util.SafeRunnable.run
  [ 8] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [ 9] org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run
  [10] java.util.concurrent.ThreadPoolExecutor.runWorker
  [11] java.util.concurrent.ThreadPoolExecutor$Worker.run
  [12] io.netty.util.concurrent.FastThreadLocalRunnable.run
  [13] java.lang.Thread.run

       calls  percent  samples  top
  ----------  -------  -------  ---
    12184328  100.00%  12184328  
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.hasMoreEntries
root@somehost:/pulsar# /tmp/async-profiler-1.8.1-linux-x64/profiler.sh -d 15 -e 
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.getNumberOfEntries 1
Started 
[org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.getNumberOfEntries] 
profiling
--- Execution profile ---
Total samples       : 0

Frame buffer usage  : 0%

       calls  percent  samples  top
  ----------  -------  -------  ---```
----
2020-10-09 22:05:28 UTC - Lari Hotari: By reading the code and trying to 
understand what it does, it seems that the "forceful topic deletion" changes on 
the broker side 
(<https://github.com/apache/pulsar/commit/35b67b0ec1fdba3baab1927095086a44c8ada174>)
  could be the one that triggers the behavior. In my case, the application code 
sets the force flag to true when some temporary topics get deleted. That is 
part of the load test in question. Is the forceful topic deletion feature a 
"last resort" type of thing and it shouldn't be used in normal cases at all? 
The forceful deletion feature seems to have been around forever (since 
<https://github.com/apache/pulsar/pull/1656>).

How could the Managed ledger get into the state that it thinks that more 
entries are arriving, but nothing ever comes? :thinking_face:
----
2020-10-09 23:35:40 UTC - Lari Hotari: I'm trying to answer the question to 
myself. :slightly_smiling_face: the usage of `cursor.hasMoreEntries()` in 
`org.apache.bookkeeper.mledger.impl.OpReadEntry#checkReadCompletion` looks 
wrong.

In the table of data extracted from the heap dump, I can see that 
OpRead.readPosition.ledgerId != OpRead.cursor.readPosition.ledgerId and 
OpRead.nextReadPosition.ledgerId != 
OpRead.cursor.ledger.lastConfirmedEntry.ledgerId (cursor write position) .
 Based on this data, I would understand that calling cursor.hasMoreEntries() 
without passing the OpRead's readPosition's ledger id is the wrong thing to do. 
I feel pretty confident so I'll file a bug in Github. /cc @Sijie Guo @Addison 
Higham
----
2020-10-09 23:49:40 UTC - Alan Hoffmeister: yup! I've helped fixing some bugs 
in that project already :slightly_smiling_face:
----
2020-10-09 23:50:38 UTC - Alan Hoffmeister: but I'm trying to learn rust so I 
thought using the native libpulsar would be a good side-project
----
2020-10-10 00:03:48 UTC - Addison Higham: Wow, this looks pretty comprehensive, 
sounds like a good path, we will get this in front of the team early next week
+1 : Lari Hotari
----
2020-10-10 00:04:02 UTC - Alexandre DUVAL: Oh cool! Was just a tip in case you 
didnt.
+1 : Alan Hoffmeister
----
2020-10-10 00:05:58 UTC - Lari Hotari: I created 
<https://github.com/apache/pulsar/issues/8229>
----
2020-10-10 02:15:00 UTC - Chloe: Last 5 days to submit your presentation for 
Pulsar Summit Asia 2020: <https://pulsar-summit.org/en/event/asia-2020/cfp>, 
CFP closes on Oct 14, 9am PT.
At the mean time, don't forget to sign up: 
<https://hopin.to/events/pulsar-summit-asia-2020>
----
2020-10-10 02:53:11 UTC - Sijie Guo: @Chloe don’t use `@ here`
----
2020-10-10 04:03:11 UTC - Rattanjot Singh: Let's say I update a bookie with new 
configuration and the old bookie are with older configuration. Which 
configuration will be picked . What about the conflicts. What about the 
messages being stored.
----
2020-10-10 04:04:58 UTC - Rattanjot Singh: What about other components like 
zookeeper ,broker and proxy.
----
2020-10-10 05:09:57 UTC - stephen: @stephen has joined the channel
----

Slack digest for #general - 2020-10-10

Reply via email to