2019-06-25 10:03:22 UTC - Richard Sherman: Just been doing some load testing
against a 3 node cluster and hit an issue where all writes were hitting this
error. `Received send error from server: PersistenceError :
org.apache.bookkeeper.mledger.ManagedLedgerException: Waiting for new ledger
creation to complete` I stopped the tests at this point. When I restarted my
consumers I'm now receiving `Received error from server:
org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty
bookies available`. I was using 2 topics each with 1 consumer and 24 producers.
I had performed a similar test yesterday albeit with only 1 topic and achieved
a sustained throughput of just over 4,000 messages per second. So decided to
double the load
----
2019-06-25 10:06:39 UTC - Richard Sherman: Additionally hitting
`admin/v2/broker-stats/topics` through the browser now returns `{}`
----
2019-06-25 10:10:15 UTC - Jon Bock: Hi Darren, we’ve heard from a number of
people who are implementing Pulsar in an event-sourcing scenario. When and
where event sourcing is a good design choice is a different discussion, but
when event sourcing is the design, Pulsar fits really well, in particular
because of its multi-tiered storage architecture since that makes it easy to
scale up to store long event histories efficiently.
slightly_smiling_face : Jeremy Taylor
----
2019-06-25 10:21:43 UTC - Richard Sherman: So it appears we've run out of disk
space on the book keeper nodes
----
2019-06-25 10:55:02 UTC - Guillaume Braibant: @Richard Sherman
Yes, you do run out of storage on your Bookkeeper nodes. Bookkeeper doesn't
allow to create new ledgers when there is not enough storage space
You have to understand that used storage is not immediately released by
Bookkeeper, even when it can be released.
----
2019-06-25 11:02:37 UTC - Guillaume Braibant: @Richard Sherman
If you want to solve your problem, you can do three things :
1. Add more storage in a way you have enough space to store messages by the
time they are consumed and deleted.
2. Limit production rate to avoid overloading.
3. Tune Bookkeeper configuration to release storage faster.
Bookkeeper periodically run a compaction job (minor and major). The compaction
job "prepare" the data in a way another periodic job (Garbage Collector) can
release storage (ie. delete the data that will not be used anymore). Here are
three property that you can tune : gcWaitTime (Garbage Collector period),
minorCompactionInterval | majorCompactionInterval (compaction job period).
Care that minorCompactionInterval MUST be GREATER than gcWaitTime (if not, your
bookies will crash at startup) :wink:
----
2019-06-25 11:07:26 UTC - Richard Sherman: Thanks Guillaume. It looks like the
problem is not with the bookeeper data but with the disks being too small. The
bookeeper data directory is only taking 138Mb of a 3.8GB partition that is 97%
used.
+1 : Guillaume Braibant
----
2019-06-25 11:14:54 UTC - Guillaume Braibant: Indeed, it is clearly not enough.
I currently face the same issue as you with two 10 Gb dedicated spaces (one
10Gb for journal, another 10 Gb for ledgers) for each bookie when testing with
a 10000 messages per second.
----
2019-06-25 14:35:08 UTC - Bipul Kumar: @Bipul Kumar has joined the channel
----
2019-06-25 15:03:40 UTC - Alexandre DUVAL: Got this issue, but when I `jar -tf
<jarfile> | grep Cellar` I got the `.class`, ```yo-pulsar-c1-n7 /pulsar #
jar -tf conf/functions/target/pulsar-functions-0.1.0-SNAPSHOT.jar | grep Cellar
com/yo/pulsar/function/CellarC1AccessLog.class
```
----
2019-06-25 15:06:58 UTC - Alexandre DUVAL: Which class path? :smile:
----
2019-06-25 15:07:41 UTC - Mark Marijnissen: Should I be worried about closed
zookeeper connections every few seconds?
```15:06:45.439 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection
from /127.0.0.1:39142
15:06:45.439 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxn - Processing ruok command from
/127.0.0.1:39142
15:06:45.440 [Thread-8551] INFO org.apache.zookeeper.server.NIOServerCnxn -
Closed socket connection for client /127.0.0.1:39142 (no session established
for client)
15:06:50.991 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection
from /127.0.0.1:39160
15:06:50.991 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxn - Processing ruok command from
/127.0.0.1:39160
15:06:50.992 [Thread-8552] INFO org.apache.zookeeper.server.NIOServerCnxn -
Closed socket connection for client /127.0.0.1:39160 (no session established
for client)
15:06:55.437 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection
from /127.0.0.1:39172
15:06:55.437 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] INFO
org.apache.zookeeper.server.NIOServerCnxn - Processing ruok command from
/127.0.0.1:39172
15:06:55.437 [Thread-8553] INFO org.apache.zookeeper.server.NIOServerCnxn -
Closed socket connection for client /127.0.0.1:39172 (no session established
for client)```
----
2019-06-25 15:18:54 UTC - Mark Marijnissen: and it crashes with: ```Exception
in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /admin/clusters/pulsar at
org.apache.zookeeper.KeeperException.create(KeeperException.java:102) at
org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at
org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1541) at
org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1569) at
org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:732) at
org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:600) at
org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:363) at
org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:291```
----
2019-06-25 17:40:11 UTC - Gaurav Sheth: @Gaurav Sheth has joined the channel
----
2019-06-25 18:02:02 UTC - Gaurav Sheth: Hi guys I am working with @Thor
Sigurjonsson and @Devin G. Bost and I am having situation around
DeadLetterTopic.
The following is the my consumer config in JAVA:
``` return client.newConsumer(Schema.STRING)
.topic(topic)
.subscriptionName(subscription)
.subscriptionType(SubscriptionType.Shared)
.messageListener(b2cProductStatusChangeConsumer)
.deadLetterPolicy(DeadLetterPolicy.builder().maxRedeliverCount(5).build())
.subscribe();```
I have the deadLetterPolicy setup and when I receive a message it is retried
for 5 times and then it does not get acknowledged as you can see in the
dashboard screenshot below
----
2019-06-25 18:02:39 UTC - Gaurav Sheth: Pulsar Dashboard
----
2019-06-25 18:05:34 UTC - Gaurav Sheth: Questions on my mind:
1 - The documentation says if no custom topic name is mentioned it will put
`Default dead letter topic name is {TopicName}-{Subscription}-DLQ.` as the
topic name. Is the topic with default name to be created separately or pulsar
takes care of it?
2 - Once the messages reach the dead letter topic how to get a handle over them
and reprocess those messages?
----
2019-06-25 18:15:20 UTC - Devin G. Bost: Also, do messages reach the DLQ only
when an error occurs in the consumer?
Since there's an unacknowledged message, I'm wondering if it didn't trigger the
requirement for getting sent to the DLQ, and if nothing needed to be written to
the DLQ, I'm wondering if that's why the DLQ hasn't been created yet.
----
2019-06-25 18:26:27 UTC - Edmond B: @Edmond B has joined the channel
----
2019-06-25 18:29:24 UTC - Edmond B: Has anyone had any luck with getting the
pulsar python client to work on a raspberry pi?
----
2019-06-25 18:33:33 UTC - Grant Wu: I don’t see why it wouldn’t work, as long
as you’re running a version that has Glibc
----
2019-06-25 18:42:40 UTC - Edmond B: I've attempted to build it manually and run
it to this error `error: [Errno 2] No such file or directory: '_pulsar.so'`.
----
2019-06-25 18:43:33 UTC - Grant Wu: Is there a particular reason why you need
to build it manually?
----
2019-06-25 18:43:40 UTC - Grant Wu: Doesn’t Pulsar provide manylinux packages?
----
2019-06-25 18:44:59 UTC - Grant Wu: I don’t have any experience with building
it manually, unfortunately
----
2019-06-25 18:45:06 UTC - Edmond B: Ive tried installing with pip with no luck
----
2019-06-25 18:45:20 UTC - Grant Wu: What happened when you tried to do that?
----
2019-06-25 18:46:39 UTC - Edmond B: It cant find the package ` Could not find a
version that satisfies the requirement pulsar-client (from versions: )``No
matching distribution found for pulsar-client`
----
2019-06-25 18:47:26 UTC - Grant Wu: What does `uname -a` say?
----
2019-06-25 18:48:15 UTC - Edmond B: `Linux IOTPI3 4.19.50+ #896 Thu Jun 20
16:09:52 BST 2019 armv6l GNU/Linux`
----
2019-06-25 18:48:44 UTC - Grant Wu: oh wait… silly me, I forgot raspberry PIs
aren’t x86
----
2019-06-25 19:04:57 UTC - Devin G. Bost: I noticed that the pulsar-functions
jar is missing from upstream for 2.3.2:
<https://packages.atlassian.com/repository/public/org/apache/pulsar/pulsar-functions/2.3.2/>
What's up with that? It's causing one of our builds to fail.
----
2019-06-25 19:07:20 UTC - Jerry Peng: @Devin G. Bost there has never been a jar
for pulsar-functions
----
2019-06-25 19:07:26 UTC - Jerry Peng: that is just a parent pom
----
2019-06-25 19:09:29 UTC - Devin G. Bost: @Jerry Peng Hmm I'm wondering how that
build ever succeeded then. :thinking_face:
I'll check if something changed on our end.
----
2019-06-25 19:15:59 UTC - Devin G. Bost: I uncommented:
```
<dependency>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-functions</artifactId>
<version>${pulsar.version}</version>
</dependency>
<dependency>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-functions-utils</artifactId>
<version>${pulsar.version}</version>
<classifier>tests</classifier>
</dependency>
```
from my POM file, and it worked.
----
2019-06-25 19:17:05 UTC - Jerry Peng: gotcha i don’t think you need to include
```
<dependency>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-functions</artifactId>
<version>${pulsar.version}</version>
</dependency>
```
----
2019-06-25 19:22:41 UTC - Sam Leung: I’ve been playing with DLQs as well, while
I haven’t fully tested them, I think
1. If you set up your broker to turn off auto topic creation, then you need to
create them yourself manually. Else, Pulsar automatically creates a topic for
you if it isn’t found, and the DLQ would behave in the same way.
2. I don’t think there has been much functionality built out for this yet, you
probably have to use pulsar functions or have your app have a subscription to
`{TopicName}-{Subscription}-DLQ` and do the processing you need.
+1 : Gaurav Sheth
----
2019-06-25 19:24:50 UTC - Sam Leung: By default, there is no ack timeout,
meaning your consumer will have infinite time to process and ack the message.
The only way it’s marked for redelivery is if your consumer disconnects. I
think the 2 ways to get the DLQ counter to work is if the ack timeout is set
and the threshold is hit, or use 2.4.0+ client to do a .negativeAcknowledgement
(only recently implemented in java)
----
2019-06-25 19:28:27 UTC - Devin G. Bost: @Jerry Peng Regarding the messages
from Gaurav
(<https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1561485722348100>), he
mentioned that the documentation states that there's a way for a consumer to
send a negative acknowledgement to the broker when there's an error (so the
message doesn't remain unacknowledged), but he's not seeing it in the latest
release of the Java Client source code. Do you know anything about that?
----
2019-06-25 19:31:41 UTC - Jerry Peng: @Devin G. Bost Negative acknowledgements
is a feature that will be released in 2.4
----
2019-06-25 19:32:33 UTC - Sam Leung: It’s in the upcoming 2.4.0
release.<https://github.com/apache/pulsar/pull/3703>
+1 : Gaurav Sheth
----
2019-06-25 19:34:20 UTC - Devin G. Bost: Gotcha.
----
2019-06-25 19:36:55 UTC - Devin G. Bost: @Gaurav Sheth FYI.
+1 : Gaurav Sheth
----
2019-06-25 20:10:03 UTC - Devin G. Bost: @Jerry Peng Is it currently in
`2.4.0-streamlio-24`?
----
2019-06-25 20:10:52 UTC - Jerry Peng: yes
----
2019-06-25 20:10:56 UTC - Devin G. Bost: Thanks.
----
2019-06-25 20:13:42 UTC - Devin G. Bost: Do you have an estimate on when 2.4.0
will ship as an Apache release? (I'm getting asked about it.)
----
2019-06-25 20:16:24 UTC - Alexandre DUVAL: @David Kjerrumgaard hi, do you have
an idea? :stuck_out_tongue:
----
2019-06-25 20:54:47 UTC - David Kjerrumgaard: @Devin G. Bost The 2.4.0 release
candidate is moving through the process now.
----
2019-06-25 20:59:03 UTC - David Kjerrumgaard: for the `--jar` switch can you
try adding the `file://` prefix
----
2019-06-25 20:59:35 UTC - David Kjerrumgaard: Also, the file protocol assumes
that file already exists on worker host, so make sure the jar is there :smiley:
----
2019-06-25 21:07:01 UTC - Devin G. Bost: Thanks.
----
2019-06-25 21:17:39 UTC - David Kjerrumgaard:
<https://github.com/apache/pulsar/milestone/20?closed=1>
----
2019-06-25 21:17:54 UTC - David Kjerrumgaard: ^^ All the fixes in the 2.4.0 RC
----
2019-06-26 00:05:22 UTC - Devin G. Bost: Thanks!
----
2019-06-26 05:12:29 UTC - Sree Vaddi: I thank you each and everyone who made to
our meetup yesterday :pray:
----