Slack digest for #general - 2020-02-25

Apache Pulsar Slack Tue, 25 Feb 2020 01:11:59 -0800

2020-02-24 11:03:45 UTC - Konstantinos Papalias: Thanks for sharing your 
experiences @Devin G. Bost looking forward for the sample project and the Jira 
to track discussions like this
----
2020-02-24 11:38:52 UTC - Lewey: I have an issue where i have 6 consumers 
connecting to a partitioned topic with 18 partitions, however, the consumers 
only seem to be consuming from odd numbered partitions
----
2020-02-24 11:42:55 UTC - Vladimir Shchur: Hi! A question about configuration, 
should broker.conf configuration be synchronized between broker and bookeeper? 
I want to increase maxMessageSize, but increasing it just for broker (in helm 
k8s chart) doesn't change broker.conf for bookeeper pod and I have to use 
nettyMaxFrameSizeBytes to make it work. 2.5.0
----
2020-02-24 11:47:06 UTC - Roman Popenov: And just to add to that, I have set 
`maxMessageSize` only, and it did seem to be picked up by the system, but I 
also saw some exceptions like:
```13:30:42.559 [bookie-io-1-1] ERROR 
org.apache.bookkeeper.proto.BookieRequestHandler - Unhandled exception occurred 
in I/O thread or handler
io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds 
5242880: 5313378 - discarded
        at 
io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:513)
 ~[io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]```
----
2020-02-24 11:50:00 UTC - Roman Popenov: And then without setting 
`nettyMaxFrameSizeBytes` anywhere in the config, I was able to exchange 
messages of roughly 40 MBs in size. Is it possible that my `maxMessageSize` is 
picked up, but there is some overhead that is passed along with the message and 
when the limit is hit, the error message is generic?
----
2020-02-24 11:50:11 UTC - Roman Popenov: 
<https://github.com/apache/pulsar/issues/3832>
----
2020-02-24 12:04:08 UTC - Rattanjot Singh: Any wiki to deploy pulsar on 
multi-cluster using kubernetes on aws?
----
2020-02-24 12:06:07 UTC - Roman Popenov: I think helm install should work
----
2020-02-24 12:06:26 UTC - Roman Popenov: If you already have set-up your 
kuberentes cluster
----
2020-02-24 12:07:22 UTC - Roman Popenov: Or do you mean across multiple AZs?
----
2020-02-24 12:07:39 UTC - Roman Popenov: If it’s not one single cluster, then 
no, there are no docs for that
----
2020-02-24 12:07:43 UTC - Rattanjot Singh: I want it in east and west
----
2020-02-24 12:08:32 UTC - Roman Popenov: I think you would have to piece things 
from
<http://pulsar.apache.org/docs/en/deploy-bare-metal-multi-cluster/>
----
2020-02-24 12:09:22 UTC - Rattanjot Singh: So there is no end to end wiki for 
kubernetes
----
2020-02-24 12:10:25 UTC - Roman Popenov: Not that I am aware of. I think it’s 
possible to piece things from that doc
----
2020-02-24 12:38:05 UTC - Pavel Tishkevich: Hi All,

I’ve noticed that when one of three brokers fails, unloading topics from failed
broker may take approximately 1 to 5 minutes depending on number of topics.
Also I’ve noticed that during unloading sending messages to the topics that are
served by alive brokers is also very slow - approaches timeout.

- Is there a way to decrease duration of topics unloading? Maybe by adding more
brokers? We have approximately 200000 short-living topics on average.
- Why sending messages to the topics that are on alive brokers is so slow
during unloading? Is there a way to improve this?

Thanks in advance!
----
2020-02-24 14:13:04 UTC - Steve Kim: done
<https://github.com/apache/pulsar/issues/6410>
----
2020-02-24 15:09:18 UTC - Manuel Mueller: thanks for all the answers! I think
it would be nice if the feature would be marked “dev-preview” to not loose too
much time with it in case one is aiming to have this in production. I will
replicate it and file and issue with the proper logs
----
2020-02-24 15:32:25 UTC - Devin G. Bost: Function state currently is marked
“developer preview” :slightly_smiling_face:
----
2020-02-24 15:49:13 UTC - Rolf Arne Corneliussen: @Sijie Guo Thanks for the
information. I have done some test (still on Windows 10), and it seems the
timer wheel thread will use 100% of a hyperthread. When running 8 consumers on
a i7-8700 (4 cores 2 threads per core), TaskManager reported 100% CPU usage -
when there was no traffic, no messages on the topics.

I have written a simple test program, creating a `HashedWheelTimer` with 1
millisecond tick time, then adding a simple timer task that schedules itself
when it times out, every 500 millisecond, and the CPU load is the same as
running an idle Pulsar consumer.

My understanding of the `HashedWheelTimer` is that it is indented to be used
for a large number of approximated timeouts and not for millisecond precision
scheduling, but I may be mistaken.

Anyway, I tried using a `java.util.concurrent.ScheduledExecutorService`,
scheduling 1000 tasks at fixed interval 1 milliseconds, and that was lighter on
the CPU than the timer wheel with 1 millisecond tick time.

Should I raise an github issue on this?
----
2020-02-24 16:08:51 UTC - Sijie Guo: Yeah please create a GitHub issue.
----
2020-02-24 16:17:38 UTC - Joshua Dunham: Hey Everyone, Is commercial support
offered for Pulsar/Bookkeeper? Can someone make a reccomendiation?
----
2020-02-24 16:18:01 UTC - Devin G. Bost: Please reach out to @Sijie Guo at
StreamNative. He can help you out!
----
2020-02-24 16:18:24 UTC - Joshua Dunham: Nice, thx!
----
2020-02-24 16:30:51 UTC - Tanner Nilsson: @Chris Bartholomew and
<http://kafkaesque.io|kafkaesque.io> have been great for us!
----
2020-02-24 16:50:27 UTC - Sijie Guo: @Joshua Dunham yes. we offer different
type of services (developer support, managed service support, and etc) for
pulsar and bookkeeper.
----
2020-02-24 16:53:28 UTC - Sijie Guo: MaxMessageSize requires configuring two
places, one in broker `maxMessageSize` and the other one is
`nettyMaxFrameSizeBytes`.
----
2020-02-24 16:55:54 UTC - Sijie Guo: `nettyMaxFrameSizeBytes` is also needed to
be set on bookie side.
+1 : Roman Popenov
----
2020-02-24 16:58:40 UTC - Sijie Guo: unloading is closing the topics of that
bundle. and open those topics when loading the bundle. closing the topics
requires metadata updates. If you have many topics within a bundle, it takes
time. You can consider increasing the number of bundles.
----
2020-02-24 17:07:14 UTC - Vladimir Shchur: Thank you for the confirmation, but
these lines tell us that it should be taken from broker config, are they
eventually ignored?
<https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/BookKeeperClientFactoryImpl.java#L106>
----
2020-02-24 17:13:14 UTC - Charmaine Keck: @Charmaine Keck has joined the channel
----
2020-02-24 17:16:05 UTC - Sijie Guo: there are two settings
----
2020-02-24 17:16:19 UTC - Sijie Guo: one in the client, the other one is the
bookie
----
2020-02-24 17:16:51 UTC - Sijie Guo: the client one is taken from the broker
configuration, and the server-side one is here:
<https://github.com/apache/bookkeeper/blob/master/conf/bk_server.conf#L201>
----
2020-02-24 17:20:39 UTC - Pavel Tishkevich: We have 4 namespaces each having
1024 bundles. So I believe when 1 of 3 brokers fails, around 1300 bundles need
to be unloaded and acquired by remaining 2 brokers.

This still proves to be quite slow and two issues described above occur.
Maybe something else may help here?
----
2020-02-24 17:27:02 UTC - Sijie Guo: Oh I see. If you only have 3 brokers, it
means if one broker down, about 70k topics are impacted. It will take some time
to get those 70k re-owned by the other two brokers.
----
2020-02-24 17:28:32 UTC - Sijie Guo: Currently there is no good way to get
around this. For a short-term solution, it is to increase the number of
brokers, so each broker owns less topics. For a long-term solution, it is for
the committers and contributors to figure out a solution to support this use
case.
----
2020-02-24 17:35:53 UTC - Sree Vaddi: It’s happening tonight. I look forward to
seeing you all.
Out of area can participate now using online link. (Will be shared before the
meeting starts.) .

<https://www.meetup.com/Apache-Heron-Bay-Area/events/nglzdrybcdbwb/>
----
2020-02-24 17:53:01 UTC - Vladimir Shchur: I see, much clearer now, thank you!
----
2020-02-24 17:54:35 UTC - RAMG: @RAMG has joined the channel
----
2020-02-24 17:55:16 UTC - Pavel Tishkevich: Thanks a lot!
----
2020-02-24 18:45:11 UTC - Kirill Podkov: [Functions Question] :wave: We've
deployed a Python function that consumes an input topic, in which the messages
contain a partition_key, however this key is not passed to the messages within
the output topic. Is this intended? We're using the default Identity SerDe, and
the function simply returns the input under certain conditions without any
custom classes.
----
2020-02-24 18:56:05 UTC - Sijie Guo: Yes by default it doesn’t pass properties
from source message to output topic. There is a new flag added to java
functions to do so. But the python function is not supported yet.
heavy_check_mark : Kirill Podkov
cry : Manuel Mueller
----
2020-02-24 19:14:42 UTC - Kiran Chitturi: @Kiran Chitturi has joined the channel
----
2020-02-24 20:10:35 UTC - Alexander Ursu: Hi, any tips on getting a Pulsar
cluster set up with Docker, looking for something more resilient than the
standalone version.
----
2020-02-24 20:28:44 UTC - Greg Gallagher: @Alexander Ursu - are you using
Docker compose, or...?
----
2020-02-24 20:28:54 UTC - Alexander Ursu: Docker Swarm
----
2020-02-24 20:30:10 UTC - Greg Gallagher: I'm trying to do something similar
for a POC of Pulsar, but to be honest I threw Docker under the bus and am just
doing onto VM. There doesn't seem to be explicit instructions on this yet, and
there's an issue opened:
----
2020-02-24 20:30:38 UTC - Greg Gallagher:
<https://github.com/apache/pulsar/issues/5401>
----
2020-02-24 20:30:51 UTC - Greg Gallagher: (I'm not a Pulsar developer, I just
joined this Slack yesterday)
----
2020-02-24 20:30:59 UTC - Greg Gallagher: Worth noting, here is how I was going
to go about this:
----
2020-02-24 20:31:17 UTC - Greg Gallagher: 1. git clone
<https://github.com/apache/pulsar>
----
2020-02-24 20:32:47 UTC - Greg Gallagher: 2. check under the
deployment/kubernetes/generic/original directory the yml files which would be
used if you deployed under Kubernets. That'll give you the environment
variables (top section, ConfigMap stuff)
----
2020-02-24 20:34:11 UTC - Greg Gallagher: 3. create a stack.yml which mimic's
this file. I haven't used Swarm in years and frankly wouldn't suggest it, but
you're not asking for advice on this front so I'll hold back. We use
Kubernetes which is hideously complicated to run on-prem in baremetal
environment. If I could do it all over again I'd really suggest looking at
Hashicorp Nomad
----
2020-02-24 20:34:14 UTC - Greg Gallagher: hope that's helpful!
----
2020-02-24 20:40:09 UTC - Alexander Ursu: Thanks, will look into it, been
trying to decipher a lot of it myself recently and make yml files for swarm
stacks. Currently sticking to Swarm because I'm in a small team and k8s seems
overkill at the moment. However, Nomad has been on my radar for a bit and it
seems interesting
----
2020-02-24 20:43:19 UTC - Greg Gallagher: Yeah, Nomad also lacks documentation
coverage of how to properly setup a cluster, step-by-step. Everyone seems to
have two modes for documentation: "run this -dev standalone thing" or "figure
it out on your own" :confused: Took me a few hours this past Saturday to
figure out how to create a Nomad cluster using Consul and Nomad. Wasn't
anything close to learning curve of k8s, and you definitely should consider
size of team when determining direction on what to manage. Unfortunately my
developers beat me into a corner when I started and demanded k8s from the
start. Wish I knew what I know now to tell them "nah" but at the time it seemed
like what everyone was doing. k8s in the cloud is far, far better than running
on prem I would suggest. Good luck!
----
2020-02-25 04:23:15 UTC - Justin Grimes: @Justin Grimes has joined the channel
----
2020-02-25 04:52:52 UTC - vanchhay: I've been wondering, "MessageId.latest"
does this field refer to the latest_message in the topic?
----
2020-02-25 04:55:08 UTC - Sijie Guo: No it doesn’t
----
2020-02-25 05:07:16 UTC - Rattanjot Singh: Has anyone deployed pulsar in east
and west region using kubernetes?
----
2020-02-25 05:52:48 UTC - Greg Gallagher: Basic question: I have a local
(baremetal) cluster running 4 bookkeeper nodes and 4 broker nodes (separate
VMs). In conf/broker.conf what is the right number to set for
managedLedgerDefaultEnsembleSize and managedLedgerDefaultWriteQuorum and
managedLedgerDefaultAckQuorum ? The user guide for baremetal
(<https://pulsar.apache.org/docs/en/deploy-bare-metal/>) says "1" for these
values, which doesn't make sense since the example configuration is similar to
what I'm listing. Should it be number of bookies - 1? Like:
```# Number of bookies to use when creating a ledger
managedLedgerDefaultEnsembleSize=3

# Number of copies to store for each message
managedLedgerDefaultWriteQuorum=3

# Number of guaranteed copies (acks to wait before write is complete)
managedLedgerDefaultAckQuorum=3```
----
2020-02-25 05:55:18 UTC - Sijie Guo: I think the guide says if you are
deploying a one-node cluster, you need to change those settings to 1. Otherwise
the default values are good enough for most of the use cases.
----
2020-02-25 05:56:41 UTC - Greg Gallagher: it does say 1, yes, but I'm deploying
a 4 node cluster (4 bookies, 4 brokers) ... is it still 1? thanks!
----
2020-02-25 06:14:54 UTC - Greg Gallagher: n/m I see I'm asking a complicated
question and need to carefully read this:
----
2020-02-25 06:14:55 UTC - Greg Gallagher:
<https://bookkeeper.apache.org/docs/4.10.0/development/protocol/#ensembles>
----
2020-02-25 07:23:41 UTC - Sijie Guo: ensemble size is how many bookies used for
storing a ledger, write quorum size is how many copies for storing an entry,
ack quorum size is how many responses to wait for confirming write success. So
most of the time you can just stick to 2/2/2 (which is the default setting), if
you wanna higher guarantees you can use 3/3/2.

But if you have number of nodes less than your required replications settings
(for example, if you have one-node), then have you to set it to 1/1/1.
Otherwise you are not able to create ledgers since you don’t have enough
bookies.

I wrote an article before. You can check it out to understand what do those
settings mean in bookkeeper replication.
<https://streaml.io/blog/why-apache-bookkeeper>
----
2020-02-25 08:14:26 UTC - xue: pulsar 2.5.0，start broker
----

Slack digest for #general - 2020-02-25

Reply via email to