2018-10-18 09:27:50 UTC - Wenfeng Wang: @Wenfeng Wang has joined the channel
----
2018-10-18 10:31:55 UTC - Nicolas Ha: I am trying to understand the 
recommendation you made last time to use a Daemonset instead of a StatefulSet 
(previous msg 
<https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1538779207000100> )

My understanding is that in the pulsar helm chart:
- currently BK and ZK use a StatefulSet (and a volumeClaim) - they do not care 
where the storage is provided, but the kubernetes cluster has to have a way to 
provide storage (not sure where this bit is specified?)
- DaemonSet would make ZK and BK pods “stick” to one physical node, and use the 
local storage there (so would work on bare metal clusters too, the operator 
would have to keep number of machines &gt;= number of ZK/BK DaemonSet replicas)

Also I see Volume claims, but I was expecting `PersistentVolumeClaim` in these:
- 
<https://github.com/apache/pulsar/blob/master/deployment/kubernetes/helm/pulsar/templates/bookkeeper-statefulset.yaml>
- 
<https://github.com/apache/pulsar/blob/master/deployment/kubernetes/helm/pulsar/templates/zookeeper-statefulset.yaml>

Or even something like what is described there: 
<https://kubernetes.io/blog/2018/04/13/local-persistent-volumes-beta/> and 
<https://kubernetes.io/docs/concepts/storage/volumes/#local> - it looks like 
this should allow not changing the statefulset and still use local volumes?

Am I missing something? Was there a specific reason to use a DaemonSet in 
"kubernetes/generic" and Statefulset in "helm"?
----
2018-10-18 12:54:31 UTC - Sijie Guo: &gt; it looks like this should allow not 
changing the statefulset and still use local volumes?

yes localvolumes should be a better solution than a daemonset.

&gt; Was there a specific reason to use a DaemonSet in “kubernetes/generic” and 
Statefulset in “helm”?

kubernetes/generic was added before k8s introduces local volumes. so we use 
daemonset there. but it probably can be changed to be using stateful set and 
local volumes.

“helm” was added for deploying in cloud environment. hence statefulset with 
persistent volumes is more resonable there.
----
2018-10-18 13:24:20 UTC - Martin Svensson: yeah, that’s right
----
2018-10-18 13:48:47 UTC - Nicolas Ha: ok that makes sense.
So if I understand correctly, there is nothing wrong with having a "generic" 
that would use local volumes, and even a "helm" version of the "generic" 
deployment?
----
2018-10-18 13:54:40 UTC - Sijie Guo: YES correct
----
2018-10-18 13:56:05 UTC - Nicolas Ha: thanks a lot - very helpful answers as 
usual :smile:
----
2018-10-18 16:44:10 UTC - Jerry Moore: @Jerry Moore has joined the channel
----
2018-10-18 18:33:57 UTC - Zuyu Zhang: Hi guys, I have a simple q regarding the 
multi-topic consumer in C++ client. When the consumer receives a message, how 
to know which topic it belongs to?
----
2018-10-18 18:40:25 UTC - Jerry Peng: MessageId::getTopicName()
----
2018-10-18 18:40:34 UTC - Jerry Peng: ```
 /**
     * Get the topic Name
     */
    const std::string&amp; getTopicName() const;
```
clap : Rodrigo Malacarne
----
2018-10-18 18:40:52 UTC - Zuyu Zhang: ok. Thanks!
----
2018-10-18 18:42:00 UTC - Jerry Peng: FYI This is a feature not in an official 
release yet.  You will have to compile from master
clap : Rodrigo Malacarne
----
2018-10-18 18:53:10 UTC - Zuyu Zhang: Yes, I did.
----
2018-10-18 22:03:51 UTC - Ryan Samo: Hey guys, I’m trying to load test Pulsar 
but keep receiving “Connection reset by peer messages”. It only happens when I 
start multiple producers and consumers on a topic. If the topic is not 
partitioned then it does not seem to happen as often. I have gone through the 
broker.conf and set many of the limits to 0 so that I can push the boundaries 
of the hardware. Any ideas on what my by throwing out all of my connections? 
All of them drop at once, consumers and producers and then immediately 
reconnect.

Thanks!
----
2018-10-18 22:05:34 UTC - Ali Ahmed: @Ryan Samo what’s your setup and how much 
traffic are you pushing through
----
2018-10-18 22:09:14 UTC - Ryan Samo: 3 brokers, 3 zookeepers each on 40 core 
servers with plenty of RAM. If I do a topic that’s not partitioned, I can spin 
up 25 producers with Pulsar-perf and generate around 1 million msg/s. The 
consumer side lags in that case so I decided to partition and that’s when the 
connection issues popped up. I want to sustain 1 million msg/s without lag if I 
can get there.
----
2018-10-18 22:09:55 UTC - Ali Ahmed: how many bookies ?
----
2018-10-18 22:13:02 UTC - Ryan Samo: 3, they are shared on the same servers as 
the brokers. 3 brokers, 3 bookies, 3 zookeepers
----
2018-10-18 22:13:26 UTC - Ryan Samo: Will reply back soon, need to drive :)
----
2018-10-18 23:23:34 UTC - Rodrigo Malacarne: @Ryan Samo, were you able to start 
a cluster with with bookies? Are you using v2.1.1-incubating?
----
2018-10-18 23:31:53 UTC - Ryan Samo: Yes I was able to get it all running. 
V2.1.1-incubating yup. Yeah so the cluster works fine until I try to have the 
multiple connections and then it just keeps dropping them with timeouts. The 
servers themselves are hardly being taxed with the load so I thought maybe a 
setting needs adjustment or the JVM is running into issues maybe.
----
2018-10-18 23:32:30 UTC - Matteo Merli: @Ryan Samo Are you reaching any 
bandwidth limit ?
----
2018-10-18 23:33:32 UTC - Matteo Merli: There certainly some setting that can 
be used when testing for high-throughput. The defaults are conservative, meant 
to work out of the box in all scenarios
----
2018-10-18 23:33:44 UTC - Ryan Samo: Well I was able to do 25 producers and a 
couple consumers with a non-partitioned table so I don’t think so.
----
2018-10-18 23:34:29 UTC - Ryan Samo: If you have any recommended settings for 
beating the crap out of the cluster I’m all for it. Need to prove out how far 
we can go
----
2018-10-18 23:34:58 UTC - Matteo Merli: One of the reasons for “connection 
reset by peer” is that one end appears to be unresponsive, so the broker cuts 
the TCP connection off, after 30-60 sec
----
2018-10-18 23:36:16 UTC - Ryan Samo: Hmm yeah I saw a setting for that in the 
broker config. I might try the Prometheus and grafana monitoring too to see if 
I see anything network wise
----
2018-10-18 23:36:31 UTC - Ryan Samo: CPU, RAM, etc all good
----
2018-10-18 23:36:58 UTC - Matteo Merli: For settings, take a look at 
<https://github.com/openmessaging/openmessaging-benchmark/blob/master/driver-pulsar/deploy/templates>
----
2018-10-18 23:37:21 UTC - Ryan Samo: I am using Pulsar-perf produce and consume 
to perform the tests. Cool thanks, I’ll check that out!
----
2018-10-18 23:37:56 UTC - Matteo Merli: Are the machines with 10Gbsp NICs ?
----
2018-10-18 23:40:41 UTC - Ryan Samo: Yes they sure are. In the same rack too 
actually 
----
2018-10-18 23:41:39 UTC - Matteo Merli: Ok, then, apart from metrics, take a 
look at broker logs for any additional clue
----
2018-10-18 23:43:04 UTC - Ryan Samo: Ok, I’ll keep searching. It feels like a 
broker.config issue because when I started the initial testing it was 
conservative like you said. I tweaked a few values to unlimited and that helped 
a ton. Thanks for all the support!
+1 : Matteo Merli
----
2018-10-19 07:15:43 UTC - Matti-Pekka Laaksonen: Hi! I tried running my cluster 
for the first time with production-level workloads, and I ended up running out 
of memory with the brokers. I have a few questions regarding this:
1. Is the sum of maxDirectMemory and maxHeapMemory the maximum memory JVM can 
use? Or is the heap memory a subset of the direct memory?
2. What is the recommended total memory (I assume the sum of the two memory 
sizes) for brokers?
3. What is the recommended share of instance's total memory used by Pulsar? The 
sample deployment scripts in the Git repo show that c5.2xlarge instances are 
used for the brokers. The instance has 16 gb of memory, but the direct and heap 
memory are both 12gb, which sum up to 24 gb, so more than the available memory
----
2018-10-19 07:27:38 UTC - Matti-Pekka Laaksonen: Oh, and I could swear that 
I've seen a guide to upgrading a running Pulsar cluster, but I can't seem to 
find it anymore. Anyone know about this?
----

Reply via email to