Slack digest for #general - 2019-01-23

Apache Pulsar Slack Wed, 23 Jan 2019 01:11:21 -0800

2019-01-22 13:16:27 UTC - Paul van der Linden: I'm experimenting how pulsar 
handles load, but I have some surprising results so far:
- high round trip time (sending a message, and then back): 18ms (compared to 
8ms for some other systems), this is for the baseline test: 10msgs/s in, 30 
msgs/s out, 7kb message size (our average msg size currently)
- already having troubles with a message throughput of 1500 out (with 500 in): 
throughput unstable, round trip: 50-150ms
----
2019-01-22 13:17:06 UTC - Paul van der Linden: are there things I can tweak 
(like not using sync flush)?
----
2019-01-22 13:17:29 UTC - Pratik Narode: @Pratik Narode has joined the channel
----
2019-01-22 15:30:22 UTC - Brian: any recommendations on setting 
MaxDirectMemorySize?
----
2019-01-22 15:30:44 UTC - Brian: 4g default is causing us OOM and while I can 
bump it up, seems like a tmpfix
----
2019-01-22 15:55:27 UTC - Brian: also why doesn't proxy say "Hey, i have a 
broker who's OOM, lemme try another broker"
----
2019-01-22 15:57:49 UTC - Ali Ahmed: @naga you can use pulsar io to ingest the 
csv file
----
2019-01-22 15:58:16 UTC - Ali Ahmed: they either use pulsar sql or pulsar 
function to run the downstream job
----
2019-01-22 16:21:20 UTC - Romain Castagnet: Hi, I have a weird bug.
I use namespace isolation policy and region aware.
When a consumer (pulsar-client consumer) and a producer (pulsar-perf produce) 
are running, I stop a complete region. Consumer and producer are still working.
When I restart the region stopped precedently, consumer and producer are 
working but I can't do "namespaces unload mytopic". An error 500 appear and in 
journald I see an error 401.
I don't understand why...
If I stop producer and consumer and I restart broker and bookeeper, unload 
works again.
Do you have any idea ?
----
2019-01-22 17:14:48 UTC - Paul van der Linden: Hi, Is there a way to speed up 
bookkeeper or pulsar in general. I'm recording pretty bad performance on 
latency compared to other brokers
----
2019-01-22 17:15:45 UTC - Paul van der Linden: Something like kafka maybe where 
kafka doesn't synchronously flushes?
----
2019-01-22 17:16:13 UTC - Paul van der Linden: I'm already seeing slow 
performance on the baseline test with 10 msgs/s in, 30 msgs out, 7kb mssage size
----
2019-01-22 17:24:48 UTC - Matteo Merli: @Paul van der Linden you can set 
`journalSyncData=false` in `bookkeeper.conf` 
(<https://github.com/apache/pulsar/blob/master/conf/bookkeeper.conf#L304>)
----
2019-01-22 17:25:44 UTC - Paul van der Linden: Thanks, exactly what I couldn't 
find so far
----
2019-01-22 17:26:08 UTC - Matteo Merli: There are several other tunables that 
can be used to adjust the performances for different conditions. Can you expand 
a bit what you’re trying to achieve?
----
2019-01-22 17:26:55 UTC - Paul van der Linden: Sure
----
2019-01-22 17:27:29 UTC - Paul van der Linden: I'm basically comparing some 
messaging systems to see what we can replace Kafka with
----
2019-01-22 17:28:05 UTC - Paul van der Linden: One of the tests are some 
performance test, and see how the brokers handle high load
----
2019-01-22 17:29:22 UTC - Paul van der Linden: We are replacing kafka, mainly 
because the python clients are quiet bad and the majority of our code is python
----
2019-01-22 17:30:41 UTC - Matteo Merli: Sure, can you describe the deployment 
setup you’re using, any config you changed from defaults and how you’re sending 
traffic and measuring latency?
----
2019-01-22 17:31:44 UTC - Paul van der Linden: I'm deploying it to kubernetes
----
2019-01-22 17:31:52 UTC - Paul van der Linden: GKE to be exact
----
2019-01-22 17:32:34 UTC - Matteo Merli: Ok, the ~20 millis avg latency is 
typical on GKE
----
2019-01-22 17:32:44 UTC - Matteo Merli: (when fsyncing data)
----
2019-01-22 17:32:50 UTC - Paul van der Linden: compared to the examples 
directory in github for gke:
- 3 bookies instead of 2
- running everything in 3 nodes with 15GB ram, 4 cpu's
- non-local ssd storage
----
2019-01-22 17:33:13 UTC - Paul van der Linden: ok, it shoots up quiet quickly 
with some tests though
----
2019-01-22 17:33:41 UTC - Paul van der Linden: with 1500 msgs/s it's struggling 
to even manage the 1500 msgs/s
----
2019-01-22 17:34:12 UTC - Matteo Merli: Is it publishing synchronously?
----
2019-01-22 17:34:27 UTC - Paul van der Linden: I'm measuring latency by sending 
a message with a python client, then pinging back with a "PID" queue basically
----
2019-01-22 17:34:52 UTC - Paul van der Linden: is that a setting in the client?
----
2019-01-22 17:35:20 UTC - Matteo Merli: there are 2 methods on the `Producer` 
class, `send()` and `send_async()` 
<http://pulsar.apache.org/api/python/#pulsar.Producer.send_async>
----
2019-01-22 17:35:36 UTC - Paul van der Linden: ah I missed that one
----
2019-01-22 17:36:11 UTC - Matteo Merli: if you want to get any decent 
throughput you need to use the async variant, otherwise the throughput is 
limited by the latency (since there will be only 1 message in flight)
----
2019-01-22 17:37:45 UTC - Matteo Merli: I’d also suggest to enable batching and 
to block when producer queue is full (for easier backpressure handling):


```
producer = client.create_producer(
                'my-topic',
                block_if_queue_full=True,
                batching_enabled=True,
                batching_max_publish_delay_ms=10
            )
```
----
2019-01-22 17:38:11 UTC - Paul van der Linden: thanks I will test that 
tommorrow (just finished all of the tests)
----
2019-01-22 17:38:22 UTC - Matteo Merli: :+1:
----
2019-01-22 17:39:27 UTC - Paul van der Linden: thanks for the help, It's good 
to do a fair tests :slightly_smiling_face: The other systems I knew slightly 
better already, so it was easier to troubleshoot these kind of throughput issues
----
2019-01-22 19:11:04 UTC - Brian: any reason why the pulsar-admin commands 
result in ```Server redirected too many times```
----
2019-01-22 20:14:47 UTC - Kendall Magesh-Davis: Hey guys, I’ve got pulsar 
deployed using your helm chart. I don’t see a `pulsar-admin` container, like 
exists with the generic k8s deployment. Are the binaries hidden one of these 
containers?
```master ~/Code/pulsar/deployment/kubernetes/generic&gt; kubectl get pods -n 
pulsar --show-labels
NAME                                                    READY   STATUS      
RESTARTS   AGE   LABELS
foo-pulsar-autorecovery-576c97dcf4-zrdgq   1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=autorecovery,pod-template-hash=1327538790,release=foo
foo-pulsar-bastion-9658ffbf4-bnvg8         1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=bastion,pod-template-hash=521499690,release=foo
foo-pulsar-bookkeeper-0                    1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=bookkeeper,controller-revision-hash=foo-pulsar-bookkeeper-7b64dd9c47,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-0|statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-0>
foo-pulsar-bookkeeper-1                    1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=bookkeeper,controller-revision-hash=foo-pulsar-bookkeeper-7b64dd9c47,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-1|statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-1>
foo-pulsar-bookkeeper-2                    1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=bookkeeper,controller-revision-hash=foo-pulsar-bookkeeper-7b64dd9c47,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-2|statefulset.kubernetes.io/pod-name=foo-pulsar-bookkeeper-2>
foo-pulsar-broker-74c6f7dcb-hq7x2          1/1     Running     3          1d    
app=pulsar,cluster=foo-pulsar,component=broker,pod-template-hash=307293876,release=foo
foo-pulsar-broker-74c6f7dcb-kl8bb          1/1     Running     3          1d    
app=pulsar,cluster=foo-pulsar,component=broker,pod-template-hash=307293876,release=foo
foo-pulsar-dashboard-5c8b94757f-4lb5l      1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=dashboard,pod-template-hash=1746503139,release=foo
foo-pulsar-grafana-77b945cdbd-g578f        1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=grafana,pod-template-hash=3365017868,release=foo
foo-pulsar-prometheus-767976df87-5w84n     1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=prometheus,pod-template-hash=3235328943,release=foo
foo-pulsar-proxy-68f7576dcd-pkbz9          1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=proxy,pod-template-hash=2493132878,release=foo
foo-pulsar-zookeeper-0                     1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=zookeeper,controller-revision-hash=foo-pulsar-zookeeper-5cf6ffdb4,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-0|statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-0>
foo-pulsar-zookeeper-1                     1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=zookeeper,controller-revision-hash=foo-pulsar-zookeeper-5cf6ffdb4,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-1|statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-1>
foo-pulsar-zookeeper-2                     1/1     Running     0          1d    
app=pulsar,cluster=foo-pulsar,component=zookeeper,controller-revision-hash=foo-pulsar-zookeeper-5cf6ffdb4,release=foo,<http://statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-2|statefulset.kubernetes.io/pod-name=foo-pulsar-zookeeper-2>
foo-pulsar-zookeeper-metadata-zl4ns        0/1     Completed   0          1d    
controller-uid=80248831-1db3-11e9-aaa6-02b3804acdb4,job-name=foo-pulsar-zookeeper-metadata```
----
2019-01-22 20:15:39 UTC - Sijie Guo: foo-pulsar-bastion-9658ffbf4-bnvg8 - 
‘bastion’ is the ‘pulsar-admin’ container
----
2019-01-22 20:16:02 UTC - Kendall Magesh-Davis: nice. thanks @Sijie Guo 
:slightly_smiling_face: I’ll poke at that one
----
2019-01-22 23:15:16 UTC - Grant Wu: @Matteo Merli @Jerry Peng Could I get a 
response to 
<https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1547770573542400> ?
----
2019-01-22 23:16:03 UTC - Grant Wu: I’m about to wrap my entire `process` 
method in a catch-all try except to avoid my Pulsar function going down - would 
be great to see if there’s a better alternative that leverages Pulsar’s 
existing capabilities
----
2019-01-22 23:16:18 UTC - Grant Wu: Because I know `pulsar-admin functions 
getstatus` is a thing…
----
2019-01-22 23:17:27 UTC - Grant Wu: Like - if I throw an exception when 
processing a message, do I reprocess the same message after my function 
restarts, or does it drop that message
----
2019-01-22 23:21:14 UTC - Grant Wu: Sorry, corrected a typo
----
2019-01-22 23:21:37 UTC - Jerry Peng: @Grant Wu that depends on the processing 
guarantee set for the function.  If the function’s processing guarantee is set 
to:
1. AT_MOST_ONCE - If there is an uncaught exception in the function code, the 
current message will be dropped and not submitted for reprocessing
2. AT_LEAST_ONE - If there is an uncaught exception in the function code, the 
current message will not be dropped and will be submitted for reprocessing
----
2019-01-22 23:22:08 UTC - Grant Wu: Ah, okay.  So basically “`process` function 
ran without throwing exception” == success?
----
2019-01-22 23:22:33 UTC - Jerry Peng: 3. EXACTLY_ONCE - If there is an uncaught 
exception in the function code, the function will fail and restart to maintain 
ordering
----
2019-01-22 23:22:42 UTC - Jerry Peng: @Grant Wu correct
----
2019-01-22 23:25:13 UTC - Grant Wu: Thanks for clarifying!
----
2019-01-22 23:25:43 UTC - Grant Wu: Not 100% I understand the difference 
between #2 and #3 though
----
2019-01-22 23:26:50 UTC - Grant Wu: In what instances would we get processing 
more than once with #2?
----
2019-01-22 23:27:02 UTC - Grant Wu: Or are the only differences in processing 
order
----
2019-01-22 23:37:39 UTC - Grant Wu: @Jerry Peng
----
2019-01-23 01:29:43 UTC - Jerry Peng: @Grant Wu when there is a failure, the 
function may execute process on the same message more than once
----
2019-01-23 01:31:44 UTC - Jerry Peng: That will happen for both #2 and #3 but 
number #3 will ensure ordering as well as use idempotent producing to make sure 
that outputs derived from a distinct message will only be written once in an 
output topic
----
2019-01-23 01:33:23 UTC - Grant Wu: Ah okay
----
2019-01-23 01:33:35 UTC - Grant Wu: But what if you're using context publishing?
----
2019-01-23 01:33:48 UTC - Grant Wu: And don't publish to a normal output topic 
at all
----
2019-01-23 06:39:38 UTC - bossbaby: i have 1 topic with many broker but only 1 
broker connect 1 topic, in the case many producer and consumer connect this 
topic, it will be bottle neck. how to do solve it?
----
2019-01-23 06:53:52 UTC - Samuel Sun: partitions ?
----
2019-01-23 06:54:49 UTC - bossbaby: i use normal topic
----
2019-01-23 06:58:08 UTC - bossbaby: i realize pulsar support partition topic 
with many broker but a question how much partition that i must use if i have 6 
broker
----
2019-01-23 07:03:22 UTC - Samuel Sun: I think it could be 6,12,18.. ?
----
2019-01-23 07:08:46 UTC - bossbaby: thanks you @Samuel Sun
----
2019-01-23 07:09:09 UTC - Samuel Sun: np
----
2019-01-23 08:04:49 UTC - Samuel Sun: hi , one question for the required 
parameters in broker.conf, is this “brokerServicePort” a required one ? and how 
do I know which parameters are required ? thanks
----
2019-01-23 08:06:12 UTC - Samuel Sun: this question is related to this issue : 
<https://github.com/apache/pulsar/issues/3390>
----
2019-01-23 08:18:32 UTC - bossbaby: why PartitionedConsumer not support 
acknowledgeCumulative in pulsar client c++?
----

Slack digest for #general - 2019-01-23

Reply via email to