2018-12-12 09:26:16 UTC - Maarten Tielemans: Morning all, I managed to get my 2
node Pulsar cluster setup yesterday. However, when doing an initial test on
latency, latency seems to have increased rather than decreased.
----
2018-12-12 09:26:29 UTC - Maarten Tielemans:
----
2018-12-12 09:27:24 UTC - Maarten Tielemans: I have configured my broker as
followed
```
# Number of bookies to use when creating a ledger
managedLedgerDefaultEnsembleSize=2
# Number of copies to store for each message
managedLedgerDefaultWriteQuorum=2
# Number of guaranteed copies (acks to wait before write is complete)
managedLedgerDefaultAckQuorum=1
```
----
2018-12-12 09:28:01 UTC - Maarten Tielemans: For reference, these were the
numbers I hit Friday with a standalone run over a longer duration
`09:44:03.405 [Thread-2] INFO org.apache.pulsar.testclient.PerformanceProducer
- Aggregated latency stats --- Latency: mean: 3.335 ms - med: 3.216 -
95pct: 4.328 - 99pct: 7.007 - 99.9pct: 15.858 - 99.99pct: 128.418 -
99.999pct: 261.110 - Max: 278.307`
----
2018-12-12 09:28:36 UTC - Maarten Tielemans: Any obvious things I am doing
wrong?
----
2018-12-12 09:29:39 UTC - Romain Castagnet: @Romain Castagnet has joined the
channel
----
2018-12-12 09:34:03 UTC - Romain Castagnet: Hi, before next release 2.3, do you
have a snapshot/beta repository ?
----
2018-12-12 10:02:54 UTC - Ivan Kelly: @Romain Castagnet for the client or
server? there's a snapshot maven repo anyhow
----
2018-12-12 10:06:46 UTC - Romain Castagnet: @Ivan Kelly it's for server and we
need the last tar.gz snapshot, where can I find this ?
----
2018-12-12 10:42:28 UTC - Ivan Kelly: @Romain Castagnet I'm not sure if we
generate one. why can't you build master directly yourself?
----
2018-12-12 10:42:54 UTC - Ivan Kelly: @Maarten Tielemans what has the latency
increased from?
----
2018-12-12 10:43:09 UTC - Ivan Kelly: ah, sorry, didn't read fully
----
2018-12-12 10:44:43 UTC - Ivan Kelly: how are you running the benchmark?
----
2018-12-12 10:47:14 UTC - Romain Castagnet: thanks @Ivan Kelly it's already
what we do
----
2018-12-12 10:47:29 UTC - Ivan Kelly: i don't think there's an official one
----
2018-12-12 10:48:32 UTC - Romain Castagnet: maybe
----
2018-12-12 10:56:55 UTC - Maarten Tielemans: hi @Ivan Kelly, in both cases I
ran the client on (one of) the nodes. the nodes are also setup in a placement
group
----
2018-12-12 10:58:47 UTC - Ivan Kelly: two node cluster has separate bookie and
pulsar? or are they on the same nodes?
----
2018-12-12 11:02:01 UTC - Maarten Tielemans: each node runs a zookeeper, bookie
and broker (similar to the standalone)
----
2018-12-12 11:07:23 UTC - Ivan Kelly: but each is running individually? as in,
their own process?
----
2018-12-12 11:08:01 UTC - Ivan Kelly: standalone has journalSyncData=false for
bookkeeper, while running a bookie has journalSyncData=true (the safer setting).
----
2018-12-12 12:25:07 UTC - Maarten Tielemans: yes, I am running zookeeper,
bookie and broker as separate (daemon) processes
----
2018-12-12 12:27:57 UTC - Maarten Tielemans: I realise the
journalSyncData=true/false change ; but considering that I am using instances
with NVMe disks, I am not sure it would explain the difference. I can try with
the setting on false
----
2018-12-12 12:40:57 UTC - Ivan Kelly: with false, it skips the fsync. fsync is
a syscall so you have a jump from userspace to kernel space
----
2018-12-12 13:00:14 UTC - Maarten Tielemans: I added the journalSyncData=false
setting to my conf/bookkeeper.conf, restarted both bookkeepers and brokers and
re-ran the performance test. Results are more or less the same
----
2018-12-12 13:00:30 UTC - Maarten Tielemans:
----
2018-12-12 13:32:29 UTC - Mike Card: Hey @David Kjerrumgaard last night I tried
reducing the number of parallel tasks from 2 to 1 and even added
synchronized(){} blocks around the access to the static variables in the custom
serializer to rule out any kind of concurrency issue and none of that made a
difference. The logging showed the producer sending 64 bytes and the consumer
getting 64 bytes so I have no idea what is making this happen. I might try
re-writing the serializer to not use the Composite() type, baffld as to why we
see this behavior under heavy load in Pulsar but not Kafka.
----
2018-12-12 14:02:24 UTC - Ivan Kelly: median and p95 are the same (slightly
better in fact), it's the p99 where things go to hell
----
2018-12-12 14:08:39 UTC - Ivan Kelly: hmm, messages are quite large. I'm going
to guess the issue is some file rolling over. there's local spikes and a large
spike every minute
----
2018-12-12 14:19:17 UTC - Shalin: Are there plans to support submitting
multi-file pulsar functions? I,e, maybe accepting a python module instead of a
py file.
----
2018-12-12 14:24:59 UTC - Maarten Tielemans: @Ivan Kelly - I ran `iostat -x 1`
on 1 node and re-ran the pulsar-perf test on the other. At times, I do see
(massive) spikes in wkB/s. Normally, I expect 300kB*60 (=18000kB/s), I see
spikes up to 250 000 kB/s. Any way to pinpoint/resolve this? I have Journal and
Ledger on the same disk.
----
2018-12-12 14:25:09 UTC - Maarten Tielemans: ```
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 2.00 0.00 5.50 5.50
0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 0.00 0.00 0.00 302.00 0.00 17978.00 119.06
0.03 0.15 0.00 0.15 0.08 2.40
avg-cpu: %user %nice %system %iowait %steal %idle
6.06 0.00 8.59 29.80 0.00 55.56
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 0.00 0.00 0.00 2026.00 0.00 250402.50 247.19
163.13 37.98 0.00 37.98 0.30 61.60
avg-cpu: %user %nice %system %iowait %steal %idle
1.04 0.00 1.55 21.76 0.00 75.65
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 0.00 0.00 0.00 472.00 0.00 47229.00 200.12
76.40 344.76 0.00 344.76 0.96 45.20
avg-cpu: %user %nice %system %iowait %steal %idle
1.02 0.00 1.02 0.00 0.00 97.97
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 0.00 0.00 0.00 304.00 0.00 18012.00 118.50
0.04 0.18 0.00 0.18 0.08 2.40
```
----
2018-12-12 14:48:58 UTC - Sijie Guo: a wheel file package is supported in the
upcoming 2.3.0 release. it requires some documentations. @Sanjeev Kulkarni or
@Jerry Peng can help on that.
----
2018-12-12 14:49:57 UTC - Ivan Kelly: journal and ledger are same disk for
standalone too.
----
2018-12-12 14:53:41 UTC - Maarten Tielemans: I could check the disk activity
with standalone too for comparison, but would take a bit of time
----
2018-12-12 14:56:18 UTC - Olivier Chicha: yes it is a possibility that I
envision
I have never played with debezium,
I was also envisionning to use directly mysql-binlog-connector-java
my only concern is about the added load and latency that this would introduce
compare to the mecanism we already have in place
We will most probably give it a try, as it definitively simplifies the concepts
and the implementation.
----
2018-12-12 15:02:51 UTC - Ivan Kelly: what class machine are you using for
these tests?
----
2018-12-12 15:03:41 UTC - Maarten Tielemans: 2 i3.large instances with the
local NVMe disks
----
2018-12-12 15:07:45 UTC - Matteo Merli: @Maarten Tielemans can you verify that
you are indeed writing to the nvme disks and not the VM local disk? The numbers
you’re getting look more like that.
The spikes in the IOps are happening when the write cache gets flushed. That’s
normal and expected.
----
2018-12-12 15:12:13 UTC - Sijie Guo: @Matteo Merli: I think the 3 ms comes from
1 ms batching at client side and 1 ms batching at journal group commit. that’s
kind of expected. because he is sending requests at 60 messages/sec. @Maarten
Tielemans: you can do followings:
- add `--batch-time-window 0` in your perf command
- set `journalFlushWhenQueueEmpty` to true in your `conf/bookkeeper.conf`
----
2018-12-12 15:13:30 UTC - Ivan Kelly: none on this explains why standalone
doesn't have this issue though
----
2018-12-12 15:17:00 UTC - Sijie Guo: I think I suggested the same thing for his
standalone setup
----
2018-12-12 15:21:11 UTC - Maarten Tielemans: @Matteo Merli I only see
(relevant) disk activity on the nvme0n1 disk, so that seems correct. iostat
shows an activity/second of 18 000 kB/s ; which seems correct for the number of
messages (60) and their respective size (300k). Apart from writing those
messages, where would the disk IO spike (~250kk) come from?
@Sijie Guo I am running bin/pulsar-perf with --batch-time-window 0 ; did not
yet set journalFlushWhenQueueEmpty=true (will do so now)
----
2018-12-12 15:22:28 UTC - Ivan Kelly: @Maarten Tielemans each message gets
written to the journal, and then the ledger disk. the spike is the ledger disk
write. it flushes every 60s
----
2018-12-12 15:24:21 UTC - Sijie Guo: @Maarten Tielemans I would suggest you set
journalFlushWhenQueueEmpty to true given you request rate is so low.
----
2018-12-12 15:32:47 UTC - Maarten Tielemans: Some numbers for a setup with
journalFlushWhenQueueEmpty=true on both brokers. Hitting lower mean and med's ;
but spike's persist
----
2018-12-12 15:32:56 UTC - Maarten Tielemans:
----
2018-12-12 15:42:35 UTC - Christophe Bornet: I have seen that PIP25 was
implemented. Does it allow to connect Pulsar to an OIDC server such as Okta or
Keycloak ?
----
2018-12-12 15:58:06 UTC - Matteo Merli: @Christophe Bornet not at this point,
though the intention is to have a pluggable source on the backend
----
2018-12-12 16:16:53 UTC - David Kjerrumgaard: @Mike Card Thanks for the update.
That is strange behavior
----
2018-12-12 16:17:54 UTC - Mike Card: Yes it is. I am taking today off and
tomorrow I’ll resume looking at this
----
2018-12-12 16:44:54 UTC - David Kjerrumgaard: At least we have isolated the
issue to the behavior of the serializer with Pulsar, as you are both reading
and writing 64 bytes on the producer and consumer side, correct?
----
2018-12-12 16:53:47 UTC - Mike Card: Yes, I had originally thought the messages
were being truncated by my tests say that is not the case
----
2018-12-12 18:02:58 UTC - Matteo Merli: That was too early in the morning.. I
hadn’t seen the iostat output above :slightly_smiling_face:
----
2018-12-12 18:05:55 UTC - Matteo Merli: @Maarten Tielemans These numbers are
not consistent with what should be expected from NVMe disks.
Can you check the bookie stats which will have the complete latencies breakdown?
Eg: please post the the output of `<http://BOOKIE:8000/metrics>`
----
2018-12-12 20:54:46 UTC - Grant Wu: Is it possible to have a Pulsar function
publish to an arbitrary number of output topics?
----
2018-12-12 21:02:23 UTC - Sanjeev Kulkarni: The question is how do you decide
which topic the output should go to. Currently this can be only achieved by
explictly using context.publish() method inside the function
----
2018-12-12 21:03:06 UTC - Grant Wu: Hrm, assume that we can determine the topic
----
2018-12-12 21:41:30 UTC - Grant Wu: Also we would need to subscribe to a
multi-topic subscription :confused:
----
2018-12-12 21:41:33 UTC - David Kjerrumgaard: We would also have to assume that
all of the "target" topics already exist and that the function has permission
to write to them.....as they won't be created at function creation time.
----
2018-12-12 21:41:49 UTC - Grant Wu: Oh, hrm…
----
2018-12-12 21:52:10 UTC - Grant Wu: Yeah, I had a feeling this was unlikely to
work, and now I have some evidence…
----
2018-12-12 21:56:33 UTC - Matteo Merli: > I don’t think it’s a huge problem,
if everything else works as expected, meaning the client will get a transient
failure and recover.
You can have multiple topics as input for a functions
----
2018-12-12 21:57:54 UTC - Grant Wu: er, why did you quote that
----
2018-12-12 21:58:26 UTC - Grant Wu: Additionally, can you listen on a regex of
topics?
----
2018-12-12 21:58:32 UTC - Grant Wu: like you can with multi-topic subscriptions?
----
2018-12-12 21:58:56 UTC - Sanjeev Kulkarni: yes you can
----
2018-12-12 22:00:13 UTC - Grant Wu: So we’d just use the standard multi-topic
subscription format here:
<https://pulsar.apache.org/docs/en/functions-deploying/#local-run-mode> ?
----
2018-12-12 22:00:21 UTC - Matteo Merli: oh, I meant to quote the other message…
:slightly_smiling_face:
> Is it possible to have a Pulsar function publish to an arbitrary number of
output topics?
----
2018-12-12 22:00:21 UTC - Grant Wu: as an argument to `--inputs`
----
2018-12-12 22:01:59 UTC - Grant Wu: so something like
```
$ bin/pulsar-admin functions create \
--py myfunc.py \
--classname myfunc.SomeFunction \
--inputs "<persistent://public/default/.*>" \
--output <persistent://public/default/output-1>
```
----
2018-12-12 22:03:22 UTC - Grant Wu: Hrm. But is it possible to publish to an
arbitrary number of outputs? It seems like I can use `context.publish` with
the following caveats - I need to specify the topic name explicitly, and I
can’t create new topics
----
2018-12-12 22:08:36 UTC - Grant Wu: (Just added quotes to avoid shell globbing)
----
2018-12-12 22:21:29 UTC - Matteo Merli: that is correct. Just is not automated
in the same way of “returning a value from the function”
----
2018-12-12 22:29:18 UTC - Sanjeev Kulkarni: --inputs is a comma seperated list
of topic names(and cannot contain regex). For regex use --topics-pattern
----
2018-12-12 22:30:34 UTC - Grant Wu: ah, okay
----
2018-12-12 22:33:22 UTC - Jonathan Budd: @Jonathan Budd has joined the channel
----
2018-12-12 22:37:51 UTC - Grant Wu: It would be nice to get this better
documented :sweat_smile: it seems like the only way to discover this is by
looking at the command line `--help`
----
2018-12-12 22:58:37 UTC - Grant Wu: Oh, I think I have hit a bug in the Pulsar
Functions Python SDK
----
2018-12-12 22:58:44 UTC - Grant Wu: ```
Traceback (most recent call last):
File
"/Users/grant.wu/fakebin/apache-pulsar-2.2.0/instances/python-instance/python_instance.py",
line 267, in actual_execution
output_object = self.function_class.process(input_object, self.contextimpl)
File "/Users/grant.wu/petuum/pulsar_functions_poc/test.py", line 8, in process
context.publish("<persistent://public/default/muhoutput>", input + '!')
File
"/Users/grant.wu/fakebin/apache-pulsar-2.2.0/instances/python-instance/contextimpl.py",
line 135, in publish
compression_type=compression_type
File
"/Users/grant.wu/fakebin/apache-pulsar-2.2.0/instances/python-instance/pulsar/__init__.py",
line 378, in create_producer
_check_type(CompressionType, compression_type, 'compression_type')
File
"/Users/grant.wu/fakebin/apache-pulsar-2.2.0/instances/python-instance/pulsar/__init__.py",
line 925, in _check_type
raise ValueError("Argument %s is expected to be of type '%s'" % (name,
var_type.__name__))
ValueError: Argument compression_type is expected to be of type
'CompressionType'
```
----
2018-12-12 22:58:56 UTC - Grant Wu: It appears that it does not like the
default `None` argument
----
2018-12-12 23:02:04 UTC - Grant Wu: Oh, this bug is fixed in master.
----
2018-12-12 23:02:18 UTC - Grant Wu: It was
<https://github.com/apache/pulsar/pull/2844>.
----
2018-12-12 23:03:44 UTC - Grant Wu: :disappointed: looks like this bugfix
missed the release window by 10 days
----
2018-12-12 23:04:06 UTC - Grant Wu: @Matteo Merli thoughts?
----
2018-12-12 23:04:12 UTC - Grant Wu: Is there a RC I can download?
----
2018-12-12 23:04:36 UTC - Matteo Merli: we’re preparing a 2.2.1 release
----
2018-12-12 23:05:47 UTC - Grant Wu: So just sit tight, then? :stuck_out_tongue:
----
2018-12-12 23:06:17 UTC - Matteo Merli: that should be soon
:slightly_smiling_face:
----
2018-12-12 23:06:20 UTC - Grant Wu: okay :slightly_smiling_face:
----
2018-12-13 03:38:37 UTC - wangwei: @wangwei has joined the channel
----
2018-12-13 05:58:49 UTC - 廖师虎: @廖师虎 has joined the channel
----