2019-12-10 13:43:44 UTC - juraj: build of `master` in a `pulsar-build`
container fails with:
----
2019-12-10 14:25:59 UTC - juraj: same on the `maven:3.6.3-jdk-8` image
----
2019-12-10 14:38:06 UTC - Martin Kunev: Hi,
I have a question regarding message ordering on a single persistent topic. I
couldn't figure it out from reading the documentation.
A topic has replication clusters: cluster0, cluster1 and cluster2. There is a
susbscriber for the topic on each cluster. In the following scenario:
* cluster0 publishes messageA
* the subscriber on cluster1 receives messageA and publishes messageB as a
result
Is it possible that the subscriber on cluster2 receives messageB before
messageA?
----
2019-12-10 14:51:47 UTC - Joe Francis: Yes. Message ordering is guaranteed only
per topic (partition) per producer.
----
2019-12-10 15:25:23 UTC - Daniel Ferreira Jorge: Hello guys... I'm trying to
configure the GCS Offloader for a new deployment... I have the
`tiered-storage-jcloud-2.4.2.nar` inside the `offloaders` directory (I'm using
the `pulsar-all` docker image), but I keep getting `No offloader found for
driver 'google-cloud-storage. Please make sure you dropped the offloader nar
packages under `${PULSAR_HOME}/offloaders'` and the broker won't initialize...
Below is my `broker.conf` offloading config.... Am I missing something?
```### --- Ledger Offloading --- ###
# The directory for all the offloader implementations
offloadersDirectory=./offloaders
# Driver to use to offload old data to long term storage (Possible values: S3,
aws-s3, google-cloud-storage)
# When using google-cloud-storage, Make sure both Google Cloud Storage and
Google Cloud Storage JSON API are enabled for
# the project (check from Developers Console -> Api&auth -> APIs).
managedLedgerOffloadDriver=google-cloud-storage
# Maximum number of thread pool threads for ledger offloading
managedLedgerOffloadMaxThreads=2
# Use Open Range-Set to cache unacked messages
managedLedgerUnackedRangesOpenCacheSetEnabled=true
# For Amazon S3 ledger offload, AWS region
s3ManagedLedgerOffloadRegion=
# For Amazon S3 ledger offload, Bucket to place offloaded ledger into
s3ManagedLedgerOffloadBucket=
# For Amazon S3 ledger offload, Alternative endpoint to connect to (useful for
testing)
s3ManagedLedgerOffloadServiceEndpoint=
# For Amazon S3 ledger offload, Max block size in bytes. (64MB by default, 5MB
minimum)
s3ManagedLedgerOffloadMaxBlockSizeInBytes=67108864
# For Amazon S3 ledger offload, Read buffer size in bytes (1MB by default)
s3ManagedLedgerOffloadReadBufferSizeInBytes=1048576
# For Google Cloud Storage ledger offload, region where offload bucket is
located.
# reference this page for more details:
<https://cloud.google.com/storage/docs/bucket-locations>
gcsManagedLedgerOffloadRegion=us-central1
# For Google Cloud Storage ledger offload, Bucket to place offloaded ledger into
gcsManagedLedgerOffloadBucket=pulsar-topic-offload
# For Google Cloud Storage ledger offload, Max block size in bytes. (64MB by
default, 5MB minimum)
gcsManagedLedgerOffloadMaxBlockSizeInBytes=67108864
# For Google Cloud Storage ledger offload, Read buffer size in bytes (1MB by
default)
gcsManagedLedgerOffloadReadBufferSizeInBytes=1048576
# For Google Cloud Storage, path to json file containing service account
credentials.
# For more details, see the "Service Accounts" section of
<https://support.google.com/googleapi/answer/6158849>
gcsManagedLedgerOffloadServiceAccountKeyFile=/tmp/gcp_access.json```
----
2019-12-10 15:40:20 UTC - Alexandre DUVAL: @Daniel Ferreira Jorge Hi, you need
to add
<https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=pulsar/pulsar-2.4.2/apache-pulsar-offloaders-2.4.2-bin.tar.gz>
----
2019-12-10 15:57:42 UTC - Nick Ruhl: @Nick Ruhl has joined the channel
----
2019-12-10 15:57:43 UTC - Daniel Ferreira Jorge: @Alexandre DUVAL Thanks for
the answer. Where should I add this file? I'm already using the `pulsar-all`
docker image, which is supposed to contain all the necessary files...
----
2019-12-10 16:24:39 UTC - Alexandre DUVAL: In /lib
----
2019-12-10 16:27:26 UTC - Nick Ruhl: Hi Pulsar Community. I am new to Pulsar
but just stood up a K8S cluster and plan to use it heavily in production soon.
I am currently chaos testing it in order to help my understanding of managing
the cluster and fixing issues when they arise. A few questions I have is the
procedure for what to do if/when the ledger and/or journaling disks fill up as
this has been my largest issue and so far has required me to rebuild cluster.
```- How can I get things back on track and/or clear the ledger and/or journal
so the cluster if functional?(If persistance is not required)
- Can I extend the volumes and perform some actions to get things realigned?(If
persistance is required)
- Is there any documentation on these topics and what to do when things go
wrong?```
Thank you all and happy holidays!
----
2019-12-10 16:31:42 UTC - Sijie Guo: Can you provide more output of that
command? I can’t see anything from the screenshot.
----
2019-12-10 16:37:33 UTC - Sijie Guo: Can you create a github issue or a
stackoverflow question for this question? This is a question better to answer
there, which can benefit all the community.
+1 : Nick Ruhl
----
2019-12-10 16:38:15 UTC - Nick Ruhl: @Sijie Guo No problem. Thank you
----
2019-12-10 16:53:42 UTC - juraj: can u see better here?
----
2019-12-10 16:53:47 UTC - juraj: ```[ERROR] Failed to execute goal
org.codehaus.mojo:exec-maven-plugin:1.6.0:exec (rename-epoll-library) on
project managed-ledger: Command execution failed.: Process exited with an
error: 127 (Exit value: 127) -> [Help 1]```
----
2019-12-10 16:54:34 UTC - juraj: found the same issue here, but idk how it was
solved:
<http://mail-archives.apache.org/mod_mbox/pulsar-dev/201807.mbox/%3C680602365.2282.1532567683130.JavaMail.jenkins@jenkins01%3E>
----
2019-12-10 17:02:10 UTC - juraj: more context
----
2019-12-10 17:02:21 UTC - juraj: ohhhh `zip: command not found`
----
2019-12-10 17:03:01 UTC - juraj: `apt-get install zip` and trying again
----
2019-12-10 17:18:48 UTC - Daniel Ferreira Jorge: @Alexandre DUVAL This file you
told me to download only contains the `tiered-storage-jcloud-2.4.2.nar`, which
is already inside my `/pulsar/offloaders` directory... anyway, I did download
and put the file you told me inside the `lib` and I have the same results....
----
2019-12-10 17:24:13 UTC - Alexandre DUVAL: my bad, you have to put in
pulsar/offloaders
----
2019-12-10 17:24:32 UTC - Alexandre DUVAL: ```~pulsar/offloaders # ls
tiered-storage-jcloud-2.4.0.nar```
----
2019-12-10 17:24:44 UTC - Alexandre DUVAL: @Daniel Ferreira Jorge
----
2019-12-10 17:24:56 UTC - Daniel Ferreira Jorge: it already is
----
2019-12-10 17:25:16 UTC - juraj: that worked, i'm on another issue involving
the docker-maven-plugin, will post later
----
2019-12-10 17:25:30 UTC - Daniel Ferreira Jorge: @Alexandre DUVAL
----
2019-12-10 17:35:28 UTC - Daniel Ferreira Jorge: From the logs, I can see an
exception when it tries to load the nar file:
----
2019-12-10 17:35:33 UTC - Daniel Ferreira Jorge:
```"<http://java.io|java.io>.IOException:
/tmp/pulsar-nar/tiered-storage-jcloud-2.4.2.nar-unpacked/META-INF could not be
created
at
org.apache.pulsar.common.nar.FileUtils.ensureDirectoryExistAndCanReadAndWrite(FileUtils.java:51)
~[org.apache.pulsar-pulsar-common-2.4.2.jar:2.4.2]
at
org.apache.pulsar.common.nar.NarUnpacker.unpack(NarUnpacker.java:106)
~[org.apache.pulsar-pulsar-common-2.4.2.jar:2.4.2]
at
org.apache.pulsar.common.nar.NarUnpacker.unpackNar(NarUnpacker.java:66)
~[org.apache.pulsar-pulsar-common-2.4.2.jar:2.4.2]
at
org.apache.pulsar.common.nar.NarClassLoader.getFromArchive(NarClassLoader.java:141)
~[org.apache.pulsar-pulsar-common-2.4.2.jar:2.4.2]
at
org.apache.bookkeeper.mledger.offload.OffloaderUtils.getOffloaderDefinition(OffloaderUtils.java:109)
~[org.apache.pulsar-managed-ledger-original-2.4.2.jar:2.4.2]
at
org.apache.bookkeeper.mledger.offload.OffloaderUtils.lambda$searchForOffloaders$1(OffloaderUtils.java:130)
~[org.apache.pulsar-managed-ledger-original-2.4.2.jar:2.4.2]
at java.lang.Iterable.forEach(Iterable.java:75) [?:1.8.0_232]
at
org.apache.bookkeeper.mledger.offload.OffloaderUtils.searchForOffloaders(OffloaderUtils.java:128)
[org.apache.pulsar-managed-ledger-original-2.4.2.jar:2.4.2]
at
org.apache.pulsar.broker.PulsarService.createManagedLedgerOffloader(PulsarService.java:728)
[org.apache.pulsar-pulsar-broker-2.4.2.jar:2.4.2]
at org.apache.pulsar.broker.PulsarService.start(PulsarService.java:382)
[org.apache.pulsar-pulsar-broker-2.4.2.jar:2.4.2]
at
org.apache.pulsar.PulsarBrokerStarter$BrokerStarter.start(PulsarBrokerStarter.java:273)
[org.apache.pulsar-pulsar-broker-2.4.2.jar:2.4.2]
at
org.apache.pulsar.PulsarBrokerStarter.main(PulsarBrokerStarter.java:332)
[org.apache.pulsar-pulsar-broker-2.4.2.jar:2.4.2]```
----
2019-12-10 17:38:08 UTC - Daniel Ferreira Jorge: apparently it is trying to
unpack the nar into a temp folder, but it is not being able to for some reason
----
2019-12-10 17:46:44 UTC - Brian Doran: Sorry @Sijie Guo I completely missed
your reply to this.
----
2019-12-10 17:47:05 UTC - Brian Doran: we are looking at improving the
throughput from our Pulsar Producer clients to a 3 node Pulsar cluster...
We cannot get the message throughput beyond about 230K/sec no matter how many
changes we make to the producer client settings:
Current client settings are:
2019-12-10 15:35:48.347Z INFO [Export-Pipeline-Queue-9]
o.a.p.c.i.ProducerStatsRecorderImpl - Pulsar client config: {
"serviceUrl" :
"<pulsar://prod-fx3s1c.s.com:6650>,prod-fx3s1a.s.com:6650,prod-fx3s1b.s.com:6650",
"authPluginClassName" : null,
"authParams" : null,
"operationTimeoutMs" : 30000,
"statsIntervalSeconds" : 60,
"numIoThreads" : 50,
"numListenerThreads" : 1,
"connectionsPerBroker" : 15,
"useTcpNoDelay" : true,
"useTls" : false,
"tlsTrustCertsFilePath" : "",
"tlsAllowInsecureConnection" : false,
"tlsHostnameVerificationEnable" : false,
"concurrentLookupRequest" : 5000,
"maxLookupRequest" : 50000,
"maxNumberOfRejectedRequestPerConnection" : 50,
"keepAliveIntervalSeconds" : 30,
"connectionTimeoutMs" : 10000,
"requestTimeoutMs" : 60000,
"defaultBackoffIntervalNanos" : 100000000,
"maxBackoffIntervalNanos" : 30000000000
}
We have 13 partitioned topics with 10 partitions each
bin/pulsar-admin topics list-partitioned-topics public/default
<persistent://public/default/TestTopic1>
<persistent://public/default/TestTopic2>
<persistent://public/default/TestTopic3>
<persistent://public/default/TestTopic4>
<persistent://public/default/TestTopic5>
<persistent://public/default/TestTopic6>
<persistent://public/default/TestTopic6>
<persistent://public/default/TestTopic7>
<persistent://public/default/TestTopic8>
<persistent://public/default/TestTopic9>
<persistent://public/default/TestTopic10>
<persistent://public/default/TestTopic11>
<persistent://public/default/TestTopic12>
<persistent://public/default/TestTopic13>
<persistent://public/default/TestTopic14>
<persistent://public/default/TestTopic15>
<persistent://public/default/TestTopic10>
<persistent://public/default/TestTopic11>
<persistent://public/default/TestTopic12>
<persistent://public/default/TestTopic13>
As you can see from the picture we have lots of producers: 16 threads consuming
data, each one with a producer per partition; so it's quite a high producer
count.
2019-12-10 15:35:48.343Z INFO [Export-Pipeline-Queue-9]
o.a.p.c.i.ProducerStatsRecorderImpl - Starting Pulsar producer perf with
config: {
"topicName" : "<persistent://public/default/TestTopic1>",
"producerName" :
"<persistent://public/default/TestTopic1[Export-Pipeline-Queue-9]>",
"sendTimeoutMs" : 30000,
"blockIfQueueFull" : true,
"maxPendingMessages" : 5000,
"maxPendingMessagesAcrossPartitions" : 50000,
"messageRoutingMode" : "RoundRobinPartition",
"hashingScheme" : "JavaStringHash",
"cryptoFailureAction" : "FAIL",
"batchingMaxPublishDelayMicros" : 200000,
"batchingMaxMessages" : 1000,
"batchingEnabled" : true,
"batcherBuilder" : { },
"compressionType" : "LZ4",
"initialSequenceId" : null,
"autoUpdatePartitions" : true,
"properties" : { }
}
----
2019-12-10 17:47:49 UTC - Brian Doran:
----
2019-12-10 17:49:34 UTC - Brian Doran: We run the same data through with the
destination as our Kafka broker (we've been doing this a long time so we know
what to expect here) but with Pulsar we're only really starting to benchmark it
over the last few weeks and having trouble replicating the Kafka throughput.
----
2019-12-10 18:06:36 UTC - Sijie Guo: Can you share more details about the
Pulsar setup (e.g. # brokers, # bookies, bookie disk and configuration)?
----
2019-12-10 19:06:32 UTC - Ryan Samo: Hey guys, is there a limit to the number
of subscriptions that can be active on a topic?
----
2019-12-10 19:18:56 UTC - Joe Francis: Not really. Keep in mind that every sub
adds 1X dispatch load. So you will need to consider scaling
----
2019-12-10 19:20:00 UTC - Addison Higham: no hard limit and they should be
fairly cheap in the case of tailing reads, but there is obviously bandwidth as
well as some CPU and mem used.
----
2019-12-10 19:20:58 UTC - Ryan Samo: Awesome, just thinking about having
devices subscribe to receive push notifications etc. thanks!
----
2019-12-10 19:21:34 UTC - Ryan Samo: I didn’t want to reach some arbitrary cap
----
2019-12-10 19:23:06 UTC - Joe Francis: Depends on the numbers. What numbers are
we talking about? 100s? 1000s? Millions?
----
2019-12-10 19:23:19 UTC - Ryan Samo: 1000s
----
2019-12-10 19:24:20 UTC - Ryan Samo: Loading a config to each client on
connection via Pulsar then all of the clients update with that config. For
example
----
2019-12-10 19:26:07 UTC - Ryan Samo: Trying to keep many clients in sync
----
2019-12-10 19:27:38 UTC - Addison Higham: since topics are tied to a single
broker, subscriptions don't split across multiple brokers so your mechanism to
scale that out is to just do bigger brokers. You could consider fanning out the
data to multiple topics and keying data, or if not, one common technique in
systems like rabbitmq (that should work here as well) might be to do a "tiered"
fanout. For example, if you had a single topic you produce to have a pulsar
functions that fans out to 10 other topics (with each topic having a copy of
the data) you could then have your clients randomly choose one of the 10 topics
and then subscribe to that. That way, instead of 1000 subscriptions on a single
broker, you get 200 subscriptions on 5 brokers for example
----
2019-12-10 19:28:47 UTC - Ryan Samo: Is this true for partitioned topics as
well?
----
2019-12-10 19:29:00 UTC - Ryan Samo: I thought they spanned brokers
----
2019-12-10 19:31:19 UTC - Addison Higham: they do, but a subscription also
spans multiple brokers as well (in the case of shared, exclusive, and
failover) so you don't buy yourself a ton there
----
2019-12-10 19:31:45 UTC - Addison Higham: not sure about the details of key
shared
----
2019-12-10 19:32:26 UTC - Joe Francis: It all depends on the througput you plan
to push. 1X in means 1000sX out. If its a config once in a while, should be
possible , but test it out
----
2019-12-10 19:33:51 UTC - Joe Francis: For some context, I run subs in the
100s, without issues
----
2019-12-10 19:54:53 UTC - juraj: almost there, but have hit this one now:
----
2019-12-10 19:56:36 UTC - juraj: seems like starting the whole build from
within docker will not work
----
2019-12-10 19:57:46 UTC - Ryan Samo: I see, ok thanks!
----
2019-12-10 19:58:36 UTC - juraj: actually, seems like some images were built!
tada : Roman Popenov
----
2019-12-10 19:58:41 UTC - Roman Popenov: Can someone go more into details about
key shared subscription?
----
2019-12-10 19:58:54 UTC - Roman Popenov: Or point me to the right KBA?
----
2019-12-10 20:00:49 UTC - Brian Doran: We have 3 brokers running in docker / 3
brokers running in docker / 3 zookeepers running in docker
----
2019-12-10 20:01:30 UTC - Brian Doran: which configurations are you looking
for.. bookie.conf?
----
2019-12-10 20:14:55 UTC - David Kjerrumgaard:
<https://github.com/apache/pulsar/wiki/PIP-34%3A-Add-new-subscribe-type-Key_shared>
thanks : Roman Popenov
----
2019-12-10 20:30:34 UTC - Brian Doran: *CPU*: 2 x Intel Xeon E5-2695 v3
2.3GHz,35M Cache,9.60GT/s QPI,Turbo,HT,14C/28T
*RAM*: 128GB RAM
*OS Disk*: 2 x 200GB SSD
*Network*: 10Gbps
FD332 Specs
16 x 1.2TB 10K SAS 6Gbps 2.5"
----
2019-12-10 20:31:34 UTC - Brian Doran: very few changes to the bookie config
----
2019-12-11 08:40:58 UTC - Jens Fosgerau: @Jens Fosgerau has joined the channel
----
2019-12-11 08:41:29 UTC - Dan Koopman: @Dan Koopman has joined the channel
----