2019-01-18 09:19:00 UTC - David Tinker: I have 2 services publishing the same messages to a persistent topic at more or less the same time. Another service is consuming the messages and it is not seeing duplicates as I would expect. As far as I can tell I haven't enabled de-duplication. Any idea whats happening? Pulsar server is 2.2.0. Producer client is 2.2.0, consumer client is 2.3.0-SNAPSHOT. ---- 2019-01-18 09:32:40 UTC - jia zhai: @David Tinker You are using 2 producer right? In messageMetadata, it could contain `producerName`, If setting it in your message produce code, `MessageMetadata msgMetadata = metadataBuilder.setProducerName(“test”).... `, then when you consume it, you could know the message is coming from which producer. ---- 2019-01-18 09:42:06 UTC - Gofu: Hello guys, I am using pulsar c++ client, somebody knows if when we are consuming a message there is a way to tell which partition it belongs to? I am using a partitionated topic, and I am only managing to get the topic, not the specific partition. ---- 2019-01-18 09:48:52 UTC - jia zhai: @Gofu In C++, you could also set/get PartitonKey for a message ---- 2019-01-18 09:49:57 UTC - Gofu: @jia zhai I know, that's when I am publishing, in order to route to a given partition, the thing is how to know which partition that the message belongs (when consuming) ---- 2019-01-18 09:50:40 UTC - Gofu: actually, I am using the golang client, not the c++ directly, but the golang client uses the c++ client ---- 2019-01-18 09:50:47 UTC - jia zhai: I see. ---- 2019-01-18 09:51:11 UTC - Gofu: that would be helpful in order to launch a thread for each partition ---- 2019-01-18 09:58:15 UTC - jia zhai: @Gofu what did you get from message.getTopicName? ---- 2019-01-18 09:59:22 UTC - jia zhai: It should contains the partition-id part ---- 2019-01-18 10:22:05 UTC - Gofu: @jia zhai so in the go client there is a Consumer interface in the message that has a method Topic(), that method doesnt return the partition-id part, i am assuming that method is the same than message.getTopicName, or am I wront? ---- 2019-01-18 10:31:02 UTC - jia zhai: @Gofu Are you using PartitionedConsumer.Topic()? ---- 2019-01-18 10:34:37 UTC - jia zhai: I remember, PartitionedConsumer.Topic() will not contains partition part, but each message is assigned with topicName get from sub-consumer that contained in PartitionedConsumer. It should contains that part. ---- 2019-01-18 10:41:59 UTC - Gofu: @jia zhai I am using this ---- 2019-01-18 10:43:24 UTC - Gofu: I can't find PartitionedConsumer ---- 2019-01-18 10:58:41 UTC - Bogdan BUNECI: Short version: This buggy behavior is fixed in snapshot by PR #2969 ---- 2019-01-18 10:59:27 UTC - Bogdan BUNECI: i will check also presto ---- 2019-01-18 11:05:15 UTC - jia zhai: @Gofu <https://github.com/apache/pulsar/pull/3346> ---- 2019-01-18 11:05:31 UTC - jia zhai: Here is the PR handling getTopicName for Go client ---- 2019-01-18 11:24:03 UTC - Gofu: @jia zhai Yeah, that's the method I am using ---- 2019-01-18 11:24:38 UTC - Gofu: but it returns the topic and not the partition. ---- 2019-01-18 11:25:21 UTC - Gofu: nevermind, I am using the one from consumer ---- 2019-01-18 11:26:03 UTC - Gofu: will try to update and see what happens ---- 2019-01-18 11:26:04 UTC - Gofu: thanks ---- 2019-01-18 11:46:45 UTC - David Tinker: Tx. I added logging for the producer name and I am receiving messages more or less evenly from each producer with no dups. The /admin/v2/../stats endpoint has "deduplicationStatus" : "Disabled". ---- 2019-01-18 11:49:15 UTC - Gofu: @jia zhai probably it could solve my problem, but its missing the release, can you tell me when you intend to do it? ---- 2019-01-18 11:52:25 UTC - jia zhai: The 2.3.0 release will be in 1-2 weeks. ---- 2019-01-18 11:52:53 UTC - jia zhai: @Gofu It depends on a bookkeeper release, and it will be soon ---- 2019-01-18 11:53:20 UTC - Gofu: @jia zhai thank you for the feedback ---- 2019-01-18 11:54:03 UTC - jia zhai: welcome ---- 2019-01-18 11:56:13 UTC - Gofu: @jia zhai There is another problem I have detected, perhaps you can help me with. In this moment there is no method to send a batch to pulsar (only sendAsync that holds in memory a bunch of messages and then on linger or max messages config it will send the batch, but the thing is that some messages could fail, and then the order will be lost), so I would need to send an atomic batch instead of having this buffer thing. This way it will either send all messages or none. Do you have plans to do that? ---- 2019-01-18 12:08:47 UTC - David Tinker: You could group your messages into one "batch" message? ---- 2019-01-18 12:13:28 UTC - Sijie Guo: @Gofu currently you can do this atomic-ish send with a batch of messages (total size is not larger than 5MB).
- you can set batchingMaxPublishDelay to disable periodical flush. so that you can control the batch. - then you can do following: ``` sendAsync(msg1); sendAsync(msg2); sendAsync(msg3); flushAsync().whenComplete(...); ``` ---- 2019-01-18 12:13:32 UTC - Gofu: @David Tinker We had that idea but the consumer it isnt expective a batch, it is expecting the individual message ---- 2019-01-18 12:15:00 UTC - Sijie Guo: this should work for java client. I believe cpp/go has similar settings, but not sure if we have added `flush` method or not. If there is no flush method, we can add one ---- 2019-01-18 12:18:04 UTC - Gofu: @Sijie Guo will look into and let you know, thanks ---- 2019-01-18 14:19:49 UTC - Gofu: @Sijie Guo The go client doesnt have the flush method ---- 2019-01-18 15:18:24 UTC - jia zhai: @Gofu If it is needed, Would you please help open an issue for this? we could handle it later. ---- 2019-01-18 15:20:24 UTC - Gofu: sure ---- 2019-01-18 15:33:31 UTC - Gofu: @jia zhai created the issue <https://github.com/apache/pulsar/issues/3388> ---- 2019-01-18 23:12:50 UTC - Emma Pollum: I'm having trouble getting one of my clusters to connect to its replication cluster. I have two clusters, C1 and C2. C2 is connecting to C1 fine, but C1 shows only an inbound connection and under pulsar-admin topic stats it says "connected":"false" Where would logs for this connection attempt be, so I can diagnose why it can't connect? ---- 2019-01-19 00:30:22 UTC - Emma Pollum: I'm seeing at least one topic in this namespace that is connected to its partner cluster, but many other topics are not connected. ---- 2019-01-19 02:10:46 UTC - Grant Wu: ``` 02:10:10.261 [Timer-1] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully Traceback (most recent call last): File "/pulsar/instances/python-instance/python_instance_main.py", line 33, in &lt;module&gt; from pulsar import Authentication File "/pulsar/instances/python-instance/pulsar/__init__.py", line 99, in &lt;module&gt; import _pulsar ImportError: No module named _pulsar ``` ---- 2019-01-19 02:11:04 UTC - Grant Wu: I’m getting this in Pulsar, on a broker deployed with an image based off of the 2.1.1 image :thinking_face: ---- 2019-01-19 02:12:00 UTC - Grant Wu: Just checked `pip freeze` - I don’t see `pulsar-client` :thinking_face: ---- 2019-01-19 02:20:04 UTC - Grant Wu: cc @Jonathan Budd ---- 2019-01-19 02:24:34 UTC - Matteo Merli: @Grant Wu is that `pulsar` or `pulsar-all` image? ---- 2019-01-19 02:24:41 UTC - Grant Wu: Not completely sure. ---- 2019-01-19 02:26:55 UTC - Grant Wu: ``` docker run -it apachepulsar/pulsar-all:2.1.1-incubating python -c "import pulsar_client" ``` ---- 2019-01-19 02:26:58 UTC - Grant Wu: Gives me no module found ---- 2019-01-19 02:27:12 UTC - Matteo Merli: Import pulsar ---- 2019-01-19 02:27:17 UTC - Grant Wu: oh, sorry ---- 2019-01-19 02:27:34 UTC - Grant Wu: ``` Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: No module named pulsar ``` ---- 2019-01-19 02:27:36 UTC - Grant Wu: same thing ---- 2019-01-19 02:27:41 UTC - Matteo Merli: :) ---- 2019-01-19 02:27:47 UTC - Matteo Merli: Ok let me check ---- 2019-01-19 02:28:05 UTC - Grant Wu: Currently waiting for ``` docker run -it apachepulsar/pulsar-all:2.2.1 python -c "import pulsar" ``` to complete ---- 2019-01-19 02:28:52 UTC - Grant Wu: docker images too big ;-; ---- 2019-01-19 02:31:12 UTC - Matteo Merli: yes yes, part of it is that it would need to be squashed ---- 2019-01-19 02:31:48 UTC - Matteo Merli: the tgz that gets unpacked in the image is actually stored twice by docker build (untar and mv of the directory) ---- 2019-01-19 02:32:06 UTC - Grant Wu: oh dear :sweat_smile: ---- 2019-01-19 02:32:08 UTC - Matteo Merli: Anyway, the client should be there, per <https://github.com/apache/pulsar/blob/v2.2.1/docker/pulsar/Dockerfile#L38> ---- 2019-01-19 02:32:58 UTC - Matteo Merli: So yes, it wasn’t there in 2.1 but it’s there in 2.2 images ---- 2019-01-19 02:33:04 UTC - Grant Wu: okay. ---- 2019-01-19 02:33:24 UTC - Grant Wu: I’ll figure out something; we really ought to be on 2.2 anyways ----
