2018-11-12 09:11:06 UTC - Sijie Guo: @tuan nguyen anh can you please create a github issue with your hardware settings and testing steps? so we can look into it and help you understand this. ---- 2018-11-12 09:23:26 UTC - Kesav Kolla: Is there any way to create persistent topic before subscribers come up and start publishing messages into topic? Right now there is no way to create a topic via admin-cli and also if a publisher tries to put messages on topic those messages are lost. There has to be at least one subscriber to start working with persistent topic. ---- 2018-11-12 09:25:44 UTC - Ali Ahmed: @Kesav Kolla pulsar will create topics on the fly you don’t need to create them in advance ---- 2018-11-12 09:27:41 UTC - Kesav Kolla: @Ali Ahmed I observer the topic creation is only happening when I start subscriber. Otherwise all messages are lost ---- 2018-11-12 09:28:26 UTC - Ali Ahmed: is this with persistent topics ? ---- 2018-11-12 09:29:25 UTC - Kesav Kolla: yes ---- 2018-11-12 09:30:23 UTC - Ali Ahmed: please create an issue with the steps in github we will investigate, by design that shouldn’t happen ---- 2018-11-12 09:31:29 UTC - jia zhai: @Kesav Kolla If no subscribtion is created before produce message, All the messages are treated as ACKed. ---- 2018-11-12 09:32:00 UTC - jia zhai: That maybe the reason that you find the messages are lost ---- 2018-11-12 09:36:07 UTC - Kesav Kolla: @jia zhai OMG all messages are ACKed.... Is there any setting I can configure to not do that. I've a simple usecase there will be one publisher who puts messages into topic. Subscriptions come and go depends on my scheduling pattern. ---- 2018-11-12 09:38:07 UTC - Kesav Kolla: Right now I'm struggling with two things keeping messages in topic second is even though I call acknowledge from my python code still lots of messages sit in backlog. I appreciate some help ---- 2018-11-12 09:41:44 UTC - Kesav Kolla: I'm trying to put a simple queue semantics using pulsar. I thought it'll support the queue semantics ---- 2018-11-12 09:42:29 UTC - Ali Ahmed: @Kesav Kolla have a look at this <https://pulsar.apache.org/docs/en/cookbooks-message-queue/> ---- 2018-11-12 09:46:03 UTC - Kesav Kolla: This is the exact document I've read before I wrote my code. I kept receiver_queue_size=0 in my subscribers and also set shared subscription ---- 2018-11-12 09:46:46 UTC - Kesav Kolla: This document doesn't mention anything about how to make the topic retain messages when there are no subscribers ---- 2018-11-12 09:47:38 UTC - Sijie Guo: @Kesav Kolla :
1) you can configure retention to keep the data when there is no subscription <http://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#set-retention-policy> 2) when you create consumer, specify SubscriptionIntialPostion.earliest. ---- 2018-11-12 09:48:48 UTC - Kesav Kolla: @Sijie Guo retention means what? It's retain after ACK also? ---- 2018-11-12 09:50:13 UTC - Kesav Kolla: @Sijie Guo Setting earliest will it re-process already ACK messages? ---- 2018-11-12 09:51:47 UTC - Sijie Guo: “retention” => <http://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#retention-policies> ---- 2018-11-12 09:53:14 UTC - Sijie Guo: “Setting earliest will it re-process already ACK messages?” it is not re-processing already ACK messages. it is telling pulsar when the subscription first created, it will start from the earliest position. ---- 2018-11-12 09:54:10 UTC - Sijie Guo: once the subscription is created, the parameter won’t take effects ---- 2018-11-12 09:54:49 UTC - Kesav Kolla: If I start 10 subscriptions then will there be any duplicates in each subscription? Assuming my receiver_queue_size=0. ---- 2018-11-12 09:55:20 UTC - Kesav Kolla: Also there is no argument to specify SubscriptionIntialPostion in python client of `subscribe`. How to set the SubscriptionIntialPostion? ---- 2018-11-12 09:57:43 UTC - Sijie Guo: > If I start 10 subscriptions then will there be any duplicates in each subscription? if you started 10 subscriptions, each subscription will receive a full copy of the messages. I guess you mean 10 consumers here? ---- 2018-11-12 09:58:17 UTC - Sijie Guo: I guess it is not supported in python yet. ---- 2018-11-12 10:03:07 UTC - Kesav Kolla: yes 10 consumers I mean ---- 2018-11-12 10:03:24 UTC - Kesav Kolla: so I'll end up in re-processing all the messages whenever consumer starts ---- 2018-11-12 10:05:10 UTC - Sijie Guo: no ---- 2018-11-12 10:06:30 UTC - Sijie Guo: the setting only applies when there is no subscription. but once the subscription is created, that initial position won’t take effects. if you have 10 consumers, only the first connected consumer will create the subscription ---- 2018-11-12 10:07:51 UTC - Kesav Kolla: I mean SubscriptionIntialPostion.earliest when all 10 consumers have earliest all 10 will read the messages ---- 2018-11-12 10:10:52 UTC - Sijie Guo: sorry I don’t understand what you mean here “read the messages” ---- 2018-11-12 10:16:27 UTC - Kesav Kolla: all 10 consumers will start reading messages at earliest. All consumers will get messages which are already ACKed ---- 2018-11-12 10:19:23 UTC - Sijie Guo: > all 10 consumers will start reading messages at earliest. yes. > All consumers will get messages which are already ACKed it will get the messages starting from the earliest messages. ---- 2018-11-12 10:19:51 UTC - Kesav Kolla: Which means duplicate processing of messages. ---- 2018-11-12 10:20:20 UTC - Kesav Kolla: In fact anytime new consumer starts it will re-process of messages from begining ---- 2018-11-12 10:20:38 UTC - Kesav Kolla: Ideally when there is no consumer messages should not be auto ACKed ---- 2018-11-12 10:20:51 UTC - Kesav Kolla: or atleast a setting should control the behavior of that ---- 2018-11-12 10:20:54 UTC - Sijie Guo: no ---- 2018-11-12 10:21:01 UTC - Sijie Guo: as there is no duplicates ---- 2018-11-12 10:21:19 UTC - Kesav Kolla: It's not duplicate messages.... It's re-processing of messages ---- 2018-11-12 10:21:30 UTC - Kesav Kolla: re-processing of message by each consumer ---- 2018-11-12 10:21:36 UTC - Sijie Guo: no ---- 2018-11-12 10:23:57 UTC - Kesav Kolla: Let's say there is a topic with retention policy set to 1 day: Producer put 100 messages. (All of them are ACKed) Assumptions on consumer (shared subscription, queue size is 0) Consumer 1 starts (It will process message1 and going on) Consumer2 starts (where will it start receiving messages)? Consumer3 starts (where will it start receiving messages)? ---- 2018-11-12 10:25:44 UTC - Sijie Guo: consumer 1, 2, 3 are in one subscription, 3 consumers (together) will receive messages starting from 1. if subscription mode is failover, only consumer 1 will receive the messages. if subscription mode is shared, consumer 2 will receive 2, consumer 3 will receive 3 ---- 2018-11-12 10:25:53 UTC - Sijie Guo: consumer 2 and 3 will not receive 1 ---- 2018-11-12 10:29:46 UTC - Sijie Guo: that says once the first consumer 1 “created” the subscription, how to consume the messages depends on the subscription mode. <http://pulsar.apache.org/docs/en/concepts-messaging/#subscription-modes> ---- 2018-11-12 10:30:08 UTC - Sijie Guo: (sorry I have to step out now, it is too late for me) ---- 2018-11-12 11:12:11 UTC - Kesav Kolla: When I try to make receive queue size to 0 I'm getting an error: ``` 2018-11-12 02:30:05.822 INFO ConnectionPool:63 | Created connection for <pulsar://queues-production.weave.local:6650> 2018-11-12 02:30:05.823 INFO ClientConnection:287 | [10.47.0.23:45586 -> 10.47.0.6:6650] Connected to broker through proxy. Logical broker: <pulsar://queues-production.weave.local:6650> 2018-11-12 02:30:05.824 INFO ConsumerImpl:168 | [<persistent://hotelsoft/rateshop/jobs>, spydr.queue, 0] Created consumer on broker [10.47.0.23:45586 -> 10.47.0.6:6650] 2018-11-12 02:30:05.826 INFO HandlerBase:53 | [<persistent://hotelsoft/rateshop/jobs>, ] Getting connection from pool 2018-11-12 02:30:05.826 INFO ProducerImpl:155 | [<persistent://hotelsoft/rateshop/jobs>, ] Created producer on broker [10.47.0.23:45586 -> 10.47.0.6:6650] 2018-11-12 02:30:05.826 WARN ConsumerImpl:541 | [<persistent://hotelsoft/rateshop/jobs>, spydr.queue, 0] Can't use this function if the queue size is 0 2018-11-12 02:30:05.827 INFO ProducerImpl:467 | [<persistent://hotelsoft/rateshop/jobs>, standalone-3-6995] Closed producer 2018-11-12 02:30:05.827 INFO ConsumerImpl:761 | [<persistent://hotelsoft/rateshop/jobs>, spydr.queue, 0] Closed consumer 0 ``` There is a WARN saying Can't use this function if the queue size is 0. I'm creating consumer by the following code: ``` client.subscribe('<persistent://hotelsoft/rateshop/jobs>', 'spydr.queue', consumer_type=ConsumerType.Shared, unacked_messages_timeout_ms=5 * 60 * 1000, receiver_queue_size=0) ``` ---- 2018-11-12 11:25:25 UTC - Kesav Kolla: I call `receive()` function and expecting the timeout_millis is None. The error ConsumerImpl:541 is basically occurs when receive is called with some timeout. Has anyone face this issue? I'm using Pulsar 2.2.0 ---- 2018-11-12 11:27:22 UTC - Kesav Kolla: I attached Python debugger also to validate the None for timeout_millis ---- 2018-11-12 11:27:28 UTC - Kesav Kolla: ---- 2018-11-12 11:39:37 UTC - Ivan Kelly: why are you changing receiver_queue_size? ---- 2018-11-12 11:39:51 UTC - Ivan Kelly: it should be at least 1 ---- 2018-11-12 11:53:14 UTC - Kesav Kolla: @Ivan Kelly to make the shared subscription semantics works like traditional queue ---- 2018-11-12 11:54:13 UTC - Ivan Kelly: #receive isn't a request/response type call ---- 2018-11-12 11:54:56 UTC - Ivan Kelly: when you subscribe, you open a stream to the broker, and the broker sends messages to the clients internal queue. receive pulls off that queue. so there must be at least one space on it ---- 2018-11-12 11:56:28 UTC - Kesav Kolla: When I've receive_queue_size to `1` then I'm seeing few messages not getting ACKed. And over time the unacked messages are creeping up and after some time messages are not even delivering to consumers at all ---- 2018-11-12 11:57:16 UTC - Ivan Kelly: that sounds like a problem with acking, not with the queue size ---- 2018-11-12 11:58:05 UTC - Kesav Kolla: Also from the API documentation of python here is the snippet: ``` receiver_queue_size: Sets the size of the consumer receive queue. The consumer receive queue controls how many messages can be accumulated by the consumer before the application calls receive(). Using a higher value could potentially increase the consumer throughput at the expense of higher memory utilization. Setting the consumer queue size to zero decreases the throughput of the consumer by disabling pre-fetching of messages. This approach improves the message distribution on shared subscription by pushing messages only to those consumers that are ready to process them. ``` ---- 2018-11-12 11:59:24 UTC - Ivan Kelly: ok, one sec ---- 2018-11-12 11:59:35 UTC - Ivan Kelly: my assumptions must be wrong ---- 2018-11-12 11:59:52 UTC - Kesav Kolla: I've a piece of code like this: ``` while True: msg: _Message = self.jobsconsumer.receive() self.jobsconsumer.acknowledge(msg) ``` I'm doing ACK right after receive. Still I see so many messages which are un ACKed ---- 2018-11-12 12:01:36 UTC - Kesav Kolla: I'm checking Python API implementation (python/src/consumer.cc (line 30-56): ``` Message Consumer_receive(Consumer& consumer) { Message msg; Result res; while (true) { Py_BEGIN_ALLOW_THREADS // Use 100ms timeout to periodically check whether the // interpreter was interrupted res = consumer.receive(msg, 100); Py_END_ALLOW_THREADS ``` Here by default it's calling receive with `100` as timeout ---- 2018-11-12 12:02:37 UTC - Ivan Kelly: hmm, so ---- 2018-11-12 12:02:47 UTC - Ivan Kelly: 2018-11-12 02:30:05.826 WARN ConsumerImpl:541 | [<persistent://hotelsoft/rateshop/jobs>, spydr.queue, 0] Can't use this function if the queue size is 0 ---- 2018-11-12 12:02:57 UTC - Ivan Kelly: Comes from : ---- 2018-11-12 12:02:58 UTC - Ivan Kelly: Result ConsumerImpl::fetchSingleMessageFromBroker(Message& msg) { if (config_.getReceiverQueueSize() != 0) { LOG_ERROR(getName() << " Can't use receiveForZeroQueueSize if the queue size is not 0"); return ResultInvalidConfiguration; } ---- 2018-11-12 12:03:10 UTC - Ivan Kelly: oh, actually no ---- 2018-11-12 12:04:59 UTC - Ivan Kelly: ok, bug ---- 2018-11-12 12:05:23 UTC - Ivan Kelly: you can't currently use queuesize 0 with python client ---- 2018-11-12 12:06:27 UTC - Ivan Kelly: <https://github.com/apache/pulsar/blob/master/pulsar-client-cpp/lib/ConsumerImpl.cc#L505> ---- 2018-11-12 12:06:55 UTC - Ivan Kelly: it's only supported when you call receive with no timeout, but python always calls with the timeout, even when the calling code doesn't ---- 2018-11-12 12:07:13 UTC - Kesav Kolla: yes that is true ---- 2018-11-12 12:07:18 UTC - Ivan Kelly: <https://github.com/apache/pulsar/blob/master/pulsar-client-cpp/python/src/consumer.cc#L38> ---- 2018-11-12 12:07:28 UTC - Kesav Kolla: yup this is the bug ---- 2018-11-12 12:08:56 UTC - Kesav Kolla: Also why some messages go unACK even if I call ```self.jobsconsumer.acknowledge(msg)``` ---- 2018-11-12 12:09:33 UTC - Kesav Kolla: Can python client return ACK status back to caller so that we can even debug why ACK is not happening ---- 2018-11-12 12:10:24 UTC - Ivan Kelly: that i don't know ---- 2018-11-12 12:11:01 UTC - Ivan Kelly: there is no ack status, ack is fire and forget ---- 2018-11-12 12:11:51 UTC - Ivan Kelly: are unacked messages getting redelivered? ---- 2018-11-12 12:12:51 UTC - Kesav Kolla: nope. I think Broker is marking the consumer as undeliverable and eventually stops sending messages ---- 2018-11-12 12:14:51 UTC - Ivan Kelly: you've set redeliver to 5 minutes. you've waited 5 minutes? ---- 2018-11-12 12:14:59 UTC - Ivan Kelly: I assume you're testing against standalone? ---- 2018-11-12 12:15:19 UTC - Kesav Kolla: yes ---- 2018-11-12 12:15:27 UTC - Kesav Kolla: it's standalone ---- 2018-11-12 12:15:41 UTC - Kesav Kolla: my consumers were running for weeks ---- 2018-11-12 12:19:13 UTC - Ivan Kelly: oh, ok ---- 2018-11-12 12:19:20 UTC - Ivan Kelly: what sort of data rate? ---- 2018-11-12 12:20:37 UTC - Kesav Kolla: I don't know the data rate. But I do publish around 1000 messages onto topic over a day ---- 2018-11-12 12:21:04 UTC - Ivan Kelly: ah, quite low then ---- 2018-11-12 12:21:23 UTC - Kesav Kolla: oh yes very low ---- 2018-11-12 12:22:04 UTC - Kesav Kolla: my guess was something to do with receive queue size so thought of reducing to `0`. Unfortunately python doesn't support it ---- 2018-11-12 12:22:20 UTC - Ivan Kelly: sounds like a legit bug ---- 2018-11-12 12:22:52 UTC - Ivan Kelly: or at least something that warrants investigation. is the data confidential? ---- 2018-11-12 12:23:45 UTC - Kesav Kolla: not confidential ---- 2018-11-12 12:23:53 UTC - Kesav Kolla: but how to debug? ---- 2018-11-12 12:24:57 UTC - Ivan Kelly: I'm not 100% sure of the steps, but it seems that there should be a lot of traces ---- 2018-11-12 12:25:28 UTC - Kesav Kolla: where? In pulsar standalone logs or my consumer logs? ---- 2018-11-12 12:25:41 UTC - Ivan Kelly: in the actual event data ---- 2018-11-12 12:26:05 UTC - Ivan Kelly: could you package up data/standalone and email it? ---- 2018-11-12 12:26:14 UTC - Ivan Kelly: <mailto:[email protected]|[email protected]> ---- 2018-11-12 12:27:18 UTC - Kesav Kolla: Curious how will actual data in the message can cause the ACK to fail? My message is protobuf binary message ---- 2018-11-12 12:27:58 UTC - Kesav Kolla: I can generate messages and write them into file but don't know how will you use that data ---- 2018-11-12 12:27:58 UTC - Ivan Kelly: not the message in the data. but I'd like to see what the subscription state actually looks like ---- 2018-11-12 12:28:24 UTC - Kesav Kolla: you mean topics stats and topics stats-internal ? ---- 2018-11-12 12:28:40 UTC - Ivan Kelly: one sec ---- 2018-11-12 12:30:57 UTC - Ivan Kelly: ya, give me the stats and stats-internal for the topic ---- 2018-11-12 12:32:31 UTC - Kesav Kolla: Thanks for offering help. I'll email to you ---- 2018-11-12 12:39:54 UTC - Ivan Kelly: what's the id of the last message you've received from the topic? ---- 2018-11-12 12:40:27 UTC - Kesav Kolla: I don't know ---- 2018-11-12 12:40:48 UTC - Kesav Kolla: I killed all my consumers just an hour ago and re-started consumers ---- 2018-11-12 12:40:58 UTC - Kesav Kolla: also cleared backlog ---- 2018-11-12 12:41:17 UTC - Ivan Kelly: ok ---- 2018-11-12 12:41:26 UTC - Ivan Kelly: can you send me the logs from the broker? ---- 2018-11-12 12:55:58 UTC - Kesav Kolla: Can pulsar SQL works against PROTOBUF schema? I've uploaded a PROTOBUF schema to my topic. I'm trying to execute SQL I'm getting error saying topic doesn't have valid schema ---- 2018-11-12 12:57:13 UTC - Kesav Kolla: The way I've created the schema file is : ``` { "type": "PROTOBUF", "schema": ".....base64 encoded Proto file contents......" } ``` ---- 2018-11-12 12:57:22 UTC - Julien Plissonneau Duquène: @Julien Plissonneau Duquène has joined the channel ---- 2018-11-12 13:00:26 UTC - Julien Plissonneau Duquène: hello there any work already started/in progress/done for deb/rpm packaging? ---- 2018-11-12 13:05:23 UTC - Ivan Kelly: @Kesav Kolla have you always been using .acknowledge(messageId)? ---- 2018-11-12 13:07:11 UTC - Ivan Kelly: the logs have loads of lines like ] [0] Received cumulative ack on shared subscription, ignoring ---- 2018-11-12 13:18:40 UTC - Kesav Kolla: I've changed to ack cumulative couple of days ago ---- 2018-11-12 13:18:55 UTC - Kesav Kolla: To test if it makes any difference ---- 2018-11-12 13:19:19 UTC - Kesav Kolla: Maybe a week ago ---- 2018-11-12 13:20:13 UTC - Julien Plissonneau Duquène: there are deb and rpm packages available for the c++ client but I was wondering if any work was started to package the broker ---- 2018-11-12 13:39:10 UTC - Ivan Kelly: @Kesav Kolla ok ---- 2018-11-12 13:46:08 UTC - Ivan Kelly: @Kesav Kolla are the consumers running in long running processes? ---- 2018-11-12 13:55:29 UTC - Kesav Kolla: Yes ---- 2018-11-12 13:56:48 UTC - Ivan Kelly: i don't see anything suspicious in the logs ---- 2018-11-12 13:57:32 UTC - Ivan Kelly: and the unacked messages should be redelivered ---- 2018-11-12 14:01:12 UTC - Kesav Kolla: You must have seen backlog of almost 19k ---- 2018-11-12 14:01:40 UTC - Kesav Kolla: All messages were in backlog ---- 2018-11-12 14:04:14 UTC - Ivan Kelly: ya, there's an issue there, but nothing suspicious in logs to point to cause ---- 2018-11-12 14:04:40 UTC - Ivan Kelly: and it sounds like you've wiped the data now, no? ---- 2018-11-12 14:07:32 UTC - Kesav Kolla: Yes now I wiped data ---- 2018-11-12 14:08:37 UTC - Ivan Kelly: did you restart the broker to see if the problem persisted? ---- 2018-11-12 14:10:56 UTC - Kesav Kolla: yes that also I've tried before. Restarted the broker and yet no change. Once messages are in backlog I couldn't get them going again ---- 2018-11-12 14:12:52 UTC - Ivan Kelly: that's a good thing though, if you can trigger the issue again. You can package up the data and we can take a look directly ---- 2018-11-12 14:13:24 UTC - Ivan Kelly: also, enable DEBUG logging in your broker, so we can see more details ---- 2018-11-12 14:14:33 UTC - Ivan Kelly: add a logger: - name: org.apache.pulsar level: DEBUG additivity: false AppenderRef: - ref: Console ---- 2018-11-12 14:14:40 UTC - Ivan Kelly: to conf/log4j2.yaml ---- 2018-11-12 14:17:49 UTC - Kesav Kolla: I'll enable DEBUG logging on broker ---- 2018-11-12 14:25:22 UTC - Kesav Kolla: oh I enabled DEBUG for org.apache.pulsar now I see the logs are like non stop continuously flowing. With those messages I feel my log file is going to become huge in no matter of time ---- 2018-11-12 14:32:40 UTC - Kesav Kolla: @Ivan Kelly I've generated some messages to test it out. Shall I send the logs and stats on your way? ---- 2018-11-12 14:33:29 UTC - Ivan Kelly: you may need to narrow the log category ---- 2018-11-12 14:33:42 UTC - Ivan Kelly: do you have the problem reproducing? ---- 2018-11-12 14:34:52 UTC - Kesav Kolla: I don't know yet whether problem is there or not. Right now consumers are working on the messages. I do see "unackedMessages" : 30 like this on some subscribers. I don't know why would there be so many unackedMessages ---- 2018-11-12 14:35:38 UTC - Kesav Kolla: I've set receiver_queue_size to 1 ---- 2018-11-12 14:36:03 UTC - Ivan Kelly: ok, send me the stats ---- 2018-11-12 14:45:27 UTC - Ivan Kelly: what rate are you writing at? ---- 2018-11-12 14:48:00 UTC - Kesav Kolla: producer? ---- 2018-11-12 14:48:50 UTC - Kesav Kolla: the way my workflow is I put messages in topic then 50 consumers work in shared subscription mode take message one at a time and work on message. If work on message fails they put back message into topic again ---- 2018-11-12 14:49:34 UTC - Kesav Kolla: when I put back message on to topic I increase some RETRY header and I only do that retry in my application logic for 5 times ---- 2018-11-12 14:50:13 UTC - Kesav Kolla: the rate at which producer does is around 500 msgs at every 5 min ---- 2018-11-12 14:50:46 UTC - Kesav Kolla: consumer will take around 5-10 seconds to process message. If it fails on the message it will keep it back into topic ---- 2018-11-12 14:51:44 UTC - Ivan Kelly: it's all on the jobs topic, no? ---- 2018-11-12 14:52:04 UTC - Kesav Kolla: yes ---- 2018-11-12 14:52:06 UTC - Ivan Kelly: could you post the code snippet doing the acks? ---- 2018-11-12 14:54:39 UTC - Ivan Kelly: I'm still seeing a lot of : [0] Received cumulative ack on shared subscription, ignoring ---- 2018-11-12 14:54:45 UTC - Kesav Kolla: ``` while True: msg: _Message = self.jobsconsumer.receive() self.jobsconsumer.acknowledge_cumulative(msg) try: # Logic to work on message except: if retrycnt < self.RETRY_COUNT: <http://self.logger.info|self.logger.info>('Resubmitting job') self.jobsproducer.send(job.SerializeToString(), properties={'RETRY_COUNT': f'RETRY-{retrycnt + 1}'}) ``` ---- 2018-11-12 14:55:17 UTC - Ivan Kelly: use acknowledge() rather than acknowledge_cumulative() ---- 2018-11-12 14:55:33 UTC - Kesav Kolla: I'm doing it now ---- 2018-11-12 14:55:40 UTC - Kesav Kolla: do you want DEBUG log? ---- 2018-11-12 14:56:13 UTC - Ivan Kelly: let it run for a while and just send the stats-internal to start ---- 2018-11-12 14:56:23 UTC - Kesav Kolla: ok fine ---- 2018-11-12 16:36:22 UTC - Matteo Merli: @Kesav Kolla The SQL connector only works with JSON and Avro at the moment. We plan to add support for Protobuf soon (est. in 2.4 release). The “tricky” part is that we don’t have the protobuf generated code in the SQL connector, so we need to write a generic deserializer for the protobuf binary data. ---- 2018-11-12 16:43:21 UTC - Beast in Black: @Matteo Merli reposting a comment I made on Friday: as a followup to all those annoying questions I asked yesterday, I have a couple more questions which popped up when I was making the change to have my consumer type be shared. These questions are due to what I understand is the round-robin nature of the `shared` consumer type and the documentation for the same available at `<https://pulsar.apache.org/docs/latest/getting-started/ConceptsAndArchitecture/#Exclusive-u84f1>` *To recap:* In my use case, each app instance (C++ app using the pulsar CPP client) will subscribe to a global topic (persistent as well as non-persistent) using a unique consumer subscription name i.e. each app instance will use a unique subscription name which is different from all the rest of the app instances. This unique name is persisted on each app instance so when the app comes back after a crash it will continue to use the same subscription name as before. Thus, on each app instance there will be exactly one consumer and one subscriber for the topic, and this consumer's type will be set to `shared` to allow the app to restart after a crash and to re-subscribe to the topic using the same consumer name it used as before without experiencing pulsar client errors related to trying to resubscribe to an existing (i.e unreaped) broker consumer connection when using the same subscription name. *Here come the questions:* 1. If new messages were published to that topic while the app instance was offline due to the crash/restart, when it connects back using the same subscription name to the same shared consumer (to which no other subscriptions will ever connect), will it receive/consume the messages which it missed while it was down? 2. After coming back up from the crash, will it receive/consume all new messages published after it came back up? Essentially, the question asks whether the round-robin nature of shared consumers will cause message receipt issues when using the same subscription name for the same shared consumer after an app crash/restart. 3. On each app instance, since there is only a single consumer/subscriber for the given topic, can message ordering still be guaranteed despite the round-robin nature of a shared consumer? +1 : Matteo Merli, Ali Ahmed ---- 2018-11-12 16:52:43 UTC - Julien Plissonneau Duquène: @Matteo Merli do you know about any ongoing work to make deb/rpm packages for pulsar-broker? ---- 2018-11-12 16:53:54 UTC - Matteo Merli: > 1. If new messages were published to that topic while the app instance was offline due to the crash/restart, when it connects back using the same subscription name to the same shared consumer (to which no other subscriptions will ever connect), will it receive/consume the messages which it missed while it was down? Yes, the subscription is durable. When consumers are slow or not connected, Pulsar will retain all the non-acknowledged messages > 2. After coming back up from the crash, will it receive/consume all new messages published after it came back up? Essentially, the question asks whether the round-robin nature of shared consumers will cause message receipt issues when using the same subscription name for the same shared consumer after an app crash/restart. It would come up and start receiving from older to newer messages (with occasional out of order if there are multiple consumers connected) > 3. On each app instance, since there is only a single consumer/subscriber for the given topic, can message ordering still be guaranteed despite the round-robin nature of a shared consumer? If there’s only 1 consumer connected, messages will come in order. If, for some reason, there are 2+ consumers connected for some time, there might be a blip of out of order there. +1 : Beast in Black ---- 2018-11-12 16:54:52 UTC - Matteo Merli: Hi @Julien Plissonneau Duquène. I don’t think there’s anyone working on that. It would be great to have! ---- 2018-11-12 17:05:05 UTC - Julien Plissonneau Duquène: ok thanks, there is some challenge to do that, but I will suggest to my team that we contribute some packaging work to the project ---- 2018-11-12 17:09:21 UTC - Beast in Black: @Matteo Merli awesome, thanks! I might have some questions related to some producer errors I'm seeing in my app when it is restarted. But I will do my own due diligence on that before I pester you with questions :slightly_smiling_face: ---- 2018-11-12 18:24:19 UTC - Matteo Merli: I think the easier option is to build these inside docker container. So it’s easy for every one to have the correct environment for RPMs and Debs ---- 2018-11-12 22:19:01 UTC - koushik chitta: @koushik chitta has joined the channel ---- 2018-11-12 22:34:11 UTC - koushik chitta: We are heavy Kafka users in Production, I just started looking into Pulsar which can solve a couple of scenarios where Kafka cannot do as of now for us. Does anyone here runs Pulsar Server on Windows OS without any issues? ---- 2018-11-12 22:38:02 UTC - Matteo Merli: Koushik, I’m not aware of any significant deployment on windows, though I wouldn’t expect any major problem there ---- 2018-11-12 22:38:39 UTC - Ali Ahmed: @koushik chitta as long as you have java 8 it should be fine ---- 2018-11-12 22:38:49 UTC - Matteo Merli: (a part from converting bash scripts into correspondent batch or similar scripts for starting the JVM with proper args) ---- 2018-11-12 22:40:04 UTC - Matteo Merli: (actually there was someone some time back that mentioned it was running on Windows and promised to contribute the script.. though that didn’t happen :slightly_smiling_face: ---- 2018-11-12 22:41:23 UTC - Ali Ahmed: @koushik chitta you can install bash and unix shell utilities in windows , so you can run the scripts that way too. ---- 2018-11-12 23:06:36 UTC - koushik chitta: Thanks @Ali Ahmed and @Matteo Merli. Unfortunately Installing shell utilities is not an option for us. I would like to write corresponding bat scripts. and would surely contribute back if I pursue this. ----
