2019-07-10 09:21:37 UTC - liuyuan: @liuyuan has joined the channel ---- 2019-07-10 13:30:46 UTC - Aaron: @Matteo Merli It is allocating 13.2 mb of data in UnAckedMessageTracker on line 109. This is a result of my large ackTimeout value of 1 hour I believe. ---- 2019-07-10 14:08:07 UTC - Rao: @Rao has joined the channel ---- 2019-07-10 14:14:13 UTC - Sam: @Sam has joined the channel ---- 2019-07-10 14:15:50 UTC - Rao: I appreciate if someone can point me to any comparison between Azure Event Hub and Apache Pulsar. Thanks in advance ---- 2019-07-10 14:24:59 UTC - Sam: I am running in SubscriptionType/Shared and when I have two consumers running...the round robin can be two or three messages on consumer1 and then two or three messages on consumer2. I was expecting one message on consumer1 and then one on consumer2...is there a setting? V 2.3.2 from docker apachepulsar/pulsar-standalone ---- 2019-07-10 14:26:39 UTC - Jeremy Taylor: I don't know of any comparisons, but after taking a quick look it seems Event Hub doesn't support infinite retention (so it's no good to me!) ---- 2019-07-10 14:37:37 UTC - vikash: How to increase message size recieve by producer in pulsar? ---- 2019-07-10 14:56:21 UTC - Rao: thanks; Event Hub is not a pub/sub; one needs to write a data pump to pull messages from a given starting offset of a given partition. Consumers from different consumer groups can read the same sequence of messages if they start from the same offset in a given partition. Do you know if I can do the same with Pulsar? ---- 2019-07-10 14:59:14 UTC - Rao: let me rephrase the question - can I do an in-order broadcast of a set of messages to a set of subscribers? ---- 2019-07-10 15:01:11 UTC - Jeremy Taylor: I'm definitely no expert on Pulsar, but I would be _very_ surprised if Pulsar cannot do exactly that ---- 2019-07-10 15:04:37 UTC - Rao: thanks. <#C5Z4T36F7|general> can anyone confirm? ---- 2019-07-10 15:05:38 UTC - Rao: I apologize if the usage "usage of general channel" in this case is inappropriate:( ---- 2019-07-10 15:20:05 UTC - Rao: has anyone built a large scale IoT system on Apache Pulsar? The scale - 200 thousand devices with each device sending a telemetry message (size 2Kb) at every 5 min to a cloud hosted monitoring and control infrastructure. ---- 2019-07-10 16:18:24 UTC - Gilberto Muñoz Hernández: Hey guys, can i do something like this: Spark Streaming App A with shared subscription to topic X, consuming all data from different executors in parallel, and Spark Streaming App B with shared subscription to topic X, consuming all data from different executors in parallel? ---- 2019-07-10 16:18:55 UTC - Gilberto Muñoz Hernández: So consumption from A wont affect consumption to B, they both will consume everithing? ---- 2019-07-10 16:20:03 UTC - Gilberto Muñoz Hernández: And at the same time each app, will be able to consume data from different executors? ---- 2019-07-10 16:21:01 UTC - Sijie Guo: > let me rephrase the question - can I do an in-order broadcast of a set of messages to a set of subscribers?
@Rao yes ---- 2019-07-10 16:21:20 UTC - Gilberto Muñoz Hernández: So, what an executor from app A consumes, wont be seen from any other executor from app A, but will be seen by one of the executors from app B. ---- 2019-07-10 16:21:28 UTC - Gilberto Muñoz Hernández: And only one. ---- 2019-07-10 16:21:49 UTC - Sijie Guo: message size for producer? Can you give me more context? ---- 2019-07-10 16:21:51 UTC - Gilberto Muñoz Hernández: Is this scenario possible with current pulsar-spark receiver? ---- 2019-07-10 16:22:46 UTC - Sijie Guo: @Gilberto Muñoz Hernández I believe so ---- 2019-07-10 16:23:13 UTC - Gilberto Muñoz Hernández: Don't want to be negative here, but have anyone tried yet? ---- 2019-07-10 16:23:46 UTC - Gilberto Muñoz Hernández: I have seen a lot that when tested doesn't work as said. ---- 2019-07-10 16:29:06 UTC - Sijie Guo: I don’t know how do you write the code and how are “executors” here. but pulsar is a pub/sub system. it guarantees different subscriptions will receive separated copies of data. subscription A will not affect subscription B. ---- 2019-07-10 16:30:02 UTC - Gilberto Muñoz Hernández: That part is ok, if was using a simple consumer ---- 2019-07-10 16:30:16 UTC - Gilberto Muñoz Hernández: But i intend to use pulsar-spark reciever ---- 2019-07-10 16:31:11 UTC - Gilberto Muñoz Hernández: That's when i need to know if pulsar-spark receiver supports shared subscriptions with other pulsar-spark-receivers ---- 2019-07-10 16:31:52 UTC - Gilberto Muñoz Hernández: and if for each receiver, can i do that "receiving" in parallel with different spark executors ---- 2019-07-10 16:32:41 UTC - Gilberto Muñoz Hernández: so each spark executor behaves as a shared consumer in single thread ---- 2019-07-10 16:33:32 UTC - Gilberto Muñoz Hernández: and what's consumed by one executor wont be consumed from again from one of its "brothers" executors ---- 2019-07-10 16:52:01 UTC - Matteo Merli: @Rao Yes, we’ve seen such use cases at much higher ingestion rates ---- 2019-07-10 17:12:19 UTC - Nagarjunreddy Gaddam: @Nagarjunreddy Gaddam has joined the channel ---- 2019-07-10 17:14:52 UTC - Rao: @Sijie Guo thank you ---- 2019-07-10 17:16:38 UTC - Rao: @Matteo Merli in such system, do devices directly publish to pulsar topic? Or use some intermediate MQTT broker? ---- 2019-07-10 17:17:27 UTC - Matteo Merli: I’ve seen either direct connectivity or HTTP microservice in between ---- 2019-07-10 17:49:08 UTC - Gilberto Muñoz Hernández: @Sijie Guo In kafka you can have a topic with 12 partition, then create a 12 thread receiver and each of those threads will be distributed over your spark worker nodes. arrow_up : Naveen Siddareddy ---- 2019-07-10 17:49:21 UTC - Gilberto Muñoz Hernández: How can I do something like that in Pulsar ---- 2019-07-10 17:50:07 UTC - Gilberto Muñoz Hernández: So I'am actually receiving in parallel, and avoid a bottleneck while receiving messages. ---- 2019-07-10 18:29:21 UTC - Sam Leung: Is there a way to auto create the tenant/namespace when you specify the fully qualified topic name on a publish/consume? (as in tenant/namespace/topic format) It would be very nice for local development to not have to pre-provision tenants and namespaces ---- 2019-07-10 18:36:40 UTC - Santiago Del Campo: Yeah... i've been thinking about this solution in particular... have some persistent volumes that are independent of the pods life cycle, with that, should not matter how many times the pods replaced, consistency should remain, right? ---- 2019-07-10 18:36:51 UTC - Ali Ahmed: @Sam Leung no but “public/default” is created in standalone mode automatically ---- 2019-07-10 18:38:04 UTC - Sam Leung: Yes but I’m looking to not have a special `public/default/mytopic1` running locally while in staging and prod I’m using `engineering/service3/mytopic1` ---- 2019-07-10 18:40:14 UTC - Abhimanyu Deora: @Abhimanyu Deora has joined the channel ---- 2019-07-10 18:47:12 UTC - Constantin: @Constantin has joined the channel ---- 2019-07-10 18:51:02 UTC - Rao: @Matteo Merli Thanks ---- 2019-07-10 20:05:05 UTC - Ryan Samo: Hey guys, I have a Pulsar cluster stood up with functions workers outside of the brokers as their own cluster. When we create functions, they work fine, no issues. If we create a sink, it creates but then never subscribes the source topic. Also, if you try to delete the sink you get a 401 unauthorized exception. Any idea why functions would work fine but sinks do not? We can create, update, get, list, you name it. Just can’t delete or get the sink to subscribe ---- 2019-07-10 20:20:41 UTC - Devin G. Bost: What's up with the search bar that does nothing? We tested it in Chrome and Firefox. :slightly_smiling_face: ---- 2019-07-10 20:43:05 UTC - Andrew Meredith: @Andrew Meredith has joined the channel ---- 2019-07-10 20:49:47 UTC - David Kjerrumgaard: @Gilberto Muñoz Hernández A pulsar subscription that is running in shared mode, can server as many threads as you like. Each thread will get the "next' item from the topic and process it. So as long as the pulsar-spark receiver supports multi-threaded applications you should be fine. ---- 2019-07-10 20:51:48 UTC - David Kjerrumgaard: @Sam Leung Unfortunately there is not a way to have tenants and namespaces auto-created currently. You will have to issue the appropriate `create` commands ---- 2019-07-10 20:53:41 UTC - David Kjerrumgaard: @Devin G. Bost That is definitely a bug that made it into the 2.4 documentation. I will create an issue in GitHub for this. Thanks for letting us know!! +1 : Devin G. Bost ---- 2019-07-10 20:58:33 UTC - David Kjerrumgaard: <https://github.com/apache/pulsar/issues/4706> ---- 2019-07-10 21:02:12 UTC - David Kjerrumgaard: @Santiago Del Campo You will also need to ensure that the same bookie ids, get attached to the same volumes, otherwise there will be issues. The metadata in ZK keeps track of which bookies have which ledger segments (data) written on them. So if your metadata says bookie-1 has ledger 123, segment 4. You will need to make sure that when you spin up bookie-1, it attaches to the EBS volume that contains ledger 123, segment 4. ---- 2019-07-10 21:02:20 UTC - David Kjerrumgaard: Hope that makes sense ---- 2019-07-10 21:31:16 UTC - Constantin: Fellow Pulsar experts - there are two connectors listed for MySQL in the Pulsar documentation - Debezium and Alibaba canal. It seems that they both performing the same tasks, but documentation states "debezuim and canal" it it suppose to be "debezuim or canal" ? ---- 2019-07-10 22:11:22 UTC - Ali Ahmed: you can use either Debezium or Canal for your cdc workload ---- 2019-07-10 22:23:13 UTC - Constantin: @Ali Ahmed Thank You ! ---- 2019-07-10 22:24:24 UTC - Constantin: @Ali Ahmed and a follow up question - "Does Pulsar support multiple consumers reading from different offsets?" - ---- 2019-07-10 22:26:49 UTC - Ali Ahmed: pulsar has selective ack there is no concept of offsets , different consumers can read from different positions in the log +1 : Constantin, Ali Ahmed ---- 2019-07-10 22:29:32 UTC - Matteo Merli: Maybe you’re referring to using different subscriptions within a topic. Each subscription is independent. Within a subscriptions there can be either 1 or more consumers: <https://pulsar.apache.org/docs/en/concepts-messaging/#subscription-modes> ---- 2019-07-10 23:36:28 UTC - Karthik Ramasamy: If you are in Dallas or nearby - swing by the meetup on Modern Streaming Data Platform using Apache Pulsar with a case study at Capital One +1 : Constantin, Devin G. Bost ---- 2019-07-10 23:36:31 UTC - Karthik Ramasamy: <https://www.meetup.com/DFW-Data-Engineering-Meetup/events/262999302/?gj=co2&rv=co2&_xtd=gatlbWFpbF9jbGlja9oAJDVhYjc3OTFiLTU4MmItNDUyMy1hNTI3LWQ2NzEyMmQ4NTI1ZQ> ---- 2019-07-11 01:04:21 UTC - Tarek Shaar: @Tarek Shaar has joined the channel ---- 2019-07-11 01:05:25 UTC - Sijie Guo: @David Kjerrumgaard @Devin G. Bost: there was an issue when we upgraded the website to support code tabs. <https://github.com/apache/pulsar/issues/4695> we are looking into it. ---- 2019-07-11 01:10:58 UTC - David Kjerrumgaard: Thanls @Sijie Guo ---- 2019-07-11 04:29:29 UTC - vikash: just i need to increase message size it either be producer and consumer ,what i need i have bulk payload which is more then 10MB so in that case how we handle produce and consume messages ---- 2019-07-11 05:04:53 UTC - Sijie Guo: @vikash you can set `max_message_size` at broker.conf to configure the max message size. but you need 2.4.0 for both client and broker to leverage this feature. If you are interested, you can checkout section “PIP-36: Configure max message size at broker side” from this blog post - <https://medium.com/streamnative/whats-new-in-apache-pulsar-2-4-0-d646f6727642> ---- 2019-07-11 06:07:41 UTC - vikash: Thanks for the info but if i have used standalone pulsar in that case i have to standalone.xml and also how we will set this property max_message_size=10 for 10MB,document not explain how to set? ---- 2019-07-11 06:08:41 UTC - Sijie Guo: for standalone, you can just modify standalone.conf ---- 2019-07-11 08:51:20 UTC - ishara: Hello i'm using pulsar with the Websocket API but I am having trouble with it as the websockets are closing by themselves. Is anyone experiencing the same issue ? Is there something to configure to solve this issue ? ----
