2019-10-25 13:25:54 UTC - Jacob O'Farrell: Hi all - I’m using Pulsar 2.3.0 and under load I’m encountering “Resolve error: system:24 : Too many open files” when trying to publish to a broker (via broker proxy if that makes a difference?) ---- 2019-10-25 14:43:49 UTC - Addison Higham: :man-facepalming: zookeeper docs are not great... with 3.5.x releases and new dynamic config, the pattern of replacing your hostname with `0.0.0.0` just for your server id, like this: ``` server.1=myhost1:2888:3888 server.2=0.0.0.0:2888:3888 server.3=myhost3:2888:3888 ``` (which you see quite common in stack overflow, etc)
is a *TERRIBLE IDEA* and will break your cluster. In ZK, upon trying to join a peer, if your server id is lower than the another quorum member server id, then you close the socket and the higher number server id connects back to you. This used to just use the address defined in the zoo.cfg, in 3.5.x releases however, the client now sends an initial message that says what it's address is. If you have changed your address to `0.0.0.0` (which appears to be fairly well used), obviously, the peer won't be able to connect back ---- 2019-10-25 14:44:50 UTC - Addison Higham: AFAICT, this isn't called out in the docs anywhere thinking_face : Matteo Merli ---- 2019-10-25 14:51:48 UTC - Pretlow Stevenson: @Pretlow Stevenson has joined the channel ---- 2019-10-25 16:40:18 UTC - limadelrey: @limadelrey has joined the channel ---- 2019-10-25 16:42:08 UTC - limadelrey: Hey guys. I've been playing with Apache Pulsar for the last couple of days so I'm wondering if 1) is it on Pulsar's roadmap to include a pulsar-io-debezium-mongo connector? and 2) will transformations like dropping tombstone events be possible in the future with Apache Pulsar? Thank you in advance. ---- 2019-10-25 16:55:11 UTC - Sijie Guo: 1) can you file a GitHub issue for it? 2) you can enable topic compaction to drop the tombstone events ---- 2019-10-25 17:17:16 UTC - limadelrey: 1) Sure - I'll reply with the GitHub issue link later; 2) I'll give it a try, thank you! ---- 2019-10-25 17:32:16 UTC - Anand Sinha: Hi @Matteo Merli, would you have any view points on these; or any other alternate suggestions for the original problem? Do note that the number of topics (and hence, subscriptions) would be very large in number. ---- 2019-10-25 17:34:10 UTC - Matteo Merli: Hi, I missed that yesterday. the difficulty in providing a more comprehensive view of the consumer joining/leaving (rather then just communicating “you’re active” to the selected consumer) is how to provide a meaningful guaranteed ---- 2019-10-25 17:34:58 UTC - Matteo Merli: eg: if a broker crashes, it won’t remember which consumers were connected before the crash, though it might not be able to see a consumer that crashes at the same time ---- 2019-10-25 18:14:17 UTC - Anand Sinha: Thanks. Any alternative solutions to the original problem using the building blocks provided by Pulsar? ---- 2019-10-25 18:36:46 UTC - Chris Maria: @Chris Maria has joined the channel ---- 2019-10-25 21:10:25 UTC - Jacob O'Farrell: Just giving this a little bump... Sorry for the noise! Any help would be appreciated ---- 2019-10-25 21:11:03 UTC - Matteo Merli: How many TCP connections made to broker? What’s the max number of file descriptors for that process? ---- 2019-10-25 21:18:08 UTC - Jacob O'Farrell: Thanks @Matteo Merli - having a look now. This is running in k8s so it's a little obscured. ---- 2019-10-25 21:22:34 UTC - Matteo Merli: you can ssh into the pod and check `ulimit -n` ---- 2019-10-25 21:22:49 UTC - Matteo Merli: and also get `lsof -p 1` ---- 2019-10-25 22:32:24 UTC - Jacob O'Farrell: ulimit returns `65536` ---- 2019-10-25 22:32:37 UTC - Jacob O'Farrell: ```root@broker-74ff588854-6tw9h:/pulsar# lsof -p 1 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sh 1 root cwd DIR 0,423 18 83891780 /pulsar sh 1 root rtd DIR 0,423 64 105912169 / sh 1 root txt REG 0,423 117208 94372230 /bin/dash sh 1 root mem REG 259,1 94372230 /bin/dash (path dev=0,423) sh 1 root mem REG 259,1 8405714 /lib/x86_64-linux-gnu/libc-2.28.so (path dev=0,423) sh 1 root mem REG 259,1 8405702 /lib/x86_64-linux-gnu/ld-2.28.so (path dev=0,423) sh 1 root 0r FIFO 0,11 0t0 92941 pipe sh 1 root 1w FIFO 0,11 0t0 92942 pipe sh 1 root 2w FIFO 0,11 0t0 92943 pipe``` ---- 2019-10-25 22:33:11 UTC - Matteo Merli: Seems pid 1 is bash instead of Pulsar jvm process ---- 2019-10-25 22:33:30 UTC - Matteo Merli: check the Pid of pulsar broker and see how many fds opened ---- 2019-10-25 22:36:46 UTC - Jacob O'Farrell: Okay seems like its running a Pid 11? The output of `lsof -p 1` is huuuge ---- 2019-10-25 22:37:24 UTC - Jacob O'Farrell: prints 965 lines ---- 2019-10-25 22:40:54 UTC - Jacob O'Farrell: all other brokers appear to have a similar amount (thank you for your help btw) ---- 2019-10-25 22:41:13 UTC - Allen Liang: @Allen Liang has joined the channel ---- 2019-10-25 22:46:11 UTC - Matteo Merli: 965 is < 65K :slightly_smiling_face: ---- 2019-10-25 22:57:43 UTC - Jacob O'Farrell: Correct! Strange. ---- 2019-10-25 23:03:49 UTC - Jacob O'Farrell: We are seeing this error appear in logs as well <https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/PersistentTopicsBase.java#L1846> ---- 2019-10-25 23:29:32 UTC - Jacob O'Farrell: 2019-10-25 23:22:01.106 ERROR ClientImpl:179 | Error Checking/Getting Partition Metadata while creating producer on persistent://[removed topic name] -- 5 ---- 2019-10-25 23:33:45 UTC - Stephen Baynham: @Stephen Baynham has joined the channel ---- 2019-10-25 23:38:24 UTC - Stephen Baynham: Hey I'm evaluating pulsar to use for pubsub with semi-transient data. That is, I'll have a lot of consumers to a lot of different topics popping in and out all the time. If my consumers fall out and reconnect within 30s-1m I'd like everything to still be there waiting for them. But beyond that I'd like everything to be cleaned up, at least evetually. My understanding is if I give my messages a 1min TTL, then my messages will be cleaned up if the client stops reading them, so that's good. 1) Is there a way to clean up subscriptions when the client stops using them (is that desirable, if the alternative is having 10s of millions of subscriptions nobody cares about?) 2) Is there a way to clean up topics when the subscriptions attached to them are removed (is that desirable, if the alternative is having millions of topics nobody cares about, and they might even be receiving messages?) ---- 2019-10-26 00:00:44 UTC - Jacob O'Farrell: Would this have anything to do with Zookeeper or similar? ---- 2019-10-26 00:47:53 UTC - Matteo Merli: > But beyond that I'd like everything to be cleaned up, at least evetually. My understanding is if I give my messages a 1min TTL, then my messages will be cleaned up if the client stops reading them, so that's good. The TTL is typically enforced every 5min by default. > 1) Is there a way to clean up subscriptions when the client stops using them (is that desirable, if the alternative is having 10s of millions of subscriptions nobody cares about?) There’s a setting at broker level to drop subscriptions that were inactive for some time. By default it’s off. `subscriptionExpirationTimeMinutes` > 2) Is there a way to clean up topics when the subscriptions attached to them are removed (is that desirable, if the alternative is having millions of topics nobody cares about, and they might even be receiving messages?) Yes, that’s the default behavior. When a topic is inactive and have no subscriptions it will be deleted. ---- 2019-10-26 00:55:11 UTC - Stephen Baynham: Wow, that's great! Would you recommend having two clusters if we had mixed workloads between transient messaging & more kafka-like pipeline stuff? I probably wouldn't want that subscription expiration on the more traditional workload ---- 2019-10-26 06:23:06 UTC - Mahesh: Hi, I have consumers that are consuming in shared subscription mode. What is load balancing that pulsar does for the consumers in this case ? I presume that it does spray message in round robin fashion. If so are there any other load balancing techniques ? ---- 2019-10-26 06:27:58 UTC - Ali Ahmed: @Stephen Baynham use one cluster isolate workloads under namespaces ---- 2019-10-26 06:29:47 UTC - Ali Ahmed: @Mahesh by default round robin is used ---- 2019-10-26 06:30:11 UTC - Ali Ahmed: for customizing the distribution take a look at Key_shared subscription ---- 2019-10-26 06:36:03 UTC - Mahesh: @Ali Ahmed Its not quite clear in the documentation about how pulsar handles consumer disconnections. Could you please provide some details on that ----
