Slack digest for #general - 2019-10-26

Apache Pulsar Slack Sat, 26 Oct 2019 02:11:26 -0700

2019-10-25 13:25:54 UTC - Jacob O'Farrell: Hi all - I’m using Pulsar 2.3.0 and 
under load I’m encountering “Resolve error: system:24 : Too many open files” 
when trying to publish to a broker (via broker proxy if that makes a 
difference?)
----
2019-10-25 14:43:49 UTC - Addison Higham: :man-facepalming: zookeeper docs are 
not great... with 3.5.x releases and new dynamic config, the pattern of 
replacing your hostname with `0.0.0.0` just for your server id, like this:
```
server.1=myhost1:2888:3888
server.2=0.0.0.0:2888:3888
server.3=myhost3:2888:3888
```
(which you see quite common in stack overflow, etc)


is a *TERRIBLE IDEA* and will break your cluster. In ZK, upon trying to join a 
peer, if your server id is lower than the another quorum member server id, then 
you close the socket and the higher  number server id connects back to you. 
This used to just use the address defined in the zoo.cfg, in 3.5.x releases 
however, the client now sends an initial message that says what it's address 
is. If you have changed your address to `0.0.0.0` (which appears to be fairly 
well used), obviously, the peer won't be able to connect back
----
2019-10-25 14:44:50 UTC - Addison Higham: AFAICT, this isn't called out in the 
docs anywhere
thinking_face : Matteo Merli
----
2019-10-25 14:51:48 UTC - Pretlow Stevenson: @Pretlow Stevenson has joined the 
channel
----
2019-10-25 16:40:18 UTC - limadelrey: @limadelrey has joined the channel
----
2019-10-25 16:42:08 UTC - limadelrey: Hey guys. I've been playing with Apache 
Pulsar for the last couple of days so I'm wondering if 1) is it on Pulsar's 
roadmap to include a pulsar-io-debezium-mongo connector? and 2) will 
transformations like dropping tombstone events be possible in the future with 
Apache Pulsar? Thank you in advance.
----
2019-10-25 16:55:11 UTC - Sijie Guo: 1) can you file a GitHub issue for it? 2) 
you can enable topic compaction to drop the tombstone events
----
2019-10-25 17:17:16 UTC - limadelrey: 1) Sure - I'll reply with the GitHub 
issue link later;
2) I'll give it a try, thank you!
----
2019-10-25 17:32:16 UTC - Anand Sinha: Hi @Matteo Merli, would you have any 
view points on these; or any other alternate suggestions for the original 
problem? Do note that the number of topics (and hence, subscriptions) would be 
very large in number.
----
2019-10-25 17:34:10 UTC - Matteo Merli: Hi, I missed that yesterday.

the difficulty in providing a more comprehensive view of the consumer 
joining/leaving (rather then just communicating “you’re active” to the selected 
consumer) is how to provide a meaningful guaranteed
----
2019-10-25 17:34:58 UTC - Matteo Merli: eg: if a broker crashes, it won’t 
remember which consumers were connected before the crash, though it might not 
be able to see a consumer that crashes at the same time
----
2019-10-25 18:14:17 UTC - Anand Sinha: Thanks. Any alternative solutions to the 
original problem using the building blocks provided by Pulsar?
----
2019-10-25 18:36:46 UTC - Chris Maria: @Chris Maria has joined the channel
----
2019-10-25 21:10:25 UTC - Jacob O'Farrell: Just giving this a little bump... 
Sorry for the noise! Any help would be appreciated
----
2019-10-25 21:11:03 UTC - Matteo Merli: How many TCP connections made to 
broker? What’s the max number of file descriptors for that process?
----
2019-10-25 21:18:08 UTC - Jacob O'Farrell: Thanks @Matteo Merli  - having a 
look now. This is running in k8s so it's a little obscured.
----
2019-10-25 21:22:34 UTC - Matteo Merli: you can ssh into the pod and check 
`ulimit -n`
----
2019-10-25 21:22:49 UTC - Matteo Merli: and also get `lsof -p 1`
----
2019-10-25 22:32:24 UTC - Jacob O'Farrell: ulimit returns `65536`
----
2019-10-25 22:32:37 UTC - Jacob O'Farrell: 
```root@broker-74ff588854-6tw9h:/pulsar# lsof -p 1
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF      NODE NAME
sh        1 root  cwd    DIR  0,423       18  83891780 /pulsar
sh        1 root  rtd    DIR  0,423       64 105912169 /
sh        1 root  txt    REG  0,423   117208  94372230 /bin/dash
sh        1 root  mem    REG  259,1           94372230 /bin/dash (path 
dev=0,423)
sh        1 root  mem    REG  259,1            8405714 
/lib/x86_64-linux-gnu/libc-2.28.so (path dev=0,423)
sh        1 root  mem    REG  259,1            8405702 
/lib/x86_64-linux-gnu/ld-2.28.so (path dev=0,423)
sh        1 root    0r  FIFO   0,11      0t0     92941 pipe
sh        1 root    1w  FIFO   0,11      0t0     92942 pipe
sh        1 root    2w  FIFO   0,11      0t0     92943 pipe```
----
2019-10-25 22:33:11 UTC - Matteo Merli: Seems pid 1 is bash instead of Pulsar 
jvm process
----
2019-10-25 22:33:30 UTC - Matteo Merli: check the Pid of pulsar broker and see 
how many fds opened
----
2019-10-25 22:36:46 UTC - Jacob O'Farrell: Okay seems like its running a Pid 
11? The output of `lsof -p 1` is huuuge
----
2019-10-25 22:37:24 UTC - Jacob O'Farrell: prints 965 lines
----
2019-10-25 22:40:54 UTC - Jacob O'Farrell: all other brokers appear to have a 
similar amount (thank you for your help btw)
----
2019-10-25 22:41:13 UTC - Allen Liang: @Allen Liang has joined the channel
----
2019-10-25 22:46:11 UTC - Matteo Merli: 965 is &lt; 65K :slightly_smiling_face:
----
2019-10-25 22:57:43 UTC - Jacob O'Farrell: Correct! Strange.
----
2019-10-25 23:03:49 UTC - Jacob O'Farrell: We are seeing this error appear in 
logs as well 
<https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/PersistentTopicsBase.java#L1846>
----
2019-10-25 23:29:32 UTC - Jacob O'Farrell: 2019-10-25 23:22:01.106 ERROR 
ClientImpl:179 | Error Checking/Getting Partition Metadata while creating 
producer on persistent://[removed topic name] -- 5
----
2019-10-25 23:33:45 UTC - Stephen Baynham: @Stephen Baynham has joined the 
channel
----
2019-10-25 23:38:24 UTC - Stephen Baynham: Hey I'm evaluating pulsar to use for 
pubsub with semi-transient data.  That is,  I'll have a lot of consumers to a 
lot of different topics popping in and out all the time.  If my consumers fall 
out and reconnect within 30s-1m I'd like everything to still be there waiting 
for them.  But beyond that I'd like everything to be cleaned up, at least 
evetually.  My understanding is if I give my messages a 1min TTL, then my 
messages will be cleaned up if the client stops reading them, so that's good.

1) Is there a way to clean up subscriptions when the client stops using them 
(is that desirable, if the alternative is having 10s of millions of 
subscriptions nobody cares about?)
2) Is there a way to clean up topics when the subscriptions attached to them 
are removed (is that desirable, if the alternative is having millions of topics 
nobody cares about, and they might even be receiving messages?)
----
2019-10-26 00:00:44 UTC - Jacob O'Farrell: Would this have anything to do with 
Zookeeper or similar?
----
2019-10-26 00:47:53 UTC - Matteo Merli: &gt; But beyond that I'd like 
everything to be cleaned up, at least evetually.  My understanding is if I give 
my messages a 1min TTL, then my messages will be cleaned up if the client stops 
reading them, so that's good.

The TTL is typically enforced every 5min by default. 

&gt; 1) Is there a way to clean up subscriptions when the client stops using 
them (is that desirable, if the alternative is having 10s of millions of 
subscriptions nobody cares about?)

There’s a setting at broker level to drop subscriptions that were inactive for 
some time. By default it’s off. `subscriptionExpirationTimeMinutes`

&gt; 2) Is there a way to clean up topics when the subscriptions attached to 
them are removed (is that desirable, if the alternative is having millions of 
topics nobody cares about, and they might even be receiving messages?)

Yes, that’s the default behavior. When a topic is inactive and have no 
subscriptions it will be deleted. 
----
2019-10-26 00:55:11 UTC - Stephen Baynham: Wow, that's great!  Would you 
recommend having two clusters if we had mixed workloads between transient 
messaging &amp; more kafka-like pipeline stuff?  I probably wouldn't want that 
subscription expiration on the more traditional workload
----
2019-10-26 06:23:06 UTC - Mahesh: Hi,
I have consumers that are consuming in shared subscription mode. What is load 
balancing that pulsar does for the consumers in this case ? I presume that it 
does spray message in round robin fashion. If so are there any other load 
balancing techniques ?
----
2019-10-26 06:27:58 UTC - Ali Ahmed: @Stephen Baynham use one cluster isolate 
workloads under namespaces
----
2019-10-26 06:29:47 UTC - Ali Ahmed: @Mahesh by default round robin is used
----
2019-10-26 06:30:11 UTC - Ali Ahmed: for customizing the distribution take a 
look at Key_shared subscription
----
2019-10-26 06:36:03 UTC - Mahesh: @Ali Ahmed Its not quite clear in the 
documentation about how pulsar handles consumer disconnections. Could you 
please provide some details on that
----

Slack digest for #general - 2019-10-26

Reply via email to