2018-11-20 09:31:39 UTC - Ganga Lakshmanasamy: No. Its going to be one single topic. ---- 2018-11-20 10:04:00 UTC - weibin.huang: @weibin.huang has joined the channel ---- 2018-11-20 12:09:34 UTC - Ivan Kelly: the producer is thread safe, so multiple threads can share one producer ---- 2018-11-20 12:12:33 UTC - Ganga Lakshmanasamy: In our case what happens is each producer is connected to an email folder, so the producer will wait for messages and send to consumer when a message comes in. So if there are 1000 users connecting different emails there would be 1000 producer instances created. And inside all these 1000 instances there will be a thread waiting for a message to arrive. Any suggestion on how these kind of scenario can be handled? ---- 2018-11-20 12:13:20 UTC - Ganga Lakshmanasamy: Is there any documentation available on how the message can be built with this new api? ---- 2018-11-20 12:22:56 UTC - Ivan Kelly: why are you creating a producer per user? couldn't you just use a pool of producers? ---- 2018-11-20 15:51:03 UTC - Ganga Lakshmanasamy: So here is a change we have done, there will be one producer created for app and that producer will be used in say 200,000 independent threads to send out messages. Will this have any impact. The producer creation will be singleton and it will be used by all those instances ---- 2018-11-20 16:08:04 UTC - Ryan Samo: Hey guys and gals, I was looking into PIP-20 for revoking TLS certs in case you get compromised or have a bad actor. In some systems you can concatenate certs into your root CA so that they can be picked up. Then if you want to revoke them you just delete them from the root CA concatenation. Does Pulsar support the ability to use concatenation in the root CA? I gave it a shot and on first pass, wound up with a 500 error: “ Valid Proxy Client role should be provided for getPartitionMetadataRequest”. The proxy role and client roles are both granted to the namespace, I just figured it was due to the root CA not supporting this in Pulsar. Any thoughts on trying to pull this off or maybe an update on PIP-20? ---- 2018-11-20 17:14:51 UTC - Byron: Hi folks, has anyone come up with a series of checks to test each component of a fresh Pulsar deployment? I seem to run into subtle issues every time I deploy a new cluster. Usually they are communication or memory-related ---- 2018-11-20 17:18:01 UTC - Sijie Guo: PIP-20 is still WIP. @Ivan Kelly might be able to incorporate your question here. ---- 2018-11-20 17:19:15 UTC - Sijie Guo: for bookies, you can use `bin/bookkeeper shell sanitycheck` for single bookie test, use `bin/bookkeeper shell simpletest` for cluster-wide test.
for brokers, there is WIP in master to do healthcheck ---- 2018-11-20 17:19:59 UTC - Sijie Guo: you can read this section : <http://pulsar.apache.org/docs/en/deploy-bare-metal/#deploying-a-bookkeeper-cluster> ---- 2018-11-20 17:20:42 UTC - Byron: thank you. i was running into a `BrokerPersistenceError` which I assume is due to the broker not be able to communicate to the bookie ---- 2018-11-20 17:21:30 UTC - Sijie Guo: do you have a detail stack trace? ---- 2018-11-20 17:22:10 UTC - Ryan Samo: Thanks @Sijie Guo! ---- 2018-11-20 17:22:32 UTC - Byron: this was returned by the (Go) client, but i can see if i can track it down in the logs of the broker ---- 2018-11-20 17:23:02 UTC - Sijie Guo: oh I see. ---- 2018-11-20 17:32:02 UTC - imteyaz ahmed khan: @imteyaz ahmed khan has joined the channel ---- 2018-11-20 17:35:30 UTC - Byron: I see `Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException: Error while recovering ledger` ---- 2018-11-20 17:35:45 UTC - Byron: In the broker stack trace ---- 2018-11-20 17:36:29 UTC - Matteo Merli: Is that in Kubernetes? ---- 2018-11-20 17:36:32 UTC - Byron: Yes ---- 2018-11-20 17:36:51 UTC - Matteo Merli: It might be related to Bookie pods having different IP addresses ---- 2018-11-20 17:37:13 UTC - Byron: I don’t see any errors in the bookies themselves ---- 2018-11-20 17:37:42 UTC - Matteo Merli: Is that deployed as a StatefulSet ? ---- 2018-11-20 17:37:51 UTC - Byron: yes ---- 2018-11-20 17:38:04 UTC - Byron: I used the gcp YAML files ---- 2018-11-20 17:38:24 UTC - Byron: with very minimal changes (like cluster name) ---- 2018-11-20 17:39:11 UTC - Matteo Merli: Ok, let me check those yaml files ---- 2018-11-20 17:40:34 UTC - Matteo Merli: In brokers, does it say it fails to connect to bookie pods? ---- 2018-11-20 17:41:48 UTC - Matteo Merli: There’s a known problem (which we fixed in Streamlio branch) with the caching of DNS names in BK client. This was fixed and we planned to release soon in BK-4.7.3 and cascade that into Pulsar-2.2.1 ---- 2018-11-20 17:43:38 UTC - Byron: `17:34:01.995 [bookkeeper-io-12-4] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0x7da7f0f3]/bookkeeper-2.bookkeeper.default.svc.cluster.local:3181, current state CONNECTING` ---- 2018-11-20 17:43:58 UTC - Byron: from one broker pod ---- 2018-11-20 17:44:38 UTC - Byron: same one with the error above ---- 2018-11-20 17:44:47 UTC - Matteo Merli: If you bounce that pod, can it connect when it comes back up ? ---- 2018-11-20 17:50:50 UTC - Byron: I bounced all three and get the same error ---- 2018-11-20 17:51:16 UTC - Matteo Merli: ok, then it’s something different ---- 2018-11-20 17:51:44 UTC - Matteo Merli: from a broker pod, are you able to `ping bookkeeper-2.bookkeeper.default.svc.cluster.local`? ---- 2018-11-20 17:54:33 UTC - Byron: ah nope ---- 2018-11-20 17:55:04 UTC - Matteo Merli: what about other bookies? ---- 2018-11-20 17:55:26 UTC - Byron: no ---- 2018-11-20 17:57:29 UTC - Byron: oh.. there is no bookie service ---- 2018-11-20 17:57:56 UTC - Byron: only the stateful set and deployment (for auto-recovery) ---- 2018-11-20 17:58:43 UTC - Byron: so i presume the k8s dns wouldn’t resolve ---- 2018-11-20 17:59:05 UTC - Byron: in the generic k8s config there is a service defined ---- 2018-11-20 17:59:28 UTC - Byron: it also uses daemon set instead of a stateful set i see ---- 2018-11-20 18:00:29 UTC - Matteo Merli: Yes, although daemon set has it’s own set of tricky parts (the pod names and IPs change all the time.. ) ---- 2018-11-20 18:00:56 UTC - Matteo Merli: So, yes the K8S “service” need to be defined to have the DNS working ---- 2018-11-20 18:01:14 UTC - Byron: ^ah. Ok i will add a service and try it out ---- 2018-11-20 18:02:04 UTC - Matteo Merli: And we need to use pod names (rather than pod IPs) for bookie, because we need that bookie identifier to be stable (otherwise we cannot know where we wrote the data earlier). ---- 2018-11-20 18:02:29 UTC - Byron: i also discovered another minor issue with the gcp config. the `ledger-disk` referenced in bookie.yaml should be `ledgers-disk`. also a topologyKey.. i can open a PR for these items if that works ---- 2018-11-20 18:02:46 UTC - Matteo Merli: Yes, please! ---- 2018-11-20 18:04:38 UTC - Byron: Now it works :slightly_smiling_face: ---- 2018-11-20 18:04:42 UTC - Byron: Thanks for the help ---- 2018-11-20 18:04:45 UTC - Byron: I will open a PR ---- 2018-11-20 18:04:46 UTC - Matteo Merli: :slightly_smiling_face: ---- 2018-11-20 18:04:59 UTC - Matteo Merli: Is the service spec missing from the Yaml in the repo? ---- 2018-11-20 18:05:03 UTC - Byron: Yes ---- 2018-11-20 18:05:07 UTC - Matteo Merli: ouch ---- 2018-11-20 18:05:31 UTC - Byron: I deliberately tried what was provided ---- 2018-11-20 18:06:25 UTC - Byron: other than the misspelling and setting resource limits (which aren’t necessarily required) everything else worked ---- 2018-11-20 18:17:36 UTC - Ivan Kelly: @Ryan Samo that looks like you would have to distribute a new cacert each time you generate a cert. no? ---- 2018-11-20 18:18:12 UTC - Ivan Kelly: 200,000 threads on a single machine? ---- 2018-11-20 18:18:24 UTC - Ivan Kelly: you'll likely hit a lot of lock contention ---- 2018-11-20 18:23:44 UTC - Byron: @Matteo Merli <https://github.com/apache/pulsar/pull/3026> ---- 2018-11-20 18:37:29 UTC - Matteo Merli: :+1: ---- 2018-11-20 19:58:26 UTC - Ryan Samo: Well, I’ve seen it where the root cert is always at the top and then you concat all new certs below. The incoming clients only care that the root cert at the top is there. Not sure how that works in a Pulsar environment or if it even would. ---- 2018-11-21 05:29:57 UTC - Samuel Sun: we want to know the latency from client --> broker --> storage, to tune our service. ----
