Slack digest for #general - 2018-11-21

Apache Pulsar Slack Wed, 21 Nov 2018 01:11:23 -0800

2018-11-20 09:31:39 UTC - Ganga Lakshmanasamy: No. Its going to be one single 
topic.
----
2018-11-20 10:04:00 UTC - weibin.huang: @weibin.huang has joined the channel
----
2018-11-20 12:09:34 UTC - Ivan Kelly: the producer is thread safe, so multiple 
threads can share one producer
----
2018-11-20 12:12:33 UTC - Ganga Lakshmanasamy: In our case what happens is each 
producer is connected to an email folder, so the producer will wait for 
messages and send to consumer when a message comes in. So if there are 1000 
users connecting different emails there would be 1000 producer instances 
created.  And inside all these 1000 instances there will be a thread waiting 
for a message to arrive. Any suggestion on how these kind of scenario can be 
handled?
----
2018-11-20 12:13:20 UTC - Ganga Lakshmanasamy: Is there any documentation 
available on how the message can be built with this new api?
----
2018-11-20 12:22:56 UTC - Ivan Kelly: why are you creating a producer per user? 
couldn't you just use a pool of producers?
----
2018-11-20 15:51:03 UTC - Ganga Lakshmanasamy: So here is a change we have 
done, there will be one producer created for app and that producer will be used 
in say 200,000 independent threads to send out messages. Will this have any 
impact. The producer creation will be singleton and it will be used by all 
those instances
----
2018-11-20 16:08:04 UTC - Ryan Samo: Hey guys and gals,
I was looking into PIP-20 for revoking TLS certs in case you get compromised or 
have a bad actor. In some systems you can concatenate certs into your root CA 
so that they can be picked up. Then if you want to revoke them you just delete 
them from the root CA concatenation. Does Pulsar support the ability to use 
concatenation in the root CA? I gave it a shot and on first pass, wound up with 
a 500 error: “ Valid Proxy Client role should be provided for 
getPartitionMetadataRequest”. The proxy role and client roles are both granted 
to the namespace, I just figured it was due to the root CA not supporting this 
in Pulsar. Any thoughts on trying to pull this off or maybe an update on PIP-20?
----
2018-11-20 17:14:51 UTC - Byron: Hi folks, has anyone come up with a series of 
checks to test each component of a fresh Pulsar deployment? I seem to run into 
subtle issues every time I deploy a new cluster. Usually they are communication 
or memory-related
----
2018-11-20 17:18:01 UTC - Sijie Guo: PIP-20 is still WIP. @Ivan Kelly might be 
able to incorporate your question here.
----
2018-11-20 17:19:15 UTC - Sijie Guo: for bookies, you can use `bin/bookkeeper 
shell sanitycheck` for single bookie test, use `bin/bookkeeper shell 
simpletest` for cluster-wide test.


for brokers, there is WIP in master to do healthcheck
----
2018-11-20 17:19:59 UTC - Sijie Guo: you can read this section : 
<http://pulsar.apache.org/docs/en/deploy-bare-metal/#deploying-a-bookkeeper-cluster>
----
2018-11-20 17:20:42 UTC - Byron: thank you. i was running into a 
`BrokerPersistenceError` which I assume is due to the broker not be able to 
communicate to the bookie
----
2018-11-20 17:21:30 UTC - Sijie Guo: do you have a detail stack trace?
----
2018-11-20 17:22:10 UTC - Ryan Samo: Thanks @Sijie Guo! 
----
2018-11-20 17:22:32 UTC - Byron: this was returned by the (Go) client, but i 
can see if i can track it down in the logs of the broker
----
2018-11-20 17:23:02 UTC - Sijie Guo: oh I see.
----
2018-11-20 17:32:02 UTC - imteyaz ahmed khan: @imteyaz ahmed khan has joined 
the channel
----
2018-11-20 17:35:30 UTC - Byron: I see `Caused by: 
org.apache.bookkeeper.mledger.ManagedLedgerException: Error while recovering 
ledger`
----
2018-11-20 17:35:45 UTC - Byron: In the broker stack trace
----
2018-11-20 17:36:29 UTC - Matteo Merli: Is that in Kubernetes?
----
2018-11-20 17:36:32 UTC - Byron: Yes
----
2018-11-20 17:36:51 UTC - Matteo Merli: It might be related to Bookie pods 
having different IP addresses
----
2018-11-20 17:37:13 UTC - Byron: I don’t see any errors in the bookies 
themselves
----
2018-11-20 17:37:42 UTC - Matteo Merli: Is that deployed as a StatefulSet ?
----
2018-11-20 17:37:51 UTC - Byron: yes
----
2018-11-20 17:38:04 UTC - Byron: I used the gcp YAML files
----
2018-11-20 17:38:24 UTC - Byron: with very minimal changes (like cluster name)
----
2018-11-20 17:39:11 UTC - Matteo Merli: Ok, let me check those yaml files
----
2018-11-20 17:40:34 UTC - Matteo Merli: In brokers, does it say it fails to 
connect to bookie pods?
----
2018-11-20 17:41:48 UTC - Matteo Merli: There’s a known problem (which we fixed 
in Streamlio branch) with the caching of DNS names in BK client. This was fixed 
and we planned to release soon in BK-4.7.3 and cascade that into Pulsar-2.2.1
----
2018-11-20 17:43:38 UTC - Byron: `17:34:01.995 [bookkeeper-io-12-4] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to 
bookie: [id: 
0x7da7f0f3]/bookkeeper-2.bookkeeper.default.svc.cluster.local:3181, current 
state CONNECTING`
----
2018-11-20 17:43:58 UTC - Byron: from one broker pod
----
2018-11-20 17:44:38 UTC - Byron: same one with the error above
----
2018-11-20 17:44:47 UTC - Matteo Merli: If you bounce that pod, can it connect 
when it comes back up ?
----
2018-11-20 17:50:50 UTC - Byron: I bounced all three and get the same error
----
2018-11-20 17:51:16 UTC - Matteo Merli: ok, then it’s something different
----
2018-11-20 17:51:44 UTC - Matteo Merli: from a broker pod, are you able to 
`ping bookkeeper-2.bookkeeper.default.svc.cluster.local`?
----
2018-11-20 17:54:33 UTC - Byron: ah nope
----
2018-11-20 17:55:04 UTC - Matteo Merli: what about other bookies?
----
2018-11-20 17:55:26 UTC - Byron: no
----
2018-11-20 17:57:29 UTC - Byron: oh.. there is no bookie service
----
2018-11-20 17:57:56 UTC - Byron: only the stateful set and deployment (for 
auto-recovery)
----
2018-11-20 17:58:43 UTC - Byron: so i presume the k8s dns wouldn’t resolve
----
2018-11-20 17:59:05 UTC - Byron: in the generic k8s config there is a service 
defined
----
2018-11-20 17:59:28 UTC - Byron: it also uses daemon set instead of a stateful 
set i see
----
2018-11-20 18:00:29 UTC - Matteo Merli: Yes, although daemon set has it’s own 
set of tricky parts (the pod names and IPs change all the time.. )
----
2018-11-20 18:00:56 UTC - Matteo Merli: So, yes the K8S “service” need to be 
defined to have the DNS working
----
2018-11-20 18:01:14 UTC - Byron: ^ah. Ok i will add a service and try it out
----
2018-11-20 18:02:04 UTC - Matteo Merli: And we need to use pod names (rather 
than pod IPs) for bookie, because we need that bookie identifier to be stable 
(otherwise we cannot know where we wrote the data earlier).
----
2018-11-20 18:02:29 UTC - Byron: i also discovered another minor issue with the 
gcp config. the `ledger-disk` referenced in bookie.yaml should be 
`ledgers-disk`. also a topologyKey.. i can open a PR for these items if that 
works
----
2018-11-20 18:02:46 UTC - Matteo Merli: Yes, please!
----
2018-11-20 18:04:38 UTC - Byron: Now it works :slightly_smiling_face:
----
2018-11-20 18:04:42 UTC - Byron: Thanks for the help
----
2018-11-20 18:04:45 UTC - Byron: I will open a PR
----
2018-11-20 18:04:46 UTC - Matteo Merli: :slightly_smiling_face:
----
2018-11-20 18:04:59 UTC - Matteo Merli: Is the service spec missing from the 
Yaml in the repo?
----
2018-11-20 18:05:03 UTC - Byron: Yes
----
2018-11-20 18:05:07 UTC - Matteo Merli: ouch
----
2018-11-20 18:05:31 UTC - Byron: I deliberately tried what was provided
----
2018-11-20 18:06:25 UTC - Byron: other than the misspelling and setting 
resource limits (which aren’t necessarily required) everything else worked
----
2018-11-20 18:17:36 UTC - Ivan Kelly: @Ryan Samo that looks like you would have 
to distribute a new cacert each time you generate a cert. no?
----
2018-11-20 18:18:12 UTC - Ivan Kelly: 200,000 threads on a single machine?
----
2018-11-20 18:18:24 UTC - Ivan Kelly: you'll likely hit a lot of lock contention
----
2018-11-20 18:23:44 UTC - Byron: @Matteo Merli 
<https://github.com/apache/pulsar/pull/3026>
----
2018-11-20 18:37:29 UTC - Matteo Merli: :+1:
----
2018-11-20 19:58:26 UTC - Ryan Samo: Well, I’ve seen it where the root cert is 
always at the top and then you concat all new certs below. The incoming clients 
only care that the root cert at the top is there. Not sure how that works in a 
Pulsar environment or if it even would.
----
2018-11-21 05:29:57 UTC - Samuel Sun: we want to know the latency from client 
--&gt; broker --&gt; storage, to tune our service.
----

Slack digest for #general - 2018-11-21

Reply via email to