Slack digest for #general - 2020-07-11

Apache Pulsar Slack Sat, 11 Jul 2020 02:12:07 -0700

2020-07-10 09:19:41 UTC - Alan Hoffmeister: If I publish a message to a topic 
without consumers, does it get deleted?
----
2020-07-10 09:22:39 UTC - Alan Hoffmeister: I'm trying to queue messages on a 
topic on which the consumer isn't ready yet, but when I connect the consumer it 
doesn't receive the messages prior to its connection :thinking_face:
----
2020-07-10 09:25:14 UTC - Ebere Abanonu: You need to set Retention to higher 
number
----
2020-07-10 09:27:58 UTC - Alan Hoffmeister: ```# ./pulsar-admin namespaces 
get-retention public/default
{
  "retentionTimeInMinutes" : -1,
  "retentionSizeInMB" : -1
}```
Retention is already set to unlimited
----
2020-07-10 09:42:54 UTC - Alan Hoffmeister: oh wait.. do I need a reader 
instead of a consumer?
----
2020-07-10 09:56:20 UTC - Miguel Martins: You need to pass the subscription 
initial position
----
2020-07-10 09:56:35 UTC - Miguel Martins: By default its latest
----
2020-07-10 10:03:43 UTC - Alan Hoffmeister: hmm got it
----
2020-07-10 10:03:54 UTC - Alan Hoffmeister: I don't think websocket has that 
option
----
2020-07-10 10:04:16 UTC - Jay/Fienna Liang: @Addison Higham May I ask you how 
to authenticate through websocket? I did some survey and it seems not supported 
by Pulsar for now?
----
2020-07-10 10:17:15 UTC - Jay/Fienna Liang: Hi there, I’m wondering is there 
any progress or plan on pure JavaScript client? I need to distribute my 
application on Raspberry Pi and Windows, but native c++ client doesn’t support 
Windows for now. I was using WebSocket until I’m trying to enable 
authentication and authorization with JWT, and it seems WebSocket has no way to 
support any authentication approach. So is there any solution that I could use 
JavaScript client with authentication enabled and won’t require clients to have 
`pulsar-client-dev` installed?
----
2020-07-10 11:09:01 UTC - Alan Hoffmeister: @Jay/Fienna Liang I don't know 
about pure js client but what you could do is to spawn a proxy server for 
authenticating. Example:


<https://gist.github.com/alanhoff/a85f4edbbc10d71b389f21a80c9b8bde>
----
2020-07-10 11:36:04 UTC - Jay/Fienna Liang: But what I need is a client to send 
or receive messages along with authentication credentials like JWT, not a 
server to authenticate the client. Or I misunderstand the usage of your example 
code?
----
2020-07-10 11:53:11 UTC - Alan Hoffmeister: you already have a client, the 
WebSocket client
----
2020-07-10 11:54:06 UTC - Alan Hoffmeister: you can still send JWT/cookies with 
your current client, then the proxy will authenticate and forward your 
connection to pulsar if your credentials are correct
----
2020-07-10 12:14:16 UTC - Vil: I think this is beginner’s question (sorry), but 
what happens if one Bookkeeper node dies forever, with all its data gone. There 
will be other bookkeeper nodes that have a copy of my data, but now one copy is 
missing due to dead bookie. How will a new copy be created? I found docs like 
<https://bookkeeper.apache.org/docs/4.10.0/development/protocol/> but i still 
dont understand what happens
----
2020-07-10 12:33:00 UTC - Marcio Martins: AFAIK, there is a process called 
auto-recovery which you enable to run embedded in a bookie or run as a 
standalone service. This will periodically look for dead bookies and/or 
under-replicated ledgers and take appropriate actions to fix the situation. Not 
100% sure this is exactly correct, though.
----
2020-07-10 12:40:08 UTC - Vil: many thanks Marcio for your message
----
2020-07-10 12:42:46 UTC - Vil: do you know how to run this process or where it 
is described in documentation?
----
2020-07-10 12:43:45 UTC - Marcio Martins: I believe it's documented in the 
bookkeeper manuals
----
2020-07-10 12:44:19 UTC - Marcio Martins: They have operating guides on how to 
repair, upgrade etc
----
2020-07-10 12:44:36 UTC - Rounak Jaggi: is there a way to enable tls in bookie 
using pem? If so please let me know the configs. Thanks
----
2020-07-10 13:11:29 UTC - Vil: thank you again Marcio
----
2020-07-10 13:22:33 UTC - Vil: Found the information at 
<https://bookkeeper.apache.org/archives/docs/r4.4.0/bookieRecovery.html>.  
Adding this here for others
----
2020-07-10 13:33:28 UTC - Joshua Decosta: To add onto this. I’m trying to 
integrate keycloak into pulsar AuthN authZ. The AuthorizationProvider is what i 
need to create for my purposes. Would it be a security risk to rely on the 
token claims for the permissions and simply bypass any internal checks?
----
2020-07-10 13:38:51 UTC - Frank Kelly: I don't think so as long as you validate 
the (JWT?) token signature is correct then that makes sense
----
2020-07-10 13:39:40 UTC - Joshua Decosta: Do you think this would mess up proxy 
to broker communications?
----
2020-07-10 13:40:21 UTC - Frank Kelly: I think they would be separate settings 
- `brokerClient******`
----
2020-07-10 13:40:51 UTC - Frank Kelly: I'm in the same boat - creating a custom 
Auth*n methodology and there's a lot of settings involved :disappointed:
----
2020-07-10 13:41:00 UTC - Joshua Decosta: They are but I’ve been reading in the 
docs about needing the proxyRole to be used in some manner. The proxy to broker 
authN authZ is confusing for me
----
2020-07-10 14:03:07 UTC - Matteo Merli: BTW, we don’t have prebuilt packages 
but the C++ client should be easily compilable on Windows
----
2020-07-10 14:03:53 UTC - Matteo Merli: Or you can pre-create the susbscription
----
2020-07-10 14:04:14 UTC - Matteo Merli: There a REST handler for that
+1 : Alan Hoffmeister
----
2020-07-10 14:04:57 UTC - Matteo Merli: Curator is a library that wrap ZK 
client lib
----
2020-07-10 14:38:07 UTC - Dave Miller: @Dave Miller has joined the channel
----
2020-07-10 17:18:21 UTC - Addison Higham: First, thanks for the feedback on 
AuthN/AuthZ, a more complete guide for custom auth would be useful. That might 
be something we could make happen in the community.

As far as your specific questions:
The token provider *does* validate the JWT, so as long as your JWTs are signed 
in such a way that you trust your claims, that should be safe.

As far as proxy role, the proxy role basically only used to create the 
privileged channel between the broker and proxy, but the rest of auth will 
still use the normal role
+1 : Joshua Decosta, Chris Hansen
----
2020-07-10 17:18:42 UTC - Addison Higham: AFAIK (I haven't looked at that much 
recently) the proxyRole isn't really used in the AuthZ part at all
----
2020-07-10 17:18:57 UTC - Addison Higham: (FYI @Joshua Decosta @Frank Kelly)
----
2020-07-10 17:22:26 UTC - Frank Kelly: Thanks @Addison Higham - it's on my "To 
do" list to contribute back some documentation for AuthN / AuthZ for sure once 
I figure it out - I'm close :-)
----
2020-07-10 17:22:47 UTC - Matt Mitchell: I’m having an issue related to message 
acknowledgment. The topic is non-persistent, possibly that’s related, but the 
issue is that the Pulsar’s `./pulsar-admin topics stats $TOPIC` output always 
shows the subscriber to have an unacknowledged message count equal to the total 
number of messages published. I have confirmed that the consumer is 
acknowledging each message (at least the code is being called). I’ve tested 
both synchronous and asynchronous methods, but yield the same result. Out of 
curiosity, I called `consumer.redeliverUnacknowledgedMessages()` and then the 
output of `topics stats` shows 0. What am I doing wrong? Is this related to 
non-persistent topics?
----
2020-07-10 17:22:52 UTC - Addison Higham: great! happy to help answer other 
questions where I can
+1 : Frank Kelly, Joshua Decosta
----
2020-07-10 17:24:00 UTC - Matteo Merli: Yes, it's probably related to the stats 
being off for a non-persistent topic
----
2020-07-10 17:25:48 UTC - Joshua Decosta: @Addison Higham so if i relies on the 
token claims for authZ in the AuthorizationProvider class that would be 
sufficient? I just want to make sure I’m understanding what you’re referring 
to. 
----
2020-07-10 17:26:06 UTC - Matt Mitchell: Ok gtk. The real issue though, is that 
my consumer is setting receiverQueueSize (currently 5, just for testing) and 
when the `availablePermits` (reported by `topics stats`) becomes 1 less than 
the queue size, the consumer stops receiving messages. I guess because its 
receiver queue is full?
----
2020-07-10 17:26:43 UTC - Joshua Decosta: How does super user roles come into 
play with custom authZ? I’m worried I’d be breaking the capability since i 
don’t see it mentioned in the authZ interface 
----
2020-07-10 17:29:35 UTC - Addison Higham: the AuthZ API basically has the 
`isSuperUser` call (similar to `isTenantAdmin`), as long as you can get the 
claims into the authZ provider and you have claims that map to those 
capabilities it would work. The rest of the calls are more granular. It seems 
like the biggest thing you would lose is that calls to add permissions wouldn't 
do anything and instead you would need to reconnect with a new token with new 
claims
----
2020-07-10 19:36:37 UTC - Matteo Merli: Can post a!simple example that 
reproduces the issue?
----
2020-07-10 21:11:38 UTC - Muljadi: I’m trying to figure out re: end-to-end 
encryption; so far the examples I’ve seen the encryption key (public / private 
key) is fetch at 4 hours time period.  It seems to indicate that you only have 
one single key to encrypt the message.  I’d like to be able to use different 
encryption key for different topic/tenant. Does pulsar support different 
encryption keys for different topic ?
----
2020-07-10 22:49:41 UTC - Justin Israel: @Justin Israel has joined the channel
----
2020-07-10 23:02:28 UTC - Justin Israel: Hi All. I'm currently evaluating 
Pulsar as an alternative to our current efforts to introduce Kafka at our 
facility. I've got a question about encryption that seems to be a non-standard 
problem, as I can't find examples in the Kafka world for this. 
We have a need to implement film/show level security at our vfx studio. Clients 
have different aggregations of show group access at different times. Now it may 
seem like this would map easily to just using ACLs for show based topics, but 
really this would overcomplicate our topic logic for producers and consumers. 
Messages will be the same schema with only the show value being the security 
factor. In  an ideal world, I would be able to have a producer publish a 
particular message type to a single topic where the show is a header or key 
value. And then I would have some kind of consumer filter on the broker that is 
auth aware, and it would be able to skip past messages that aren't authorised 
and mark their offsets as read. That way I don't have to deal with encrypting 
different messages per show or different topics. A reason for not using 
encryption as well is that if messages are persistent then we cant easily roll 
the encryption key on the old stuff. 
Does anyone have input on how they might tackle this problem? Without some kind 
of consumer filter plugin support on brokers I can only think to create a proxy 
where consumers connect and auth, and then the proxy acts as the filter between 
a subscription.
----
2020-07-10 23:27:00 UTC - Addison Higham: Hi @Justin Israel, welcome :)

Sounds like an interesting use case!

As things currently stand, I don't think Pulsar can do exactly that, but is 
pretty close.

First off is the most straight forward approach: Pulsar topics are pretty 
cheap, but instead of producing directly to multiple topics, you could still 
produce to one topic and then have a Pulsar function consume that topic and 
re-route the message to per-show topics, and then just use ACLs to protect on 
the consume side. That would make your producer be pretty much as you describe, 
with some consumers being able to consume from per-show topics, or even use a 
regex subscription to grab multiple shows or just grab all the shows from the 
original "unsplit" topic.

Another option would be to use pulsar built in message encryption: 
<http://pulsar.apache.org/docs/en/security-encryption/>
It works pretty close to what you describe, there is even an option on the 
consumer where if it can't decrypt a message it will silently through it away 
and advance the cursor 
(`PulsarClient.newConsumer().cryptoFailureAction(ConsumerCryptoFailureAction)` 
is how you do that). This works pretty dang close to what you describe except 
the message still technically make it to the client. You could combine this 
with a `Key_Shared` subscription to potentially make it so that this happens 
less. A Key_Shared subscription ensures that one consumer always gets the same 
key. A consumer can also ask for a certain range of keys (via a hash, see 
`KeySharedPolicySticky`) but this isn't quite as straight forward as you need 
to make sure all keys are getting consumed.

Hopefully that helps put you on a good path, happy to answer any follow ups 
:slightly_smiling_face:
----
2020-07-11 03:03:34 UTC - Thomas Delora: @Thomas Delora has joined the channel
----
2020-07-11 08:26:03 UTC - Justin Israel: Thanks for the fast reply and the 
suggestions! The Pulsar Function seems like it might be a possible approach 
since it can manage the creation of topics as needed.  And then the auth 
solution will just handle the per-show access when they wildcard subscribe.
The common case is that users have access to maybe 90% of the projects, so I 
had thought if I could keep it simple and do some kind of message-level auth 
filtering, that would prevent us from having to split into many topics. But at 
least this way the producers and consumers don't have to think of  the 
different topics and we still have the original full topic.
I think the encryption approach might not work for long retention show-based 
key access unless we expected to re-stream a new encrypted topics from the 
original if we every wanted to roll the key.
----

Slack digest for #general - 2020-07-11

Reply via email to