Slack digest for #general - 2020-07-24

Apache Pulsar Slack Fri, 24 Jul 2020 02:11:25 -0700

2020-07-23 12:42:06 UTC - Arthur: I deployed pulsar with official Helm without 
tls. How could I do an ingress to expose it without tls in first time. Is it 
possible to expose it in pulsar+ssl only using ingress?
----
2020-07-23 12:45:12 UTC - xiaolong.ran: Yes, next week I will start to do 
release 2.6.1 related work, and this changes will be included
----
2020-07-23 15:36:17 UTC - Jorge Miralles: Hi guys! Hope you’re doing well!
I have a question  about peeking encrypted messages:
We’ve been using pulsar client (java) to peek messages, but now that we’re 
encrypting them, message peeking seems to be failing. I’ve read the docs but 
couldn’t find anything about it.


We’ve seen errors like this when trying to peek messages:

Caused by: 
org.apache.pulsar.client.admin.PulsarAdminException$ServerSideErrorException: 
Some error occourred on the server
Is there a way to decrypt messages in java client for peeking?
----
2020-07-23 15:38:51 UTC - Matteo Merli: The peek through REST API cannot show 
the message because brokers won't have the keys to decrypt it.

If you want to "peek" a message, you have few options:
1. If using a shared subscription, just add one more consumer that receives a 
messages, prints it and returns without ack
2. Create a reader to peek the messages
----
2020-07-23 15:46:38 UTC - Jorge Miralles: Thanks, readers seems a good option
----
2020-07-23 15:48:04 UTC - Jorge Miralles: I was hoping, i could configure the 
admin to  add the key so it could decrypt the message
----
2020-07-23 15:48:16 UTC - Jorge Miralles: I’ll check the reader option
----
2020-07-23 15:48:19 UTC - Jorge Miralles: Thanks again
----
2020-07-23 15:55:44 UTC - Devin G. Bost: After upgrading to Pulsar 2.5.2, when 
I try to make a Get request to get a tenant, I'm getting:
`Get <https://subdomain-pulsar-ws-tls.domain.com:8443/admin/v2/tenants/ops>: 
x509: certificate signed by unknown authority`
I've double checked the certs. I'm not sure what else to check.
Do we need to set something on the http client before issuing the request?
We're already passing an admin token as an authorization header.
----
2020-07-23 16:25:29 UTC - Devin G. Bost: Or, if someone remembers where in the 
Pulsar source code the REST endpoints are, that would be helpful.
----
2020-07-23 17:11:05 UTC - anbutech17: Thanks Joshua
----
2020-07-23 17:27:56 UTC - Addison Higham: @Takahiro Hozumi from the restriction 
to persistent topics, compaction can't really happen on non-persistent topics, 
so that will be enforced by Pulsar.

Also, just to be clear, "a single active consumer" that only applies to a 
single subscription, which means you can still have many different 
subscriptions, each with it's own consumer, so while it will not be as 
efficient, you could use multiple exclusive consumers and then filter on keys 
OR use a reader, which does support compaction, but the subscription isn't 
durable.


As to why you can't use it on shared or key_shared subscriptions, that is 
enforced by the broker. I am not completely clear on the details, but likely is 
due to how compaction moves from a compacted topic to the "tail" of the raw 
topic and that being tricky to do with a shared consumer. However, it may be 
somewhat of a different story for a `key_shared` subscription.

If you wouldn't mind opening up an issue, it would be a good thing to discuss 
on github and see if it something that could be supported key_shared
----
2020-07-23 17:35:43 UTC - Addison Higham: Not quite sure I understand what you 
are asking.

Are you asking if you can do TLS termination via an ingress controller or just 
if an ingress controller  can expose pulsar without TLS? The answer to both of 
those should be yes, but you do need to use an ingress controller capable of 
exposing raw TCP as well as HTTP on the same service.

If you don't need TLS, what you might consider instead,depending on if you use 
a cloud provider is to use "service" with annotations. For example, on AWS,  
<https://kubernetes.io/docs/concepts/services-networking/service/#aws-nlb-support>
 works well for that
----
2020-07-23 17:38:18 UTC - Addison Higham: Does this work? 
<https://pulsar.apache.org/admin-rest-api/?version=2.6.0&amp;apiversion=v2> 
&lt;- the REST api docs generated from swagger/openapi
----
2020-07-23 17:41:56 UTC - Devin G. Bost: I went through that, but it doesn't 
contain everything I need, such as headers, etc.
So, that's why I need to find where those endpoints are in the source code. I 
found them before, but I just can't remember where they're at.
----
2020-07-23 17:46:55 UTC - Devin G. Bost: I found it in the broker code
----
2020-07-23 18:16:42 UTC - Takahiro Hozumi: @Addison Higham Thank you for your 
comment.
I found the exact same issue on github.
<https://github.com/apache/pulsar/issues/7028>
----
2020-07-23 18:50:09 UTC - Chris Hansen: Some questions about function context 
secrets. It seems like the only way to add/modify a function’s “secretsMap” is 
by using a function config file (e.g. `pulsar-admin functions update 
--function-config-file &lt;filename&gt;`). Is that correct? Related: is there a 
way they can be a bit more dynamic? I suppose in my view it would be better if 
knowing which secrets were available was delegated to SecretsProvider instead, 
but `ContextImpl` always checks w/ this secretsMap before calling the 
SecretsProvider. Maybe there are reasons to do that but I can’t think of them.
`ContextImpl`:
```    @Override
    public String getSecret(String secretName) {
        if (secretsMap.containsKey(secretName)) {
            return secretsProvider.provideSecret(secretName, 
secretsMap.get(secretName));
        } else {
            return null;
        }
    }```
----
2020-07-23 18:58:31 UTC - Yezen: I've posted a few questions about Pulsar End 
to End Encryption so far.

All the examples I see are Java related 
<https://pulsar.apache.org/docs/en/security-encryption/>

I have multiple clients that are written in GO.  Does anyone know if there is 
message encryption support for go when producing and consuming messages?  If 
not what would I need to do to get this working for GO.
----
2020-07-23 19:03:13 UTC - Yezen: Also on another note.  If I want to set up 
Pulsar in AWS.  I know that I can use the pulsar stand alone for local dev 
<https://pulsar.apache.org/docs/en/standalone-docker/>

But what does the topology of a prod environment look like with pulsar?  How 
many docker containers would I need (one for apache bookkeeper and one for 
pulsar)?  How many ec2's do you typically have running?
----
2020-07-23 19:05:01 UTC - Addison Higham: That is correct, as far as the API 
being the way it is, it is primarily for getting the metadata in the 
`secretMap` back to the backend. The existing bakends don't make use of it, but 
it was designed with things like hashicorp vault or aws secrets manager where 
you need some metadata to look up the secret.

I supposed that it might make sense for the secret provider be able to work 
with secrets "unknown" to the system. Perhaps the API could be expanded such 
that you had a method on the SecretProvider

`public bool hasSecret(String secretName, Map&lt;&gt; secretMap)` and let the 
implementation decide, with the default being the current behavior and 
overridden implementations to allow for grabbing arbitrary secrets
----
2020-07-23 19:12:44 UTC - Chris Hansen: yep, something like that is what I had 
in mind. I can open an issue to discuss further
----
2020-07-23 19:13:53 UTC - Addison Higham: Pulsar currently has two golang 
clients, but ATM, I don't think either supports encryption, however, it should 
be relatively easy to add. You should consider opening up an issue here 
<https://github.com/apache/pulsar-client-go/issues>
----
2020-07-23 19:14:55 UTC - Chris Hansen: also, it seemed to me that the 
SecretsProviderConfigurator can be another source of metadata, so maybe a 
secretsMap isn’t actually needed? edit: though it should be kept for 
backward-compat
----
2020-07-23 19:19:35 UTC - Addison Higham: If you are familiar with EKS, then 
the helm charts at <https://github.com/apache/pulsar-helm-chart> are a really 
rapid way to get a production ready cluster.

If you are looking to run on raw EC2 instances, pulsar is pretty flexible and 
it all really depends on your needs. If you just have relatively low volume 
needs but want to be resilient to failure, you might consider deploying 
brokers, bookies, and zookkeeper nodes all on the same machine, and only 3 
instances might be sufficient. For a larger install, you may want separate out 
the components and then have a total of 9 nodes, 3 each for brokers, bookkeeper 
and zookeeper.
+1 : Yezen, Muljadi
----
2020-07-23 19:30:52 UTC - Yezen: Thanks @Addison Higham.

I opened up an issue here 
<https://github.com/apache/pulsar-client-go/issues/333>
----
2020-07-23 19:31:42 UTC - Yezen: So I know that there is a c++ solution of end 
to end encryption with pulsar.  Would that be compatible with the CGo client 
<https://pulsar.apache.org/docs/en/client-libraries-cgo/>
----
2020-07-23 19:48:13 UTC - Addison Higham: @Yezen it likely just isn't exposed 
in the golang wrapper, it might be very simple to expose though, as yes, it is 
all supported in C++
+1 : Yezen
----
2020-07-23 20:17:07 UTC - Joshua Decosta: Regarding ProxyRoles, i noticed for 
tenant operations many of the authZ methods check both proxy and original 
principal roles. Is this the case for canProduce/canConsume/canLookup?
----
2020-07-23 20:39:23 UTC - Addison Higham: @Joshua Decosta this section of the 
docs explains that best: 
<http://pulsar.apache.org/docs/en/security-authorization/#proxy-roles>
----
2020-07-23 20:44:40 UTC - Joshua Decosta: @Addison Higham I think my logic is 
failing due to the token aspect. So i have a proxy configured and a client 
connects through that proxy. Are both tokens passed to authN and authZ? Do both 
tokens get used throughout the process? 
----
2020-07-23 20:45:24 UTC - Muljadi: @Addison Higham  how is the configuration 
looks like if you have 3 instances so it can communicate with the zookeeper ?  
we would also need s3 bucket access for storage ?
----
2020-07-23 20:47:43 UTC - Addison Higham: you would also run zookeeper on on 
those same 3 nodes and the zookeeper connection string for bookkeeper and 
brokers would each be all 3 instances hostnames/IPs.

As far as S3, that is only used for storage offloading, so you would still need 
some local disk for bookkeeper and zookeeper
+1 : Muljadi, Yezen
----
2020-07-23 20:54:35 UTC - Addison Higham: yes, both tokens will get passed, for 
both TCP and HTTP, the proxy takes the original auth data and moves it a new 
field, then attaches its own data, the code typically checks the proxy first to 
make sure it is authenticated and then authorized (as it must also have 
permission, but is commonly done by having the proxy be superuser), then the 
original principal is put through authN/authZ
----
2020-07-23 20:54:36 UTC - Muljadi: thank you!
----
2020-07-23 20:56:15 UTC - Joshua Decosta: Do you know what class I could look 
at to find this flow? I’m not sure where the endpoint classes live to see this. 
----
2020-07-23 20:56:50 UTC - Muljadi: We’re looking into deploying with ECS 
Fargate in AWS to be able to scale up or down
----
2020-07-23 21:00:09 UTC - Addison Higham: @Muljadi the challenge with fargate 
is that there isn't any persistent storage, so while you can use offloading to 
s3, you usually still need at minimum enough disk for a few hours. That might 
be acceptable, but you would need to be careful
----
2020-07-23 21:01:06 UTC - Addison Higham: err, careful when doing any sort of 
deployment
----
2020-07-23 21:01:16 UTC - Muljadi: @Addison Higham what would be the minimum 
disk for low volume usage needed?   that’s a good point re: fargate, thanks!
----
2020-07-23 21:08:23 UTC - Addison Higham: difficult to say, but the way you 
could estimate it: when offloading happens and when it is deleted is controlled 
by `offload-threshold` and `offload-deletion-lag`. Those apply per topic, so if 
you set offload threshold to 1GB and offload-deletion lag for 1 hour and write 
roughly 1GB/hour and have 10 topics, each node would need to have at least 20G 
(up to 1 GB before offload triggers + 1 GB that waits an hour before deleting * 
10 topics). In reality, you would probably want perhaps double that (40 GB) as 
bookkeeper doesn't immediately free up disk and also needs room to compact 
ledgers.
+1 : Muljadi, charles
----
2020-07-24 04:07:43 UTC - Jennifer Huang: @Jennifer Huang set the channel 
topic: - Pulsar vs. Kafka: 
<https://streamnative.io/blog/tech/pulsar-vs-kafka-part-2>
- TGIP #016 Backlog and StorageSize
  1 PM PST, 7/24
  live streaming: 
<https://www.youtube.com/channel/UCywxUI5HlIyc0VEKYR4X9Pg/live>
----
2020-07-24 08:42:38 UTC - Arthur: I installed pulsar without tls. Actually I 
expose it with a NodePort but I want to expose it with an Ingress.
I tried nginx in TCP LB mode but I think it needs a LB behind. Do you know if I 
can expose TCP with nginx ingress? What about Traefik?
Your link is about LB but I don't use Cloud Provider

In a second time, I'll need to use pulsar+ssl protocol.  In this case, I don't 
know if ssl could be enabled on top of pulsar (ingress) or I must active it in 
pulsar anyway.
----

Slack digest for #general - 2020-07-24

Reply via email to