Slack digest for #general - 2020-07-21

Apache Pulsar Slack Tue, 21 Jul 2020 02:19:43 -0700

2020-07-20 09:20:25 UTC - Tolulope Awode: Hello, I am having similar issue
----
2020-07-20 09:21:04 UTC - Tolulope Awode: ```07-19 10:56:21.201 WARN  
[140648026367744] AckGroupingTrackerEnabled:99 | Connection is not ready, 
grouping ACK failed.
2020-07-19 10:56:21.210 WARN  [140648009582336] AckGroupingTrackerEnabled:99 | 
Connection is not ready, grouping ACK failed.
2020-07-19 10:56:21.301 WARN  [140648026367744] AckGroupingTrackerEnabled:99 | 
Connection is not ready, grouping ACK failed.
2020-07-19 10:56:21.311 WARN  [140648009582336] AckGroupingTrackerEnabled:99 | 
Connection is not ready, grouping ACK failed.```


----
2020-07-20 11:52:41 UTC - VanderChen: I have the same question. Additionally, 
is there any way to restart the failed broker automatically?
----
2020-07-20 12:59:24 UTC - Ebere Abanonu: @Sijie Guo this is important, very!!
----
2020-07-20 13:56:38 UTC - Jonas Kint: Is there a public realease schedule for 
pulsar? I’m currently running against issues with the gcs offloader that got 
fixed in master and need to do some capacity planning in order to survive our 
current topic growth :sweat_smile:
----
2020-07-20 13:57:27 UTC - Ebere Abanonu: September 2.7.0
----
2020-07-20 13:58:54 UTC - Frank Kelly: Thanks - The only solution I found was 
to delete my minikube environment and recreate it - I had assigned it about 
36GB of disk
----
2020-07-20 14:34:28 UTC - Jonas Kint: and are there plans on doing a minor 
2.6.1 release?
----
2020-07-20 14:36:46 UTC - Ebere Abanonu: Maybe, maybe not. I think, following 
the pattern, 2.6.1 should be expected
----
2020-07-20 14:52:57 UTC - rwaweber: Hey all! Given the thread is nearly a month 
old, I’ll start a new one here, since it kind of morphed from the original 
question:

Does anyone have an idea why it appears that the `pulsar_storage_size` metric 
for a topic would report double the occupied storage reported by the 
`pulsar-admin topic stats-internal &lt;topic&gt;` command?

original thread here: 
<https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1592947146270700>
----
2020-07-20 15:17:31 UTC - Shivam Arora: @Shivam Arora has joined the channel
----
2020-07-20 15:37:41 UTC - Ashish Srivastava: @Ashish Srivastava has joined the 
channel
----
2020-07-20 16:43:45 UTC - Joshua Decosta: @Addison Higham in regards to using 
two or more AuthenticationProviders. If I have a custom AuthorizationProvider. 
Do i need to configure it to deal with all the AuthenticationProviders that are 
enabled? This is meant to deal with brokerClients in one way and regular 
clients in a customized way. 
----
2020-07-20 16:46:10 UTC - Joshua Decosta: Does any Authentication/Authorization 
occur on layer 4? 
----
2020-07-20 17:19:39 UTC - Joshua Eric: This slack is a fantastic resource and 
you'll likely get the answers you need. <https://pandio.com> also offers 
managed services and enterprise support.
----
2020-07-20 17:23:09 UTC - Joshua Eric: This is what I had to end up putting 
together taking into consideration the format of the `input` variable being 
passed into `def(self, input, context)`:

```class Schema:
    schema = None

    def __init__(self, *args):
        self.schema = args[0]

    def __call__(self, f):
        def wrapped(*args):
            args = list(args)
            args[1] = self.schema.decode(args[1].encode())
            return self.schema.encode(f(*tuple(args))).decode("utf-8")

        return wrapped```
----
2020-07-20 17:24:23 UTC - Joshua Eric: This was necessary due to the type of 
`input` being a string. I expected it to be bytes, but I haven't dug deep into 
what is being done to it before it is passed into that function.
----
2020-07-20 19:18:17 UTC - Addison Higham: I assume you are looking for IP 
whitelist/blacklist?

There isn't any built in support, but the `AuthenticationDataSource` has fields 
for getting IP address and the like, so that might be possible to use
----
2020-07-20 19:19:41 UTC - Addison Higham: It wouldn't technically happen at L4, 
but you could use L4 information in your auth and authz decisions
----
2020-07-20 19:20:15 UTC - Addison Higham: @Joshua Decosta It will still get 
invoked, yes, but the `AuthenticationDataSource` would be unique depending on 
which auth provided allowed it (I believe, haven't ever actually tested it)
----
2020-07-20 19:36:57 UTC - Sijie Guo: If you are running in K8s, you can enable 
liveness probe on brokers.
----
2020-07-20 19:39:41 UTC - Sijie Guo: Is the metric you viewed in the grafana 
dashboard?
----
2020-07-20 19:40:08 UTC - Sijie Guo: If so, you might need to check if it sums 
the metrics both from the namespace level metrics as well as from the topic 
level metrics.
----
2020-07-20 21:16:16 UTC - Joshua Decosta: So are the broker to broker 
communications always at the application level? 
----
2020-07-20 22:03:51 UTC - Addison Higham: yes
----
2020-07-20 23:54:59 UTC - Nick Rivera: Hi! My organization currently uses 
RabbitMQ as the message bus that powers our RPC layer but I was interested in 
evaluating Pulsar as a replacement and have a few questions.

1. We use a pattern where services dynamically create a request/response queue 
at startup. Is this pattern viable in Pulsar or is the expense of creating 
and/or deleting topics too much on the cluster?
2. It seems that messages within a topic can be given a TTL but the topics 
themselves remain. What negative effects are caused by having many empty topics 
on a cluster? Is there a reasonable way to clean these up?
3. non-persistent topics seem like an attractive option for the 
request/response queue, however since they are not persisted how are they load 
balanced across brokers as they go up or down? Are non-persistent topics always 
tied to a particular broker for the topic's lifetime? 

----
2020-07-20 23:57:17 UTC - Ali Ahmed: 1. cost of creating and deleting a topic 
is negligible.
2. There is no real effect , topics can be set to autodelete
----
2020-07-20 23:58:17 UTC - Ali Ahmed: @Nick Rivera you should try evaluating this
<https://github.com/streamnative/aop>
----
2020-07-20 23:58:54 UTC - Nick Rivera: I should mention we are a c++ codebase
----
2020-07-20 23:59:18 UTC - Nick Rivera: ah but I see
----
2020-07-20 23:59:50 UTC - Nick Rivera: With regards to point 2, I was unable to 
find how to do that within the documentation. How is topic auto-deletion 
achieved?
----
2020-07-21 00:00:26 UTC - Ali Ahmed: it’s a config I don’t remember on top of 
my head
----
2020-07-21 00:04:46 UTC - Shivam Arora: Hello all! In documentation of minimum 
hardware requirement on bare metal. There is a reference to AWS i3.4xlarge -
1. is there any recommended requirement for bare metal or VM ?
2. AWS min requirement is baselined for how much load ?
Thanks in advance
----
2020-07-21 00:06:12 UTC - Nick Rivera: do you recall if the config is on the 
namespace or the topic itself? I am having trouble finding what you are 
referring to in the docs
----
2020-07-21 00:14:59 UTC - Addison Higham: @Nick Rivera The setting is called 
`brokerDeleteInactiveTopicsEnabled` and it currently applies to the entire 
cluster, but I think there are plans to allow it to be set per namespace as well
----
2020-07-21 00:15:20 UTC - Nick Rivera: ahhh ok
----
2020-07-21 00:15:32 UTC - Nick Rivera: didn't think to check at the cluster 
level, but that's very useful. Thank you!
----
2020-07-21 00:16:56 UTC - Addison Higham: Coming from rabbitMQ, if you want a 
`transient` topic, that is pretty achievable with expiring subscriptions and 
that property
----
2020-07-21 00:19:07 UTC - Addison Higham: And regards to your question of 
non-persistent topics, your intuition is correct, any unconsumed messages would 
be lost when a topic is migrated.

And yes AoP is something to take a look at as well, for some more context, see 
this blog post 
<https://medium.com/streamnative/announcing-amqp-on-pulsar-bring-native-amqp-protocol-support-to-apache-pulsar-dc7bc10c106f>
----
2020-07-21 00:22:54 UTC - Nick Rivera: It is very interesting. I've actually 
already integrated against the pulsar-client-cpp library so I wish I had looked 
into this beforehand
----
2020-07-21 00:45:53 UTC - Addison Higham: :thumbsup: one more follow up 
question for you, how many topics do you think you would be creating? while it 
is true that pulsar can handle lots of topics, if you are talking multiple 
hundred of thousands, you may need to tune the cluster more.
----
2020-07-21 02:15:22 UTC - Addison Higham: I don't think think those machine 
sizes should be seen as minimums, Pulsar can effectively run on quite small 
machines (just with obviously less throughput). Even down to 1 GB of heap and a 
single CPU for all components. It is probably best to work out from your 
requirements to find what amount of hardware you need. But in genera, if you 
are looking for a guide on what to think of for hardware and where you 
bottleneck first:

- Bookkeeper nodes benefit from fast SSDs (for the journal) and moderately 
speedy larger disk for ledgers (HDDs are fine, bookkeepers can use multiple 
disks itself to scale out). Memory is used for caches, but mostly Bookkeeper 
does a good job of pushing the disk to where it is the bottleneck
- Brokers are a bit more varied and depend a fair amount on use case, for write 
bound workloads, network saturation is often the bottleneck, but it can be CPU 
for more read heavy workloads

If you want to get a rough idea of hardware, using the `pulsar-perf` or 
openmessaging benchmark are pretty quick way to get an idea of performance on a 
given set of hardware
----
2020-07-21 03:05:07 UTC - markmborg: @markmborg has joined the channel
----
2020-07-21 03:07:12 UTC - VanderChen: Thanks a lot. And Is there any solution 
for bare machine deployment.
----
2020-07-21 03:31:41 UTC - Shivji Kumar Jha: @Addison Higham ^
----
2020-07-21 03:34:21 UTC - Shivji Kumar Jha: WIth 
bookkeeperTLSClientAuthentication=false and tlsClientAuthentication=false, i 
have this on broker:
```Successfully connected to bookie using TLS: 
pulsar-node3.&lt;..&gt;.com:3181```
----
2020-07-21 03:34:25 UTC - Shivji Kumar Jha: My TLS is working it seems but its 
when i enable auth my bookies start crashing!
----
2020-07-21 03:35:49 UTC - Addison Higham: the bookie crash is the same 
exception above?
----
2020-07-21 03:38:37 UTC - Addison Higham: I actually haven't run TLS on bookies 
yet, so I am not super familiar with what that could be, but I would certainly 
think that clients not sending a cert causing a crash would be a bug... Does 
that stack trace have any more?
----
2020-07-21 03:39:36 UTC - Shivji Kumar Jha: Yes @Addison Higham I just tried 
<https://github.com/streamnative/charts/blob/c6064db23db3540bb4af78d34ae5408c4934592b/charts/pulsar/templates/broker/broker-configmap.yaml#L172|this>
 tyring to use V2 protocol, now i see only this in bookie log:
```02:46:58.183 [BookKeeperClientWorker-OrderedExecutor-1-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L1241925 E34309 from bookie: 
pulsar-node3.&lt;####&gt;:3181
02:46:58.184 [bookkeeper-io-19-1] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to 
bookie: [id: 0xaf47a3fe, 
L:/10.160.6.172:47340]/pulsar-node3.&lt;####&gt;.com:3181, current state 
CONNECTING :
io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) 
failed: Connection refused: pulsar-node3.&lt;####&gt;/10.160.6.118:3181
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection 
refused
        at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124) 
~[io.netty-netty-transport-native-unix-common-4.1.48.Final.jar:4.1.48.Final]
        at io.netty.channel.unix.Socket.finishConnect(Socket.java:243) 
~[io.netty-netty-transport-native-unix-common-4.1.48.Final.jar:4.1.48.Final]
        at 
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:672)
 
[io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final]
        at 
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:649)
 
[io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final]
        at 
io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:529)
 
[io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final]
        at 
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:465) 
[io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) 
[io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final]
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
 [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
[io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]\```
----
2020-07-21 03:46:13 UTC - Addison Higham: Hrm @Sijie Guo would probably have 
more context around that specific change, but one thing, have you tried just 
using `openssl s_client -connect` to check the certs more directly? The error 
you had earlier and that error now would lead me to think you may have some 
cert issues? perhaps the server isn't presenting the right cert in its cert 
request to the client? If the client doesn't provide a cert, that usually means 
the client didn't trust the cert presented to it in the cert request
----
2020-07-21 03:47:09 UTC - Addison Higham: these usually are much easy to 
diagnose with just using the openssl tools
----
2020-07-21 03:47:22 UTC - Addison Higham: (AFK for a bit, but feel free to ask 
more questions, can respond later)
----
2020-07-21 04:55:09 UTC - Nick Rivera: I am more thinking on the scale of 100s 
of topics. Perhaps in the low thousands
----
2020-07-21 07:41:15 UTC - Shivam Arora: Thanks Addison for your valuable input. 
I will share our configuration with expected load for others to reference.
----
2020-07-21 08:01:53 UTC - Daniel Ciocirlan: yes, we run bare metal with no k8s 
for the moment and would really help. It seems restarting the proxys solves the 
issue.
----
2020-07-21 08:51:55 UTC - Frank.Z: @Frank.Z has joined the channel
----

Slack digest for #general - 2020-07-21

Reply via email to