2020-07-20 09:20:25 UTC - Tolulope Awode: Hello, I am having similar issue ---- 2020-07-20 09:21:04 UTC - Tolulope Awode: ```07-19 10:56:21.201 WARN [140648026367744] AckGroupingTrackerEnabled:99 | Connection is not ready, grouping ACK failed. 2020-07-19 10:56:21.210 WARN [140648009582336] AckGroupingTrackerEnabled:99 | Connection is not ready, grouping ACK failed. 2020-07-19 10:56:21.301 WARN [140648026367744] AckGroupingTrackerEnabled:99 | Connection is not ready, grouping ACK failed. 2020-07-19 10:56:21.311 WARN [140648009582336] AckGroupingTrackerEnabled:99 | Connection is not ready, grouping ACK failed.```
---- 2020-07-20 11:52:41 UTC - VanderChen: I have the same question. Additionally, is there any way to restart the failed broker automatically? ---- 2020-07-20 12:59:24 UTC - Ebere Abanonu: @Sijie Guo this is important, very!! ---- 2020-07-20 13:56:38 UTC - Jonas Kint: Is there a public realease schedule for pulsar? I’m currently running against issues with the gcs offloader that got fixed in master and need to do some capacity planning in order to survive our current topic growth :sweat_smile: ---- 2020-07-20 13:57:27 UTC - Ebere Abanonu: September 2.7.0 ---- 2020-07-20 13:58:54 UTC - Frank Kelly: Thanks - The only solution I found was to delete my minikube environment and recreate it - I had assigned it about 36GB of disk ---- 2020-07-20 14:34:28 UTC - Jonas Kint: and are there plans on doing a minor 2.6.1 release? ---- 2020-07-20 14:36:46 UTC - Ebere Abanonu: Maybe, maybe not. I think, following the pattern, 2.6.1 should be expected ---- 2020-07-20 14:52:57 UTC - rwaweber: Hey all! Given the thread is nearly a month old, I’ll start a new one here, since it kind of morphed from the original question: Does anyone have an idea why it appears that the `pulsar_storage_size` metric for a topic would report double the occupied storage reported by the `pulsar-admin topic stats-internal <topic>` command? original thread here: <https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1592947146270700> ---- 2020-07-20 15:17:31 UTC - Shivam Arora: @Shivam Arora has joined the channel ---- 2020-07-20 15:37:41 UTC - Ashish Srivastava: @Ashish Srivastava has joined the channel ---- 2020-07-20 16:43:45 UTC - Joshua Decosta: @Addison Higham in regards to using two or more AuthenticationProviders. If I have a custom AuthorizationProvider. Do i need to configure it to deal with all the AuthenticationProviders that are enabled? This is meant to deal with brokerClients in one way and regular clients in a customized way. ---- 2020-07-20 16:46:10 UTC - Joshua Decosta: Does any Authentication/Authorization occur on layer 4? ---- 2020-07-20 17:19:39 UTC - Joshua Eric: This slack is a fantastic resource and you'll likely get the answers you need. <https://pandio.com> also offers managed services and enterprise support. ---- 2020-07-20 17:23:09 UTC - Joshua Eric: This is what I had to end up putting together taking into consideration the format of the `input` variable being passed into `def(self, input, context)`: ```class Schema: schema = None def __init__(self, *args): self.schema = args[0] def __call__(self, f): def wrapped(*args): args = list(args) args[1] = self.schema.decode(args[1].encode()) return self.schema.encode(f(*tuple(args))).decode("utf-8") return wrapped``` ---- 2020-07-20 17:24:23 UTC - Joshua Eric: This was necessary due to the type of `input` being a string. I expected it to be bytes, but I haven't dug deep into what is being done to it before it is passed into that function. ---- 2020-07-20 19:18:17 UTC - Addison Higham: I assume you are looking for IP whitelist/blacklist? There isn't any built in support, but the `AuthenticationDataSource` has fields for getting IP address and the like, so that might be possible to use ---- 2020-07-20 19:19:41 UTC - Addison Higham: It wouldn't technically happen at L4, but you could use L4 information in your auth and authz decisions ---- 2020-07-20 19:20:15 UTC - Addison Higham: @Joshua Decosta It will still get invoked, yes, but the `AuthenticationDataSource` would be unique depending on which auth provided allowed it (I believe, haven't ever actually tested it) ---- 2020-07-20 19:36:57 UTC - Sijie Guo: If you are running in K8s, you can enable liveness probe on brokers. ---- 2020-07-20 19:39:41 UTC - Sijie Guo: Is the metric you viewed in the grafana dashboard? ---- 2020-07-20 19:40:08 UTC - Sijie Guo: If so, you might need to check if it sums the metrics both from the namespace level metrics as well as from the topic level metrics. ---- 2020-07-20 21:16:16 UTC - Joshua Decosta: So are the broker to broker communications always at the application level? ---- 2020-07-20 22:03:51 UTC - Addison Higham: yes ---- 2020-07-20 23:54:59 UTC - Nick Rivera: Hi! My organization currently uses RabbitMQ as the message bus that powers our RPC layer but I was interested in evaluating Pulsar as a replacement and have a few questions. 1. We use a pattern where services dynamically create a request/response queue at startup. Is this pattern viable in Pulsar or is the expense of creating and/or deleting topics too much on the cluster? 2. It seems that messages within a topic can be given a TTL but the topics themselves remain. What negative effects are caused by having many empty topics on a cluster? Is there a reasonable way to clean these up? 3. non-persistent topics seem like an attractive option for the request/response queue, however since they are not persisted how are they load balanced across brokers as they go up or down? Are non-persistent topics always tied to a particular broker for the topic's lifetime? ---- 2020-07-20 23:57:17 UTC - Ali Ahmed: 1. cost of creating and deleting a topic is negligible. 2. There is no real effect , topics can be set to autodelete ---- 2020-07-20 23:58:17 UTC - Ali Ahmed: @Nick Rivera you should try evaluating this <https://github.com/streamnative/aop> ---- 2020-07-20 23:58:54 UTC - Nick Rivera: I should mention we are a c++ codebase ---- 2020-07-20 23:59:18 UTC - Nick Rivera: ah but I see ---- 2020-07-20 23:59:50 UTC - Nick Rivera: With regards to point 2, I was unable to find how to do that within the documentation. How is topic auto-deletion achieved? ---- 2020-07-21 00:00:26 UTC - Ali Ahmed: it’s a config I don’t remember on top of my head ---- 2020-07-21 00:04:46 UTC - Shivam Arora: Hello all! In documentation of minimum hardware requirement on bare metal. There is a reference to AWS i3.4xlarge - 1. is there any recommended requirement for bare metal or VM ? 2. AWS min requirement is baselined for how much load ? Thanks in advance ---- 2020-07-21 00:06:12 UTC - Nick Rivera: do you recall if the config is on the namespace or the topic itself? I am having trouble finding what you are referring to in the docs ---- 2020-07-21 00:14:59 UTC - Addison Higham: @Nick Rivera The setting is called `brokerDeleteInactiveTopicsEnabled` and it currently applies to the entire cluster, but I think there are plans to allow it to be set per namespace as well ---- 2020-07-21 00:15:20 UTC - Nick Rivera: ahhh ok ---- 2020-07-21 00:15:32 UTC - Nick Rivera: didn't think to check at the cluster level, but that's very useful. Thank you! ---- 2020-07-21 00:16:56 UTC - Addison Higham: Coming from rabbitMQ, if you want a `transient` topic, that is pretty achievable with expiring subscriptions and that property ---- 2020-07-21 00:19:07 UTC - Addison Higham: And regards to your question of non-persistent topics, your intuition is correct, any unconsumed messages would be lost when a topic is migrated. And yes AoP is something to take a look at as well, for some more context, see this blog post <https://medium.com/streamnative/announcing-amqp-on-pulsar-bring-native-amqp-protocol-support-to-apache-pulsar-dc7bc10c106f> ---- 2020-07-21 00:22:54 UTC - Nick Rivera: It is very interesting. I've actually already integrated against the pulsar-client-cpp library so I wish I had looked into this beforehand ---- 2020-07-21 00:45:53 UTC - Addison Higham: :thumbsup: one more follow up question for you, how many topics do you think you would be creating? while it is true that pulsar can handle lots of topics, if you are talking multiple hundred of thousands, you may need to tune the cluster more. ---- 2020-07-21 02:15:22 UTC - Addison Higham: I don't think think those machine sizes should be seen as minimums, Pulsar can effectively run on quite small machines (just with obviously less throughput). Even down to 1 GB of heap and a single CPU for all components. It is probably best to work out from your requirements to find what amount of hardware you need. But in genera, if you are looking for a guide on what to think of for hardware and where you bottleneck first: - Bookkeeper nodes benefit from fast SSDs (for the journal) and moderately speedy larger disk for ledgers (HDDs are fine, bookkeepers can use multiple disks itself to scale out). Memory is used for caches, but mostly Bookkeeper does a good job of pushing the disk to where it is the bottleneck - Brokers are a bit more varied and depend a fair amount on use case, for write bound workloads, network saturation is often the bottleneck, but it can be CPU for more read heavy workloads If you want to get a rough idea of hardware, using the `pulsar-perf` or openmessaging benchmark are pretty quick way to get an idea of performance on a given set of hardware ---- 2020-07-21 03:05:07 UTC - markmborg: @markmborg has joined the channel ---- 2020-07-21 03:07:12 UTC - VanderChen: Thanks a lot. And Is there any solution for bare machine deployment. ---- 2020-07-21 03:31:41 UTC - Shivji Kumar Jha: @Addison Higham ^ ---- 2020-07-21 03:34:21 UTC - Shivji Kumar Jha: WIth bookkeeperTLSClientAuthentication=false and tlsClientAuthentication=false, i have this on broker: ```Successfully connected to bookie using TLS: pulsar-node3.<..>.com:3181``` ---- 2020-07-21 03:34:25 UTC - Shivji Kumar Jha: My TLS is working it seems but its when i enable auth my bookies start crashing! ---- 2020-07-21 03:35:49 UTC - Addison Higham: the bookie crash is the same exception above? ---- 2020-07-21 03:38:37 UTC - Addison Higham: I actually haven't run TLS on bookies yet, so I am not super familiar with what that could be, but I would certainly think that clients not sending a cert causing a crash would be a bug... Does that stack trace have any more? ---- 2020-07-21 03:39:36 UTC - Shivji Kumar Jha: Yes @Addison Higham I just tried <https://github.com/streamnative/charts/blob/c6064db23db3540bb4af78d34ae5408c4934592b/charts/pulsar/templates/broker/broker-configmap.yaml#L172|this> tyring to use V2 protocol, now i see only this in bookie log: ```02:46:58.183 [BookKeeperClientWorker-OrderedExecutor-1-0] INFO org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L1241925 E34309 from bookie: pulsar-node3.<####>:3181 02:46:58.184 [bookkeeper-io-19-1] ERROR org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0xaf47a3fe, L:/10.160.6.172:47340]/pulsar-node3.<####>.com:3181, current state CONNECTING : io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: pulsar-node3.<####>/10.160.6.118:3181 Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124) ~[io.netty-netty-transport-native-unix-common-4.1.48.Final.jar:4.1.48.Final] at io.netty.channel.unix.Socket.finishConnect(Socket.java:243) ~[io.netty-netty-transport-native-unix-common-4.1.48.Final.jar:4.1.48.Final] at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:672) [io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final] at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:649) [io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final] at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:529) [io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:465) [io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [io.netty-netty-transport-native-epoll-4.1.48.Final-linux-x86_64.jar:4.1.48.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]\``` ---- 2020-07-21 03:46:13 UTC - Addison Higham: Hrm @Sijie Guo would probably have more context around that specific change, but one thing, have you tried just using `openssl s_client -connect` to check the certs more directly? The error you had earlier and that error now would lead me to think you may have some cert issues? perhaps the server isn't presenting the right cert in its cert request to the client? If the client doesn't provide a cert, that usually means the client didn't trust the cert presented to it in the cert request ---- 2020-07-21 03:47:09 UTC - Addison Higham: these usually are much easy to diagnose with just using the openssl tools ---- 2020-07-21 03:47:22 UTC - Addison Higham: (AFK for a bit, but feel free to ask more questions, can respond later) ---- 2020-07-21 04:55:09 UTC - Nick Rivera: I am more thinking on the scale of 100s of topics. Perhaps in the low thousands ---- 2020-07-21 07:41:15 UTC - Shivam Arora: Thanks Addison for your valuable input. I will share our configuration with expected load for others to reference. ---- 2020-07-21 08:01:53 UTC - Daniel Ciocirlan: yes, we run bare metal with no k8s for the moment and would really help. It seems restarting the proxys solves the issue. ---- 2020-07-21 08:51:55 UTC - Frank.Z: @Frank.Z has joined the channel ----
