2020-05-11 11:01:05 UTC - eric.olympe: Hi everybody ! Is it possible to use Keycloak for authentication/authorization to connect to Apache Pulsar ? ---- 2020-05-11 11:09:17 UTC - rani: *Proxy Log* ```01:18:12.193 [pulsar-external-web-4-6] INFO org.eclipse.jetty.server.RequestLog - 10.242.145.212 - - [11/May/2020:01:18:12 +0000] "GET /admin/v2/persistent/pulsar/functions/coordinate/stats?getPreciseBacklog=false HTTP/1.1" 502 0 "-" "Pulsar-Java-v2.5.1" 4``` *Broker Log* ```11:06:42.192 [pulsar-web-39-6] INFO org.eclipse.jetty.server.RequestLog - 10.242.145.191 - - [11/May/2020:11:06:42 +0000] "GET /admin/v2/persistent/pulsar/functions/coordinate/stats?getPreciseBacklog=false HTTP/1.1" 307 0 "-" "Pulsar-Java-v2.5.1" 2``` Please note that these requests only fail intermittently. Not all the time.
Yes, we do have JWT authentication enabled. The function worker communicates with brokers over pulsar-proxy and we have `functionWorkerWebServiceURL` set in `proxy.conf` ---- 2020-05-11 11:35:30 UTC - Thomas Jamet: @Thomas Jamet has joined the channel ---- 2020-05-11 11:55:07 UTC - Thomas Jamet: Is there a way to set / retrieve the key on a message via the Python client? I don't see any key method exposed by the <https://pulsar.apache.org/api/python/2.5.0-SNAPSHOT/#pulsar.Message|Message class>. ---- 2020-05-11 11:58:41 UTC - Damien Roualen: The problem has been solved, authentication for the broker itself was missing. (“brokerClientAuthenticationParameters”). Problem solved! Thanks @Alexandre DUVAL. ---- 2020-05-11 12:12:23 UTC - Matej Šaravanja: Hi, I have a problem either with understanding or implementation. I'm testing `KeyShared` subscription mode in Python library. There is one broker, 1 partitioned topic (5 partitions), 1 producer and 2 consumers. Consumer config is same for both consumers: ```config={ "subscription_name": "test-subscription-key-shared", "consumer_type": pulsar.ConsumerType.KeyShared, "initial_position": pulsar.InitialPosition.Earliest }``` When I produce messages with: `producer.send_async(payload, cb, partition_key=random.choice(keys))` to test key-based routing, messages always end up in consumer that started first. I expected that keys will be equally distributed among consumers. What did I miss? ---- 2020-05-11 14:48:54 UTC - JG: ah ok, yes I saw a difference when a topic already exists and when a topic does not exist yet ---- 2020-05-11 14:49:26 UTC - JG: on documentation, even if topic does not exist, pulsar will create automatically ---- 2020-05-11 15:12:58 UTC - Penghui Li: How many keys used by random.choice(). It’s related to the hash of the key. ---- 2020-05-11 16:07:49 UTC - Addison Higham: Pulsar has pluggable authentication and authorization APIs. The way to do that is laid out here: <https://pulsar.apache.org/docs/en/security-extending/> If all you care about is Authn, it is fairly straight forward to implement. The most complicated bit being that you will need to write a client plugin for each language you use with pulsar and also distribute it. Another option, which is what we do, is use our primary auth mechanism (in our case, AWS IAM) and have a small service where we can just retrieve tokens. We do that on app boot. ---- 2020-05-11 16:22:08 UTC - Matej Šaravanja: 6 keys. I discovered it was because of producer batching option. Key based batching is still not available in python client. When it's disabled, key shared option works perfectly, but with very high decrease in throughput ---- 2020-05-11 17:32:57 UTC - Kirill Merkushev: <https://lanwen.ru/posts/pulsar-functions-how-to-debug-with-testcontainers/|https://lanwen.ru/posts/pulsar-functions-how-to-debug-with-testcontainers/> tag me as @delnariel in twitter if you will have anything to say +1 : Konstantinos Papalias, Ali Ahmed, Sijie Guo, David Kjerrumgaard ---- 2020-05-11 17:43:11 UTC - Sijie Guo: partition_key is the `key` ---- 2020-05-11 17:44:30 UTC - Thomas Jamet: Thanks for the confirmation. The name threw me off. ---- 2020-05-11 18:07:59 UTC - Tim Corbett: Greetings, I'm trying to estimate costs for a Pulsar cluster running on AWS. We would be running our producers/consumers in multiple AZs in a given region, and was wondering how best to configure the Pulsar cluster as to minimize cross-AZ bandwidth as that appears to be a significant cost. In an ideal world the only cross-AZ transfer would be for replication, and the publisher/consumer could always write to/read from a local node, but with the understanding that that is likely infeasible, what options/best practices are around? +1 : Franck Schmidlin, Kirill Merkushev ---- 2020-05-11 19:09:08 UTC - Kai Levy: does message_ttl apply to _all_ messages? on both topics that have subscriptions (and backlogs), and those without subscriptions? ---- 2020-05-11 19:13:24 UTC - Gary Fredericks: my current understanding is it only applies to subscriptions; i.e., without subscriptions it has no effect. but I could be wrong ---- 2020-05-11 19:17:48 UTC - Kai Levy: i haven't been able to find a clear explanation in the docs.. would that mean that a retention policy is the only way to clear messages from a topic that has no subscriptions? ---- 2020-05-11 19:25:43 UTC - Tim Corbett: Could be very wrong but I think I've seen subscription-less topics simply discard any messages received, so there is nothing to clear because nothing is retained? ---- 2020-05-11 19:28:44 UTC - Gary Fredericks: yes, I believe the retention policy is the only thing that causes messages to get deleted ---- 2020-05-11 19:28:54 UTC - Gary Fredericks: for a persistent topic ---- 2020-05-11 19:32:38 UTC - Kai Levy: thanks -- so if i want to expire unacked messages for a namespace with both topics that _do_ and _do not_ have subscriptions, i would need to enable a retention policy _and_ a ttl? that seems odd to me, since the docs suggest that you should be using _either_ a retention policy _or_ a ttl ---- 2020-05-11 19:34:22 UTC - Gary Fredericks: where do they say that? ---- 2020-05-11 19:34:59 UTC - Kai Levy: <https://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#retention-and-ttl-solve-two-different-problems> ---- 2020-05-11 19:40:14 UTC - Gary Fredericks: when you say "with both topics that _do_ and _do not_ have subscriptions" do you mean that you have two different styles of topics? ---- 2020-05-11 19:40:40 UTC - Kai Levy: no, same type of topic -- just if i have a topic that doesn't happen to have any subscriptions on it yet ---- 2020-05-11 19:40:45 UTC - Jarrod Johnson: @Jarrod Johnson has joined the channel ---- 2020-05-11 19:41:57 UTC - Gary Fredericks: I don't think I understand their claim that you'd want to use at most one ---- 2020-05-11 19:42:09 UTC - Gary Fredericks: I can imagine using both of those _and_ a backlog quota to boot ---- 2020-05-11 19:42:51 UTC - Kai Levy: yeah, agreed.. maybe somebody else can clear that up. thank you for your help! ---- 2020-05-11 20:39:39 UTC - Deepak Sah: Hi, @Liam Clarke I am using `c5.2xlarge` for pulsar and `m5.2xlarge` for producer. When I run my benchmark for 60 seconds, the average comes out to be 15k messages/s . I have check the logs and I don't think I'm hitting IOPs or bandwidth limits. Am I missing something? How can improve the throughput? ---- 2020-05-11 20:40:34 UTC - Deepak Sah: Also, I'm using 16 concurrent producers ---- 2020-05-11 23:46:24 UTC - David Kjerrumgaard: Agreed. I am in the process of adding that but ran into a fatal bug that preventing the LocalRunner class form launching local instances of Pulsar Functions. ---- 2020-05-12 02:12:10 UTC - Ken Huang: Hi, I want to know about geo-replication. If I have three pulsar cluster in different places which enable geo-replication. After a client publishes msgs to A cluster, A cluster machine destroyed and unable to recover. Because the machine broke before the backup to B and C cluster, it means I lost the msgs? Can I avoid this condition if I use a global zookeeper? ---- 2020-05-12 04:42:39 UTC - Toktok Rambo: How would you avoid this situation in any distributed system? i.e. A crashes before syncing to B or C. If you cannot recover from A’s persistence block/layer/volume.. It is indeed lost for ever. ---- 2020-05-12 04:44:03 UTC - Toktok Rambo: you could build your app logic such that it will keep retrying, unless you get a successful ack. A success ack would mean you should be able to recover A at least. ---- 2020-05-12 07:24:40 UTC - Abhilash Mandaliya: hi. I am facing an issue mentioned here: <https://github.com/apache/pulsar/issues/6768> It was not occurring before but started suddenly. I am having a following JSON schema: ```{ "type": "JSON", "schema": "{\"name\":\"User\",\"type\":\"record\",\"namespace\":\"com.connect.pulsar.inbound.data\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"id\",\"type\":\"INT64\"}]}", "properties": {} }``` ---- 2020-05-12 07:24:40 UTC - Abhilash Mandaliya: while uploading this to a topic, I am getting the following exception (The default exception shown was simply ‘500 Internal Server Error’. I had to print stacktrace and use my custom build to get it) java.lang.IllegalArgumentException: Can not resolve JsonSchema ‘type’ id of “record”, not recognized as one of standard values: [STRING, NUMBER, INTEGER, BOOLEAN, OBJECT, ARRAY, NULL, ANY] at com.fasterxml.jackson.module.jsonSchema.JsonSchemaIdResolver.typeFromId(JsonSchemaIdResolver.java:66) at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:156) at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97) at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:254) at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4202) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3266) at org.apache.pulsar.broker.service.schema.validator.StructSchemaDataValidator.validate(StructSchemaDataValidator.java:60) at org.apache.pulsar.broker.service.schema.validator.SchemaDataValidator.validateSchemaData(SchemaDataValidator.java:42) at org.apache.pulsar.broker.service.schema.validator.SchemaRegistryServiceWithSchemaDataValidator.putSchemaIfAbsent(SchemaRegistryServiceWithSchemaDataValidator.java:92) at org.apache.pulsar.broker.admin.v2.SchemasResource.lambda$postSchema$6(SchemasResource.java:331) at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:714) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) at org.apache.pulsar.zookeeper.ZooKeeperDataCache.lambda$0(ZooKeeperDataCache.java:68) at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:714) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) at org.apache.pulsar.zookeeper.ZooKeeperCache.lambda$18(ZooKeeperCache.java:374) at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:714) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) at org.apache.pulsar.zookeeper.ZooKeeperCache.lambda$14(ZooKeeperCache.java:355) at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) Can anyone help me on this? The github issue I mentioned above doesn’t have the solution. ---- 2020-05-12 08:16:13 UTC - Rattanjot Singh: How do we enable tls in pulsar manager to broker? ----
