2020-05-19 09:26:41 UTC - Andreas Müller: Just a short note that we have integrated Pulsar with Flow Director 2.0 (release is next week, the link points to the new site): <https://newsite.flowdirector.io/integration/pulsar/> +1 : Chris Bartholomew ---- 2020-05-19 09:29:09 UTC - Ali Ahmed: Good to know ---- 2020-05-19 09:29:18 UTC - Ali Ahmed: is there a blog about the release ? ---- 2020-05-19 09:29:55 UTC - Andreas Müller: <https://newsite.flowdirector.io/blog> ---- 2020-05-19 09:30:25 UTC - Andreas Müller: That will moved to <http://www.flowdirector.io|www.flowdirector.io> next week. ---- 2020-05-19 11:25:29 UTC - Luke Stephenson: Hi @Sijie Guo. The docs under <https://pulsar.apache.org/docs/en/concepts-topic-compaction/#compaction> state: > Topic compaction _does_, however, respect retention. If retention has removed a message from the message backlog of a topic, the message will also not be readable from the compacted topic ledger. So it sounds like if I set a retention policy to remove old messages from the "original" topic, that will also remove messages from the compacted topic. ---- 2020-05-19 11:26:11 UTC - Luke Stephenson: Thank you @Sijie Guo. That is really useful to know. ---- 2020-05-19 11:34:08 UTC - Luke Stephenson: I can't find any mention of a `maxPendingRequests` as something that can be configured in <https://pulsar.apache.org/docs/en/reference-configuration/#bookkeeper>. Could you share a link to the configuration. Thanks ---- 2020-05-19 12:24:39 UTC - Jonathan: We’re getting pulsar setup in our infrastructure and are kicking the tires a bit. One question that has come up that I can’t seem to find some good documentation for is regarding to consumers that disappear. We have auto-scaling in our system where the consumers from pulsar come and go quite regularly. When a consumer disappears, does the broker automatically detect this and redeliver the message or do we have to configure something specifically to get this functionality? ---- 2020-05-19 12:44:18 UTC - Gilles Barbier: @Sijie Guo I think the potential issue can be described like this: 2 instances of the same function (processing each a different message, but using the same state) read the state in the same time (so getting the same value), then update it concurrently and save it. Everything added in the state by the function being the first to save will be ignore by the state saved by the second function instance. The question is: is this description correct? Is this the case also for counters? Thx ---- 2020-05-19 13:05:08 UTC - Gilles Barbier: Hi all. I’m trying to use the API to retrieve the counter value of a function and I can not find any endpoint. There are some endpoint for function state but nothing for function counters. Am I wrong? ---- 2020-05-19 13:26:59 UTC - Alexandre DUVAL: Yup, but is prometheus a good way for millions of ns? :stuck_out_tongue_winking_eye: ---- 2020-05-19 14:01:37 UTC - ckdarby: Well, it is a factor of N ---- 2020-05-19 14:01:46 UTC - ckdarby: Not really going to get around that I guess ---- 2020-05-19 14:37:33 UTC - Alexandre DUVAL: Ok, nothing related, did you see the answer I did on pulsar-rs about impl the latest on vector? ---- 2020-05-19 16:05:32 UTC - Ravi Shah: Hi Guys,
I am getting - java.util.concurrent.TimeoutException: Idle timeout expired: 300000/300000 ms I am using websocket based producer. Is there any way to set keepAlive or do a ping call to make socket alive? Or Can I set idle time to Infinite? ---- 2020-05-19 16:13:56 UTC - Sijie Guo: @Luke Stephenson yes. you are right. Can you create a github issue? We can improve this behavior. +1 : Franck Schmidlin, Luke Stephenson ---- 2020-05-19 16:16:11 UTC - Sijie Guo: did you try - <https://github.com/apache/pulsar#eclipse> ---- 2020-05-19 16:16:51 UTC - Sijie Guo: Yes broker will automatically detect it ---- 2020-05-19 16:18:40 UTC - rwaweber: Hey all! Question on bookkeeper’s prometheus metrics settings: Is it possible to serve prometheus metrics over a different port from the web service port? Looking at the docs <https://pulsar.apache.org/docs/en/2.5.1/deploy-monitoring/#bookkeeper-stats|here>, it definitely appears like that’s how adding that setting would work. Is it possible that it might conflict with other settings, or vice versa? For instance, I’m seeing prometheus metrics available on the webservice admin port, at `/metrics` (they also seem a little wonky, but I figure I’d ask that as a separate question once we figure out where those metrics are coming from) ---- 2020-05-19 16:22:48 UTC - Matteo Merli: Not really, the logic there is to either spin up the HTTP service for prometheus metrics, or reuse the BK admin HTTP service. <https://github.com/apache/bookkeeper/blob/450f27ce3c80f78151bdd962c9deb31e92df6620/bookkeeper-stats-providers/prometheus-metrics-provider/src/main/java/org/apache/bookkeeper/stats/prometheus/PrometheusMetricsProvider.java#L126> ---- 2020-05-19 16:57:10 UTC - Olivier Chicha: No, I will definitively have a look at it thanks a lot ---- 2020-05-19 17:04:58 UTC - Jeff Schneller: I have set up basic TLS Authentication/Authorization (at least I hope I did). I connect via SSL and use a producer cert to produce messages and a consumer cert to receive messages. This works. Then as a test, I used the consumer cert to produce messages expecting it to fail however it did not. ```bin/pulsar-admin topics permissions <persistent://public/default/logging> Warning: Nashorn engine is planned to be removed from a future JDK release { "consumer" : [ "consume" ], "producer" : [ "produce" ] }``` ---- 2020-05-19 17:08:42 UTC - David Kjerrumgaard: <https://pulsar.apache.org/docs/en/functions-state/#query-state> ---- 2020-05-19 17:11:08 UTC - David Kjerrumgaard: @Jeff Schneller Can you confirm that the certs map to different roles in Pulsar? ---- 2020-05-19 17:18:37 UTC - Jeff Schneller: Here is the consumer cert ---- 2020-05-19 17:20:06 UTC - Jeff Schneller: Here is producer cert ---- 2020-05-19 17:20:09 UTC - Jeff Schneller: ---- 2020-05-19 17:23:08 UTC - Jeff Schneller: ```client = PulsarClient.builder() .serviceUrl(serviceUrl) .enableTls(true) .tlsTrustCertsFilePath("e:\\certs\\ca.cert.pem") .authentication(AuthenticationFactory.TLS("e:\\certs\\consumer.cert.pem", "e:\\certs\\consumer.key-pk8.pem")) .build(); loggingProducer = client.newProducer(org.apache.pulsar.client.api.Schema.JSON(LogEvent.class)).topic("logging").create(); } catch (PulsarClientException e1) { // TODO Auto-generated catch block e1.printStackTrace(); } if (loggingProducer == null) { System.out.println("Someething went very wrong"); return; } LogEvent startLogEvent = new LogEvent("", "Log Test", Calendar.getInstance().getTime(), "Log Test"); try { loggingProducer.send(startLogEvent); System.out.println("Sent message"); } catch (PulsarClientException e1) { // TODO Auto-generated catch block e1.printStackTrace(); } System.out.println("Exited");``` And here is some basic code to produce a message that should fail based on the cert being used. ---- 2020-05-19 17:24:55 UTC - Franck Schmidlin: Doesn't look like there is a channel dedicated to pulsar functions. Can anyone point me to a router example with compex schemas? I can only find string examples. I'm trying to route messages that all inherit from a base schema, based on the value of a field from that base schema. I can serde on that base schema on the way in, but can't figure out how to publish the messages in their specific schema to target topics. Something like this example, but with schemas and in java? <https://pulsar.apache.org/docs/en/functions-overview/#content-based-routing-example|https://pulsar.apache.org/docs/en/functions-overview/#content-based-routing-example> ---- 2020-05-19 17:25:58 UTC - David Kjerrumgaard: Ok, thanks for confirming that. Based on what you have shared, this appears to be an issue that needs to be reported on Github so it can be fixed. ---- 2020-05-19 17:31:50 UTC - Olivier Chicha: @Sijie Guo In fact this what I have already done. here is a simple project that I run with no other project in my eclipse: pom.xml: ```<project xmlns="<http://maven.apache.org/POM/4.0.0>" xmlns:xsi="<http://www.w3.org/2001/XMLSchema-instance>" xsi:schemaLocation="<http://maven.apache.org/POM/4.0.0> <https://maven.apache.org/xsd/maven-4.0.0.xsd>"> <modelVersion>4.0.0</modelVersion> <groupId>mygroup</groupId> <artifactId>mttest</artifactId> <version>0.0.1-SNAPSHOT</version> <properties> <maven.compiler.target>1.11</maven.compiler.target> <maven.compiler.source>1.11</maven.compiler.source> </properties> <dependencies> <dependency> <groupId>org.apache.pulsar</groupId> <artifactId>pulsar-client-original</artifactId> <version>2.5.1</version> </dependency> <!-- Most probably useless --> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.16.18</version> </dependency> </dependencies> </project>``` main class: ```package my.test; import org.apache.pulsar.client.api.Schema; public class MyTest { public static void main(String[] args) { try { Schema<String> stringSchema = Schema.STRING; System.out.println("shema = " + stringSchema); } catch (Throwable t) { t.printStackTrace(); } } }``` output: > java.lang.ExceptionInInitializerError > at my.test.MyTest.main(MyTest.java:9) > Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodError: 'org.apache.pulsar.common.schema.SchemaInfo org.apache.pulsar.common.schema.SchemaInfo.setName(java.lang.String)' > at org.apache.pulsar.client.internal.ReflectionUtils.catchExceptions(ReflectionUtils.java:46) > at org.apache.pulsar.client.internal.DefaultImplementation.newBytesSchema(DefaultImplementation.java:136) > at org.apache.pulsar.client.api.Schema.<clinit>(Schema.java:149) > ... 1 more > Caused by: java.lang.NoSuchMethodError: 'org.apache.pulsar.common.schema.SchemaInfo org.apache.pulsar.common.schema.SchemaInfo.setName(java.lang.String)' > at org.apache.pulsar.client.impl.schema.BytesSchema.<clinit>(BytesSchema.java:35) > at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490) > at java.base/java.lang.Class.newInstance(Class.java:584) > at org.apache.pulsar.client.internal.DefaultImplementation.lambda$10(DefaultImplementation.java:138) > at org.apache.pulsar.client.internal.ReflectionUtils.catchExceptions(ReflectionUtils.java:35) > ... 3 more ---- 2020-05-19 17:43:07 UTC - rwaweber: Awesome, thanks for the clarification Matteo! ---- 2020-05-19 17:49:38 UTC - rwaweber: So in other words, if you enable the admin HTTP port _and_ you enable the metrics port, the metrics port specified will be ignored and it will prefer the already allocated admin port, right? That’s just my read on this comment directly above: > `only start its own http server when prometheus http is enabled and bk http server is not enabled` ---- 2020-05-19 17:52:32 UTC - Franck Schmidlin: Am I right to assume that retention rules do not apply to messages sent to tiered storage? Or is there some magic that enables messages to be stored to s3 and still expire after a given time? ---- 2020-05-19 17:58:35 UTC - Adelina Brask: hi guys, anyone have seen this "`Maximum redirect reached: 5`" exception before? I get often and out of the blue: " `"Pulsar-Java-v2.5.1" 0` `17:46:59.908 [pulsar-web-41-25] INFO org.eclipse.jetty.server.RequestLog - 10.220.37.191 - - [19/May/2020:17:46:59 +0000] "GET /admin/v3/sink/public/default/elastic-elastic/5/status HTTP/1.1" 307 0 "-" "Pulsar-Java-v2.5.1" 4` `17:46:59.908 [pulsar-web-41-32] ERROR org.apache.pulsar.functions.worker.rest.api.SinksImpl - public/default/elastic-elastic Got Exception Getting Status` `java.lang.RuntimeException: org.apache.pulsar.client.admin.PulsarAdminException: <http://javax.ws.rs|javax.ws.rs>.ProcessingException: Maximum redirect reached: 5` `at org.apache.pulsar.functions.worker.rest.api.ComponentImpl$GetStatus.getComponentStatus(ComponentImpl.java:246) ~[org.apache.pulsar-pulsar-functions-worker-2.5.1.jar:2.5.1]` `at org.apache.pulsar.functions.worker.rest.api.SinksImpl.getSinkStatus(SinksImpl.java:629) ~[org.apache.pulsar-pulsar-functions-worker-2.5.1.jar:2.5.1]` `at org.apache.pulsar.broker.admin.impl.SinksBase.getSinkStatus(SinksBase.java:318) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]` `at sun.reflect.GeneratedMethodAccessor194.invoke(Unknown Source) ~[?:?]` `at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_252]"` ---- 2020-05-19 18:01:05 UTC - Ming: Retention policy applies to the tiered storage. You should set the right retention policy in respect to the local storage and s3. Pulsar already has tiered storage feature for that purpose. If messages are due for deletion, then they will be. ---- 2020-05-19 18:01:27 UTC - Matteo Merli: > it will prefer the already allocated admin port, right? Yes. I know it's a bit confusing.. Initially there was no HTTP admin in BK when we added the Prometheus metrics. Later it was adapted ---- 2020-05-19 18:01:40 UTC - Addison Higham: @Franck Schmidlin for your functions, yes, you can just use the `context.publish` for each topic. One thing to note though is that `context.publish` is sync, which means you can't use batching and will really limit throughput. If you need more throughput you can do `context.newOutputMessage` which returns `TypedMessageBuilder` that has an async publish method. The issue then is that what to do with failed messages becomes something you need to solve for. There are some plans to improve the functions API to make such use cases easier, but that is the state of the world for now :slightly_smiling_face: +1 : Franck Schmidlin, Sijie Guo, Konstantinos Papalias ---- 2020-05-19 18:06:57 UTC - David Kjerrumgaard: You still need to configure tiered storage offload in order for the messages to be moved to S3, etc. +1 : Ming ---- 2020-05-19 18:09:32 UTC - Franck Schmidlin: Good call about the asynch. Mybissue is more about schemas, not sure how to read on a generic schema (messageBase) then publish messages messageType1, messageType2, etc, without my function needing to know about these specific schemas. Unless I process raw byte[] but not sure how that plays with schemas? ---- 2020-05-19 18:11:38 UTC - Franck Schmidlin: That's impressive. Where is the magic happening? Or is it as simple as expiry mapping to a cursor position? ---- 2020-05-19 18:12:06 UTC - Sergey Rublev: @Sergey Rublev has joined the channel ---- 2020-05-19 18:14:24 UTC - Gilles Barbier: Thanks @David Kjerrumgaard this is for the state. It’s the same API for counters ? ---- 2020-05-19 18:16:13 UTC - David Kjerrumgaard: The magic happens in the BookKeeper layer. Once a Ledger has been marked as CLOSED, and all of the messages have been acknowledged, then it is kept around based on the retention policy for the topic's namespace. When offloading is requested, these segments of the topic are copied, one-by-one, to tiered storage. The ledger metadata information is updated inside the managed ledger, so you can access them from Pulsar as you would any ledgers on the bookies. ---- 2020-05-19 18:17:25 UTC - David Kjerrumgaard: you have to configure the offloader to run automatically though, as it isn't an "automatic" thing built into the framework. <https://pulsar.apache.org/docs/en/2.5.1/cookbooks-tiered-storage/#configuring-offload-to-run-automatically> ---- 2020-05-19 18:19:05 UTC - David Kjerrumgaard: Good question. The counters are kept in Zookeeper, so you should be access them with the `/pulsar/bin/pulsar zookeeper-shell` tool ---- 2020-05-19 18:19:23 UTC - David Kjerrumgaard: `ls /counters` , and go from there.... :smiley: ---- 2020-05-19 18:28:45 UTC - Jeff Schneller: ok. Seems odd since it is such a basic test ---- 2020-05-19 19:04:22 UTC - Jeff Schneller: Issue reported on github +1 : David Kjerrumgaard ---- 2020-05-19 19:27:43 UTC - Karthik Ramasamy: <https://www.linkedin.com/posts/kramasamy_splunk-datatoeverything-pulsar-activity-6668592642433146880-f-Kq> +1 : Ming, David Kjerrumgaard, Sijie Guo, Matteo Merli, Konstantinos Papalias, Luke Lu, Julius S, Ali Ahmed, Penghui Li, Deepak Sah ---- 2020-05-19 19:28:37 UTC - Karthik Ramasamy: Apache Pulsar replaces Kafka in Splunk Distributed Stream Processor 1.1.0 ---- 2020-05-19 19:57:14 UTC - Sahil Sawhney: @Sahil Sawhney has joined the channel ---- 2020-05-19 20:10:20 UTC - Sahil Sawhney: Team I have a question wrt official pulsar helm chart for production. So in the values.yaml file, if we set `tls.enabled` to `true` , is it sufficient to make our cluster TLS enabled or we need to enable TLS encryption for individual component as well like setting: `tls.proxy.enabled` to `true` `tls.broker.enabled` to `true` etc... ---- 2020-05-19 21:29:50 UTC - Sijie Guo: `tls.enabled` is the global flag. ---- 2020-05-19 21:30:13 UTC - Sijie Guo: You can override `tls.proxy.enabled` or `tls.broker.enabled` to disable tls for individual components. ---- 2020-05-19 21:31:16 UTC - Sijie Guo: Or I misunderstand your requirements. ---- 2020-05-19 21:31:28 UTC - Sijie Guo: You are building your own project not the pulsar project. ---- 2020-05-19 21:32:24 UTC - Sijie Guo: did you try `pulsar-client` instead of p`ulsar-client-original` ---- 2020-05-19 21:43:40 UTC - Sijie Guo: You can use GenericRecord and AUTO_CONSUME schema to read events. ---- 2020-05-19 21:44:02 UTC - Sijie Guo: The question is - Are you using Pulsar schemas? Or you manage serializing and deserializing yourself? ---- 2020-05-19 21:45:17 UTC - Sijie Guo: It seems to be indicating that there are multiple redirections before the request sent to the owner of the broker. ---- 2020-05-19 21:45:28 UTC - Sijie Guo: Do you have more context on this? ---- 2020-05-19 22:18:56 UTC - Sahil Sawhney: Thanks @Sijie Guo for the reply So basically this global flag in itself does nothing. The helm chart needs this global flag to be true along with the flag of components like `tls.broker.enabled`. ---- 2020-05-19 22:19:52 UTC - Sijie Guo: correct. ---- 2020-05-19 22:20:17 UTC - Sijie Guo: The global flag just provides a convinient way to turn off tls completely ---- 2020-05-19 22:21:13 UTC - Sahil Sawhney: Okay got it. Thanks @Sijie Guo Basically after putting all the config values for my helm chart as per the Doc, I am getting ```root@pulsar-prod-toolset-0:/pulsar# bin/pulsar-admin tenants list null Reason: <http://javax.ws.rs|javax.ws.rs>.ProcessingException: handshake timed out``` ---- 2020-05-19 22:23:03 UTC - Sahil Sawhney: I have tls enabled for only broker and proxy. Also authentication and authorization are also enabled ---- 2020-05-19 22:24:23 UTC - Luke Stephenson: Will do. ---- 2020-05-19 22:25:02 UTC - Luke Stephenson: @Sijie Guo Are you able to point me in the direction of the docs or source code regarding `maxPendingRequests` ? Can't see it anywhere. ---- 2020-05-19 22:26:49 UTC - Sahil Sawhney: The updated values.yaml file that I am using ---- 2020-05-19 23:31:39 UTC - Luke Stephenson: Raised <https://github.com/apache/pulsar/issues/6995> ---- 2020-05-20 02:21:41 UTC - Ebere Abanonu: Thanks ---- 2020-05-20 04:02:14 UTC - hangc: are you adding partitions for a topic and occurs those exception? ---- 2020-05-20 04:24:06 UTC - Alexander Ursu: Was trying to use the Presto sql-worker in a standalone setup in Docker. Is there any way to reduce or limit CPU usage by the sql-worker to make it more bearable for a development environment? ---- 2020-05-20 04:59:01 UTC - Deepa: @Sijie Guo - can you please help with your inputs on this? ---- 2020-05-20 05:02:37 UTC - Sijie Guo: @Deepa why did you set `.keepAliveInterval(3000,TimeUnit.SECONDS)`? ---- 2020-05-20 05:03:04 UTC - Sijie Guo: This means client only sends ping/pong messages every 3000 seconds ---- 2020-05-20 05:03:47 UTC - Sijie Guo: If Server side doesn’t receive any ping/pong messages from client within 60 seconds, it will close the connection. ---- 2020-05-20 05:04:45 UTC - Deepa: even without that the connection gets closed in 60 seconds when i run in debug mode and pause it at line "System.out.println("wait!!!!");" ---- 2020-05-20 05:06:11 UTC - Sijie Guo: How did you pause at line? ---- 2020-05-20 05:07:28 UTC - Deepa: i run it in debug mode and dont proceed after reaching that line for more than a minute ---- 2020-05-20 05:08:11 UTC - Deepa: then i see below log ```00:28:09.465 [pulsar-io-50-8] WARN org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0xfda1aeae, L:/127.0.0.1:6650 - R:/127.0.0.1:52145]] Forcing connection to close after keep-alive timeout 00:28:09.466 [pulsar-io-50-8] INFO org.apache.pulsar.broker.service.ServerCnx - Closed connection from /127.0.0.1:52145``` ---- 2020-05-20 05:08:32 UTC - Deepa: and a new connection is created when i proceed ---- 2020-05-20 05:08:45 UTC - Deepa: ```00:28:32.201 [pulsar-io-50-11] INFO org.apache.pulsar.broker.service.ServerCnx - New connection from /127.0.0.1:52171 00:28:32.203 [pulsar-io-50-11] INFO org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:52171][<persistent://public/default/my-keepalive-topic>] Creating producer. producerId=0``` ---- 2020-05-20 05:32:43 UTC - Olivier Chicha: @Sijie Guo: same issue with pulsar-client ---- 2020-05-20 07:17:11 UTC - Ken Huang: Hi, I'm testing full-mesh geo-replication. I use a configurationStore and 2 pulsar clusters. After the deployment is done, I can see two clusters own each other and so do tenant, namespace. When I produce a message on clusterA, only clusterA can receive. What step did I do wrong? here is topic stat, I found connected is false, why... ```"replication" : { "pulsar-14" : { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "msgRateExpired" : 0.0, "replicationBacklog" : 3, "connected" : false, "replicationDelayInSeconds" : 0 } },``` ---- 2020-05-20 08:29:15 UTC - Patrik Kleindl: Hi Confluent has just posted a comparison between Kafka, Pulsar and RabbitMQ. <https://www.confluent.io/kafka-vs-pulsar/> As with any subjective comparison some of the statements might be questionable or incomplete. I am interested in both technologies and would welcome any thoughts. Disclaimer: I don’t work for Confluent, I don’t favor Kafka or Pulsar and I don’t like bashing one or the other ---- 2020-05-20 08:45:13 UTC - Patrik Kleindl: > Bookkeeper’s retention capabilities are limited by the fine-grained metadata it stores in Zookeeper. Can anyone explain this please? ----
