2019-06-26 10:08:49 UTC - Alexandre DUVAL: I tried, and I'm running this from current functions worker ---- 2019-06-26 10:08:52 UTC - Alexandre DUVAL: ```yo-pulsar-c1-n7 /pulsar # ls /pulsar/conf/functions/target pulsar-functions-0.1.0-SNAPSHOT.jar yo-pulsar-c1-n7 /pulsar # bin/pulsar-admin --admin-url <https://192.168.10.16:2001> --auth-params "token:<TOKEN>" functions create --jar file:///pulsar/conf/functions/target/pulsar-functions-0.1.0-SNAPSHOT.jar --classname com.yo.pulsar.function.CellarC1AccessLogs --tenant yo --namespace functions --custom-schema-inputs "{ '<persistent://yo/logs/access-haproxy-cellar-c1-raw>': 'STRING' }" Function class com.yo.pulsar.function.CellarC1AccessLogs must be in class path
Reason: Function class com.yo.pulsar.function.CellarC1AccessLogs must be in class path ``` ---- 2019-06-26 10:09:30 UTC - Alexandre DUVAL: @Sijie Guo hi, maybe you have an idea? (I got other function class in this jar and it runs smoothly.) ---- 2019-06-26 10:10:25 UTC - Sijie Guo: @Alexandre DUVAL is the jar a uber jar? ---- 2019-06-26 10:11:44 UTC - Alexandre DUVAL: it is. ---- 2019-06-26 10:12:06 UTC - Alexandre DUVAL: 5778 classes files on it :stuck_out_tongue: ---- 2019-06-26 10:12:17 UTC - Alexandre DUVAL: (jar -tf /pulsar/conf/functions/target/pulsar-functions-0.1.0-SNAPSHOT.jar | wc) ---- 2019-06-26 10:12:39 UTC - Alexandre DUVAL: ```yo-pulsar-c1-n7 /pulsar # jar -tf /pulsar/conf/functions/target/pulsar-functions-0.1.0-SNAPSHOT.jar | grep Cellar com/yo/pulsar/function/CellarC1AccessLog.class com/yo/pulsar/model/CellarAccessLog.class ``` ---- 2019-06-26 10:12:45 UTC - Alexandre DUVAL: and my class is on it. ---- 2019-06-26 10:12:52 UTC - Sijie Guo: is it a typo? ---- 2019-06-26 10:12:57 UTC - Sijie Guo: `com.yo.pulsar.function.CellarC1AccessLogs` ---- 2019-06-26 10:13:05 UTC - Sijie Guo: there is no `s` +1 : David Kjerrumgaard ---- 2019-06-26 10:13:10 UTC - Sijie Guo: it should be `com.yo.pulsar.function.CellarC1AccessLog` ---- 2019-06-26 10:13:15 UTC - Alexandre DUVAL: oh, just a miss, i hide clevercloud :stuck_out_tongue: ---- 2019-06-26 10:13:16 UTC - Alexandre DUVAL: sorry ---- 2019-06-26 10:13:22 UTC - Alexandre DUVAL: oh damn ---- 2019-06-26 10:13:31 UTC - Alexandre DUVAL: I am so sorry. ---- 2019-06-26 10:13:37 UTC - Alexandre DUVAL: Thanks for your time. ---- 2019-06-26 10:13:38 UTC - Sijie Guo: no worries ---- 2019-06-26 10:13:50 UTC - Sijie Guo: hope you have fun with pulsar functions :slightly_smiling_face: ---- 2019-06-26 10:15:35 UTC - Alexandre DUVAL: I do have! ---- 2019-06-26 12:14:46 UTC - Randy: @Randy has joined the channel ---- 2019-06-26 14:24:08 UTC - Aaron: Are non-authenticated producers assigned roles (or a default role name)? ---- 2019-06-26 15:33:27 UTC - Ryan Samo: Hey guys, on Pulsar 2.3.2 I am getting 401 unauthorized when running Pulsar-admin cli to list the public/default topics. I’m using TLS and the admin cert is set to superuser on the brokers. Has anyone seen this before? ---- 2019-06-26 15:34:26 UTC - Ryan Samo: Also Pulsar proxy is running but my my client.conf is pointing to the broker ports, not the proxy ---- 2019-06-26 16:19:52 UTC - Sam Leung: I have a question about subscription names, are they expected to be globally unique? Since there can be multi-topic subscriptions, it’s really bad news if I use a shared subscription mode to topicA with nameA, if someone else subscribes to topicB with nameA, I’ll start receiving topicB messages without any warnings to either party? (also assume we’re in the same namespace) ---- 2019-06-26 16:32:10 UTC - Gilberto Muñoz Hernández: @Gilberto Muñoz Hernández has joined the channel ---- 2019-06-26 16:36:28 UTC - David Kjerrumgaard: @Ryan Samo Does your `brokerServiceUrl=<pulsar+ssl://localhost:6651/>` property have `pulsar+ssl` in the URL? ---- 2019-06-26 16:37:43 UTC - Gilberto Muñoz Hernández: Hi everybody, i was trying to find some sort of forum in which i could ask questions or now about future pulsar features and i end up here. ---- 2019-06-26 16:37:53 UTC - David Kjerrumgaard: @Ryan Samo Make sure you have all of the following properties set in your client config ---- 2019-06-26 16:38:27 UTC - David Kjerrumgaard: @Gilberto Muñoz Hernández You came to the right place. ---- 2019-06-26 16:38:40 UTC - Gilberto Muñoz Hernández: Is it ok, to just ask anything about pulsar in here, or is this slack for developers only ---- 2019-06-26 16:39:26 UTC - David Kjerrumgaard: This general channel is fine for general questions. There is a separate dev channel for dev questions ---- 2019-06-26 16:39:40 UTC - Gilberto Muñoz Hernández: what about feature requests ---- 2019-06-26 16:39:55 UTC - Gilberto Muñoz Hernández: do you accept something like that ---- 2019-06-26 16:39:57 UTC - Gilberto Muñoz Hernández: ? ---- 2019-06-26 16:40:49 UTC - David Kjerrumgaard: @Gilberto Muñoz Hernández We can discuss feature requests here, but the proper forum for submitting them is in github, <https://github.com/apache/pulsar/issues> ---- 2019-06-26 16:41:25 UTC - Gilberto Muñoz Hernández: remember i am not a pulsar dev ---- 2019-06-26 16:41:32 UTC - Gilberto Muñoz Hernández: just a mere user ---- 2019-06-26 16:41:53 UTC - David Kjerrumgaard: or a more complex feature request might require a PIP, which can be filed here.... <https://github.com/apache/pulsar/wiki> ---- 2019-06-26 16:42:31 UTC - David Kjerrumgaard: @Gilberto Muñoz Hernández What did you have in mind? We can discuss here and decide the best course of action ---- 2019-06-26 16:43:03 UTC - Gilberto Muñoz Hernández: i really really really would like a PYTHON spark stream receiver ---- 2019-06-26 16:43:22 UTC - Gilberto Muñoz Hernández: since most ai frameworks are implemented in java ---- 2019-06-26 16:43:28 UTC - Gilberto Muñoz Hernández: i mean python sorry ---- 2019-06-26 16:43:56 UTC - Gilberto Muñoz Hernández: but i've only seen that in java ---- 2019-06-26 16:44:27 UTC - David Kjerrumgaard: So you want something like the following, but for Python? <http://pulsar.apache.org/docs/en/adaptors-spark/> ---- 2019-06-26 16:44:43 UTC - Gilberto Muñoz Hernández: exactly ---- 2019-06-26 16:45:09 UTC - Gilberto Muñoz Hernández: or maybe i dont need it and i could just use a simple consumer? ---- 2019-06-26 16:45:34 UTC - Gilberto Muñoz Hernández: but if that is true, why dont you do that for java too and instead use that adaptor? ---- 2019-06-26 16:47:53 UTC - David Kjerrumgaard: The Spark adapter is for Spark streaming applications. That means that the same code it run against all the elements in the topic and processed together, like this example <https://github.com/apache/pulsar/blob/master/examples/spark/src/main/java/org/apache/spark/streaming/receiver/example/SparkStreamingPulsarReceiverExample.java> ---- 2019-06-26 16:49:15 UTC - David Kjerrumgaard: If you wanted to use a simple client, you would process each message individually and have to keep track of intermediate results, and decide when to terminate the client session. ---- 2019-06-26 16:51:26 UTC - Gilberto Muñoz Hernández: ok, but thats java code, what if i am using pyspark and want to create a streaming app that consume from a pulsar topic and do x or y with numpy or tensorflow. ---- 2019-06-26 16:51:27 UTC - Ryan Samo: @David Kjerrumgaard thanks, double checking my client.conf ---- 2019-06-26 16:52:26 UTC - David Kjerrumgaard: @Gilberto Muñoz Hernández Than we need to write a new Python adapter for that, which would require a PIP request. If you want to create one, I can help you complete it. ---- 2019-06-26 16:53:11 UTC - Gilberto Muñoz Hernández: i would apreciate that a lot ---- 2019-06-26 16:54:11 UTC - Gilberto Muñoz Hernández: i mean that's the only thing right now that may force us to switch to kafka ---- 2019-06-26 16:55:10 UTC - Gilberto Muñoz Hernández: after all, if we are talking about ai, python is the prefered lang ---- 2019-06-26 16:55:29 UTC - David Kjerrumgaard: I would recommend looking at some of the previous ones that got accepted as a guide. The more detail, specification you can provide the better, as you are essentially giving A) a reason WHY someone should write this, and B) the software requirements to write and test against ---- 2019-06-26 16:56:26 UTC - David Kjerrumgaard: @Gilberto Muñoz Hernández Does Kafka have such a feature? ---- 2019-06-26 16:57:47 UTC - Gilberto Muñoz Hernández: <https://spark.apache.org/docs/2.3.1/streaming-kafka-0-8-integration.html> ---- 2019-06-26 16:57:51 UTC - David Kjerrumgaard: `pyspark.streaming.kafka import KafkaUtils`, etc? ---- 2019-06-26 16:57:57 UTC - Gilberto Muñoz Hernández: yes ---- 2019-06-26 16:58:05 UTC - Gilberto Muñoz Hernández: for python java and scala ---- 2019-06-26 16:58:22 UTC - Gilberto Muñoz Hernández: both the receiver and direct approach i believe ---- 2019-06-26 16:58:32 UTC - David Kjerrumgaard: I would definitely recommend adding that to the motivation section, asking for feature parity with Kafka. ---- 2019-06-26 16:58:38 UTC - Gilberto Muñoz Hernández: am new in this concepts so maybe i was missguided to belive that ---- 2019-06-26 16:59:41 UTC - Gilberto Muñoz Hernández: ok i will check how to file a pip request ---- 2019-06-26 16:59:58 UTC - Gilberto Muñoz Hernández: thank you a lot for your time ---- 2019-06-26 17:00:23 UTC - Gilberto Muñoz Hernández: really appreciate it ---- 2019-06-26 17:00:55 UTC - David Kjerrumgaard: No problem.....glad to help. I think this is a good feature to have!! ---- 2019-06-26 17:02:40 UTC - Gilberto Muñoz Hernández: me too, i mean, there is no point in bid data without processing, there is no better processing that machine learning algorithms, and there is no better language for that than python (right now) ---- 2019-06-26 17:07:08 UTC - Ryan Samo: Hey @David Kjerrumgaard, should the client.conf target the ports on the Pulsar proxy or just the brokers when using the Pulsar-admin cli? ---- 2019-06-26 17:12:01 UTC - David Kjerrumgaard: @Ryan Samo The brokers. ---- 2019-06-26 17:12:42 UTC - David Kjerrumgaard: ---- 2019-06-26 17:14:33 UTC - David Kjerrumgaard: If you add lines 4 & 5 you should be able to use the proxies instead of the brokers. ---- 2019-06-26 17:18:41 UTC - Ryan Samo: Cool thanks! ---- 2019-06-26 17:19:53 UTC - Yuvaraj Loganathan: @Shivji Kumar Jha ^^ ---- 2019-06-26 17:21:01 UTC - Sam Leung: Hmm doing some more testing here, might just be something odd I’m doing. Will check back in after I find out. ---- 2019-06-26 18:03:36 UTC - Gilberto Muñoz Hernández: @David Kjerrumgaard the pip request was kind of too specific for my current pulsar knowledge (almost none) so i just submitted a simple feature request ---- 2019-06-26 18:03:42 UTC - Gilberto Muñoz Hernández: @David Kjerrumgaard <https://github.com/apache/pulsar/issues/4608> +1 : David Kjerrumgaard ---- 2019-06-26 18:05:46 UTC - Gilberto Muñoz Hernández: @David Kjerrumgaard I hope i did it fine and you guys do it in next releases ---- 2019-06-26 18:06:24 UTC - David Kjerrumgaard: @Gilberto Muñoz Hernández Thanks for creating this. It will go for review and hopefully it will get prioritized. ---- 2019-06-26 18:12:21 UTC - Ryan Samo: @David Kjerrumgaard , if you run functions workers on separate machines and not on the brokers, can you still run them in cluster mode or do they only work as localrun only? Just trying out moving my functions workers off of my brokers to see the benefits ---- 2019-06-26 18:16:21 UTC - Ryan Samo: I’m using the functions-worker.md in the docs as guidance, just wondering if the functions work or behave the same in and out of the brokers and which is a better path for production use. Stability, security, etc. ---- 2019-06-26 18:17:49 UTC - David Kjerrumgaard: @Ryan Samo Yes, you can run them in cluster mode on non-broker nodes. The document you reference is the correct one to use. Play close attention to the following `When you are running functions-worker in a separate cluster, the admin rest endpoints are split into two clusters. functions, function-worker, source and sink endpoints are now served by the functions-worker cluster, while all the other remaining endpoints are served by the broker cluster. Hence you need to configure your pulsar-admin to use the right service URL accordingly.` ---- 2019-06-26 18:18:56 UTC - David Kjerrumgaard: You can use your existing proxy as a central entry point for your admin service. (recommended) and the steps are documented. ---- 2019-06-26 18:19:54 UTC - David Kjerrumgaard: The biggest benefit to this approach is the reduction in resource contention on the brokers, and eliminating the risk of having a rouge function bringing down the broker ---- 2019-06-26 18:21:44 UTC - Jon Bock: That approach also allows you to scale function workers independently of brokers. +1 : David Kjerrumgaard, Ryan Samo ---- 2019-06-26 18:24:19 UTC - Ryan Samo: Perfect, that’s exactly why I want to run them separate to the brokers. So following this doc and paying attention to the proxy setup, I have the admin cli using the client.conf file and inside that file targets my proxy. It works for non functions commands. I then added the functionWorkerWebServiceURLTLS to point to my new functions worker nodes. I can ping the functions worker nodes via that url via curl or a browser but the admin cli never reaches them from what I can tell. It always replies with “Function worker service is not done initializing. 503 error ---- 2019-06-26 18:25:15 UTC - Ryan Samo: It’s like it’s not figuring out they are not on the brokers but on separate machines ---- 2019-06-26 18:27:33 UTC - David Kjerrumgaard: What is the value of your `functionsWorkerEnabled` property in the broker.conf? ---- 2019-06-26 18:28:22 UTC - Ryan Samo: False ---- 2019-06-26 18:28:29 UTC - Ryan Samo: On all 3 brokers +1 : David Kjerrumgaard ---- 2019-06-26 18:30:07 UTC - Ryan Samo: Pulsar 2.3.2 ---- 2019-06-26 18:31:15 UTC - David Kjerrumgaard: and what values do you have for the worker parameters, e.g. `workerHostname` . `workerPort`, `workerPortTls`, etc? ---- 2019-06-26 18:32:36 UTC - Ryan Samo: Workerhostname is not specified WorkerPort 6750 WorkerPortTls 6751 ---- 2019-06-26 18:33:07 UTC - Ryan Samo: If you leave the host name blank, it shows to pick it up automatically in the logs ---- 2019-06-26 18:33:28 UTC - Ryan Samo: WorkerId is also not specified ---- 2019-06-26 18:34:22 UTC - Ryan Samo: pulsarServiceUrl is pointing to the SSL port on the proxy ---- 2019-06-26 18:34:24 UTC - David Kjerrumgaard: And for these? Are they all configured ---- 2019-06-26 18:36:07 UTC - David Kjerrumgaard: what is the value of `functionWorkerWebServiceURLTLS` in your `proxy.conf` ? ---- 2019-06-26 18:36:24 UTC - Ryan Samo: Yes they all are, and the certs for the auth are the admin certs / superuser for now ---- 2019-06-26 18:36:53 UTC - David Kjerrumgaard: Sorry for the laundry list, just walking through the possible issues :smiley: ---- 2019-06-26 18:38:03 UTC - Ryan Samo: <https://pulsar.dev.functions.int.com:6751> ---- 2019-06-26 18:38:15 UTC - Ryan Samo: All good, I really appreciate the help ---- 2019-06-26 18:38:28 UTC - David Kjerrumgaard: Any error messages in the proxy log? ---- 2019-06-26 18:38:38 UTC - Ryan Samo: Checking ---- 2019-06-26 18:41:06 UTC - Ryan Samo: It shows a POST to the /admin/v3/functions/public/default ---- 2019-06-26 18:41:15 UTC - David Kjerrumgaard: If we don't find any obvious errors in the logs, we might want to try disabling TLS first and see if we can get the functions to run. just to remove any TLS issues from the mix ---- 2019-06-26 18:41:27 UTC - Ryan Samo: And a response of HTTP 1.1 503 ---- 2019-06-26 18:42:18 UTC - Ryan Samo: Yeah maybe so... let me try one other thing with the worker certs, swap them out for a different pair ---- 2019-06-26 18:42:24 UTC - David Kjerrumgaard: ok ---- 2019-06-26 18:56:16 UTC - Ryan Samo: So if I reboot the functions worker node, you can see the producer and consumer fire up through the proxy and on the broker logs, so the functions worker is coming up fine and connecting to the brokers ---- 2019-06-26 18:56:33 UTC - Ryan Samo: I just can’t talk back to the functions worker, weird ---- 2019-06-26 19:08:11 UTC - David Kjerrumgaard: Any errors in the function-worker log? ---- 2019-06-26 19:08:11 UTC - Ryan Samo: @David Kjerrumgaard , I just got it to work if I explicitly specified the —admin-url flag when running the pulsar-admin cli, but it won’t work otherwise clap : David Kjerrumgaard ---- 2019-06-26 19:09:38 UTC - Ryan Samo: Is that how it is intended to work? I thought the proxy is supposed to check if it’s a functions command and then call the 6751 endpoint ---- 2019-06-26 19:11:05 UTC - David Kjerrumgaard: `When you are running functions-worker in a separate cluster, the admin rest endpoints are split into two clusters. functions, function-worker, source and sink endpoints are now served by the functions-worker cluster, while all the other remaining endpoints are served by the broker cluster. Hence you need to configure your pulsar-admin to use the right service URL accordingly.` So that makes sense ---- 2019-06-26 19:11:41 UTC - David Kjerrumgaard: so you had to override the `default` admin url to go to the proper endpoint? ---- 2019-06-26 19:12:15 UTC - Ryan Samo: Yes is sure did ---- 2019-06-26 19:12:31 UTC - Ryan Samo: But my default endpoint was the proxy ---- 2019-06-26 19:14:04 UTC - Ryan Samo: And now I override it to the functions endpoint ---- 2019-06-26 19:15:03 UTC - David Kjerrumgaard: yea, sounds like the proxy isn't working as expected...... ---- 2019-06-26 19:16:55 UTC - David Kjerrumgaard: maybe you need to change the proxy config ---- 2019-06-26 19:19:19 UTC - Ryan Samo: Checking that again ---- 2019-06-26 19:28:07 UTC - Ryan Samo: Yeah the function deployed but dies every 30secs with an UNAVAILABLE io exception ---- 2019-06-26 19:28:16 UTC - David Kjerrumgaard: where did you start the function-worker? `bin/pulsar functions-worker` ---- 2019-06-26 19:28:25 UTC - Ryan Samo: Using the exclamation function ---- 2019-06-26 19:28:42 UTC - Ryan Samo: I started it on the standalone node outside the brokers ---- 2019-06-26 19:29:20 UTC - David Kjerrumgaard: ok ---- 2019-06-26 19:49:34 UTC - Ryan Samo: Ok the IO issue was because the functions worker was targeting the proxy instead of the broker... all solved except overriding the functions worker url ---- 2019-06-26 20:03:53 UTC - Aaron: Is there any way to force authentication to produce/consume to a topic? ---- 2019-06-26 20:18:49 UTC - Sam Leung: To your 2nd question, yes, that’s what authorization / permissions is for. After you turn it on, Pulsar has a built in system to manage permissions. <https://pulsar.apache.org/docs/en/admin-api-permissions/> ---- 2019-06-26 20:56:34 UTC - Gilberto Muñoz Hernández: In pulsar documentation says it support avro and protobuff schemas but the java client api doc only covers json, then again i saw there is a Schema.AVRO but i have no idea of how to use it, does anyone know how to? ---- 2019-06-26 20:59:29 UTC - Matteo Merli: It’s similar to JSON: `Schema.AVRO(MyPojo.class)` ---- 2019-06-27 08:34:45 UTC - Guillaume Braibant: Does it generate the Avro schema of your POJO at runtime? ---- 2019-06-27 08:35:47 UTC - Ali Ahmed: yes +1 : Guillaume Braibant ---- 2019-06-27 09:03:25 UTC - Ritesh Chandra Nailwal: @Ritesh Chandra Nailwal has joined the channel ---- 2019-06-27 09:06:13 UTC - Ritesh Chandra Nailwal: I need to install Apache pulsar with 3 zk, 3 BK, and 3 broker. can you please provide the step by step instructions. like how many broker daemon we require, when should be specify the zk, book keeper configuration (in which daemon). The documentation provided in <https://pulsar.apache.org/docs/en/deploy-bare-metal/> seems be be insufficient. ----
