2020-01-08 09:22:42 UTC - Suhan Duman: @Suhan Duman has joined the channel
----
2020-01-08 13:05:13 UTC - Roman Popenov: I’ve noticed that everywhere in the
documentation for Source/Sinks, the JSON configs file examples are given as:
```{
"bootstrapServers": "pulsar-kafka:9092",
"groupId": "test-pulsar-io",
"topic": "my-topic",
"sessionTimeoutMs": "10000",
"autoCommitEnabled": false
}```
whereas it should be
```{
"configs": {
"bootstrapServers": "pulsar-kafka:9092",
"groupId": "test-pulsar-io",
"topic": "my-topic",
"sessionTimeoutMs": "10000",
"autoCommitEnabled": false
}
}```
----
2020-01-08 13:06:57 UTC - Roman Popenov: Did I miss something?
----
2020-01-08 13:25:46 UTC - tuteng: The above is correct, it can be used like this
```bin/pulsar-admin source localrun --archive
connectors/pulsar-io-debezium-mysql-{{pulsar:version}}.nar --name
debezium-mysql-source --destination-topic-name debezium-mysql-topic --tenant
public --namespace default --source-config '{"database.hostname":
"localhost","database.port": "3306","database.user":
"debezium","database.password": "dbz","database.server.id":
"184054","database.server.name": "dbserver1","database.whitelist":
"inventory","database.history":
"<http://org.apache.pulsar.io|org.apache.pulsar.io>.debezium.PulsarDatabaseHistory","database.history.pulsar.topic":
"history-topic","database.history.pulsar.service.url":
"<pulsar://127.0.0.1:6650>","key.converter":
"org.apache.kafka.connect.json.JsonConverter","value.converter":
"org.apache.kafka.connect.json.JsonConverter","pulsar.service.url":
"<pulsar://127.0.0.1:6650>","offset.storage.topic": "offset-topic"}'```
thanks : Roman Popenov
----
2020-01-08 13:26:28 UTC - Roman Popenov: I see! Thanks!
----
2020-01-08 14:25:24 UTC - Kevin DETHELOT: @Kevin DETHELOT has joined the channel
----
2020-01-08 14:40:08 UTC - Fernando: Can I make http requests using the `Pulsar
Functions SDK` in python?
----
2020-01-08 14:41:32 UTC - Roman Popenov: Shouldn’t be a problem
----
2020-01-08 14:41:56 UTC - Fernando: but how do I use the requests library?
----
2020-01-08 14:42:26 UTC - Roman Popenov: python or java?
----
2020-01-08 14:42:33 UTC - Fernando: ah sorry, python
----
2020-01-08 14:43:15 UTC - Fernando: this is an external dependency
----
2020-01-08 14:45:55 UTC - Roman Popenov: Yeah, I haven’t tried running requests
in Python, sorry
----
2020-01-08 14:47:11 UTC - Roman Popenov: I would assume that importing a
package using the relative paths would work
----
2020-01-08 14:47:16 UTC - Roman Popenov: But I haven’t tried that
----
2020-01-08 14:48:17 UTC - Fernando: I find it a bit confusing since pulsar is
running on kubernetes so I’d have to install dependencies in the broker
container which doesn’t sound like good practice
----
2020-01-08 14:48:38 UTC - Roman Popenov: I wouldn’t install dependencies
----
2020-01-08 14:48:57 UTC - Roman Popenov: If you are running it in a broker
context, I would just CP the library itself
----
2020-01-08 14:49:25 UTC - Fernando: I see
----
2020-01-08 14:49:31 UTC - Roman Popenov: and then when importing modules, I
would use import lib or refer to the modules using relative paths
----
2020-01-08 14:49:56 UTC - Roman Popenov: It doesn’t seem like a very clean
solution, but that’s what I done once
----
2020-01-08 14:50:01 UTC - Fernando: ok so basically like deploying lambdas in
AWS
----
2020-01-08 14:57:05 UTC - Roman Popenov: Actually, I think you can just import
requests
----
2020-01-08 14:57:18 UTC - Roman Popenov:
----
2020-01-08 14:57:35 UTC - Roman Popenov: requests is an already installed module
----
2020-01-08 14:57:45 UTC - Fernando: interesting
----
2020-01-08 14:57:52 UTC - Fernando: this is in the broker
----
2020-01-08 14:58:06 UTC - Roman Popenov: Yeah
----
2020-01-08 14:58:17 UTC - Fernando: I’ll have a look thanks
----
2020-01-08 14:59:04 UTC - Fernando: you’re right, awsome!
+1 : Roman Popenov
----
2020-01-08 16:09:49 UTC - Adam: @Adam has joined the channel
----
2020-01-08 16:10:24 UTC - Adam: Hi! I'm wondering if state storage from pulsar
functions is still in developer preview
----
2020-01-08 16:11:00 UTC - Adam: I noticed that this blog post
<https://streaml.io/blog/eda-simple-event-processing> started referencing it
back in 2018
----
2020-01-08 16:11:49 UTC - Adam: But this doc seems to imply that it's still in
developer preview:
<https://pulsar.apache.org/docs/en/functions-state/#__docusaurus>
----
2020-01-08 16:16:35 UTC - Adam: Ah, I see an older message explaining that it
is still in developer preview. I have a further question then - in Kafka
Streams workers, you can have a transaction between the update to a state store
and the consumer's progress (since both are writing to Kafka, and Kafka has
transactions). I'm curious if there will be a similar capability for Pulsar
functions at some point in the future?
----
2020-01-08 16:17:31 UTC - Adam: And one further question - is there a place
that documents what work is remaining to take state storage out of developer
preview? I'm curious if there's any way to pitch in on that effort
----
2020-01-08 16:30:08 UTC - Ryan: Is there a specific reason clients are not
allowed to skip to a specific message in a topic, whether via Id or
otherwise?Currently, it appears clients have the choice of either starting at
the beginning of a stream or at the latest message? After taking a look at the
source code, there is support for a subscription to keep track of a client's
location within a stream, if a client connects/disconnects, so the client has
resume where it left off but the initial connection options appear to be an
enum with the above two choices.
----
2020-01-08 16:45:22 UTC - Kohei Watanabe: @Kohei Watanabe has joined the channel
----
2020-01-08 21:40:21 UTC - Mathieu Druart: @Pedro Cardoso I tried the 2.5.0-RC2
version and added
```extraServerComponents:
"org.apache.bookkeeper.stream.server.StreamStorageLifecycleComponent"```
to the values-mini.yaml, after the cluster deployment I verified and the
property was correct in every `conf/bookkeeper.conf` files of the nodes, but
when I try a function I still have the same exception
`java.lang.IllegalStateException: State is not enabled.` when I try to access
the state. Any ideas ? Thanks !
----
2020-01-08 21:43:19 UTC - Pedro Cardoso: @Sijie Guo :point_up: ?
----
2020-01-08 21:55:15 UTC - Julien: @Julien has joined the channel
----
2020-01-08 22:55:06 UTC - Roman Popenov: Anyone has any explanation for the
resources values in `values.yaml` for the helm chart?
----
2020-01-08 22:56:15 UTC - juraj: like, why proxy has 4 gb of ram? lol, i don't
----
2020-01-08 22:58:02 UTC - Roman Popenov: And why 4 nodes of bookies with 15 Gi
of ram and not 6 with 10 Gi
----
2020-01-08 22:59:14 UTC - Roman Popenov: Do you have any performance metrics
for Pulsar?
----
2020-01-08 22:59:25 UTC - Roman Popenov: Or any recommendations?
----
2020-01-08 23:25:17 UTC - juraj: nothing that i could say has been thoroughly
empirically validated / battle-tested yet
----
2020-01-08 23:26:39 UTC - Roman Popenov: Any future plans?
----
2020-01-08 23:30:47 UTC - juraj: i have estimated how much roughly to give each
component ram/cpu based on total aws/eks node ram and cpu specs, and how much
the k8s system components already allocated for themselves..
for the future would be nice to have a tool that would do this automatically
and then spit out the values.yaml accordingly
----
2020-01-08 23:31:14 UTC - Roman Popenov: What are your estimates?
----
2020-01-08 23:32:23 UTC - Roman Popenov: I was thinking of starting with:
```Scaled Pulsar Cluster without monitoring
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
---------------------------
Zookeeper:
3 Nodes:
memory: 1Gi
cpu: 1
volume: 20Gi
---------------------------
---------------------------
Bookkeeper:
4 Nodes:
memory: 1Gi
cpu: 1
volumes:
50Gi ledger
50Gi journal
---------------------------
---------------------------
Broker:
3 Nodes:
memory: 1Gi
cpu: 1
---------------------------
---------------------------
Proxy:
3 Nodes:
memory: 1Gi
cpu: 1
---------------------------
---------------------------
Auto-Recovery:
1 Node:
memory: 1Gi
cpu: 250m
---------------------------
---------------------------
Management:
1 Nodes:
memory: 250Mi
cpu: 0.1
---------------------------
---------------------------
Functions +IO:
1 Nodes:
memory: 1Gi
cpu: 0.5
---------------------------
---------------------------
Functions +IO:
1 Nodes:
memory: 1Gi
cpu: 0.5
-------------------------```
----
2020-01-08 23:32:59 UTC - Roman Popenov: Roughly with
RAM ~ 16 Gi
CPU ~ 16
DISK ~50 Gi
----
2020-01-08 23:33:10 UTC - Roman Popenov: And see how it fares
----
2020-01-08 23:33:25 UTC - juraj: i have planned for an EKS cluster of 4 worker
nodes (r5d.xlarge) and 1 system node (r5d.large)
----
2020-01-08 23:34:04 UTC - juraj: i'm placing the components using node taints /
tolerations and deployment/statefulset affinity rules
----
2020-01-08 23:35:11 UTC - juraj: (i'm not using functions yet, plus they got
broken in 2.4.2)
----
2020-01-08 23:35:21 UTC - Roman Popenov: Oh
----
2020-01-08 23:35:25 UTC - Roman Popenov: What broke?
----
2020-01-08 23:35:45 UTC - juraj: the broker :smile:
sweat_smile : Roman Popenov
----
2020-01-08 23:36:15 UTC - Roman Popenov: What is the issue exactly?
----
2020-01-08 23:36:25 UTC - juraj: <https://github.com/apache/pulsar/issues/5818>
----
2020-01-08 23:38:26 UTC - Roman Popenov: Oh yeah, I was working around it
----
2020-01-08 23:39:08 UTC - Roman Popenov: It shouldn’t prevent from using
functions
----
2020-01-09 02:26:27 UTC - rmb: Hi all, I have some questions about pulsar
producers, specifically in the nodejs library:
• if sending a message fails, what are the possible error messages send() could
throw?
• if a broker has deduplication turned on, the docs recommend setting the
timeout to -1 --- why is that? and if there's no timeout, is there some other
mechanism for the client to decide that a message has failed?
• the nodejs documentation only lists methods send(), flush(), and close() for
the producer. is there a way to extract the producer's configuration data?
(for example, if I want to know the producerName or the lastSequenceId) those
functions seem to be implemented in the other client libraries; is there a
reason they're not in the nodejs library?
----
2020-01-09 03:59:13 UTC - vikash: Hello All,
I am also facing the same issue through .net client producer(Pulsar.Client.Api
Pulsar.Client, Version=0.12.0.0)
here is the issue link
<https://github.com/apache/pulsar/issues/5454>
java.lang.NullPointerException: null
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java
there it is closed for python pulsar client....but I m facing through .net
pulsar client. while sending the message from .net client producer to pulsar
topic.
Any alternative solution is there to pass message from .net client producer to
IO JDBC sink connector with/without
Schema(Avro or JSON)?
----
2020-01-09 04:19:26 UTC - Sijie Guo: this is just fixed in
<https://github.com/apache/pulsar/pull/5930>
----
2020-01-09 04:36:47 UTC - vikash: @Sijie Guo I m looking for .net side client
producer to send message to the topic with/without schema to io-JDBC sink...any
solution from .net client or websocket side?
----
2020-01-09 05:35:41 UTC - Sijie Guo: I don’t think websocket support schema yet.
----
2020-01-09 05:36:01 UTC - Sijie Guo: There are two .net client availble. I am
not sure if the schema is supported or not.
----