2019-11-04 09:40:10 UTC - Jasper Li: Halo all,
I have a question to use Pulsar SQL after doing cdc by debezium connector, it
returns me an error as below:
```
2019-11-04T09:32:03.590Z WARN statement-response-2
com.facebook.presto.server.ThrowableMapper Request failed for
/v1/statement/20191104_093201_00002_562v2/3
java.lang.IllegalArgumentException: Unsupported schema : {
"name": "db.db.table",
"schema": {
"key": {
"name": "Bytes",
"schema": "",
"type": "BYTES",
"properties": {}
},
"value": {
"name": "Bytes",
"schema": "",
"type": "BYTES",
"properties": {}
}
},
"properties": {
"key.schema.name": "Bytes",
"key.schema.properties": "{}",
"key.schema.type": "BYTES",
"kv.encoding.type": "INLINE",
"value.schema.name": "Bytes",
"value.schema.properties": "{}",
"value.schema.type": "BYTES"
}
}
```
Does it mean there is no way to query data from debezium cdc record by Pulsar
SQL, unless I have transform it?
Thanks!!!!!
----
2019-11-04 09:48:12 UTC - Jasper Li: @xiaolong.ran Thank you very much!!! I am
successful in offloading to my GCS bucket now!!! It is cool!!!
----
2019-11-04 10:09:52 UTC - Jasper Li: Halo again,
I have another question about debezium CDC in Pulsar compare with Kafka. Since
I am moving from Kafka to Pulsar, I have used debezium to do cdc and get change
log from mysql, but, I have used
```transforms.unwrap.type=io.debezium.transforms.UnwrapFromEnvelope``` and
```key.converter=io.confluent.connect.avro.AvroConverter/value.converter=io.confluent.connect.avro.AvroConverter```
in Kafka connect, but I cannot use them in Pulsar IO directly, is it possible
to apply them in Pulsar IO?
Thanks again!!!
----
2019-11-04 10:56:12 UTC - Kabeer Ahmed: @tuteng Thank you for tagging @yijie
----
2019-11-04 11:10:32 UTC - Sijie Guo: @Jasper Li:
for your first question, I think PrestoSQL doesn’t support key/value schema
yet. @Penghui Li was looking into adding that support.
for the second question, I believe you can set those settings. you can just
add those debezium settings under `configs:` section in the yaml file you used
to submit the connector.
+1 : Jasper Li, Penghui Li
----
2019-11-04 13:09:35 UTC - Berger: @Berger has joined the channel
----
2019-11-04 13:17:26 UTC - Berger: Good morning everyone. I’m looking for
message solution which can be installed as multicluster between two different
cloud providers at the start and add more clusters installed between different
infrastructures across world :slightly_smiling_face:
I started with basic installation on k8s cluster, but here I start to think if
it is possible to install pulsar on multiple k8s cluster and connect them
together into one multi-cluster instance. Is it something what I can gain with
the kubernetes, or I have to use normal instances, but still there i need
probably any quorum to share configuration between clusters :open_mouth:
+1 : Jasper Li
----
2019-11-04 13:57:10 UTC - Matt Mitchell: I tried to run the latest dashboard
(last week and tip of master as of 5 mins ago) and it failed with `sudo:
initdb: command not found`. I found that the path to `initdb` in the
`init-postgres.sh` script references `9.6` but the image has `11`. After
updating the script, the dashboard seems to work. Is this a known issue or
possibly something related to my env?
----
2019-11-04 14:08:56 UTC - Matt Mitchell: Also, once it’s running, I the UI
doesn’t list anything… no tenants, namespaces etc.. maybe there’s a config/ENV
option missing from the README?
----
2019-11-04 14:15:07 UTC - tuteng: You can try login in pg for query data?
----
2019-11-04 14:55:26 UTC - Sijie Guo: @Matt Mitchell the current dashboard uses
topic stats for displaying tenants and namespaces. if you don’t have any
traffic, those information might not show up. You can try use the new
management console: <https://github.com/apache/pulsar-manager>
----
2019-11-04 14:57:12 UTC - Sijie Guo: you can install pulsar in multiple k8s
clusters and expose the proxies through a load balancer. so each cluster can
connect to the others. In this way, you can make a global instance.
----
2019-11-04 15:10:58 UTC - Berger: @Sijie Guo Did you seen any examples
(articles or whatever) with such configuration? I guess in this case I can use
this method to make connection between different installation types like k8s,
normal installation on instances etc.
----
2019-11-04 15:18:38 UTC - Matt Mitchell: will do. thanks @Sijie Guo
----
2019-11-04 15:45:00 UTC - Alexandre DUVAL: @Jerry Peng do you have an example
of function config yaml file?
----
2019-11-04 15:46:15 UTC - Alexandre DUVAL: Hi, how to inject env var to a
pulsar function?
----
2019-11-04 15:50:34 UTC - Alex Rufo: @Alex Rufo has joined the channel
----
2019-11-04 15:53:20 UTC - Alexandre DUVAL: env vars should be passed in
PULSAR_EXTRA_OPTS?
----
2019-11-04 16:03:16 UTC - Matteo Merli: @Jared Mackey @Raman Gupta the sequence
id, a part for deduplication, is used to correlate a SendReceipt to a
particular Send request.
It’s not optional or ignored, but rather it’s stored within the message
metadata.
The sequence id is a per-producer client assigned identifier while the
“message” is a storage assigned unique identifier
----
2019-11-04 18:55:21 UTC - Addison Higham: reading the docs on retention
policies, want to make sure I understand something. The docs say "and" for size
and time, does that mean that if I set a policy for 10GB size and 3 hours of
time, that it could go beyond 10GB if I have more than 10GB of data in the 3
hour window?
----
2019-11-04 19:27:45 UTC - Jerry Peng: No it’s which limit is reached first
----
2019-11-04 19:45:06 UTC - Addison Higham: okay, that is what I thought was more
likely the case (and maybe I missed it) but it wasn't obvious at first glance
----
2019-11-04 20:31:36 UTC - Jerry Peng: ```
name: jerry-function
tenant: public
namespace: default
jar:
/Users/jerrypeng/workspace/incubator-pulsar/pulsar-functions/java-examples/target/pulsar-functions-api-examples.jar
className: org.apache.pulsar.functions.api.examples.TestFunction
inputSpecs:
<persistent://jerry/default/jerry-input>:
receiverQueueSize: 1000
output: <persistent://jerry/default/jerry-output>
parallelism: 1
cleanupSubscription: true
```
----
2019-11-04 21:17:20 UTC - Alexandre DUVAL: there is a way to define custom
input schema?
----
2019-11-04 22:47:02 UTC - CTRL: @CTRL has joined the channel
----
2019-11-04 22:47:42 UTC - CTRL: hi everyone! :slightly_smiling_face:
wave : Chris Bartholomew, Matteo Merli, Karthik Ramasamy
----
2019-11-04 23:49:08 UTC - JJ: @JJ has joined the channel
----
2019-11-05 01:28:18 UTC - Jasper Li: Thanks for your reply!!! It is happy to
know the PrestoSQL will support key/value schema in the future and I will try
to set up my debezium again. :slightly_smiling_face:
----
2019-11-05 01:46:41 UTC - kay pan: @kay pan has joined the channel
----
2019-11-05 01:53:21 UTC - kay pan: hi everyone
----
2019-11-05 01:54:34 UTC - kay pan: i have a issue:
[pulsar-ordered-OrderedExecutor-7-0-EventThread] INFO
org.apache.pulsar.zookeeper.ZooKeeperDataCache - [State:CONNECTED Timeout:30000
sessionid:0x20043268f0f000b
----
2019-11-05 01:54:45 UTC - kay pan: please help ,thanks
----
2019-11-05 03:26:37 UTC - mrigesh: @mrigesh has joined the channel
----
2019-11-05 05:35:43 UTC - Gopi Krishna:
<https://github.com/PharosProduction/tutorial-pulsar-java> so if we are writing
java classes as in this link, how do we run the java classes of producers and
consumers. I am confused
----
2019-11-05 06:52:01 UTC - Gopi Krishna: Are there any connectors by which we
can stream data from mongodb to pulsar? I can find
<https://pulsar.apache.org/docs/en/next/io-mongo-sink/> pulsar sink connector
but not any connector to pull data from mongo
----
2019-11-05 06:52:27 UTC - Gopi Krishna: Are there any connectors by which we
can stream data from mongodb to pulsar? I can find
<https://pulsar.apache.org/docs/en/next/io-mongo-sink/> pulsar sink connector
but not any connector to pull data from mongo
----
2019-11-05 06:52:48 UTC - Ali Ahmed: @Gopi Krishna You mean a cdc for mongodb ?
----
2019-11-05 06:53:11 UTC - Gopi Krishna: what is a cdc ?
----
2019-11-05 06:53:52 UTC - Ali Ahmed: change data capture
----
2019-11-05 06:55:30 UTC - Gopi Krishna: Hmm, not exactly. Basically I am trying
to read the data streamed into mongodb through nifi. This data can be
historical or real-time
----
2019-11-05 06:58:09 UTC - Gopi Krishna: any idea ?
----
2019-11-05 07:16:30 UTC - tuteng: You can try
<https://pulsar.apache.org/docs/en/next/io-debug/> to debug
----
2019-11-05 07:16:50 UTC - tuteng:
<https://pulsar.apache.org/docs/en/next/io-debug/#debug-in-localrun-mode>
----
2019-11-05 07:19:54 UTC - Gopi Krishna: This is just for debugging of
mongo-connector-sink
----
2019-11-05 07:23:58 UTC - tuteng:
<https://github.com/apache/pulsar/issues/5474> We haven't added mongo's cdc
scene yet.
----
2019-11-05 07:24:14 UTC - tuteng: pull data from mongo
----
2019-11-05 07:24:55 UTC - Gopi Krishna: thanks will go through
----
2019-11-05 07:28:58 UTC - Sijie Guo: @tuteng: @Gopi Krishna is asking for a
mongodb cdc.
----