2019-11-24 13:23:53 UTC - Fernando: how should I ingest data from the Debezium
postgres connector in pulsar SQL? The connector creates a schema:
```{
"name": "dbserver1.inventory.products",
"schema": {
"key": {
"name": "Bytes",
"schema": "",
"type": "BYTES",
"properties": {}
},
"value": {
"name": "Bytes",
"schema": "",
"type": "BYTES",
"properties": {}
}
},
"type": "KEY_VALUE",
"properties": {
"key.schema.name": "Bytes",
"key.schema.properties": "{}",
"key.schema.type": "BYTES",
"kv.encoding.type": "INLINE",
"value.schema.name": "Bytes",
"value.schema.properties": "{}",
"value.schema.type": "BYTES"
}
}```
but this is not recognized by presto
----
2019-11-24 16:12:27 UTC - Sijie Guo: @leonidv by default, there is one log
topic and one output topic (the results returned by the function will be
published to). You can publish the results to as many topics as you want by
using Context.publish as what @Jasper Li pointed out.
Also I would recommend reading the pulsar documentation rather than PIP:
<http://pulsar.apache.org/docs/en/functions-overview/>. Documentation is
updated as the code evolves.
----
2019-11-24 16:13:06 UTC - Sijie Guo: Currently I am not aware of any such study
yet.
----
2019-11-24 16:13:53 UTC - Sijie Guo: There is a WIP adding key/value schema
support in Pulsar SQL.
----
2019-11-24 19:42:40 UTC - Thor Sigurjonsson: I have a topic in production that
is giving 500 errors on stats call, and producers can't produce either.
----
2019-11-24 19:43:04 UTC - Thor Sigurjonsson: Any ideas how one can fix the 500
errors on topics or what might be the cause?
----
2019-11-24 19:50:49 UTC - David Kjerrumgaard: Are there any errors in the
broker logs?
----
2019-11-24 19:52:08 UTC - Thor Sigurjonsson: Looking...
----
2019-11-24 19:56:10 UTC - Thor Sigurjonsson:
`[BookKeeperClientWorker-OrderedExecutor-7-0] WARN
org.apache.pulsar.broker.service.BrokerService - Failed to create topic
persistent://<tenant>/<ns>/<topic>`
----
2019-11-24 19:56:13 UTC - Thor Sigurjonsson: found this
----
2019-11-24 20:02:47 UTC - Thor Sigurjonsson: Is there a good way to "rebuild" a
topic? That's getting 500 errors, wonder if there is a missing ledger in BK or
something of that kind. We have a producer that is hard to re-deploy.
----
2019-11-24 20:03:16 UTC - Thor Sigurjonsson: I'm thinking some kind of forceful
deletion...
----
2019-11-24 20:03:37 UTC - Thor Sigurjonsson: or zookeeper surgery that makes it
happy again (no backlog needs saving)
----
2019-11-24 20:05:53 UTC - David Kjerrumgaard: Does the error say why it cannot
create the topic? Is it a permission issue or is it a ZK issue?
----
2019-11-24 20:06:18 UTC - Thor Sigurjonsson: The auth role has been working
----
2019-11-24 20:06:22 UTC - Thor Sigurjonsson: (token auth)
----
2019-11-24 20:06:40 UTC - Thor Sigurjonsson: and our admin cli gets 500's on
stats and other calls on the topic
----
2019-11-24 20:06:57 UTC - Thor Sigurjonsson: makes me think it's in a bad state
----
2019-11-24 20:07:00 UTC - Thor Sigurjonsson: of some kind
----
2019-11-24 20:11:35 UTC - David Kjerrumgaard: Is the behavior isolated to that
topic only?
----
2019-11-24 20:13:40 UTC - Thor Sigurjonsson: it would appear yes
----
2019-11-24 20:14:09 UTC - Thor Sigurjonsson: it's also quite an older topic and
we've done some migrations with little data in flight since then
----
2019-11-24 20:14:16 UTC - Thor Sigurjonsson: which might have caused an issue
in BK
----
2019-11-24 20:14:24 UTC - Thor Sigurjonsson: or ZK/BK aggreement
----
2019-11-24 20:15:28 UTC - Thor Sigurjonsson: we're seeing good flows on our
other data flows
----
2019-11-24 20:15:48 UTC - David Kjerrumgaard: Is it possible that the Ledger
IDs associated with the topic and stored in ZK have been removed from BK?
----
2019-11-24 20:16:03 UTC - Thor Sigurjonsson: it is possible
----
2019-11-24 20:17:21 UTC - David Kjerrumgaard: and you get a 500 error when you
try to issue admin commands for that topic? Including delete, etc
----
2019-11-24 20:18:01 UTC - David Kjerrumgaard: Are there any active
subscriptions on the topic?
<https://pulsar.apache.org/docs/en/pulsar-admin/#subscriptions>
----
2019-11-24 20:19:55 UTC - Thor Sigurjonsson: we get 500s on some admin-cli
commands yes
----
2019-11-24 20:19:57 UTC - Thor Sigurjonsson: not all
----
2019-11-24 20:19:59 UTC - Thor Sigurjonsson: policies are ok
----
2019-11-24 20:20:36 UTC - Thor Sigurjonsson: we have a function subscribing
----
2019-11-24 20:21:20 UTC - Thor Sigurjonsson: and producer getting some errors
connecting too
----
2019-11-24 20:22:42 UTC - Thor Sigurjonsson: `reset-cursor` `stats`
`stats-internal` give 500s
----
2019-11-24 20:23:15 UTC - Thor Sigurjonsson: but `persistent lookup`
`permissions` `info-internal` work
----
2019-11-24 20:24:08 UTC - David Kjerrumgaard: If you want to "rebuild" the
topic then you should remove all the active subscribers, and delete the topic.
Can you try those 2 steps?
----
2019-11-24 20:24:39 UTC - David Kjerrumgaard:
<https://pulsar.apache.org/docs/en/pulsar-admin/#unsubscribe-1> then
<https://pulsar.apache.org/docs/en/pulsar-admin/#delete-4>
----
2019-11-24 20:25:03 UTC - Thor Sigurjonsson: is `delete` new?
----
2019-11-24 20:25:07 UTC - David Kjerrumgaard: You may need to stop the function
----
2019-11-24 20:25:29 UTC - David Kjerrumgaard: what version of Pulsar are you
running?
----
2019-11-24 20:26:51 UTC - Thor Sigurjonsson: 2.4.0-streamlio-24
----
2019-11-24 20:29:55 UTC - David Kjerrumgaard: `delete` command has been around
since 2.0 version at least, it not soon.
----
2019-11-24 20:30:19 UTC - David Kjerrumgaard: So your version should support it
----
2019-11-24 20:36:59 UTC - Thor Sigurjonsson: I'm getting 500 on unsubscribe and
on delete
----
2019-11-24 20:37:25 UTC - Thor Sigurjonsson: function is stopped though
----
2019-11-24 20:37:59 UTC - David Kjerrumgaard: Are there any remaining
subscriptions after you stopped the function?
----
2019-11-24 20:38:43 UTC - Thor Sigurjonsson: I dont think so
----
2019-11-24 20:38:57 UTC - Thor Sigurjonsson: but I can't verify with 500s
showing up on the calls
----
2019-11-24 20:46:17 UTC - David Kjerrumgaard: do you get a 500 on the
`subscriptions` call? I thought you were able to call that earlier to get the
list of subscriptions
----
2019-11-24 20:54:30 UTC - Thor Sigurjonsson: no we just have records of what
was deployed... function mainly...
----
2019-11-24 20:56:10 UTC - Thor Sigurjonsson: I'm sort of looking around in
zKcli to find things
----
2019-11-24 20:56:29 UTC - Thor Sigurjonsson: but it's hard to map without
knowing
----
2019-11-24 21:00:45 UTC - David Kjerrumgaard: Gotcha. The metadata is hard to
follow and not documented AFAIK.
----
2019-11-24 21:00:51 UTC - David Kjerrumgaard: sorry I couldn't be more help
----
2019-11-24 23:02:02 UTC - Thor Sigurjonsson: Thanks, we sorted it out. Stopped
function, removed managed-ledger for it in ZK and started it again. Things
fixed themselves then.
+1 : David Kjerrumgaard
----
2019-11-25 05:20:58 UTC - Fernando: is there a place where I can track this?
----
2019-11-25 05:30:44 UTC - Fernando: also I’d prefer if the input messages from
the source would be properly typed and not really key value but I’m having a
hard time finding documentation on how to do this
----
2019-11-25 05:41:57 UTC - Fernando: maybe related question: how do I type the
key and value instead of byte. I could be json or anything that allows me to
use it with SQL
----
2019-11-25 07:51:46 UTC - Sijie Guo: @jia zhai @tuteng ^
----
2019-11-25 08:25:17 UTC - tuteng: I will try fix this problem.
----
2019-11-25 08:34:40 UTC - tuteng: There was already a internal discussion of
how to solve this issue. this need 2 main support recently.
1. debezium is using KeyValueSchema currently, so we need support
KeyValueSchema in Pulsar SQL;
currently debezium not support Avro Schema, there is an issue tracking
it(<https://github.com/apache/pulsar/issues/5633>)
----
2019-11-25 08:37:25 UTC - Fernando: is there a way to re-serialize the topic
coming from debezium into a new topic that Pulsar SQL can understand? It’s kind
of a blocker right now since I don’t know how to do this without using kafka
instead
----
2019-11-25 08:47:25 UTC - tuteng: You are right, we need to do this. We have
developed a pulsar-io-kafka <https://github.com/streamnative/pulsar-io-kafka>
before. The principle is similar, but there is still some additional work to be
done.
----
2019-11-25 08:54:28 UTC - Fernando: Thanks I’ll have a look.
----