Re: [DISCUSS] Changing Schema for Kafka Source Connector

Xiangdong Huang Sat, 07 Nov 2020 17:25:46 -0800

Hi,

Though I am not from Kafka project, maybe I can join the discussion
as I have some experiences and usually we (mean, the database) consumes
data from Kafka.


I am reading the current Kafka adapter's implementation,
and will be back soon :D

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Christofer Dutz <[email protected]> 于2020年11月8日周日 上午5:26写道：

> Hi Ben,
>
> I have to admit that even if I created most of the connector, I have
> almost no experience with Kafka.
> I don't even understand the schema topic ;-)
>
> But it does make sense from your email and I can imagine that the current
> way it's implemented could be an issue.
> I have no objections to changing this.
>
> Perhaps it would be cool to involve some people from the Kafka project?
> Perhaps callout on the [email protected] mailing-list too? (I'd probably do
> both, if I were you)
>
> Chris
>
>
>
> Am 07.11.20, 14:31 schrieb "Otto Fowler" <[email protected]>:
>
>      Is it a breaking change?  Is anyone using it?
>     I would be fine with changing it, but we would need to be clear on
> those
>     things.
>     Also, I think the names should reflect what the project uses,
>
>     plcfields not fields etc
>
>     From: Ben Hutcheson <[email protected]> <[email protected]>
>     Reply: [email protected] <[email protected]> <
> [email protected]>
>     Date: November 7, 2020 at 06:10:42
>     To: [email protected] <[email protected]> <[email protected]>
>     Subject:  [DISCUSS] Changing Schema for Kafka Source Connector
>
>     Hi,
>
>     Putting together a doc for the schema for the source connector for
> Kafka it
>     appears that the existing schema is something like this:-
>
>     {
>     "type": "record",
>     "name": "source-connector",
>     "namespace": "com.apache.plc4x.kafka.config",
>     "fields": [
>     {
>     "name": "running",
>     "type": "int16"
>     },
>     {
>     "name": "conveyorLeft",
>     "type": "int16"
>     },
>     ....
>     ]
>     }
>
>     In which the schema needs to be modified depending on what tags are
> being
>     collected.
>
>     If we change it so that the tags are included in an array of name/value
>     pairs then we wouldn't have to modify the schema when tags are
>     added/deleted. The new schema would be something like this.
>
>     {
>     "type": "record",
>     "name": "source-connector",
>     "namespace": "com.apache.plc4x.kafka.config",
>     "fields": [
>     {
>     "name": "tag",
>     "type":{
>     "type": "array",
>     "items":{
>     "name":"tag",
>     "type":"record",
>     "fields":[
>     {"name":"name", "type":"string"},
>     {"name":"value", "type":"string"} <- With this
>     eventually being a union of different types.
>     ]
>     }
>     }
>     }
>     {
>     "name": "timestamp",
>     "type": "string"
>     }
>     ]
>     }
>
>     What are your thoughts on changing this? It would allow us not to have
> to
>     send the schema within each packet.
>
>     It does increase the packet size for the specific case that the tags
> will
>     never change and the schema isn't being included in each packet, but I
>     think this would be few and far between.
>
>     Ben
>
>

Re: [DISCUSS] Changing Schema for Kafka Source Connector

Reply via email to