Re: [DISCUSS] Changing Schema for Kafka Source Connector

Christofer Dutz Sat, 07 Nov 2020 13:26:46 -0800

Hi Ben,

I have to admit that even if I created most of the connector, I have almost no 
experience with Kafka.
I don't even understand the schema topic ;-)


But it does make sense from your email and I can imagine that the current way 
it's implemented could be an issue.
I have no objections to changing this. 

Perhaps it would be cool to involve some people from the Kafka project? Perhaps 
callout on the [email protected] mailing-list too? (I'd probably do both, if I 
were you)

Chris



Am 07.11.20, 14:31 schrieb "Otto Fowler" <[email protected]>:

     Is it a breaking change?  Is anyone using it?
    I would be fine with changing it, but we would need to be clear on those
    things.
    Also, I think the names should reflect what the project uses,

    plcfields not fields etc

    From: Ben Hutcheson <[email protected]> <[email protected]>
    Reply: [email protected] <[email protected]> <[email protected]>
    Date: November 7, 2020 at 06:10:42
    To: [email protected] <[email protected]> <[email protected]>
    Subject:  [DISCUSS] Changing Schema for Kafka Source Connector

    Hi,

    Putting together a doc for the schema for the source connector for Kafka it
    appears that the existing schema is something like this:-

    {
    "type": "record",
    "name": "source-connector",
    "namespace": "com.apache.plc4x.kafka.config",
    "fields": [
    {
    "name": "running",
    "type": "int16"
    },
    {
    "name": "conveyorLeft",
    "type": "int16"
    },
    ....
    ]
    }

    In which the schema needs to be modified depending on what tags are being
    collected.

    If we change it so that the tags are included in an array of name/value
    pairs then we wouldn't have to modify the schema when tags are
    added/deleted. The new schema would be something like this.

    {
    "type": "record",
    "name": "source-connector",
    "namespace": "com.apache.plc4x.kafka.config",
    "fields": [
    {
    "name": "tag",
    "type":{
    "type": "array",
    "items":{
    "name":"tag",
    "type":"record",
    "fields":[
    {"name":"name", "type":"string"},
    {"name":"value", "type":"string"} <- With this
    eventually being a union of different types.
    ]
    }
    }
    }
    {
    "name": "timestamp",
    "type": "string"
    }
    ]
    }

    What are your thoughts on changing this? It would allow us not to have to
    send the schema within each packet.

    It does increase the packet size for the specific case that the tags will
    never change and the schema isn't being included in each packet, but I
    think this would be few and far between.

    Ben

Re: [DISCUSS] Changing Schema for Kafka Source Connector

Reply via email to