Spark Structured Streaming and Kafka message schema evolution

Mich Talebzadeh Mon, 15 Mar 2021 05:25:53 -0700

This is just a query.

In general Kafka-connect requires means to register that schema such that
producers and consumers understand that. It also allows schema evolution,
i.e. changes to metadata that identifies the structure of data sent via
topic.


When we stream a kafka topic into (Spark Structured Streaming (SSS), the
assumption is that by the time Spark processes that data, its structure
can be established. With foreachBatch, we create a dataframe on top of
incoming batches of Json messages and the dataframe can be interrogated.
However, the processing may fail if another column is added to the topic
and the consumer (in this case SSS) is not aware of it. How can this change
of schema be verified?

Thanks

LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Spark Structured Streaming and Kafka message schema evolution

Reply via email to