Gian mentioned MSQ. The new MSQ work is exciting and powerful for Druid
ingestion. If the data needs cleaning, we would expect users to employ
something like Spark to do that task, then emit clean data to Kafka or files,
which Druid MSQ can ingest. That is:
Dirty data —> Spark —> Kafka/Files —>
Hi Julian,
Thank you so much for your contribution on Spark support. As an existing
committer, I would like to help get the Spark connector merged into OSS
(including PR reviews and any other development work that may be needed).
We can move the conversation regarding Spark support into a new thre