Re: [Spark SQL] Structured Streaming in pyhton can connect to cassandra ?

2022-03-25 Thread Gourav Sengupta
Hi, completely agree with Alex, also if you are just writing to Cassandra then what is the purpose of writing to Kafka broker? Generally people just find it sound as if adding more components to their architecture is great, but sadly it is not. Remove the Kafka broker, incase you are not

Re: [Spark SQL] Structured Streaming in pyhton can connect to cassandra ?

2022-03-25 Thread Alex Ott
You don't need to use foreachBatch to write to Cassandra. You just need to use Spark Cassandra Connector version 2.5.0 or higher - it supports native writing of stream data into Cassandra. Here is an announcement: https://www.datastax.com/blog/advanced-apache-cassandra-analytics-now-open-all

Re: [Spark SQL] Structured Streaming in pyhton can connect to cassandra ?

2022-03-21 Thread Mich Talebzadeh
dear student, Check this article of mine in Linkedin Processing Change Data Capture with Spark Structured Streaming There is a link to GitHub

Re: [Spark SQL] Structured Streaming in pyhton can connect to cassandra ?

2022-03-21 Thread Sean Owen
Looks like you are trying to apply this class/function across Spark, but it contains a non-serialized object, the connection. That has to be initialized on use, otherwise you try to send it from the driver and that can't work. On Mon, Mar 21, 2022 at 11:51 AM guillaume farcy <

[Spark SQL] Structured Streaming in pyhton can connect to cassandra ?

2022-03-21 Thread guillaume farcy
Hello, I am a student and I am currently doing a big data project. Here is my code: https://gist.github.com/Balykoo/262d94a7073d5a7e16dfb0d0a576b9c3 My project is to retrieve messages from a twitch chat and send them into kafka then spark reads the kafka topic to perform the processing in the