Hello,

I am a student and I am currently doing a big data project.
Here is my code:
https://gist.github.com/Balykoo/262d94a7073d5a7e16dfb0d0a576b9c3

My project is to retrieve messages from a twitch chat and send them into kafka then spark reads the kafka topic to perform the processing in the provided gist.

I will want to send these messages into cassandra.

I tested a first solution on line 72 which works but when there are too many messages spark crashes. Probably due to the fact that my function connects to cassandra each time it is called.

I tried the object approach to mutualize the connection object but without success: _pickle.PicklingError: Could not serialize object: TypeError: cannot pickle '_thread.RLock' object

Can you please tell me how to do this?
Or at least give me some advice?

Sincerely,
FARCY Guillaume.



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to