Looks like you are trying to apply this class/function across Spark, but it
contains a non-serialized object, the connection. That has to be
initialized on use, otherwise you try to send it from the driver and that
can't work.

On Mon, Mar 21, 2022 at 11:51 AM guillaume farcy <
guillaume.fa...@imt-atlantique.net> wrote:

> Hello,
>
> I am a student and I am currently doing a big data project.
> Here is my code:
> https://gist.github.com/Balykoo/262d94a7073d5a7e16dfb0d0a576b9c3
>
> My project is to retrieve messages from a twitch chat and send them into
> kafka then spark reads the kafka topic to perform the processing in the
> provided gist.
>
> I will want to send these messages into cassandra.
>
> I tested a first solution on line 72 which works but when there are too
> many messages spark crashes. Probably due to the fact that my function
> connects to cassandra each time it is called.
>
> I tried the object approach to mutualize the connection object but
> without success:
> _pickle.PicklingError: Could not serialize object: TypeError: cannot
> pickle '_thread.RLock' object
>
> Can you please tell me how to do this?
> Or at least give me some advice?
>
> Sincerely,
> FARCY Guillaume.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to