Instead of foreach try to use forEachPartitions, that will initialize the
connector per partition rather than per record.
Thanks
Best Regards
On Fri, Aug 14, 2015 at 1:13 PM, Dawid Wysakowicz
wysakowicz.da...@gmail.com wrote:
No the connector does not need to be serializable cause it is
-- Forwarded message --
From: Dawid Wysakowicz wysakowicz.da...@gmail.com
Date: 2015-08-14 9:32 GMT+02:00
Subject: Re: Using unserializable classes in tasks
To: mark manwoodv...@googlemail.com
I am not an expert but first of all check if there is no ready connector
(you mentioned
I have a Spark job that computes some values and needs to write those
values to a data store. The classes that write to the data store are not
serializable (eg, Cassandra session objects etc).
I don't want to collect all the results at the driver, I want each worker
to write the data - what is
No the connector does not need to be serializable cause it is constructed
on the worker. Only objects shuffled across partitions needs to be
serializable.
2015-08-14 9:40 GMT+02:00 mark manwoodv...@googlemail.com:
I guess I'm looking for a more general way to use complex graphs of
objects that