Lee, what cluster do you use? standalone, yarn-cluster, yarn-client, mesos?
in yarn-cluster the driver program is executed inside one of nodes in
cluster, so might be that driver code needs to be serialized to be sent to
some node

On 5 June 2015 at 22:55, Lee McFadden <splee...@gmail.com> wrote:

>
> On Fri, Jun 5, 2015 at 12:30 PM Marcelo Vanzin <van...@cloudera.com>
> wrote:
>
>> Ignoring the serialization thing (seems like a red herring):
>>
>
> People seem surprised that I'm getting the Serialization exception at all
> - I'm not convinced it's a red herring per se, but on to the blocking
> issue...
>
>
>>
>>
> You might be using this Cassandra library with an incompatible version of
>> Spark; the `TaskMetrics` class has changed in the past, and the method it's
>> looking for does not exist at least in 1.4.
>>
>>
> You are correct, I was being a bone head.  We recently downgraded to Spark
> 1.2.1 and I was running the compiled jar using Spark 1.3.1 on my local
> machine.  Running the job with threading on my 1.2.1 cluster worked.  Thank
> you for finding the obvious mistake :)
>
> Regarding serialization, I'm still confused as to why I was getting a
> serialization error in the first place as I'm executing these Runnable
> classes from a java thread pool.  I'm fairly new to Scala/JVM world and
> there doesn't seem to be any Spark documentation to explain *why* I need to
> declare the sc variable as @transient (or even that I should).
>
> I was under the impression that objects only need to be serializable when
> they are sent over the network, and that doesn't seem to be occurring as
> far as I can tell.
>
> Apologies if this is simple stuff, but I don't like "fixing things"
> without knowing the full reason why the changes I made fixed things :)
>
> Thanks again for your time!
>

Reply via email to