On Fri, Jun 5, 2015 at 12:30 PM Marcelo Vanzin <van...@cloudera.com> wrote:
> Ignoring the serialization thing (seems like a red herring): > People seem surprised that I'm getting the Serialization exception at all - I'm not convinced it's a red herring per se, but on to the blocking issue... > > You might be using this Cassandra library with an incompatible version of > Spark; the `TaskMetrics` class has changed in the past, and the method it's > looking for does not exist at least in 1.4. > > You are correct, I was being a bone head. We recently downgraded to Spark 1.2.1 and I was running the compiled jar using Spark 1.3.1 on my local machine. Running the job with threading on my 1.2.1 cluster worked. Thank you for finding the obvious mistake :) Regarding serialization, I'm still confused as to why I was getting a serialization error in the first place as I'm executing these Runnable classes from a java thread pool. I'm fairly new to Scala/JVM world and there doesn't seem to be any Spark documentation to explain *why* I need to declare the sc variable as @transient (or even that I should). I was under the impression that objects only need to be serializable when they are sent over the network, and that doesn't seem to be occurring as far as I can tell. Apologies if this is simple stuff, but I don't like "fixing things" without knowing the full reason why the changes I made fixed things :) Thanks again for your time!