Have you tried instantiating the instance inside the closure, rather than outside of it?
If that works, you may need to switch to use mapPartition / foreachPartition for efficiency reasons. On Mon, Mar 23, 2015 at 3:03 PM, Adelbert Chang <adelbe...@gmail.com> wrote: > Is there no way to pull out the bits of the instance I want before I sent > it through the closure for aggregate? I did try pulling things out, along > the lines of > > def foo[G[_], B](blah: Blah)(implicit G: Applicative[G]) = { > val lift: B => G[RDD[B]] = b => > G.point(sparkContext.parallelize(List(b))) > > rdd.aggregate(/* use lift in here */) > } > > But that doesn't seem to work either, still seems to be trying to > serialize the Applicative... :( > > On Mon, Mar 23, 2015 at 12:27 PM, Dean Wampler <deanwamp...@gmail.com> > wrote: > >> Well, it's complaining about trait OptionInstances which is defined in >> Option.scala in the std package. Use scalap or javap on the scalaz library >> to find out which member of the trait is the problem, but since it says >> "$$anon$1", I suspect it's the first value member, "implicit val >> optionInstance", which has a long list of mixin traits, one of which is >> probably at fault. OptionInstances is huge, so there might be other >> offenders. >> >> Scalaz wasn't designed for distributed systems like this, so you'll >> probably find many examples of nonserializability. An alternative is to >> avoid using Scalaz in any closures passed to Spark methods, but that's >> probably not what you want. >> >> dean >> >> Dean Wampler, Ph.D. >> Author: Programming Scala, 2nd Edition >> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) >> Typesafe <http://typesafe.com> >> @deanwampler <http://twitter.com/deanwampler> >> http://polyglotprogramming.com >> >> On Mon, Mar 23, 2015 at 12:03 PM, adelbertc <adelbe...@gmail.com> wrote: >> >>> Hey all, >>> >>> I'd like to use the Scalaz library in some of my Spark jobs, but am >>> running >>> into issues where some stuff I use from Scalaz is not serializable. For >>> instance, in Scalaz there is a trait >>> >>> /** In Scalaz */ >>> trait Applicative[F[_]] { >>> def apply2[A, B, C](fa: F[A], fb: F[B])(f: (A, B) => C): F[C] >>> def point[A](a: => A): F[A] >>> } >>> >>> But when I try to use it in say, in an `RDD#aggregate` call I get: >>> >>> >>> Caused by: java.io.NotSerializableException: >>> scalaz.std.OptionInstances$$anon$1 >>> Serialization stack: >>> - object not serializable (class: >>> scalaz.std.OptionInstances$$anon$1, >>> value: scalaz.std.OptionInstances$$anon$1@4516ee8c) >>> - field (class: dielectric.syntax.RDDOps$$anonfun$1, name: G$1, >>> type: >>> interface scalaz.Applicative) >>> - object (class dielectric.syntax.RDDOps$$anonfun$1, <function2>) >>> - field (class: >>> dielectric.syntax.RDDOps$$anonfun$traverse$extension$1, >>> name: apConcat$1, type: interface scala.Function2) >>> - object (class >>> dielectric.syntax.RDDOps$$anonfun$traverse$extension$1, >>> <function2>) >>> >>> Outside of submitting a PR to Scalaz to make things Serializable, what >>> can I >>> do to make things Serializable? I considered something like >>> >>> implicit def applicativeSerializable[F[_]](implicit F: Applicative[F]): >>> SomeSerializableType[F] = >>> new SomeSerializableType { ... } ?? >>> >>> Not sure how to go about doing it - I looked at java.io.Externalizable >>> but >>> given `scalaz.Applicative` has no value members I'm not sure how to >>> implement the interface. >>> >>> Any guidance would be much appreciated - thanks! >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Getting-around-Serializability-issues-for-types-not-in-my-control-tp22193.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> > > > -- > Adelbert (Allen) Chang >