Hi Piotr, The easiest solution to this for now is to write all your code (including the case class) inside a Object and the execution part in a method in the object. Then you can call the method on the spark shell to executed your code.
Cheers, Rohit *Founder & CEO, **Tuplejump, Inc.* ____________________________ www.tuplejump.com *The Data Engineering Platform* On Fri, Apr 25, 2014 at 12:58 PM, Piotr Kołaczkowski <pkola...@datastax.com>wrote: > Yeah, this is related. > > From > https://groups.google.com/forum/#!msg/spark-users/bwAmbUgxWrA/HwP4Nv4adfEJ > : > "This is a limitation that will hopefully go away in Scala 2.10 or 2.10 .1, > when we'll use macros to remove the need to do this. (Or more generally if > we get some changes in the Scala interpreter to do something smarter in > this case.) " > > We're using Spark 0.9.0, Scala 2.10.3 and the limitation is there. Any > ideas when it is going to be fixed? > > The workaround with embedding everything inside a singleton object does not > work for me, because nested classes defined there are still inner and > require additional argument to the constructor (when invoked by > reflection). > > If I only had some reliable way to obtain a reference to that outer object > by reflection, we could somehow workaround it. E.g. saving it in some > singleton object, etc. However, a proper fix would be to make non-inner > classes properly non-inner. > > Thanks, > Piotr > > > > > 2014-04-25 0:13 GMT+02:00 Michael Armbrust <mich...@databricks.com>: > > > The Spark REPL is slightly modified from the normal Scala REPL to prevent > > work from being done twice when closures are deserialized on the workers. > > I'm not sure exactly why this causes your problem, but its probably > worth > > filing a JIRA about it. > > > > Here is another issues with classes defined in the REPL. Not sure if it > is > > related, but I'd be curious if the workaround helps you: > > https://issues.apache.org/jira/browse/SPARK-1199 > > > > Michael > > > > > > On Thu, Apr 24, 2014 at 3:14 AM, Piotr Kołaczkowski > > <pkola...@datastax.com>wrote: > > > > > Hi, > > > > > > I'm working on Cassandra-Spark integration and I hit a pretty severe > > > problem. One of the provided functionality is mapping Cassandra rows > into > > > objects of user-defined classes. E.g. like this: > > > > > > class MyRow(val key: String, val data: Int) > > > sc.cassandraTable("keyspace", "table").select("key", "data").as[MyRow] > > // > > > returns CassandraRDD[MyRow] > > > > > > In this example CassandraRDD creates MyRow instances by reflection, > i.e. > > > matches selected fields from Cassandra table and passes them to the > > > constructor. > > > > > > Unfortunately this does not work in Spark REPL. > > > Turns out any class declared on the REPL is an inner classes, and to be > > > successfully created, it needs a reference to the outer object, even > > though > > > it doesn't really use anything from the outer context. > > > > > > scala> class SomeClass > > > defined class SomeClass > > > > > > scala> classOf[SomeClass].getConstructors()(0) > > > res11: java.lang.reflect.Constructor[_] = public > > > $iwC$$iwC$SomeClass($iwC$$iwC) > > > > > > I tried passing a null as a temporary workaround, and it also doesn't > > work > > > - I get NPE. > > > How can I get a reference to the current outer object representing the > > > context of the current line? > > > > > > Also, plain non-spark Scala REPL doesn't exhibit this behaviour - and > > > classes declared on the REPL are proper top-most classes, not inner > ones. > > > Why? > > > > > > Thanks, > > > Piotr > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Piotr Kolaczkowski, Lead Software Engineer > > > pkola...@datastax.com > > > > > > 777 Mariners Island Blvd., Suite 510 > > > San Mateo, CA 94404 > > > > > > > > > -- > Piotr Kolaczkowski, Lead Software Engineer > pkola...@datastax.com > > 777 Mariners Island Blvd., Suite 510 > San Mateo, CA 94404 >