Hi Piotr,

The easiest solution to this for now is to write all your code (including
the case class) inside a Object and the execution part in a method in the
object. Then you can call the method on the spark shell to executed your
code.

Cheers,
Rohit


*Founder & CEO, **Tuplejump, Inc.*
____________________________
www.tuplejump.com
*The Data Engineering Platform*


On Fri, Apr 25, 2014 at 12:58 PM, Piotr Kołaczkowski
<pkola...@datastax.com>wrote:

> Yeah, this is related.
>
> From
> https://groups.google.com/forum/#!msg/spark-users/bwAmbUgxWrA/HwP4Nv4adfEJ
> :
> "This is a limitation that will hopefully go away in Scala 2.10 or 2.10 .1,
> when we'll use macros to remove the need to do this. (Or more generally if
> we get some changes in the Scala interpreter to do something smarter in
> this case.) "
>
> We're using Spark 0.9.0, Scala 2.10.3 and the limitation is there. Any
> ideas when it is going to be fixed?
>
> The workaround with embedding everything inside a singleton object does not
> work for me, because nested classes defined there are still inner  and
> require additional argument to the constructor (when invoked by
> reflection).
>
> If I only had some reliable way to obtain a reference to that outer object
> by reflection, we could somehow workaround it. E.g. saving it in some
> singleton object, etc. However, a proper fix would be to make non-inner
> classes properly non-inner.
>
> Thanks,
> Piotr
>
>
>
>
> 2014-04-25 0:13 GMT+02:00 Michael Armbrust <mich...@databricks.com>:
>
> > The Spark REPL is slightly modified from the normal Scala REPL to prevent
> > work from being done twice when closures are deserialized on the workers.
> >  I'm not sure exactly why this causes your problem, but its probably
> worth
> > filing a JIRA about it.
> >
> > Here is another issues with classes defined in the REPL.  Not sure if it
> is
> > related, but I'd be curious if the workaround helps you:
> > https://issues.apache.org/jira/browse/SPARK-1199
> >
> > Michael
> >
> >
> > On Thu, Apr 24, 2014 at 3:14 AM, Piotr Kołaczkowski
> > <pkola...@datastax.com>wrote:
> >
> > > Hi,
> > >
> > > I'm working on Cassandra-Spark integration and I hit a pretty severe
> > > problem. One of the provided functionality is mapping Cassandra rows
> into
> > > objects of user-defined classes. E.g. like this:
> > >
> > > class MyRow(val key: String, val data: Int)
> > > sc.cassandraTable("keyspace", "table").select("key", "data").as[MyRow]
> >  //
> > > returns CassandraRDD[MyRow]
> > >
> > > In this example CassandraRDD creates MyRow instances by reflection,
> i.e.
> > > matches selected fields from Cassandra table and passes them to the
> > > constructor.
> > >
> > > Unfortunately this does not work in Spark REPL.
> > > Turns out any class declared on the REPL is an inner classes, and to be
> > > successfully created, it needs a reference to the outer object, even
> > though
> > > it doesn't really use anything from the outer context.
> > >
> > > scala> class SomeClass
> > > defined class SomeClass
> > >
> > > scala> classOf[SomeClass].getConstructors()(0)
> > > res11: java.lang.reflect.Constructor[_] = public
> > > $iwC$$iwC$SomeClass($iwC$$iwC)
> > >
> > > I tried passing a null as a temporary workaround, and it also doesn't
> > work
> > > - I get NPE.
> > > How can I get a reference to the current outer object representing the
> > > context of the current line?
> > >
> > > Also, plain non-spark Scala REPL doesn't exhibit this behaviour - and
> > > classes declared on the REPL are proper top-most classes, not inner
> ones.
> > > Why?
> > >
> > > Thanks,
> > > Piotr
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Piotr Kolaczkowski, Lead Software Engineer
> > > pkola...@datastax.com
> > >
> > > 777 Mariners Island Blvd., Suite 510
> > > San Mateo, CA 94404
> > >
> >
>
>
>
> --
> Piotr Kolaczkowski, Lead Software Engineer
> pkola...@datastax.com
>
> 777 Mariners Island Blvd., Suite 510
> San Mateo, CA 94404
>

Reply via email to