On a more serious note -- yes, Datasets breaks with  Scala value classes in
Spark 2.0.0 and Spark 1.6.1. I wrote up a JIRA bug and I hope some more
knowledgable people can look at it.

Sean Own has commented on other code generation errors before I put him as
shepherd in JIRA. Michael Armbrust has expressed interest in code
generation / encoder problems I have found recently.

Here is the problem:

https://issues.apache.org/jira/browse/SPARK-17368

Thank you



On Thu, Sep 1, 2016 at 3:09 PM, Aris <arisofala...@gmail.com> wrote:

> Thank you Jakob on two counts
>
> 1. Yes, thanks for pointing out that spark-shell cannot take value
> classes, that was an additional confusion to me!
>
> 2. We have a Spark 2.0 project which is definitely breaking at runtime
> with a Dataset of value classes. I am not sure if this is also the case in
> Spark 1.6, I'm going to verify.
>
> Once I can make a trivial example with value classes breaking I will open
> a JIRA for Spark.
>
> Is Martin Odersky your father? Does this mean Scala is your sister? And
> does that mean Spark is your cousin? ;-)
>
>
> On Thu, Sep 1, 2016 at 2:44 PM, Jakob Odersky <ja...@odersky.com> wrote:
>
>> Hi Aris,
>> thanks for sharing this issue. I can confirm that value classes
>> currently don't work, however I can't think of reason why they
>> shouldn't be supported. I would therefore recommend that you report
>> this as a bug.
>>
>> (Btw, value classes also currently aren't definable in the REPL. See
>> https://issues.apache.org/jira/browse/SPARK-17367)
>>
>> regards,
>> --Jakob
>>
>> On Thu, Sep 1, 2016 at 1:58 PM, Aris <arisofala...@gmail.com> wrote:
>> > Hello Spark community -
>> >
>> > Does Spark 2.0 Datasets *not support* Scala Value classes (basically
>> > "extends AnyVal" with a bunch of limitations) ?
>> >
>> > I am trying to do something like this:
>> >
>> > case class FeatureId(value: Int) extends AnyVal
>> > val seq = Seq(FeatureId(1),FeatureId(2),FeatureId(3))
>> > import spark.implicits._
>> > val ds = spark.createDataset(seq)
>> > ds.count
>> >
>> >
>> > This will compile, but then it will break at runtime with a cryptic
>> error
>> > about "cannot find int at value". If I remove the "extends AnyVal" part,
>> > then everything works.
>> >
>> > Value classes are a great performance boost / static type checking
>> feature
>> > in Scala, but are they prohibited in Spark Datasets?
>> >
>> > Thanks!
>> >
>>
>
>

Reply via email to