[ 
https://issues.apache.org/jira/browse/SPARK-17248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832163#comment-15832163
 ] 

Leif Warner commented on SPARK-17248:
-------------------------------------

Much more efficient encodings than strings are possible with enums. That's a 
big reason I was looking to encode things as enums (or the Scala sum type 
equivalent, a sealed abstract class with a few case objects that extend it).
Consider:
{code}
object Bool extends Enumeration {
  val True, False = Value
}
{code}

It'd be a lot more efficient to encode a 1 or 0 instead of the strings "True" 
or "False" every time.

> Add native Scala enum support to Dataset Encoders
> -------------------------------------------------
>
>                 Key: SPARK-17248
>                 URL: https://issues.apache.org/jira/browse/SPARK-17248
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Silvio Fiorito
>
> Enable support for Scala enums in Encoders. Ideally, users should be able to 
> use enums as part of case classes automatically.
> Currently, this code...
> {code}
> object MyEnum extends Enumeration {
>   type MyEnum = Value
>   val EnumVal1, EnumVal2 = Value
> }
> case class MyClass(col: MyEnum.MyEnum)
> val data = Seq(MyClass(MyEnum.EnumVal1), MyClass(MyEnum.EnumVal2)).toDS()
> {code}
> ...results in this stacktrace:
> {code}
> ava.lang.UnsupportedOperationException: No Encoder found for MyEnum.MyEnum
> - field (class: "scala.Enumeration.Value", name: "col")
> - root class: 
> "line550c9f34c5144aa1a1e76bcac863244717.$read.$iwC.$iwC.$iwC.$iwC.MyClass"
>       at 
> org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:598)
>       at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:592)
>       at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:583)
>       at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
>       at 
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
>       at scala.collection.immutable.List.foreach(List.scala:318)
>       at 
> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
>       at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
>       at 
> org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:583)
>       at 
> org.apache.spark.sql.catalyst.ScalaReflection$.serializerFor(ScalaReflection.scala:425)
>       at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:61)
>       at org.apache.spark.sql.Encoders$.product(Encoders.scala:274)
>       at 
> org.apache.spark.sql.SQLImplicits.newProductEncoder(SQLImplicits.scala:47)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to