[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

mt40 Wed, 17 Oct 2018 18:15:01 -0700

Github user mt40 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22309#discussion_r226142827
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala 
---
    @@ -376,6 +387,23 @@ object ScalaReflection extends ScalaReflection {
               dataType = ObjectType(udt.getClass))
             Invoke(obj, "deserialize", ObjectType(udt.userClass), getPath :: 
Nil)
     
    +      case t if isValueClass(t) =>
    +        // nested value class is treated as its underlying type
    +        // top level value class must be treated as a product
    +        val underlyingType = getUnderlyingTypeOf(t)
    +        val underlyingClsName = getClassNameFromType(underlyingType)
    +        val clsName = t.typeSymbol.asClass.fullName
    +        val newTypePath = s"""- Scala value class: 
$clsName($underlyingClsName)""" +:
    +          walkedTypePath
    +
    +        val arg = deserializerFor(underlyingType, path, newTypePath)
    +        if (path.isDefined) {
    +          arg
    --- End diff --
    
    Take class `User` above for example. After compile, field id of type `Id` 
will become `Int` so when constructing `User` we need `id` to be `Int`.
    
    Also why we need `NewInstance` in case `Id` is itself the schema? Because 
`Id` may remain as `Id` if it is treated as another type (following [allocation 
rule](https://docs.scala-lang.org/overviews/core/value-classes.html#allocation-details)).
 For example, in method 
[encodeDecodeTest](https://github.com/apache/spark/blob/a40806d2bd84e9a0308165f0d6c97e9cf00aa4a3/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala#L373),
 if we pass an instance of `Id` as input, it will not be converted to `Int`. In 
the other case when the required type is explicitly `Id`, then both the input 
and the result returned from deserialization will both become `Int`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...

Reply via email to