Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9712#discussion_r44867544
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/RowEncoderSuite.scala
 ---
    @@ -68,7 +117,36 @@ class RowEncoderSuite extends SparkFunSuite {
           .add("structOfArray", new StructType().add("array", arrayOfString))
           .add("structOfMap", new StructType().add("map", mapOfString))
           .add("structOfArrayAndMap",
    -        new StructType().add("array", arrayOfString).add("map", 
mapOfString)))
    +        new StructType().add("array", arrayOfString).add("map", 
mapOfString))
    +      .add("structOfUDT", structOfUDT))
    +
    +  test(s"encode/decode: arrayOfUDT") {
    +    val schema = new StructType()
    +      .add("arrayOfUDT", arrayOfUDT)
    +
    +    val encoder = RowEncoder(schema)
    +
    +    val input: Row = Row(Seq(new ExamplePoint(0.1, 0.2), new 
ExamplePoint(0.3, 0.4)))
    +    val row = encoder.toRow(input)
    +    val convertedBack = encoder.fromRow(row)
    +    assert(input.getSeq[ExamplePoint](0) == 
convertedBack.getSeq[ExamplePoint](0))
    +  }
    +
    +  test(s"encode/decode: Product") {
    +    val schema = new StructType()
    +      .add("structAsProduct",
    +        new StructType()
    +          .add("int", IntegerType)
    +          .add("string", StringType)
    +          .add("double", DoubleType))
    +
    +    val encoder = RowEncoder(schema)
    +
    +    val input: Row = Row((100, "test", 0.123))
    --- End diff --
    
    Actually I found this problem when working on ScalaUDF. ScalaUDF will use 
`schemaFor` to obtain catalyst type for UDF input and output. The catalyst type 
returned by `schemaFor` for a `Product` is `StructType`. It is reasonable as we 
don't have other type to represent `Product` as I see.
    
    So for a `StructType` field in an external `Row`, both `Row` and `Product` 
are possible values. When we call `extractorsFor` on the external `Row`, 
`externalDataTypeFor` will return `ObjectType(classOf[Row])` for this field. 
But the `get` accessor on the inputObject (i.e., the `Row`) will possibly 
return a `Product` for the ScalaUDF case and an exception will be thrown.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to