[GitHub] spark pull request #20163: [SPARK-22966][PySpark] Spark SQL should handle Py...

ueshin Tue, 09 Jan 2018 02:08:31 -0800

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20163#discussion_r160364055
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
    @@ -120,10 +121,18 @@ object EvaluatePython {
         case (c: java.math.BigDecimal, dt: DecimalType) => Decimal(c, 
dt.precision, dt.scale)
     
         case (c: Int, DateType) => c
    +    // Pyrolite will unpickle a Python datetime.date to a 
java.util.Calendar
    +    case (c: Calendar, DateType) => 
DateTimeUtils.fromJavaCalendarForDate(c)
    --- End diff --
    
    Yeah, 2. should work for `StringType`.
    
    I'd also like to add some documents like 1. for users to be careful about 
the return type. I've found that `udf`s return `null` and `pandas_udf`s throw 
some exception in most case when the return type is mismatching.
    Of course we can try to make the behavior differences between `udf` and 
`pandas_udf` closer as possible in the future, but I think it is the best 
effort basis for the mismatching return type.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PySpark] Spark SQL should handle Py...

Reply via email to