Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22795#discussion_r227198050
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -3023,6 +3023,42 @@ def pandas_udf(f=None, returnType=None, 
functionType=None):
             conversion on returned data. The conversion is not guaranteed to 
be correct and results
             should be checked for accuracy by users.
         """
    +
    +    # The following table shows most of Pandas data and SQL type 
conversions in Pandas UDFs that
    +    # are not yet visible to the user. Some of behaviors are buggy and 
might be changed in the near
    +    # future. The table might have to be eventually documented externally.
    +    # Please see SPARK-25798's PR to see the codes in order to generate 
the table below.
    +    #
    +    # 
+-----------------------------+----------------------+----------+-------+--------+--------------------+--------------------+--------+---------+---------+---------+------------+------------+------------+-----------------------------------+-----------------------------------------------------+-----------------+--------------------+-----------------------------+-------------+-----------------+------------------+-----------+--------------------------------+
  # noqa
    +    # |SQL Type \ Pandas 
Value(Type)|None(object(NoneType))|True(bool)|1(int8)|1(int16)|            
1(int32)|            
1(int64)|1(uint8)|1(uint16)|1(uint32)|1(uint64)|1.0(float16)|1.0(float32)|1.0(float64)|1970-01-01
 00:00:00(datetime64[ns])|1970-01-01 00:00:00-05:00(datetime64[ns, 
US/Eastern])|a(object(string))|  1(object(Decimal))|[1 2 
3](object(array[int32]))|1.0(float128)|(1+0j)(complex64)|(1+0j)(complex128)|A(category)|1
 days 00:00:00(timedelta64[ns])|  # noqa
    +    # 
+-----------------------------+----------------------+----------+-------+--------+--------------------+--------------------+--------+---------+---------+---------+------------+------------+------------+-----------------------------------+-----------------------------------------------------+-----------------+--------------------+-----------------------------+-------------+-----------------+------------------+-----------+--------------------------------+
  # noqa
    +    # |                      boolean|                  None|      True|   
True|    True|                True|                True|    True|     True|     
True|     True|       False|       False|       False|                          
    False|                                                False|                
X|                   X|                            X|        False|            
False|             False|          X|                           False|  # noqa
    --- End diff --
    
    I think it's fine .. many conversions here look buggy, for instance, 
`A(category)` with `tinyint` becomes `0` or string conversions .. 
    Let's just fix, upgrade arrow and then update this chart later..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to