[GitHub] spark pull request #22655: [SPARK-25666][PYTHON] Internally document type co...

HyukjinKwon Sun, 07 Oct 2018 08:59:43 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22655#discussion_r223218561
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -2733,6 +2733,33 @@ def udf(f=None, returnType=StringType()):
         |         8|      JOHN DOE|          22|
         +----------+--------------+------------+
         """
    +
    +    # The following table shows most of Python data and SQL type 
conversions in normal UDFs that
    +    # are not yet visible to the user. Some of behaviors are buggy and 
might be changed in the near
    +    # future. The table might have to be eventually documented externally.
    +    # Please see SPARK-25666's PR to see the codes in order to generate 
the table below.
    +    #
    +    # 
+-----------------------------+--------------+----------+------+-------+------+----------+--------------------+-----------------------------+----------+----------------------+---------+--------------------+--------------+----------+--------------+-------------+-------------+
  # noqa
    +    # |SQL Type \ Python 
Value(Type)|None(NoneType)|True(bool)|1(int)|1(long)|a(str)|a(unicode)|    
1970-01-01(date)|1970-01-01 00:00:00(datetime)|1.0(float)|array('i', 
[1])(array)|[1](list)|         (1,)(tuple)|ABC(bytearray)|1(Decimal)|{'a': 
1}(dict)|Row(a=1)(Row)|Row(a=1)(Row)|  # noqa
    +    # 
+-----------------------------+--------------+----------+------+-------+------+----------+--------------------+-----------------------------+----------+----------------------+---------+--------------------+--------------+----------+--------------+-------------+-------------+
  # noqa
    +    # |                         null|          None|      None|  None|   
None|  None|      None|                None|                         None|      
None|                  None|     None|                None|          None|      
None|          None|            X|            X|  # noqa
    +    # |                      boolean|          None|      True|  None|   
None|  None|      None|                None|                         None|      
None|                  None|     None|                None|          None|      
None|          None|            X|            X|  # noqa
    +    # |                      tinyint|          None|      None|     1|     
 1|  None|      None|                None|                         None|      
None|                  None|     None|                None|          None|      
None|          None|            X|            X|  # noqa
    +    # |                     smallint|          None|      None|     1|     
 1|  None|      None|                None|                         None|      
None|                  None|     None|                None|          None|      
None|          None|            X|            X|  # noqa
    +    # |                          int|          None|      None|     1|     
 1|  None|      None|                None|                         None|      
None|                  None|     None|                None|          None|      
None|          None|            X|            X|  # noqa
    +    # |                       bigint|          None|      None|     1|     
 1|  None|      None|                None|                         None|      
None|                  None|     None|                None|          None|      
None|          None|            X|            X|  # noqa
    +    # |                       string|          None|      true|     1|     
 1|     a|         a|java.util.Gregori...|         java.util.Gregori...|       
1.0|           [I@7f1970e1|      [1]|[Ljava.lang.Objec...|   [B@284838a9|       
  1|         {a=1}|            X|            X|  # noqa
    --- End diff --
    
    Hmmmmm .. I see the type is not clear here. Let me think about this a bit 
more.
    
    `[B@284838a9` is a quite buggy behaviour - we should fix. So I was thinking 
of documenting internally since we already spent much time to figure out how it 
works for each case individually (at 
https://github.com/apache/spark/pull/20163).



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22655: [SPARK-25666][PYTHON] Internally document type co...

Reply via email to