Github user AlexanderKoryagin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22568#discussion_r221183502
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -5525,32 +5525,73 @@ def data(self):
                 .withColumn("v", explode(col('vs'))).drop('vs')
     
         def test_supported_types(self):
    -        from pyspark.sql.functions import pandas_udf, PandasUDFType, 
array, col
    -        df = self.data.withColumn("arr", array(col("id")))
    +        from decimal import Decimal
    +        from distutils.version import LooseVersion
    +        import pyarrow as pa
    +        from pyspark.sql.functions import pandas_udf, PandasUDFType
     
    -        # Different forms of group map pandas UDF, results of these are 
the same
    +        input_values_with_schema = [
    +            (1, StructField('id', IntegerType())),
    +            (2, StructField('byte', ByteType())),
    +            (3, StructField('short', ShortType())),
    +            (4, StructField('int', IntegerType())),
    +            (5, StructField('long', LongType())),
    +            (1.1, StructField('float', FloatType())),
    +            (2.2, StructField('double', DoubleType())),
    +            (Decimal(1.123), StructField('decim', DecimalType(10, 3))),
    +            ([1, 2, 3], StructField('array', ArrayType(IntegerType()))),
    +            (True, StructField('bool', BooleanType())),
    +            ('hello', StructField('str', StringType())),
    +        ]
    --- End diff --
    
    Are you sure that keeping the original way is more important then making 
code more readable and maintainable?
    IMHO with such way it will be hard to visually match values with types:
    ```python
    values = [
        1, 2, 3, 4, 5, 1.1, 2.2, Decimal(1.123),
        [1, 2, 2], True, 'hello'
    ]
    output_schema = [
        StructField('id', IntegerType()), StructField('byte', ByteType()), 
        StructField('short', ShortType()), StructField('int', IntegerType()), 
        StructField('long', LongType()), StructField('float', FloatType()),
        StructField('double', DoubleType()), StructField('decim', 
DecimalType(10, 3)), 
        StructField('array', ArrayType(IntegerType())), StructField('bool', 
BooleanType()),
        StructField('str', StringType())
    ]
    ```
    But anyway, if you insists , I can change


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to