[ https://issues.apache.org/jira/browse/SPARK-41746 ]
Ruifeng Zheng deleted comment on SPARK-41746: --------------------------------------- was (Author: podongfeng): {code:java} ********************************************************************** File "/Users/ruifeng.zheng/Dev/spark/python/pyspark/sql/connect/functions.py", line 1423, in pyspark.sql.connect.functions.map_filter Failed example: df.select(map_filter( "data", lambda _, v: v > 30.0).alias("data_filtered") ).show(truncate=False) Expected: +--------------------------+ |data_filtered | +--------------------------+ |{baz -> 32.0, foo -> 42.0}| +--------------------------+ Got: +--------------------------+ |data_filtered | +--------------------------+ |{foo -> 42.0, baz -> 32.0}| +--------------------------+ <BLANKLINE> ********************************************************************** File "/Users/ruifeng.zheng/Dev/spark/python/pyspark/sql/connect/functions.py", line 1465, in pyspark.sql.connect.functions.map_zip_with Failed example: df.select(map_zip_with( "base", "ratio", lambda k, v1, v2: round(v1 * v2, 2)).alias("updated_data") ).show(truncate=False) Expected: +---------------------------+ |updated_data | +---------------------------+ |{SALES -> 16.8, IT -> 48.0}| +---------------------------+ Got: +---------------------------+ |updated_data | +---------------------------+ |{IT -> 48.0, SALES -> 16.8}| +---------------------------+ <BLANKLINE> ********************************************************************** 1 of 2 in pyspark.sql.connect.functions.map_filter 1 of 2 in pyspark.sql.connect.functions.map_zip_with {code} > SparkSession.createDataFrame does not support nested datatypes > -------------------------------------------------------------- > > Key: SPARK-41746 > URL: https://issues.apache.org/jira/browse/SPARK-41746 > Project: Spark > Issue Type: Sub-task > Components: Connect > Affects Versions: 3.4.0 > Reporter: Hyukjin Kwon > Priority: Major > > {code} > File "/.../spark/python/pyspark/sql/connect/group.py", line 183, in > pyspark.sql.connect.group.GroupedData.pivot > Failed example: > df2 = spark.createDataFrame([ > Row(training="expert", sales=Row(course="dotNET", year=2012, > earnings=10000)), > Row(training="junior", sales=Row(course="Java", year=2012, > earnings=20000)), > Row(training="expert", sales=Row(course="dotNET", year=2012, > earnings=5000)), > Row(training="junior", sales=Row(course="dotNET", year=2013, > earnings=48000)), > Row(training="expert", sales=Row(course="Java", year=2013, > earnings=30000)), > ]) > Exception raised: > Traceback (most recent call last): > File "/.../miniconda3/envs/python3.9/lib/python3.9/doctest.py", line > 1336, in __run > exec(compile(example.source, filename, "single", > File "<doctest pyspark.sql.connect.group.GroupedData.pivot[3]>", line > 1, in <module> > df2 = spark.createDataFrame([ > File > "/.../workspace/forked/spark/python/pyspark/sql/connect/session.py", line > 196, in createDataFrame > table = pa.Table.from_pandas(pdf) > File "pyarrow/table.pxi", line 3475, in pyarrow.lib.Table.from_pandas > File > "/.../miniconda3/envs/python3.9/lib/python3.9/site-packages/pyarrow/pandas_compat.py", > line 611, in dataframe_to_arrays > arrays = [convert_column(c, f) > File > "/.../miniconda3/envs/python3.9/lib/python3.9/site-packages/pyarrow/pandas_compat.py", > line 611, in <listcomp> > arrays = [convert_column(c, f) > File > "/.../miniconda3/envs/python3.9/lib/python3.9/site-packages/pyarrow/pandas_compat.py", > line 598, in convert_column > raise e > File > "/.../miniconda3/envs/python3.9/lib/python3.9/site-packages/pyarrow/pandas_compat.py", > line 592, in convert_column > result = pa.array(col, type=type_, from_pandas=True, safe=safe) > File "pyarrow/array.pxi", line 316, in pyarrow.lib.array > File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array > File "pyarrow/error.pxi", line 123, in pyarrow.lib.check_status > pyarrow.lib.ArrowTypeError: ("Expected bytes, got a 'int' object", > 'Conversion failed for column 1 with type object') > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org