[ https://issues.apache.org/jira/browse/SPARK-36283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Darcy Shen updated SPARK-36283: ------------------------------- Description: h2. Case 1 {code:python} spark = SparkSession.builder.getOrCreate() spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), ExamplePoint(2, 2)])}) df = spark.createDataFrame(pdf) df.show() {code} h2. Case 2 {code} spark = SparkSession.builder.getOrCreate() spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), ExamplePoint(2, 2)])}) schema = StructType([StructField('point', ExamplePointUDT(), False)]) df = spark.createDataFrame(pdf, schema) df.show() {code} h3. Case 3 {code} spark = SparkSession.builder.getOrCreate() spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1.0, 1.0), ExamplePoint(2.0, 2.0)])}) schema = StructType([StructField('point', ExamplePointUDT(), False)]) df = spark.createDataFrame(pdf, schema) df.show() {code} > Bug when creating dataframe without schema and with Arrow disabled > ------------------------------------------------------------------ > > Key: SPARK-36283 > URL: https://issues.apache.org/jira/browse/SPARK-36283 > Project: Spark > Issue Type: Sub-task > Components: PySpark > Affects Versions: 3.1.1 > Reporter: Darcy Shen > Priority: Major > > h2. Case 1 > {code:python} > spark = SparkSession.builder.getOrCreate() > spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") > pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), ExamplePoint(2, > 2)])}) > df = spark.createDataFrame(pdf) > df.show() > {code} > h2. Case 2 > {code} > spark = SparkSession.builder.getOrCreate() > spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") > pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), ExamplePoint(2, > 2)])}) > schema = StructType([StructField('point', ExamplePointUDT(), False)]) > df = spark.createDataFrame(pdf, schema) > df.show() > {code} > h3. Case 3 > {code} > spark = SparkSession.builder.getOrCreate() > spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") > pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1.0, 1.0), > ExamplePoint(2.0, 2.0)])}) > schema = StructType([StructField('point', ExamplePointUDT(), False)]) > df = spark.createDataFrame(pdf, schema) > df.show() > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org