[ https://issues.apache.org/jira/browse/SPARK-33322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-33322. ---------------------------------- Resolution: Cannot Reproduce This is fixed from Spark 3.0.0. It's a breaking change so it cannot be ported back. > Dataframe: data is wrongly presented because of column name > ----------------------------------------------------------- > > Key: SPARK-33322 > URL: https://issues.apache.org/jira/browse/SPARK-33322 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.4.5 > Reporter: Mihaly Hazag > Priority: Major > Attachments: image-2020-11-03-14-57-09-433.png, > image-2020-11-03-14-57-37-308.png > > > Consider the code below: `some_text` column got the `some_int` value, while > its value is null in the dataframe. > !image-2020-11-03-14-57-09-433.png! > > Renaming the field from `some_text` to `some_apple`, fixed the problem! 🙂 > !image-2020-11-03-14-57-37-308.png! > > > Here is the code to reproduce the problem > {code:python} > from datetime import datetime > from pyspark.sql import Row > from pyspark.sql.types import StructType, StructField, DateType, StringType, > IntegerType > > schema = StructType( > [ > StructField('dfdt', DateType(), True), > StructField('some_text', StringType(), True), > StructField('some_int', IntegerType(), True), > ] > ) > > test_df = spark.createDataFrame([ > Row(dfdt=datetime.strptime('2020-12-18', '%Y-%m-%d'), some_text='cdsvg', > some_int=100) > ], schema) > > display(test_df) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org