[ 
https://issues.apache.org/jira/browse/SPARK-33322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33322.
----------------------------------
    Resolution: Cannot Reproduce

This is fixed from Spark 3.0.0. It's a breaking change so it cannot be ported 
back.

> Dataframe: data is wrongly presented because of column name
> -----------------------------------------------------------
>
>                 Key: SPARK-33322
>                 URL: https://issues.apache.org/jira/browse/SPARK-33322
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.4.5
>            Reporter: Mihaly Hazag
>            Priority: Major
>         Attachments: image-2020-11-03-14-57-09-433.png, 
> image-2020-11-03-14-57-37-308.png
>
>
> Consider the code below: `some_text` column got the `some_int` value, while 
> its value is null in the dataframe.
>    !image-2020-11-03-14-57-09-433.png!
>  
> Renaming the field from `some_text` to `some_apple`, fixed the problem! 🙂
> !image-2020-11-03-14-57-37-308.png!
>  
>  
> Here is the code to reproduce the problem
> {code:python}
> from datetime import datetime
> from pyspark.sql import Row
> from pyspark.sql.types import StructType, StructField, DateType, StringType, 
> IntegerType
>  
> schema = StructType(
>   [
>     StructField('dfdt', DateType(), True),
>     StructField('some_text', StringType(), True),
>     StructField('some_int', IntegerType(), True),
>   ]
> )
>  
> test_df = spark.createDataFrame([
>   Row(dfdt=datetime.strptime('2020-12-18', '%Y-%m-%d'), some_text='cdsvg', 
> some_int=100)
> ], schema)
>  
> display(test_df)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to