[ 
https://issues.apache.org/jira/browse/SPARK-32961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bui Bao Anh updated SPARK-32961:
--------------------------------
    Description: 
There are weird characters in the output when printing out to console or 
writing to files.

Find attached files to see how it look in Spark Dataframe and Pandas Dataframe.

 

  was:
There are weird characters in the output when printing out to console or 
writing to files.

Below is how it look in Spark Dataframe:

 

However, this is not the case in Pandas Dataframe:

!image-2020-09-22-10-51-28-294.png!

 


> PySpark CSV read with UTF-16 encoding is not working correctly
> --------------------------------------------------------------
>
>                 Key: SPARK-32961
>                 URL: https://issues.apache.org/jira/browse/SPARK-32961
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.4, 3.0.1
>         Environment: both spark local and cluster mode
>            Reporter: Bui Bao Anh
>            Priority: Major
>              Labels: Correctness
>         Attachments: pandas df.png, pyspark df.png
>
>
> There are weird characters in the output when printing out to console or 
> writing to files.
> Find attached files to see how it look in Spark Dataframe and Pandas 
> Dataframe.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to