Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/20949#discussion_r197643948 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -512,6 +512,43 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils with Te } } + test("Save csv with custom charset") { + Seq("iso-8859-1", "utf-8", "windows-1250").foreach { encoding => --- End diff -- Could you check the `UTF-16` and `UTF-32` encoding too. The written csv files must contain [BOMs](https://en.wikipedia.org/wiki/Byte_order_mark) for such encodings. I am not sure that Spark CSV datasource is able to read it in per-line mode (`multiLine` is set to `false`). Probably, you need to switch to multLine mode or read the files by Scala's library like in JsonSuite: https://github.com/apache/spark/blob/c7e2742f9bce2fcb7c717df80761939272beff54/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala#L2322-L2338
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org