Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20949#discussion_r197662087 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -513,6 +513,43 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils { } } + test("Save csv with custom charset") { + Seq("iso-8859-1", "utf-8", "windows-1250").foreach { encoding => + withTempDir { dir => + val csvDir = new File(dir, "csv").getCanonicalPath + // scalastyle:off + val originalDF = Seq("µà áâä ÃÃÃ").toDF("_c0") + // scalastyle:on + originalDF.write + .option("header", "false") + .option("encoding", encoding) + .csv(csvDir) + + val df = spark + .read + .option("header", "false") + .option("encoding", encoding) --- End diff -- Now it's fine. I think we decided to support encoding in CSV/JSON datasources. Ignore the comment above. We can proceed separately.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org