[ https://issues.apache.org/jira/browse/SPARK-21442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089927#comment-16089927 ]
eugen yushin edited comment on SPARK-21442 at 7/17/17 2:45 PM: --------------------------------------------------------------- Agree, thanks for quick response. Feel free to close as duplicate was (Author: eyushin): Agree, thanks for quick response > Spark CSV writer trims trailing spaces > -------------------------------------- > > Key: SPARK-21442 > URL: https://issues.apache.org/jira/browse/SPARK-21442 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 2.1.0, 2.1.1 > Environment: version 2.1.0-mapr-1703 > Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_131) > and > version 2.1.1 > Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_131) > Reporter: eugen yushin > > Looks like Spark truncates trailing spaces saving data with csv codec. Check > the following example for more details (note extra space at the end of "Johny > " field): > {code} > scala> case class SampleRow(field1: String, field2: String) > defined class SampleRow > scala> val fooDS = Seq(SampleRow("Johny ", "Doe"), SampleRow("Ivan", > "Susanin")).toDS() > fooDS: org.apache.spark.sql.Dataset[SampleRow] = [field1: string, field2: > string] > scala> fooDS.collect.foreach(println) > SampleRow(Johny ,Doe) > SampleRow(Ivan,Susanin) > scala> fooDS.show() > +------+-------+ > |field1| field2| > +------+-------+ > |Johny | Doe| > | Ivan|Susanin| > +------+-------+ > scala> import org.apache.spark.sql.SaveMode > import org.apache.spark.sql.SaveMode > scala> fooDS.write.option("delimiter", > "|").mode(SaveMode.Overwrite).csv("file:///tmp/spaces.txt") > cat /tmp/spaces.txt/* > Johny|Doe > Ivan|Susanin > {code} > I expect space before the pipe at the first line in output file. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org