[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings
[ https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913151#comment-16913151 ] nicolas paris commented on SPARK-28779: --- good to know thanks > CSV writer doesn't handle older Mac line endings > > > Key: SPARK-28779 > URL: https://issues.apache.org/jira/browse/SPARK-28779 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.4.0 >Reporter: nicolas paris >Priority: Minor > > The spark csv writer does not consider "\r" as a newline in string type > columns. As a result, the resulting csv are not quoted, and they get > corrupted. > All \n, \r\n and \r should be considered as newline to allow robust csv > serialization. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings
[ https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912813#comment-16912813 ] Hyukjin Kwon commented on SPARK-28779: -- That will be available from Spark 3.0.0 which will be released soon. > CSV writer doesn't handle older Mac line endings > > > Key: SPARK-28779 > URL: https://issues.apache.org/jira/browse/SPARK-28779 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.4.0 >Reporter: nicolas paris >Priority: Minor > > The spark csv writer does not consider "\r" as a newline in string type > columns. As a result, the resulting csv are not quoted, and they get > corrupted. > All \n, \r\n and \r should be considered as newline to allow robust csv > serialization. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings
[ https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912655#comment-16912655 ] nicolas paris commented on SPARK-28779: --- i cannot find the lineSep option in the dataframeReader/writer csv method API. This exists in the json method, maybe that's what you were thinking about ? https://spark.apache.org/docs/2.4.0/api/scala/index.html#org.apache.spark.sql.DataFrameReader@csv(paths:String*):org.apache.spark.sql.DataFrame > CSV writer doesn't handle older Mac line endings > > > Key: SPARK-28779 > URL: https://issues.apache.org/jira/browse/SPARK-28779 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.4.0 >Reporter: nicolas paris >Priority: Minor > > The spark csv writer does not consider "\r" as a newline in string type > columns. As a result, the resulting csv are not quoted, and they get > corrupted. > All \n, \r\n and \r should be considered as newline to allow robust csv > serialization. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings
[ https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911857#comment-16911857 ] Hyukjin Kwon commented on SPARK-28779: -- You can set {{lineSep}} option. > CSV writer doesn't handle older Mac line endings > > > Key: SPARK-28779 > URL: https://issues.apache.org/jira/browse/SPARK-28779 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.4.0 >Reporter: nicolas paris >Priority: Minor > > The spark csv writer does not consider "\r" as a newline in string type > columns. As a result, the resulting csv are not quoted, and they get > corrupted. > All \n, \r\n and \r should be considered as newline to allow robust csv > serialization. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org