[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings

2019-08-22 Thread nicolas paris (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913151#comment-16913151
 ] 

nicolas paris commented on SPARK-28779:
---

good to know thanks

> CSV writer doesn't handle older Mac line endings
> 
>
> Key: SPARK-28779
> URL: https://issues.apache.org/jira/browse/SPARK-28779
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.4.0
>Reporter: nicolas paris
>Priority: Minor
>
> The spark csv writer does not consider "\r"  as a newline in string type 
> columns. As a result, the resulting csv are not quoted, and they get 
> corrupted.
> All \n, \r\n and \r should be considered as newline to allow robust csv 
> serialization.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings

2019-08-21 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912813#comment-16912813
 ] 

Hyukjin Kwon commented on SPARK-28779:
--

That will be available from Spark 3.0.0 which will be released soon.

> CSV writer doesn't handle older Mac line endings
> 
>
> Key: SPARK-28779
> URL: https://issues.apache.org/jira/browse/SPARK-28779
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.4.0
>Reporter: nicolas paris
>Priority: Minor
>
> The spark csv writer does not consider "\r"  as a newline in string type 
> columns. As a result, the resulting csv are not quoted, and they get 
> corrupted.
> All \n, \r\n and \r should be considered as newline to allow robust csv 
> serialization.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings

2019-08-21 Thread nicolas paris (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912655#comment-16912655
 ] 

nicolas paris commented on SPARK-28779:
---

i cannot find the lineSep option in the dataframeReader/writer csv method API. 
This exists in the json method, maybe that's what you were thinking about ?

https://spark.apache.org/docs/2.4.0/api/scala/index.html#org.apache.spark.sql.DataFrameReader@csv(paths:String*):org.apache.spark.sql.DataFrame

> CSV writer doesn't handle older Mac line endings
> 
>
> Key: SPARK-28779
> URL: https://issues.apache.org/jira/browse/SPARK-28779
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.4.0
>Reporter: nicolas paris
>Priority: Minor
>
> The spark csv writer does not consider "\r"  as a newline in string type 
> columns. As a result, the resulting csv are not quoted, and they get 
> corrupted.
> All \n, \r\n and \r should be considered as newline to allow robust csv 
> serialization.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28779) CSV writer doesn't handle older Mac line endings

2019-08-20 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911857#comment-16911857
 ] 

Hyukjin Kwon commented on SPARK-28779:
--

You can set {{lineSep}} option.

> CSV writer doesn't handle older Mac line endings
> 
>
> Key: SPARK-28779
> URL: https://issues.apache.org/jira/browse/SPARK-28779
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.4.0
>Reporter: nicolas paris
>Priority: Minor
>
> The spark csv writer does not consider "\r"  as a newline in string type 
> columns. As a result, the resulting csv are not quoted, and they get 
> corrupted.
> All \n, \r\n and \r should be considered as newline to allow robust csv 
> serialization.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org