[jira] [Comment Edited] (SPARK-33566) Incorrectly Parsing CSV file

2020-11-26 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239256#comment-17239256
 ] 

Yang Jie edited comment on SPARK-33566 at 11/26/20, 1:12 PM:
-

I think the reason for the bad case is Spark use "STOP_AT_DELIMITER" as default 
"UnescapedQuoteHandling" to build "CsvParser".  Configure 
"UnescapedQuoteHandling" to  "STOP_AT_CLOSING_QUOTE" seems can resolve this 
issue, but Spark not support configure this option now. [~hyukjin.kwon] 
[~moresmores]


was (Author: luciferyang):
I think the reason for the bad case is Spark use "STOP_AT_DELIMITER" as default 
"UnescapedQuoteHandling" to build "CsvParser".  Configure 
"UnescapedQuoteHandling" to  "STOP_AT_CLOSING_QUOTE" seems can resolve this 
issue. [~hyukjin.kwon] [~moresmores]

> Incorrectly Parsing CSV file
> 
>
> Key: SPARK-33566
> URL: https://issues.apache.org/jira/browse/SPARK-33566
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.7
>Reporter: Stephen More
>Priority: Minor
>
> Here is a test case: 
> [https://github.com/mores/maven-examples/blob/master/comma/src/test/java/org/test/CommaTest.java]
> It shows how I believe apache commons csv and opencsv correctly parses the 
> sample csv file.
> spark is not correctly parsing the sample csv file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-33566) Incorrectly Parsing CSV file

2020-11-25 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239049#comment-17239049
 ] 

Hyukjin Kwon edited comment on SPARK-33566 at 11/26/20, 3:57 AM:
-

Here is the output from running mvn clean test:

Running org.test.CommaTest

{code}
2020-11-25 17:55:45,728 INFO [CommaTest:12]
OpenCsv
2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2
2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two
2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!
2020-11-25 17:55:45,763 INFO [CommaTest:26]

spark
2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2
2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two
2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from Joe 
Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!
{code}



was (Author: moresmores):
Here is the output from running mvn clean test:

Running org.test.CommaTest
{code}
2020-11-25 17:55:45,728 INFO [CommaTest:12]
OpenCsv
2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2
2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two
2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!
2020-11-25 17:55:45,763 INFO [CommaTest:26]

spark
2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2
2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two
2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from Joe 
Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!

> Incorrectly Parsing CSV file
> 
>
> Key: SPARK-33566
> URL: https://issues.apache.org/jira/browse/SPARK-33566
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.7
>Reporter: Stephen More
>Priority: Minor
>
> Here is a test case: 
> [https://github.com/mores/maven-examples/blob/master/comma/src/test/java/org/test/CommaTest.java]
> It shows how I believe apache commons csv and opencsv correctly parses the 
> sample csv file.
> spark is not correctly parsing the sample csv file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-33566) Incorrectly Parsing CSV file

2020-11-25 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239049#comment-17239049
 ] 

Hyukjin Kwon edited comment on SPARK-33566 at 11/26/20, 3:57 AM:
-

Here is the output from running mvn clean test:

Running org.test.CommaTest
{code}
2020-11-25 17:55:45,728 INFO [CommaTest:12]
OpenCsv
2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2
2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two
2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!
2020-11-25 17:55:45,763 INFO [CommaTest:26]

spark
2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2
2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two
2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from Joe 
Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!


was (Author: moresmores):
{{Here is the output from running mvn clean test}}

 

{{Running org.test.CommaTest}}
{{2020-11-25 17:55:45,728 INFO [CommaTest:12] }}
{{OpenCsv}}
{{2020-11-25 17:55:45,758 INFO [CommaTest:19] h1 h3 h2}}
{{2020-11-25 17:55:45,758 INFO [CommaTest:19] one three two}}
{{2020-11-25 17:55:45,760 INFO [CommaTest:19] abc xyz ^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!}}
{{2020-11-25 17:55:45,763 INFO [CommaTest:26] }}
{{spark}}
{{2020-11-25 17:55:46,464 WARN [NativeCodeLoader:62] Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable}}
{{2020-11-25 17:55:55,299 INFO [CommaTest:36] Count: 2}}
{{2020-11-25 17:55:55,449 INFO [CommaTest:41] one three two}}
{{2020-11-25 17:55:55,449 INFO [CommaTest:41] abc sans-serif;"">Referral from 
Joe Smith. Fred is hard working. Super smart "^@Referral from Joe Smith. Fred is 
hard working. Super smart, though you wouldnt know it at first. 6 
months, and we sold this project. Phooey he said to me! Whats up 
with you people. Youll say anything for a sale! Until he met me of 
coursehaar haar!Internet is spottyWorking while at home so. Will be 
applied this weekend. On Bill Recovery and 20 yr warranty 
added.Kindness made this deal happen!}}

> Incorrectly Parsing CSV file
> 
>
> Key: SPARK-33566
> URL: https://issues.apache.org/jira/browse/SPARK-33566
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.7
>Reporter: Stephen More
>Priority: Minor
>
> Here is a test case: 
> [https://github.com/mores/maven-examples/blob/master/comma/src/test/java/org/test/CommaTest.java]
> It shows how I believe apache commons csv and opencsv correctly parses the 
> sample csv file.
> spark is not correctly parsing the sample csv file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org