[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
@justinuang, this might affect existing users application. Although this 
matches the behaviour to non-miltiline mode, can we explicitly mention it in 
migration guide?

cc @cloud-fan and @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97496/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #97496 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97496/testReport)**
 for PR 22503 at commit 
[`040047b`](https://github.com/apache/spark/commit/040047b696c58496ea3da274fa2c58166d31b100).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #97496 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97496/testReport)**
 for PR 22503 at commit 
[`040047b`](https://github.com/apache/spark/commit/040047b696c58496ea3da274fa2c58166d31b100).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-17 Thread justinuang
Github user justinuang commented on the issue:

https://github.com/apache/spark/pull/22503
  
done!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
@justinuang, okay. Mind rebasing this please?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-16 Thread justinuang
Github user justinuang commented on the issue:

https://github.com/apache/spark/pull/22503
  
So Hadoop's LineReader looks like it handles CR, LF, CRLF:


https://github.com/apache/hadoop/blob/f90c64e6242facf38c2baedeeda42e4a8293e642/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java#L36

Univocity handles CR, LF, CRLF (the logic is a bit convoluted but it looks 
like they have the same behavior in that if they see a CR, they will look for a 
LF next):


https://github.com/uniVocity/univocity-parsers/blob/master/src/main/java/com/univocity/parsers/common/input/LineSeparatorDetector.java

I do agree we should expose the option of `setLineSeparator`, but 
regardless of that, the default behavior of handling CR, LF, CRLF should be the 
same between single line and multiline mode.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97359/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #97359 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97359/testReport)**
 for PR 22503 at commit 
[`695f676`](https://github.com/apache/spark/commit/695f6760e239a781ad8fb0b1e428944e73f79563).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
I haven't checked what `setLineSeparatorDetectionEnabled` does explicitly 
yet in Univocity parser. Is this exactly same behaviour when we read it via 
Hadoop's `LineRecordReader`? Also how does it work with `setLineSeparator`? 
Essentially we should expose this option too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #97359 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97359/testReport)**
 for PR 22503 at commit 
[`695f676`](https://github.com/apache/spark/commit/695f6760e239a781ad8fb0b1e428944e73f79563).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96865/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #96865 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96865/testReport)**
 for PR 22503 at commit 
[`695f676`](https://github.com/apache/spark/commit/695f6760e239a781ad8fb0b1e428944e73f79563).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #96865 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96865/testReport)**
 for PR 22503 at commit 
[`695f676`](https://github.com/apache/spark/commit/695f6760e239a781ad8fb0b1e428944e73f79563).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-02 Thread mccheah
Github user mccheah commented on the issue:

https://github.com/apache/spark/pull/22503
  
@HyukjinKwon is this ready to be merged in, or is there more feedback to be 
addressed?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-28 Thread justinuang
Github user justinuang commented on the issue:

https://github.com/apache/spark/pull/22503
  
What does it take to get this to be merged in?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-26 Thread justinuang
Github user justinuang commented on the issue:

https://github.com/apache/spark/pull/22503
  
Sounds good, thanks guys =)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
Seems fine but I or someone else should take a closer look before getting 
this in.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96559/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #96559 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96559/testReport)**
 for PR 22503 at commit 
[`812e4c5`](https://github.com/apache/spark/commit/812e4c58adfb73c6f90b08600dc43e4a0bd88921).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22503
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96556/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #96556 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96556/testReport)**
 for PR 22503 at commit 
[`67d11f1`](https://github.com/apache/spark/commit/67d11f11201461b7d8f6389c7176c7e50f3fdd7b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #96559 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96559/testReport)**
 for PR 22503 at commit 
[`812e4c5`](https://github.com/apache/spark/commit/812e4c58adfb73c6f90b08600dc43e4a0bd88921).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #96556 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96556/testReport)**
 for PR 22503 at commit 
[`67d11f1`](https://github.com/apache/spark/commit/67d11f11201461b7d8f6389c7176c7e50f3fdd7b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
Mind explaining what `setLineSeparatorDetectionEnabled` does in the PR 
description as well?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-25 Thread justinuang
Github user justinuang commented on the issue:

https://github.com/apache/spark/pull/22503
  
It looks like a flake? Can someone retrigger it?

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96511/console


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-09-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22503
  
**[Test build #96511 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96511/testReport)**
 for PR 22503 at commit 
[`67d11f1`](https://github.com/apache/spark/commit/67d11f11201461b7d8f6389c7176c7e50f3fdd7b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org