Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216772060
Thank you @jbax. I will try to do so after checking If I can identify any
useful changes with Spark.
---
If your project is set up for it, you can reply to this em
Github user jbax commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216770729
By the way, may I suggest you guys to upgrade to version 2.1.0 as it comes
with substantial performance improvements for parsing and writing CSV.
---
If your project is s
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216765934
Closing this. I will bring up this again maybe in
https://github.com/apache/spark/pull/12268.
---
If your project is set up for it, you can reply to this email an
Github user HyukjinKwon closed the pull request at:
https://github.com/apache/spark/pull/12818
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature i
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216765525
Yea I haven't spent much time looking but it doesn't seem worth to me to
remove this. We might also use comment in the future.
---
If your project is set up for it, you
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216765450
@rxin If it is too minor to merge, I can close and then do this in another
PR maybe after investigating the newline stuff discussed above.
---
If your project is s
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216764368
@rxin in terms of funtionalities and performance, No.
But it shortens codes and I thought it is confusing whether `comment`
option in `CSVOptions` affects
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216763362
Are there any benefits to removing this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project d
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216763118
@rxin To cut it short, I got a confirm, from the original author of
Univocity, `setComment()` has no effect as long as Spark does not write
comments from `DataFrame
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216761918
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216761919
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216761768
**[Test build #57720 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57720/consoleFull)**
for PR 12818 at commit
[`3b289d9`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216750238
**[Test build #57720 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57720/consoleFull)**
for PR 12818 at commit
[`3b289d9`](https://gi
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216750176
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have thi
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216749947
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216749940
**[Test build #57718 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57718/consoleFull)**
for PR 12818 at commit
[`3b289d9`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216749945
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216749347
**[Test build #57718 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57718/consoleFull)**
for PR 12818 at commit
[`3b289d9`](https://gi
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216748780
Oh, I misunderstood your first comment.I think I should not take out
`setLineSeparator()` here but maybe I should open another issue ticket to set
`normalizeLineEnd
Github user jbax commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216747285
Foo and bar are part of the same value, they just happen to have a line
ending in between. And yes `setLineSeparator()` it is related to the values
themselves when writing
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216743678
@jbax Ah, I guess `foo` and `bar` are separate rows, right? `stripLineEnd`
will be applied for each row.
If I got you wrong and `setLineSeparator()` is rela
Github user jbax commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216743260
What happens if you do this:
```
scala> "foo\r\nbar\r\n".stripLineEnd
```
Shouldn't the result be this?
```
res0: String = foo\r\n
bar
```
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216740931
@jbax Cool! Thank you for detailed explanation.
So, this uses OS default newline without `setLineSeparator()`, which is
trimmed
[here](https://github.com/a
Github user jbax commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216738292
I just read the rest of this ticket. Be careful with the
`setLineSeparator()`. It uses the default OS line separator but that's not
always desired.
By default, th
Github user jbax commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216737346
Confirmed. It is only used if you call `CsvWriter.commentRow()` or
`CsvWriter.commentRowToString()` to write comments to the output.
---
If your project is set up for it
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216734653
Please allow me cc you, @jbax, who I guess the author of Univocity parser.
Could you please confirm that `Format.setComment()` is not affected if we
only calls
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216733974
Hi @falaki, could you take a quick look? it won't be too long!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHu
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216024734
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216024726
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216024733
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216024727
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216024701
**[Test build #57470 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57470/consoleFull)**
for PR 12818 at commit
[`9b570db`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216024696
**[Test build #57468 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57468/consoleFull)**
for PR 12818 at commit
[`9b570db`](https://g
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216023502
cc @falaki
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabl
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216020146
**[Test build #57468 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57468/consoleFull)**
for PR 12818 at commit
[`9b570db`](https://gi
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216020147
**[Test build #57470 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57470/consoleFull)**
for PR 12818 at commit
[`9b570db`](https://gi
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216019892
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/12818
[MINOR][SQL] Remove not affected settings for writing in CSV.
## What changes were proposed in this pull request?
This PR removes not affected settings for writing CSV files.
-
38 matches
Mail list logo