Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21631
@MaxGekk, thanks. mind opening a PR to upgrade?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
> do we still hit the bug when parsing csv data?
I have checked uniVocity 2.7.2, there is no problem on this version.
---
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
oh, super quick fix ;) Thanks, @MaxGekk
In the master, do we still hit the bug when parsing csv data?
---
-
To unsubscribe,
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
The bug has been already fixed in uniVocity `2.6.5-SNAPSHOT`
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
> please take a following action. Will help and check if it's needed.
I have opened the issue for uniVocity parser:
https://github.com/uniVocity/univocity-parsers/issues/250
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21631
Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21631
LGTM.
@MaxGekk please take a following action. Will help and check if it's needed.
---
-
To unsubscribe, e-mail:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
@HyukjinKwon BTW, can you check this?
@MaxGekk Probably, I feel you'd be better to file a new jira for the point
you're looking into.
---
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
> v2.5.9 also have the same behaviour?
yes, it is the same.
> Anyway, it'd be better to ask the author ;) I asked before and I got
quick response.
ok. I will create an
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
v2.5.9 also have the same behaviour? Anyway, it'd be better to ask the
author ;) I asked before and I got quick response.
---
-
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
Here is the test for uniVocity parser:
https://github.com/MaxGekk/univocity_tests . For the first line, `parseLine`
outputs empty array but `null`s for the next calls. What do you think should I
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92317/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92317 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92317/testReport)**
for PR 21631 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92317 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92317/testReport)**
for PR 21631 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/469/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
> But first we should be sure if it's a bug or not for this anyway.
I will try to reproduce it on small example without Spark. I am not sure
what the expected behavior should be if set of
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21631
I mean `null is returned for valid input string "8"`. I thought this is a
bug. If there's valid case returning `null`, yea we should handle `null` of
course but the case you mentioned sounds
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
> So you mean it's a bug in Univocity?
No, I mean we don't handle `null` from Univocity's `parseLine` at all (in
another situations), and we just propagate `NullPointerException` to an user
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21631
So you mean it's a bug in Univocity? that's another fix for a bug existing
in Univocity then. We could work around this bug if it's clear that's a bug. I
would suggest to open a bug there if we
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
It seems `null` for Univocity's `parserLine` is normal way to indicate
about an error. Should we handle `null`s and throw `BadRecordException` instead
of propagating NPE to user's app?
---
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
@MaxGekk yea, I noticed that behaviour. Probably, in case we set an empty
array in `CommonSettings.selectIndexes`, it seems `UnivocityParser` returns
null for valid input? I'm not sure setting an
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
I have found the places inside of UnivocityParser from where the `null`
comes. It is interesting that `null` is returned for valid input string `"8"`.
See the screenshot:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21631
Wouldn't it be better to check schema instead of value for per record?
---
-
To unsubscribe, e-mail:
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21631
Looking at the `NullPointerException`, it comes from the line:
```scala
if (tokens.length != schema.length) {
```
where `tokens` is null returned by `parseLine` of `UnivocityParser`.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92293/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92293 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92293/testReport)**
for PR 21631 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92293 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92293/testReport)**
for PR 21631 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/459/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
yea, I checked the two queries with/without column pruning in the master;
```
./bin/spark-shell --conf spark.sql.csv.parser.columnPruning.enabled=true
(default)
scala> val dir =
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92282 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92282/testReport)**
for PR 21631 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92284/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92282/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92284 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92284/testReport)**
for PR 21631 at commit
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21631
Both?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
yea, I think this is regressions because I checked that the query above
passed before[ this
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21631
@maropu Could you confirm whether these two bugs are regressions in the
master branch?
---
-
To unsubscribe, e-mail:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
As I described in
https://github.com/apache/spark/pull/21625#discussion_r197679077, I found
another bug? (the case where
`spark.sql.csv.parser.columnPruning.enabled=false`) when working on this pr;
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92284 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92284/testReport)**
for PR 21631 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/451/
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
@HyukjinKwon sure, I'll do
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21631
**[Test build #92282 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92282/testReport)**
for PR 21631 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/449/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21631
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21631
cc: @HyukjinKwon @MaxGekk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
52 matches
Mail list logo