Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/17177
@ep1804 @jbax Thank you. I will cc and inform you both when I happen to see
a PR bumping up the version to 2.4.0 (or probably I guess I will).
---
If your project is set up for it, you can
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
I agree with you @HyukjinKwon , this PR will be closed for now and re-open.
And, thank you for the notice @jbox !
---
If your project is set up for it, you can reply to this email and have
Github user jbax commented on the issue:
https://github.com/apache/spark/pull/17177
2.4.0 released, thank you guys!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/17177
I think it is good to add as it is also described in univocity library but
I am not too sure if it is worth exposing an option that has currently a little
bug. Maybe we could close for now and
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
As mentioned in https://github.com/uniVocity/univocity-parsers/issues/143,
for proper handling of escape characters, the uniVocity option
`escapeUnquotedValues` is also required.
I added
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/17177
Thank you so much both for your efforts and time.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
The bug is fixed in uniVocity.
https://github.com/uniVocity/univocity-parsers/issues/143
This will be considered (on Monday).
---
If your project is set up for it, you can reply to this email
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
@jbax
They are not CSV inputs. They are RAW TEXT.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user jbax commented on the issue:
https://github.com/apache/spark/pull/17177
Doesn't seem correct to me. All test cases are using broken CSV and trigger
the parser handling of unescaped quotes, where it tries to rescue the data and
produce something sensible. See my test case
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
Documentation for DataFrameReader, DataFrameWriter, DataStreamReader,
readwriter.py and streaming.py are written. Check please.
---
If your project is set up for it, you can reply to this email and
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
An issue is raised for uniVocity parser:
https://github.com/uniVocity/univocity-parsers/issues/143
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
Thank you @HyukjinKwon .
I made changes following your comments:
* `escapeQuoteEscaping` instead of `escapeEscape`
* defalutl value to `\u` (unset)
* `withTempPath`
*
12 matches
Mail list logo