Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/20004
OK. I'll deal with it that after the new year break : )
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/20004
Revision followed:
- comment on the default values.
- applying charToEscapeQuoteEscaping using Option type
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/20004
`CSVOptions` is changed NOT to set `charToEscapeQuoteEscaping` to `\u`
by default, to allow the uniVocity parser to use `escape` as
`charToEscapeQuoteEscaping` character
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/20004
When `charToEscapeQuoteEscaping` is not set and `quote` and `escape` are
different, uniVocity parser uses `escape` character as
`charToEscapeQuoteEscaping` by default. This is why the test passes
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/20004
Option name changed to `charToEscapeQuoteEscaping`.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user ep1804 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20004#discussion_r157391252
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -482,6 +482,36 @@ class CSVSuite extends QueryTest
Github user ep1804 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20004#discussion_r157391233
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -482,6 +482,36 @@ class CSVSuite extends QueryTest
Github user ep1804 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20004#discussion_r157391102
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala
---
@@ -249,6 +249,8 @@ final class DataStreamReader private[sql
Github user ep1804 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20004#discussion_r157389005
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamReader.scala
---
@@ -249,6 +249,8 @@ final class DataStreamReader private[sql
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/20004
This is the original thread: https://github.com/apache/spark/pull/17177
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/20004
@HyukjinKwon
This issue is re-open. Please check this.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
GitHub user ep1804 opened a pull request:
https://github.com/apache/spark/pull/20004
[Spark 22818][SQL] csv escape of quote escape
## What changes were proposed in this pull request?
Escape of escape should be considered when using the UniVocity csv
encoding/decoding
Github user ep1804 closed the pull request at:
https://github.com/apache/spark/pull/17177
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
I agree with you @HyukjinKwon , this PR will be closed for now and re-open.
And, thank you for the notice @jbox !
---
If your project is set up for it, you can reply to this email and have
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
As mentioned in https://github.com/uniVocity/univocity-parsers/issues/143,
for proper handling of escape characters, the uniVocity option
`escapeUnquotedValues` is also required.
I added
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
The bug is fixed in uniVocity.
https://github.com/uniVocity/univocity-parsers/issues/143
This will be considered (on Monday).
---
If your project is set up for it, you can reply to this email
Github user ep1804 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17177#discussion_r105376506
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -693,8 +697,8 @@ def text(self, path, compression=None):
@since(2.0)
def csv(self
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
@jbax
They are not CSV inputs. They are RAW TEXT.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
Documentation for DataFrameReader, DataFrameWriter, DataStreamReader,
readwriter.py and streaming.py are written. Check please.
---
If your project is set up for it, you can reply to this email
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
An issue is raised for uniVocity parser:
https://github.com/uniVocity/univocity-parsers/issues/143
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
Thank you @HyukjinKwon .
I made changes following your comments:
* `escapeQuoteEscaping` instead of `escapeEscape`
* defalutl value to `\u` (unset)
* `withTempPath
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
Thank you for early and detailed response @HyukjinKwon .
1. About the purpose of PR, Yes, it's about using escape-a-quote-escape
option. I used the wording 'encoding/decoding' with a general
Github user ep1804 commented on the issue:
https://github.com/apache/spark/pull/17177
cc @HyukjinKwon
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user ep1804 opened a pull request:
https://github.com/apache/spark/pull/17177
[SPARK-19384][SQL] csv encoding/decoding using escape of escape
Escape of escape should be considered when using the UniVocity csv
encoding/decoding library.
I added lines
24 matches
Mail list logo