[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread jurriaan
GitHub user jurriaan opened a pull request: https://github.com/apache/spark/pull/13267 [SPARK-15493][SQL] Allow setting the quoteEscapingEnabled flag when writing CSV ## What changes were proposed in this pull request? See https://github.com/uniVocity/univocity-parsers/blo

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread jurriaan
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221115470 cc @rxin @HyukjinKwon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221115849 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13267#discussion_r64312653 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -364,6 +364,33 @@ class CSVSuite extends QueryTes

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13267#discussion_r64312722 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -364,6 +364,33 @@ class CSVSuite extends QueryTes

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13267#discussion_r64312893 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -364,6 +364,33 @@ class CSVSuite extends QueryTes

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/13267#discussion_r64313317 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -364,6 +364,33 @@ class CSVSuite extends QueryTes

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221137806 @jurriaan Just to double check.. It dose not escape `quote`s if `quote` and/or `escape` are not set? I think they might better be documented.. --- If your proj

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221171295 **[Test build #3011 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3011/consoleFull)** for PR 13267 at commit [`caf8808`](https://g

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread jurriaan
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221176282 @HyukjinKwon If you don't supply those options they are set to the defaults. For the workings of the setQuoteEscapingEnabled see https://github.com/uniVocity/univocity

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-23 Thread jurriaan
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221176753 @HyukjinKwon Addressed your comments and improved the documentation a bit. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221185335 **[Test build #3011 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3011/consoleFull)** for PR 13267 at commit [`caf8808`](https://

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221370006 Can we explain using an example what this does when it is off? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread jurriaan
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221380701 @rxin An example using the following dataframe: ``` spark.createDataFrame([['test "quote"', 123, 'it "works"!', '"very" well']]) ``` The d

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221385900 Thanks - a follow up question: should this flag ever be false? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as w

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread jurriaan
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221387961 @rxin Good question, I'm not sure what's the best approach here. It looks like setting the flag to true by default could be a good choice. The comment at [htt

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221390796 @jbax can we get a 2nd opinion here about quoteEscapingEnabled? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread jbax
Github user jbax commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221408197 It's disabled by default because earlier versions were slower when writing CSV and it helped a little bit. Also because parsing unqoted values is faster. Wi

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221449233 Thanks, @jbax. Given this I think we should just have it on by default. Some follow-up questions: 1. When will 2.2.x come out? 2. We should probably up

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread jbax
Github user jbax commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221454486 @rxin In your case think it's better to have this turned on by default. Regarding your other questions: 1 - There's no timeline. 2.2.x will come out when new featu

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread falaki
Github user falaki commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221468338 @rxin and @jurriaan I agree to keep it set by default. However, I think it is better to leave it configurable. In two cases before, I assumed a reasonable default value

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221469276 @jurriaan want to do the change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221469255 Yea I agree with escapeQuotes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not ha

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread jurriaan
Github user jurriaan commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221486596 @rxin Done :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featur

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13267#discussion_r64522811 --- Diff: python/pyspark/sql/readwriter.py --- @@ -787,6 +787,9 @@ def csv(self, path, mode=None, compression=None, sep=None, quote=None, escape=No

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13267#discussion_r64522859 --- Diff: sql/core/pom.xml --- @@ -39,7 +39,7 @@ com.univocity univocity-parsers - 2.1.0 + 2.1.1 --- End di

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221487940 **[Test build #3018 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3018/consoleFull)** for PR 13267 at commit [`8c4bef1`](https://g

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-24 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221489074 BTW don'r forget to update the title too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-15493][SQL] Allow setting the quoteEsca...

2016-05-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13267#issuecomment-221508989 **[Test build #3018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3018/consoleFull)** for PR 13267 at commit [`8c4bef1`](https://