[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/19448 > But, if I were working on a Spark distribution at a vendor, this is something I would definitely include because it's such a useful feature. I concur :) ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19448 Thank you :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 Sure, I will and let me note it ahead next time. I made a mistake while trying to think of reasons for this backport. --- -

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19448 I am not really worried about this particular change. It's already merged and it seems a small and safe change. I am not planning to revert it. But, in general, let's avoid of merging changes

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/19448 I have a lot of sympathy for the argument that infrastructure software shouldn't have too many backports and that those should be generally bug fixes. But, if I were working on a Spark distribution

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 Okay. I am sorry for this trouble. Should we revert this if you guys feel strongly about it? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/19448 @HyukjinKwon branch-2.2 is in a maintenance branch, I am not sure it is appropriate to merge this change to branch-2.2 since it is not really a bug fix. If the doc is not accurate, we should fix the

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 @steveloughran Thanks for your inputs. Totally agree on your opinions. Spark is an infrastructure software. We have to be very careful when backporting the PRs. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 I guess we wouldn't change the docs in branch-2.2 alone as we have a safe fix here for this mismatch anyway. I think I just wanted to say this backport can be justified. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/19448 PS, for people who are interested in dynamic committers, [MAPREDUCE-6823](https://issues.apache.org/jira/browse/MAPREDUCE-6823) is something to look at. It allows you to switch committers

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-13 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/19448 Thanks for reviewing this/getting it in. Personally, I had it in the "improvement" category rather than bug fix. If it wasn't for that line in the docs, there'd be no ambiguity about

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 Will check it if I am not confident next time. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 Ok. Next time, please check it with the committers who are familiar with Spark SQL. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 I did this as I was confident if it is a bug because doc says it should work but actually not, without breaking the previous support. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 This one starts at least since Spark 1.5. If you are not confident whether this is bug or not, please check it before merging it. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 How come fixing the behaviour as documented is not a bug fix? I think that basically mean we don't backport fixes for things not working as documented for other internal configurations.

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 That conf is an internal one. The end users will not see it. This is not a bug fix. We should not extend the existing functions or introduce new behaviors/features in 2.2.x releases.

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 Since the risk is low, I did not revert it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 I think this is a bug to fix as the previous behaviour does not work as documented: ``` subclass of org.apache.hadoop.mapreduce.OutputCommitter... ``` and does not

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 This is not eligible for backporting. We should not do it next time. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 I didn't backported this one respecting the JIRA issue type, `Improvement` but yea, it sounds more like a bug fix. --- - To

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19448 Hi, All. Can we have this in Apache Spark 2.2.1? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82702/ Test PASSed. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82702 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82702/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82700/ Test PASSed. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82700 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82700/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82702 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82702/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/19448 Still +1 from me as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82692/ Test PASSed. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82692 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82692/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82700 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82700/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19448 LGTM pending Jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19448 Still LGTM except for few nits. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82692 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82692/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82680/ Test PASSed. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82680/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82680/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19448 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82643/ Test FAILed. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82643/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82643/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-06 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/19448 +1 I completely agree that using a ParquetOutputCommitter should be optional. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82517/ Test PASSed. ---

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82517 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82517/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19448 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19448 **[Test build #82517 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82517/testReport)** for PR 19448 at commit

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-06 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/19448 + @rdblue --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: