[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/21066 +1 one thing to consider here is to be ruthless about when there are things in bits of the HDFS APIs/libraries which don't suit, and rather than think "how do we work around this", think "what do we need to do to get this fixed". This includes (base on the HBase & Hive experiences) * what's marked stable * serialization of classes * pulling up of operations from HDFS to the public FileSystem API (source of some contention there between myself and the hdfs team as to what constitutes acceptable specification and tests) * thread safety (HBase & encrypted IO) * various constants in HDFS interfaces tagged as private. etc. BTW, I'm thinking of retiring the MRv1 commit APIs: initially marking as deprecated. I'd match that with something to pre-emptively move spark onto the V2 one. After all, it's all bridged internally. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21066 If we're considering only supporting Hadoop 3 in Spark 3 -- and I think we should -- this could even go into the main source tree. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/21066 The main barrier to this is the what-do-we-do-about-hive problem, as without it ASF Spark doesn't run against Hadoop 3.x It looks like "support Hive 2" is the plan there, *which is the right thing to do long term* short term, well, we're actually shipping this and the patched hive 1.2.x artifacts in HDP-3.0; qualifying through our own tests, etc. I'm happy with it. It's also worth noting that there's work ongoing in Hadoop 3.2-3.3 to add multipart upload as an explicit API across filesystems, so you'll be able to write committers which can use multipart upload & commit across stores. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user venkey-ariv commented on the issue: https://github.com/apache/spark/pull/21066 Are there any plans to merge this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90333/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #90333 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90333/testReport)** for PR 21066 at commit [`3e1bce3`](https://github.com/apache/spark/commit/3e1bce3b9163de836681c69a2eff8e67108ac7b7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3010/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #90333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90333/testReport)** for PR 21066 at commit [`3e1bce3`](https://github.com/apache/spark/commit/3e1bce3b9163de836681c69a2eff8e67108ac7b7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21066 cc @rxin @JoshRosen @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89856/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89856/testReport)** for PR 21066 at commit [`659a7a4`](https://github.com/apache/spark/commit/659a7a4378cf5afe539ef113faebb7a3f583b1ab). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class BindingParquetOutputCommitter(` * `class PathOutputCommitProtocol(` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89856/testReport)** for PR 21066 at commit [`659a7a4`](https://github.com/apache/spark/commit/659a7a4378cf5afe539ef113faebb7a3f583b1ab). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2678/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/21066 Hi, @mridulm . Could you review this PR please? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/21066 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89844/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89844 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89844/testReport)** for PR 21066 at commit [`659a7a4`](https://github.com/apache/spark/commit/659a7a4378cf5afe539ef113faebb7a3f583b1ab). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class BindingParquetOutputCommitter(` * `class PathOutputCommitProtocol(` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2671/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89844 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89844/testReport)** for PR 21066 at commit [`659a7a4`](https://github.com/apache/spark/commit/659a7a4378cf5afe539ef113faebb7a3f583b1ab). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89346/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89346 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89346/testReport)** for PR 21066 at commit [`9d02ae7`](https://github.com/apache/spark/commit/9d02ae731e0fe314da312a614baa5664e40eaf80). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2321/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89346/testReport)** for PR 21066 at commit [`9d02ae7`](https://github.com/apache/spark/commit/9d02ae731e0fe314da312a614baa5664e40eaf80). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/21066 RAT test was on a 0-byte .keep file in `src/test/scala` as the maven plugging adding a profile-specific test source path needs an original one. easiest fix is just to add a real scala file in the source tree, with an ASF comment. I don't want to add explicit instantiation tests (e.g new S3AFileSystem()), because of some CP conflict between S3AFS on Hadoop 2.8 and spark's own CP: risk of failing on some test setups. It's a legit failure, but... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][Wip] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2320/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][Wip] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][Wip] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89343 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89343/testReport)** for PR 21066 at commit [`3da1f3f`](https://github.com/apache/spark/commit/3da1f3faa6601d38deb259203f2f48b17293f51d). * This patch **fails RAT tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class BindingParquetOutputCommitter(` * `class PathOutputCommitProtocol(` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][Wip] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89343/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][Wip] Add commit protocol binding to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21066: [SPARK-23977][CLOUD][Wip] Add commit protocol binding to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21066 **[Test build #89343 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89343/testReport)** for PR 21066 at commit [`3da1f3f`](https://github.com/apache/spark/commit/3da1f3faa6601d38deb259203f2f48b17293f51d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org