[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/2638 [SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent breaking binary-compatibility. Original problem is [SPARK-3764](https://issues.apache.org/jira/browse/SPARK-3764). `AppendingParquetOutputFormat` uses a binary-incompatible method `context.getTaskAttemptID`. This causes binary-incompatible of Spark itself, i.e. if Spark itself is built against hadoop-1, the artifact is for only hadoop-1, and vice versa. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ueshin/apache-spark issues/SPARK-3771 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2638.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2638 commit ec213c160393698fa01c62469263f050f3668453 Author: Takuya UESHIN Date: 2014-10-02T14:22:46Z Use reflection to prevent breaking binary-compatibility. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-57729963 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21231/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-57741192 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/259/consoleFull) for PR 2638 at commit [`ec213c1`](https://github.com/apache/spark/commit/ec213c160393698fa01c62469263f050f3668453). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-57746994 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/259/consoleFull) for PR 2638 at commit [`ec213c1`](https://github.com/apache/spark/commit/ec213c160393698fa01c62469263f050f3668453). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` println(s"Failed to load main class $childMainClass.")` * ` case class GetPeers(blockManagerId: BlockManagerId) extends ToBlockManagerMaster` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-57791903 A particular instance of Spark will be built for a particular version of Hadoop and/or YARN. It is not at this point a universal binary anyway, and so, I do not think it is necessary to add this indirection via reflection. That is, if you are deploying on Hadoop 1, you need to build Spark for Hadoop 1, and similarly for Hadoop 2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-57800537 @srowen, Thank you for your comment. Indeed, when deploy completed apps to Spark cluster, there is a particular instance of Spark. But Spark app developers will use artifacts in Maven Central while developing and unit-testing. The artifacts seem to be built for Hadoop 2, so if they want to test with Hadoop 1, it won't work. What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-58565195 @ueshin I'm not sure I fully understand. What are the two method signatures in question such that it compiles but then fails at runtime. Can you perhaps include these details in a comment? @srowen are you satisfied with that explanation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-58572880 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-58596020 @marmbrus, Thank you for your comment. The `TaskAttemptContext` is a class in [hadoop-1](https://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/TaskAttemptContext.html) but is an interface in [hadoop-2](http://hadoop.apache.org/docs/r2.5.1/api/org/apache/hadoop/mapreduce/TaskAttemptContext.html). The signatures of the method `TaskAttemptContext.getTaskAttemptID` for the both versions are the same, so the method calls are source-compatible but NOT binary-compatible because the opcode of method call for class is [`INVOKEVIRTUAL`](http://cs.au.dk/~mis/dOvs/jvmspec/ref--35.html) and for interface is [`INVOKEINTERFACE`](http://cs.au.dk/~mis/dOvs/jvmspec/ref--32.html). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-58603463 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21566/consoleFull) for PR 2638 at commit [`efd3784`](https://github.com/apache/spark/commit/efd3784a756bd1a1b239496ed5a1c1b662c04ffa). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-58606003 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21566/consoleFull) for PR 2638 at commit [`efd3784`](https://github.com/apache/spark/commit/efd3784a756bd1a1b239496ed5a1c1b662c04ffa). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-58606008 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21566/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2638 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3771][SQL] AppendingParquetOutputFormat...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2638#issuecomment-58952072 Thanks! Merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org