[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14634 LGTM. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65629/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #65629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65629/consoleFull)** for PR 14634 at commit [`90fbe4e`](https://github.com/apache/spark/commit/90fbe4e7bc8e80d7601eb020d428055a1a44797a). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class HiveQuerySuite extends HiveComparisonTest with SQLTestUtils with BeforeAndAfter ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #65629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65629/consoleFull)** for PR 14634 at commit [`90fbe4e`](https://github.com/apache/spark/commit/90fbe4e7bc8e80d7601eb020d428055a1a44797a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14634 This change looks good. Let's add a regression test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65620/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #65620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65620/consoleFull)** for PR 14634 at commit [`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #65620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65620/consoleFull)** for PR 14634 at commit [`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14634 @JoshRosen this is a regression in 2.0(it works in 1.6), so I think we should target it to 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14634 cc @yhuai , In `InsertIntoHiveTable` we already called `newHadoopConf`, I think it's safer to get these hive confs from `hadoopConf` instead of `sqlConf`, to respect hive confs in `hive-site.xml`. We can check other places one by one later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14634 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/14634 @yhuai, @cloud-fan What's the status of this issue? Should this still be targeted for 2.0.1 (which it is currently in JIRA)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14634 Based on my understanding, after this PR, we will respect the conf values of `hive.exec.dynamic.partition`, `hive.exec.dynamic.partition.mode` and `hive.exec.compress.output` that are specified in `hive-site.xml`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14634 Sorry. What's the necessity to make this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14634 uh, I see. Thank you! No more question. : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14634 1. `hadoopConf` contains all the confs from SQL conf, see `SessionState.newHadoopConf` 2. users can change `hadoopConf` at runtime. 3. the default value is from hive, I'm ok to not follow it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14634 Sorry, let me rephrase the potential issue. - `insertInto` API forces users to set `hive.exec.dynamic.partition` to `true` and `hive.exec.dynamic.partition.mode` to `nonstrict`. This might not be convinient. The default value of `hive.exec.dynamic.partition.mode` is `strict`. - If we always read the setting from `hadoopConf`, does that mean users are unable to control these settings for different queries? Is that possible users can change the setting values in `hadoopConf` at runtime? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14634 If users want to use the `DataFrameWriter`'s `insertInto` API for partitioned Hive table, they have to set `hive.exec.dynamic.partition` to `true` and `hive.exec.dynamic.partition.mode` to `nonstrict`. Otherwise, it does not work. Users are unable to specify the partition values in the `insertInto` APIs. See [the code](https://github.com/apache/spark/blob/8c8acdec9365136cba13060ce36c22b28e29b59b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L258). Let me use a test case to show the issue. ```Scala withTempDir { tmpDir => val basePath = tmpDir.getCanonicalPath val externalTab = "extTable_with_partitions" withTable(externalTab) { assert(tmpDir.listFiles.isEmpty) sql( s""" |CREATE EXTERNAL TABLE $externalTab (key INT, value STRING) |PARTITIONED BY (ds STRING, hr STRING) |stored as Parquet |LOCATION '$basePath' """.stripMargin) for (ds <- Seq("2008-04-08", "2008-04-09"); hr <- Seq("11", "12")) { sql( s""" |INSERT OVERWRITE TABLE $externalTab |partition (ds='$ds',hr='$hr') |SELECT 1, 'a' """.stripMargin) } withSQLConf("hive.exec.dynamic.partition" -> "true", "hive.exec.dynamic.partition.mode" -> "nonstrict") { Seq((1, "2", "2008-04-09", "12")).toDF("key", "value", "ds", "hr").write .insertInto(externalTab) } } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14634 > hive.exec.dynamic.partition also impacts our regular writing paths I think it's hive only conf? Normal data source relation should not read this conf. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14634 `hive.exec.dynamic.partition` also impacts our regular writing paths (DataFrameWriter APIs). I remember DataFrameWriter APIs always assume this conf is true, right? If it is controlled by hadoopConf, users might hit strange errors when using our DataFrameWriter APIs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14634 Great! It resolves my original concern. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63746/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #63746 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63746/consoleFull)** for PR 14634 at commit [`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #63746 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63746/consoleFull)** for PR 14634 at commit [`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14634 cc @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org