[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-20 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14634
  
LGTM. Merging to master and branch 2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14634
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65629/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14634
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14634
  
**[Test build #65629 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65629/consoleFull)**
 for PR 14634 at commit 
[`90fbe4e`](https://github.com/apache/spark/commit/90fbe4e7bc8e80d7601eb020d428055a1a44797a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class HiveQuerySuite extends HiveComparisonTest with SQLTestUtils with 
BeforeAndAfter `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14634
  
**[Test build #65629 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65629/consoleFull)**
 for PR 14634 at commit 
[`90fbe4e`](https://github.com/apache/spark/commit/90fbe4e7bc8e80d7601eb020d428055a1a44797a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14634
  
This change looks good. Let's add a regression test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14634
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65620/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14634
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14634
  
**[Test build #65620 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65620/consoleFull)**
 for PR 14634 at commit 
[`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14634
  
**[Test build #65620 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65620/consoleFull)**
 for PR 14634 at commit 
[`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14634
  
@JoshRosen this is a regression in 2.0(it works in 1.6), so I think we 
should target it to 2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14634
  
cc @yhuai , In `InsertIntoHiveTable` we already called `newHadoopConf`, I 
think it's safer to get these hive confs from `hadoopConf` instead  of 
`sqlConf`, to respect hive confs in `hive-site.xml`. We can check other places 
one by one later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14634
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread JoshRosen
Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/14634
  
@yhuai, @cloud-fan What's the status of this issue? Should this still be 
targeted for 2.0.1 (which it is currently in JIRA)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14634
  
Based on my understanding, after this PR, we will respect the conf values 
of `hive.exec.dynamic.partition`, `hive.exec.dynamic.partition.mode` and 
`hive.exec.compress.output` that are specified in `hive-site.xml`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14634
  
Sorry. What's the necessity to make this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14634
  
uh, I see. Thank you! No more question. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14634
  
1. `hadoopConf` contains all the confs from SQL conf, see 
`SessionState.newHadoopConf`
2. users can change `hadoopConf` at runtime.
3. the default value is from hive, I'm ok to not follow it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14634
  
Sorry, let me rephrase the potential issue. 

- `insertInto` API forces users to set `hive.exec.dynamic.partition` to 
`true` and `hive.exec.dynamic.partition.mode` to `nonstrict`. This might not be 
convinient. The default value of `hive.exec.dynamic.partition.mode` is 
`strict`. 

- If we always read the setting from `hadoopConf`, does that mean users are 
unable to control these settings for different queries? Is that possible users 
can change the setting values in `hadoopConf` at runtime?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14634
  
If users want to use the `DataFrameWriter`'s `insertInto` API for 
partitioned Hive table, they have to set `hive.exec.dynamic.partition` to 
`true` and `hive.exec.dynamic.partition.mode` to `nonstrict`. Otherwise, it 
does not work. Users are unable to specify the partition values in the 
`insertInto` APIs. See [the 
code](https://github.com/apache/spark/blob/8c8acdec9365136cba13060ce36c22b28e29b59b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala#L258).
 

Let me use a test case to show the issue. 
```Scala
withTempDir { tmpDir =>
  val basePath = tmpDir.getCanonicalPath
  val externalTab = "extTable_with_partitions"
  withTable(externalTab) {
assert(tmpDir.listFiles.isEmpty)
sql(
  s"""
 |CREATE EXTERNAL TABLE $externalTab (key INT, value STRING)
 |PARTITIONED BY (ds STRING, hr STRING)
 |stored as Parquet
 |LOCATION '$basePath'
   """.stripMargin)

for (ds <- Seq("2008-04-08", "2008-04-09"); hr <- Seq("11", "12")) {
  sql(
s"""
   |INSERT OVERWRITE TABLE $externalTab
   |partition (ds='$ds',hr='$hr')
   |SELECT 1, 'a'
 """.stripMargin)
}
withSQLConf("hive.exec.dynamic.partition" -> "true",
"hive.exec.dynamic.partition.mode" -> "nonstrict") {
  Seq((1, "2", "2008-04-09", "12")).toDF("key", "value", "ds", 
"hr").write
.insertInto(externalTab)
}
  }
}
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14634
  
> hive.exec.dynamic.partition also impacts our regular writing paths

I think it's hive only conf? Normal data source relation should not read 
this conf.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14634
  
`hive.exec.dynamic.partition` also impacts our regular writing paths 
(DataFrameWriter APIs). I remember DataFrameWriter APIs always assume this conf 
is true, right?

If it is controlled by hadoopConf, users might hit strange errors when 
using our DataFrameWriter APIs. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14634
  
Great! It resolves my original concern. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14634
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63746/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14634
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14634
  
**[Test build #63746 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63746/consoleFull)**
 for PR 14634 at commit 
[`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14634
  
**[Test build #63746 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63746/consoleFull)**
 for PR 14634 at commit 
[`64268f3`](https://github.com/apache/spark/commit/64268f34191d9f5447a63f34e53c9663aac714e2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-08-14 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14634
  
cc @yhuai


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org