[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17001


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r105565416
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -907,3 +936,97 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
--- End diff --

`default` -> `The default`
`use warehouse path` -> `uses the warehouse path`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r105565378
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -907,3 +936,97 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+  spark.sql("CREATE TABLE t(a string)")
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // table in default database use the location of default database 
which is also warehouse path
+  assert(new Path(table.location) == fs.makeQualified(new 
Path(warehousePath, "t")))
+  spark.sql("INSERT INTO TABLE t SELECT 1")
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE DATABASE not_default")
+  spark.sql("USE not_default")
+  spark.sql("CREATE TABLE t1(b string)")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // table in not default database use the location of its own database
+  assert(new Path(table1.location) == fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+} finally {
+  spark.sql("USE default")
+}
+  }
+}
+
+object SPARK_19667_VERIFY_TABLE_PATH {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // the table in default database created in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
--- End diff --

->
```
For the table created by another job in the default database, the location 
of this table is not changed, even if the current job has different warehouse 
path.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r105565256
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -907,3 +936,97 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+  spark.sql("CREATE TABLE t(a string)")
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // table in default database use the location of default database 
which is also warehouse path
+  assert(new Path(table.location) == fs.makeQualified(new 
Path(warehousePath, "t")))
+  spark.sql("INSERT INTO TABLE t SELECT 1")
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE DATABASE not_default")
+  spark.sql("USE not_default")
+  spark.sql("CREATE TABLE t1(b string)")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // table in not default database use the location of its own database
+  assert(new Path(table1.location) == fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+} finally {
+  spark.sql("USE default")
+}
+  }
+}
+
+object SPARK_19667_VERIFY_TABLE_PATH {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // the table in default database created in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table.location) != fs.makeQualified(new 
Path(warehousePath, "t")))
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE TABLE t3(d string)")
+  val table3 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+  // the table in default database created here in this job, it will 
use the warehouse path
+  // of this job as its location
--- End diff --

->
```
When a job creates a table in the default database, the table location is 
under the warehouse path
that is configured for the local job.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r105565132
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -907,3 +936,97 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+  spark.sql("CREATE TABLE t(a string)")
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // table in default database use the location of default database 
which is also warehouse path
+  assert(new Path(table.location) == fs.makeQualified(new 
Path(warehousePath, "t")))
+  spark.sql("INSERT INTO TABLE t SELECT 1")
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE DATABASE not_default")
+  spark.sql("USE not_default")
+  spark.sql("CREATE TABLE t1(b string)")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // table in not default database use the location of its own database
+  assert(new Path(table1.location) == fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+} finally {
+  spark.sql("USE default")
+}
+  }
+}
+
+object SPARK_19667_VERIFY_TABLE_PATH {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // the table in default database created in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table.location) != fs.makeQualified(new 
Path(warehousePath, "t")))
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE TABLE t3(d string)")
+  val table3 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+  // the table in default database created here in this job, it will 
use the warehouse path
+  // of this job as its location
+  assert(new Path(table3.location) == fs.makeQualified(new 
Path(warehousePath, "t3")))
+
+  spark.sql("USE not_default")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // the table in not default database create in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table1.location) != fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+  assert(!new File(warehousePath.toString, 
"not_default.db/t1").exists())
--- End diff --

This scenario (line 993-1000) is not needed to test, IMO. Most of the test 
cases already cover it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r105565041
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -907,3 +936,97 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+  spark.sql("CREATE TABLE t(a string)")
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // table in default database use the location of default database 
which is also warehouse path
+  assert(new Path(table.location) == fs.makeQualified(new 
Path(warehousePath, "t")))
+  spark.sql("INSERT INTO TABLE t SELECT 1")
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE DATABASE not_default")
+  spark.sql("USE not_default")
+  spark.sql("CREATE TABLE t1(b string)")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // table in not default database use the location of its own database
+  assert(new Path(table1.location) == fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+} finally {
+  spark.sql("USE default")
+}
+  }
+}
+
+object SPARK_19667_VERIFY_TABLE_PATH {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // the table in default database created in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table.location) != fs.makeQualified(new 
Path(warehousePath, "t")))
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE TABLE t3(d string)")
+  val table3 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+  // the table in default database created here in this job, it will 
use the warehouse path
+  // of this job as its location
+  assert(new Path(table3.location) == fs.makeQualified(new 
Path(warehousePath, "t3")))
+
+  spark.sql("USE not_default")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // the table in not default database create in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table1.location) != fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+  assert(!new File(warehousePath.toString, 
"not_default.db/t1").exists())
+
+  spark.sql("CREATE TABLE t2(c string)")
+  val table2 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t2"))
+  // the table in not default database created here in this job, it 
will use the location
+  // of the database as its location, not the warehouse path in this 
job
--- End diff --

->
```
the table created in the non-default database (created in another job) is 
under the database location.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r105564983
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -907,3 +936,97 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+  spark.sql("CREATE TABLE t(a string)")
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // table in default database use the location of default database 
which is also warehouse path
+  assert(new Path(table.location) == fs.makeQualified(new 
Path(warehousePath, "t")))
+  spark.sql("INSERT INTO TABLE t SELECT 1")
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE DATABASE not_default")
+  spark.sql("USE not_default")
+  spark.sql("CREATE TABLE t1(b string)")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // table in not default database use the location of its own database
+  assert(new Path(table1.location) == fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+} finally {
+  spark.sql("USE default")
+}
+  }
+}
+
+object SPARK_19667_VERIFY_TABLE_PATH {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // the table in default database created in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table.location) != fs.makeQualified(new 
Path(warehousePath, "t")))
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE TABLE t3(d string)")
+  val table3 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+  // the table in default database created here in this job, it will 
use the warehouse path
+  // of this job as its location
+  assert(new Path(table3.location) == fs.makeQualified(new 
Path(warehousePath, "t3")))
+
+  spark.sql("USE not_default")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // the table in not default database create in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table1.location) != fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+  assert(!new File(warehousePath.toString, 
"not_default.db/t1").exists())
+
+  spark.sql("CREATE TABLE t2(c string)")
+  val table2 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t2"))
+  // the table in not default database created here in this job, it 
will use the location
+  // of the database as its location, not the warehouse path in this 
job
+  assert(new Path(table2.location) != fs.makeQualified(
+new Path(warehousePath, "not_default.db/t2")))
+
+  spark.sql("CREATE DATABASE not_default_1")
--- End diff --

-> `non_default_db1`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@sp

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r105564912
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -907,3 +936,97 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+  spark.sql("CREATE TABLE t(a string)")
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // table in default database use the location of default database 
which is also warehouse path
+  assert(new Path(table.location) == fs.makeQualified(new 
Path(warehousePath, "t")))
+  spark.sql("INSERT INTO TABLE t SELECT 1")
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE DATABASE not_default")
+  spark.sql("USE not_default")
+  spark.sql("CREATE TABLE t1(b string)")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // table in not default database use the location of its own database
+  assert(new Path(table1.location) == fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+} finally {
+  spark.sql("USE default")
+}
+  }
+}
+
+object SPARK_19667_VERIFY_TABLE_PATH {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = new Path(spark.sharedState.warehousePath)
+  val fs = 
warehousePath.getFileSystem(spark.sessionState.newHadoopConf())
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(new Path(defaultDB.locationUri) == 
fs.makeQualified(warehousePath))
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // the table in default database created in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table.location) != fs.makeQualified(new 
Path(warehousePath, "t")))
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE TABLE t3(d string)")
+  val table3 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t3"))
+  // the table in default database created here in this job, it will 
use the warehouse path
+  // of this job as its location
+  assert(new Path(table3.location) == fs.makeQualified(new 
Path(warehousePath, "t3")))
+
+  spark.sql("USE not_default")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // the table in not default database create in 
job(SPARK_19667_CREATE_TABLE) above,
+  // which has different warehouse path from this job, its location 
still equals to
+  // the location when it's created.
+  assert(new Path(table1.location) != fs.makeQualified(
+new Path(warehousePath, "not_default.db/t1")))
+  assert(!new File(warehousePath.toString, 
"not_default.db/t1").exists())
+
+  spark.sql("CREATE TABLE t2(c string)")
+  val table2 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t2"))
+  // the table in not default database created here in this job, it 
will use the location
+  // of the database as its location, not the warehouse path in this 
job
+  assert(new Path(table2.location) != fs.makeQualified(
+new Path(warehousePath, "not_default.db/t2")))
+
+  spark.sql("CREATE DATABASE not_default_1")
+  spark.sql("USE not_default_1")
+  spark.sql("CREATE TABLE t4(e string)")
+  val table4 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t4"))
+  // the table created in the database which created in this job, it 
will use the location
+  // of the database.
--- End diff --

->
```
the table created in the non-default database is under the database 
location.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub a

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r104111279
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -74,7 +88,17 @@ abstract class ExternalCatalog {
*/
   def alterDatabase(dbDefinition: CatalogDatabase): Unit
 
-  def getDatabase(db: String): CatalogDatabase
+  final def getDatabase(db: String): CatalogDatabase = {
+val database = getDatabaseInternal(db)
--- End diff --

put this in the else branch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-02 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r104101806
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -905,3 +934,91 @@ object SPARK_18989_DESC_TABLE {
 }
   }
 }
+
+object SPARK_19667_CREATE_TABLE {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = 
s"file:${spark.sharedState.warehousePath.stripSuffix("/")}"
+  val defaultDB = 
spark.sessionState.catalog.getDatabaseMetadata("default")
+  // default database use warehouse path as its location
+  assert(defaultDB.locationUri.stripSuffix("/") == warehousePath)
+  spark.sql("CREATE TABLE t(a string)")
+
+  val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+  // table in default database use the location of default database 
which is also warehouse path
+  assert(table.location.stripSuffix("/") == s"$warehousePath/t")
+  spark.sql("INSERT INTO TABLE t SELECT 1")
+  assert(spark.sql("SELECT * FROM t").count == 1)
+
+  spark.sql("CREATE DATABASE not_default")
+  spark.sql("USE not_default")
+  spark.sql("CREATE TABLE t1(b string)")
+  val table1 = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t1"))
+  // table in not default database use the location of its own database
+  assert(table1.location.stripSuffix("/") == 
s"$warehousePath/not_default.db/t1")
+} finally {
+  spark.sql("USE default")
+}
+  }
+}
+
+object SPARK_19667_VERIFY_TABLE_PATH {
+  def main(args: Array[String]): Unit = {
+val spark = SparkSession.builder().enableHiveSupport().getOrCreate()
+try {
+  val warehousePath = 
s"file:${spark.sharedState.warehousePath.stripSuffix("/")}"
--- End diff --

I am doing this modify


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-02 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103891244
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

I have modify the code by adding 
```
lazy val defaultDB = { val qualifiedWarehousePath = SessionCatalog 
.makeQualifiedPath(warehousePath, hadoopConf).toString 
CatalogDatabase("default","", qualifiedWarehousePath, Map.empty) }
```
in `ExternalCatalog `

if it is not ok ,I will revert it, thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-02 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103889802
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

@cloud-fan  I found it that if we add a parameter `defaultDB` for 
`ExternalCatalog` and its subclass `InMemoryCatalog` and `HiveExternalCatalog`, 
this change will cause a lot of related code to be modified, such as test cases 
,and other logic which create `InMemoryCatalog` and `HiveExternalCatalog`, for 
example:  currently all the parameters of `InMemoryCatalog` have its own 
default value `class InMemoryCatalog(conf: SparkConf = new 
SparkConf,hadoopConfig: Configuration = new Configuration)`,we can create it 
without an parameters, but if we add a `defaultDB`, we should new a defaultDB 
in the parameter, while we can not create a legal deafultDB because we can not 
get the warehouse path for the defaultDB like this:
` `class InMemoryCatalog(conf: SparkConf = new SparkConf,hadoopConfig: 
Configuration = new Configuration, defaultDB: CatalogDatabase = 
CatalogDatabase("default","","${can not get the warehouse path}",Map.empty))``

if we don't provide a default value for defautDB in the parameter, this 
will cause more code change which I think it is not proper.

what about we keep the `provided def warehousePath` in `ExternalCatalog`, 
and add a 
`lazy val defaultDB = {
val qualifiedWarehousePath = SessionCatalog
  .makeQualifiedPath(warehousePath, hadoopConf).toString
   CatalogDatabase("default","", qualifiedWarehousePath, Map.empty)
}` 

this can also avoid call getDatabase



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-01 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103865522
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

ok~ let me fix it~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103865312
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

but it will be only used in `getDatabase`, and we can save a metastore call 
to get the default database.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-01 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103863975
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

if we pass a defaultDB, it seems like we introduce an instance of defaultDB 
as we discussed above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103863070
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

we still have conf/hadoopConf in `InMemoryCatalog` and 
`HiveExternalCatalog`, we can just add one more parameter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-01 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103862862
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

I think conf/hadoopConf is more useful, later logic can use it. and it's 
subclass also has these two conf


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103861521
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -30,7 +33,7 @@ import 
org.apache.spark.sql.catalyst.expressions.Expression
  *
  * Implementations should throw [[NoSuchDatabaseException]] when databases 
don't exist.
  */
-abstract class ExternalCatalog {
+abstract class ExternalCatalog(conf: SparkConf, hadoopConf: Configuration) 
{
--- End diff --

how about we just pass in a `defaultDB: CatalogDatabase`? then we don't 
need to add the `protected def warehousePath: String`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-03-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103861355
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -74,7 +77,19 @@ abstract class ExternalCatalog {
*/
   def alterDatabase(dbDefinition: CatalogDatabase): Unit
 
-  def getDatabase(db: String): CatalogDatabase
+  def getDatabase(db: String): CatalogDatabase = {
+val database = getDatabaseInternal(db)
+// The default database's location always uses the warehouse path.
+// Since the location of database stored in metastore is qualified,
+// we also make the warehouse location qualified.
+if (db == SessionCatalog.DEFAULT_DATABASE) {
--- End diff --

makes sense


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-28 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103605667
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -74,7 +77,19 @@ abstract class ExternalCatalog {
*/
   def alterDatabase(dbDefinition: CatalogDatabase): Unit
 
-  def getDatabase(db: String): CatalogDatabase
+  def getDatabase(db: String): CatalogDatabase = {
+val database = getDatabaseInternal(db)
+// The default database's location always uses the warehouse path.
+// Since the location of database stored in metastore is qualified,
+// we also make the warehouse location qualified.
+if (db == SessionCatalog.DEFAULT_DATABASE) {
--- End diff --

I think if we create an default  instance  of `CatalogDatabase`, we will 
add more logic to other interfaces, such as `databaseExists` ,we need to add 
the logic that  if it is default database, we always return true even if there 
is no default database in hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103420821
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -74,7 +77,19 @@ abstract class ExternalCatalog {
*/
   def alterDatabase(dbDefinition: CatalogDatabase): Unit
 
-  def getDatabase(db: String): CatalogDatabase
+  def getDatabase(db: String): CatalogDatabase = {
+val database = getDatabaseInternal(db)
+// The default database's location always uses the warehouse path.
+// Since the location of database stored in metastore is qualified,
+// we also make the warehouse location qualified.
+if (db == SessionCatalog.DEFAULT_DATABASE) {
--- End diff --

can we totally make default database a virtual concept? i.e. we never 
create it or ask metastore to retrieve it,  just keep an instance of 
`CatalogDatabase` for default database, and return it in 
`ExternalCatalog.getDatabase`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103419380
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
 ---
@@ -74,7 +77,19 @@ abstract class ExternalCatalog {
*/
   def alterDatabase(dbDefinition: CatalogDatabase): Unit
 
-  def getDatabase(db: String): CatalogDatabase
+  def getDatabase(db: String): CatalogDatabase = {
--- End diff --

final def


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-24 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103067748
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,15 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
--- End diff --

I'd like the first one, `defaultTablePath` in `SessionCatalog` has been 
used in `CreateDataSourceAsSelectCommand`, I think it is useful to keep it in 
`SessionCatalog`, maybe later in another place we want to use it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103046202
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,15 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
--- End diff --

or let's just move this `defaultTablePath` function to `ExternalCatalog`, 
and we use it in `InMemoryCatalog` and `HiveExternalCatalog`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r103045973
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,15 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
--- End diff --

how about we put this logic in `ExternalCatalog`? e.g. 
`ExternalCatalog.getDatabase` will check the default database and then call 
`getDatabaseInternal` which need to be implemented by subclasses.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102873675
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

Please forgive me~
It seemly that I understand your descirbe is for confirm what issue this 
change is going to resolve . Yes, it is to resolve the issue which you 
described~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102871210
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

o(╯□╰)o


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102870703
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

The issue was already explained above. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102870643
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -193,7 +193,16 @@ class SessionCatalog(
   def getDatabaseMetadata(db: String): CatalogDatabase = {
 val dbName = formatDatabaseName(db)
 requireDbExists(dbName)
-externalCatalog.getDatabase(dbName)
+val database = externalCatalog.getDatabase(dbName)
+
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
--- End diff --

ok, it is better, thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102870494
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -193,7 +193,16 @@ class SessionCatalog(
   def getDatabaseMetadata(db: String): CatalogDatabase = {
 val dbName = formatDatabaseName(db)
 requireDbExists(dbName)
-externalCatalog.getDatabase(dbName)
+val database = externalCatalog.getDatabase(dbName)
+
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.warehousePath, 
hadoopConf).toString
+} else database.locationUri
--- End diff --

Yes. I knew it. We should create another function and rename the existing 
one. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102870072
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

@gatorsmile  could you desc it clear for what the issue is? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102866593
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

yes, it is duplicate, I have tried to merge them by unify the 
`defaultTablePath for managed table` in 
[HiveExternalCatalog](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L205-L216)
 and 
[InMemoryCatalog](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala#L194-L202
) up to 
[SessionCatalog](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L253)

then, we can drop the `defaultTablePath` in `HiveExternalCatalog`.

But this cause some test faile in `InMemoryCatalogSuit`, we should modify 
the tests.

So I think maybe we can keep the already existed duplication function 
`defaultTablePath`, and to merge them and fix some test failed in another PR.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102865566
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -193,7 +193,16 @@ class SessionCatalog(
   def getDatabaseMetadata(db: String): CatalogDatabase = {
 val dbName = formatDatabaseName(db)
 requireDbExists(dbName)
-externalCatalog.getDatabase(dbName)
+val database = externalCatalog.getDatabase(dbName)
+
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.warehousePath, 
hadoopConf).toString
+} else database.locationUri
--- End diff --

`getDefaultDBPath` used when we create a database without providing a 
location.
it is different from the logic here, here is to get the default table path 
in a existed database


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102784103
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

Or call `SessionCatalog.getDatabaseMetadata`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102783506
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

Actually, this change fixes an existing issue, if my understanding is 
correct

If the default database has already been created in the metastore (which 
could be shared), any following changes of `spark.sql.default.warehouse.dir` 
can trigger an issue when we create a managed table in the default database 
(Here, we assume Hive support is enabled). [The table path will be based on 
this 
function](https://github.com/windpiger/spark/blob/8f8063f8c4e11cb4f6158c90d21d8a0985d38584/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L209-L216).
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102781147
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

why we need to do this here? when we wanna get the database metadata, we 
should call `SessionCatalog.getDatabase`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102778675
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -408,7 +408,13 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 
   private def defaultTablePath(tableIdent: TableIdentifier): String = {
-val dbLocation = getDatabase(tableIdent.database.get).locationUri
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (tableIdent.database.orNull == 
SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.get(WAREHOUSE_PATH), 
hadoopConf).toString
+} else getDatabase(tableIdent.database.get).locationUri
--- End diff --

This is duplicate code. We should call the same one. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102778417
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -193,7 +193,16 @@ class SessionCatalog(
   def getDatabaseMetadata(db: String): CatalogDatabase = {
 val dbName = formatDatabaseName(db)
 requireDbExists(dbName)
-externalCatalog.getDatabase(dbName)
+val database = externalCatalog.getDatabase(dbName)
+
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.warehousePath, 
hadoopConf).toString
+} else database.locationUri
--- End diff --

Do we also need to qualify the default location for the other database in 
`getDefaultDBPath`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102778093
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -193,7 +193,16 @@ class SessionCatalog(
   def getDatabaseMetadata(db: String): CatalogDatabase = {
 val dbName = formatDatabaseName(db)
 requireDbExists(dbName)
-externalCatalog.getDatabase(dbName)
+val database = externalCatalog.getDatabase(dbName)
+
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.warehousePath, 
hadoopConf).toString
+} else database.locationUri
--- End diff --

Actually, maybe we can write a separate function for getting the location 
of default database. Just put it below the function 
[`getDefaultDBPath`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala#L227-L234).
 We might need to rename `getDefaultDBPath`. The name looks confusing now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102777281
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -193,7 +193,16 @@ class SessionCatalog(
   def getDatabaseMetadata(db: String): CatalogDatabase = {
 val dbName = formatDatabaseName(db)
 requireDbExists(dbName)
-externalCatalog.getDatabase(dbName)
+val database = externalCatalog.getDatabase(dbName)
+
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
--- End diff --

How about 

> // The default database's location always uses the warehouse path.
// Since the location of database stored in metastore is qualified,
// we also make the warehouse location qualified.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102776886
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -193,7 +193,16 @@ class SessionCatalog(
   def getDatabaseMetadata(db: String): CatalogDatabase = {
 val dbName = formatDatabaseName(db)
 requireDbExists(dbName)
-externalCatalog.getDatabase(dbName)
+val database = externalCatalog.getDatabase(dbName)
+
+// default database's location always use the warehouse path,
+// and since the location of database stored in metastore is qualified,
+// here we also make qualify for warehouse location
+val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+  SessionCatalog.makeQualifiedPath(conf.warehousePath, 
hadoopConf).toString
+} else database.locationUri
--- End diff --

Please fix the style. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102650536
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -339,10 +340,17 @@ private[hive] class HiveClientImpl(
 
   override def getDatabase(dbName: String): CatalogDatabase = 
withHiveState {
 Option(client.getDatabase(dbName)).map { d =>
+  // default database's location always use the warehouse path,
+  // and since the location of database stored in metastore is 
qualified,
+  // here we also make qualify for warehouse location
+  val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+SessionCatalog.makeQualifiedPath(sparkConf.get(WAREHOUSE_PATH), 
hadoopConf).toString
--- End diff --

Agreed! 👍 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102648988
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -339,10 +340,17 @@ private[hive] class HiveClientImpl(
 
   override def getDatabase(dbName: String): CatalogDatabase = 
withHiveState {
 Option(client.getDatabase(dbName)).map { d =>
+  // default database's location always use the warehouse path,
+  // and since the location of database stored in metastore is 
qualified,
+  // here we also make qualify for warehouse location
+  val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+SessionCatalog.makeQualifiedPath(sparkConf.get(WAREHOUSE_PATH), 
hadoopConf).toString
--- End diff --

What I want is consistency. Now we decide to define the location of default 
database as warehouse path, and we should stick with it. The main goal of this 
PR is not to fix the bug when sharing metastore db, but to change the 
definition of default database location.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102642601
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -339,10 +340,17 @@ private[hive] class HiveClientImpl(
 
   override def getDatabase(dbName: String): CatalogDatabase = 
withHiveState {
 Option(client.getDatabase(dbName)).map { d =>
+  // default database's location always use the warehouse path,
+  // and since the location of database stored in metastore is 
qualified,
+  // here we also make qualify for warehouse location
+  val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
+SessionCatalog.makeQualifiedPath(sparkConf.get(WAREHOUSE_PATH), 
hadoopConf).toString
--- End diff --

This won't work for `InMemoryCatalog`, isn't it?

You should either implement this logic in all `ExternalCatalog`s, all put 
it in `SessionCatalog`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102625694
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -339,10 +340,15 @@ private[hive] class HiveClientImpl(
 
   override def getDatabase(dbName: String): CatalogDatabase = 
withHiveState {
 Option(client.getDatabase(dbName)).map { d =>
+  // default database's location always use the warehouse path
+  val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
--- End diff --

sorry, I don't get the point, if this logic is exactly what we expected, 
we'd better to replace it at the beginning?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102622702
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala 
---
@@ -92,7 +92,8 @@ abstract class BucketedWriteSuite extends QueryTest with 
SQLTestUtils {
 
   def tableDir: File = {
 val identifier = 
spark.sessionState.sqlParser.parseTableIdentifier("bucketed_table")
-new 
File(URI.create(spark.sessionState.catalog.defaultTablePath(identifier)))
+new 
File(URI.create(s"file:${spark.sessionState.catalog.defaultTablePath(identifier)
--- End diff --

ok, let me try~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102619806
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala 
---
@@ -92,7 +92,8 @@ abstract class BucketedWriteSuite extends QueryTest with 
SQLTestUtils {
 
   def tableDir: File = {
 val identifier = 
spark.sessionState.sqlParser.parseTableIdentifier("bucketed_table")
-new 
File(URI.create(spark.sessionState.catalog.defaultTablePath(identifier)))
+new 
File(URI.create(s"file:${spark.sessionState.catalog.defaultTablePath(identifier)
--- End diff --

shall we fix `SessionCatalog.defaultTablePath`? to make it return a URI 
mabe?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102619596
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -339,10 +340,15 @@ private[hive] class HiveClientImpl(
 
   override def getDatabase(dbName: String): CatalogDatabase = 
withHiveState {
 Option(client.getDatabase(dbName)).map { d =>
+  // default database's location always use the warehouse path
+  val dbLocation = if (dbName == SessionCatalog.DEFAULT_DATABASE) {
--- End diff --

is it more logical to put this logic in `SessionCatalog`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102619396
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala 
---
@@ -92,7 +92,8 @@ abstract class BucketedWriteSuite extends QueryTest with 
SQLTestUtils {
 
   def tableDir: File = {
 val identifier = 
spark.sessionState.sqlParser.parseTableIdentifier("bucketed_table")
-new 
File(URI.create(spark.sessionState.catalog.defaultTablePath(identifier)))
+new 
File(URI.create(s"file:${spark.sessionState.catalog.defaultTablePath(identifier)
+  .stripPrefix("file:")}"))
--- End diff --

?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-21 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102392451
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -311,11 +313,12 @@ private[hive] class HiveClientImpl(
   override def createDatabase(
   database: CatalogDatabase,
   ignoreIfExists: Boolean): Unit = withHiveState {
+// default database's location always use the warehouse path, here set 
to emtpy string
 client.createDatabase(
   new HiveDatabase(
 database.name,
 database.description,
-database.locationUri,
+if (database.name == SessionCatalog.DEFAULT_DATABASE) "" else 
database.locationUri,
--- End diff --

sorry, actually it will throw an exception, my local default has created, 
so it does not hit the exception, I will just replace the default database 
location when reload from metastore, drop the logic when create database set 
location to empty string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17001#discussion_r102387112
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -311,11 +313,12 @@ private[hive] class HiveClientImpl(
   override def createDatabase(
   database: CatalogDatabase,
   ignoreIfExists: Boolean): Unit = withHiveState {
+// default database's location always use the warehouse path, here set 
to emtpy string
 client.createDatabase(
   new HiveDatabase(
 database.name,
 database.description,
-database.locationUri,
+if (database.name == SessionCatalog.DEFAULT_DATABASE) "" else 
database.locationUri,
--- End diff --

If it is empty, metastore will set it for us, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org