[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14757


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-24 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r76102697
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogSuite.scala
 ---
@@ -21,26 +21,26 @@ import org.apache.hadoop.conf.Configuration
 
 import org.apache.spark.SparkConf
 import org.apache.spark.sql.catalyst.catalog._
-import org.apache.spark.sql.hive.client.HiveClient
 
 /**
  * Test suite for the [[HiveExternalCatalog]].
  */
 class HiveExternalCatalogSuite extends ExternalCatalogSuite {
 
-  private val client: HiveClient = {
-// We create a metastore at a temp location to avoid any potential
-// conflict of having multiple connections to a single derby instance.
-HiveUtils.newClientForExecution(new SparkConf, new Configuration)
+  private val externalCatalog: HiveExternalCatalog = {
+new HiveExternalCatalog(new SparkConf, new Configuration)
   }
 
   protected override val utils: CatalogTestUtils = new CatalogTestUtils {
 override val tableInputFormat: String = 
"org.apache.hadoop.mapred.SequenceFileInputFormat"
 override val tableOutputFormat: String = 
"org.apache.hadoop.mapred.SequenceFileOutputFormat"
-override def newEmptyCatalog(): ExternalCatalog =
-  new HiveExternalCatalog(client, new Configuration())
+// We create a metastore at a temp location to avoid any potential
--- End diff --

It is not valid now. Will remove it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r76016944
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogSuite.scala
 ---
@@ -21,26 +21,26 @@ import org.apache.hadoop.conf.Configuration
 
 import org.apache.spark.SparkConf
 import org.apache.spark.sql.catalyst.catalog._
-import org.apache.spark.sql.hive.client.HiveClient
 
 /**
  * Test suite for the [[HiveExternalCatalog]].
  */
 class HiveExternalCatalogSuite extends ExternalCatalogSuite {
 
-  private val client: HiveClient = {
-// We create a metastore at a temp location to avoid any potential
-// conflict of having multiple connections to a single derby instance.
-HiveUtils.newClientForExecution(new SparkConf, new Configuration)
+  private val externalCatalog: HiveExternalCatalog = {
+new HiveExternalCatalog(new SparkConf, new Configuration)
--- End diff --

how about
```
val catalog = 
catalog.client.reset()
catalog
```

then we don't need 
https://github.com/apache/spark/pull/14757/files#diff-8c4108666a6639034f0ddbfa075bcb37R43,
 which is too far away from here and hard to connect them together.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r76016337
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogSuite.scala
 ---
@@ -21,26 +21,26 @@ import org.apache.hadoop.conf.Configuration
 
 import org.apache.spark.SparkConf
 import org.apache.spark.sql.catalyst.catalog._
-import org.apache.spark.sql.hive.client.HiveClient
 
 /**
  * Test suite for the [[HiveExternalCatalog]].
  */
 class HiveExternalCatalogSuite extends ExternalCatalogSuite {
 
-  private val client: HiveClient = {
-// We create a metastore at a temp location to avoid any potential
-// conflict of having multiple connections to a single derby instance.
-HiveUtils.newClientForExecution(new SparkConf, new Configuration)
+  private val externalCatalog: HiveExternalCatalog = {
+new HiveExternalCatalog(new SparkConf, new Configuration)
   }
 
   protected override val utils: CatalogTestUtils = new CatalogTestUtils {
 override val tableInputFormat: String = 
"org.apache.hadoop.mapred.SequenceFileInputFormat"
 override val tableOutputFormat: String = 
"org.apache.hadoop.mapred.SequenceFileOutputFormat"
-override def newEmptyCatalog(): ExternalCatalog =
-  new HiveExternalCatalog(client, new Configuration())
+// We create a metastore at a temp location to avoid any potential
--- End diff --

is it still valid?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75986531
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogSuite.scala
 ---
@@ -21,26 +21,26 @@ import org.apache.hadoop.conf.Configuration
 
 import org.apache.spark.SparkConf
 import org.apache.spark.sql.catalyst.catalog._
-import org.apache.spark.sql.hive.client.HiveClient
 
 /**
  * Test suite for the [[HiveExternalCatalog]].
  */
 class HiveExternalCatalogSuite extends ExternalCatalogSuite {
--- End diff --

Before the PR, what `HiveExternalCatalogSuite` uses is a 
[HiveUtils.newClientForExecution](https://github.com/apache/spark/blob/7bb64aae27f670531699f59d3f410e38866609b7/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogSuite.scala#L34)
 . The `newClientForExecution`'s configuration is `newTemporaryConfiguration`, 
[which makes a new path for 
metastore](https://github.com/apache/spark/blob/2ae7b88a07140e012b6c60db3c4a2a8ca360c684/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala#L366-L379).
 Thus, we can say it is pointing to a different metastore.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75974295
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogSuite.scala
 ---
@@ -21,26 +21,26 @@ import org.apache.hadoop.conf.Configuration
 
 import org.apache.spark.SparkConf
 import org.apache.spark.sql.catalyst.catalog._
-import org.apache.spark.sql.hive.client.HiveClient
 
 /**
  * Test suite for the [[HiveExternalCatalog]].
  */
 class HiveExternalCatalogSuite extends ExternalCatalogSuite {
--- End diff --

Why the changes in this file make hive client connect to a metastore that 
already has data?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75963484
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -40,6 +40,15 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 
   protected def resetState(): Unit = { }
 
+  // Clear the state before all tests
--- End diff --

Yeah, it connects to the same metastore. We can change it by setting 
`HiveConf.ConfVars.METASTORECONNECTURLKEY.varname` when creating a new external 
catalog.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75858056
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -40,6 +40,15 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 
   protected def resetState(): Unit = { }
 
+  // Clear the state before all tests
--- End diff --

Does that mean the newly created hive client already has data in it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75857364
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDataFrameSuite.scala ---
@@ -31,7 +31,7 @@ class HiveDataFrameSuite extends QueryTest with 
TestHiveSingleton {
   }
 
   test("SPARK-15887: hive-site.xml should be loaded") {
-val hiveClient = 
spark.sharedState.asInstanceOf[HiveSharedState].metadataHive
--- End diff --

oh sorry I misread the code, `metadataHive` only exist in `HiveContext` now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75813864
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala
 ---
@@ -51,7 +51,8 @@ class MetastoreDataSourcesSuite extends QueryTest with 
SQLTestUtils with TestHiv
 
   // To test `HiveExternalCatalog`, we need to read the raw table 
metadata(schema, partition
   // columns and bucket specification are still in table properties) from 
hive client.
-  private def hiveClient: HiveClient = 
sharedState.asInstanceOf[HiveSharedState].metadataHive
--- End diff --

See the above answer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75813871
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/ShowCreateTableSuite.scala ---
@@ -266,7 +266,7 @@ class ShowCreateTableSuite extends QueryTest with 
SQLTestUtils with TestHiveSing
   }
 
   private def createRawHiveTable(ddl: String): Unit = {
-
hiveContext.sharedState.asInstanceOf[HiveSharedState].metadataHive.runSqlHive(ddl)
--- End diff --

See the above answer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75813839
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -378,10 +378,9 @@ object SetMetastoreURLTest extends Logging {
 s"spark.sql.test.expectedMetastoreURL should be set.")
 }
 
-// HiveSharedState is used when Hive support is enabled.
+// HiveExternalCatalog is used when Hive support is enabled.
 val actualMetastoreURL =
-  spark.sharedState.asInstanceOf[HiveSharedState]
--- End diff --

See the above answer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75813787
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDataFrameSuite.scala ---
@@ -31,7 +31,7 @@ class HiveDataFrameSuite extends QueryTest with 
TestHiveSingleton {
   }
 
   test("SPARK-15887: hive-site.xml should be loaded") {
-val hiveClient = 
spark.sharedState.asInstanceOf[HiveSharedState].metadataHive
--- End diff --

`HiveSharedState` is removed, and thus, we have to do it. Right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75813591
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -40,6 +40,15 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 
   protected def resetState(): Unit = { }
 
+  // Clear all state before each test
+  override def beforeEach(): Unit = {
+try {
+  resetState()
--- End diff --

Since the test cases in `ExternalCatalog` do not drop the created tables, 
we need to drop it. To avoid duplicate cleanup, we can change `afterEach` to 
`afterAll`. Let me fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75812578
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala
 ---
@@ -51,7 +51,8 @@ class MetastoreDataSourcesSuite extends QueryTest with 
SQLTestUtils with TestHiv
 
   // To test `HiveExternalCatalog`, we need to read the raw table 
metadata(schema, partition
   // columns and bucket specification are still in table properties) from 
hive client.
-  private def hiveClient: HiveClient = 
sharedState.asInstanceOf[HiveSharedState].metadataHive
--- End diff --

do we need to change this line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75812586
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/ShowCreateTableSuite.scala ---
@@ -266,7 +266,7 @@ class ShowCreateTableSuite extends QueryTest with 
SQLTestUtils with TestHiveSing
   }
 
   private def createRawHiveTable(ddl: String): Unit = {
-
hiveContext.sharedState.asInstanceOf[HiveSharedState].metadataHive.runSqlHive(ddl)
--- End diff --

do we need to change this line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75812566
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala ---
@@ -378,10 +378,9 @@ object SetMetastoreURLTest extends Logging {
 s"spark.sql.test.expectedMetastoreURL should be set.")
 }
 
-// HiveSharedState is used when Hive support is enabled.
+// HiveExternalCatalog is used when Hive support is enabled.
 val actualMetastoreURL =
-  spark.sharedState.asInstanceOf[HiveSharedState]
--- End diff --

do we need to change this line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75812417
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDataFrameSuite.scala ---
@@ -31,7 +31,7 @@ class HiveDataFrameSuite extends QueryTest with 
TestHiveSingleton {
   }
 
   test("SPARK-15887: hive-site.xml should be loaded") {
-val hiveClient = 
spark.sharedState.asInstanceOf[HiveSharedState].metadataHive
--- End diff --

do we need to change this line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75812142
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -40,6 +40,15 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 
   protected def resetState(): Unit = { }
 
+  // Clear all state before each test
+  override def beforeEach(): Unit = {
+try {
+  resetState()
--- End diff --

Then can we remove the `afterEach` now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75810819
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
-  // we are not really testing the reflection logic based on the setting of
-  // CATALOG_IMPLEMENTATION.
   @transient
-  override lazy val sharedState: HiveSharedState = {
-existingSharedState.getOrElse(new HiveSharedState(sc))
+  override lazy val sharedState: SharedState = {
--- End diff --

i see, thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75807732
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
-  // we are not really testing the reflection logic based on the setting of
-  // CATALOG_IMPLEMENTATION.
   @transient
-  override lazy val sharedState: HiveSharedState = {
-existingSharedState.getOrElse(new HiveSharedState(sc))
+  override lazy val sharedState: SharedState = {
--- End diff --

Although `existingSharedState` is using the same name, we did not override 
[the 
one](https://github.com/apache/spark/blob/18c2c92580bdc27aa5129d9e7abda418a3633ea6/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L55)
 in `SharedState`. Thus, they are different. 

Actually, I tried it. It breaks [one test 
case](https://github.com/apache/spark/blob/a117afa7c2d94f943106542ec53d74ba2b5f1058/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala#L1082)
 

BTW, it works if `TestHiveSparkSession` overrides both [`sparkContext` and 
`existingSharedState`](https://github.com/apache/spark/blob/18c2c92580bdc27aa5129d9e7abda418a3633ea6/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L54-L55).
 However, we have to change `existingSharedState` in `SparkSession` to 
non-private.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75806381
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
-  // we are not really testing the reflection logic based on the setting of
-  // CATALOG_IMPLEMENTATION.
   @transient
-  override lazy val sharedState: HiveSharedState = {
-existingSharedState.getOrElse(new HiveSharedState(sc))
+  override lazy val sharedState: SharedState = {
--- End diff --

But is `existingSharedState.getOrElse(new SharedState(sc))` same with what 
the parent do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75804665
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
--- End diff --

The comment is moved to 
https://github.com/gatorsmile/spark/blob/d61ced8a2e81cef13afa5e0aa6203e505dd67570/sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala#L147-L148

Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75804066
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
-  // we are not really testing the reflection logic based on the setting of
-  // CATALOG_IMPLEMENTATION.
   @transient
-  override lazy val sharedState: HiveSharedState = {
-existingSharedState.getOrElse(new HiveSharedState(sc))
+  override lazy val sharedState: SharedState = {
--- End diff --

`sharedState` in `SparkSession` is unable to access the 
`existingSharedState` defined in `TestHiveSparkSession`.  We are unable to 
override it because [it is 
private](https://github.com/apache/spark/blob/18c2c92580bdc27aa5129d9e7abda418a3633ea6/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L55).
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75800807
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
--- End diff --

Yeah. Let me move it back.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75800677
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -40,6 +40,15 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 
   protected def resetState(): Unit = { }
 
+  // Clear all state before each test
+  override def beforeEach(): Unit = {
+try {
+  resetState()
--- End diff --


https://github.com/apache/spark/blob/7bb64aae27f670531699f59d3f410e38866609b7/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogSuite.scala#L31-L35

Before this PR, we create a metastore at a temp location to avoid any 
potential conflict of having multiple connections to a single derby instance. 
This does not sound an issue now. Is my understanding right?

If we do not create a new metastore, we have to clean the existing 
metastore; otherwise, [the 
checking](https://github.com/apache/spark/blob/5effc016c893ce917d535cc1b5026d8e4c846721/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala#L60-L61)
 in the test case could fail.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75799684
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -41,13 +42,20 @@ import org.apache.spark.sql.types.{DataType, StructType}
  * A persistent implementation of the system catalog using Hive.
  * All public methods must be synchronized for thread-safety.
  */
-private[spark] class HiveExternalCatalog(client: HiveClient, hadoopConf: 
Configuration)
+private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: 
Configuration)
   extends ExternalCatalog with Logging {
 
   import CatalogTypes.TablePartitionSpec
   import HiveExternalCatalog._
   import CatalogTableType._
 
+  /**
+   * A Hive client used to interact with the metastore.
+   */
+  private val client: HiveClient = {
--- End diff --

Sure, will do it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75791305
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
--- End diff --

is `TestHiveSessionState` already removed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75791279
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala ---
@@ -139,12 +139,9 @@ private[hive] class TestHiveSparkSession(
 
   assume(sc.conf.get(CATALOG_IMPLEMENTATION) == "hive")
 
-  // TODO: Let's remove HiveSharedState and TestHiveSessionState. 
Otherwise,
-  // we are not really testing the reflection logic based on the setting of
-  // CATALOG_IMPLEMENTATION.
   @transient
-  override lazy val sharedState: HiveSharedState = {
-existingSharedState.getOrElse(new HiveSharedState(sc))
+  override lazy val sharedState: SharedState = {
--- End diff --

do we need to override this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75790997
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala
 ---
@@ -40,6 +40,15 @@ abstract class ExternalCatalogSuite extends 
SparkFunSuite with BeforeAndAfterEac
 
   protected def resetState(): Unit = { }
 
+  // Clear all state before each test
+  override def beforeEach(): Unit = {
+try {
+  resetState()
--- End diff --

why do we need this now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14757#discussion_r75790947
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -41,13 +42,20 @@ import org.apache.spark.sql.types.{DataType, StructType}
  * A persistent implementation of the system catalog using Hive.
  * All public methods must be synchronized for thread-safety.
  */
-private[spark] class HiveExternalCatalog(client: HiveClient, hadoopConf: 
Configuration)
+private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: 
Configuration)
   extends ExternalCatalog with Logging {
 
   import CatalogTypes.TablePartitionSpec
   import HiveExternalCatalog._
   import CatalogTableType._
 
+  /**
+   * A Hive client used to interact with the metastore.
+   */
+  private val client: HiveClient = {
--- End diff --

can we make it public and remove `getClient`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14757: [SPARK-17190] [SQL] Removal of HiveSharedState

2016-08-22 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/14757

[SPARK-17190] [SQL] Removal of HiveSharedState

### What changes were proposed in this pull request?
Since `HiveClient` is used to interact with the Hive metastore, it should 
be hidden in `HiveExternalCatalog`. After moving `HiveClient` into 
`HiveExternalCatalog`, `HiveSharedState` becomes a wrapper of 
`HiveExternalCatalog`. Thus, removal of `HiveSharedState` becomes 
straightforward. After removal of `HiveSharedState`, the reflection logic is 
directly applied on the choice of `ExternalCatalog` types, based on the 
configuration of `CATALOG_IMPLEMENTATION`. 

`HiveClient` is also used/invoked by the other entities besides 
HiveExternalCatalog, we defines the following two APIs:
```Scala
  /**
   * Return the existing [[HiveClient]] used to interact with the metastore.
   */
  def getClient: HiveClient

  /**
   * Return a [[HiveClient]] as a new session
   */
  def getNewClient: HiveClient
```

### How was this patch tested?
The existing test cases

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark removeHiveClient

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14757.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14757






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org