[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-81431800
  
  [Test build #28638 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28638/consoleFull)
 for   PR 4885 at commit 
[`1c47b2a`](https://github.com/apache/spark/commit/1c47b2a3e88579b0868564a5c598f90465b6d84a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-81431807
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28638/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-81403533
  
  [Test build #28638 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28638/consoleFull)
 for   PR 4885 at commit 
[`1c47b2a`](https://github.com/apache/spark/commit/1c47b2a3e88579b0868564a5c598f90465b6d84a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-81376089
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28636/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-81376063
  
  [Test build #28636 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28636/consoleFull)
 for   PR 4885 at commit 
[`815b27a`](https://github.com/apache/spark/commit/815b27acfb90a8ef23b11abf7e22ab2f9ea5b0e6).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-81375018
  
Thank you @liancheng @guowei2 for the review, I've updated the code as 
suggested.

Still, I am thinking how to handle the temporal `function` and `table` 
which isolated by `SQLSession`, maybe life would be easier if we have the 
design along this PR(we can do those in a separated PR). Any suggestions 
@liancheng @marmbrus @guowei2 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-81373249
  
  [Test build #28636 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28636/consoleFull)
 for   PR 4885 at commit 
[`815b27a`](https://github.com/apache/spark/commit/815b27acfb90a8ef23b11abf7e22ab2f9ea5b0e6).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26457655
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: 
SparkContext)
*/
   def getAllConfs: immutable.Map[String, String] = conf.getAllConfs
 
+  // TODO how to handle the temp table per user session?
   @transient
   protected[sql] lazy val catalog: Catalog = new SimpleCatalog(true)
 
+  // TODO how to handle the temp function per user session?
--- End diff --

Yea, the same with `Catalog`, we also need to think about the how to handle 
the `temp` function for current session in `FunctionRegistry`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26457610
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: 
SparkContext)
*/
   def getAllConfs: immutable.Map[String, String] = conf.getAllConfs
 
+  // TODO how to handle the temp table per user session?
--- End diff --

Yea, we can keep it as separated PR. But for Spark SQL, the temp table is 
managed by `Catalog`, probably we also need to refactor the `Catalog` code a 
little bit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-15 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26457577
  
--- Diff: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 ---
@@ -245,15 +377,22 @@ abstract class HiveThriftJdbcTest extends 
HiveThriftServer2Test {
 s"jdbc:hive2://localhost:$serverPort/"
   }
 
-  protected def withJdbcStatement(f: Statement => Unit): Unit = {
-val connection = DriverManager.getConnection(jdbcUri, user, "")
-val statement = connection.createStatement()
-
-try f(statement) finally {
-  statement.close()
-  connection.close()
+  def withMultipleConnectionJdbcStatement(fs: Seq[Statement => Unit]) {
--- End diff --

That's really nice suggestion!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-78857208
  
Hey @chenghao-intel, terribly sorry for the delay. In general this LGTM. 
Left some comments, mostly on styling issues. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369537
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
@@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends 
SQLContext(sc) {
 Nil
 }
 
+  override protected[sql] def createSession(): SQLSession = {
+new this.SQLSession()
+  }
+
+  protected[hive] class SQLSession extends super.SQLSession {
+protected[sql] override lazy val conf: SQLConf = new SQLConf {
+  override def dialect: String = getConf(SQLConf.DIALECT, "hiveql")
+}
+
+protected[hive] lazy val hiveconf: HiveConf = {
--- End diff --

Any real use cases here? Maybe your authentication patch? In general, it's 
not recommended to use `HiveConf` as a mutable configuration collection.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369170
  
--- Diff: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 ---
@@ -245,15 +377,22 @@ abstract class HiveThriftJdbcTest extends 
HiveThriftServer2Test {
 s"jdbc:hive2://localhost:$serverPort/"
   }
 
-  protected def withJdbcStatement(f: Statement => Unit): Unit = {
-val connection = DriverManager.getConnection(jdbcUri, user, "")
-val statement = connection.createStatement()
-
-try f(statement) finally {
-  statement.close()
-  connection.close()
+  def withMultipleConnectionJdbcStatement(fs: Seq[Statement => Unit]) {
--- End diff --

To make the syntax prettier, fs can be replaced with variable arguments:

```scala
def withMultipleConnectionJdbcStatement(fs: Statement => Unit*) {
  ...
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369178
  
--- Diff: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 ---
@@ -245,15 +377,22 @@ abstract class HiveThriftJdbcTest extends 
HiveThriftServer2Test {
 s"jdbc:hive2://localhost:$serverPort/"
   }
 
-  protected def withJdbcStatement(f: Statement => Unit): Unit = {
-val connection = DriverManager.getConnection(jdbcUri, user, "")
-val statement = connection.createStatement()
-
-try f(statement) finally {
-  statement.close()
-  connection.close()
+  def withMultipleConnectionJdbcStatement(fs: Seq[Statement => Unit]) {
+val user = System.getProperty("user.name")
+val connections = fs.map { _ => DriverManager.getConnection(jdbcUri, 
user, "") }
+val statements = connections.map(_.createStatement())
+
+try {
+  statements.zip(fs).map { case (s, f) => f(s) }
+} finally {
+  statements.map(_.close())
+  connections.map(_.close())
 }
   }
+
+  def withJdbcStatement(f: Statement => Unit) {
+  withMultipleConnectionJdbcStatement(Seq(f))
--- End diff --

Indentation if off.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369116
  
--- Diff: 
sql/hive-thriftserver/v0.12.0/src/main/scala/org/apache/spark/sql/hive/thriftserver/Shim12.scala
 ---
@@ -220,3 +227,42 @@ private[hive] class SparkExecuteStatementOperation(
 setState(OperationState.FINISHED)
   }
 }
+
+private[hive] class SparkSQLSessionManager(hiveContext: HiveContext)
+  extends SessionManager
+  with ReflectedCompositeService {
+
+  private lazy val sparkSqlOperationManager = new 
SparkSQLOperationManager(hiveContext)
+
+  override def init(hiveConf: HiveConf) {
+setSuperField(this, "hiveConf", hiveConf)
+
+val backgroundPoolSize = 
hiveConf.getIntVar(ConfVars.HIVE_SERVER2_ASYNC_EXEC_THREADS)
+setSuperField(this, "backgroundOperationPool", 
Executors.newFixedThreadPool(backgroundPoolSize))
+getAncestorField[Log](this, 3, "LOG").info(
+  s"HiveServer2: Async execution pool size $backgroundPoolSize")
+
+setSuperField(this, "operationManager", sparkSqlOperationManager)
+addService(sparkSqlOperationManager)
+
+initCompositeService(hiveConf)
+  }
+
+  override def openSession(
+  username: String,
+  passwd: String,
+  sessionConf: java.util.Map[String, String],
+  withImpersonation: Boolean,
+  delegationToken: String): SessionHandle = {
+hiveContext.openSession()
+
+super.openSession(username, passwd, sessionConf, withImpersonation, 
delegationToken)
+  }
+
+  override def closeSession(sessionHandle: SessionHandle) {
+super.closeSession(sessionHandle)
+sparkSqlOperationManager.sessionToActivePool -= sessionHandle
+
+hiveContext.detachSession()
+  }
+}
--- End diff --

Nit: add a newline.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369112
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -138,6 +142,14 @@ class SQLContext(@transient val sparkContext: 
SparkContext)
 
   protected[sql] def executePlan(plan: LogicalPlan) = new 
this.QueryExecution(plan)
 
+  @transient
+  protected[sql] val tss = new ThreadLocal[SQLSession]() {
--- End diff --

Maybe `currentSession` is a better name than `tss`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369119
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
@@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends 
SQLContext(sc) {
 Nil
 }
 
+  override protected[sql] def createSession(): SQLSession = {
+new this.SQLSession()
+  }
+
+  protected[hive] class SQLSession extends super.SQLSession {
--- End diff --

@guowei2 I think either way is OK for now. Putting all session-specific 
stuff into a central place (`SQLSession`) seems cleaner to me. Making 
`SQLSession` a thread-local does look a little ugly, however, right now it's 
not used anywhere other than the Thrift server. When we do decide to move Hive 
into a separate data source and make our own data source neutral Spark SQL 
server, we can handle the session problem in a cleaner way (e.g., using an 
actor for each session and keep all session-specific stuff in the actor 
instance).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369107
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: 
SparkContext)
*/
   def getAllConfs: immutable.Map[String, String] = conf.getAllConfs
 
+  // TODO how to handle the temp table per user session?
--- End diff --

This is a good question. Ideally we may want to session isolation for 
temporary tables. However, we can leave this for another PR if you think it 
makes this PR too complicated. Especially, `HiveMetastoreCatalog` handles both 
persisted and temporary tables.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369110
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: 
SparkContext)
*/
   def getAllConfs: immutable.Map[String, String] = conf.getAllConfs
 
+  // TODO how to handle the temp table per user session?
   @transient
   protected[sql] lazy val catalog: Catalog = new SimpleCatalog(true)
 
+  // TODO how to handle the temp function per user session?
--- End diff --

Same here. But this one should be simpler, we don't handle persisted UDF in 
Spark SQL for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-13 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r26369097
  
--- Diff: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 ---
@@ -195,6 +195,138 @@ class HiveThriftBinaryServerSuite extends 
HiveThriftJdbcTest {
   }
 }
   }
+
+  test("test multiple session") {
--- End diff --

Indentations are off in this test case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-77114487
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28256/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-77114480
  
  [Test build #28256 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28256/consoleFull)
 for   PR 4885 at commit 
[`0ca4bbd`](https://github.com/apache/spark/commit/0ca4bbd2512dcdcfa3cf23556fde119050d6c85b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread guowei2
Github user guowei2 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r25756659
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
@@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends 
SQLContext(sc) {
 Nil
 }
 
+  override protected[sql] def createSession(): SQLSession = {
+new this.SQLSession()
+  }
+
+  protected[hive] class SQLSession extends super.SQLSession {
+protected[sql] override lazy val conf: SQLConf = new SQLConf {
+  override def dialect: String = getConf(SQLConf.DIALECT, "hiveql")
+}
+
+protected[hive] lazy val hiveconf: HiveConf = {
--- End diff --

Is `def` better here? 
if we set some new conf. we can't get the latest value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread guowei2
Github user guowei2 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4885#discussion_r25756484
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
@@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends 
SQLContext(sc) {
 Nil
 }
 
+  override protected[sql] def createSession(): SQLSession = {
+new this.SQLSession()
+  }
+
+  protected[hive] class SQLSession extends super.SQLSession {
--- End diff --

I think there's no need to overwrite `SQLSession` and `createSession` here, 
for `SessionState` self is `ThreadLocal`. we just need to set `SessionState` 
when `openSession` in `SparkSQLSessionManager`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-77106224
  
  [Test build #28256 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28256/consoleFull)
 for   PR 4885 at commit 
[`0ca4bbd`](https://github.com/apache/spark/commit/0ca4bbd2512dcdcfa3cf23556fde119050d6c85b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-77102270
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28253/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-77102265
  
  [Test build #28253 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28253/consoleFull)
 for   PR 4885 at commit 
[`5fea724`](https://github.com/apache/spark/commit/5fea724ba0682d60cedc00475f4ba5195e2fa357).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-77102100
  
cc @liancheng @tianyi @guowei2
We have 2 implementations for supporting the multiple sessions in 
thriftserver, can you review the code for me? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4885#issuecomment-77101995
  
  [Test build #28253 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28253/consoleFull)
 for   PR 4885 at commit 
[`5fea724`](https://github.com/apache/spark/commit/5fea724ba0682d60cedc00475f4ba5195e2fa357).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-03-03 Thread chenghao-intel
GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/4885

[SPARK-2087] [SQL] [WIP] Multiple thriftserver sessions with single 
HiveContext instance

This is another implementation other than #4382 , which keep only a single 
HiveContext within ThriftServer, and the session dependent objects are wrapped 
as an internal class `SQLSession`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark 
multisessions_singlecontext

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4885.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4885


commit 5fea724ba0682d60cedc00475f4ba5195e2fa357
Author: Cheng Hao 
Date:   2015-03-04T05:49:50Z

thriftservice with single context




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-05 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-73195494
  
@liancheng Seems HiveThriftServer2Suite didn't run, is it disabled by 
default?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-05 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-73186905
  
@mallman this PR exactly aims to fix the bug you mentioned, and it passed 
the tested in my local machine. However, I am still figuring out some of the 
unit testing failures, hopefully I can update the title by removing the "WIP" 
soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-05 Thread mallman
Github user mallman commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-73081758
  
FWIW I'd like to add my two cents. The main piece of functionality the 
installation at my company would benefit from is independent user sessions. I'm 
not familiar enough with the source to say exactly what that means in terms of 
a source patch, but one of the key use cases is the ability to set the session 
default database ("use ") and SQLConf settings independent of other 
beeline connections. Right now, setting the database sets it across all 
connections and that is a major impediment to wider use of a shared 
thriftserver.

Cheers!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-72992929
  
Seems `HiveThriftServer2Suite` didn't run, is it disabled by default?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-72992740
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26816/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-72992738
  
  [Test build #26816 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26816/consoleFull)
 for   PR 4382 at commit 
[`403d6ec`](https://github.com/apache/spark/commit/403d6ec4e11e0e815a2b2f10ebd4e530d857074b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-72992394
  
@guowei2 Probably no. HiveContext has it's own internal metastore (temp 
metastore), `SQLConf` instance etc. I don't think multiple users want to share 
those info when they connect the same thriftserver. `SessionState` actually 
wraps the internal state as the local thread internal, hence I didn't change 
that in my PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread guowei2
Github user guowei2 commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-72991661
  
@chenghao-intel  
I think there is no need to make `HiveContext` as `ThreadLocal`. and 
`SessionState` `ThreadLocal` is enough.
https://github.com/guowei2/spark/compare/SPARK-4815?expand=1
do we fix the same issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-72987398
  
@liancheng  Do you have any idea how to collect the thrift sever logs in 
the unit test? It says timeout exception, and I believe either port error or 
the server process exited due to some error (like the create the hive metastore 
client failure).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4382#issuecomment-72987184
  
  [Test build #26816 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26816/consoleFull)
 for   PR 4382 at commit 
[`403d6ec`](https://github.com/apache/spark/commit/403d6ec4e11e0e815a2b2f10ebd4e530d857074b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...

2015-02-04 Thread chenghao-intel
GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/4382

[SPARK-2087] [SQL] [WIP] Multiple thriftserver sessions with different 
HiveContext instances

Passed the local binary deployment testing, but failed in the unittest, 
will investigate what happened.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark multisessions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4382.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4382


commit 403d6ec4e11e0e815a2b2f10ebd4e530d857074b
Author: Cheng Hao 
Date:   2015-02-05T02:55:34Z

Multiple thriftserver sessions with different HiveContext instance




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org