[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-81431800 [Test build #28638 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28638/consoleFull) for PR 4885 at commit [`1c47b2a`](https://github.com/apache/spark/commit/1c47b2a3e88579b0868564a5c598f90465b6d84a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-81431807 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28638/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-81403533 [Test build #28638 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28638/consoleFull) for PR 4885 at commit [`1c47b2a`](https://github.com/apache/spark/commit/1c47b2a3e88579b0868564a5c598f90465b6d84a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-81376089 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28636/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-81376063 [Test build #28636 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28636/consoleFull) for PR 4885 at commit [`815b27a`](https://github.com/apache/spark/commit/815b27acfb90a8ef23b11abf7e22ab2f9ea5b0e6). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-81375018 Thank you @liancheng @guowei2 for the review, I've updated the code as suggested. Still, I am thinking how to handle the temporal `function` and `table` which isolated by `SQLSession`, maybe life would be easier if we have the design along this PR(we can do those in a separated PR). Any suggestions @liancheng @marmbrus @guowei2 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-81373249 [Test build #28636 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28636/consoleFull) for PR 4885 at commit [`815b27a`](https://github.com/apache/spark/commit/815b27acfb90a8ef23b11abf7e22ab2f9ea5b0e6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26457655 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: SparkContext) */ def getAllConfs: immutable.Map[String, String] = conf.getAllConfs + // TODO how to handle the temp table per user session? @transient protected[sql] lazy val catalog: Catalog = new SimpleCatalog(true) + // TODO how to handle the temp function per user session? --- End diff -- Yea, the same with `Catalog`, we also need to think about the how to handle the `temp` function for current session in `FunctionRegistry`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26457610 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: SparkContext) */ def getAllConfs: immutable.Map[String, String] = conf.getAllConfs + // TODO how to handle the temp table per user session? --- End diff -- Yea, we can keep it as separated PR. But for Spark SQL, the temp table is managed by `Catalog`, probably we also need to refactor the `Catalog` code a little bit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26457577 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala --- @@ -245,15 +377,22 @@ abstract class HiveThriftJdbcTest extends HiveThriftServer2Test { s"jdbc:hive2://localhost:$serverPort/" } - protected def withJdbcStatement(f: Statement => Unit): Unit = { -val connection = DriverManager.getConnection(jdbcUri, user, "") -val statement = connection.createStatement() - -try f(statement) finally { - statement.close() - connection.close() + def withMultipleConnectionJdbcStatement(fs: Seq[Statement => Unit]) { --- End diff -- That's really nice suggestion! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-78857208 Hey @chenghao-intel, terribly sorry for the delay. In general this LGTM. Left some comments, mostly on styling issues. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369537 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends SQLContext(sc) { Nil } + override protected[sql] def createSession(): SQLSession = { +new this.SQLSession() + } + + protected[hive] class SQLSession extends super.SQLSession { +protected[sql] override lazy val conf: SQLConf = new SQLConf { + override def dialect: String = getConf(SQLConf.DIALECT, "hiveql") +} + +protected[hive] lazy val hiveconf: HiveConf = { --- End diff -- Any real use cases here? Maybe your authentication patch? In general, it's not recommended to use `HiveConf` as a mutable configuration collection. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369170 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala --- @@ -245,15 +377,22 @@ abstract class HiveThriftJdbcTest extends HiveThriftServer2Test { s"jdbc:hive2://localhost:$serverPort/" } - protected def withJdbcStatement(f: Statement => Unit): Unit = { -val connection = DriverManager.getConnection(jdbcUri, user, "") -val statement = connection.createStatement() - -try f(statement) finally { - statement.close() - connection.close() + def withMultipleConnectionJdbcStatement(fs: Seq[Statement => Unit]) { --- End diff -- To make the syntax prettier, fs can be replaced with variable arguments: ```scala def withMultipleConnectionJdbcStatement(fs: Statement => Unit*) { ... } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369178 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala --- @@ -245,15 +377,22 @@ abstract class HiveThriftJdbcTest extends HiveThriftServer2Test { s"jdbc:hive2://localhost:$serverPort/" } - protected def withJdbcStatement(f: Statement => Unit): Unit = { -val connection = DriverManager.getConnection(jdbcUri, user, "") -val statement = connection.createStatement() - -try f(statement) finally { - statement.close() - connection.close() + def withMultipleConnectionJdbcStatement(fs: Seq[Statement => Unit]) { +val user = System.getProperty("user.name") +val connections = fs.map { _ => DriverManager.getConnection(jdbcUri, user, "") } +val statements = connections.map(_.createStatement()) + +try { + statements.zip(fs).map { case (s, f) => f(s) } +} finally { + statements.map(_.close()) + connections.map(_.close()) } } + + def withJdbcStatement(f: Statement => Unit) { + withMultipleConnectionJdbcStatement(Seq(f)) --- End diff -- Indentation if off. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369116 --- Diff: sql/hive-thriftserver/v0.12.0/src/main/scala/org/apache/spark/sql/hive/thriftserver/Shim12.scala --- @@ -220,3 +227,42 @@ private[hive] class SparkExecuteStatementOperation( setState(OperationState.FINISHED) } } + +private[hive] class SparkSQLSessionManager(hiveContext: HiveContext) + extends SessionManager + with ReflectedCompositeService { + + private lazy val sparkSqlOperationManager = new SparkSQLOperationManager(hiveContext) + + override def init(hiveConf: HiveConf) { +setSuperField(this, "hiveConf", hiveConf) + +val backgroundPoolSize = hiveConf.getIntVar(ConfVars.HIVE_SERVER2_ASYNC_EXEC_THREADS) +setSuperField(this, "backgroundOperationPool", Executors.newFixedThreadPool(backgroundPoolSize)) +getAncestorField[Log](this, 3, "LOG").info( + s"HiveServer2: Async execution pool size $backgroundPoolSize") + +setSuperField(this, "operationManager", sparkSqlOperationManager) +addService(sparkSqlOperationManager) + +initCompositeService(hiveConf) + } + + override def openSession( + username: String, + passwd: String, + sessionConf: java.util.Map[String, String], + withImpersonation: Boolean, + delegationToken: String): SessionHandle = { +hiveContext.openSession() + +super.openSession(username, passwd, sessionConf, withImpersonation, delegationToken) + } + + override def closeSession(sessionHandle: SessionHandle) { +super.closeSession(sessionHandle) +sparkSqlOperationManager.sessionToActivePool -= sessionHandle + +hiveContext.detachSession() + } +} --- End diff -- Nit: add a newline. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369112 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -138,6 +142,14 @@ class SQLContext(@transient val sparkContext: SparkContext) protected[sql] def executePlan(plan: LogicalPlan) = new this.QueryExecution(plan) + @transient + protected[sql] val tss = new ThreadLocal[SQLSession]() { --- End diff -- Maybe `currentSession` is a better name than `tss`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369119 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends SQLContext(sc) { Nil } + override protected[sql] def createSession(): SQLSession = { +new this.SQLSession() + } + + protected[hive] class SQLSession extends super.SQLSession { --- End diff -- @guowei2 I think either way is OK for now. Putting all session-specific stuff into a central place (`SQLSession`) seems cleaner to me. Making `SQLSession` a thread-local does look a little ugly, however, right now it's not used anywhere other than the Thrift server. When we do decide to move Hive into a separate data source and make our own data source neutral Spark SQL server, we can handle the session problem in a cleaner way (e.g., using an actor for each session and keep all session-specific stuff in the actor instance). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369107 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: SparkContext) */ def getAllConfs: immutable.Map[String, String] = conf.getAllConfs + // TODO how to handle the temp table per user session? --- End diff -- This is a good question. Ideally we may want to session isolation for temporary tables. However, we can leave this for another PR if you think it makes this PR too complicated. Especially, `HiveMetastoreCatalog` handles both persisted and temporary tables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369110 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -103,9 +105,11 @@ class SQLContext(@transient val sparkContext: SparkContext) */ def getAllConfs: immutable.Map[String, String] = conf.getAllConfs + // TODO how to handle the temp table per user session? @transient protected[sql] lazy val catalog: Catalog = new SimpleCatalog(true) + // TODO how to handle the temp function per user session? --- End diff -- Same here. But this one should be simpler, we don't handle persisted UDF in Spark SQL for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r26369097 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala --- @@ -195,6 +195,138 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest { } } } + + test("test multiple session") { --- End diff -- Indentations are off in this test case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-77114487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28256/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-77114480 [Test build #28256 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28256/consoleFull) for PR 4885 at commit [`0ca4bbd`](https://github.com/apache/spark/commit/0ca4bbd2512dcdcfa3cf23556fde119050d6c85b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user guowei2 commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r25756659 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends SQLContext(sc) { Nil } + override protected[sql] def createSession(): SQLSession = { +new this.SQLSession() + } + + protected[hive] class SQLSession extends super.SQLSession { +protected[sql] override lazy val conf: SQLConf = new SQLConf { + override def dialect: String = getConf(SQLConf.DIALECT, "hiveql") +} + +protected[hive] lazy val hiveconf: HiveConf = { --- End diff -- Is `def` better here? if we set some new conf. we can't get the latest value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user guowei2 commented on a diff in the pull request: https://github.com/apache/spark/pull/4885#discussion_r25756484 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -272,6 +244,44 @@ class HiveContext(sc: SparkContext) extends SQLContext(sc) { Nil } + override protected[sql] def createSession(): SQLSession = { +new this.SQLSession() + } + + protected[hive] class SQLSession extends super.SQLSession { --- End diff -- I think there's no need to overwrite `SQLSession` and `createSession` here, for `SessionState` self is `ThreadLocal`. we just need to set `SessionState` when `openSession` in `SparkSQLSessionManager`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-77106224 [Test build #28256 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28256/consoleFull) for PR 4885 at commit [`0ca4bbd`](https://github.com/apache/spark/commit/0ca4bbd2512dcdcfa3cf23556fde119050d6c85b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-77102270 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28253/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-77102265 [Test build #28253 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28253/consoleFull) for PR 4885 at commit [`5fea724`](https://github.com/apache/spark/commit/5fea724ba0682d60cedc00475f4ba5195e2fa357). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-77102100 cc @liancheng @tianyi @guowei2 We have 2 implementations for supporting the multiple sessions in thriftserver, can you review the code for me? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4885#issuecomment-77101995 [Test build #28253 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28253/consoleFull) for PR 4885 at commit [`5fea724`](https://github.com/apache/spark/commit/5fea724ba0682d60cedc00475f4ba5195e2fa357). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/4885 [SPARK-2087] [SQL] [WIP] Multiple thriftserver sessions with single HiveContext instance This is another implementation other than #4382 , which keep only a single HiveContext within ThriftServer, and the session dependent objects are wrapped as an internal class `SQLSession`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenghao-intel/spark multisessions_singlecontext Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4885.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4885 commit 5fea724ba0682d60cedc00475f4ba5195e2fa357 Author: Cheng Hao Date: 2015-03-04T05:49:50Z thriftservice with single context --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-73195494 @liancheng Seems HiveThriftServer2Suite didn't run, is it disabled by default? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-73186905 @mallman this PR exactly aims to fix the bug you mentioned, and it passed the tested in my local machine. However, I am still figuring out some of the unit testing failures, hopefully I can update the title by removing the "WIP" soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user mallman commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-73081758 FWIW I'd like to add my two cents. The main piece of functionality the installation at my company would benefit from is independent user sessions. I'm not familiar enough with the source to say exactly what that means in terms of a source patch, but one of the key use cases is the ability to set the session default database ("use ") and SQLConf settings independent of other beeline connections. Right now, setting the database sets it across all connections and that is a major impediment to wider use of a shared thriftserver. Cheers! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-72992929 Seems `HiveThriftServer2Suite` didn't run, is it disabled by default? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-72992740 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26816/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-72992738 [Test build #26816 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26816/consoleFull) for PR 4382 at commit [`403d6ec`](https://github.com/apache/spark/commit/403d6ec4e11e0e815a2b2f10ebd4e530d857074b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-72992394 @guowei2 Probably no. HiveContext has it's own internal metastore (temp metastore), `SQLConf` instance etc. I don't think multiple users want to share those info when they connect the same thriftserver. `SessionState` actually wraps the internal state as the local thread internal, hence I didn't change that in my PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user guowei2 commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-72991661 @chenghao-intel I think there is no need to make `HiveContext` as `ThreadLocal`. and `SessionState` `ThreadLocal` is enough. https://github.com/guowei2/spark/compare/SPARK-4815?expand=1 do we fix the same issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-72987398 @liancheng Do you have any idea how to collect the thrift sever logs in the unit test? It says timeout exception, and I believe either port error or the server process exited due to some error (like the create the hive metastore client failure). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4382#issuecomment-72987184 [Test build #26816 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26816/consoleFull) for PR 4382 at commit [`403d6ec`](https://github.com/apache/spark/commit/403d6ec4e11e0e815a2b2f10ebd4e530d857074b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2087] [SQL] [WIP] Multiple thriftserver...
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/4382 [SPARK-2087] [SQL] [WIP] Multiple thriftserver sessions with different HiveContext instances Passed the local binary deployment testing, but failed in the unittest, will investigate what happened. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenghao-intel/spark multisessions Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4382.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4382 commit 403d6ec4e11e0e815a2b2f10ebd4e530d857074b Author: Cheng Hao Date: 2015-02-05T02:55:34Z Multiple thriftserver sessions with different HiveContext instance --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org