[GitHub] spark pull request #16099: [SPARK-18665][SQL] set statement state to "ERROR"...
Github user BruceXu1991 commented on a diff in the pull request: https://github.com/apache/spark/pull/16099#discussion_r165876866 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -241,6 +241,8 @@ private[hive] class SparkExecuteStatementOperation( dataTypes = result.queryExecution.analyzed.output.map(_.dataType).toArray } catch { case e: HiveSQLException => +HiveThriftServer2.listener.onStatementError( + statementId, e.getMessage, SparkUtils.exceptionString(e)) --- End diff -- This PR still occurs in Spark 2.2.1. A workaround is to use "String.valueOf(e.getMessage)" instead of "e.getMessage" to avoid of NPE problem. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Github user BruceXu1991 commented on a diff in the pull request: https://github.com/apache/spark/pull/20034#discussion_r158577749 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -186,7 +186,7 @@ private[hive] class HiveClientImpl( /** Returns the configuration for the current session. */ def conf: HiveConf = state.getConf - private val userName = state.getAuthenticator.getUserName + private val userName = conf.getUser --- End diff -- well, if using spark 2.2.1's current implementation ``` private val userName = state.getAuthenticator.getUserName ``` when the implementation of state.getAuthenticator is **HadoopDefaultAuthenticator**, which is default in hive conf, the username is got. however, in the case that the implementation of state.getAuthenticator is **SessionStateUserAuthenticator**, which is used in my case, then username will be null. the simplified code below explains the reason: 1) HadoopDefaultAuthenticator ``` public class HadoopDefaultAuthenticator implements HiveAuthenticationProvider { @Override public String getUserName() { return userName; } @Override public void setConf(Configuration conf) { this.conf = conf; UserGroupInformation ugi = null; try { ugi = Utils.getUGI(); } catch (Exception e) { throw new RuntimeException(e); } this.userName = ugi.getShortUserName(); if (ugi.getGroupNames() != null) { this.groupNames = Arrays.asList(ugi.getGroupNames()); } } } public class Utils { public static UserGroupInformation getUGI() throws LoginException, IOException { String doAs = System.getenv("HADOOP_USER_NAME"); if(doAs != null && doAs.length() > 0) { return UserGroupInformation.createProxyUser(doAs, UserGroupInformation.getLoginUser()); } return UserGroupInformation.getCurrentUser(); } } ``` it shows that HadoopDefaultAuthenticator will get username through Utils.getUGI(), so the username is HADOOP_USER_NAME of LoginUser. 2) SessionStateUserAuthenticator ``` public class SessionStateUserAuthenticator implements HiveAuthenticationProvider { @Override public void setConf(Configuration arg0) { } @Override public String getUserName() { return sessionState.getUserName(); } } ``` it shows that SessionStateUserAuthenticator get the username through sessionState.getUserName(), which is null. Here is the [instantiation of SessionState in HiveClientImpl](https://github.com/apache/spark/blob/1cf3e3a26961d306eb17b7629d8742a4df45f339/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L187) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
Github user BruceXu1991 commented on a diff in the pull request: https://github.com/apache/spark/pull/20034#discussion_r158472814 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -186,7 +186,7 @@ private[hive] class HiveClientImpl( /** Returns the configuration for the current session. */ def conf: HiveConf = state.getConf - private val userName = state.getAuthenticator.getUserName + private val userName = conf.getUser --- End diff -- yes, i met this problem by using MySQL as Hive metastore. what's more, when I execute DESCRIBE FORMATTED spark_22846, NullPointerException will occur. ''' > DESCRIBE FORMATTED offline.spark_22846; Error: java.lang.NullPointerException (state=,code=0) ''' and the detail stack info: ``` 17/12/22 18:18:10 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING, java.lang.NullPointerException at scala.collection.immutable.StringOps$.length$extension(StringOps.scala:47) at scala.collection.immutable.StringOps.length(StringOps.scala:47) at scala.collection.IndexedSeqOptimized$class.isEmpty(IndexedSeqOptimized.scala:27) at scala.collection.immutable.StringOps.isEmpty(StringOps.scala:29) at scala.collection.TraversableOnce$class.nonEmpty(TraversableOnce.scala:111) at scala.collection.immutable.StringOps.nonEmpty(StringOps.scala:29) at org.apache.spark.sql.catalyst.catalog.CatalogTable.toLinkedHashMap(interface.scala:301) at org.apache.spark.sql.execution.command.DescribeTableCommand.describeFormattedTableInfo(tables.scala:559) at org.apache.spark.sql.execution.command.DescribeTableCommand.run(tables.scala:537) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) at org.apache.spark.sql.Dataset.(Dataset.scala:183) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:767) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691) ``` this result of NPE is that owner is null. The relevant source code is below: ``` def toLinkedHashMap: mutable.LinkedHashMap[String, String] = { . line 301: if (owner.nonEmpty) map.put("Owner", owner) } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20034: [SPARK-22846][SQL] Fix table owner is null when creating...
Github user BruceXu1991 commented on the issue: https://github.com/apache/spark/pull/20034 @cloud-fan @gatorsmile could you review this issue? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20034: [SPARK-22846][SQL] Fix table owner is null when c...
GitHub user BruceXu1991 opened a pull request: https://github.com/apache/spark/pull/20034 [SPARK-22846][SQL] Fix table owner is null when creating table through spark sql or thriftserver ## What changes were proposed in this pull request? fix table owner is null when create new table through spark sql ## How was this patch tested? manual test. 1ãfirst create an table 2ãselect the table properties in mysql of hive metastore Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/BruceXu1991/spark SPARK-22846 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20034.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20034 commit e8c3035028e6242005806476f5ce7cbdad5af889 Author: xu.wenchun <xu.wenchun@...> Date: 2017-12-20T13:05:13Z fix SPARK-22846 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16099: [SPARK-18665][SQL] set statement state to "ERROR"...
Github user BruceXu1991 commented on a diff in the pull request: https://github.com/apache/spark/pull/16099#discussion_r116693874 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -241,6 +241,8 @@ private[hive] class SparkExecuteStatementOperation( dataTypes = result.queryExecution.analyzed.output.map(_.dataType).toArray } catch { case e: HiveSQLException => +HiveThriftServer2.listener.onStatementError( + statementId, e.getMessage, SparkUtils.exceptionString(e)) --- End diff -- I tried your code. there seems have a bug. e.getMessage may get a null value, which may cause NPE in ThriftServerPage.errorMessageCell(detail) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17269: SPARK-19927: Support the --hivevar in the spark-s...
GitHub user BruceXu1991 opened a pull request: https://github.com/apache/spark/pull/17269 SPARK-19927: Support the --hivevar in the spark-sql command line and in the beeline ## What changes were proposed in this pull request? - Read key hivevar of hivevariables into sqlContext.conf in SparkSQLOperationManager - Adds TestSQL.sql as test resource - Adds spark-sql CLI test case in CliSuite ## How was this patch tested? Added a test case for spark-sql CLI mode Manual test for Beeline mode Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/BruceXu1991/spark hivevar-bugfix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17269.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17269 commit 586df4ee78faff5c8180abf4bde2c3f0bf4dc053 Author: xu.wenchun <xu.wenc...@immomo.com> Date: 2017-03-12T17:50:41Z Support the --hivevar in the spark-sql command line and in the beeline --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org