[ 
https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155022#comment-16155022
 ] 

Hu Liu, commented on SPARK-21918:
---------------------------------

[~mgaido] Yes, all the jobs are executed using the same user. but the problem 
is't in STS.
The STS open session with impersonation when doAS is enabled
{code:java}
if (cliService.getHiveConf().getBoolVar(ConfVars.HIVE_SERVER2_ENABLE_DOAS) &&
        (userName != null)) {
      String delegationTokenStr = getDelegationToken(userName);
      sessionHandle = cliService.openSessionWithImpersonation(protocol, 
userName,
          req.getPassword(), ipAddress, req.getConfiguration(), 
delegationTokenStr);
    } else {
{code}
And run sql by session ugi in HiveSessionProxy.
For DDL operation, spark sql use Hive object in HiveClientImplement.java to 
communicate with metastore.
Currently the Hive object is shared between different threads that why all jobs 
is executed using same user in HiveClientImpl.java
{code:java}
  private def client: Hive = {
    if (clientLoader.cachedHive != null) {
      clientLoader.cachedHive.asInstanceOf[Hive]
    } else {
      val c = Hive.get(conf)
      clientLoader.cachedHive = c
      c
    }
  }
{code}
Actually Hive object store different instance for different thread and class 
HiveSessionImplwithUGI have already create Hive object for current user session

{code:java}
   // create a new metastore connection for this particular user session
    Hive.set(null);
    try {
      sessionHive = Hive.get(getHiveConf());
    } catch (HiveException e) {
      throw new HiveSQLException("Failed to setup metastore connection", e);
    }
{code}

If we could pass the Hive object for current user session to the work thread, 
we can fix this problem
I have already fixed it and could run DDL operation using the session user. 



> HiveClient shouldn't share Hive object between different thread
> ---------------------------------------------------------------
>
>                 Key: SPARK-21918
>                 URL: https://issues.apache.org/jira/browse/SPARK-21918
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Hu Liu,
>
> I'm testing the spark thrift server and found that all the DDL statements are 
> run by user hive even if hive.server2.enable.doAs=true
> The root cause is that Hive object is shared between different thread in 
> HiveClientImpl
> {code:java}
>   private def client: Hive = {
>     if (clientLoader.cachedHive != null) {
>       clientLoader.cachedHive.asInstanceOf[Hive]
>     } else {
>       val c = Hive.get(conf)
>       clientLoader.cachedHive = c
>       c
>     }
>   }
> {code}
> But in impersonation mode, we should just share the Hive object inside the 
> thread so that the  metastore client in Hive could be associated with right 
> user.
> we can  pass the Hive object of parent thread to child thread when running 
> the sql to fix it
> I have already had a initial patch for review and I'm glad to work on it if 
> anyone could assign it to me.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to