[ 
https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16699562#comment-16699562
 ] 

Roohi Syeda commented on HIVE-20819:
------------------------------------

Logs

 

2018-10-23 10:52:58,516 DEBUG hive.ql.parse.ParseDriver: 
[HiveServer2-Handler-Pool: Thread-54]: Parsing command: drop table empdata

2018-10-23 10:52:58,516 DEBUG hive.ql.parse.ParseDriver: 
[HiveServer2-Handler-Pool: Thread-54]: Parse Completed

2018-10-23 10:52:58,566 INFO  hive.metastore: [HiveServer2-Handler-Pool: 
Thread-54]: Trying to connect to metastore with URI thrift://XXXXX:9083

*2018-10-23 10:52:58,567 INFO  hive.metastore: [**HiveServer2-Handler-Pool**: 
Thread-54]: Opened a connection to metastore, current connections: 4*

2018-10-23 10:52:58,569 INFO  hive.metastore: [HiveServer2-Handler-Pool: 
Thread-54]: Connected to metastore.

2018-10-23 10:52:58,698 INFO  org.apache.hadoop.hive.ql.Driver: 
[HiveServer2-Handler-Pool: Thread-54]: Semantic Analysis Completed

2018-10-23 10:52:58,699 INFO  hive.ql.metadata.Hive: [HiveServer2-Handler-Pool: 
Thread-54]: Dumping metastore api call timing information for : compilation 
phase

2018-10-23 10:52:58,699 DEBUG hive.ql.metadata.Hive: [HiveServer2-Handler-Pool: 
Thread-54]: Total time spent in each metastore function (ms): 
\{getTable_(String, String, )=129}

2018-10-23 10:52:58,699 INFO  org.apache.hadoop.hive.ql.Driver: 
[HiveServer2-Handler-Pool: Thread-54]: Completed compiling 
command(queryId=hive_20181023105252_d3247a1c-e343-470b-aa46-a692b5ade414); Time 
taken: 0.183 seconds

2018-10-23 10:52:58,699 DEBUG org.apache.hadoop.security.UserGroupInformation: 
*[HiveServer2-**Handler-Pool**: Thread-54]:* *PrivilegedAction as:hive 
(auth:SIMPLE)* 
*from:org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)*

2018-10-23 10:52:58,699 DEBUG org.apache.hive.service.cli.CLIService: 
[HiveServer2-Handler-Pool: Thread-54]: SessionHandle 
[44f74bd9-1a71-458e-9992-e9d8afc3a958]: executeStatementAsync()

2018-10-23 10:52:58,700 DEBUG org.apache.hadoop.security.UserGroupInformation: 
[HiveServer2-Background-Pool: Thread-56]: PrivilegedAction as:hive (auth:PROXY) 
via hive (auth:SIMPLE) 
from:org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)

2018-10-23 10:52:58,700 DEBUG org.apache.thrift.transport.TSaslTransport: 
[HiveServer2-Handler-Pool: Thread-54]: writing data length: 109

*2018-10-23 10:52:58,715 DEBUG hive.ql.metadata.Hive: 
[HiveServer2-Background-Pool: Thread-56]: Creating new db. 
db.isCurrentUserOwner = false*

*2018-10-23 10:52:58,715 DEBUG hive.ql.metadata.Hive: 
[HiveServer2-Background-Pool: Thread-56]: Closing current thread's connection 
to Hive Metastore.*

*2018-10-23 10:52:58,715 INFO  hive.metastore: [HiveServer2-Background-Pool: 
Thread-56]: Closed a connection to metastore, current connections: 3*

2018-10-23 10:52:58,716 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
[HiveServer2-Background-Pool: Thread-56]: </PERFLOG method=TimeToSubmit 
start=1540317178702 end=1540317178716 duration=14 
from=org.apache.hadoop.hive.ql.Driver>

2018-10-23 10:52:58,716 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
[HiveServer2-Background-Pool: Thread-56]: <PERFLOG method=runTasks 
from=org.apache.hadoop.hive.ql.Driver>

2018-10-23 10:52:58,716 INFO  org.apache.hadoop.hive.ql.Driver: 
[HiveServer2-Background-Pool: Thread-56]: Starting task [Stage-0:DDL] in serial 
mode

2018-10-23 10:52:58,717 INFO  hive.metastore: [HiveServer2-Background-Pool: 
Thread-56]: Trying to connect to metastore with URI thrift://XXXX:9083

*2018-10-23 10:52:58,717 INFO  hive.metastore: 
[HiveServer2-**Background-Pool**: Thread-56]: Opened a connection to metastore, 
current connections: 4*

2018-10-23 10:52:58,720 INFO  hive.metastore: [HiveServer2-Background-Pool: 
Thread-56]: Connected to metastore.

> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-20819
>                 URL: https://issues.apache.org/jira/browse/HIVE-20819
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>            Reporter: Roohi Syeda
>            Assignee: Roohi Syeda
>            Priority: Minor
>         Attachments: HIVE-20819.1.patch
>
>
> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set.
> The connections created are in ESTABLISHED state and never closed
>  
> *More Details :* 
> When a new query is executed for a new session
>  
> The handler thread, calls line 66 HiveSessionImplwithUGI
> (UserGroupInformation.createProxyUser(
>           owner, UserGroupInformation.getLoginUser());
>  
> At *query compile time*, this sessionUgi is used to open MS connection by 
> *handler* thread
> Later at *query run time*, line 277 of SQLOperation
> Runnable work =
>           new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), 
> SessionState.get(),
>               asyncPrepare);
>  
> getCurrentUGI(); is used to create a new proxy user, which in turn calls 
> Utils.getUGI (see below) and passed to the *Background* thread
>  
>  public static UserGroupInformation *getUGI*() throws LoginException, 
> IOException {
>     String doAs = System.getenv("HADOOP_USER_NAME");
>     if(doAs != null && doAs.length() > 0) {
>      /*
>       * this allows doAs (proxy user) to be passed along across process 
> boundary where
>       * delegation tokens are not supported.  For example, a DDL stmt via 
> WebHCat with
>       * a doAs parameter, forks to 'hcat' which needs to start a Session that
>       * proxies the end user
>       */
>       return UserGroupInformation.createProxyUser(doAs, 
> UserGroupInformation.getLoginUser());
>     }
>     return UserGroupInformation.getCurrentUser();
>   }
>  
>  
> currentUGI creates a *new* proxyuser instance. This ugi is being set on the 
> background thread
> And when it is trying to get the Hive db in subsequent calls, we see that 
> since the ugi’s are not equal (See the equals code below), a new connection 
> is opened, which is never closed, by background thread.
> Line 318 in Hive.java
>  
>   private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean 
> isFastCheck,
>       boolean doRegisterAllFns) throws HiveException {
>     Hive db = hiveDB.get();
>     if (db == null || !db.*isCurrentUserOwner*() || needsRefresh
>         || (c != null && !isCompatible(db, c, isFastCheck))) {
>       db = create(c, false, db, doRegisterAllFns);
>     }
>     if (c != null) {
>       db.conf = c;
>     }
>     return db;
>   }
>  
>  private boolean isCurrentUserOwner() throws HiveException {
>     try {
>       return owner == null || 
> owner.equals(UserGroupInformation.getCurrentUser());
>     } catch(IOException e) {
>       throw new HiveException("Error getting current user: " + 
> e.getMessage(), e);
>     }
>   }
>  /**
>    * Compare the subjects to see if they are equal to each other.
>    */
>   @Override
>   public boolean *equals*(Object o) {
>     if (o == this) {
>       return true;
>     } else if (o == null || getClass() != o.getClass()) {
>       return false;
>     } else {
>       return subject == ((UserGroupInformation) o).subject;
>     }
>   }
>  
> Solution:
> When we assign *currentUGI* to the bg thread, we should call 
> UserGroupInformation.getCurrentUser() (see below) instead of calling *getUGI* 
> method listed above (which creates a new instance of proxy user and subject 
> and will always return isCurrentUserOwner as false, since both subject and 
> ugi instances are different and hence creates a new MS connection)
>  
> /**
>    * Return the current user, including any doAs in the current stack.
>  
>    */
>  
>   public synchronized
>   static UserGroupInformation getCurrentUser() throws IOException {
>     AccessControlContext context = AccessController.getContext();
>     Subject subject = Subject.getSubject(context);
>     if (subject == null || subject.getPrincipals(User.class).isEmpty()) {
>       return getLoginUser();
>     } else {
>       return new UserGroupInformation(subject);
>     }
>   }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to