[ 
https://issues.apache.org/jira/browse/HIVE-23802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159197#comment-17159197
 ] 

gaozhan ding commented on HIVE-23802:
-------------------------------------

Can someone plz review this patch?

> “merge files” job was submited to default queue when set hive.merge.tezfiles 
> to true
> ------------------------------------------------------------------------------------
>
>                 Key: HIVE-23802
>                 URL: https://issues.apache.org/jira/browse/HIVE-23802
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.1.0
>            Reporter: gaozhan ding
>            Assignee: gaozhan ding
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: 15940042679272.png, HIVE-23802.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We use tez as the query engine. When hive.merge.tezfiles  set to true,merge 
> files task,  which followed by orginal task,  will be submit to default queue 
> rather then the queue same with orginal task.
> I study this issue for days and found that, every time starting a container, 
> "tez,queue.name" whill be unset in current session. Code are as below:
> {code:java}
> // TezSessionState.startSessionAndContainers()
> // sessionState.getQueueName() comes from cluster wide configured queue names.
>  // sessionState.getConf().get("tez.queue.name") is explicitly set by user in 
> a session.
>  // TezSessionPoolManager sets tez.queue.name if user has specified one or 
> use the one from
>  // cluster wide queue names.
>  // There is no way to differentiate how this was set (user vs system).
>  // Unset this after opening the session so that reopening of session uses 
> the correct queue
>  // names i.e, if client has not died and if the user has explicitly set a 
> queue name
>  // then reopened session will use user specified queue name else default 
> cluster queue names.
>  conf.unset(TezConfiguration.TEZ_QUEUE_NAME);
> {code}
> So after the orgin task was submited to yarn, "tez.queue.name" will be unset. 
> While starting merge file task, it will try use the same session with orgin 
> job, but get false due to tez.queue.name was unset. Seems like we could not 
> unset this property.
> {code:java}
> // TezSessionPoolManager.canWorkWithSameSession()
> if (!session.isDefault()) {
>   String queueName = session.getQueueName();
>   String confQueueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
>   LOG.info("Current queue name is " + queueName + " incoming queue name is " 
> + confQueueName);
>   return (queueName == null) ? confQueueName == null : 
> queueName.equals(confQueueName);
> } else {
>   // this session should never be a default session unless something has 
> messed up.
>   throw new HiveException("The pool session " + session + " should have been 
> returned to the pool"); 
> }
> {code}
>    !15940042679272.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to