[ 
https://issues.apache.org/jira/browse/HIVE-23802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaozhan ding updated HIVE-23802:
--------------------------------
    Description: 
We use tez as the query engine. When hive.merge.tezfiles  set to true,merge 
files task,  which followed by orginal task,  will be submit to default queue 
rather then the queue same with orginal task.

I study this issue for days and found that, every time starting a container, 
"tez,queue.name" whill be unset in current session. Code are as below:
{code:java}
// TezSessionState.startSessionAndContainers()

// sessionState.getQueueName() comes from cluster wide configured queue names.
 // sessionState.getConf().get("tez.queue.name") is explicitly set by user in a 
session.
 // TezSessionPoolManager sets tez.queue.name if user has specified one or use 
the one from
 // cluster wide queue names.
 // There is no way to differentiate how this was set (user vs system).
 // Unset this after opening the session so that reopening of session uses the 
correct queue
 // names i.e, if client has not died and if the user has explicitly set a 
queue name
 // then reopened session will use user specified queue name else default 
cluster queue names.
 conf.unset(TezConfiguration.TEZ_QUEUE_NAME);
{code}
So after the orgin tast was submited to yarn, "tez,queue.name" will be unset. 
While starting merge file task, it will try use the same session with orgin 
job, but get false due to tez,queue.name was unset.
{code:java}
// TezSessionPoolManager.canWorkWithSameSession()

if (!session.isDefault()) {
  String queueName = session.getQueueName();
  String confQueueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
  LOG.info("Current queue name is " + queueName + " incoming queue name is " + 
confQueueName);
  return (queueName == null) ? confQueueName == null : 
queueName.equals(confQueueName);
} else {
  // this session should never be a default session unless something has messed 
up.
  throw new HiveException("The pool session " + session + " should have been 
returned to the pool"); 
}
{code}
   !15940042679272.png!

 

 

  was:
We use tez as the query engine. When hive.merge.tezfiles  set to true,merge 
files task,  which followed by orginal task,  will be submit to default queue 
rather then the queue same with orginal task.

I study this issue for days and found that, every time starting a container, 
"tez,queue.name" whill be unset in current session. Code are as below:
{code:java}
// TezSessionState.startSessionAndContainers()

// sessionState.getQueueName() comes from cluster wide configured queue names.
 // sessionState.getConf().get("tez.queue.name") is explicitly set by user in a 
session.
 // TezSessionPoolManager sets tez.queue.name if user has specified one or use 
the one from
 // cluster wide queue names.
 // There is no way to differentiate how this was set (user vs system).
 // Unset this after opening the session so that reopening of session uses the 
correct queue
 // names i.e, if client has not died and if the user has explicitly set a 
queue name
 // then reopened session will use user specified queue name else default 
cluster queue names.
 conf.unset(TezConfiguration.TEZ_QUEUE_NAME);
{code}
So after the orgin tast was submited to yarn, "tez,queue.name" will be unset. 
While starting merge file task, it will try use the same session with orgin 
job, but get false due to tez,queue.name was unset.
{code:java}
// TezSessionPoolManager.canWorkWithSameSession()

if (!session.isDefault()) {
  String queueName = session.getQueueName();
  String confQueueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
  LOG.info("Current queue name is " + queueName + " incoming queue name is " + 
confQueueName);
  return (queueName == null) ? confQueueName == null : 
queueName.equals(confQueueName);
} else {
  // this session should never be a default session unless something has messed 
up.
  throw new HiveException("The pool session " + session + " should have been 
returned to the pool"); 
}
{code}
  !企业微信截图_15940042679272.png!

 

 


> “merge files” job was submited to default queue when set hive.merge.tezfiles 
> to true
> ------------------------------------------------------------------------------------
>
>                 Key: HIVE-23802
>                 URL: https://issues.apache.org/jira/browse/HIVE-23802
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.1.0
>            Reporter: gaozhan ding
>            Assignee: gaozhan ding
>            Priority: Major
>         Attachments: 15940042679272.png
>
>
> We use tez as the query engine. When hive.merge.tezfiles  set to true,merge 
> files task,  which followed by orginal task,  will be submit to default queue 
> rather then the queue same with orginal task.
> I study this issue for days and found that, every time starting a container, 
> "tez,queue.name" whill be unset in current session. Code are as below:
> {code:java}
> // TezSessionState.startSessionAndContainers()
> // sessionState.getQueueName() comes from cluster wide configured queue names.
>  // sessionState.getConf().get("tez.queue.name") is explicitly set by user in 
> a session.
>  // TezSessionPoolManager sets tez.queue.name if user has specified one or 
> use the one from
>  // cluster wide queue names.
>  // There is no way to differentiate how this was set (user vs system).
>  // Unset this after opening the session so that reopening of session uses 
> the correct queue
>  // names i.e, if client has not died and if the user has explicitly set a 
> queue name
>  // then reopened session will use user specified queue name else default 
> cluster queue names.
>  conf.unset(TezConfiguration.TEZ_QUEUE_NAME);
> {code}
> So after the orgin tast was submited to yarn, "tez,queue.name" will be unset. 
> While starting merge file task, it will try use the same session with orgin 
> job, but get false due to tez,queue.name was unset.
> {code:java}
> // TezSessionPoolManager.canWorkWithSameSession()
> if (!session.isDefault()) {
>   String queueName = session.getQueueName();
>   String confQueueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
>   LOG.info("Current queue name is " + queueName + " incoming queue name is " 
> + confQueueName);
>   return (queueName == null) ? confQueueName == null : 
> queueName.equals(confQueueName);
> } else {
>   // this session should never be a default session unless something has 
> messed up.
>   throw new HiveException("The pool session " + session + " should have been 
> returned to the pool"); 
> }
> {code}
>    !15940042679272.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to