[ 
https://issues.apache.org/jira/browse/HIVE-22527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-22527:
------------------------------
    Description: 
Hive on Tez. We enable small file merge configuration with set 
*hive.merge.tezfiles=true*. So , There will be another job launched for merging 
files after sql job. However, the merge file job is submitted into another yarn 
queue, not the queue of current beeline client session. It seems that the 
merging files job start a new tez session with new conf which is different the 
current session conf, leading to the merging file job goes into default queue.

 

Attachment *hive logs.png* shows that current session queue is 
*root.bdoc.production* ( String queueName = session.getQueueName();) incoming 
queue name is *null* ( String confQueueName = 
conf.get(TezConfiguration.TEZ_QUEUE_NAME);). In fact, we log in to the same 
beeline client with *set tez.queue.name=* *root.bdoc.production,* and  all  
jobs should be submitted into the same queue including file merge job.

[https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L445]

[https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L446]

 

Attachment *explain with merge files.png* shows that ** the stage-4 is 
individual merge file job which is submitted into another yarn queue(default 
queue), not the queue root.bdoc.production.

  was:
Hive on Tez. We enable small file merge configuration with set 
*hive.merge.tezfiles=true*. So , There will be another job launched for merging 
files after sql job. However, the merge file job is submitted into another yarn 
queue, not the queue of current beeline client session. It seems that the 
merging files job start a new tez session with new conf which is different the 
current session conf, leading to the merging file job goes into default queue.

 

Attachment *hive logs.png* shows that current session queue is 
*root.bdoc.production* ( String queueName = session.getQueueName();) incoming 
queue name is *null* ( String confQueueName = 
conf.get(TezConfiguration.TEZ_QUEUE_NAME);). In fact, we log in to the same 
beeline client with *set tez.queue.name=* *root.bdoc.production,* and  all  
jobs should be submitted into the queue including file merge job.

[https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L445]

[https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L446]

 

Attachment *explain with merge files.png* shows that ** the stage-4 is 
individual merge file job which is submitted into another yarn queue(default 
queue), not the queue root.bdoc.production.


> Hive on Tez : Job of merging samll files will be submitted into another queue 
> (default queue)
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-22527
>                 URL: https://issues.apache.org/jira/browse/HIVE-22527
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.1.0, 3.1.1
>            Reporter: zhangbutao
>            Priority: Blocker
>         Attachments: explain with merge files.png, file merge job.png, hive 
> logs.png
>
>
> Hive on Tez. We enable small file merge configuration with set 
> *hive.merge.tezfiles=true*. So , There will be another job launched for 
> merging files after sql job. However, the merge file job is submitted into 
> another yarn queue, not the queue of current beeline client session. It seems 
> that the merging files job start a new tez session with new conf which is 
> different the current session conf, leading to the merging file job goes into 
> default queue.
>  
> Attachment *hive logs.png* shows that current session queue is 
> *root.bdoc.production* ( String queueName = session.getQueueName();) incoming 
> queue name is *null* ( String confQueueName = 
> conf.get(TezConfiguration.TEZ_QUEUE_NAME);). In fact, we log in to the same 
> beeline client with *set tez.queue.name=* *root.bdoc.production,* and  all  
> jobs should be submitted into the same queue including file merge job.
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L445]
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L446]
>  
> Attachment *explain with merge files.png* shows that ** the stage-4 is 
> individual merge file job which is submitted into another yarn queue(default 
> queue), not the queue root.bdoc.production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to