Do you have the dag plan ? I mean the dag topology.  The dot file in the AM 
container.




Best Regard,
Jeff Zhang


From: Xiaoyong Zhu <xiaoy...@microsoft.com<mailto:xiaoy...@microsoft.com>>
Reply-To: "user@tez.apache.org<mailto:user@tez.apache.org>" 
<user@tez.apache.org<mailto:user@tez.apache.org>>
Date: Friday, September 11, 2015 at 1:25 PM
To: "user@tez.apache.org<mailto:user@tez.apache.org>" 
<user@tez.apache.org<mailto:user@tez.apache.org>>
Subject: RE: how to allocate more containers?

Thanks for the information. here's my understanding of the resource allocation 
(please correct me if I am wrong) and my scenario:

1.       Assuming the cluster is dedicated to only one Tez application, then I 
want to maximize the usage of the single application (Mem/CPU)

2.       Assuming I have changed all the configurations in YARN side so the 
memory/CPU allocation of a certain node is maximized (meaning each node can be 
theoretically full utilized). The input is around 500GB~1TB

3.       Then I launched a Tez application (Hive on Tez). Tez will choose the 
number of tasks (in my case, there are usually 3K tasks), an each task usually 
run about 10~20 seconds.

In this case, I don't think my Tez task should be increased (as each of them 
just run a couple of seconds so I think each task has the ability to process 
its data). The swimlane picture is attached (for a smaller data size but the 
DAG plans are the same). The container reuse switch is also on.

In order to maximize the utilization, I would rather like to increase my 
container number so more tasks can be run in parallel, but I am not sure if Tez 
AM will ask RM for a certain amount of containers based on what? Can I change 
the container number Tez asks so the job could be run faster?

Xiaoyong

From: Jianfeng (Jeff) Zhang [mailto:jzh...@hortonworks.com]
Sent: Friday, September 11, 2015 1:19 PM
To: user@tez.apache.org<mailto:user@tez.apache.org>
Subject: Re: how to allocate more containers?


 by default I think container reuse is enabled. You may disable it to get more 
containers, but it also needs some trade-off and not use resource efficiently.

Set tez.am.container.reuse.enabled = false


Best Regard,
Jeff Zhang


From: Jianfeng Zhang <jzh...@hortonworks.com<mailto:jzh...@hortonworks.com>>
Reply-To: "user@tez.apache.org<mailto:user@tez.apache.org>" 
<user@tez.apache.org<mailto:user@tez.apache.org>>
Date: Friday, September 11, 2015 at 12:52 PM
To: "user@tez.apache.org<mailto:user@tez.apache.org>" 
<user@tez.apache.org<mailto:user@tez.apache.org>>
Subject: Re: how to allocate more containers?

Resource usage is more related to your cluster configuration (the resource 
scheduler configuration)
Do you intend to increase parallelism (more tasks ) to get more containers ?
And there's some configurations that you can use to get containers more quickly 
with some other trade-off,  but it would not give you more containers.


Best Regard,
Jeff Zhang


From: Xiaoyong Zhu <xiaoy...@microsoft.com<mailto:xiaoy...@microsoft.com>>
Reply-To: "user@tez.apache.org<mailto:user@tez.apache.org>" 
<user@tez.apache.org<mailto:user@tez.apache.org>>
Date: Friday, September 11, 2015 at 12:38 PM
To: "user@tez.apache.org<mailto:user@tez.apache.org>" 
<user@tez.apache.org<mailto:user@tez.apache.org>>
Subject: how to allocate more containers?

Hi

I am wondering if there is a configuration I can change to allocate more 
containers for a certain Tez application? I am using Hive on Tez.

Thanks!

Xiaoyong

Reply via email to