Do you have the dag plan ? I mean the dag topology. The dot file in the AM container.
Best Regard, Jeff Zhang From: Xiaoyong Zhu <xiaoy...@microsoft.com<mailto:xiaoy...@microsoft.com>> Reply-To: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>> Date: Friday, September 11, 2015 at 1:25 PM To: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>> Subject: RE: how to allocate more containers? Thanks for the information. here's my understanding of the resource allocation (please correct me if I am wrong) and my scenario: 1. Assuming the cluster is dedicated to only one Tez application, then I want to maximize the usage of the single application (Mem/CPU) 2. Assuming I have changed all the configurations in YARN side so the memory/CPU allocation of a certain node is maximized (meaning each node can be theoretically full utilized). The input is around 500GB~1TB 3. Then I launched a Tez application (Hive on Tez). Tez will choose the number of tasks (in my case, there are usually 3K tasks), an each task usually run about 10~20 seconds. In this case, I don't think my Tez task should be increased (as each of them just run a couple of seconds so I think each task has the ability to process its data). The swimlane picture is attached (for a smaller data size but the DAG plans are the same). The container reuse switch is also on. In order to maximize the utilization, I would rather like to increase my container number so more tasks can be run in parallel, but I am not sure if Tez AM will ask RM for a certain amount of containers based on what? Can I change the container number Tez asks so the job could be run faster? Xiaoyong From: Jianfeng (Jeff) Zhang [mailto:jzh...@hortonworks.com] Sent: Friday, September 11, 2015 1:19 PM To: user@tez.apache.org<mailto:user@tez.apache.org> Subject: Re: how to allocate more containers? by default I think container reuse is enabled. You may disable it to get more containers, but it also needs some trade-off and not use resource efficiently. Set tez.am.container.reuse.enabled = false Best Regard, Jeff Zhang From: Jianfeng Zhang <jzh...@hortonworks.com<mailto:jzh...@hortonworks.com>> Reply-To: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>> Date: Friday, September 11, 2015 at 12:52 PM To: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>> Subject: Re: how to allocate more containers? Resource usage is more related to your cluster configuration (the resource scheduler configuration) Do you intend to increase parallelism (more tasks ) to get more containers ? And there's some configurations that you can use to get containers more quickly with some other trade-off, but it would not give you more containers. Best Regard, Jeff Zhang From: Xiaoyong Zhu <xiaoy...@microsoft.com<mailto:xiaoy...@microsoft.com>> Reply-To: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>> Date: Friday, September 11, 2015 at 12:38 PM To: "user@tez.apache.org<mailto:user@tez.apache.org>" <user@tez.apache.org<mailto:user@tez.apache.org>> Subject: how to allocate more containers? Hi I am wondering if there is a configuration I can change to allocate more containers for a certain Tez application? I am using Hive on Tez. Thanks! Xiaoyong