We probably need to fix the docs that refer to "tez.am.container.session.delay-allocation-millis”. Can you point which doc you are referring to? This setting was removed in 0.5.x in favor of the min/max release timeouts. To achieve the same behavior as tez.am.container.session.delay-allocation-millis, just set the min and max to the same value.
— Hitesh On Dec 9, 2014, at 7:01 AM, Fabio <anyte...@gmail.com> wrote: > Thanks Rajesh, it was really that the problem! Actually... for a moment I > thought about those parameters, but I assumed they would have been ignored > during a session. > In my opinion, they should not be considered by the system while running in > session mode, and tez.am.container.session.delay-allocation-millis should be > the exact delay before releasing a container (at least when > tez.am.container.session.delay-allocation-millis > > tez.am.container.idle.release-timeout-min.millis)... Sure, this leads to the > risk of accumulating containers up to the upper limit of the > application/queue, if any. Or maybe devs could consider a warning if this > condition is met, to alert the user that that parameter is going to be > useless since containers will be released long before. How do you think? > > Thanks for the help > > Fabio > > On 12/09/2014 11:11 AM, Rajesh Balamohan wrote: >> >>> >> 2014-12-09 09:39:40,314 INFO >> [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler] >> rm.YarnTaskSchedulerService: TaskScheduler initialized with configuration: >> maxRMHeartbeatInterval: 1000, containerReuseEnabled: true, reuseRackLocal: >> true, reuseNonLocal: false, localitySchedulingDelay: 250, >> idleContainerMinTimeout=5000, idleContainerMaxTimeout=10000, >> sessionMinHeldContainers=0 >> >>> >> >> >> >> Can you try the following settings instead? >> >> tez.am.container.idle.release-timeout-min.millis=400000 >> tez.am.container.idle.release-timeout-max.millis=600000 >> >> 60000 is setting to 10 minutes. >> >> ~Rajesh.B >> >> >> On Tue, Dec 9, 2014 at 3:21 PM, Fabio <anyte...@gmail.com> wrote: >> Hi everyone, >> I'm currently running Hive on Tez, especially I am testing the session mode. >> I can actually submit different queries to the same Tez AM, and that's ok. >> But for some reason containers are released a very short time after the end >> of the assigned task, whenever no new task is pending. In such a way there >> is no chance for container reuse among different queries. I already tried to >> set tez.am.container.session.delay-allocation-millis=-1 (and before this, to >> 600000), but this behavior persists. >> In the logs I see this two suspicious lines: >> >> 2014-12-09 09:44:23,035 INFO [DelayedContainerManager] >> rm.YarnTaskSchedulerService: Releasing unused container: >> container_1418090991482_0008_01_000002 >> >> and a few milliseconds after the container is stopped: >> >> 2014-12-09 09:44:23,274 INFO [TezChild] task.ContainerReporter: Got >> TaskUpdate: 7439 ms after starting to poll. TaskInfo: shouldDie: true >> 2014-12-09 09:44:23,276 INFO [main] task.TezChild: ContainerTask returned >> shouldDie=true, Exiting >> >> It seems to me that the container is really released as soon as it is no >> more required (regardless of what could happen in the future). Is it so? How >> can I solve this? >> >> I attach the aggregated log and the swimlanes graph that highlight this >> behavior. >> >> Thanks guys >> >> Fabio >> >> >> >> -- >> ~Rajesh.B >