[
https://issues.apache.org/jira/browse/TEZ-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060202#comment-14060202
]
Bikas Saha commented on TEZ-1269:
---------------------------------
I was considering these things when writing the patch.
1) node local - it wasnt clear that the next query would have same locality as
the last one
2) For time of use, the patch is already ignoring new containers, but I am not
sure how to filter used ones. Ideally, we would like to use the newest
containers that have run long enough to get JIT going. The older containers may
have been used too much and may be close to getting full GC/have stray side
effects etc. Not quite sure how to filter those out.
IMO, ideally we would like to keep a good physical spread of containers - like
1 on every machine/rack if possible. That way in small clusters we would keep
node local containers and in large clusters we would at least be rack local.
> TaskScheduler prematurely releases containers
> ---------------------------------------------
>
> Key: TEZ-1269
> URL: https://issues.apache.org/jira/browse/TEZ-1269
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-1269.1.patch, TEZ-1269.2.patch
>
>
> It checks for session mode and if not true, and if there are no outstanding
> requests, then it releases the containers before the container timeout has
> expired. If the state machine is on its way to scheduling new tasks during
> this time then they will not be able to reuse these containers.
--
This message was sent by Atlassian JIRA
(v6.2#6252)