Thanks Hitesh, very helpful as usual.

On 12/05/2014 05:56 PM, Hitesh Shah wrote:
Hi Fabio,

Regarding the second container assignment, the critical aspect is 
"reusedContainer=true”. It is re-using the container used for the parent 
vertex’s task hence the priority is not relevant. In such cases, eventually the 
priority 4 container will be released without being used.
If you set tez.am.container.reuse.enabled to false, you will see the prio 4 
container being used as expected.

As for distance from root, the approach used in VertexImpl.java is:

       int distanceFromRoot = startEvent.getSourceDistanceFromRoot() + 1;
       if(vertex.distanceFromRoot < distanceFromRoot) {
         vertex.distanceFromRoot = distanceFromRoot;
       }

The above is done in SourceVertexStartedTransition which will invoked whenever 
a parent vertex has started.

Hope that helps clarifies what is happening.

— Hitesh


On Dec 5, 2014, at 6:11 AM, Fabio <[email protected]> wrote:

Hi all,
while reading the log from the join example (2 source vertexes and a sink 
vertex) I noticed the following:

2014-09-26 21:20:27,021 INFO [TaskSchedulerEventHandlerThread] 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Allocation request for task: 
attempt_1411734050933_0003_1_00_000000_0 with request: Capability[<memory:1024, 
vCores:1>]Priority[2] host: node02 rack: null
...
2014-09-26 21:20:28,237 INFO [DelayedContainerManager] 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Assigning container to task, 
container=Container: [ContainerId: container_1411734050933_0003_01_000002, NodeId: 
node02:34192, NodeHttpAddress: node02:8042, Resource: <memory:1024, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.56.102:34192 }, ], 
task=attempt_1411734050933_0003_1_00_000000_0, containerHost=node02, 
localityMatchType=NodeLocal, matchedLocation=node02, honorLocalityFlags=true, 
reusedContainer=false, delayedContainers=1, containerResourceMemory=1024, 
containerResourceVCores=1

And something similar for the other "parent" vertex, nothing strange here. But this is 
about the "joiner" vertex:

2014-09-26 21:21:24,292 INFO [TaskSchedulerEventHandlerThread] 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Allocation request for task: 
attempt_1411734050933_0003_1_02_000000_0 with request: Capability[<memory:1024, 
vCores:1>]Priority[4] host: null rack: null
...
2014-09-26 21:21:24,318 INFO [DelayedContainerManager] 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Assigning container to task, 
container=Container: [ContainerId: container_1411734050933_0003_01_000003, NodeId: 
node02:34192, NodeHttpAddress: node02:8042, Resource: <memory:1024, vCores:1>, 
Priority: 2, Token: Token { kind: ContainerToken, service: 192.168.56.102:34192 }, ], 
task=attempt_1411734050933_0003_1_02_000000_0, containerHost=node02, 
localityMatchType=NodeLocal, matchedLocation=node02, honorLocalityFlags=true, 
reusedContainer=true, delayedContainers=2, containerResourceMemory=1024, 
containerResourceVCores=1

Here the priority of the obtained container is still 2, but I was expecting to 
find the same priority of the request (4). So what is the priority of the 
obtained container, since it seems to be 2 regardless of the request? Is it 
used by Tez? How?

Another question I would like to ask is: I see the priority is calculated as 
(vertexDistanceFromRoot + 1) * 2, where vertexDistanceFromRoot is (I think) the distance 
from the vertex which got its input from a file, or at least not from another vertex. But 
I haven't been able to understand how this value is set, especially in case two (or more) 
branches converging in a common vertex X have not the same "depth"... in other 
words: what happens if X has two parents, one with priority 4 and one 6? Which will be 
its priority?

Thanks in advance

Fabio


Reply via email to