Hi Robert, This is a bug ( which has been seen in other scenarios too ). You can follow some of the discussion related to this issue at https://issues.apache.org/jira/browse/TEZ-1522.
thanks — Hitesh On Sep 12, 2014, at 11:20 AM, Grandl Robert <rgra...@yahoo.com> wrote: > Hi guys, > > During some of my experiments, I realized that a vertex which is managed by a > ShuffleVertexManager is looking for tasks who have finished just in parent > vertices where data movement is SCATTER_GATHER. > > For example, in the attached DAG, Reducer 3 is able to start tasks looking > just at Map_5 and Reducer_2, and such even if none of the tasks have finished > on the branch with Reducer 8, Reducer 3 still starts. > > The main reason seems to be that a ShuffleVertexManager is looking for tasks > finished just in bipartiteSources vertices, which seems to be only those > which are SCATTER_GATHER. So a parent which is Broadcast, is completely > ignored from this. > > Do I miss something, i.e. it is as designed or it is a bug ? > > Thanks, > Robert > <query15.png>