Hey Gopal,

using the task count basically for 2 things (in mr for both the map stage and 
the reduce stage):
- each task samples its output-data up to a certain number. This number is the 
desired sample count divided by the number of tasks
- also we use the task count in some scenarios to let the last task (of a stage 
or a vertex) do some extra logic. That plays in combination of the task-index.

Looking at your patch it looks like it will do the job for kind of the map-like 
vertex but not for the aggregation vertex, right ?
Also what jira issue is that ?

best
Johannes

On 24 Jul 2014, at 07:40, Gopal V <[email protected]> wrote:

> On 7/23/14, 6:07 PM, Johannes Zillmann wrote:
>> Hey Tez team,
>> 
>> is there some way to get the task count within a vertex from within a task ?
>> Some equivalent to mapred.map.tasks and mapred.reduce.tasks for map-reduce ?
> 
> Could you explain the use-case for this particular requirement?
> 
> I intend to add the vertex parallelism to the task context as part of one of 
> my WIP branches.
> 
> I uploaded my base patch-set as is (including the TODO markers).
> 
> https://issues.apache.org/jira/secure/attachment/12657536/TEZ-broadcast-shuffle%2Bvertex-parallelism.patch
> 
> If you can explain what you are actually looking to do with this information, 
> perhaps I can roll the two feature reqs together.
> 
> Cheers,
> Gopal
> 

Reply via email to