[ 
https://issues.apache.org/jira/browse/PIG-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996993#comment-13996993
 ] 

Daniel Dai commented on PIG-3846:
---------------------------------

Summary of changes:
1. TezOperDependencyParallelismEstimator, estimate the number of parallelism 
based on the parallelism of predecessors and operators within predecessors' 
physical plan
2. PigOrderByVertexManager, VertexManagerPlugin for sort vertex of order by. It 
receive event from partition node and decrease parallelism of sort vertex 
automatically (TEZ-1107 prevent increase parallelism of sort job)
3. Change of POReservoirSample, FindQuantilesTez, WeightedRangePartitionerTez, 
PigProcessor to assist PigOrderByVertexManager, FindQuantilesTez will estimate 
numQuantiles based on the samples sent from POReservoirSample (include stats of 
the previous job), WeightedRangePartitionerTez will partition the incoming data 
into the estimated numQuantiles partitions, and PigProcessor will send 
numQuantiles to PigOrderByVertexManager
4. Set auto-parallelism flag for ShuffleVertexManager to true for applicable 
vertex
5. Add estimatedParallelism to TezOperator. If requestedParallelism is not set, 
TezOperDependencyParallelismEstimator will estimate the parallelism and 
instruct VertexManager to figure out parallelism dynamically

> Implement automatic reducer parallelism
> ---------------------------------------
>
>                 Key: PIG-3846
>                 URL: https://issues.apache.org/jira/browse/PIG-3846
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Rohini Palaniswamy
>            Assignee: Daniel Dai
>             Fix For: tez-branch
>
>         Attachments: PIG-3846-1.patch, PIG-3846-3.patch
>
>
>  Tez has it built-in. We can start with reusing it and then look at 
> customization for better performance. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to