> On Sept. 24, 2014, 5:32 a.m., Daniel Dai wrote:
> > I think we might need a flag to enable/disable this feature. This feature 
> > might confuse some users when Pig override parallelism statement. In 
> > addition, our auto-parallel algorithm is not sophisticated enough, we shall 
> > provide a fallback in case auto-parallelism does not work.

Updated patch to use pig.tez.auto.parallelism itself to turn off overriding 
estimation for intermediate reducers. Don't think we need a separate setting 
for that. User can configure parallel for all operators and turn off 
pig.tez.auto.parallelism to use the specified parallelism.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25912/#review54390
-----------------------------------------------------------


On Sept. 24, 2014, 3:27 p.m., Rohini Palaniswamy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25912/
> -----------------------------------------------------------
> 
> (Updated Sept. 24, 2014, 3:27 p.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Daniel Dai.
> 
> 
> Bugs: PIG-4162
>     https://issues.apache.org/jira/browse/PIG-4162
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Following changes are done:
>     - Always estimate intermediate reducer parallelism even if user has 
> specified PARALLEL.
>     - intermediate reducer parallelism = Min(2 * userparallelism, 
> Math.max(userparallelism, Math.max(estimatedparallelism, 
> Math.max(2999,PigReducerEstimator.MAX_REDUCER_COUNT_PARAM)). i.e Limiting 
> estimated parallelism to be not more than 2x userparallelism or 2999. 
> Hardcoding 2999 for now which is different from final reducer max parallelism 
> default of 999 and is only for intermediate reducers. Will make it 
> configurable later if needed. 
>     - ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_DESIRED_TASK_INPUT_SIZE 
> is set to blocksize for intermediate tasks(same as mapper behaviour) instead 
> of InputSizeReducerEstimator.DEFAULT_BYTES_PER_REDUCER which defaults to 1G
>      
>    Patch has few other minor unrelated fixes as well.
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/Main.java 
> 1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezOperator.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/ParallelismSetter.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/TezOperDependencyParallelismEstimator.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/TezParallelismEstimator.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/ParallelConstantVisitor.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigImplConstants.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezStats.java
>  1626640 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/bigdata.conf 
> 1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestAlgebraicEval.java
>  1626640 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestForEachNestedPlan.java
>  1626640 
>   http://svn.apache.org/repos/asf/pig/trunk/test/tez-tests 1626640 
> 
> Diff: https://reviews.apache.org/r/25912/diff/
> 
> 
> Testing
> -------
> 
> test-tez unit tests and e2e tests good.
> 
> 
> Thanks,
> 
> Rohini Palaniswamy
> 
>

Reply via email to