-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16928/#review32286
-----------------------------------------------------------


This is great work. Thank you so much!

I have two comments-

1) It doesn't seem to work for a map-only job. For eg, I tried to run load and 
dump in grunt as follows-

x = load '/user/cheolsoop/foo';
dump x;

This job doesn't get converted to local mode because no of reducers are 21, 
which doesn't make sense. See log output below-

2014-01-20 10:05:30,578 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- Size of input: 8 bytes.
2014-01-20 10:05:30,578 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- No of reducers: 21
2014-01-20 10:05:30,578 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- This job cannot be converted run in-process

2) The changes in PigStats and PigStatsUtil might break backward compatibility. 
Perhaps we could avoid them if they're not necessary. Thoughts?



trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java
<https://reviews.apache.org/r/16928/#comment61021>

    Do you mind replacing these with static variables too?



trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
<https://reviews.apache.org/r/16928/#comment61022>

    I think the pseudo distributed mode means single-node and multi-processes. 
But you mean the local mode (multi-threads) here, don't you?



trunk/src/org/apache/pig/tools/pigstats/PigStats.java
<https://reviews.apache.org/r/16928/#comment61027>

    I like removing this from PigStats.
    
    But I am a bit worried that this might break backward compatibility with 
downstream applications since it is public.



trunk/src/org/apache/pig/tools/pigstats/mapreduce/MRPigStatsUtil.java
<https://reviews.apache.org/r/16928/#comment61023>

    Update the comment to reflect the change.



trunk/src/org/apache/pig/tools/pigstats/mapreduce/MRPigStatsUtil.java
<https://reviews.apache.org/r/16928/#comment61024>

    Update the comment to reflect the change.


- Cheolsoo Park


On Jan. 16, 2014, 10:04 p.m., Aniket Mokashi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16928/
> -----------------------------------------------------------
> 
> (Updated Jan. 16, 2014, 10:04 p.m.)
> 
> 
> Review request for pig, Cheolsoo Park, Daniel Dai, Dmitriy Ryaboy, and Julien 
> Le Dem.
> 
> 
> Bugs: PIG-3463
>     https://issues.apache.org/jira/browse/PIG-3463
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> If pig.auto.local.enabled is set, JCC will modify Configuration of all the 
> jobs with one reducer and input size less than pig.auto.local.input.maxbytes, 
> so that they are forced to run in local mode. Output of local run is also 
> written to hdfs.
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/ExecTypeProvider.java 1558572 
>   trunk/src/org/apache/pig/PigConfiguration.java 1558572 
>   trunk/src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java 
> 1558572 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 
> 1558572 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  1558572 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
>  1558572 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceOper.java
>  1558572 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigInputFormat.java
>  1558572 
>   trunk/src/org/apache/pig/impl/PigImplConstants.java 1558572 
>   trunk/src/org/apache/pig/tools/pigstats/EmbeddedPigStats.java 1558572 
>   trunk/src/org/apache/pig/tools/pigstats/PigStats.java 1558572 
>   trunk/src/org/apache/pig/tools/pigstats/mapreduce/MRPigStatsUtil.java 
> 1558572 
>   trunk/src/org/apache/pig/tools/pigstats/mapreduce/SimplePigStats.java 
> 1558572 
>   trunk/test/org/apache/pig/test/TestAutoLocalMode.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/16928/diff/
> 
> 
> Testing
> -------
> 
> Tried few scenarios with the patch-
> Load small data, group all, count - works in local mode.
> Load small data, another small data and replicated join - works in local mode.
> Load small data and order by key - all 3 jobs work in local mode and .
> Load small data and large data for replicated join - first job runs in local 
> mode, second runs in MR mode.
> Load large data and order by key - works in first stages in local mode and 
> last stage in MR mode.
> 
> 
> Thanks,
> 
> Aniket Mokashi
> 
>

Reply via email to