[ 
https://issues.apache.org/jira/browse/PIG-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093330#comment-13093330
 ] 

jirapos...@reviews.apache.org commented on PIG-2237:
----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1664/#review1684
-----------------------------------------------------------



trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/LimitAdjuster.java
<https://reviews.apache.org/r/1664/#comment3838>

    please fix the indentation for the contents of this if block, and add {}s 



trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/LimitAdjuster.java
<https://reviews.apache.org/r/1664/#comment3839>

    you keep having to cast it. Just add 
    
    POStore storeOp = (POStore) mpLeaf;
    
    at the beginning of the block; it'll clean up the code.



trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java
<https://reviews.apache.org/r/1664/#comment3841>

    please add documentation for Pig Developers indicating when and how to use 
the methods in this helper class.



trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java
<https://reviews.apache.org/r/1664/#comment3840>

    will using this mess up projection push-down?


- Dmitriy


On 2011-08-29 23:34:23, Daniel Dai wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1664/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-29 23:34:23)
bq.  
bq.  
bq.  Review request for pig and Thejas Nair.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  See PIG-2237
bq.  
bq.  
bq.  This addresses bug PIG-2237.
bq.      https://issues.apache.org/jira/browse/PIG-2237
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/LimitAdjuster.java
 PRE-CREATION 
bq.    
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
 1162260 
bq.    
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java
 PRE-CREATION 
bq.    
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
 1162260 
bq.    trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1162260 
bq.    trunk/test/org/apache/pig/test/TestMRCompiler.java 1162260 
bq.  
bq.  Diff: https://reviews.apache.org/r/1664/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Test-patch:
bq.       [exec] +1 overall.  
bq.       [exec] 
bq.       [exec]     +1 @author.  The patch does not contain any @author tags.
bq.       [exec] 
bq.       [exec]     +1 tests included.  The patch appears to include 3 new or 
modified tests.
bq.       [exec] 
bq.       [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
bq.       [exec] 
bq.       [exec]     +1 javac.  The applied patch does not increase the total 
number of javac compiler warnings.
bq.       [exec] 
bq.       [exec]     +1 findbugs.  The patch does not introduce any new 
Findbugs warnings.
bq.       [exec] 
bq.       [exec]     +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
bq.  
bq.  Unit test:
bq.      all pass.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Daniel
bq.  
bq.



> LIMIT generates wrong number of records if pig determines no of reducers as 
> more than 1
> ---------------------------------------------------------------------------------------
>
>                 Key: PIG-2237
>                 URL: https://issues.apache.org/jira/browse/PIG-2237
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Anitha Raju
>            Assignee: Daniel Dai
>             Fix For: 0.9.1, 0.10
>
>         Attachments: PIG-2237-1.patch, PIG-2237-2.patch, PIG-2237-3.patch
>
>
> Hi,
> For a script
> ========
> A = load 'test.txt' using PigStorage() as (a:int,b:int);
> B = order A by a ;
> C = limit B 2;
> store C into 'op1' using PigStorage();
> ========
> Limit and ORDER BY are done in the same MR job if no explicit PARALLELism is 
> mentioned.
> In this case, the no of reducers are determined by pig and sometimes it is 
> calculated > 1.
> Since limit happens at the reduce side, each reduce tasks does a limit 
> separately generating n*2 records where n is the no of reduce tasks 
> calculated by pig.
> If an explicit specification of no of reduce tasks using PARALLEL keyword is 
> done on ORDER BY,
> ==========
> B = order A by a PARALLEL 4;
> ==========
> another MR is created with 1 reduce task where the limit is done. 
> In short, the issue occurs when the no of reducers calculated by pig is 
> greater than 1 and a limit is involved in the MR.
> The issue can be replicated by specifying
> ==========
> -Dpig.exec.reducers.bytes.per.reducer
> ==========
> The issue is seen in 0.8 and 0.9 version. It works good in 0.7
> Regards,
> Anitha

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to