[ https://issues.apache.org/jira/browse/PIG-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704447#action_12704447 ]
Hadoop QA commented on PIG-741: ------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12406856/PIG-741.patch against trunk revision 768923. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/28/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/28/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/28/console This message is automatically generated. > Add LIMIT as a statement that works in nested FOREACH > ----------------------------------------------------- > > Key: PIG-741 > URL: https://issues.apache.org/jira/browse/PIG-741 > Project: Pig > Issue Type: New Feature > Reporter: David Ciemiewicz > Assignee: Alan Gates > Fix For: 0.3.0 > > Attachments: PIG-741.patch > > > I'd like to compute the top 10 results in each group. > The natural way to express this in Pig would be: > {code} > A = load '...' using PigStorage() as ( > date: int, > count: int, > url: chararray > ); > B = group A by ( date ); > C = foreach B { > D = order A by count desc; > E = limit D 10; > generate > FLATTEN(E); > }; > dump C; > {code} > Yeah, I could write a UDF / PiggyBank function to take the top n results. But > since LIMIT already exists as a statement, it seems like it should also work > in the nested foreach context. > Example workaround code. > {code} > C = foreach B { > D = order A by count desc; > E = util.TOP(D, 10); > generate > FLATTEN(E); > }; > dump C; > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.