Ideally, you should use the TOP function. It will be more efficient, as it is algebraic.
2012/6/29 Kris Coward <k...@melon.org> > > LIMIT and ORDER BY are both allowed nested ops for a FOREACH statement. > These should be able to do what you want. > > e.g. > > B = GROUP A BY key > C = FOREACH B { > X = ORDER A BY orderingParam; > Y = LIMIT X 100; > GENERATE group, Y;} > > -Kris > > On Fri, Jun 29, 2012 at 04:19:18PM -0700, Benjamin Juhn wrote: > > Hi there, > > > > I'm trying to write a group by statement, only returning the top 100 > records from each group. Does pig support this? > > > > Thanks, > > Ben > > -- > Kris Coward http://unripe.melon.org/ > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3 >