Thanks Dmitriy, all sorted now.
James
On Mon, Sep 3, 2012 at 6:21 PM, Dmitriy Ryaboy wrote:
> That's cause you used "group all" which groups everything into one
> group, which by definition can only go to one reducer.
>
> What if instead you group into some large-enough number of buckets?
>
> A
That's cause you used "group all" which groups everything into one
group, which by definition can only go to one reducer.
What if instead you group into some large-enough number of buckets?
A = LOAD 'records.txt' USING PigStorage('\t') AS (recordId:int);
A_PRIME = FOREACH A generate *, ROUND(RAN