2011/4/26 Dmitriy Ryaboy
> This may be helpful in understanding what happens when you do a group-by:
> http://squarecog.wordpress.com/2010/05/11/group-operator-in-apache-pig/
>
> Thank you very Much.
> Also, are you sure TOP doesn't give you items in order? It's a bag, but the
> implementation
This may be helpful in understanding what happens when you do a group-by:
http://squarecog.wordpress.com/2010/05/11/group-operator-in-apache-pig/
Also, are you sure TOP doesn't give you items in order? It's a bag, but the
implementation is such that flattening it should give you things in proper
o
A has changed. A outside the foreach is a relation (all the records
you loaded). Inside the foreach A is a bag created by the group by.
So what this does is order the bag A by the second input, and then
take the top 3 records. Actually, given that order by goes from least
to greatest th
2011/4/26 Alan Gates
> topResults = foreach D {
>srtd = order A by second;
>top3 = limit srtd 3;
>generate flatten(top3);
> };
>
> Alan.
>
> Thank you Alan. It works perfectly.
I realize I didn't really understood the mechanism behind foreach.
Reading this piece of code I
topResults = foreach D {
srtd = order A by second;
top3 = limit srtd 3;
generate flatten(top3);
};
Alan.
On Apr 26, 2011, at 6:11 AM, ugo jardonnet wrote:
Hi. I am looking for a way to get the result of top ordered. Is it
possible
?
Example:
A = LOAD 'datatest' USING
mmm
In fact TOP doesn't order results. I was looking for a way to do this from
PIG.
The problem is TOP returns a bag which cannot be ordered. And of course
after the foreach its to late.
2011/4/26 Sven Krasser
> At a glance it could be this: The first field in D.A is of type chararray,
> but TOP
At a glance it could be this: The first field in D.A is of type chararray,
but TOP orders based on long.
-Sven
On Tue, Apr 26, 2011 at 6:11 AM, ugo jardonnet wrote:
> Hi. I am looking for a way to get the result of top ordered. Is it possible
> ?
>
> Example:
>
> A = LOAD 'datatest' USING PigStor
Hi. I am looking for a way to get the result of top ordered. Is it possible
?
Example:
A = LOAD 'datatest' USING PigStorage(';') as (first: chararray, second:
int);
D = GROUP A BY first;
topResults = FOREACH D {
result = TOP(3, 1, A);
GENERATE flatten(result); -- unordered
};
dump