-
From: Mehmet Tepedelenlioglu [mailto:mehmets...@yahoo.com]
Sent: Wednesday, May 28, 2014 4:27 PM
To: user@pig.apache.org user@pig.apache.org
Subject: Re: How to sample an inner bag?
I have no experience with the python udfs (I use Java). But I doubt the example
you supplied would work. First, I am
Reuters
-Original Message-
From: Mehmet Tepedelenlioglu [mailto:mehmets...@yahoo.com]
Sent: Tuesday, May 27, 2014 5:09 PM
To: user@pig.apache.org user@pig.apache.org
Subject: Re: How to sample an inner bag?
If you know how many items you want from each inner bag exactly, you can hack
Subject: Re: How to sample an inner bag?
If you know how many items you want from each inner bag exactly, you can hack
it like this:
x = foreach x {
y = foreach x generate RANDOM() as rnd, *;
y = order y by rnd;
y = limit y $SAMPLE_NUM;
y = foreach y generate $1
Hi Pig users,
Is there an easy/efficient way to sample an inner bag? For example, with input
in a relation like
(id1,att1,{(a,0.01),(b,0.02),(x,0.999749968742)})
(id1,att2,{(a,0.03),(b,0.04),(x,0.998749217772)})
(id2,att1,{(b,0.05),(c,0.06),(x,0.996945334509)})
I’d like to sample 1/3 the
If you know how many items you want from each inner bag exactly, you can hack
it like this:
x = foreach x {
y = foreach x generate RANDOM() as rnd, *;
y = order y by rnd;
y = limit y $SAMPLE_NUM;
y = foreach y generate $1 ..;
generate group, y;
}
Basically randomize the
@Mehmet... great hack! I like it :-P
On Tue, May 27, 2014 at 5:08 PM, Mehmet Tepedelenlioglu
mehmets...@yahoo.com wrote:
If you know how many items you want from each inner bag exactly, you can
hack it like this:
x = foreach x {
y = foreach x generate RANDOM() as rnd, *;
y =