DEFINE SRS datafu.pig.sampling.SimpleRandomSample('0.01');
examples = LOAD '/home/sreeveni/myfiles/FS/age.txt' as (id,age);
grouped = GROUP examples BY id;
sampled = FOREACH grouped GENERATE FLATTEN(SRS(examples));
DUMP sampled;
I supplied a file with 796 lines,When i dumped the output I am getting the
same 796 lines as output
Why is it so?
Thanks in advance.
--
*Thanks & Regards *
*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/