Hi

This is my code for sampling

*--Load data*
*inputdata = LOAD '$input' using PigStorage('$delimiter');*

*--Group data*
*groupedByAll = group inputdata all;*

*--output into hdfs*
*sampled = SAMPLE inputdata $fraction;*
*store sampled into '$output' using PigStorage('$delimiter'); *

 --Sampling.pig
--pig -x mapreduce -f Sampling.pig -param input=foo.csv -param
output=OUT/pig -param delimiter="," -param fraction='0.05'

--Load data
inputdata = LOAD '$input' using PigStorage('$delimiter');

--Group data
groupedByAll = group inputdata all;

--output into hdfs
sampled = SAMPLE inputdata $fraction;
store sampled into '$output' using PigStorage('$delimiter');

I am taking input parameters as customized
pig -x mapreduce -f Sampling.pig -param input=foo.csv -param output=OUT/pig
-param delimiter="," -param fraction='0.05'

I would like to do a modification in the same
I am trying to take my input as json

sample json:
*{"Name":"sampling","elementInfo":{"fraction":"3"},"destination":"/user/sree/OUT","source":"/user/sree/foo.txt"}*

Now I need to parse the above json and take the needful params.
How to do the same
I know we can load json in apache pig but how to extract the needful from
the json

from here I only need
fraction,destination,source

Please suggest a way

-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Reply via email to