One simple way is to write a UDF that will act as Json parser. Load your data and then call your UDF to parse and extract whatever you want from the Json. You need to build what you want to get. Pig doesn't do that for you, it gives you the capability to do that. How you do is upto you.
On Fri, Jul 25, 2014 at 12:09 PM, unmesha sreeveni <[email protected]> wrote: > Hi > > This is my code for sampling > > *--Load data* > *inputdata = LOAD '$input' using PigStorage('$delimiter');* > > *--Group data* > *groupedByAll = group inputdata all;* > > *--output into hdfs* > *sampled = SAMPLE inputdata $fraction;* > *store sampled into '$output' using PigStorage('$delimiter'); * > > --Sampling.pig > --pig -x mapreduce -f Sampling.pig -param input=foo.csv -param > output=OUT/pig -param delimiter="," -param fraction='0.05' > > --Load data > inputdata = LOAD '$input' using PigStorage('$delimiter'); > > --Group data > groupedByAll = group inputdata all; > > --output into hdfs > sampled = SAMPLE inputdata $fraction; > store sampled into '$output' using PigStorage('$delimiter'); > > I am taking input parameters as customized > pig -x mapreduce -f Sampling.pig -param input=foo.csv -param output=OUT/pig > -param delimiter="," -param fraction='0.05' > > I would like to do a modification in the same > I am trying to take my input as json > > sample json: > > *{"Name":"sampling","elementInfo":{"fraction":"3"},"destination":"/user/sree/OUT","source":"/user/sree/foo.txt"}* > > Now I need to parse the above json and take the needful params. > How to do the same > I know we can load json in apache pig but how to extract the needful from > the json > > from here I only need > fraction,destination,source > > Please suggest a way > > -- > *Thanks & Regards * > > > *Unmesha Sreeveni U.B* > *Hadoop, Bigdata Developer* > *Center for Cyber Security | Amrita Vishwa Vidyapeetham* > http://www.unmeshasreeveni.blogspot.in/ >
