Hi, Thanks for replying. Err I am a new here. I am trying to find the info as in what is UDF?
On Tue, Sep 25, 2012 at 10:41 PM, Cheolsoo Park <cheol...@cloudera.com>wrote: > Hi, > > in = load 'in.txt' using PigStorage(',') as (merchant:int, customer:int, > amount:float); > perMerchant = group in by merchant; > avg = foreach perMerchant generate group, AVG(in.amount); > dump avg; > > This returns (merchant_id, avg of amount) as follows: > > (1233,203.1999969482422) > (1234,264.6000061035156) > > Regarding standard deviation, you can write your own UDF that computes it. > Please take a look at AVG.java to see how it compute the average. > Basically, you need to modify the exec() method to compute standard > deviation instead of average. > > Thanks, > Cheolsoo > > On Tue, Sep 25, 2012 at 6:36 PM, jamal sasha <jamalsha...@gmail.com> > wrote: > > > Hi, > > I have a huge text file of form > > data is saved in directory data/data1.txt, data2.txt and so on > > merchant_id, user_id, amount > > 1234, 9123, 299.2 > > 1233, 9199, 203.2 > > 1234, 0124, 230 > > and so on.. > > > > What I want to do is for each merchant, find the average amount.. > > so basically in the end i want to save the output in file. > > something like > > merchant_id, average_amount > > 1234, avg_amt_1234 a > > and so on. > > How do I calculate the standard deviation as well? > > > > Sorry for asking such a basic question. :( > > Any help would be appreciated. :) > > Jamal > > >