Hi,
  Thanks for replying.
Err I am a new here.
I am trying to find the info as in what is UDF?


On Tue, Sep 25, 2012 at 10:41 PM, Cheolsoo Park <cheol...@cloudera.com>wrote:

> Hi,
>
> in = load 'in.txt' using PigStorage(',') as (merchant:int, customer:int,
> amount:float);
> perMerchant = group in by merchant;
> avg = foreach perMerchant generate group, AVG(in.amount);
> dump avg;
>
> This returns (merchant_id, avg of amount) as follows:
>
> (1233,203.1999969482422)
> (1234,264.6000061035156)
>
> Regarding standard deviation, you can write your own UDF that computes it.
> Please take a look at AVG.java to see how it compute the average.
> Basically, you need to modify the exec() method to compute standard
> deviation instead of average.
>
> Thanks,
> Cheolsoo
>
> On Tue, Sep 25, 2012 at 6:36 PM, jamal sasha <jamalsha...@gmail.com>
> wrote:
>
> > Hi,
> >    I have a huge text file of form
> > data is saved in directory data/data1.txt, data2.txt and so on
> >  merchant_id, user_id, amount
> >   1234, 9123, 299.2
> >   1233, 9199, 203.2
> >   1234, 0124, 230
> >   and so on..
> >
> > What I want to do is for each merchant, find the average amount..
> > so basically in the end i want to save the output in file.
> > something like
> > merchant_id, average_amount
> >  1234, avg_amt_1234 a
> >   and so on.
> > How do I calculate the standard deviation as well?
> >
> > Sorry for asking such a basic question. :(
> > Any help would be appreciated. :)
> > Jamal
> >
>

Reply via email to