what is the error ? function not found or something like that ? what about this ? avg = generate myudfs.CalculateAvg(dividends);
On Mon, Mar 4, 2013 at 4:56 PM, Preeti Gupta <preetigupt...@soe.ucsc.edu>wrote: > Hello All, > > I have dataset like > > 0, 10.1, 20.1, 30, 40, > 50, 60, 70, 80.1, 1, > 2, 3, 4, 5, 6, > 7, 8, 9, 10, 11, > 12, 13, 14, 15, 16, > 1, 2, 3, 4, 5, > 56, 6, 7, 8, 9, > 9, 9, 9, 12, 1, > 3, 14, 1, 5, 6, > 7, 8, 8, 9, 12 > > So basically comma separated values. But I want to consider this as one > data column and I want to calculate the average of the whole dataset. > > I believe I have to write UDF to calculate average. Pig is able to load > this data > > ( 0, 10.1, 20.1, 30, 40,) > ( 50, 60, 70, 80.1, 1,) > ( 2, 3, 4, 5, 6,) > ( 7, 8, 9, 10, 11,) > ( 12, 13, 14, 15, 16,) > ( 1, 2, 3, 4, 5,) > ( 56, 6, 7, 8, 9,) > ( 9, 9, 9, 12, 1,) > ( 3, 14, 1, 5, 6,) > ( 7, 8, 8, 9, 12 ) > > and How do I invoke that UDF in my pig script? Say I implement > CalculateAvg function. > > REGISTER ./myudfs.jar > dividends = load 'try.txt'; > dump dividends > --grouped = group dividends by symbol; > avg = generate CalculateAvg(dividends); > dump avg > --store avg into 'average_dividend'; > > It fails. > >