ah nice. Thank you so much Zach! On Tue, Dec 7, 2010 at 11:47 AM, Zach Bailey <zach.bai...@dataclip.com>wrote:
> > You can pass parameters via the UDF constructor. For example: > > > public MyUDF(boolean includeAge, boolean includeGender) > > > then you would initialize it like so in your pig script: > > > define MY_UDF_ONLY_AGE com.package.MyUDF(true, false) > > > and use it like: > > > data_with_age = FOREACH data GENERATE user_id, MY_UDF_ONLY_AGE(user_id); > > > HTH, > Zach > > > On Tuesday, December 7, 2010 at 2:44 PM, Dexin Wang wrote: > > > Hi, > > > > This might be a dumb question. Is it possible to pass anything other than > > the input tuple to a UDF Eval function? > > > > Basically in my UDF, I need to do some user info lookup. So the input > will > > be: > > > > (userid,f1,f2) > > > > with this UDF, I want to convert it to something like > > > > (userid,age,gender,location,f1,f2) > > > > where in the UDF I do a DB lookup on the userid and returns user's info > > (age, gender, etc). But I don't necessarily want to pass back the same > user > > info fields, e.g. sometimes I only want age. > > > > I hope there is a way for me to tell the UDF that I only want "age", and > > sometimes "age, location", etc. > > > > What's the best way to achieve this without having to write a separate > UDF > > for every case? > > > > Thanks. > > Dexin > > > > > > > > > > >