You could dump the data in a dfs file and pass the location of the file as param to your udf in define - so that it initializes itself using that data ...
- Mridul > -----Original Message----- > From: Dexin Wang [mailto:wangde...@gmail.com] > Sent: Tuesday, June 26, 2012 10:58 PM > To: user@pig.apache.org > Subject: Passing a BAG to Pig UDF constructor? > > Is it possible to pass a bag to a Pig UDF constructor? > > Basically in the constructor I want to initialize some hash map so that > on every exec operation, I can use the hashmap to do a lookup and find > the value I need, and apply some algorithm to it. > > I realize I could just do a replicated join to achieve similar things > but the algorithm is more than a few lines and there are some edge > cases so I would rather wrap that logic inside a UDF function. I also > realize I could just pass a file path to the constructor and read the > files to initialize the hashmap but my files are on Amazon's S3 and I > don't want to deal with > S3 API to read the file. > > Is this possible or is there some alternative ways to achieve the same > thing? > > Thanks. > Dexin