Thanks for the reference, Yes I am aware of it but I can't use it as is. For my future references also it would be good for me to know:
1. If I create a static member in UDF class is that one instance per mapper task? 2. Is there a method that gets called at the end of mapper method that I can use for cleanup? On the same subject is it better to index in UDF or storefunc? I am trying to see how to decide in this case where you are interacting with external system. On Tue, May 15, 2012 at 6:03 PM, Russell Jurney <russell.jur...@gmail.com>wrote: > Are you aware of Wonderdog, which already does this? Unfortunately, > finding reusable pig components can be very hard, as they exist across > many proprietary projects. > > https://github.com/infochimps/wonderdog > A post explaining how to use it, end to end, is here: > > http://www.quora.com/Autocomplete/What-is-the-best-way-to-implement-an-autocomplete-search-feature-when-dealing-with-large-data-sets > > Russell Jurney http://datasyndrome.com > > On May 15, 2012, at 4:18 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > > > I am trying to write an UDF that indexes data in elasticsearch after > > converting it to JSON. I had 2 questions: > > > > 1. If I create a static member in UDF class is that one instance per > mapper > > task? > > 2. Is there a method that gets called at the end of mapper method that I > > can use for cleanup? > > > > I was wondering if I should rather write a storefunc that would index the > > data. Need some help here, essentially I need some way to initialize > search > > Client once and then at the end close it out. >