Are you aware of Wonderdog, which already does this? Unfortunately, finding reusable pig components can be very hard, as they exist across many proprietary projects.
https://github.com/infochimps/wonderdog A post explaining how to use it, end to end, is here: http://www.quora.com/Autocomplete/What-is-the-best-way-to-implement-an-autocomplete-search-feature-when-dealing-with-large-data-sets Russell Jurney http://datasyndrome.com On May 15, 2012, at 4:18 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > I am trying to write an UDF that indexes data in elasticsearch after > converting it to JSON. I had 2 questions: > > 1. If I create a static member in UDF class is that one instance per mapper > task? > 2. Is there a method that gets called at the end of mapper method that I > can use for cleanup? > > I was wondering if I should rather write a storefunc that would index the > data. Need some help here, essentially I need some way to initialize search > Client once and then at the end close it out.