How do you identify that map task is ending within the map method? Is it possible to know which is the last call to map method?
On Sat, Apr 18, 2009 at 10:59 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > I jumped into Hadoop at the 'deep end'. I know pig, hive, and hbase > support the ability to max(). I am writing my own max() over a simple > one column dataset. > > The best solution I came up with was using MapRunner. With maprunner I > can store the highest value in a private member variable. I can read > through the entire data set and only have to emit one value per mapper > upon completion of the map data. Then I can specify one reducer and > carry out the same operation. > > Does anyone have a better tactic. I thought a counter could do this > but are they atomic? >