I jumped into Hadoop at the 'deep end'. I know pig, hive, and hbase support the ability to max(). I am writing my own max() over a simple one column dataset.
The best solution I came up with was using MapRunner. With maprunner I can store the highest value in a private member variable. I can read through the entire data set and only have to emit one value per mapper upon completion of the map data. Then I can specify one reducer and carry out the same operation. Does anyone have a better tactic. I thought a counter could do this but are they atomic?