Hello, I was wondering if Hadoop provides thread safe shared variables that can be accessed from individual mappers/reducers along with a proper locking mechanism. To clarify things, let's say that in the word count example, I want to know the word that has the highest frequency and how many times it occured. I believe that the latter can be done using the counters that come with the Hadoop framework but I don't know how to get the word itself as a String. Of course, the problem can be more complicated like the top 100 words or so.
I thought of writing a serial program which can go over the final output of the word count but this wouldn't be a good idea if the output file gets too large. However, if there is a way to define and use shared variables, this would be really easy to do on the fly during the word count's reduce phase. Thanks, Jim