Shared thread safe variables?

Jim Twensky Wed, 24 Dec 2008 00:29:04 -0800

Hello,

I was wondering if Hadoop provides thread safe shared variables that can be
accessed from individual mappers/reducers along with a proper locking
mechanism. To clarify things, let's say that in the word count example, I
want to know the word that has the highest frequency and how many times it
occured. I believe that the latter can be done using the counters that come
with the Hadoop framework but I don't know how to get the word itself as a
String. Of course, the problem can be more complicated like the top 100
words or so.


I thought of writing a serial program which can go over the final output of
the word count but this wouldn't be a good idea if the output file gets too
large. However, if there is a way to define and use shared variables, this
would be really easy to do on the fly during the word count's reduce phase.

Thanks,
Jim

Shared thread safe variables?

Reply via email to