Hi,
Many options available here. You can use jobconf (0.18 ) / context.conf (0.20) 
to pass these lines across all tasks ( assuming the size isnt relatively large 
) and use configure / setup to retrieve these.. Or use distributed cache to 
read a file containing these lines ( possibly with jvm reuse if you want that 
extra bit as well. )

Thanks,
Amogh

On 10/26/09 6:17 AM, "Boyu Zhang" <boyuzhan...@gmail.com> wrote:

Dear All,

I am implementing a clustering algorithm in which I need to compare each
line to two specific lines (they all have the same format ) and output two
scores denoting the similarity between each line to the two specific lines.

Can I define two global variables (the 2 specific lines) in the main[]
method and pass those two variables to the mapper class?
Or can I store the two lines in a separate file (say Centric )and have
mapper class read the file and compare each lines (from other files, say
Data in which the data need to be processed) with the two from the separate
file Centric?

Thanks a lot for reading my email, really appreciate any help!

Boyu Zhang(Emma)
University of Delaware

Reply via email to