Hi All,
I am writing a class (called Parser) with a couple of static functions
because I don't want millions of instances of this class to be created during
the run.
However, I realized that Hadoop will eventually produce parallel jobs, and
if all jobs will call static functions of this Parser class, would that be safe?
In other words, will all hadoop jobs share the same class Parser or will
each of them have their own Parser? In the former case, if all jobs share the
same class, then if I make the methods synchronized, then the jobs would need
to wait until the locks to the functions are released, thus that would affect
the performance. However, in later case, that would not cause any problem.
Can someone provide some insights?
Thanks
Huy