That code is in, unfortunately it doesn't quite solve the problem;
you'd need to do some more work. You'd have to write subclasses that
spit out the statistics you want. Then set the appropriate options in
hadoop-site, so that those classes get loaded.
On Wed, Oct 8, 2008 at 12:30 PM, George
Hi!
I've developed a Map/Reduce algorithm to analyze some logs from web
application.
So basically, we are ready to start QA test phase, so now, I would like to
now how efficient is my application
from performance point of view.
So is there any procedure I could use to do some profiling?
Just run your map reduce job local and connect your profiler. I use
yourkit.
Works great!
You can profile your map reduce job running the job in local mode as
ant other java app as well.
However we also profiled in a grid. You just need to install the
yourkit agent into the jvm of the node
Are you interested in simply profiling your own code (in which case you can
clearly use what ever java profiler you want), or your construction of the
MapReduce job, ie how much time is being spent in the Map vs the sort vs
the shuffle vs the Reduce. I am not aware of a good solution to the
Great, thanks for this info, is there any chance that this information can
also be exposed for streaming jobs as well?
(All of the jobs that we run in our lab are only via streaming...)
Thanks!
Ashish
On Wed, Oct 8, 2008 at 12:30 PM, George Porter [EMAIL PROTECTED]wrote:
Hi Ashish,
I