Measuring Hadoop Execution time automatically

Or Raz Fri, 19 Oct 2018 07:15:08 -0700

I am writing a MapReduce Job using Hadoop natively and I want to measure
the execution time of the job and I am collecting it using Date (I know
that also in the UI I can see the execution time). In order to find the
correct execution time I am running this job 3 times (using one hadoop jar
and a for loop that call the job) and I am getting very strange results.


It seems like the first run has much shorter time than the others and I get
the same output (I know that the location of the containers might be the
reason for the change of execution time but I am not sure why the first run
is always the fastest). An example of the code of what I am using

public int run (String[] args ) throws Exception {Configuration conf =
getConf();
conf.set(...)...for (int i = 0; i < 3; i++){Job job
=job.getInstance(conf, "OR-MR");...if (!job.waitForCompletion(true))
  {
    System.exit(1);
  }}//for}//run

By the way, if I am running the job 3 times (one per hadoop command) than I
get the fast execution time.

Is there any other way to do measure the execution time without running it
manually three times?

Measuring Hadoop Execution time automatically

Reply via email to