I am writing a MapReduce Job using Hadoop natively and I want to measure
the execution time of the job and I am collecting it using Date (I know
that also in the UI I can see the execution time). In order to find the
correct execution time I am running this job 3 times (using one hadoop jar
and a for loop that call the job) and I am getting very strange results.

It seems like the first run has much shorter time than the others and I get
the same output (I know that the location of the containers might be the
reason for the change of execution time but I am not sure why the first run
is always the fastest). An example of the code of what I am using

public int run (String[] args ) throws Exception {Configuration conf =
getConf();
conf.set(...)...for (int i = 0; i < 3; i++){Job job
=job.getInstance(conf, "OR-MR");...if (!job.waitForCompletion(true))
  {
    System.exit(1);
  }}//for}//run

By the way, if I am running the job 3 times (one per hadoop command) than I
get the fast execution time.

Is there any other way to do measure the execution time without running it
manually three times?

Reply via email to