I am writing a MapReduce Job using Hadoop natively and I want to measure the execution time of the job and I am collecting it using Date (I know that also in the UI I can see the execution time). In order to find the correct execution time I am running this job 3 times (using one hadoop jar and a for loop that call the job) and I am getting very strange results.
It seems like the first run has much shorter time than the others and I get the same output (I know that the location of the containers might be the reason for the change of execution time but I am not sure why the first run is always the fastest). An example of the code of what I am using public int run (String[] args ) throws Exception {Configuration conf = getConf(); conf.set(...)...for (int i = 0; i < 3; i++){Job job =job.getInstance(conf, "OR-MR");...if (!job.waitForCompletion(true)) { System.exit(1); }}//for}//run By the way, if I am running the job 3 times (one per hadoop command) than I get the fast execution time. Is there any other way to do measure the execution time without running it manually three times?