After further investigation, and looking at re-run results of the
benchmarks, I change my vote to +1.
Sort on 500 nodes, re-run #1: 2.43 hrs
Sort on 500 nodes, re-run #2: 2.36 hrs
Thanks,
Mukund
Mukund Madhugiri wrote:
I am voting -1 for this as I am seeing performance degradation in the
Sort benchmarks with 500 nodes. I expect to have an update on the
performance problem by tomorrow.
Here is the data on the performance degradation:
Time taken for Sort benchmark on 500 nodes:
- Oct 23: 2.4 hrs
- Oct 25: 2.3 hrs (candidate 0)
- Nov 01: 2.7 hrs (candidate 1)
NOTE: 2.7 hrs is the highest in all the runs over the last month.
Testing done:
Unit tests (on candidate 0 and candidate 1):
- Windows
- Linux
The following benchmark jobs were run on 20, 100 and 500 nodes on
candidate 1. The benchmarks were additionally run on the 900 nodes
with candidate 0.
- TestDFSIO
- MRBench (aka SmallJobs)
- NNBench
- Sort (RandomWriter, Sort, SortValidation)
Thanks,
Mukund
Doug Cutting wrote:
I've created a new candidate build for Hadoop 0.15.0.
http://people.apache.org/~cutting/hadoop-0.15.0-candidate-1/
Should we release this?
Doug