Here are the results i got from my 4 node cluster (correction i noted 5 earlier). One of my nodes out of the 4 is a namenode+datanode both.

GENERATE RANDOM DATA
Wrote out 40GB of random binary data:
Map output records=4088301
The job took 358 seconds. (approximately: 6 minutes).

SORT RANDOM GENERATED DATA
Map output records=4088301
Reduce input records=4088301
The job took 2136 seconds. (approximately: 35 minutes).

VALIDATION OF SORTED DATA
The job took 183 seconds.
SUCCESS! Validated the MapReduce framework's 'sort' successfully.

It would be interesting to see what performance numbers others with a similar setup have obtained.

Thanks,
Usman

I am setting up a new cluster of 10 nodes of 2.83G Quadcore (2x6MB
cache), 8G RAM and 2x500G drives, and will do the same soon.  Got some
issues though so it won't start up...

Tim


On Wed, Oct 14, 2009 at 11:36 AM, Usman Waheed <usm...@opera.com> wrote:
Thanks Tim, i will check it out and post my results for comments.
-Usman
Might it be worth running the http://wiki.apache.org/hadoop/Sort and
posting your results for comment?

Tim


On Wed, Oct 14, 2009 at 10:48 AM, Usman Waheed <usm...@opera.com> wrote:

Hi,

Is there a way to tell what kind of performance numbers one can expect
out
of their cluster given a certain set of specs.

For example i have 5 nodes in my cluster that all have the following
hardware configuration(s):
Quad Core 2.0GHz, 8GB RAM, 4x1TB disks and are all on the same rack.

Thanks,
Usman





Reply via email to