On Aug 21, 2011, at 7:17 PM, Michel Segel wrote: > Avi, > First why 32 bit OS? > You have a 64 bit processor that has 4 cores hyper threaded looks like 8cpus.
With only 1.7gb of mem, there likely isn't much of a reason to use a 64-bit OS. The machines (as you point out) are already tight on memory. 64-bit is only going to make it worse. >> >> 1.7 GB memory >> 1 Intel(R) Xeon(R) CPU E5507 @ 2.27GHz >> Ubuntu Server 10.10 , 32-bit platform >> Cloudera CDH3 Manual Hadoop Installation >> (for the ones who are familiar with Amazon Web Services, I am talking about >> Small EC2 Instances/Servers) >> >> Total job run time is +-15 minutes (+-50 files/blocks/mapTasks of up to 250 >> MB and 10 reduce tasks). >> >> Based on the above information, does anyone can recommend on a best practice >> configuration?? How many spindles? Are your tasks spilling? >> Do you thinks that when dealing with such a small cluster, and when >> processing such a small amount of data, >> is it even possible to optimize jobs so they would run much faster? Most of the time, performance issues are with the algorithm, not Hadoop.