Re: Hadoop Java Versions

Steve Loughran Mon, 27 Jun 2011 04:39:37 -0700

On 26/06/11 20:23, Scott Carey wrote:



On 6/23/11 5:49 AM, "Steve Loughran"<ste...@apache.org>  wrote:

what's your HW setup? #cores/server, #servers, underlying OS?


CentOS 5.6.
4 cores / 8 threads a server (Nehalem generation Intel processor).

that should be enough to find problems. I've just moved up to a 6-core12 thread desktop and that found problems on some non-Hadoop code, whichshows that the more threads you have, and the faster the machines are,the more your race conditions show up. With Hadoop the fact that you canhave 10-1000 servers means that in a large cluster the probability ofthat race condition showing up scales well.

Also run a smaller cluster with 2x quad core Core 2 generation Xeons.

Off topic:
The single proc Nehalem is faster than the dual core 2's for most use
cases -- and much lower power.  Looking forward to single proc 4 or 6 core
Sandy Bridge based systems for the next expansion -- testing 4 core vs 4
core has these 30% faster than the Nehalem generation systems in CPU bound
tasks and lower power.  Intel prices single socket Xeons so much lower
than the Dual socket ones that the best value for us is to get more single
socket servers rather than fewer dual socket ones (with similar processor
to hard drive ratio).

Yes, in a large cluster the price of filling the second socket cancompare to a lot of storage, and TB of storage is more tangible. I guessit depends on your application.

Regarding Sandy Bridge, I've no experience of those, but I worry that 10Gbps is still bleeding edge, and shouldn't be needed for code with goodlocality anyway; it is probably more cost effective to stay at1Gbps/server, though the issue there is the #of HDD/s server generateslots of replication traffic when a single server fails...

Re: Hadoop Java Versions

Reply via email to