On Tue, Apr 13, 2010 at 4:13 AM, stephen mulcahy
<stephen.mulc...@deri.org>wrote:

> Todd Lipcon wrote:
>
>> Most likely a kernel bug. In previous versions of Debian there was a buggy
>> forcedeth driver, for example, that caused it to drop off the network in
>> high load. Who knows what new bug is in 2.6.32 which is brand spanking
>> new.
>>
>
> Yes, it looks like it is a kernel bug alright (see thread on kernel netdev
> at http://marc.info/?t=127094288900001&r=1&w=2 if interested). To be fair,
> I don't think these bugs are confined to Debian - I did some initial testing
> with Scientific Linux and also ran into problems with forcedeth.


Interesting, good find. I try to avoid forcedeth now and have heard the same
from ops people at various large linux deployments. Not sure why, but it's
traditionally had a lot of bugs/regressions.


> Sure, but I figured I'd go with a distro now that can be largely left
> untouched for the next 2-3 years and Debian lenny felt that bit old for
> that. I know RHEL/CentOS would fit that requirement also, will see. I'm also
> interested in using DRBD in some of our nodes for redundancy, again, running
> with a newer distro should reduce the pain of configuring that.
>
> Finally, I figured burning in our cluster was a good opportunity to give
> back to the community and do some testing on their behalf.
>

Very admirable of you :) It is good to have some people running new kernels
to suss these issues out before the rest of us check out modern technology
;-)


>
> With regard to our TeraSort benchmark time of ~23 minutes - is that in the
> right ballpark for a cluster of 45 data nodes and a nn and 2nn?
>
>
Yep, sounds about the right ballpark.

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to