Re: losing network interfaces during long running map-reduce jobs

David Howell Mon, 05 Apr 2010 12:27:48 -0700

> But I haven't seen anything in the dmesg log. I'll have to try looking
> at the tcpdump output on Monday, once I can get console access again.
> My apologies that I'm so sketchy on details right now... so far, I
> haven't been any able to find any evidence of something going wrong
> except for the hadoop log entries when the IOExceptions start.
>
> Thanks,
> -David
>


I just lost my networking again. This time, I had switched my cluster
back to the build I was using before I switched to CDH2.

It's Hadoop 0.20.1 with these patches applied (for Dumbo):

HADOOP-1722-v0.20.1
HADOOP-5450
MAPREDUCE-764
HADOOP-5528

Now I'm wondering if something about my job is the culprit. I have 2
nodes, both 8 core machines.
mapred.tasktracker.map|reduce.tasks.maximum are both set to 7.

The job I'm running is combining lots of gzipped Apache log files into
sequence files for later analysis... I'm going from one file per
virtual host per server per day to file per virtual host per day. The
last attempt had ~1400 maps/10 reduces.

Is this job some kind of map-reduce anti-pattern that's causing problems?

Here's the source to mapper and reducer:
http://gist.github.com/356750

Cheers,
David

Re: losing network interfaces during long running map-reduce jobs

Reply via email to