https://issues.apache.org/jira/browse/HADOOP-4346 might explain this.
Raghu.
Bryan Duxbury wrote:
Ok, so, what might I do next to try and diagnose this? Does it sound
like it might be an HDFS/mapreduce bug, or should I pore over my own
code first?
Also, did any of the other exceptions look
Doug Cutting wrote:
Raghu Angadi wrote:
For the current implementation, you need around 3x fds. 1024 is too
low for Hadoop. The Hadoop requirement will come down, but 1024 would
be too low anyway.
1024 is the default on many systems. Shouldn't we try to make the
default configuration work
Raghu Angadi wrote:
The most interesting one in my eyes is the too many open files one. My
ulimit is 1024. How much should it be? I don't think that I have that
many files open in my mappers. They should only be operating on a
single file at a time. I can try to run the job again and get
Raghu Angadi wrote:
Doug Cutting wrote:
Raghu Angadi wrote:
For the current implementation, you need around 3x fds. 1024 is too
low for Hadoop. The Hadoop requirement will come down, but 1024 would
be too low anyway.
1024 is the default on many systems. Shouldn't we try to make the
Ok, so, what might I do next to try and diagnose this? Does it sound
like it might be an HDFS/mapreduce bug, or should I pore over my own
code first?
Also, did any of the other exceptions look interesting?
-Bryan
On Sep 29, 2008, at 10:40 AM, Raghu Angadi wrote:
Raghu Angadi wrote:
Doug
Bryan Duxbury wrote:
Ok, so, what might I do next to try and diagnose this? Does it sound
like it might be an HDFS/mapreduce bug, or should I pore over my own
code first?
Also, did any of the other exceptions look interesting?
The exceptions closest to the failure time would be most
Does your failed map task open a lot of files to write? Could you please check
the log of the datanode running at the machine where the map tasks failed? Do
you see any error message containing exceeds the limit of concurrent xcievers?
Hairong
From: Bryan
Well, I did find some more errors in the datanode log. Here's a
sampling:
2008-09-26 10:43:57,287 ERROR org.apache.hadoop.dfs.DataNode:
DatanodeRegistration(10.100.11.115:50010,
storageID=DS-1784982905-10.100.11.115-50010-1221785192226,
infoPort=50075, ipcPort=50020):DataXceiver: