There was an issue related to hung connections (HDFS-3357). But the JIRA indicates the fix is available in Hadoop-2.0.0-alpha. Still, would be worth checking on Sandy's suggestion
On Wed, Mar 20, 2013 at 11:09 PM, Sandy Ryza <sandy.r...@cloudera.com>wrote: > Hi Kishore, > > 50010 is the datanode port. Does your lsof indicate that the sockets are > in CLOSE_WAIT? I had come across an issue like this where that was a > symptom. > > -Sandy > > > On Wed, Mar 20, 2013 at 4:24 AM, Krishna Kishore Bonagiri < > write2kish...@gmail.com> wrote: > >> Hi, >> >> I am running a date command with YARN's distributed shell example in a >> loop of 1000 times in this way: >> >> yarn jar >> /home/kbonagir/yarn/hadoop-2.0.0-alpha/share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-2.0.0-alpha.jar >> org.apache.hadoop.yarn.applications.distributedshell.Client --jar >> /home/kbonagir/yarn/hadoop-2.0.0-alpha/share/hadoop/mapreduce/hadoop-yarn-applications-distributedshell-2.0.0-alpha.jar >> --shell_command date --num_containers 2 >> >> >> Around 730th time or so, I am getting an error in node manager's log >> saying that it failed to launch container because there are "Too many open >> files" and when I observe through lsof command,I find that there is one >> instance of this kind of file is left for each run of Application Master, >> and it kept growing as I am running it in loop. >> >> node1:44871->node1:50010 >> >> Is this a known issue? Or am I missing doing something? Please help. >> >> Note: I am working on hadoop--2.0.0-alpha >> >> Thanks, >> Kishore >> > >