Just to be clear, I second Brian's opinion. Relying on finalizes is a very good way to run out of file descriptors.
On Sun, Jun 21, 2009 at 9:32 AM, <brian.lev...@nokia.com> wrote: > IMHO, you should never rely on finalizers to release scarce resources since > you don't know when the finalizer will get called, if ever. > > -brian > > > > -----Original Message----- > From: ext jason hadoop [mailto:jason.had...@gmail.com] > Sent: Sunday, June 21, 2009 11:19 AM > To: core-user@hadoop.apache.org > Subject: Re: "Too many open files" error, which gets resolved after some > time > > HDFS/DFS client uses quite a few file descriptors for each open file. > > Many application developers (but not the hadoop core) rely on the JVM > finalizer methods to close open files. > > This combination, expecially when many HDFS files are open can result in > very large demands for file descriptors for Hadoop clients. > We as a general rule never run a cluster with nofile less that 64k, and for > larger clusters with demanding applications have had it set 10x higher. I > also believe there was a set of JVM versions that leaked file descriptors > used for NIO in the HDFS core. I do not recall the exact details. > > On Sun, Jun 21, 2009 at 5:27 AM, Stas Oskin <stas.os...@gmail.com> wrote: > > > Hi. > > > > After tracing some more with the lsof utility, and I managed to stop the > > growth on the DataNode process, but still have issues with my DFS client. > > > > It seems that my DFS client opens hundreds of pipes and eventpolls. Here > is > > a small part of the lsof output: > > > > java 10508 root 387w FIFO 0,6 6142565 pipe > > java 10508 root 388r FIFO 0,6 6142565 pipe > > java 10508 root 389u 0000 0,10 0 6142566 > > eventpoll > > java 10508 root 390u FIFO 0,6 6135311 pipe > > java 10508 root 391r FIFO 0,6 6135311 pipe > > java 10508 root 392u 0000 0,10 0 6135312 > > eventpoll > > java 10508 root 393r FIFO 0,6 6148234 pipe > > java 10508 root 394w FIFO 0,6 6142570 pipe > > java 10508 root 395r FIFO 0,6 6135857 pipe > > java 10508 root 396r FIFO 0,6 6142570 pipe > > java 10508 root 397r 0000 0,10 0 6142571 > > eventpoll > > java 10508 root 398u FIFO 0,6 6135319 pipe > > java 10508 root 399w FIFO 0,6 6135319 pipe > > > > I'm using FSDataInputStream and FSDataOutputStream, so this might be > > related > > to pipes? > > > > So, my questions are: > > > > 1) What happens these pipes/epolls to appear? > > > > 2) More important, how I can prevent their accumation and growth? > > > > Thanks in advance! > > > > 2009/6/21 Stas Oskin <stas.os...@gmail.com> > > > > > Hi. > > > > > > I have HDFS client and HDFS datanode running on same machine. > > > > > > When I'm trying to access a dozen of files at once from the client, > > several > > > times in a row, I'm starting to receive the following errors on client, > > and > > > HDFS browse function. > > > > > > HDFS Client: "Could not get block locations. Aborting..." > > > HDFS browse: "Too many open files" > > > > > > I can increase the maximum number of files that can opened, as I have > it > > > set to the default 1024, but would like to first solve the problem, as > > > larger value just means it would run out of files again later on. > > > > > > So my questions are: > > > > > > 1) Does the HDFS datanode keeps any files opened, even after the HDFS > > > client have already closed them? > > > > > > 2) Is it possible to find out, who keeps the opened files - datanode or > > > client (so I could pin-point the source of the problem). > > > > > > Thanks in advance! > > > > > > > > > -- > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > http://www.amazon.com/dp/1430219424?tag=jewlerymall > www.prohadoopbook.com a community for Hadoop Professionals > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals