how many threads do you have? Number of active threads is very important. Normally,

#fds = (3 * #threads_blocked_on_io) + #streams

12 per stream is certainly way off.

Raghu.

Stas Oskin wrote:
Hi.

In my case it was actually ~ 12 fd's per stream, which included pipes and
epolls.

Could it be that HDFS opens 3 x 3 (input - output - epoll) fd's per each
thread, which make it close to the number I mentioned? Or it always 3 at
maximum per thread / stream?

Up to 10 sec looks quite the correct number, it seems it gets freed arround
this time indeed.

Regards.

2009/6/23 Raghu Angadi <rang...@yahoo-inc.com>

To be more accurate, once you have HADOOP-4346,

fds for epoll and pipes = 3 * threads blocked on Hadoop I/O

Unless you have hundreds of threads at a time, you should not see hundreds
of these. These fds stay up to 10sec even after the
threads exit.

I am a bit confused about your exact situation. Please check number of
threads if you still facing the problem.

Raghu.


Raghu Angadi wrote:

since you have HADOOP-4346, you should not have excessive epoll/pipe fds
open. First of all do you still have the problem? If yes, how many hadoop
streams do you have at a time?

System.gc() won't help if you have HADOOP-4346.

Ragu.

 Thanks for your opinion!
2009/6/22 Stas Oskin <stas.os...@gmail.com>

 Ok, seems this issue is already patched in the Hadoop distro I'm using
(Cloudera).

Any idea if I still should call GC manually/periodically to clean out
all
the stale pipes / epolls?

2009/6/22 Steve Loughran <ste...@apache.org>

 Stas Oskin wrote:
 Hi.

So what would be the recommended approach to pre-0.20.x series?

To insure each file is used only by one thread, and then it safe to
close
the handle in that thread?

Regards.

 good question -I'm not sure. For anythiong you get with
FileSystem.get(),
its now dangerous to close, so try just setting the reference to null
and
hoping that GC will do the finalize() when needed




Reply via email to