too many open files? Isn't 4K enough???

Yuri Pradkin Tue, 04 Nov 2008 15:48:09 -0800

Hi,

I'm running current snapshot (-r709609), doing a simple word count using python 
over 
streaming.  I'm have a relatively moderate setup of 17 nodes.

I'm getting this exception:

java.io.FileNotFoundException:
/usr/local/hadoop/hadoop-hadoop/mapred/local/taskTracker/jobcache/job_200811041109_0003/attempt_200811041109_0003_m_000000_0/output/spill4055.out.index

(Too many open files)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:137)
at
org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.(RawLocalFileSystem.java:62)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.(RawLocalFileSystem.java:98)
at
org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:168)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:359)
at
org.apache.hadoop.mapred.IndexRecord.readIndexFile(IndexRecord.java:47)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.getIndexInformation(MapTask.java:1339)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1237)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:857)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child.main(Child.java:155)

I see that AFTER I've reconfigured the max allowable open files to 4096!

When I monitor the number of open files on a box running hadoop I see the
number fluctuating around 900 during the map phase. Then I see it going up
through
the roof during sorting/shuffling phase. I see a lot of open files named like
"/users/hadoop/hadoop-hadoop/mapred/local/taskTracker/jobcache/job_200811041109_0003/attempt_200811041109_0003_m_000000_1/outp
ut/spill2188.out"

What a poor user to do about this? Reconfigure hadoop to allow 32K open files
as somebody suggested
on an hbase forum that I googled up? Or some other ridiculous number? If yes,
what should it be?
Or is it my config problem and there is a way to control this?

Do I need to file a jira about this or is this a problem that people are aware
of? Because right
now it looks to me that Hadoop scalability is broken. No way 4K descriptors
should be insufficient.

Any feedback will be appreciated.

Thanks,

-Yuri

P.S. BTW, someone on this list has suggested before that after restarting
hadoop a similar sounding
problem goes away for a while. It did not work for me.

too many open files? Isn't 4K enough???

Reply via email to