Re: java.io.IOException: Could not get block locations. Aborting...

Scott Whitecross Mon, 09 Feb 2009 19:50:51 -0800

I tried modifying the settings, and I'm still running into the sameissue. I increased the xceivers count (fs.datanode.max.xcievers) inthe hadoop-site.xml file. I also checked to make sure the filehandles were increased, but they were fairly high to begin with.

I don't think I'm dealing with anything out of the ordinary either.I'm process three large 'log' files, totaling around 5 GB, andproducing around 8000 output files after some data processing,probably totals 6 or 7 gig. In the past, I've produced a lot fewerfiles, and that has been fine. When I change the process to output tojust a few files, no problem again.

Anything else beyond the limits? Is HDFS creating a substantialamount of temp files as well?







On Feb 9, 2009, at 8:11 PM, Bryan Duxbury wrote:

Correct.

+1 to Jason's more unix file handles suggestion. That's a must-have.

-Bryan

On Feb 9, 2009, at 3:09 PM, Scott Whitecross wrote:
This would be an addition to the hadoop-site.xml file, to updfs.datanode.max.xcievers?
Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:
Small files are bad for hadoop. You should avoid keeping a lot ofsmall files if possible.
That said, that error is something I've seen a lot. It usuallyhappens when the number of xcievers hasn't been adjusted upwardsfrom the default of 256. We run with 8000 xcievers, and that seemsto solve our problems. I think that if you have a lot of openfiles, this problem happens a lot faster.
-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:
Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
It seems to be related to trying to write to many files to HDFS.I have a class extendingorg.apache.hadoop.mapred.lib.MultipleOutputFormat and if I outputto a few file names, everything works. However, if I output tothousands of small files, the above error occurs. I'm havingtrouble isolating the problem, as the problem doesn't occur inthe debugger unfortunately.
Is this a memory issue, or is there an upper limit to the numberof files HDFS can hold? Any settings to adjust?
Thanks.

Re: java.io.IOException: Could not get block locations. Aborting...

Reply via email to