Another critical variable to check is dfs.datanode.max.xcievers. The default value is 256. You should bump that up to 4096 or higher.
-Vijay On Thu, Mar 1, 2012 at 7:07 PM, Abhishek Parolkar <abhis...@viki.com> wrote: > Hi There! > I have been doing an interesting experiment of building mac mini cluster > (http://www.scribd.com/doc/76827185/Mac-Mini-Hadoop-Cluster) > I am continuously getting "java.io.IOException: java.io.IOException: Could > not obtain block: blk_-" errors when I run hive queries on large set of > data. > > Queried on small set of data (10Gig) works fine but if I query on large > (about 170G) it gives that error. > DATA is stored as SEQUENCEFILE partition by date & hour each file of about > 160 MB. > > Here is what JobTracker says about map/reduce > : http://screencast.com/t/hQ9Y7zsaO (more > detail: http://screencast.com/t/jHplMXHXuys) > > Searching about the issue, I found that many people face this problem > because of: > 1.) Block not available on any of the data nodes http://bit.ly/wFGgEF > 2.) hadoop is not able to open enough file descriptors (ulimit > issue) http://bit.ly/wi4fg8 > > I fixed all that and ran the query again but no luck (my ulimit -n is 65534) > > > My Configuration: > Hadoop Version: Hadoop 1.0.0 > Platform: OS X 10.7.2 (Mac mini ) > Nodes: 3 Data , 1 Namenode , 1 Jobtracker, 3 TaskTracker > Hive version: 0.8.1 > ulimit -a of all nodes: http://pastie.org/private/ukxeuqcz31qckmn9hiqsba > memory per node (sysctl -n hw.memsize) : 4.096G > free_mem: 1.89G > output of allmemory: http://pastie.org/private/drscsrbxf6dg7t9pwoc1g > > > FSCK of whole external table location > : http://pastie.org/private/ki0xxfnuaoi1xkxbkylrlw > HDFS report : http://pastie.org/private/ahinnwty2v6exrapre65ta > > > -v_abhi_v > > > > > > > > > > > >