Lost task tracker and Could not obtain block errors
I'm running into a wall with one of my map reduce jobs (actually its a 7 jobs, chained together). I get to the 5th MR job, which takes as input the output from the 3rd MR job, and right off the bat I start getting Lost task tracker and Could not obtain block... errors. Eventually I get enough of these errors that hadoop just kills my tasks, and fails the job all together. I'm running a 5 node hadoop cluster on EC2. The input to the 5th MR job is ~400mb in size (10 part-* files, each ~40mb in size), so its not really that big. And I seem to get this no matter how big a hdfs cluster I create (5 - 15 nodes). I'm not really sure how to proceed in trouble shooting the issue. Any help would be greatly appreciated. -- Thanks, John C
Job fails with Could not obtain block errors
I have a MR job that repeatedly fails during a join operation in the Mapper, with the errors java.io.IOException: Could not obtain block. I'm running on EC2, on a 12 node cluster, provisioned by whirr. Oddly enough on a 5 node cluster the MR job runs through without any problems. The repeated exception the tasks are reporting in the web UI for this job is: java.io.IOException: Could not obtain block: blk_8346145198855916212_1340 file=/user/someuser/output_6_doc_tf_and_u/part-2 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1993) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1800) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948) at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.sync(SequenceFile.java:2186) at org.apache.hadoop.mapred.SequenceFileRecordReader.init(SequenceFileRecordReader.java:48) at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59) at org.apache.hadoop.mapred.lib.DelegatingInputFormat.getRecordReader(DelegatingInputFormat.java:124) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:370) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:262) When I look at the task log details for this failed job it shows that the DFSClient failed to connect to a datanode that had a replicated copy of this block, and added the datanode ipaddress to the list of deadNodes (exception shown below). 11:25:19,204 INFO DFSClient:1835 - Failed to connect to / 10.114.123.82:50010, add to deadNodes and continue java.io.IOException: Got error in response to OP_READ_BLOCK self=/ 10.202.163.95:43022, remote=/10.114.123.82:50010 for file /user/someuser/output_6_doc_tf_and_u/part-2 for block 5843350240062345818_1332 at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1487) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1811) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948) at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1465) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1437) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1424) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1419) at org.apache.hadoop.mapred.SequenceFileRecordReader.init(SequenceFileRecordReader.java:43) at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59) at org.apache.hadoop.mapred.lib.DelegatingInputFormat.getRecordReader(DelegatingInputFormat.java:124) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:370) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:262) It then goes onto try the other two datanodes that contain replicas of this block, each throwing the same exception, and each being added to the list of dead nodes, at which point the task fails. This cycle of failures is happening multiple times during this job, against several different blocks. I then looked in the namenode's log, to see what is going on with datanodes that are getting added to the list of deadNodes, and found them associated with the following error: 2011-07-13 05:33:55,161 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.heartbeatCheck: lost heartbeat from 10.83.109.118:50010 Looking through the rest of the namenode log I count 36 different entries for lost heartbeats. Is this a common error? The odd thing is that after the job fails, hdfs seems to be able to recover itself, bringing these nodes back online and re-replicating the files across the nodes again. So when I browse the hdfs, and look for one of the files that was causing the previous failures, its showing up in the correct directory, with its replication set
Re: Could not obtain block
[moving to common-user, since this spans both MR and HDFS - probably easier than cross-posting] Can you check the DN logs for exceeds the limit of concurrent xcievers? You may need to bump the dfs.datanode.max.xcievers parameter in hdfs-site.xml, and also possibly the nfiles ulimit. -Todd On Wed, Mar 9, 2011 at 3:27 AM, Evert Lammerts evert.lamme...@sara.nl wrote: We see a lot of IOExceptions coming from HDFS during a job that does nothing but untar 100 files (1 per Mapper, sizes vary between 5GB and 80GB) that are in HDFS, to HDFS. DataNodes are also showing Exceptions that I think are related. (See stacktraces below.) This job should not be able to overload the system I think... I realize that much data needs to go over the lines, but HDFS should still be responsive. Any ideas / help is much appreciated! Some details: * Hadoop 0.20.2 (CDH3b4) * 5 node cluster plus 1 node for JT/NN (Sun Thumpers) * 4 cores/node, 4GB RAM/core * CentOS 5.5 Job output: java.io.IOException: java.io.IOException: Could not obtain block: blk_-3695352030358969086_130839 file=/user/emeij/icwsm-data-test/01-26-SOCIAL_MEDIA.tar.gz at ilps.DownloadICWSM$UntarMapper.map(DownloadICWSM.java:449) at ilps.DownloadICWSM$UntarMapper.map(DownloadICWSM.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) Caused by: java.io.IOException: Could not obtain block: blk_-3695352030358969086_130839 file=/user/emeij/icwsm-data-test/01-26-SOCIAL_MEDIA.tar.gz at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1977) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1784) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1932) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:74) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:335) at ilps.DownloadICWSM$CopyThread.run(DownloadICWSM.java:149) Example DataNode Exceptions (not that these come from the node at 192.168.28.211): 2011-03-08 19:40:40,297 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_-9222067946733189014_3798233 java.io.EOFException: while trying to read 3067064 bytes 2011-03-08 19:40:41,018 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.28.211:50050, dest: /192.168.28.211:49748, bytes: 0, op: HDFS_READ, cliID: DFSClient_attempt_201103071120_0030_m_32_0, offset: 30 72, srvID: DS-568746059-145.100.2.180-50050-1291128670510, blockid: blk_3596618013242149887_4060598, duration: 2632000 2011-03-08 19:40:41,049 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_-9221028436071074510_2325937 java.io.EOFException: while trying to read 2206400 bytes 2011-03-08 19:40:41,348 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_-9221549395563181322_4024529 java.io.EOFException: while trying to read 3037288 bytes 2011-03-08 19:40:41,357 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_-9221885906633018147_3895876 java.io.EOFException: while trying to read 1981952 bytes 2011-03-08 19:40:41,434 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block blk_-9221885906633018147_3895876 unfinalized and removed. 2011-03-08 19:40:41,434 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-9221885906633018147_3895876 received exception java.io.EOFException: while trying to read 1981952 bytes 2011-03-08 19:40:41,434 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.28.211:50050, storageID=DS-568746059-145.100.2.180-50050-1291128670510, infoPort=50075, ipcPort=50020):DataXceiver java.io.EOFException: while trying to read 1981952 bytes at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:270) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:357) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:378) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:534
Re: Could not obtain block
May be some datanode is down in the cluster ...check datanode logs of nodes in cluster On Thu, Jan 20, 2011 at 3:43 PM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process the wordcount example on my hadoop cluster and get a Could not obtain block Exception. Did any one know what is the problem? If I start this program in my local than processed it good. I do this: root@master bin]# ./hadoop jar ../hadoop-0.20.2-examples.jar wordcount point/start-all.sh s/start-all.sh 11/01/20 11:57:56 INFO input.FileInputFormat: Total input paths to process : 1 11/01/20 11:57:57 INFO mapred.JobClient: Running job: job_201101201036_0002 11/01/20 11:57:58 INFO mapred.JobClient: map 0% reduce 0% 11/01/20 11:58:16 INFO mapred.JobClient: Task Id : attempt_201101201036_0002_m_00_0, Status : FAILED java.io.IOException: Could not obtain block: blk_7716960257524845873_1708 file=/user/root/point/start-all.sh at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient .java:1812) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.ja va:1638) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767 ) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(Line RecordReader.java:97) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(Ma pTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) 11/01/20 11:58:33 INFO mapred.JobClient: Task Id : attempt_201101201036_0002_m_00_1, Status : FAILED java.io.IOException: Could not obtain block: blk_7716960257524845873_1708 file=/user/root/point/start-all.sh at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient .java:1812) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.ja va:1638) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767 ) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(Line RecordReader.java:97) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(Ma pTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) 11/01/20 11:58:48 INFO mapred.JobClient: Task Id : attempt_201101201036_0002_m_00_2, Status : FAILED java.io.IOException: Could not obtain block: blk_7716960257524845873_1708 file=/user/root/point/start-all.sh at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient .java:1812) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.ja va:1638) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767 ) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(Line RecordReader.java:97) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(Ma pTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) 11/01/20 11:59:06 INFO mapred.JobClient: Job complete: job_201101201036_0002 11/01/20 11:59:06 INFO mapred.JobClient: Counters: 2 11/01/20 11:59:06 INFO mapred.JobClient: Job Counters 11/01/20 11:59:06 INFO mapred.JobClient: Launched map tasks=4 11/01/20 11:59:06 INFO mapred.JobClient: Failed map tasks=1 Regards Musa Cavus
Could not obtain block
Hi, I process the wordcount example on my hadoop cluster and get a Could not obtain block Exception. Did any one know what is the problem? If I start this program in my local than processed it good. I do this: root@master bin]# ./hadoop jar ../hadoop-0.20.2-examples.jar wordcount point/start-all.sh s/start-all.sh 11/01/20 11:57:56 INFO input.FileInputFormat: Total input paths to process : 1 11/01/20 11:57:57 INFO mapred.JobClient: Running job: job_201101201036_0002 11/01/20 11:57:58 INFO mapred.JobClient: map 0% reduce 0% 11/01/20 11:58:16 INFO mapred.JobClient: Task Id : attempt_201101201036_0002_m_00_0, Status : FAILED java.io.IOException: Could not obtain block: blk_7716960257524845873_1708 file=/user/root/point/start-all.sh at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient .java:1812) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.ja va:1638) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767 ) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(Line RecordReader.java:97) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(Ma pTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) 11/01/20 11:58:33 INFO mapred.JobClient: Task Id : attempt_201101201036_0002_m_00_1, Status : FAILED java.io.IOException: Could not obtain block: blk_7716960257524845873_1708 file=/user/root/point/start-all.sh at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient .java:1812) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.ja va:1638) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767 ) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(Line RecordReader.java:97) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(Ma pTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) 11/01/20 11:58:48 INFO mapred.JobClient: Task Id : attempt_201101201036_0002_m_00_2, Status : FAILED java.io.IOException: Could not obtain block: blk_7716960257524845873_1708 file=/user/root/point/start-all.sh at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient .java:1812) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.ja va:1638) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767 ) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(Line RecordReader.java:97) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(Ma pTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) 11/01/20 11:59:06 INFO mapred.JobClient: Job complete: job_201101201036_0002 11/01/20 11:59:06 INFO mapred.JobClient: Counters: 2 11/01/20 11:59:06 INFO mapred.JobClient: Job Counters 11/01/20 11:59:06 INFO mapred.JobClient: Launched map tasks=4 11/01/20 11:59:06 INFO mapred.JobClient: Failed map tasks=1 Regards Musa Cavus
Help: Could not obtain block: blk_ Exception
Hi All, I am getting Could not obtain block: blk_2706642997966533027_4482 file=/user/outputwc425729652_0/part-r-0 I checked the file is actually there. What I should do? Please help. Could not obtain block: blk_2706642997966533027_4482 file=/user/outputwc425729652_0/part-r-0 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) at java.io.DataInputStream.read(DataInputStream.java:132) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) at java.io.InputStreamReader.read(InputStreamReader.java:167) at java.io.BufferedReader.fill(BufferedReader.java:136) at java.io.BufferedReader.readLine(BufferedReader.java:299) at java.io.BufferedReader.readLine(BufferedReader.java:362) at speeditup.ClusterByWordCountFSDriver$ClusterBasedOnWordCountMapper.map(ClusterByWordCountFSDriver.java:157) at speeditup.ClusterByWordCountFSDriver$ClusterBasedOnWordCountMapper.map(ClusterByWordCountFSDriver.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170)
MapFiles error Could not obtain block
Hi, I'm using the MapFileOutputFormat to lookup values in MapFiles and keep getting Could not obtain block errors. I'm thinking it might be because ulimit is not set high enough. Has anyone else run into this issue? attempt_201011180019_0005_m_03_0: Caught exception while getting cached files: java.io.IOException: Could not obtain block: blk_-7027776556206952935_61338 file=/mydata/part-r-0/data attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1976) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1783) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1931) attempt_201011180019_0005_m_03_0: at java.io.DataInputStream.readFully(DataInputStream.java:178) attempt_201011180019_0005_m_03_0: at java.io.DataInputStream.readFully(DataInputStream.java:152) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1435) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1424) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1419) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.createDataFileReader(MapFile.java:302) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:284) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:273) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:260) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:253) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.Child$4.run(Child.java:217) attempt_201011180019_0005_m_03_0: at java.security.AccessController.doPrivileged(Native Method) attempt_201011180019_0005_m_03_0: at javax.security.auth.Subject.doAs(Subject.java:396) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.Child.main(Child.java:211) -Kim
Re: MapFiles error Could not obtain block
Hi Kim, I saw this problem once, turned out the block was getting deleted before it was read. Check namenode for blk_-7027776556206952935_61338. What's the story there? Jeff On Thu, Nov 18, 2010 at 12:45 PM, Kim Vogt k...@simplegeo.com wrote: Hi, I'm using the MapFileOutputFormat to lookup values in MapFiles and keep getting Could not obtain block errors. I'm thinking it might be because ulimit is not set high enough. Has anyone else run into this issue? attempt_201011180019_0005_m_03_0: Caught exception while getting cached files: java.io.IOException: Could not obtain block: blk_-7027776556206952935_61338 file=/mydata/part-r-0/data attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1976) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1783) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1931) attempt_201011180019_0005_m_03_0: at java.io.DataInputStream.readFully(DataInputStream.java:178) attempt_201011180019_0005_m_03_0: at java.io.DataInputStream.readFully(DataInputStream.java:152) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1435) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1424) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1419) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.createDataFileReader(MapFile.java:302) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:284) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:273) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:260) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:253) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.Child$4.run(Child.java:217) attempt_201011180019_0005_m_03_0: at java.security.AccessController.doPrivileged(Native Method) attempt_201011180019_0005_m_03_0: at javax.security.auth.Subject.doAs(Subject.java:396) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.Child.main(Child.java:211) -Kim
Re: MapFiles error Could not obtain block
Hey Jeff, I'm not deleting any blocks and hadoop fsck returns all the blocks as being there and healthy :-/ -Kim On Thu, Nov 18, 2010 at 4:14 PM, Jeff Bean jwfb...@cloudera.com wrote: Hi Kim, I saw this problem once, turned out the block was getting deleted before it was read. Check namenode for blk_-7027776556206952935_61338. What's the story there? Jeff On Thu, Nov 18, 2010 at 12:45 PM, Kim Vogt k...@simplegeo.com wrote: Hi, I'm using the MapFileOutputFormat to lookup values in MapFiles and keep getting Could not obtain block errors. I'm thinking it might be because ulimit is not set high enough. Has anyone else run into this issue? attempt_201011180019_0005_m_03_0: Caught exception while getting cached files: java.io.IOException: Could not obtain block: blk_-7027776556206952935_61338 file=/mydata/part-r-0/data attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1976) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1783) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1931) attempt_201011180019_0005_m_03_0: at java.io.DataInputStream.readFully(DataInputStream.java:178) attempt_201011180019_0005_m_03_0: at java.io.DataInputStream.readFully(DataInputStream.java:152) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1435) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1424) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1419) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.createDataFileReader(MapFile.java:302) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:284) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:273) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:260) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.io.MapFile$Reader.init(MapFile.java:253) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.Child$4.run(Child.java:217) attempt_201011180019_0005_m_03_0: at java.security.AccessController.doPrivileged(Native Method) attempt_201011180019_0005_m_03_0: at javax.security.auth.Subject.doAs(Subject.java:396) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) attempt_201011180019_0005_m_03_0: at org.apache.hadoop.mapred.Child.main(Child.java:211) -Kim
Re: Could not obtain block
Increased the ulimit to 64000 ... same problem stop/start-all ... same problem but on a different block which of course present, so it looks like there is nothing wrong with actual data in the hdfs. I use the Nutch default hadoop 0.19.x anything related ? 2010/1/30 Ken Goodhope kengoodh...@gmail.com Could not obtain block errors are often caused by running out of available file handles. You can confirm this by going to the shell and entering ulimit -n. If it says 1024, the default, then you will want to increase it to about 64,000. On Fri, Jan 29, 2010 at 4:06 PM, MilleBii mille...@gmail.com wrote: X-POST with Nutch mailing list. HEEELP !!! Kind of get stuck on this one. I backed-up my hdfs data, reformated the hdfs, put data back, try to merge my segments together and it explodes again. Exception in thread Lucene Merge Thread #0 org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: Could not obtain block: blk_4670839132945043210_1585 file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) If I go into the hfds/data directory I DO find the faulty block Could it be a synchro problem on the segment merger code ? 2010/1/29 MilleBii mille...@gmail.com I'm looking for some help. I'm Nutch user, everything was working fine, but now I get the following error when indexing. I have a single note pseudo distributed set up. Some people on the Nutch list indicated to me that I could full, so I remove many things and hdfs is far from full. This file directory was perfectly OK the day before. I did a hadoop fsck... report says healthy. What can I do ? Is is safe to do a Linux FSCK just in case ? Caused by: java.io.IOException: Could not obtain block: blk_8851198258748412820_9031 file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq -- -MilleBii- -- -MilleBii- -- Ken Goodhope Cell: 425-750-5616 362 Bellevue Way NE Apt N415 Bellevue WA, 98004 -- -MilleBii-
Re: Could not obtain block
Ken, FIXED !!! SO MUCH THANKS Command prompt ulimit wasn't enough, one needs to hard set it and reboot explained here http://posidev.com/blog/2009/06/04/set-ulimit-parameters-on-ubuntu/ 2010/1/30 MilleBii mille...@gmail.com Increased the ulimit to 64000 ... same problem stop/start-all ... same problem but on a different block which of course present, so it looks like there is nothing wrong with actual data in the hdfs. I use the Nutch default hadoop 0.19.x anything related ? 2010/1/30 Ken Goodhope kengoodh...@gmail.com Could not obtain block errors are often caused by running out of available file handles. You can confirm this by going to the shell and entering ulimit -n. If it says 1024, the default, then you will want to increase it to about 64,000. On Fri, Jan 29, 2010 at 4:06 PM, MilleBii mille...@gmail.com wrote: X-POST with Nutch mailing list. HEEELP !!! Kind of get stuck on this one. I backed-up my hdfs data, reformated the hdfs, put data back, try to merge my segments together and it explodes again. Exception in thread Lucene Merge Thread #0 org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: Could not obtain block: blk_4670839132945043210_1585 file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) If I go into the hfds/data directory I DO find the faulty block Could it be a synchro problem on the segment merger code ? 2010/1/29 MilleBii mille...@gmail.com I'm looking for some help. I'm Nutch user, everything was working fine, but now I get the following error when indexing. I have a single note pseudo distributed set up. Some people on the Nutch list indicated to me that I could full, so I remove many things and hdfs is far from full. This file directory was perfectly OK the day before. I did a hadoop fsck... report says healthy. What can I do ? Is is safe to do a Linux FSCK just in case ? Caused by: java.io.IOException: Could not obtain block: blk_8851198258748412820_9031 file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq -- -MilleBii- -- -MilleBii- -- Ken Goodhope Cell: 425-750-5616 362 Bellevue Way NE Apt N415 Bellevue WA, 98004 -- -MilleBii- -- -MilleBii-
Re: Could not obtain block
X-POST with Nutch mailing list. HEEELP !!! Kind of get stuck on this one. I backed-up my hdfs data, reformated the hdfs, put data back, try to merge my segments together and it explodes again. Exception in thread Lucene Merge Thread #0 org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: Could not obtain block: blk_4670839132945043210_1585 file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) If I go into the hfds/data directory I DO find the faulty block Could it be a synchro problem on the segment merger code ? 2010/1/29 MilleBii mille...@gmail.com I'm looking for some help. I'm Nutch user, everything was working fine, but now I get the following error when indexing. I have a single note pseudo distributed set up. Some people on the Nutch list indicated to me that I could full, so I remove many things and hdfs is far from full. This file directory was perfectly OK the day before. I did a hadoop fsck... report says healthy. What can I do ? Is is safe to do a Linux FSCK just in case ? Caused by: java.io.IOException: Could not obtain block: blk_8851198258748412820_9031 file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq -- -MilleBii- -- -MilleBii-
Re: Could not obtain block
Could not obtain block errors are often caused by running out of available file handles. You can confirm this by going to the shell and entering ulimit -n. If it says 1024, the default, then you will want to increase it to about 64,000. On Fri, Jan 29, 2010 at 4:06 PM, MilleBii mille...@gmail.com wrote: X-POST with Nutch mailing list. HEEELP !!! Kind of get stuck on this one. I backed-up my hdfs data, reformated the hdfs, put data back, try to merge my segments together and it explodes again. Exception in thread Lucene Merge Thread #0 org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException: Could not obtain block: blk_4670839132945043210_1585 file=/user/nutch/crawl/indexed-segments/20100113003609/part-0/_ym.frq at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309) If I go into the hfds/data directory I DO find the faulty block Could it be a synchro problem on the segment merger code ? 2010/1/29 MilleBii mille...@gmail.com I'm looking for some help. I'm Nutch user, everything was working fine, but now I get the following error when indexing. I have a single note pseudo distributed set up. Some people on the Nutch list indicated to me that I could full, so I remove many things and hdfs is far from full. This file directory was perfectly OK the day before. I did a hadoop fsck... report says healthy. What can I do ? Is is safe to do a Linux FSCK just in case ? Caused by: java.io.IOException: Could not obtain block: blk_8851198258748412820_9031 file=/user/nutch/crawl/indexed-segments/20100111233601/part-0/_103.frq -- -MilleBii- -- -MilleBii- -- Ken Goodhope Cell: 425-750-5616 362 Bellevue Way NE Apt N415 Bellevue WA, 98004
java.io.IOException: Could not obtain block:
Hello everyone, I am getting this error java.io.IOException: Could not obtain block:, when running on my new cluster. When I ran the same job on the single node it worked perfectly, I then added in the second node, and receive this error. I was running the grep sample job. I am running Hadoop 0.19.2, because of a dependency on Nutch (Eventhough this was not a Nutch job). I am not running HBase, the version of Java is OpenJDK 1.6.0. Does anybody have any ideas? Thanks in advance, -John
Re: java.io.IOException: Could not obtain block:
I've not encountered an error like this, but here's some suggestions: 1. Try to make sure that your two node cluster is setup correctly. Querying the web interface, using any of the included dfs utils (eg. hadoop dfs -ls), or looking in your log directory may yield more useful stack traces or errors. 2. Open up the source and check out the code around the stack trace. This sucks, but hadoop is actually pretty easy to surf through in Eclipse, and most classes are kept within a reasonable number of lines of code and fairly readable. 3. Rip out the parts of Nutch you need and drop them in your project, and forget about 0.19. This isn't ideal, but you have to remember that this whole ecosystem is still forming and sometimes it makes sense to rip stuff out and transplant it into your project rather than depending on 2-3 classes from a project which you otherwise don't use. On 11/10/09 11:32 AM, John Martyniak wrote: Hello everyone, I am getting this error java.io.IOException: Could not obtain block:, when running on my new cluster. When I ran the same job on the single node it worked perfectly, I then added in the second node, and receive this error. I was running the grep sample job. I am running Hadoop 0.19.2, because of a dependency on Nutch (Eventhough this was not a Nutch job). I am not running HBase, the version of Java is OpenJDK 1.6.0. Does anybody have any ideas? Thanks in advance, -John
Re: java.io.IOException: Could not obtain block:
Edmund, Thanks for the advice. It turns out that it was the firewall running on the second cluster node. So I stopped that and all is working correctly. Now that I have the second node working the way that it is supposed to probably, going to bring another couple of nodes online. Wish me luck:) -John On Nov 10, 2009, at 9:30 PM, Edmund Kohlwey wrote: I've not encountered an error like this, but here's some suggestions: 1. Try to make sure that your two node cluster is setup correctly. Querying the web interface, using any of the included dfs utils (eg. hadoop dfs -ls), or looking in your log directory may yield more useful stack traces or errors. 2. Open up the source and check out the code around the stack trace. This sucks, but hadoop is actually pretty easy to surf through in Eclipse, and most classes are kept within a reasonable number of lines of code and fairly readable. 3. Rip out the parts of Nutch you need and drop them in your project, and forget about 0.19. This isn't ideal, but you have to remember that this whole ecosystem is still forming and sometimes it makes sense to rip stuff out and transplant it into your project rather than depending on 2-3 classes from a project which you otherwise don't use. On 11/10/09 11:32 AM, John Martyniak wrote: Hello everyone, I am getting this error java.io.IOException: Could not obtain block:, when running on my new cluster. When I ran the same job on the single node it worked perfectly, I then added in the second node, and receive this error. I was running the grep sample job. I am running Hadoop 0.19.2, because of a dependency on Nutch (Eventhough this was not a Nutch job). I am not running HBase, the version of Java is OpenJDK 1.6.0. Does anybody have any ideas? Thanks in advance, -John
Question about DFSClient Could not obtain block errors
I am trying to read data placed on hdfs in one EC2 cluster from a different EC2 cluster and am getting the errors below. Both EC2 Clusters are running v0.19. When I run 'hadoop -get small-page-index small-page-index' on the source cluster everything works fine and the data is properly retrieved out of hdfs. FWIW, hadoop fs -ls works fine across clusters. Any ideas of what might be the problem and how to remedy it? thanks, Scott Here are the errors I am getting: [r...@domu-12-31-38-00-4e-32 ~]# hadoop fs -cp hdfs://domU-12-31-38-00-1C-B1.compute-1.internal:50001/user/root/small-page-index small-page-index 09/09/14 21:48:43 INFO hdfs.DFSClient: Could not obtain block blk_-4157273618194597760_1160 from any node: java.io.IOException: No live nodes contain current block 09/09/14 21:51:46 INFO hdfs.DFSClient: Could not obtain block blk_-4157273618194597760_1160 from any node: java.io.IOException: No live nodes contain current block 09/09/14 21:54:49 INFO hdfs.DFSClient: Could not obtain block blk_-4157273618194597760_1160 from any node: java.io.IOException: No live nodes contain current block Exception closing file /user/root/small-page-index/aIndex/_0.cfs java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198) at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:243) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413) at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236) at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221)