Hi All:
 
I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master running 
namenode).  I'm trying to process a small (180G) dataset.  I've done this 
succesfully and painlessly running 0.15.0.  When I run 0.17.0 with the same 
data and same code (w/API changes for 0.17.0 and recompiled, of course), I get 
a ton of failures.  I've increased the number of namenode threads trying to 
resolve this, but that doesn't seem to help.  The errors are of the following 
flavor:
 
java.io.IOException: Could not get block locations. Aborting...
java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting...
Exception in thread "Thread-2" java.util.ConcurrentModificationException
Exception closing file /blah/_temporary/_task_200807052311_0001_r_0000
04_0/baz/part-xxxxx
 
As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1).  I 
am wondering if anybody can shed some light on this, or if others are having 
similar problems.  
 
Any thoughts, insights, etc. would be greatly appreciated.
 
Thanks,
C G
 
Here's an ugly trace:
08/07/06 01:43:29 INFO mapred.JobClient:  map 100% reduce 93%
08/07/06 01:43:29 INFO mapred.JobClient: Task Id : 
task_200807052311_0001_r_000003_0, Status : FAILED
java.io.IOException: Could not get block locations. Aborting...
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
task_200807052311_0001_r_000003_0: Exception closing file 
/output/_temporary/_task_200807052311_0001_r_0000
03_0/a/b/part-00003
task_200807052311_0001_r_000003_0: java.io.IOException: All datanodes 
10.2.11.2:50010 are bad. Aborting...
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja
va:2095)
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1
818)
task_200807052311_0001_r_000003_0: Exception in thread "Thread-2" 
java.util..ConcurrentModificationException
task_200807052311_0001_r_000003_0:      at 
java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
task_200807052311_0001_r_000003_0:      at 
java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
task_200807052311_0001_r_000003_0:      at 
org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
08/07/06 01:44:32 INFO mapred.JobClient:  map 100% reduce 74%
08/07/06 01:44:32 INFO mapred.JobClient: Task Id : 
task_200807052311_0001_r_000001_0, Status : FAILED
java.io.IOException: Could not get block locations. Aborting...
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
task_200807052311_0001_r_000001_0: Exception in thread "Thread-2" 
java.util..ConcurrentModificationException
task_200807052311_0001_r_000001_0:      at 
java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
task_200807052311_0001_r_000001_0:      at 
java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
task_200807052311_0001_r_000001_0:      at 
org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
task_200807052311_0001_r_000001_0:      at 
org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
task_200807052311_0001_r_000001_0:      at 
org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
task_200807052311_0001_r_000001_0:      at 
org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
task_200807052311_0001_r_000001_0:      at 
org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
08/07/06 01:44:45 INFO mapred.JobClient:  map 100% reduce 54%



      

Reply via email to