Hi All, I recently encountered the following exception when attempting to write to our hadoop grid
This is from the name-node log: 2008-05-05 16:11:01,966 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 54310, call create(/2008-05-05/ONE-2008-05-05-14.gz, DFSClient_-164311132, false, 3, 67108864) from 10.2.15.6:42519: error: org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create file /2008-05-05/ONE-2008-05-05-14.gz for DFSClient_-164311132 on client 10.2.15.6 because current leaseholder is trying to recreate file. org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create file /2008-05-05/ONE-2008-05-05-14.gz for DFSClient_-164311132 on client 10.2.15.6 because current leaseholder is trying to recreate file. at org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:850) at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:806) at org.apache.hadoop.dfs.NameNode.create(NameNode.java:276) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) I was wondering if anyone could offer some insight as to possible causes for this. Only one process was attempting to use copyFromLocal on the file at that time (although five other machines were attempting to copy files differently named to the same directory using copyFromLocal). Our grid was under very heavy load at that time - so might this have something to do with the following exception? It should be noted that the following attempt to use copyFromLocal an hour after this incident was successful. Further, is there a way to specify that failed copies not leave artifacts? In this particular case, a size 0 file ONE-2008-05-05-14.gz was left on the grid. -- Bo Shi (207) 469-8264 (M) 25 Kingston St, 5th Fl Boston, MA 02111 USA