Hi All,

I am running a process to extract feature vectors from images and write as
SequenceFiles on HDFS. My dataset of images is very large (~46K images).

The writing process worked all fine for half of the process but all of
sudden following problem occured:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
create file /mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817 for
DFSClient_148861898 on client 10.118.177.84 because current leaseholder is
trying to recreate file.

On investigating, I found that the error started generating after a
LeaseExpirationException :
dfs.server.namenode.LeaseExpiredException: No lease on
/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817 File is not open
for writing. [Lease.  Holder: DFSClient_148861898, pendingcreates: 1]
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817 File is not open
for writing. [Lease.  Holder: DFSClient_148861898, pendingcreates: 1]

The process has already taken me 18-19 hrs and it would be very tough for me
to restart the whole process.
Is there anything which can be done to fix it run-time ? ( may be
force-deleting the concerned file
'/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817' on HDFS ?)

Regards
Lokendra

*Detailed Log:*

2011-05-25 04:03:32,160 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 54310, call
addBlock(/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817,
DFSClient_148861898) from 10.118.177.84:48372: error: org.apache.hadoop.h
dfs.server.namenode.LeaseExpiredException: No lease on
/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817 File is not open
for writing. [Lease.  Holder: DFSClient_148861898, pendingcreates: 1]
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817 File is not open
for writing. [Lease.  Holder: DFSClient_148861898, pendingcreates: 1]
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1340)
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1323)
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1251)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-05-25 04:03:32,175 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_-4965605132591592561 is added to invalidSet
of 10.118.177.84:50010
2011-05-25 04:03:32,207 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=johndoe,johndoe    ip=/10.118.177.84    cmd=delete
src=/mnt/tmp/sirs-dataset-k10000/feature-repo/imageList    dst=null
perm=null
2011-05-25 04:03:32,212 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=johndoe,johndoe    ip=/10.118.177.84    cmd=create
src=/mnt/tmp/sirs-dataset-k10000/feature-repo/imageList    dst=null
perm=johndoe
:supergroup:rw-r--r--
2011-05-25 04:03:32,215 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.allocateBlock:
/mnt/tmp/sirs-dataset-k10000/feature-repo/imageList.
blk_6557263107434203565_332695
2011-05-25 04:03:32,695 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.registerDatanode: node registration from
10.118.177.84:50010storage
DS-199406591-10.118.177.84-50010-1306165949296
2011-05-25 04:03:32,696 INFO org.apache.hadoop.net.NetworkTopology: Removing
a node: /default-rack/10.118.177.84:50010
2011-05-25 04:03:32,696 INFO org.apache.hadoop.net.NetworkTopology: Adding a
new node: /default-rack/10.118.177.84:50010
2011-05-25 04:03:33,045 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.100.245.5:50010 is added to
blk_6557263107434203565_332695 size 11746349
2011-05-25 04:03:33,045 INFO org.apache.hadoop.hdfs.StateChange: DIR*
NameSystem.completeFile: file
/mnt/tmp/sirs-dataset-k10000/feature-repo/imageList is closed by
DFSClient_148861898
2011-05-25 04:03:33,404 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=johndoe,johndoe    ip=/10.118.177.84    cmd=delete
src=/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817    dst=null
perm
=null
2011-05-25 04:03:33,405 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=johndoe,johndoe    ip=/10.118.177.84    cmd=create
src=/mnt/tmp/sirs-dataset-k10000/feature-repo/features/109817    dst=null
perm
=johndoe:supergroup:rw-r--r--
2011-05-25 04:03:33,468 WARN org.apache.hadoop.hdfs.StateChange: DIR*
NameSystem.startFile: failed to create file
/mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817 for
DFSClient_148861898 on client 10.118.177.84 because current
leaseholder is trying to recreate file.
2011-05-25 04:03:33,469 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 54310, call
create(/mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817, rwxr-xr-x,
DFSClient_148861898, true, 1, 67108864) from 10.118.177.84:48372
: error: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
failed to create file
/mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817 for
DFSClient_148861898 on client 10.118.177.84 because current leaseholder is
trying
 to recreate file.
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
create file /mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817 for
DFSClient_148861898 on client 10.118.177.84 because current leaseholder is
trying to recre
ate file.
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1045)
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:981)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:377)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-05-25 04:03:33,709 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll FSImage from
10.118.177.84
2011-05-25 04:03:33,709 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions:
10 Total time for transactions(ms): 0Number of transactions batched in
Syncs: 0 Number of syncs: 7 SyncTimes(ms): 220
2011-05-25 04:03:35,165 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask
10.118.177.84:50010 to delete  blk_-4965605132591592561_332692
2011-05-25 04:04:33,481 WARN org.apache.hadoop.hdfs.StateChange: DIR*
NameSystem.startFile: failed to create file
/mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817 for
DFSClient_148861898 on client 10.118.177.84 because current
leaseholder is trying to recreate file.
2011-05-25 04:04:33,481 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 8 on 54310, call
create(/mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817, rwxr-xr-x,
DFSClient_148861898, true, 1, 67108864) from 10.118.177.84:48372
: error: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
failed to create file
/mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817 for
DFSClient_148861898 on client 10.118.177.84 because current leaseholder is
trying
 to recreate file.
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
create file /mnt/tmp/sirs-dataset-k10000/feature-repo/metadata/109817 for
DFSClient_148861898 on client 10.118.177.84 because current leaseholder is
trying to recre
ate file.
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1045)
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:981)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:377)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)





Regards
Lokendra

Reply via email to