I removed HDFS-630 and this issue seems to go away. I haven't seen it over the last few restarts. St.Ack
On Wed, Nov 11, 2009 at 3:46 PM, stack <[email protected]> wrote: > Here's the NN's view on the problematic file: > > [st...@aa0-000-12 logs]$ grep 1257981905391 > hadoop-stack-namenode-aa0-000-12.u.powerset.com.log > 2009-11-11 23:25:05,449 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* > NameNode.create: file > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > for DFSClient_-1861029014 at 208.76.44.140 > 2009-11-11 23:25:05,449 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.startFile: > src=/hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391, > holder=DFSClient_-1861029014, clientMachine=208.76.44.140, > createParent=true, replication=3, overwrite=true, append=false > 2009-11-11 23:25:05,451 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > FSDirectory.addFile: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > is added to the file system > 2009-11-11 23:25:05,451 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.startFile: add > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > to namespace for DFSClient_-1861029014 > 2009-11-11 23:25:05,452 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=stack,powerset,engineering ip=/208.76.44.140 cmd=create > src=/hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > dst=null perm=stack:supergroup:rw-r--r-- > 2009-11-11 23:25:12,405 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* > NameNode.addBlock: file > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > for DFSClient_-1861029014 > 2009-11-11 23:25:12,406 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.getAdditionalBlock: file > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > for DFSClient_-1861029014 > 2009-11-11 23:25:12,406 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > FSDirectory.addFile: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > with blk_-3269268615851057446_1010 block is added to the in-memory file > system > 2009-11-11 23:25:12,406 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.allocateBlock: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391. > blk_-3269268615851057446_1010{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[208.76.44.140:51010|RBW], > ReplicaUnderConstruction[208.76.44.141:51010|RBW], > ReplicaUnderConstruction[208.76.44.142:51010|RBW]]} > 2009-11-11 23:25:12,504 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > NameSystem.fsync: file > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > for DFSClient_-1861029014 > 2009-11-11 23:25:12,505 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > FSDirectory.persistBlocks: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > with 1 blocks is persisted to the file system > 2009-11-11 23:25:34,233 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* > NameNode.complete: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > for DFSClient_-1861029014 > 2009-11-11 23:25:34,233 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.completeFile: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > for DFSClient_-1861029014 > 2009-11-11 23:25:34,234 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > FSDirectory.closeFile: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > with 1 blocks is persisted to the file system > 2009-11-11 23:25:34,234 INFO org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.completeFile: file > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > is closed by DFSClient_-1861029014 > 2009-11-11 23:25:45,638 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=stack,powerset,engineering ip=/208.76.44.139 cmd=open > src=/hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > dst=null perm=null > 2009-11-11 23:25:47,272 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* > Namenode.delete: > src=/hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391, > recursive=true > 2009-11-11 23:25:47,272 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.delete: /hbase/.logs/aa0-000-13.u.powerset.com > ,60020,1257981904990/hlog.dat.1257981905391 > 2009-11-11 23:25:47,273 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > FSDirectory.delete: /hbase/.logs/aa0-000-13.u.powerset.com > ,60020,1257981904990/hlog.dat.1257981905391 > 2009-11-11 23:25:47,273 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > FSDirectory.unprotectedDelete: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > is removed > 2009-11-11 23:25:47,273 DEBUG > org.apache.hadoop.hdfs.server.namenode.LeaseManager: LeaseManager.findLease: > prefix=/hbase/.logs/aa0-000-13.u.powerset.com > ,60020,1257981904990/hlog.dat.1257981905391 > 2009-11-11 23:25:47,280 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* > Namesystem.delete: > /hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > is removed > 2009-11-11 23:25:47,280 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: > ugi=stack,powerset,engineering ip=/208.76.44.139 cmd=delete > src=/hbase/.logs/aa0-000-13.u.powerset.com,60020,1257981904990/hlog.dat.1257981905391 > dst=null perm=null > > > > On Wed, Nov 11, 2009 at 3:41 PM, stack <[email protected]> wrote: > >> I should have said this is branch-0.21 from about an hour ago. I have >> hdfs-630 applied but this seems unrelated. >> Thanks, >> St.Ack >> >> >> On Wed, Nov 11, 2009 at 3:40 PM, stack <[email protected]> wrote: >> >>> Any recommendations for what I should do to cope when I get such a >>> beastie? (It happened in our WAL so server kills itself). >>> Thanks, >>> St.Ack >>> >>> java.io.IOException: java.io.IOException: Cannot complete block: block >>> has not been COMMITTED by the client >>> at >>> org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction.convertToCompleteBlock(BlockInfoUnderConstruction.java:158) >>> at >>> org.apache.hadoop.hdfs.server.namenode.BlockManager.completeBlock(BlockManager.java:288) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.finalizeINodeFileUnderConstruction(FSNamesystem.java:1846) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1367) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1329) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:660) >>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:516) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:964) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:960) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:396) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:958) >>> >>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>> Method) >>> at >>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) >>> at >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) >>> at >>> java.lang.reflect.Constructor.newInstance(Constructor.java:513) >>> at >>> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94) >>> at >>> org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48) >>> at >>> org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66) >>> at >>> org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:98) >>> >> >> >
