Jonathan Hsieh created HBASE-8349:
-------------------------------------

             Summary: TestLogRolling#TestLogRollOnDatanodeDeath hangs under 
hadoop2 profile
                 Key: HBASE-8349
                 URL: https://issues.apache.org/jira/browse/HBASE-8349
             Project: HBase
          Issue Type: Sub-task
          Components: hadoop2
    Affects Versions: 0.95.0, 0.98.0
            Reporter: Jonathan Hsieh
            Assignee: Jonathan Hsieh
             Fix For: 0.98.0, 0.95.1


TestLogRolling has been hanging -- after a data node is killed the client 
attempts to recover a lease and fails forever.  (This example ran for a while 
and shows recovery attempt 541888).  

{code}
2013-04-15 16:37:49,074 INFO  [SplitLogWorker-localhost,39898,1366065830907] 
util.FSHDFSUtils(72): Attempt 541888 to recoverLease on file 
hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 returned false, trying for 2642865ms
2013-04-15 16:37:49,075 ERROR [SplitLogWorker-localhost,39898,1366065830907] 
util.FSHDFSUtils(86): Can't recoverLease after 541888 attempts and 2642866ms  
for 
hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 - continuing without the lease, but we could have a data loss.
2013-04-15 16:37:49,075 INFO  [IPC Server handler 9 on 41333] 
namenode.FSNamesystem(1957): recoverLease: recover lease [Lease.  Holder: 
DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091, pendingcreates: 
1], 
src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 from client DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091
2013-04-15 16:37:49,075 INFO  [IPC Server handler 9 on 41333] 
namenode.FSNamesystem(2981): Recovering lease=[Lease.  Holder: 
DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091, pendingcreates: 
1], 
src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
2013-04-15 16:37:49,078 WARN  [IPC Server handler 9 on 41333] 
namenode.FSNamesystem(3096): DIR* NameSystem.internalReleaseLease: File 
/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 has not been closed. Lease recovery is in progress. RecoveryId = 543317 for 
block blk_7636447875270454121_1019{blockUCState=UNDER_RECOVERY, 
primaryNodeIndex=1, replicas=[ReplicaUnderConstruction[127.0.0.1:38288|RBW], 
ReplicaUnderConstruction[127.0.0.1:35956|RWR]]}
2013-04-15 16:37:49,078 INFO  [SplitLogWorker-localhost,39898,1366065830907] 
util.FSHDFSUtils(72): Attempt 541889 to recoverLease on file 
hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 returned false, trying for 2642869ms
2013-04-15 16:37:49,079 ERROR [SplitLogWorker-localhost,39898,1366065830907] 
util.FSHDFSUtils(86): Can't recoverLease after 541889 attempts and 2642870ms  
for 
hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 - continuing without the lease, but we could have a data loss.
2013-04-15 16:37:49,079 INFO  [IPC Server handler 4 on 41333] 
namenode.FSNamesystem(1957): recoverLease: recover lease [Lease.  Holder: 
DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091, pendingcreates: 
1], 
src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 from client DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091
2013-04-15 16:37:49,079 INFO  [IPC Server handler 4 on 41333] 
namenode.FSNamesystem(2981): Recovering lease=[Lease.  Holder: 
DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091, pendingcreates: 
1], 
src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
2013-04-15 16:37:49,082 WARN  [IPC Server handler 4 on 41333] 
namenode.FSNamesystem(3096): DIR* NameSystem.internalReleaseLease: File 
/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 has not been closed. Lease recovery is in progress. RecoveryId = 543318 for 
block blk_7636447875270454121_1019{blockUCState=UNDER_RECOVERY, 
primaryNodeIndex=1, replicas=[ReplicaUnderConstruction[127.0.0.1:38288|RBW], 
ReplicaUnderConstruction[127.0.0.1:35956|RWR]]}
2013-04-15 16:37:49,083 INFO  [SplitLogWorker-localhost,39898,1366065830907] 
util.FSHDFSUtils(72): Attempt 541890 to recoverLease on file 
hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 returned false, trying for 2642874ms
2013-04-15 16:37:49,083 ERROR [SplitLogWorker-localhost,39898,1366065830907] 
util.FSHDFSUtils(86): Can't recoverLease after 541890 attempts and 2642874ms  
for 
hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
 - continuing without the lease, but we could have a data loss.
{code}

It initially starts with permissions errors similar to HBASE-7636.  For now we 
disable in this test and will address with a fix in HBASE-8337 with an assist 
from HDFS folks.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to