[ 
https://issues.apache.org/jira/browse/HBASE-10025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832145#comment-13832145
 ] 

stack commented on HBASE-10025:
-------------------------------

[~jxiang] Thanks for taking a look.  Sounds reasonable.  Why did we stop 
assigning?  How does that mechanism work?  Should we write up something for ops 
on how to find and fix this kind of scenario?

> Region stuck in transition after failed RS_ZK_REGION_FAILED_OPEN
> ----------------------------------------------------------------
>
>                 Key: HBASE-10025
>                 URL: https://issues.apache.org/jira/browse/HBASE-10025
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 0.96.0
>            Reporter: stack
>
> Testing, I just ran into a case where I have a region stuck in transition.  
> We kept trying to open the region but were failing and eventually got stuck.  
> Looks like a block was missing so region may never have been able to open but 
> we likely shouldn't have the region stuck in transition.
> Here is some log snippet from master side only showing lines that mention the 
> region that is failing:
> {code}
>  31 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, 
> server=c2023.halxg.cloudera.com,60020,1385405351397, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, 
> ts=1385411440853, server=c2023.halxg.cloudera.com,60020,1385405351397}
>  30 2013-11-25 12:30:56,111 INFO  [AM.ZK.Worker-pool2-t366] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411440853, 
> server=c2023.halxg.cloudera.com,60020,1385405351397} to 
> {d1485bc64a1a9246908e6af38a89b973 state=CLOSED, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397}
>  29 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2023.halxg.cloudera.com,60020,1385405351397 accepted as a dest server = 
> false
>  28 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: No previous transition plan found (or ignoring an 
> existing plan) for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; generated random 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2025.halxg.cloudera.com,60020,1385405351329; 3 (on#
>  27 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> handler.ClosedRegionHandler: Handling CLOSED event for 
> d1485bc64a1a9246908e6af38a89b973
>  26 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2025.halxg.cloudera.com,60020,1385405351329 accepted as a dest server = 
> true
>  25 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Using pre-existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2025.halxg.cloudera.com,60020,1385405351329
>  24 2013-11-25 12:30:56,111 INFO  [AM.ZK.Worker-pool2-t366] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=CLOSED, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OFFLINE, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397}
>  23 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> zookeeper.ZKAssign: master:60000-0x1429097f3410007, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Creating (or updating) 
> unassigned node d1485bc64a1a9246908e6af38a89b973 with OFFLINE state
>  22 2013-11-25 12:30:56,118 INFO  [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Assigning 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. to 
> c2025.halxg.cloudera.com,60020,1385405351329
>  21 2013-11-25 12:30:56,118 INFO  [AM.ZK.Worker-pool2-t366] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OFFLINE, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397} to 
> {d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, ts=1385411456118, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>  20 2013-11-25 12:30:56,127 DEBUG [AM.ZK.Worker-pool2-t367] 
> master.AssignmentManager: Handling M_ZK_REGION_OFFLINE, 
> server=c2025.halxg.cloudera.com,60020,1385405351329, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411456118, server=c2025.halxg.cloudera.com,60020,1385405351329}
>  19 2013-11-25 12:30:56,135 DEBUG [AM.ZK.Worker-pool2-t368] 
> master.AssignmentManager: Handling RS_ZK_REGION_OPENING, 
> server=c2025.halxg.cloudera.com,60020,1385405351329, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411456118, server=c2025.halxg.cloudera.com,60020,1385405351329}
>  18 2013-11-25 12:30:56,135 INFO  [AM.ZK.Worker-pool2-t368] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=PENDING_OPEN, ts=1385411456118, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OPENING, ts=1385411456135, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>  17 2013-11-25 12:31:10,594 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411456135, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}}
>  16 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, 
> server=c2025.halxg.cloudera.com,60020,1385405351329, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, 
> ts=1385411456135, server=c2025.halxg.cloudera.com,60020,1385405351329}
>  15 2013-11-25 12:31:15,368 INFO  [AM.ZK.Worker-pool2-t369] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411456135, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=CLOSED, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>  14 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2025.halxg.cloudera.com,60020,1385405351329 accepted as a dest server = 
> false
>  13 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: No previous transition plan found (or ignoring an 
> existing plan) for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; generated random 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2024.halxg.cloudera.com,60020,1385405351453; 3 (on#
>  12 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> handler.ClosedRegionHandler: Handling CLOSED event for 
> d1485bc64a1a9246908e6af38a89b973
>  11 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2024.halxg.cloudera.com,60020,1385405351453 accepted as a dest server = 
> true
>  10 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Using pre-existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2024.halxg.cloudera.com,60020,1385405351453
>   9 2013-11-25 12:31:15,368 INFO  [AM.ZK.Worker-pool2-t369] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=CLOSED, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OFFLINE, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>   8 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> zookeeper.ZKAssign: master:60000-0x1429097f3410007, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Creating (or updating) 
> unassigned node d1485bc64a1a9246908e6af38a89b973 with OFFLINE state
>   7 2013-11-25 12:31:15,375 INFO  [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Assigning 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. to 
> c2024.halxg.cloudera.com,60020,1385405351453
>   6 2013-11-25 12:31:15,375 INFO  [AM.ZK.Worker-pool2-t369] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OFFLINE, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, ts=1385411475375, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}
>   5 2013-11-25 12:31:15,382 DEBUG [AM.ZK.Worker-pool2-t370] 
> master.AssignmentManager: Handling M_ZK_REGION_OFFLINE, 
> server=c2024.halxg.cloudera.com,60020,1385405351453, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411475375, server=c2024.halxg.cloudera.com,60020,1385405351453}
>   4 2013-11-25 12:31:15,392 DEBUG [AM.ZK.Worker-pool2-t371] 
> master.AssignmentManager: Handling RS_ZK_REGION_OPENING, 
> server=c2024.halxg.cloudera.com,60020,1385405351453, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411475375, server=c2024.halxg.cloudera.com,60020,1385405351453}
>   3 2013-11-25 12:31:15,393 INFO  [AM.ZK.Worker-pool2-t371] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=PENDING_OPEN, ts=1385411475375, 
> server=c2024.halxg.cloudera.com,60020,1385405351453} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OPENING, ts=1385411475393, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}
>   2 2013-11-25 12:31:27,436 DEBUG [AM.ZK.Worker-pool2-t372] 
> master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, 
> server=c2024.halxg.cloudera.com,60020,1385405351453, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, 
> ts=1385411475393, server=c2024.halxg.cloudera.com,60020,1385405351453}
>   1 2013-11-25 12:31:27,436 WARN  [AM.ZK.Worker-pool2-t372] 
> master.RegionStates: Failed to open/close d1485bc64a1a9246908e6af38a89b973 on 
> c2024.halxg.cloudera.com,60020,1385405351453, set to FAILED_OPEN
>   0 2013-11-25 12:31:27,436 INFO  [AM.ZK.Worker-pool2-t372] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411475393, 
> server=c2024.halxg.cloudera.com,60020,1385405351453} to 
> {d1485bc64a1a9246908e6af38a89b973 state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}
>   1 2013-11-25 12:36:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   2 2013-11-25 12:41:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   3 2013-11-25 12:46:10,596 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   4 2013-11-25 12:51:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   5 2013-11-25 12:56:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
> {code}
> Here is log from the regionserver side on last open attempt:
> {code}
> 2013-11-25 12:29:40,648 INFO  [RS_OPEN_REGION-c2024:60020-1] 
> handler.OpenRegionHandler: Opening of region {ENCODED => 
> d1485bc64a1a9246908e6af38a89b973, NAME => 
> 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', 
> ENDKEY => '00000000000000000005329216'} failed, transitioning from OPENING to 
> FAILED_OPEN in ZK, expecting version 10
> 2013-11-25 12:29:40,648 DEBUG [RS_OPEN_REGION-c2024:60020-1] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> 2013-11-25 12:29:40,663 DEBUG [RS_OPEN_REGION-c2024:60020-1] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> 2013-11-25 12:31:15,376 INFO  [Priority.RpcServer.handler=0,port=60020] 
> regionserver.HRegionServer: Open 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.
> 2013-11-25 12:31:15,380 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning 
> d1485bc64a1a9246908e6af38a89b973 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING
> 2013-11-25 12:31:15,390 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node 
> d1485bc64a1a9246908e6af38a89b973 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING
> 2013-11-25 12:31:15,390 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> regionserver.HRegion: Opening region: {ENCODED => 
> d1485bc64a1a9246908e6af38a89b973, NAME => 
> 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', 
> ENDKEY => '00000000000000000005329216'}
> 2013-11-25 12:31:15,391 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl 
> for table TestTable d1485bc64a1a9246908e6af38a89b973
> 2013-11-25 12:31:15,391 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> regionserver.HRegion: Instantiated 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.
> 2013-11-25 12:31:15,396 INFO  
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] 
> compactions.CompactionConfiguration: size [134217728, 9223372036854775807); 
> files [3, 10); ratio 1.200000; off-peak ratio 5.000000; throttle point 
> 2684354560; delete expired; major period 604800000, major jitter 0.500000
> 2013-11-25 12:31:15,410 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/0b11ea6ce17f47269518d8ebdad7317f,
>  isReference=false, isBulkLoadResult=false, seqid=37383, majorCompaction=false
> 2013-11-25 12:31:15,421 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/1f8d0ce7df434557949e4d6abd007763,
>  isReference=false, isBulkLoadResult=false, seqid=34710, majorCompaction=false
> 2013-11-25 12:31:15,427 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/3ccc4c67a02745ecbca04f45faf49843,
>  isReference=false, isBulkLoadResult=false, seqid=37300, majorCompaction=false
> 2013-11-25 12:31:15,434 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/4664bc35374b4a7db7f8920449ac539c,
>  isReference=false, isBulkLoadResult=false, seqid=36624, majorCompaction=false
> 2013-11-25 12:31:15,441 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/646293ad1fc44ad589bdcb27195d19df,
>  isReference=false, isBulkLoadResult=false, seqid=37227, majorCompaction=false
> 2013-11-25 12:31:15,444 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> No node available for 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
> 2013-11-25 12:31:15,444 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:15,444 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 1 IOException, will wait for 2303.229529558494 msec.
> 2013-11-25 12:31:17,749 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> No node available for 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
> 2013-11-25 12:31:17,749 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:17,749 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 2 IOException, will wait for 3539.589126663613 msec.
> 2013-11-25 12:31:21,290 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> No node available for 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
> 2013-11-25 12:31:21,290 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:21,290 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 3 IOException, will wait for 6005.249913877286 msec.
> 2013-11-25 12:31:27,298 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS Read
> 2013-11-25 12:31:21,290 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:21,290 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 3 IOException, will wait for 6005.249913877286 msec.
> 2013-11-25 12:31:27,298 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS Read
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
>         at java.io.DataInputStream.readFully(DataInputStream.java:195)
>         at 
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:415)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:567)
>         at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:607)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1083)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:188)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:351)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:446)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:472)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:109)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:425)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> 2013-11-25 12:31:27,306 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/cbbb02d1fe254ec28b28b692247fe2a3,
>  isReference=false, isBulkLoadResult=false, seqid=36367, majorCompaction=false
> 2013-11-25 12:31:27,330 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/dc4bd223cb6e4e169450aa4dc511b436,
>  isReference=false, isBulkLoadResult=false, seqid=25056, majorCompaction=true
> 2013-11-25 12:31:27,358 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/e5d87d2d22474c20973543b7009862d3,
>  isReference=false, isBulkLoadResult=false, seqid=28805, majorCompaction=false
> 2013-11-25 12:31:27,366 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/f7822308175c4bedbcb8a657ebbebdd1,
>  isReference=false, isBulkLoadResult=false, seqid=36090, majorCompaction=false
> 2013-11-25 12:31:27,393 ERROR [RS_OPEN_REGION-c2024:60020-0] 
> handler.OpenRegionHandler: Failed open of 
> region=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., starting 
> to roll back the global memstore size.
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile 
> Trailer from file 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:700)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:616)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:587)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4115)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4086)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4037)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3988)
>         at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:475)
>         at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:140)
>         at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: 
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile 
> Trailer from file 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:450)
>         at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:239)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3090)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:673)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:670)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         ... 3 more
> Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
> reading HFile Trailer from file 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:581)
>         at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:607)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1083)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:188)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:351)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:446)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:472)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:109)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:425)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422)
>         ... 6 more
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
>         at java.io.DataInputStream.readFully(DataInputStream.java:195)
>         at 
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:415)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:567)
>         ... 15 more
> 2013-11-25 12:31:27,395 INFO  [RS_OPEN_REGION-c2024:60020-0] 
> handler.OpenRegionHandler: Opening of region {ENCODED => 
> d1485bc64a1a9246908e6af38a89b973, NAME => 
> 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', 
> ENDKEY => '00000000000000000005329216'} failed, transitioning from OPENING to 
> FAILED_OPEN in ZK, expecting version 28
> 2013-11-25 12:31:27,395 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> 2013-11-25 12:31:27,433 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to