[ 
https://issues.apache.org/jira/browse/HBASE-10025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832123#comment-13832123
 ] 

Jimmy Xiang commented on HBASE-10025:
-------------------------------------

The reason for the region stuck in transition in this case is to let OPS know 
it. Otherwise, it will be a surprise to them if they find out some region is 
not assigned and not shown in the web UI.  The other choice is to not stuck it 
in transition, but put it somewhere so we can show it in web UI.

> Region stuck in transition after failed RS_ZK_REGION_FAILED_OPEN
> ----------------------------------------------------------------
>
>                 Key: HBASE-10025
>                 URL: https://issues.apache.org/jira/browse/HBASE-10025
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 0.96.0
>            Reporter: stack
>
> Testing, I just ran into a case where I have a region stuck in transition.  
> We kept trying to open the region but were failing and eventually got stuck.  
> Looks like a block was missing so region may never have been able to open but 
> we likely shouldn't have the region stuck in transition.
> Here is some log snippet from master side only showing lines that mention the 
> region that is failing:
> {code}
>  31 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, 
> server=c2023.halxg.cloudera.com,60020,1385405351397, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, 
> ts=1385411440853, server=c2023.halxg.cloudera.com,60020,1385405351397}
>  30 2013-11-25 12:30:56,111 INFO  [AM.ZK.Worker-pool2-t366] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411440853, 
> server=c2023.halxg.cloudera.com,60020,1385405351397} to 
> {d1485bc64a1a9246908e6af38a89b973 state=CLOSED, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397}
>  29 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2023.halxg.cloudera.com,60020,1385405351397 accepted as a dest server = 
> false
>  28 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: No previous transition plan found (or ignoring an 
> existing plan) for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; generated random 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2025.halxg.cloudera.com,60020,1385405351329; 3 (on#
>  27 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> handler.ClosedRegionHandler: Handling CLOSED event for 
> d1485bc64a1a9246908e6af38a89b973
>  26 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2025.halxg.cloudera.com,60020,1385405351329 accepted as a dest server = 
> true
>  25 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Using pre-existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2025.halxg.cloudera.com,60020,1385405351329
>  24 2013-11-25 12:30:56,111 INFO  [AM.ZK.Worker-pool2-t366] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=CLOSED, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OFFLINE, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397}
>  23 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] 
> zookeeper.ZKAssign: master:60000-0x1429097f3410007, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Creating (or updating) 
> unassigned node d1485bc64a1a9246908e6af38a89b973 with OFFLINE state
>  22 2013-11-25 12:30:56,118 INFO  [AM.ZK.Worker-pool2-t366] 
> master.AssignmentManager: Assigning 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. to 
> c2025.halxg.cloudera.com,60020,1385405351329
>  21 2013-11-25 12:30:56,118 INFO  [AM.ZK.Worker-pool2-t366] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OFFLINE, ts=1385411456111, 
> server=c2023.halxg.cloudera.com,60020,1385405351397} to 
> {d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, ts=1385411456118, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>  20 2013-11-25 12:30:56,127 DEBUG [AM.ZK.Worker-pool2-t367] 
> master.AssignmentManager: Handling M_ZK_REGION_OFFLINE, 
> server=c2025.halxg.cloudera.com,60020,1385405351329, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411456118, server=c2025.halxg.cloudera.com,60020,1385405351329}
>  19 2013-11-25 12:30:56,135 DEBUG [AM.ZK.Worker-pool2-t368] 
> master.AssignmentManager: Handling RS_ZK_REGION_OPENING, 
> server=c2025.halxg.cloudera.com,60020,1385405351329, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411456118, server=c2025.halxg.cloudera.com,60020,1385405351329}
>  18 2013-11-25 12:30:56,135 INFO  [AM.ZK.Worker-pool2-t368] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=PENDING_OPEN, ts=1385411456118, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OPENING, ts=1385411456135, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>  17 2013-11-25 12:31:10,594 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411456135, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}}
>  16 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, 
> server=c2025.halxg.cloudera.com,60020,1385405351329, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, 
> ts=1385411456135, server=c2025.halxg.cloudera.com,60020,1385405351329}
>  15 2013-11-25 12:31:15,368 INFO  [AM.ZK.Worker-pool2-t369] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411456135, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=CLOSED, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>  14 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2025.halxg.cloudera.com,60020,1385405351329 accepted as a dest server = 
> false
>  13 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: No previous transition plan found (or ignoring an 
> existing plan) for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; generated random 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2024.halxg.cloudera.com,60020,1385405351453; 3 (on#
>  12 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> handler.ClosedRegionHandler: Handling CLOSED event for 
> d1485bc64a1a9246908e6af38a89b973
>  11 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Found an existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server 
> is c2024.halxg.cloudera.com,60020,1385405351453 accepted as a dest server = 
> true
>  10 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Using pre-existing plan for 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; 
> plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, 
> dest=c2024.halxg.cloudera.com,60020,1385405351453
>   9 2013-11-25 12:31:15,368 INFO  [AM.ZK.Worker-pool2-t369] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=CLOSED, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OFFLINE, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329}
>   8 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] 
> zookeeper.ZKAssign: master:60000-0x1429097f3410007, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Creating (or updating) 
> unassigned node d1485bc64a1a9246908e6af38a89b973 with OFFLINE state
>   7 2013-11-25 12:31:15,375 INFO  [AM.ZK.Worker-pool2-t369] 
> master.AssignmentManager: Assigning 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. to 
> c2024.halxg.cloudera.com,60020,1385405351453
>   6 2013-11-25 12:31:15,375 INFO  [AM.ZK.Worker-pool2-t369] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OFFLINE, ts=1385411475368, 
> server=c2025.halxg.cloudera.com,60020,1385405351329} to 
> {d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, ts=1385411475375, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}
>   5 2013-11-25 12:31:15,382 DEBUG [AM.ZK.Worker-pool2-t370] 
> master.AssignmentManager: Handling M_ZK_REGION_OFFLINE, 
> server=c2024.halxg.cloudera.com,60020,1385405351453, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411475375, server=c2024.halxg.cloudera.com,60020,1385405351453}
>   4 2013-11-25 12:31:15,392 DEBUG [AM.ZK.Worker-pool2-t371] 
> master.AssignmentManager: Handling RS_ZK_REGION_OPENING, 
> server=c2024.halxg.cloudera.com,60020,1385405351453, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, 
> ts=1385411475375, server=c2024.halxg.cloudera.com,60020,1385405351453}
>   3 2013-11-25 12:31:15,393 INFO  [AM.ZK.Worker-pool2-t371] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=PENDING_OPEN, ts=1385411475375, 
> server=c2024.halxg.cloudera.com,60020,1385405351453} to 
> {d1485bc64a1a9246908e6af38a89b973 state=OPENING, ts=1385411475393, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}
>   2 2013-11-25 12:31:27,436 DEBUG [AM.ZK.Worker-pool2-t372] 
> master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, 
> server=c2024.halxg.cloudera.com,60020,1385405351453, 
> region=d1485bc64a1a9246908e6af38a89b973, 
> current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, 
> ts=1385411475393, server=c2024.halxg.cloudera.com,60020,1385405351453}
>   1 2013-11-25 12:31:27,436 WARN  [AM.ZK.Worker-pool2-t372] 
> master.RegionStates: Failed to open/close d1485bc64a1a9246908e6af38a89b973 on 
> c2024.halxg.cloudera.com,60020,1385405351453, set to FAILED_OPEN
>   0 2013-11-25 12:31:27,436 INFO  [AM.ZK.Worker-pool2-t372] 
> master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 
> state=OPENING, ts=1385411475393, 
> server=c2024.halxg.cloudera.com,60020,1385405351453} to 
> {d1485bc64a1a9246908e6af38a89b973 state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}
>   1 2013-11-25 12:36:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   2 2013-11-25 12:41:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   3 2013-11-25 12:46:10,596 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   4 2013-11-25 12:51:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
>   5 2013-11-25 12:56:10,595 DEBUG 
> [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: 
> Not running balancer because 1 region(s) in transition: 
> {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 
> state=FAILED_OPEN, ts=1385411487436, 
> server=c2024.halxg.cloudera.com,60020,1385405351453}}
> {code}
> Here is log from the regionserver side on last open attempt:
> {code}
> 2013-11-25 12:29:40,648 INFO  [RS_OPEN_REGION-c2024:60020-1] 
> handler.OpenRegionHandler: Opening of region {ENCODED => 
> d1485bc64a1a9246908e6af38a89b973, NAME => 
> 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', 
> ENDKEY => '00000000000000000005329216'} failed, transitioning from OPENING to 
> FAILED_OPEN in ZK, expecting version 10
> 2013-11-25 12:29:40,648 DEBUG [RS_OPEN_REGION-c2024:60020-1] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> 2013-11-25 12:29:40,663 DEBUG [RS_OPEN_REGION-c2024:60020-1] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> 2013-11-25 12:31:15,376 INFO  [Priority.RpcServer.handler=0,port=60020] 
> regionserver.HRegionServer: Open 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.
> 2013-11-25 12:31:15,380 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning 
> d1485bc64a1a9246908e6af38a89b973 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING
> 2013-11-25 12:31:15,390 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node 
> d1485bc64a1a9246908e6af38a89b973 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING
> 2013-11-25 12:31:15,390 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> regionserver.HRegion: Opening region: {ENCODED => 
> d1485bc64a1a9246908e6af38a89b973, NAME => 
> 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', 
> ENDKEY => '00000000000000000005329216'}
> 2013-11-25 12:31:15,391 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl 
> for table TestTable d1485bc64a1a9246908e6af38a89b973
> 2013-11-25 12:31:15,391 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> regionserver.HRegion: Instantiated 
> TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.
> 2013-11-25 12:31:15,396 INFO  
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] 
> compactions.CompactionConfiguration: size [134217728, 9223372036854775807); 
> files [3, 10); ratio 1.200000; off-peak ratio 5.000000; throttle point 
> 2684354560; delete expired; major period 604800000, major jitter 0.500000
> 2013-11-25 12:31:15,410 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/0b11ea6ce17f47269518d8ebdad7317f,
>  isReference=false, isBulkLoadResult=false, seqid=37383, majorCompaction=false
> 2013-11-25 12:31:15,421 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/1f8d0ce7df434557949e4d6abd007763,
>  isReference=false, isBulkLoadResult=false, seqid=34710, majorCompaction=false
> 2013-11-25 12:31:15,427 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/3ccc4c67a02745ecbca04f45faf49843,
>  isReference=false, isBulkLoadResult=false, seqid=37300, majorCompaction=false
> 2013-11-25 12:31:15,434 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/4664bc35374b4a7db7f8920449ac539c,
>  isReference=false, isBulkLoadResult=false, seqid=36624, majorCompaction=false
> 2013-11-25 12:31:15,441 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/646293ad1fc44ad589bdcb27195d19df,
>  isReference=false, isBulkLoadResult=false, seqid=37227, majorCompaction=false
> 2013-11-25 12:31:15,444 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> No node available for 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
> 2013-11-25 12:31:15,444 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:15,444 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 1 IOException, will wait for 2303.229529558494 msec.
> 2013-11-25 12:31:17,749 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> No node available for 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
> 2013-11-25 12:31:17,749 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:17,749 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 2 IOException, will wait for 3539.589126663613 msec.
> 2013-11-25 12:31:21,290 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> No node available for 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
> 2013-11-25 12:31:21,290 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:21,290 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 3 IOException, will wait for 6005.249913877286 msec.
> 2013-11-25 12:31:27,298 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS Read
> 2013-11-25 12:31:21,290 INFO  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> from any node: java.io.IOException: No live nodes contain current block. Will 
> get new block locations from namenode and retry...
> 2013-11-25 12:31:21,290 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS chooseDataNode: got # 3 IOException, will wait for 6005.249913877286 msec.
> 2013-11-25 12:31:27,298 WARN  [StoreFileOpenerThread-info-1] hdfs.DFSClient: 
> DFS Read
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: 
> BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
>         at java.io.DataInputStream.readFully(DataInputStream.java:195)
>         at 
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:415)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:567)
>         at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:607)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1083)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:188)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:351)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:446)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:472)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:109)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:425)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> 2013-11-25 12:31:27,306 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/cbbb02d1fe254ec28b28b692247fe2a3,
>  isReference=false, isBulkLoadResult=false, seqid=36367, majorCompaction=false
> 2013-11-25 12:31:27,330 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/dc4bd223cb6e4e169450aa4dc511b436,
>  isReference=false, isBulkLoadResult=false, seqid=25056, majorCompaction=true
> 2013-11-25 12:31:27,358 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/e5d87d2d22474c20973543b7009862d3,
>  isReference=false, isBulkLoadResult=false, seqid=28805, majorCompaction=false
> 2013-11-25 12:31:27,366 DEBUG 
> [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/f7822308175c4bedbcb8a657ebbebdd1,
>  isReference=false, isBulkLoadResult=false, seqid=36090, majorCompaction=false
> 2013-11-25 12:31:27,393 ERROR [RS_OPEN_REGION-c2024:60020-0] 
> handler.OpenRegionHandler: Failed open of 
> region=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., starting 
> to roll back the global memstore size.
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile 
> Trailer from file 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:700)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:616)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:587)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4115)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4086)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4037)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3988)
>         at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:475)
>         at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:140)
>         at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: 
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile 
> Trailer from file 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:450)
>         at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:239)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3090)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:673)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:670)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         ... 3 more
> Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
> reading HFile Trailer from file 
> hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:581)
>         at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:607)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1083)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:188)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:351)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:446)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:472)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:109)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:425)
>         at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422)
>         ... 6 more
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 
> file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
>         at java.io.DataInputStream.readFully(DataInputStream.java:195)
>         at 
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:415)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:567)
>         ... 15 more
> 2013-11-25 12:31:27,395 INFO  [RS_OPEN_REGION-c2024:60020-0] 
> handler.OpenRegionHandler: Opening of region {ENCODED => 
> d1485bc64a1a9246908e6af38a89b973, NAME => 
> 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', 
> ENDKEY => '00000000000000000005329216'} failed, transitioning from OPENING to 
> FAILED_OPEN in ZK, expecting version 28
> 2013-11-25 12:31:27,395 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> 2013-11-25 12:31:27,433 DEBUG [RS_OPEN_REGION-c2024:60020-0] 
> zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, 
> quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node 
> d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to 
> RS_ZK_REGION_FAILED_OPEN
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to