[ https://issues.apache.org/jira/browse/HBASE-10025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832296#comment-13832296 ]
stack commented on HBASE-10025: ------------------------------- Thanks [~jxiang] Is there an issue for adding prob to UI? That'd be coolio. I'd close this then. > Region stuck in transition after failed RS_ZK_REGION_FAILED_OPEN > ---------------------------------------------------------------- > > Key: HBASE-10025 > URL: https://issues.apache.org/jira/browse/HBASE-10025 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Affects Versions: 0.96.0 > Reporter: stack > > Testing, I just ran into a case where I have a region stuck in transition. > We kept trying to open the region but were failing and eventually got stuck. > Looks like a block was missing so region may never have been able to open but > we likely shouldn't have the region stuck in transition. > Here is some log snippet from master side only showing lines that mention the > region that is failing: > {code} > 31 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] > master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, > server=c2023.halxg.cloudera.com,60020,1385405351397, > region=d1485bc64a1a9246908e6af38a89b973, > current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, > ts=1385411440853, server=c2023.halxg.cloudera.com,60020,1385405351397} > 30 2013-11-25 12:30:56,111 INFO [AM.ZK.Worker-pool2-t366] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=OPENING, ts=1385411440853, > server=c2023.halxg.cloudera.com,60020,1385405351397} to > {d1485bc64a1a9246908e6af38a89b973 state=CLOSED, ts=1385411456111, > server=c2023.halxg.cloudera.com,60020,1385405351397} > 29 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] > master.AssignmentManager: Found an existing plan for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server > is c2023.halxg.cloudera.com,60020,1385405351397 accepted as a dest server = > false > 28 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] > master.AssignmentManager: No previous transition plan found (or ignoring an > existing plan) for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; generated random > plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, > dest=c2025.halxg.cloudera.com,60020,1385405351329; 3 (on# > 27 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] > handler.ClosedRegionHandler: Handling CLOSED event for > d1485bc64a1a9246908e6af38a89b973 > 26 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] > master.AssignmentManager: Found an existing plan for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server > is c2025.halxg.cloudera.com,60020,1385405351329 accepted as a dest server = > true > 25 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] > master.AssignmentManager: Using pre-existing plan for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; > plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, > dest=c2025.halxg.cloudera.com,60020,1385405351329 > 24 2013-11-25 12:30:56,111 INFO [AM.ZK.Worker-pool2-t366] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=CLOSED, ts=1385411456111, > server=c2023.halxg.cloudera.com,60020,1385405351397} to > {d1485bc64a1a9246908e6af38a89b973 state=OFFLINE, ts=1385411456111, > server=c2023.halxg.cloudera.com,60020,1385405351397} > 23 2013-11-25 12:30:56,111 DEBUG [AM.ZK.Worker-pool2-t366] > zookeeper.ZKAssign: master:60000-0x1429097f3410007, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Creating (or updating) > unassigned node d1485bc64a1a9246908e6af38a89b973 with OFFLINE state > 22 2013-11-25 12:30:56,118 INFO [AM.ZK.Worker-pool2-t366] > master.AssignmentManager: Assigning > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. to > c2025.halxg.cloudera.com,60020,1385405351329 > 21 2013-11-25 12:30:56,118 INFO [AM.ZK.Worker-pool2-t366] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=OFFLINE, ts=1385411456111, > server=c2023.halxg.cloudera.com,60020,1385405351397} to > {d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, ts=1385411456118, > server=c2025.halxg.cloudera.com,60020,1385405351329} > 20 2013-11-25 12:30:56,127 DEBUG [AM.ZK.Worker-pool2-t367] > master.AssignmentManager: Handling M_ZK_REGION_OFFLINE, > server=c2025.halxg.cloudera.com,60020,1385405351329, > region=d1485bc64a1a9246908e6af38a89b973, > current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, > ts=1385411456118, server=c2025.halxg.cloudera.com,60020,1385405351329} > 19 2013-11-25 12:30:56,135 DEBUG [AM.ZK.Worker-pool2-t368] > master.AssignmentManager: Handling RS_ZK_REGION_OPENING, > server=c2025.halxg.cloudera.com,60020,1385405351329, > region=d1485bc64a1a9246908e6af38a89b973, > current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, > ts=1385411456118, server=c2025.halxg.cloudera.com,60020,1385405351329} > 18 2013-11-25 12:30:56,135 INFO [AM.ZK.Worker-pool2-t368] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=PENDING_OPEN, ts=1385411456118, > server=c2025.halxg.cloudera.com,60020,1385405351329} to > {d1485bc64a1a9246908e6af38a89b973 state=OPENING, ts=1385411456135, > server=c2025.halxg.cloudera.com,60020,1385405351329} > 17 2013-11-25 12:31:10,594 DEBUG > [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: > Not running balancer because 1 region(s) in transition: > {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 > state=OPENING, ts=1385411456135, > server=c2025.halxg.cloudera.com,60020,1385405351329}} > 16 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] > master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, > server=c2025.halxg.cloudera.com,60020,1385405351329, > region=d1485bc64a1a9246908e6af38a89b973, > current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, > ts=1385411456135, server=c2025.halxg.cloudera.com,60020,1385405351329} > 15 2013-11-25 12:31:15,368 INFO [AM.ZK.Worker-pool2-t369] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=OPENING, ts=1385411456135, > server=c2025.halxg.cloudera.com,60020,1385405351329} to > {d1485bc64a1a9246908e6af38a89b973 state=CLOSED, ts=1385411475368, > server=c2025.halxg.cloudera.com,60020,1385405351329} > 14 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] > master.AssignmentManager: Found an existing plan for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server > is c2025.halxg.cloudera.com,60020,1385405351329 accepted as a dest server = > false > 13 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] > master.AssignmentManager: No previous transition plan found (or ignoring an > existing plan) for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; generated random > plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, > dest=c2024.halxg.cloudera.com,60020,1385405351453; 3 (on# > 12 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] > handler.ClosedRegionHandler: Handling CLOSED event for > d1485bc64a1a9246908e6af38a89b973 > 11 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] > master.AssignmentManager: Found an existing plan for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. destination server > is c2024.halxg.cloudera.com,60020,1385405351453 accepted as a dest server = > true > 10 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] > master.AssignmentManager: Using pre-existing plan for > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.; > plan=hri=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., src=, > dest=c2024.halxg.cloudera.com,60020,1385405351453 > 9 2013-11-25 12:31:15,368 INFO [AM.ZK.Worker-pool2-t369] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=CLOSED, ts=1385411475368, > server=c2025.halxg.cloudera.com,60020,1385405351329} to > {d1485bc64a1a9246908e6af38a89b973 state=OFFLINE, ts=1385411475368, > server=c2025.halxg.cloudera.com,60020,1385405351329} > 8 2013-11-25 12:31:15,368 DEBUG [AM.ZK.Worker-pool2-t369] > zookeeper.ZKAssign: master:60000-0x1429097f3410007, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Creating (or updating) > unassigned node d1485bc64a1a9246908e6af38a89b973 with OFFLINE state > 7 2013-11-25 12:31:15,375 INFO [AM.ZK.Worker-pool2-t369] > master.AssignmentManager: Assigning > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. to > c2024.halxg.cloudera.com,60020,1385405351453 > 6 2013-11-25 12:31:15,375 INFO [AM.ZK.Worker-pool2-t369] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=OFFLINE, ts=1385411475368, > server=c2025.halxg.cloudera.com,60020,1385405351329} to > {d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, ts=1385411475375, > server=c2024.halxg.cloudera.com,60020,1385405351453} > 5 2013-11-25 12:31:15,382 DEBUG [AM.ZK.Worker-pool2-t370] > master.AssignmentManager: Handling M_ZK_REGION_OFFLINE, > server=c2024.halxg.cloudera.com,60020,1385405351453, > region=d1485bc64a1a9246908e6af38a89b973, > current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, > ts=1385411475375, server=c2024.halxg.cloudera.com,60020,1385405351453} > 4 2013-11-25 12:31:15,392 DEBUG [AM.ZK.Worker-pool2-t371] > master.AssignmentManager: Handling RS_ZK_REGION_OPENING, > server=c2024.halxg.cloudera.com,60020,1385405351453, > region=d1485bc64a1a9246908e6af38a89b973, > current_state={d1485bc64a1a9246908e6af38a89b973 state=PENDING_OPEN, > ts=1385411475375, server=c2024.halxg.cloudera.com,60020,1385405351453} > 3 2013-11-25 12:31:15,393 INFO [AM.ZK.Worker-pool2-t371] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=PENDING_OPEN, ts=1385411475375, > server=c2024.halxg.cloudera.com,60020,1385405351453} to > {d1485bc64a1a9246908e6af38a89b973 state=OPENING, ts=1385411475393, > server=c2024.halxg.cloudera.com,60020,1385405351453} > 2 2013-11-25 12:31:27,436 DEBUG [AM.ZK.Worker-pool2-t372] > master.AssignmentManager: Handling RS_ZK_REGION_FAILED_OPEN, > server=c2024.halxg.cloudera.com,60020,1385405351453, > region=d1485bc64a1a9246908e6af38a89b973, > current_state={d1485bc64a1a9246908e6af38a89b973 state=OPENING, > ts=1385411475393, server=c2024.halxg.cloudera.com,60020,1385405351453} > 1 2013-11-25 12:31:27,436 WARN [AM.ZK.Worker-pool2-t372] > master.RegionStates: Failed to open/close d1485bc64a1a9246908e6af38a89b973 on > c2024.halxg.cloudera.com,60020,1385405351453, set to FAILED_OPEN > 0 2013-11-25 12:31:27,436 INFO [AM.ZK.Worker-pool2-t372] > master.RegionStates: Transitioned {d1485bc64a1a9246908e6af38a89b973 > state=OPENING, ts=1385411475393, > server=c2024.halxg.cloudera.com,60020,1385405351453} to > {d1485bc64a1a9246908e6af38a89b973 state=FAILED_OPEN, ts=1385411487436, > server=c2024.halxg.cloudera.com,60020,1385405351453} > 1 2013-11-25 12:36:10,595 DEBUG > [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: > Not running balancer because 1 region(s) in transition: > {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 > state=FAILED_OPEN, ts=1385411487436, > server=c2024.halxg.cloudera.com,60020,1385405351453}} > 2 2013-11-25 12:41:10,595 DEBUG > [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: > Not running balancer because 1 region(s) in transition: > {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 > state=FAILED_OPEN, ts=1385411487436, > server=c2024.halxg.cloudera.com,60020,1385405351453}} > 3 2013-11-25 12:46:10,596 DEBUG > [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: > Not running balancer because 1 region(s) in transition: > {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 > state=FAILED_OPEN, ts=1385411487436, > server=c2024.halxg.cloudera.com,60020,1385405351453}} > 4 2013-11-25 12:51:10,595 DEBUG > [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: > Not running balancer because 1 region(s) in transition: > {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 > state=FAILED_OPEN, ts=1385411487436, > server=c2024.halxg.cloudera.com,60020,1385405351453}} > 5 2013-11-25 12:56:10,595 DEBUG > [c2020.halxg.cloudera.com,60000,1385405460201-BalancerChore] master.HMaster: > Not running balancer because 1 region(s) in transition: > {d1485bc64a1a9246908e6af38a89b973={d1485bc64a1a9246908e6af38a89b973 > state=FAILED_OPEN, ts=1385411487436, > server=c2024.halxg.cloudera.com,60020,1385405351453}} > {code} > Here is log from the regionserver side on last open attempt: > {code} > 2013-11-25 12:29:40,648 INFO [RS_OPEN_REGION-c2024:60020-1] > handler.OpenRegionHandler: Opening of region {ENCODED => > d1485bc64a1a9246908e6af38a89b973, NAME => > 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', > ENDKEY => '00000000000000000005329216'} failed, transitioning from OPENING to > FAILED_OPEN in ZK, expecting version 10 > 2013-11-25 12:29:40,648 DEBUG [RS_OPEN_REGION-c2024:60020-1] > zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning > d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to > RS_ZK_REGION_FAILED_OPEN > 2013-11-25 12:29:40,663 DEBUG [RS_OPEN_REGION-c2024:60020-1] > zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node > d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to > RS_ZK_REGION_FAILED_OPEN > 2013-11-25 12:31:15,376 INFO [Priority.RpcServer.handler=0,port=60020] > regionserver.HRegionServer: Open > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. > 2013-11-25 12:31:15,380 DEBUG [RS_OPEN_REGION-c2024:60020-0] > zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning > d1485bc64a1a9246908e6af38a89b973 from M_ZK_REGION_OFFLINE to > RS_ZK_REGION_OPENING > 2013-11-25 12:31:15,390 DEBUG [RS_OPEN_REGION-c2024:60020-0] > zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node > d1485bc64a1a9246908e6af38a89b973 from M_ZK_REGION_OFFLINE to > RS_ZK_REGION_OPENING > 2013-11-25 12:31:15,390 DEBUG [RS_OPEN_REGION-c2024:60020-0] > regionserver.HRegion: Opening region: {ENCODED => > d1485bc64a1a9246908e6af38a89b973, NAME => > 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', > ENDKEY => '00000000000000000005329216'} > 2013-11-25 12:31:15,391 DEBUG [RS_OPEN_REGION-c2024:60020-0] > regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl > for table TestTable d1485bc64a1a9246908e6af38a89b973 > 2013-11-25 12:31:15,391 DEBUG [RS_OPEN_REGION-c2024:60020-0] > regionserver.HRegion: Instantiated > TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973. > 2013-11-25 12:31:15,396 INFO > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] > compactions.CompactionConfiguration: size [134217728, 9223372036854775807); > files [3, 10); ratio 1.200000; off-peak ratio 5.000000; throttle point > 2684354560; delete expired; major period 604800000, major jitter 0.500000 > 2013-11-25 12:31:15,410 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/0b11ea6ce17f47269518d8ebdad7317f, > isReference=false, isBulkLoadResult=false, seqid=37383, majorCompaction=false > 2013-11-25 12:31:15,421 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/1f8d0ce7df434557949e4d6abd007763, > isReference=false, isBulkLoadResult=false, seqid=34710, majorCompaction=false > 2013-11-25 12:31:15,427 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/3ccc4c67a02745ecbca04f45faf49843, > isReference=false, isBulkLoadResult=false, seqid=37300, majorCompaction=false > 2013-11-25 12:31:15,434 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/4664bc35374b4a7db7f8920449ac539c, > isReference=false, isBulkLoadResult=false, seqid=36624, majorCompaction=false > 2013-11-25 12:31:15,441 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/646293ad1fc44ad589bdcb27195d19df, > isReference=false, isBulkLoadResult=false, seqid=37227, majorCompaction=false > 2013-11-25 12:31:15,444 INFO [StoreFileOpenerThread-info-1] hdfs.DFSClient: > No node available for > BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > 2013-11-25 12:31:15,444 INFO [StoreFileOpenerThread-info-1] hdfs.DFSClient: > Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > from any node: java.io.IOException: No live nodes contain current block. Will > get new block locations from namenode and retry... > 2013-11-25 12:31:15,444 WARN [StoreFileOpenerThread-info-1] hdfs.DFSClient: > DFS chooseDataNode: got # 1 IOException, will wait for 2303.229529558494 msec. > 2013-11-25 12:31:17,749 INFO [StoreFileOpenerThread-info-1] hdfs.DFSClient: > No node available for > BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > 2013-11-25 12:31:17,749 INFO [StoreFileOpenerThread-info-1] hdfs.DFSClient: > Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > from any node: java.io.IOException: No live nodes contain current block. Will > get new block locations from namenode and retry... > 2013-11-25 12:31:17,749 WARN [StoreFileOpenerThread-info-1] hdfs.DFSClient: > DFS chooseDataNode: got # 2 IOException, will wait for 3539.589126663613 msec. > 2013-11-25 12:31:21,290 INFO [StoreFileOpenerThread-info-1] hdfs.DFSClient: > No node available for > BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > 2013-11-25 12:31:21,290 INFO [StoreFileOpenerThread-info-1] hdfs.DFSClient: > Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > from any node: java.io.IOException: No live nodes contain current block. Will > get new block locations from namenode and retry... > 2013-11-25 12:31:21,290 WARN [StoreFileOpenerThread-info-1] hdfs.DFSClient: > DFS chooseDataNode: got # 3 IOException, will wait for 6005.249913877286 msec. > 2013-11-25 12:31:27,298 WARN [StoreFileOpenerThread-info-1] hdfs.DFSClient: > DFS Read > 2013-11-25 12:31:21,290 INFO [StoreFileOpenerThread-info-1] hdfs.DFSClient: > Could not obtain BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > from any node: java.io.IOException: No live nodes contain current block. Will > get new block locations from namenode and retry... > 2013-11-25 12:31:21,290 WARN [StoreFileOpenerThread-info-1] hdfs.DFSClient: > DFS chooseDataNode: got # 3 IOException, will wait for 6005.249913877286 msec. > 2013-11-25 12:31:27,298 WARN [StoreFileOpenerThread-info-1] hdfs.DFSClient: > DFS Read > org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: > BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793) > at java.io.DataInputStream.readFully(DataInputStream.java:195) > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:415) > at > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:567) > at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:607) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1083) > at > org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:188) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:351) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:446) > at > org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:472) > at > org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:109) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:425) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 2013-11-25 12:31:27,306 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/cbbb02d1fe254ec28b28b692247fe2a3, > isReference=false, isBulkLoadResult=false, seqid=36367, majorCompaction=false > 2013-11-25 12:31:27,330 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/dc4bd223cb6e4e169450aa4dc511b436, > isReference=false, isBulkLoadResult=false, seqid=25056, majorCompaction=true > 2013-11-25 12:31:27,358 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/e5d87d2d22474c20973543b7009862d3, > isReference=false, isBulkLoadResult=false, seqid=28805, majorCompaction=false > 2013-11-25 12:31:27,366 DEBUG > [StoreOpener-d1485bc64a1a9246908e6af38a89b973-1] regionserver.HStore: loaded > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/f7822308175c4bedbcb8a657ebbebdd1, > isReference=false, isBulkLoadResult=false, seqid=36090, majorCompaction=false > 2013-11-25 12:31:27,393 ERROR [RS_OPEN_REGION-c2024:60020-0] > handler.OpenRegionHandler: Failed open of > region=TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973., starting > to roll back the global memstore size. > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:700) > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:616) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:587) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4115) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4086) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4037) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3988) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:475) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:140) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile > Trailer from file > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > at > org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:450) > at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:239) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3090) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:673) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:670) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ... 3 more > Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem > reading HFile Trailer from file > hdfs://c2020.halxg.cloudera.com:8020/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > at > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:581) > at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:607) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1083) > at > org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:188) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:351) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:446) > at > org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:472) > at > org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:109) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:425) > at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422) > ... 6 more > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1665758641-10.20.84.26-1385096579215:blk_1073772825_32007 > file=/hbase_094/data/default/TestTable/d1485bc64a1a9246908e6af38a89b973/info/881468a5db134daaa621661a1edfbb2b > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793) > at java.io.DataInputStream.readFully(DataInputStream.java:195) > at > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:415) > at > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:567) > ... 15 more > 2013-11-25 12:31:27,395 INFO [RS_OPEN_REGION-c2024:60020-0] > handler.OpenRegionHandler: Opening of region {ENCODED => > d1485bc64a1a9246908e6af38a89b973, NAME => > 'TestTable,,1385406553349.d1485bc64a1a9246908e6af38a89b973.', STARTKEY => '', > ENDKEY => '00000000000000000005329216'} failed, transitioning from OPENING to > FAILED_OPEN in ZK, expecting version 28 > 2013-11-25 12:31:27,395 DEBUG [RS_OPEN_REGION-c2024:60020-0] > zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioning > d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to > RS_ZK_REGION_FAILED_OPEN > 2013-11-25 12:31:27,433 DEBUG [RS_OPEN_REGION-c2024:60020-0] > zookeeper.ZKAssign: regionserver:60020-0x1429097f3410003, > quorum=c2020.halxg.cloudera.com:2181, baseZNode=/hbase Transitioned node > d1485bc64a1a9246908e6af38a89b973 from RS_ZK_REGION_OPENING to > RS_ZK_REGION_FAILED_OPEN > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)