[ 
https://issues.apache.org/jira/browse/HBASE-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025441#comment-13025441
 ] 

stack commented on HBASE-3741:
------------------------------

Change the name of this Set or else add a comment which says how this differs 
from the RIT that is kept up in zk.

Why not declare and define all on the one line?  Why do the assign in the 
constructor?  Just do it when you declare the data member:

{code}
+    this.regionsInTransition =
+        new ConcurrentSkipListSet<byte[]>(Bytes.BYTES_COMPARATOR);
{code}


This new exception should be called RegionAlreadyInTransitionException rather 
than RegionAlreadyInTransition.

Yeah, I'm not sure this a good name for a public method (Does it have to be 
public?  +  public Set<byte[]> getRegionsInTransition();).




> OpenRegionHandler and CloseRegionHandler are possibly racing
> ------------------------------------------------------------
>
>                 Key: HBASE-3741
>                 URL: https://issues.apache.org/jira/browse/HBASE-3741
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.90.3
>
>         Attachments: HBASE-3741-rsfix-v2.patch, HBASE-3741-rsfix.patch
>
>
> This is a serious issue about a race between regions being opened and closed 
> in region servers. We had this situation where the master tried to unassign a 
> region for balancing, failed, force unassigned it, force assigned it 
> somewhere else, failed to open it on another region server (took too long), 
> and then reassigned it back to the original region server. A few seconds 
> later, the region server processed the first closed and the region was left 
> unassigned.
> This is from the master log:
> {quote}
> 11-04-05 15:11:17,758 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
> Sent CLOSE to serverName=sv4borg42,60020,1300920459477, load=(requests=187, 
> regions=574, usedHeap=3918, maxHeap=6973) for region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:12:10,021 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=PENDING_CLOSE, ts=1302041477758
> 2011-04-05 15:12:10,021 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> ...
> 2011-04-05 15:14:45,783 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
> was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=CLOSED, ts=1302041685733
> 2011-04-05 15:14:45,783 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:60000-0x42ec2cece810b68 Creating (or updating) unassigned node for 
> 1470298961 with OFFLINE state
> ...
> 2011-04-05 15:14:45,885 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961;
>  
> plan=hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
>  src=sv4borg42,60020,1300920459477, dest=sv4borg40,60020,1302041218196
> 2011-04-05 15:14:45,885 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  to sv4borg40,60020,1302041218196
> 2011-04-05 15:15:39,410 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=PENDING_OPEN, ts=1302041700944
> 2011-04-05 15:15:39,410 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_OPEN for too long, reassigning 
> region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:39,410 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
> was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=PENDING_OPEN, ts=1302041700944
> ...
> 2011-04-05 15:15:39,410 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
> was found (or we are ignoring an existing plan) for 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  so generated a random one; 
> hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
>  src=, dest=sv4borg42,60020,1300920459477; 19 (online=19, exclude=null) 
> available servers
> 2011-04-05 15:15:39,410 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  to sv4borg42,60020,1300920459477
> 2011-04-05 15:15:40,951 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> master:60000-0x42ec2cece810b68 Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, 
> path=/prodjobs/unassigned/1470298961
> 2011-04-05 15:15:40,952 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> master:60000-0x42ec2cece810b68 Retrieved 93 byte(s) of data from znode 
> /prodjobs/unassigned/1470298961 and set watcher; 
> region=stumbles_by_userid2,'����穗���6,1266566087256, 
> server=sv4borg42,60020,1300920459477, state=RS_ZK_REGION_OPENED
> 2011-04-05 15:15:40,952 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=sv4borg42,60020,1300920459477, 
> region=1470298961
> 2011-04-05 15:15:42,222 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
> event for 1470298961; deleting unassigned node
> ...
> 2011-04-05 15:15:55,812 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> master:60000-0x42ec2cece810b68 Retrieved 93 byte(s) of data from znode 
> /prodjobs/unassigned/1470298961 and set watcher; 
> region=stumbles_by_userid2,'����穗���6,1266566087256, 
> server=sv4borg42,60020,1300920459477, state=RS_ZK_REGION_CLOSING
> 2011-04-05 15:15:55,812 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
> node: /prodjobs/unassigned/1470298961 
> (region=stumbles_by_userid2,'����穗���6,1266566087256, 
> server=sv4borg42,60020,1300920459477, state=RS_ZK_REGION_CLOSING)
> 2011-04-05 15:15:55,812 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_CLOSING, server=sv4borg42,60020,1300920459477, 
> region=1470298961
> 2011-04-05 15:15:55,812 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSING for region 
> 1470298961 from server sv4borg42,60020,1300920459477 but region was in  the 
> state null and not in expected PENDING_CLOSE or CLOSING states
> {quote}
> And from sv4borg42:
> {quote}
> 2011-04-05 15:09:58,755 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:11:17,757 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:12:10,021 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:14:45,675 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing 
> close of 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:14:45,700 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961:
>  disabling compactions & flushes
> 2011-04-05 15:14:45,701 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Updates disabled for region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:14:45,701 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Closed 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:14:45,758 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed 
> region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:39,410 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 
> region: 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:39,410 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing 
> open of 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:39,486 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Opening region: REGION => {NAME => 
> 'stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256',
>  STARTKEY => '\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6', 
> ENDKEY => '\x00'\x9AU\x7F\xFF\xFE\xEBQ\xB0\xC3\xEF\x00Jr\xF2', ENCODED => 
> 1470298961, TABLE => ...
> 2011-04-05 15:15:39,487 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Instantiated 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:40,399 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Onlined 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961;
>  next sequenceid=37627407247
> 2011-04-05 15:15:40,488 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
> Updated row 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  in region .META.,,1 with server=sv4borg42:60020, startcode=1300920459477
> 2011-04-05 15:15:40,582 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:55,776 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing 
> close of 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:55,809 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Closing 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961:
>  disabling compactions & flushes
> 2011-04-05 15:15:55,809 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Updates disabled for region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:55,809 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Closed 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:55,842 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed 
> region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:55,943 DEBUG 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing 
> close of 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:55,943 WARN 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Received 
> CLOSE for region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  but currently not serving
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to