[ 
https://issues.apache.org/jira/browse/HBASE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-6329:
--------------------------------

    Attachment: HBASE-6329v2.patch

bq.if (this.catalogTracker != null) this.catalogTracker.stop();

With a detailed look about the CatalogTracker and HConnection, I think we could 
also do MetaEditor.addDaughter after catalogTracker.stop();

In the patch v2, add checkOpen in some place of SplitTransaction.
                
> Stop META regionserver when splitting region could cause daughter region 
> assign twice
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-6329
>                 URL: https://issues.apache.org/jira/browse/HBASE-6329
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6329v1.patch, HBASE-6329v2.patch
>
>
> We found this issue in 0.94, first let me describe the caseļ¼š
> Stop META rs when split is in progress
> 1.Stopping META rs(Server A).
> 2.The main thread of rs close ZK and delete ephemeral node of the rs.
> 3.SplitTransaction is retring MetaEditor.addDaughter
> 4.Master's ServerShutdownHandler process the above dead META server
> 5.Master fixup daughter and assign the daughter
> 6.The daughter is opened on another server(Server B)
> 7.Server A's splitTransaction successfully add the daughter to .META. with 
> serverName=Server A
> 8.Now, in the .META., daughter's region location is Server A but it is 
> onlined on Server B
> 9.Restart Master, and master will assign the daughter again.
> Attaching the logs, daughter region 80f999ea84cb259e20e9a228546f6c8a
> Master log:
> 2012-07-04 13:45:56,493 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
> for dw93.kgb.sqa.cm4,60020,1341378224464
> 2012-07-04 13:45:58,983 INFO 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Fixup; missing 
> daughter 
> writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.
>  
> 2012-07-04 13:45:58,985 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
> Added daughter 
> writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.,
>  serverName=null 
> 2012-07-04 13:45:58,988 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.
>  to dw88.kgb.sqa.cm4,60020,1341379188777 
> 2012-07-04 13:46:00,201 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the 
> region 
> writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.
>  that was online on dw88.kgb.sqa.cm4,60020,1341379188777 
> Master log after restart:
> 2012-07-04 14:27:05,824 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:60000-0x136187d60e34644 Creating (or updating) unassigned node for 
> 80f999ea84cb259e20e9a228546f6c8a with OFFLINE state 
> 2012-07-04 14:27:05,851 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Processing region 
> writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.
>  in state M_ZK_REGION_OFFLINE 
> 2012-07-04 14:27:05,854 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.
>  to dw93.kgb.sqa.cm4,60020,1341380812020 
> 2012-07-04 14:27:06,051 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=dw93.kgb.sqa.cm4,60020,1341380812020, 
> region=80f999ea84cb259e20e9a228546f6c8a 
> Regionserver(META rs) log:
> 2012-07-04 13:45:56,491 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server 
> dw93.kgb.sqa.cm4,60020,1341378224464; zookeeper connection c
> losed.
> 2012-07-04 13:46:11,951 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
> Added daughter 
> writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.,
>  serverName=dw93.kgb.sqa.cm4,60020,1341378224464 
> 2012-07-04 13:46:11,952 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: Done with post open 
> deploy task for 
> region=writetest,JC\xCA\xC8\xCF<Q\xC49>OH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.,
>  daughter=true 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to