[jira] Commented: (HBASE-1104) Doubly-assigned regions redux

stack (JIRA) Wed, 07 Jan 2009 20:45:09 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661840#action_12661840
 ]


stack commented on HBASE-1104:
------------------------------

Did you mean to add in changes to Index: src/webapps/master/WEB-INF/web.xml?

Want to add more javadoc to the @return in below (Not important...)

Index: src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
===================================================================
--- src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java  (revision 
732591)
+++ src/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java  (working copy)
@@ -126,6 +126,7 @@
    * @param regionName name of the region to update
    * @param b BatchUpdate
    * @param expectedValues map of column names to expected data values.
+   * @return true if 

Tell me about this change:

         storedInfo = this.master.serverManager.getServerInfo(serverName);
         deadServer = this.master.serverManager.isDead(serverName);
-        deadServerAndLogsSplit =
-          this.master.serverManager.isDeadServerLogsSplit(serverName);


and...


-      if ((deadServerAndLogsSplit ||
-          (!deadServer && (storedInfo == null ||
-            (storedInfo.getStartCode() != startCode)))) &&
-          this.regionManager.assignable(info)) {
+      if ((deadServer ||
+          (storedInfo == null || storedInfo.getStartCode() != startCode))) {
+

It don't look right.  Changes I made for 1099 were "allow assigning if its a 
dead server and its commit logs HAVE been split" or "if NOT a dead 
server....because if a dead server and didn't pass first test, then its logs 
are being split.."  ... We don't want BaseScanner assigning to servers on dead 
list.  If regions are assigned to server on dead list, when dead server runs 
its scan in shutdown handler, we'll reassign these regions as though they'd 
been on crashed server; makes for double assignment and a mess.

You also remove the new method assignable.  Don't we want to check if region is 
'assignable' before dropping into this assigning code block? (Not sure... so 
asking).

Your patch does this which as discussed on IRC is not whats wanted:

{code}
@@ -1088,12 +1088,8 @@
       byte [] closestKey = store.getRowKeyAtOrBefore(row);
       // If it happens to be an exact match, we can stop looping.
       // Otherwise, we need to check if it's the max and move to the next
-      if (HStoreKey.equalsTwoRowKeys(regionInfo, row, closestKey)) {
+      if (closestKey != null) {
         key = new HStoreKey(closestKey, this.regionInfo);
-      } else if (closestKey != null &&
-          (key == null || HStoreKey.compareTwoRowKeys(
-              regionInfo,closestKey, key.getRow()) > 0) ) {
-        key = new HStoreKey(closestKey, this.regionInfo);
       } else {
         return null;
       }
{code}

Do you think this safe Jim in below?

{code}
@@ -564,9 +566,10 @@
       //       the messages we've received. In this case, a close could be
       //       processed before an open resulting in the master not agreeing on
       //       the region's state.
+      master.regionManager.setClosed(region.getRegionName());
{code}

Will we have the problem where state changes are processed out of order?  
Thinking on it, it doesn't seem so but asking just to check.

I'll hold on testing the patch until answer on above.

> Doubly-assigned regions redux
> -----------------------------
>
>                 Key: HBASE-1104
>                 URL: https://issues.apache.org/jira/browse/HBASE-1104
>             Project: Hadoop HBase
>          Issue Type: Bug
>         Environment: pset cluster with TRUNK.
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: 1104.patch
>
>
> Testing, I see doubly assigned regions.  Below is from master log for 
> TestTable,0000135598,1230761605500.
> {code}
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO 
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: 
> TestTable,0000116170,1230761152219: TestTable,0000116170,1230761152219 split; 
> daughters: TestTable,0000116170,1230761605500, 
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO 
> org.apache.hadoop.hbase.master.RegionManager: assigning region 
> TestTable,0000135598,1230761605500 to server XX.XX.XX.142:60020
> 2008-12-31 22:13:38,561 [IPC Server handler 6 on 60000] INFO 
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: 
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO 
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: 
> TestTable,0000135598,1230761605500 open on XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO 
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row 
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode 
> 1230759988953 and server XX.XX.XX.142:60020
> 2008-12-31 22:13:44,640 [IPC Server handler 4 on 60000] DEBUG 
> org.apache.hadoop.hbase.master.RegionManager: Going to close region 
> TestTable,0000135598,1230761605500
> 2008-12-31 22:13:50,441 [IPC Server handler 9 on 60000] INFO 
> org.apache.hadoop.hbase.master.RegionManager: assigning region 
> TestTable,0000135598,1230761605500 to server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,457 [IPC Server handler 5 on 60000] INFO 
> org.apache.hadoop.hbase.master.ServerManager: Received 
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from 
> XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [IPC Server handler 5 on 60000] INFO 
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: 
> TestTable,0000135598,1230761605500 from XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO 
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: 
> TestTable,0000135598,1230761605500 open on XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO 
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row 
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode 
> 1230759988788 and server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,688 [IPC Server handler 6 on 60000] INFO 
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: 
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:53,688 [HMaster] DEBUG 
> org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessRegionClose 
> of TestTable,0000135598,1230761605500, false
> 2008-12-31 22:13:54,263 [IPC Server handler 7 on 60000] INFO 
> org.apache.hadoop.hbase.master.RegionManager: assigning region 
> TestTable,0000135598,1230761605500 to server XX.XX.XX.141:60020
> 2008-12-31 22:13:57,273 [IPC Server handler 9 on 60000] INFO 
> org.apache.hadoop.hbase.master.ServerManager: Received 
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from 
> XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [IPC Server handler 0 on 60000] INFO 
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: 
> TestTable,0000135598,1230761605500 from XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [HMaster] INFO 
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: 
> TestTable,0000135598,1230761605500 open on XX.XX.XX.141:60020
> 2008-12-31 22:14:03,918 [HMaster] INFO 
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row 
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode 
> 1230759989031 and server XX.XX.XX.141:60020
> 2008-12-31 22:14:29,350 [RegionManager.metaScanner] DEBUG 
> org.apache.hadoop.hbase.master.BaseScanner: 
> TestTable,0000135598,1230761605500 no longer has references to 
> TestTable,0000116170,1230761152219
> {code}
> See how we choose to assign before we get the close back from the 
> regionserver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1104) Doubly-assigned regions redux

Reply via email to