[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14207: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to 0.98 > Region was hijacked and remained in transition when RS failed to open a > region and later regionplan changed to new RS on retry > -- > > Key: HBASE-14207 > URL: https://issues.apache.org/jira/browse/HBASE-14207 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.98.6 >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Critical > Fix For: 0.98.15 > > Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, > HBASE-14207-0.98.patch > > > On production environment, following events happened > 1. Master is trying to assign a region to RS, but due to > KeeperException$SessionExpiredException RS failed to open the region. > In RS log, saw multiple WARN log related to > KeeperException$SessionExpiredException > > KeeperErrorCode = Session expired for > /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b > > Unable to get data of znode > /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b > 2. Master retried to assign the region to same RS, but RS again failed. > 3. On second retry new plan formed and this time plan destination (RS) is > different, so master send the request to new RS to open the region. But new > RS failed to open the region as there was server mismatch in ZNODE than the > expected current server name. > Logs Snippet: > {noformat} > HM > 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing > 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | > org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) > 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned > {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, > server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, > ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | > org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) > 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed > region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on > server: T101PC03VM13,21302,1436816690692 | > org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) > 2015-07-14 03:50:29,800 | INFO | > MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning > INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to > T101PC03VM13,21302,1436816690692 | > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) > 2015-07-14 03:50:29,801 | WARN | > MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of > INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to > T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 > of 10 | > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) > 2015-07-14 03:50:29,802 | INFO | > MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign > INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to > the same failed server. | > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) > 2015-07-14 03:50:31,804 | INFO | > MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning > INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to > T101PC03VM13,21302,1436816690692 | > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) > 2015-07-14 03:50:31,806 | WARN | > MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of > INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to > T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 > of 10 | > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) > 2015-07-14 03:50:31,807 | INFO | > MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned > {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, > server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b > state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | > org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) > 2015-07-14 03:50:31,807 | INFO | >
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14207: --- Attachment: HBASE-14207-0.98-V2.patch Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Fix For: 0.98.15 Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 |
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14207: --- Status: Patch Available (was: Open) Reattach to kick HadoopQA Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Fix For: 0.98.15 Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14207: --- Status: Open (was: Patch Available) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Fix For: 0.98.15 Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 |
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14207: -- Status: Patch Available (was: Open) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Fix For: 0.98.15 Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 |
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-14207: - Attachment: HBASE-14207-0.98-V2.patch Modified patch with below changes {code} if (useZKForAssignment) { setOfflineInZK = true; } {code} Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Fix For: 0.98.15 Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14207: --- Fix Version/s: (was: 0.98.14) Status: Open (was: Patch Available) bq. org.apache.hadoop.hbase.master.TestZKLessAMOnCluster This looks like a relevant test failure. Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Fix For: 0.98.15 Attachments: HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-14207: - Attachment: HBASE-14207-0.98.patch Attached patch for 0.98, but I think this bug may exist with ZK assignment. Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Attachments: HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 |
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-14207: - Fix Version/s: 0.98.15 0.98.14 Status: Patch Available (was: Open) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical Fix For: 0.98.14, 0.98.15 Attachments: HBASE-14207-0.98.patch On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 |
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-14207: - Affects Version/s: 0.98.6 Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,807 | INFO |
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-14207: - Priority: Critical (was: Major) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry -- Key: HBASE-14207 URL: https://issues.apache.org/jira/browse/HBASE-14207 Project: HBase Issue Type: Bug Components: master Reporter: Pankaj Kumar Assignee: Pankaj Kumar Priority: Critical On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 |
[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry
[ https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankaj Kumar updated HBASE-14207: - Description: On production environment, following events happened 1. Master is trying to assign a region to RS, but due to KeeperException$SessionExpiredException RS failed to open the region. In RS log, saw multiple WARN log related to KeeperException$SessionExpiredException KeeperErrorCode = Session expired for /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b Unable to get data of znode /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b 2. Master retried to assign the region to same RS, but RS again failed. 3. On second retry new plan formed and this time plan destination (RS) is different, so master send the request to new RS to open the region. But new RS failed to open the region as there was server mismatch in ZNODE than the expected current server name. Logs Snippet: {noformat} HM 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644) 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on server: T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768) 2015-07-14 03:50:29,800 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:29,801 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:29,802 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to the same failed server. | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123) 2015-07-14 03:50:31,804 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,806 | WARN | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 of 10 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to T101PC03VM14,21302,1436816997967 | org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983) 2015-07-14 03:50:31,807 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031807, server=T101PC03VM14,21302,1436816997967} | org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327) 2015-07-14 03:51:09,501 | INFO | MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-4 | Skip assigning region in transition on other server{08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031807, server=T101PC03VM14,21302,1436816997967} |