[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Fix Version/s: 0.94.5 -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning ROOT region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.ROOT is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning ROOT Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Attachment: 7504-94.patch -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: 7504-94.patch, 7504-trunk v1.patch, 7504-trunk v2.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning ROOT region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.ROOT is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning ROOT Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: 7504-94.patch, 7504-trunk v1.patch, 7504-trunk v2.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning ROOT region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.ROOT is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning ROOT Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Attachment: 7504-trunk v1.patch -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Attachments: 7504-trunk v1.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning -ROOT- region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.-ROO- is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning -ROOT- Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Attachment: 7504-trunk v1.patch -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Attachments: 7504-trunk v1.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning -ROOT- region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.-ROO- is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning -ROOT- Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Attachment: (was: 7504-trunk v1.patch) -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Attachments: 7504-trunk v1.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning -ROOT- region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.-ROO- is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning -ROOT- Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Description: 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning ROOT region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.ROOT is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning ROOT Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} was: 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning -ROOT- region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.-ROO- is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning -ROOT- Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Attachments: 7504-trunk v1.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning ROOT region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.ROOT is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT.
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Attachment: 7504-trunk v2.patch -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning ROOT region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.ROOT is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning ROOT Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
[ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-7504: Fix Version/s: 0.96.0 Status: Patch Available (was: Open) -ROOT- may be offline forever after FullGC of RS - Key: HBASE-7504 URL: https://issues.apache.org/jira/browse/HBASE-7504 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch 1.FullGC happen on ROOT regionserver. 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler 3.Regionserver complete the FullGC 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true 5.ServerShutdownHandler skip assigning ROOT region 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver report 7.ROOT is offline now, and won't be assigned any more unless we restart master Master Log: {code} 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752 to dead servers, submitted shutdown handler to be executed, root=true, meta=false 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign. 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752 {code} No log of assigning ROOT Regionserver log: {code} 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms instead of 10ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira