[jira] [Resolved] (HBASE-9635) HBase Table regions are not getting re-assigned to the new region server when it comes up (when the existing region server not able to handle the load)

Andrew Kyle Purtell (Jira) Mon, 13 Jun 2022 08:26:09 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andrew Kyle Purtell resolved HBASE-9635.
----------------------------------------
    Resolution: Incomplete

> HBase Table regions are not getting re-assigned to the new region server when 
> it comes up (when the existing region server not able to handle the load) 
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9635
>                 URL: https://issues.apache.org/jira/browse/HBASE-9635
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.94.11
>         Environment: SuSE11
>            Reporter: shankarlingayya
>            Priority: Major
>
> {noformat}
> HBase Table regions are not getting assigned to the new region server for a 
> period of 30 minutes (when the existing region server not able to handle the 
> load)
> Procedure:
> 1. Setup Non HA Hadoop Cluster with two nodes (Node1-XX.XX.XX.XX,  
> Node2-YY.YY.YY.YY)
> 2. Install Zookeeper & HRegionServer in Node-1
> 3. Install HMaster & HRegionServer in Node-2
> 4. From Node2 create HBase Table ( table name 't1' with one column family 
> 'cf1' )
> 5. Perform addrecord 99649 rows 
> 6. kill both the node Region Server and limit the Node1 Region Server FD to 
> 600
> 7. Start only the Node1 Region server ==> so that FD exhaust can happen in 
> Node1 Region Server
> 8. After some 5-10 minuites start the Node2 Region Server
> ===> Huge number of regions of table 't1' are in OPENING state, which are not 
> getting re assigned to the Node2 region server which is free. 
> ===> When the new region server comes up then the master should detect and 
> allocate the open failed regions to the region server (here it is staying the 
> OPENINING state for 30 minutes which will have huge impcat user app which 
> makes use of this table)
> 2013-09-23 18:46:12,160 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
> Instantiated t1,row507465,1379937224590.2d9fad2aee78103f928d8c7fe16ba6cd.
> 2013-09-23 18:46:12,160 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=t1,row507465,1379937224590.2d9fad2aee78103f928d8c7fe16ba6cd., 
> starting to roll back the global memstore size.
> 2013-09-23 18:50:55,284 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
> renew lease for 
> [DFSClient_hb_rs_HOST-XX.XX.XX.XX,61020,1379940823286_-641204614_48] for 309 
> seconds.  Will retry shortly ...
> java.io.IOException: Failed on local exception: java.net.SocketException: Too 
> many open files; Host Details : local host is: 
> "HOST-XX.XX.XX.XX/XX.XX.XX.XX"; destination host is: "HOST-XX.XX.XX.XX":8020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at $Proxy13.renewLease(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:188)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at $Proxy13.renewLease(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:522)
>         at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:679)
>         at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
>         at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
>         at 
> org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
>         at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.SocketException: Too many open files
>         at sun.nio.ch.Net.socket0(Native Method)
>         at sun.nio.ch.Net.socket(Net.java:97)
>         at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:84)
>         at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
>         at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
>         at 
> org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
>         at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:523)
>         at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
>         at 
> org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1318)
>         ... 16 more
> 2013-09-23 18:50:56,285 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
> renew lease for 
> [DFSClient_hb_rs_HOST-XX.XX.XX.XX,61020,1379940823286_-641204614_48] for 310 
> seconds.  Will retry shortly ...
> java.io.IOException: Failed on local exception: java.net.SocketException: Too 
> many open files; Host Details : local host is: 
> "HOST-XX.XX.XX.XX/XX.XX.XX.XX"; destination host is: "HOST-XX.XX.XX.XX":8020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1351)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1300)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>        at $Proxy13.renewLease(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:188)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at $Proxy13.renewLease(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:522)
>         at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:679)
>         at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
>         at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
>         at 
> org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
>         at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.SocketException: Too many open files
>         at sun.nio.ch.Net.socket0(Native Method)
>         at sun.nio.ch.Net.socket(Net.java:97)
>         at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:84)
>         at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
>         at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
>         at 
> org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
>         at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:523)
>         at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
>         at 
> org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1318)
>         ... 16 more
> 2013-09-23 18:50:57,287 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
> renew lease for 
> [DFSClient_hb_rs_HOST-XX.XX.XX.XX,61020,1379940823286_-641204614_48] for 311 
> seconds.  Will retry shortly ...
> java.io.IOException: Failed on local exception: java.net.SocketException: Too 
> many open files; Host Details : local host is: 
> "HOST-XX.XX.XX.XX/XX.XX.XX.XX"; destination host is: "HOST-XX.XX.XX.XX":8020;
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Resolved] (HBASE-9635) HBase Table regions are not getting re-assigned to the new region server when it comes up (when the existing region server not able to handle the load)

Reply via email to