ruanhui created HBASE-27967: ------------------------------- Summary: introduce a ConnectionLimitHandler to limit the number of concurrent connections to the Server Key: HBASE-27967 URL: https://issues.apache.org/jira/browse/HBASE-27967 Project: HBase Issue Type: New Feature Components: IPC/RPC Affects Versions: 3.0.0-alpha-4 Reporter: ruanhui Fix For: 3.0.0-beta-1
The unreasonable retries of the client cause the hbase server to fail to accept and create new connections, and thus hang up. We can consider introducing a ConnectionLimitHandler similar to Cassandra in our NettyRpcServer to protect the hbase servers. ERROR [master:store-WAL-Roller] master.HMaster: ***** ABORTING master hmaster,60000,1679921578648: IOE in log roller ***** java.net.SocketException: Call From hmaster/hmaster to namenode:9000 failed on socket exception: java.net.SocketException: Too many open files; For more details see: [http://wiki.apache.org/hadoop/SocketException] java.io.IOException: Too many open files at java.base/sun.nio.ch.Net.accept(Native Method) at java.base/sun.nio.ch.ServerSocketChannelImpl.implAccept(ServerSocketChannelImpl.java:425) at java.base/sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:391) at org.apt java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:376) at jdk.proxy2/jdk.proxy2.$Proxy24.getFileInfo(Unknown Source) at jdk.internal.reflect.GeneratedMethodAccessor139.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:376) at jdk.proxy2/jdk.proxy2.$Proxy24.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1753) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1617) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1614) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1629) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1713) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.getNewPath(AbstractFSWAL.java:582) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:843) at org.apache.hadoop.hbase.wal.AbstractWALRoller$RollController.rollWal(AbstractWALRoller.java:268) at org.apache.hadoop.hbase.wal.AbstractWALRoller.run(AbstractWALRoller.java:187) Caused by: java.net.SocketException: Too many open files at java.base/sun.nio.ch.Net.socket0(Native Method) at java.base/sun.nio.ch.Net.socket(Net.java:524) at java.base/sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:146) at java.base/sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:129) at java.base/sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:77) at java.base/java.nio.channels.SocketChannel.open(SocketChannel.java:192) at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:656) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812) at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636) at org.apache.hadoop.ipc.Client.call(Client.java:1452) -- This message was sent by Atlassian Jira (v8.20.10#820010)