[ https://issues.apache.org/jira/browse/HDFS-17311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liuguanghua updated HDFS-17311: ------------------------------- Description: In the Router, find blow log 2023-12-29 15:18:54,799 ERROR org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add more than 2048 connections at the same time The log indicates that ConnectionManager.creatorQueue is full at a certain point. But my cluster does not have so many users cloud reach up 2048 pair of <user,nn>. This may be due to the following reasons: # ConnectionManager.creatorQueue is a queue that will be offered ConnectionPool if ConnectionContext is not enough. # ConnectionCreator thread will consume from creatorQueue and make more ConnectionContexts for a ConnectionPool. # Client will concurrent invoke for ConnectionManager.getConnection() for a same user. And this maybe lead to add many same ConnectionPool into ConnectionManager.creatorQueue. # When creatorQueue is full, a new ConnectionPool will not be added in successfully and log this error. This maybe lead to a really new ConnectionPool clould not produce more ConnectionContexts for new user. So this pr try to make creatorQueue will not add same ConnectionPool at once. was: 2023-12-29 15:18:54,799 ERROR org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add more than 2048 connections at the same time In my environment, ConnectionManager creatorQueue is full ,but the cluster does not have so many users cloud reach up 2048 pair of <user,nn> in router. In the case of a large number of concurrent creatorQueue add same pool more than once. > RBF: ConnectionManager creatorQueue should offer a pool that is not already > in creatorQueue. > -------------------------------------------------------------------------------------------- > > Key: HDFS-17311 > URL: https://issues.apache.org/jira/browse/HDFS-17311 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: liuguanghua > Assignee: liuguanghua > Priority: Major > Labels: pull-request-available > > In the Router, find blow log > > 2023-12-29 15:18:54,799 ERROR > org.apache.hadoop.hdfs.server.federation.router.ConnectionManager: Cannot add > more than 2048 connections at the same time > > The log indicates that ConnectionManager.creatorQueue is full at a certain > point. But my cluster does not have so many users cloud reach up 2048 pair of > <user,nn>. > This may be due to the following reasons: > # ConnectionManager.creatorQueue is a queue that will be offered > ConnectionPool if ConnectionContext is not enough. > # ConnectionCreator thread will consume from creatorQueue and make more > ConnectionContexts for a ConnectionPool. > # Client will concurrent invoke for ConnectionManager.getConnection() for a > same user. And this maybe lead to add many same ConnectionPool into > ConnectionManager.creatorQueue. > # When creatorQueue is full, a new ConnectionPool will not be added in > successfully and log this error. This maybe lead to a really new > ConnectionPool clould not produce more ConnectionContexts for new user. > So this pr try to make creatorQueue will not add same ConnectionPool at once. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org