[ https://issues.apache.org/jira/browse/HDFS-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537032#comment-15537032 ]
Chen Liang commented on HDFS-10932: ----------------------------------- The failed tests do not seem to be related, local runs do not have the test fails in {{TestOzoneRestWithMiniCluster}} and {{TestBlockTokenWithDFSStriped}}, but {{TestBlockPoolManager.testFederationRefresh}} has been consistently failing locally with the patch AND without the patch. May need to fix it separately. > Ozone : fix XceiverClient slow shutdown > --------------------------------------- > > Key: HDFS-10932 > URL: https://issues.apache.org/jira/browse/HDFS-10932 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Chen Liang > Assignee: Chen Liang > Attachments: HDFS-10932-HDFS-7240.002.patch, HDFS-10932.001.patch > > > Currently {{XceiverClient}} is the underlying entity of > {{DistributedStorageHandler.newKeyWriter()}} and > {{DistributedStorageHandler.newKeyReader()}} for making call to container > for read/write. When {{XceiverClient}} gets closed, > {{group.shutdownGracefully()}} gets called, which is an asynchronous call. > A problem is that this asynchronous call has default quiet period of 2 > seconds before it actually shutdown, so if we have a burst of read/write > calls, we would end up having threads created faster than they got > terminated, reaching system limit at some point. > Ideally, this needs to be fixed with cached clients instead of creating new > thread each time. This JIRA only tries to give a temporary fix for the time > being. > Thanks [~anu] for the offline discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org