[ https://issues.apache.org/jira/browse/HDFS-16198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei-Chiu Chuang updated HDFS-16198: ----------------------------------- Fix Version/s: 3.4.0 > Short circuit read leaks Slot objects when InvalidToken exception is thrown > --------------------------------------------------------------------------- > > Key: HDFS-16198 > URL: https://issues.apache.org/jira/browse/HDFS-16198 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Eungsop Yoo > Assignee: Eungsop Yoo > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-16198.patch, screenshot-2.png > > Time Spent: 2.5h > Remaining Estimate: 0h > > In secure mode, 'dfs.block.access.token.enable' should be set 'true'. With > this configuration SecretManager.InvalidToken exception may be thrown if the > access token expires when we do short circuit reads. It doesn't matter > because the failed reads will be retried. But it causes the leakage of > ShortCircuitShm.Slot objects. > > We found this problem in our secure HBase clusters. The number of open file > descriptors of RegionServers kept increasing using short circuit reading. > !screenshot-2.png! > > It was caused by the leakage of shared memory segments used by short circuit > reading. > {code:java} > [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk > '{print $2}') | grep /dev/shm | wc -l > 3925 > [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk > '{print $2}') | grep /dev/shm | head -5 > java 86309 hbase DEL REG 0,19 2308279984 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_743473959 > java 86309 hbase DEL REG 0,19 2306359893 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_1594162967 > java 86309 hbase DEL REG 0,19 2305496758 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_2043027439 > java 86309 hbase DEL REG 0,19 2304784261 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_689571088 > java 86309 hbase DEL REG 0,19 2302621988 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_347008590 > {code} > > We finally found that the root cause of this is the leakage of > ShortCircuitShm.Slot. > > The fix is trivial. Just free the slot when InvalidToken exception is thrown. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org