[ https://issues.apache.org/jira/browse/HDFS-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhuobin zheng updated HDFS-9803: -------------------------------- Description: 我的区域服务器日志中充斥着诸如“SecretManager$InvalidToken:尝试设置对 <hdfs-file-path> 的短路访问时访问控制错误...已过期”之类的消息。这些日志与来自区域服务器的 responseTooSlow WARNings 相对应。 {noformat} 2016-01-19 22:10:14,432 INFO [B.defaultRpcServer.handler=4,queue=1,port=16020] 短路。ShortCircuitCache:ShortCircuitCache(0x71bdc547):无法加载 10740376333_BP6016-1034500000000000000000000000000000000000000例外。 org.apache.hadoop.security.token.SecretManager$InvalidToken:尝试使用 block_token_identifier 设置对 <hfile path> 令牌的短路访问时出现访问控制错误(expiryDate=1453194430724,keyId=1508822027,userId=hbase,blockPoolId=BP -1145309065-XXX-1448053136416,blockId=1074037633,访问模式=[READ])已过期。 在 org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591) 在 org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490) 在 org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782) 在 org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716) 在 org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422) 在 org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333) 在 org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618) 在 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844) 在 org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896) 在 java.io.DataInputStream.read(DataInputStream.java:149) 在 org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678) 在 org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372) 在 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591) 在 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470) 在 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437) ... {noformat} 一个潜在的解决方案可能是拥有一个后台线程,尽最大努力在令牌过期之前主动刷新缓存中的令牌,以最大限度地减少对关键路径的延迟影响。 感谢[~cnauroth]在[用户列表|http://mail-archives.apache.org/mod_mbox/hadoop-user/201601.mbox/%3CCANZa%3DGt%3Dhvuf3fyOJqf-jdpBPL_xDknKBcp7LmaC-YUm0jDUVg%40mail.gmail.com%3E]上提供解释并建议解决方案。 was: My region server logs are flooding with messages like "SecretManager$InvalidToken: access control error while attempting to set up short-circuit access to <hdfs-file-path> ... is expired". These logs correspond with responseTooSlow WARNings from the region server. {noformat} 2016-01-19 22:10:14,432 INFO [B.defaultRpcServer.handler=4,queue=1,port=16020] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x71bdc547): could not load 1074037633_BP-1145309065-XXX-1448053136416 due to InvalidToken exception. org.apache.hadoop.security.token.SecretManager$InvalidToken: access control error while attempting to set up short-circuit access to <hfile path> token with block_token_identifier (expiryDate=1453194430724, keyId=1508822027, userId=hbase, blockPoolId=BP-1145309065-XXX-1448053136416, blockId=1074037633, access modes=[READ]) is expired. at org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591) at org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490) at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782) at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716) at org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333) at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618) at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678) at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437) ... {noformat} A potential solution could be to have a background thread that makes a best effort to proactively refreshes tokens in the cache before they expire, so as to minimize latency impact on the critical path. Thanks to [~cnauroth] for providing an explaination and suggesting a solution over on the [user list|http://mail-archives.apache.org/mod_mbox/hadoop-user/201601.mbox/%3CCANZa%3DGt%3Dhvuf3fyOJqf-jdpBPL_xDknKBcp7LmaC-YUm0jDUVg%40mail.gmail.com%3E]. > Proactively refresh ShortCircuitCache entries to avoid latency spikes > --------------------------------------------------------------------- > > Key: HDFS-9803 > URL: https://issues.apache.org/jira/browse/HDFS-9803 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Nick Dimiduk > Priority: Major > > 我的区域服务器日志中充斥着诸如“SecretManager$InvalidToken:尝试设置对 <hdfs-file-path> > 的短路访问时访问控制错误...已过期”之类的消息。这些日志与来自区域服务器的 responseTooSlow WARNings 相对应。 > {noformat} > 2016-01-19 22:10:14,432 INFO > [B.defaultRpcServer.handler=4,queue=1,port=16020] > 短路。ShortCircuitCache:ShortCircuitCache(0x71bdc547):无法加载 > 10740376333_BP6016-1034500000000000000000000000000000000000000例外。 > org.apache.hadoop.security.token.SecretManager$InvalidToken:尝试使用 > block_token_identifier 设置对 <hfile path> > 令牌的短路访问时出现访问控制错误(expiryDate=1453194430724,keyId=1508822027,userId=hbase,blockPoolId=BP > -1145309065-XXX-1448053136416,blockId=1074037633,访问模式=[READ])已过期。 > 在 > org.apache.hadoop.hdfs.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:591) > 在 > org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:490) > 在 > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:782) > 在 > org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:716) > 在 > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422) > 在 > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333) > 在 > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:618) > 在 > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:844) > 在 org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896) > 在 java.io.DataInputStream.read(DataInputStream.java:149) > 在 > org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:678) > 在 > org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1372) > 在 > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1591) > 在 > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1470) > 在 > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437) > ... > {noformat} > 一个潜在的解决方案可能是拥有一个后台线程,尽最大努力在令牌过期之前主动刷新缓存中的令牌,以最大限度地减少对关键路径的延迟影响。 > 感谢[~cnauroth]在[用户列表|http://mail-archives.apache.org/mod_mbox/hadoop-user/201601.mbox/%3CCANZa%3DGt%3Dhvuf3fyOJqf-jdpBPL_xDknKBcp7LmaC-YUm0jDUVg%40mail.gmail.com%3E]上提供解释并建议解决方案。 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org