[ https://issues.apache.org/jira/browse/HDFS-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei-Chiu Chuang updated HDFS-16136: ----------------------------------- Description: After HDFS-10609, HDFS-11741, we still observed InvalidEncryptionKeyException errors that are not retried. {noformat} 2021-07-12 11:10:58,795 ERROR datanode.DataNode (DataXceiver.java:writeBlock(863)) - DataNode{data=FSDataset{dirpath='[/grid/01/hadoop/hdfs/data, /grid/02/hadoop/hdfs/data, /grid/03/hadoop/hdfs/data, /grid/04/hadoop/hdfs/data, /grid/05/hadoop/hdfs/data, /grid/06/hadoop/hdfs/data, /grid/07/hadoop/hdfs/data, /grid/08/hadoop/hdfs/data, /grid/09/hadoop/hdfs/data, /grid/10/hadoop/hdfs/data, /grid/11/hadoop/hdfs/data, /grid/12/hadoop/hdfs/data, /grid/13/hadoop/hdfs/data, /grid/14/hadoop/hdfs/data, /grid/15/hadoop/hdfs/data, /grid/16/hadoop/hdfs/data, /grid/17/hadoop/hdfs/data, /grid/18/hadoop/hdfs/data, /grid/19/hadoop/hdfs/data, /grid/20/hadoop/hdfs/data, /grid/21/hadoop/hdfs/data, /grid/22/hadoop/hdfs/data]'}, localName='lxdmelcly-lxw01-p01-whw10289.oan:10019', datanodeUuid='70403b64-cb39-4b4a-ac6c-787ce7bdbe2c', xmitsInProgress=0}:Exception transfering block BP-1743446178-172.18.16.38-1537373339905:blk_2196991498_1131235321 to mirror 172.18.16.33:10019 org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: Can't re-compute encryption key for nonce, since the required block key (keyID=-213389155) doesn't exist. Current key: 1804780309 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:419) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:479) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:303) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:245) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:215) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:800) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) at java.lang.Thread.run(Thread.java:745) 2021-07-12 11:10:58,796 ERROR datanode.DataNode (DataXceiver.java:run(321)) - xxx:10019:DataXceiver error processing WRITE_BLOCK operation src: /172.18.16.8:41992 dst: /172.18.16.20:10019 org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: Can't re-compute encryption key for nonce, since the required block key (keyID=-213389155) doesn't exist. Current key: 1804780309 {noformat} We should handle this exception whenever SaslDataTransferClient.socketSend() is invoked: DataXceiver.writeBlock() BlockDispatcher.moveBlock() DataNode.run() DataXceiver.replaceBlock() StripedBlockWriter.init() This issue isn't that obvious, because the existing HDFS fault tolerance mechanisms should mask the data encryption key error. was: After HDFS-10609, HDFS-11741, we still observe InvalidEncryptionKeyException errors that are not retried. {noformat} 2021-07-12 11:10:58,795 ERROR datanode.DataNode (DataXceiver.java:writeBlock(863)) - DataNode{data=FSDataset{dirpath='[/grid/01/hadoop/hdfs/data, /grid/02/hadoop/hdfs/data, /grid/03/hadoop/hdfs/data, /grid/04/hadoop/hdfs/data, /grid/05/hadoop/hdfs/data, /grid/06/hadoop/hdfs/data, /grid/07/hadoop/hdfs/data, /grid/08/hadoop/hdfs/data, /grid/09/hadoop/hdfs/data, /grid/10/hadoop/hdfs/data, /grid/11/hadoop/hdfs/data, /grid/12/hadoop/hdfs/data, /grid/13/hadoop/hdfs/data, /grid/14/hadoop/hdfs/data, /grid/15/hadoop/hdfs/data, /grid/16/hadoop/hdfs/data, /grid/17/hadoop/hdfs/data, /grid/18/hadoop/hdfs/data, /grid/19/hadoop/hdfs/data, /grid/20/hadoop/hdfs/data, /grid/21/hadoop/hdfs/data, /grid/22/hadoop/hdfs/data]'}, localName='lxdmelcly-lxw01-p01-whw10289.oan:10019', datanodeUuid='70403b64-cb39-4b4a-ac6c-787ce7bdbe2c', xmitsInProgress=0}:Exception transfering block BP-1743446178-172.18.16.38-1537373339905:blk_2196991498_1131235321 to mirror 172.18.16.33:10019 org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: Can't re-compute encryption key for nonce, since the required block key (keyID=-213389155) doesn't exist. Current key: 1804780309 at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:419) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:479) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:303) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:245) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:215) at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:800) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) at java.lang.Thread.run(Thread.java:745) 2021-07-12 11:10:58,796 ERROR datanode.DataNode (DataXceiver.java:run(321)) - xxx:10019:DataXceiver error processing WRITE_BLOCK operation src: /172.18.16.8:41992 dst: /172.18.16.20:10019 org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: Can't re-compute encryption key for nonce, since the required block key (keyID=-213389155) doesn't exist. Current key: 1804780309 {noformat} We should handle this exception whenever SaslDataTransferClient.socketSend() is invoked: DataXceiver.writeBlock() BlockDispatcher.moveBlock() DataNode.run() DataXceiver.replaceBlock() StripedBlockWriter.init() This issue isn't that obvious, because the existing HDFS fault tolerance mechanisms should mask the data encryption key error. > Handle all occurrence of InvalidEncryptionKeyException > ------------------------------------------------------- > > Key: HDFS-16136 > URL: https://issues.apache.org/jira/browse/HDFS-16136 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.1.1 > Reporter: Wei-Chiu Chuang > Priority: Major > > After HDFS-10609, HDFS-11741, we still observed InvalidEncryptionKeyException > errors that are not retried. > {noformat} > 2021-07-12 11:10:58,795 ERROR datanode.DataNode > (DataXceiver.java:writeBlock(863)) - > DataNode{data=FSDataset{dirpath='[/grid/01/hadoop/hdfs/data, > /grid/02/hadoop/hdfs/data, /grid/03/hadoop/hdfs/data, > /grid/04/hadoop/hdfs/data, /grid/05/hadoop/hdfs/data, > /grid/06/hadoop/hdfs/data, /grid/07/hadoop/hdfs/data, > /grid/08/hadoop/hdfs/data, /grid/09/hadoop/hdfs/data, > /grid/10/hadoop/hdfs/data, /grid/11/hadoop/hdfs/data, > /grid/12/hadoop/hdfs/data, /grid/13/hadoop/hdfs/data, > /grid/14/hadoop/hdfs/data, /grid/15/hadoop/hdfs/data, > /grid/16/hadoop/hdfs/data, /grid/17/hadoop/hdfs/data, > /grid/18/hadoop/hdfs/data, /grid/19/hadoop/hdfs/data, > /grid/20/hadoop/hdfs/data, /grid/21/hadoop/hdfs/data, > /grid/22/hadoop/hdfs/data]'}, > localName='lxdmelcly-lxw01-p01-whw10289.oan:10019', > datanodeUuid='70403b64-cb39-4b4a-ac6c-787ce7bdbe2c', > xmitsInProgress=0}:Exception transfering block > BP-1743446178-172.18.16.38-1537373339905:blk_2196991498_1131235321 to mirror > 172.18.16.33:10019 > org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: > Can't re-compute encryption key for nonce, since the required block key > (keyID=-213389155) doesn't exist. Current key: 1804780309 > at > org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:419) > at > org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:479) > at > org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:303) > at > org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:245) > at > org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:215) > at > org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:800) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290) > at java.lang.Thread.run(Thread.java:745) > 2021-07-12 11:10:58,796 ERROR datanode.DataNode (DataXceiver.java:run(321)) - > xxx:10019:DataXceiver error processing WRITE_BLOCK operation src: > /172.18.16.8:41992 dst: /172.18.16.20:10019 > org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: > Can't re-compute encryption key for nonce, since the required block key > (keyID=-213389155) doesn't exist. Current key: 1804780309 > {noformat} > We should handle this exception whenever SaslDataTransferClient.socketSend() > is invoked: > DataXceiver.writeBlock() > BlockDispatcher.moveBlock() > DataNode.run() > DataXceiver.replaceBlock() > StripedBlockWriter.init() > This issue isn't that obvious, because the existing HDFS fault tolerance > mechanisms should mask the data encryption key error. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org