[ https://issues.apache.org/jira/browse/HDFS-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin Patrick McCabe updated HDFS-6107: --------------------------------------- Attachment: HDFS-6107.001.patch I fixed the error handling case and added a unit test. I noticed that we were incrementing the DN metrics for BlocksCached and BlocksUncached just as soon as we received the DNA_CACHE and DNA_CACHE commands. This is wrong, since if caching takes a while, the NN may send those commands more than once. The command itself is idempotent. I fixed it so that FsDatasetCache changes those stats instead. I think this might fix some flaky unit tests we had, since we'll no longer double-count a block if the NN happens to send a DNA_CACHE for it twice. > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable > ------------------------------------------------------------------------------------------------- > > Key: HDFS-6107 > URL: https://issues.apache.org/jira/browse/HDFS-6107 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.4.0 > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Attachments: HDFS-6107.001.patch > > > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable. This is because the CachingTask fails to reset the > block state in this error handling case. -- This message was sent by Atlassian JIRA (v6.2#6252)