[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936395#comment-13936395 ] Guo Ruijing commented on HDFS-6087: --- I think I already resolve Konstantin Shvachko and Tsz Wo Nicholas Sze's comments/concerns. I will wait for your new comments/concerns and update it in document. design motivation is: 1) unify HDFS write/append/truncate 2) the design is base of writable snapshot / snapshot restore (This JIRA is not created to track snapshot items) > Unify HDFS write/append/truncate > > > Key: HDFS-6087 > URL: https://issues.apache.org/jira/browse/HDFS-6087 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Guo Ruijing > Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf > > > In existing implementation, HDFS file can be appended and HDFS block can be > reopened for append. This design will introduce complexity including lease > recovery. If we design HDFS block as immutable, it will be very simple for > append & truncate. The idea is that HDFS block is immutable if the block is > committed to namenode. If the block is not committed to namenode, it is HDFS > client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936391#comment-13936391 ] Guo Ruijing commented on HDFS-6087: --- issue: The last block is not available for reading. solution 1: if the block is referenced by client, the block can be moved to remove list in NN after block is unreferenced by client. 1) GetBlockLocations with Reference option 2) Client copy block to local buffer 3) New RPM message UnreferenceBlocks is sent to NN solution 2: block is moved to trash and delayed to be deleted in DN. In exsiting, blocks are deleted in DN after Heartbeat is responded to DN (lazy to delete blocks) if block is already read by client and the block is requested to delete, DN should delete the block after read complete. In most case, client can read the last block: 1) client request block location information 2) HDFS client copy blocks to local buffer. 3) Heartbeat request to delete block(lazy to delete blocks) 4) HDFS application slowly read data from local buffer. for race condition 2) and 3), we can delay to delete blocks. even if block is deleted, client can request new block information. I like solution 2 > Unify HDFS write/append/truncate > > > Key: HDFS-6087 > URL: https://issues.apache.org/jira/browse/HDFS-6087 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Guo Ruijing > Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf > > > In existing implementation, HDFS file can be appended and HDFS block can be > reopened for append. This design will introduce complexity including lease > recovery. If we design HDFS block as immutable, it will be very simple for > append & truncate. The idea is that HDFS block is immutable if the block is > committed to namenode. If the block is not committed to namenode, it is HDFS > client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936383#comment-13936383 ] Guo Ruijing commented on HDFS-6087: --- writing not in block boundary will trigger block copying in DN: 1) it won't lead to a lot of small block 2) Like most of file system, hflush/hsync/truncate may cause performance downgrade. If we can design zero copy for block copy, there is little performance downgrade. 1) Block is defined as (block data file, block length) 2) source block is already committed to NN and immutable. 3) block file can be created/appended and cannot be overridden or truncated. 4) Block size may not be equal to block data file length 5) create hardlink for block data file if copy block length = file length 6) copy block data file if copy block length < file length Example: 1) Block 1: (blockfile1, 32M) blockfile1(length: 32M) 2) copy Block 1 to Block 2 with 32M a) hardlink blockfile 1 to blockfile 2. b) Block 2: (blockfile2, 32M) blockfile2 (length: 32M) 3) write 16M buffer to block 2 a) Block 1: (blockfile1, 32M) blockfile1(length: 48M) b) Block 2: (blockfile2, 48M) blockfile2(length: 48M) 3) copy Block 2 to Block 3 with 16M a) copy blockfile2 to blockfile3 with 16M b) Block 1: (blockfile1, 32M) blockfile1(length: 48M) c) Block 2: (blockfile2, 48M) blockfile2(length: 48M) d) block 3: (blockfile 3, 16M) blockfile3(length: 16M) > Unify HDFS write/append/truncate > > > Key: HDFS-6087 > URL: https://issues.apache.org/jira/browse/HDFS-6087 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Guo Ruijing > Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf > > > In existing implementation, HDFS file can be appended and HDFS block can be > reopened for append. This design will introduce complexity including lease > recovery. If we design HDFS block as immutable, it will be very simple for > append & truncate. The idea is that HDFS block is immutable if the block is > committed to namenode. If the block is not committed to namenode, it is HDFS > client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936379#comment-13936379 ] Guo Ruijing commented on HDFS-6087: --- if client need to read data in early time, application should be: 1. open (for create/append) 2. write 3. hflush/hsync 4. write 5. close Note: writing not in block boundary will trigger block copy in DN (we may design zero copy for block copy) if client don't need to read in wary time, application can be: 1. open (for create/append) 2. write 3. write. 5 close > Unify HDFS write/append/truncate > > > Key: HDFS-6087 > URL: https://issues.apache.org/jira/browse/HDFS-6087 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Guo Ruijing > Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf > > > In existing implementation, HDFS file can be appended and HDFS block can be > reopened for append. This design will introduce complexity including lease > recovery. If we design HDFS block as immutable, it will be very simple for > append & truncate. The idea is that HDFS block is immutable if the block is > committed to namenode. If the block is not committed to namenode, it is HDFS > client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936378#comment-13936378 ] Guo Ruijing commented on HDFS-6087: --- It support hflush/hsync: 1) sync all buffer. 2) commit buffer to NN if it is block boundary. 3) copy new block and append buffer to new block and commit to NN if it is not block boundary > Unify HDFS write/append/truncate > > > Key: HDFS-6087 > URL: https://issues.apache.org/jira/browse/HDFS-6087 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Guo Ruijing > Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf > > > In existing implementation, HDFS file can be appended and HDFS block can be > reopened for append. This design will introduce complexity including lease > recovery. If we design HDFS block as immutable, it will be very simple for > append & truncate. The idea is that HDFS block is immutable if the block is > committed to namenode. If the block is not committed to namenode, it is HDFS > client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
[ https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936218#comment-13936218 ] Hudson commented on HDFS-6106: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1727 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1727/]) HDFS-6106. Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1577798) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml > Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms > > > Key: HDFS-6106 > URL: https://issues.apache.org/jira/browse/HDFS-6106 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.4.0 > > Attachments: HDFS-6106.001.patch > > > Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} > to improve the responsiveness of caching. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
[ https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936195#comment-13936195 ] Hudson commented on HDFS-6106: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1702 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1702/]) HDFS-6106. Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1577798) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml > Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms > > > Key: HDFS-6106 > URL: https://issues.apache.org/jira/browse/HDFS-6106 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.4.0 > > Attachments: HDFS-6106.001.patch > > > Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} > to improve the responsiveness of caching. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
[ https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936146#comment-13936146 ] Hudson commented on HDFS-6106: -- FAILURE: Integrated in Hadoop-Yarn-trunk #510 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/510/]) HDFS-6106. Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1577798) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml > Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms > > > Key: HDFS-6106 > URL: https://issues.apache.org/jira/browse/HDFS-6106 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.4.0 > > Attachments: HDFS-6106.001.patch > > > Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} > to improve the responsiveness of caching. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6107) When a block can't be cached due to limited space on the DataNode, that block becomes uncacheable
[ https://issues.apache.org/jira/browse/HDFS-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936112#comment-13936112 ] Hadoop QA commented on HDFS-6107: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634897/HDFS-6107.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6409//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6409//console This message is automatically generated. > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable > - > > Key: HDFS-6107 > URL: https://issues.apache.org/jira/browse/HDFS-6107 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-6107.001.patch > > > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable. This is because the CachingTask fails to reset the > block state in this error handling case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6093) Expose more caching information for debugging by users
[ https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936087#comment-13936087 ] Colin Patrick McCabe commented on HDFS-6093: bq. Arpit said: In addition to reducing the timeout as you suggested, can we add some explanation to the command output, or update CentralizedCacheManagement.html in the docs? I agree, we could add a short comment to the docs about this. Now that the timeout has been reduced, there should be much less discrepancy between the output of the two commands, of course. Taking a more detailed look at the patch now. {code} + public FsStatus getCacheStatus() throws IOException { {code} I know it seems clever to reuse the same class for getStatus and getCacheStatus, but it could become a problem if someone later adds more fields to getStatus that don't apply to getCacheStatus. I think we need our own type for this, to maintain sanity in the future. It's not that much code. {code} public long getMissingBlocksCount() throws IOException { +statistics.incrementReadOps(1); return dfs.getMissingBlocksCount(); {code} Can we put the non-caching-related incrementReadOps changes in their own JIRA? It may seem like a trivial change, but it's kind of distracting from this JIRA. Also I'm not sure I understand when we're "supposed" to increment this... {code} /** * Number of replicas pending caching. */ private long numPendingCaching; /** * Number of replicas pending uncaching. */ private long numPendingUncaching; {code} Could use a linebreak after {{numPendingCaching}} for consistency. Like I said earlier, I'd prefer to decouple the counter(s) that can be read from the CRM from the counters that the CRM uses internally during the scan. Using the same variable for both just invites bugs like the one Arpit pointed out, where rescan zeroes the counter outside the lock. {code} [CacheManager#processCacheReportImpl changes] {code} Incrementally updating the pendingUncached list and stats is a nice idea, but it seems too ambitious for 2.4 at this point. Now that the CRM interval is 30 seconds, it shouldn't be too bad to just wait for the CRM to update its stats and the lists. Additionally, we don't even know that monitor is non-null at this point, so there is an NPE here, I think. Let's leave this out and revisit it later. > Expose more caching information for debugging by users > -- > > Key: HDFS-6093 > URL: https://issues.apache.org/jira/browse/HDFS-6093 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching >Affects Versions: 2.4.0 >Reporter: Andrew Wang >Assignee: Andrew Wang > Attachments: hdfs-6093-1.patch > > > When users submit a new cache directive, it's unclear if the NN has > recognized it and is actively trying to cache it, or if it's hung for some > other reason. It'd be nice to expose a "pending caching/uncaching" count the > same way we expose pending replication work. > It'd also be nice to display the aggregate cache capacity and usage in > dfsadmin -report, since we already have have it as a metric and expose it > per-DN in report output. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
[ https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936083#comment-13936083 ] Hudson commented on HDFS-6106: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5335 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5335/]) HDFS-6106. Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1577798) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml > Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms > > > Key: HDFS-6106 > URL: https://issues.apache.org/jira/browse/HDFS-6106 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.4.0 > > Attachments: HDFS-6106.001.patch > > > Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} > to improve the responsiveness of caching. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
[ https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6106: --- Resolution: Fixed Fix Version/s: 2.4.0 Status: Resolved (was: Patch Available) > Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms > > > Key: HDFS-6106 > URL: https://issues.apache.org/jira/browse/HDFS-6106 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.4.0 > > Attachments: HDFS-6106.001.patch > > > Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} > to improve the responsiveness of caching. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6106) Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms
[ https://issues.apache.org/jira/browse/HDFS-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936073#comment-13936073 ] Colin Patrick McCabe commented on HDFS-6106: TestHASafeMode failure seems to be HDFS-6094, not related to this patch. Committing. Thanks, Andrew. > Reduce default for dfs.namenode.path.based.cache.refresh.interval.ms > > > Key: HDFS-6106 > URL: https://issues.apache.org/jira/browse/HDFS-6106 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.4.0 > > Attachments: HDFS-6106.001.patch > > > Reduce the default for {{dfs.namenode.path.based.cache.refresh.interval.ms}} > to improve the responsiveness of caching. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6107) When a block can't be cached due to limited space on the DataNode, that block becomes uncacheable
[ https://issues.apache.org/jira/browse/HDFS-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6107: --- Attachment: HDFS-6107.001.patch I fixed the error handling case and added a unit test. I noticed that we were incrementing the DN metrics for BlocksCached and BlocksUncached just as soon as we received the DNA_CACHE and DNA_CACHE commands. This is wrong, since if caching takes a while, the NN may send those commands more than once. The command itself is idempotent. I fixed it so that FsDatasetCache changes those stats instead. I think this might fix some flaky unit tests we had, since we'll no longer double-count a block if the NN happens to send a DNA_CACHE for it twice. > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable > - > > Key: HDFS-6107 > URL: https://issues.apache.org/jira/browse/HDFS-6107 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-6107.001.patch > > > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable. This is because the CachingTask fails to reset the > block state in this error handling case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6107) When a block can't be cached due to limited space on the DataNode, that block becomes uncacheable
[ https://issues.apache.org/jira/browse/HDFS-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6107: --- Status: Patch Available (was: Open) > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable > - > > Key: HDFS-6107 > URL: https://issues.apache.org/jira/browse/HDFS-6107 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.4.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-6107.001.patch > > > When a block can't be cached due to limited space on the DataNode, that block > becomes uncacheable. This is because the CachingTask fails to reset the > block state in this error handling case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6107) When a block can't be cached due to limited space on the DataNode, that block becomes uncacheable
Colin Patrick McCabe created HDFS-6107: -- Summary: When a block can't be cached due to limited space on the DataNode, that block becomes uncacheable Key: HDFS-6107 URL: https://issues.apache.org/jira/browse/HDFS-6107 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe When a block can't be cached due to limited space on the DataNode, that block becomes uncacheable. This is because the CachingTask fails to reset the block state in this error handling case. -- This message was sent by Atlassian JIRA (v6.2#6252)