Let's assume that hdfs maintains 3 replicas of the 256MB block, then all of
these 3 datanodes will have only one copy of the block in their respective
mem cache and thus avoiding the repeated i/o reads. This goes with the
centralized cache management policy of hdfs that also gives you an option
to pin 2 of these 3 blocks in cache and save the remaining 256MB of cache
space. Here's a link
<https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html>
on
the same.

Hope that helps.

Ritesh

Reply via email to