[hadoop] 08/10: HDFS-14357. Update documentation for HDFS cache on SCM support. Contributed by Feilong He.

rakeshr Sun, 27 Oct 2019 10:09:38 -0700

This is an automated email from the ASF dual-hosted git repository.

rakeshr pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/hadoop.git


commit f2563dcca100098a7e121d6d0a75907eaf87fa75
Author: Rakesh Radhakrishnan <rake...@apache.org>
AuthorDate: Mon Jul 15 13:18:23 2019 +0530

    HDFS-14357. Update documentation for HDFS cache on SCM support. Contributed 
by Feilong He.
    
    (cherry picked from commit 30a8f840f1572129fe7d02f8a784c47ab57ce89a)
---
 .../src/site/markdown/CentralizedCacheManagement.md    | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md
 
b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md
index f2de043..85cc242 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md
@@ -32,6 +32,8 @@ Centralized cache management in HDFS has many significant 
advantages.
 
 4.  Centralized caching can improve overall cluster memory utilization. When 
relying on the OS buffer cache at each DataNode, repeated reads of a block will 
result in all *n* replicas of the block being pulled into buffer cache. With 
centralized cache management, a user can explicitly pin only *m* of the *n* 
replicas, saving *n-m* memory.
 
+5.  HDFS supports non-volatile storage class memory (SCM, also known as 
persistent memory) cache in Linux platform. User can enable either memory cache 
or SCM cache for a DataNode. Memory cache and SCM cache can coexist among 
DataNodes. In the current implementation, the cache data in SCM will be cleaned 
up when DataNode restarts. Persistent HDFS cache support on SCM will be 
considered in the future.
+
 Use Cases
 ---------
 
@@ -200,11 +202,21 @@ Configuration
 
 In order to lock block files into memory, the DataNode relies on native JNI 
code found in `libhadoop.so` or `hadoop.dll` on Windows. Be sure to [enable 
JNI](../hadoop-common/NativeLibraries.html) if you are using HDFS centralized 
cache management.
 
+Currently, there are two implementations for persistent memory cache. The 
default one is pure Java based implementation and the other is native 
implementation which leverages PMDK library to improve the performance of cache 
write and cache read.
+
+To enable PMDK based implementation, please follow the below steps.
+
+1. Install PMDK library. Please refer to the official site http://pmem.io/ for 
detailed information.
+
+2. Build Hadoop with PMDK support. Please refer to "PMDK library build 
options" section in `BUILDING.txt` in the source code.
+
+To verify that PMDK is correctly detected by Hadoop, run the `hadoop 
checknative` command.
+
 ### Configuration Properties
 
 #### Required
 
-Be sure to configure the following:
+Be sure to configure one of the following properties for DRAM cache or 
persistent memory cache. Please note that DRAM cache and persistent cache 
cannot coexist on a DataNode.
 
 *   dfs.datanode.max.locked.memory
 
@@ -212,6 +224,10 @@ Be sure to configure the following:
 
     This setting is shared with the [Lazy Persist Writes 
feature](./MemoryStorage.html). The Data Node will ensure that the combined 
memory used by Lazy Persist Writes and Centralized Cache Management does not 
exceed the amount configured in `dfs.datanode.max.locked.memory`.
 
+*   dfs.datanode.cache.pmem.dirs
+
+    This property specifies the cache volume of persistent memory. For 
multiple volumes, they should be separated by “,”, e.g. “/mnt/pmem0, 
/mnt/pmem1”. The default value is empty. If this property is configured, the 
volume capacity will be detected. And there is no need to configure 
`dfs.datanode.max.locked.memory`.
+
 #### Optional
 
 The following properties are not required, but may be specified for tuning:


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-commits-h...@hadoop.apache.org

[hadoop] 08/10: HDFS-14357. Update documentation for HDFS cache on SCM support. Contributed by Feilong He.

Reply via email to