[ 
https://issues.apache.org/jira/browse/HDFS-13762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-13762:
----------------------------
    Release Note: Non-volatile storage class memory (SCM, also known as 
persistent memory) is supported in HDFS cache. To enable SCM cache, user just 
needs to configure SCM volume for property “dfs.datanode.cache.pmem.dirs” in 
hdfs-site.xml. And all HDFS cache directives keep unchanged. There are two 
implementations for HDFS SCM Cache, one is pure java code implementation and 
the other is native PMDK based implementation. The latter implementation can 
bring user better performance gain in cache write and cache read. To enable 
PMDK based implementation, user should install PMDK library by referring to the 
official site http://pmem.io/. Then, build Hadoop with PMDK support by 
referring to "PMDK library build options" section in `BUILDING.txt` in the 
source code. If multiple SCM volumes are configured, a round-robin policy is 
used to select an available volume for caching a block. Consistent with DRAM 
cache, SCM cache also has no cache eviction mechanism. When DataNode receives a 
data read request from a client, if the corresponding block is cached into SCM, 
DataNode will instantiate an InputStream with the block location path on SCM 
(pure java implementation) or cache address on SCM (PMDK based implementation). 
Once the InputStream is created, DataNode will send the cached data to the 
client. Please refer "Centralized Cache Management" guide for more details.   
(was: Non-volatile storage class memory (SCM, also known as persistent memory) 
is supported in HDFS cache. To enable SCM cache, user just needs to configure 
SCM volume for property “dfs.datanode.cache.pmem.dirs”. And all HDFS cache 
directives keep unchanged. There are two implementations for HDFS SCM Cache, 
one is pure java code implementation and the other is native PMDK based 
implementation. The latter implementation can bring user better performance 
gain in cache write and cache read. To enable PMDK based implementation, user 
should install PMDK library by referring to the official site http://pmem.io/. 
Then, build Hadoop with PMDK support by referring to "PMDK library build 
options" section in `BUILDING.txt` in the source code. If multiple SCM volumes 
are configured, a round-robin policy is used to select an available volume for 
caching a block. Consistent with DRAM cache, SCM cache also has no cache 
eviction mechanism. When DataNode receives a data read request from a client, 
if the corresponding block is cached into SCM, DataNode will instantiate an 
InputStream with the block location path on SCM (pure java implementation) or 
cache address on SCM (PMDK based implementation). Once the InputStream is 
created, DataNode will send the cache data to the client.)

> Support non-volatile storage class memory(SCM) in HDFS cache directives
> -----------------------------------------------------------------------
>
>                 Key: HDFS-13762
>                 URL: https://issues.apache.org/jira/browse/HDFS-13762
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: caching, datanode
>            Reporter: Sammi Chen
>            Assignee: Feilong He
>            Priority: Major
>         Attachments: HDFS-13762.000.patch, HDFS-13762.001.patch, 
> HDFS-13762.002.patch, HDFS-13762.003.patch, HDFS-13762.004.patch, 
> HDFS-13762.005.patch, HDFS-13762.006.patch, HDFS-13762.007.patch, 
> HDFS-13762.008.patch, SCMCacheDesign-2018-11-08.pdf, 
> SCMCacheDesign-2019-07-12.pdf, SCMCacheDesign-2019-07-16.pdf, 
> SCMCacheDesign-2019-3-26.pdf, SCMCacheTestPlan-2019-3-27.pdf, 
> SCMCacheTestPlan.pdf, SCM_Cache_Perf_Results-v1.pdf
>
>
> No-volatile storage class memory is a type of memory that can keep the data 
> content after power failure or between the power cycle. Non-volatile storage 
> class memory device usually has near access speed as memory DIMM while has 
> lower cost than memory.  So today It is usually used as a supplement to 
> memory to hold long tern persistent data, such as data in cache. 
> Currently in HDFS, we have OS page cache backed read only cache and RAMDISK 
> based lazy write cache.  Non-volatile memory suits for both these functions. 
> This Jira aims to enable storage class memory first in read cache. Although 
> storage class memory has non-volatile characteristics, to keep the same 
> behavior as current read only cache, we don't use its persistent 
> characteristics currently.  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to