[jira] [Commented] (HDFS-14401) Refine the implementation for HDFS cache on SCM

Feilong He (JIRA) Thu, 11 Apr 2019 22:43:20 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815989#comment-16815989
 ]


Feilong He commented on HDFS-14401:
-----------------------------------

Thanks [~rakeshr] & [~anoop.hbase] so much for your comments.
{quote}Volume management at the datanode uses java Files APIs and do managing 
the mount paths. Similar to that, this feature also have multiple {{pmem.dirs}} 
supported, to make it simple {{pmem}} can also follow the same pattern....
{quote}
Good point! It looks feasible. In multi-thread situation, the pmem full 
exception should be handled gracefully or we need use a variable to count the 
used bytes for each pmem volume. I will consider your suggestion in our new 
patch.
{quote}PmemVolumeManager#reserve is synchronized whereas release not !
{quote}
PmemVolumeManager#reserve firstly choose an available volume, then update the 
used bytes for that volume. To avoid a case that the assigned volume is full 
due to other thread's reserve operation, I make this method synchronized. For 
#release, it is unnecessary to do that.
{quote}chooseVolume -> synchronized Any reason why and not the counter variable 
is Atomic alone?
{quote}
Considering one or more pmem volume can be used up, I make this method 
synchronized to search next available pmem until all pmem volumes are checked 
one by one. It's a strict round-robin operation.
{quote}blockKeyToVolume.put(key, index);
 What is the heap size requirement for this Map? Per entry? Need some math 
which will be useful.
{quote}
The key's type is ExtendedBlockId, which is initialized by BlockPoolId (36 
Bytes String) and BlockId (long type). The map's value, i.e., volume index, is 
Byte type. Considering padding bytes for object, etc., around 96 Bytes are 
required for a entry in a map. Suppose the block size is 128MB, currently pmem 
volume's max size is 500GB, so a pmem volume is capable of caching at most 4k 
blocks. Suppose 6 pmem volumes are used, thus the overall heap size is 96*4k*6 
bytes = 2+MB. The estimation is not precise, but the order of magnitude of real 
value should be in 1 MB level, which I think is acceptable to JVM.
{quote}private final Long maxBytes;
 Reason why not long but Long?
{quote}
I just noted that it's not reasonable to use Long instead of long. I will fix 
it in our new patch.
{quote}static PmemVolumeManager getPmemVolumeManager() {
 Can we avoid static getters? Can we have a singleton model?
{quote}
Good suggestion. I will consider singleton model for PmemVolumeManager.

 

Thanks again!

> Refine the implementation for HDFS cache on SCM
> -----------------------------------------------
>
>                 Key: HDFS-14401
>                 URL: https://issues.apache.org/jira/browse/HDFS-14401
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: caching, datanode
>            Reporter: Feilong He
>            Assignee: Feilong He
>            Priority: Major
>         Attachments: HDFS-14401.000.patch
>
>
> In this Jira, we will refine the implementation for HDFS cache on SCM, such 
> as: 1) Handle full pmem volume in VolumeManager; 2) Refine pmem volume 
> selection impl; 3) Clean up MapppableBlockLoader interface; etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14401) Refine the implementation for HDFS cache on SCM

Reply via email to