[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694879#comment-14694879
 ] 

Yi Liu commented on HDFS-8859:
------------------------------

Thanks [~szetszwo] for the review! Update the patch to address your comments.
{quote}
How about calling it LightWeightResizableGSet?
{quote}
Agree, rename it in the new patch.

{quote}
>From your calculation, the patch improve each block replica object size about 
>45%. The JIRA summary is misleading. It seems claiming that it improves the 
>overall DataNode memory footprint by about 45%. For 10m replicas, the original 
>overall map entry object size is ~900 MB and the new size is ~500MB. Is it 
>correct?
{quote}
It's correct. Actually I added {{ReplicaMap}} in the JIRA summary, yes, I use 
{{()}}, :), considering the {{ReplicaMap}} is the major in memory long-lived 
object of Datanode, of course, there are other aspects (most are transient: 
data read/write buffer, rpc buffer, etc..), I just highlighted the improvement.

{quote}
 Subclass can call super.put(..)
{quote}
Update in the new patch. I just used to a new internal method . 

{quote}
There is a rewrite for LightWeightGSet.remove(..)
{quote}
I revert it in the new patch and keep original one. Original implement has 
duplicate logic, we can share same logic for all the {{if...else..}} branches.

{quote}
I think we need some long running tests to make sure the correctness. See 
TestGSet.runMultipleTestGSet()
{quote}
Agree, updated it in the new patch. 


For the test failures of {{003}}, it's because there is one place 
(BlockPoolSlice) add replicaInfo to replicaMap from a tmp replicapMap, but the 
replicaInfo is still in the tmp one, we can remove it from the tmp one before 
adding (for LightWeightGSet, an element is not allowed to exist in two gset).  
In {{002}} patch, the failure doesn't exist, we have a new implement of 
{{SetIterator}} which is very similar to the logic in java Hashmap, and a bit 
different with original one, but both are correct, the major difference is the 
time of finding next element. In the new patch, I keep the original one, and 
make few change in BlockPoolSlice.  All tests run successfully in my local for 
the new patch.

> Improve DataNode (ReplicaMap) memory footprint to save about 45%
> ----------------------------------------------------------------
>
>                 Key: HDFS-8859
>                 URL: https://issues.apache.org/jira/browse/HDFS-8859
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>            Priority: Critical
>         Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map<String, Map<Long, ReplicaInfo>> map =
>     new HashMap<String, Map<Long, ReplicaInfo>>();
> {code}
> Currently we use a HashMap {{Map<Long, ReplicaInfo>}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
>     SIZE (bytes)           ITEM
>     20                        The Key: Long (12 bytes object overhead + 8 
> bytes long)
>     12                        HashMap Entry object overhead
>     4                          reference to the key in Entry
>     4                          reference to the value in Entry
>     4                          hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
>     SIZE (bytes)           ITEM
>     4                             a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
>     



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to