[
https://issues.apache.org/jira/browse/HDFS-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306125#comment-17306125
]
Stephen O'Donnell commented on HDFS-15907:
------------------------------------------
Yea, Shiv was concerned about the memory overhead of concurrentHashMap, but I
cannot see why it is a problem.
The ConcurentHashMap implement is an object which contains a number of HashMaps
under the covers. It simply store the keys by hashing the keys across the
number of Maps it is using internally, and it synchronises at the sub-map
level, somewhat like a "striped lock". It will be slightly slower for put and
get due to this extra indirection, but the overhead is tiny.
Does it have a higher memory overhead than a HashMap? Yes but if you are
storing thousands or millions of keys this will not matter. The extra overhead
is not a "per entry" overhead, its a static number driven by the Java object
overhead. If the memory overhead of a HashMap is 32 bytes (picking a number out
of the air, I have not checked this) then the overhead of a ConcurrentHashMap
is approx 32 * 32, as I think it creates 32 sub-maps by default.
I feel that this overhead is worth it as it will provide better concurrency
that synchronising the entire map for all keys.
> Reduce Memory Overhead of AclFeature by avoiding AtomicInteger
> --------------------------------------------------------------
>
> Key: HDFS-15907
> URL: https://issues.apache.org/jira/browse/HDFS-15907
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Attachments: HDFS-15907.001.patch
>
>
> In HDFS-15792 we made some changes to the AclFeature and ReferenceCountedMap
> classes to address a rare bug when loading the FSImage in parallel.
> One change we made was to replace an int inside AclFeature with an
> AtomicInteger to avoid synchronising the methods in AclFeature.
> Discussing this change with [~weichiu], he pointed out that while the
> AclFeature cache is intended to reduce the count of AclFeature objects, on a
> large cluster, it is possible for there to be many millions of AclFeature
> objects.
> Previously, the int will have taken 4 bytes of heap.
> By moving to a AtomicInteger, we probably have an overhead of:
> 4 bytes (or 8 if the heap is over 32GB) for a reference to the atomic long
> object
> 12 byte overhead for the java object
> 4 bytes inside the atomic long to store an int.
>
> So the total heap overhead has gone from 4 bytes to 20 bytes just to use an
> AtomicInteger.
> Therefore I think it makes sense to remove the AtomicInteger and just
> synchronise the methods of AclFeature where the value is incremented /
> decremented / retrieved.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]