Misha Dmitriev created HDFS-12922:
-------------------------------------

             Summary: Arrays of length 1 cause 9.2% memory overhead
                 Key: HDFS-12922
                 URL: https://issues.apache.org/jira/browse/HDFS-12922
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Misha Dmitriev
            Assignee: Misha Dmitriev


I recently obtained a big (over 60GiB) heap dump from a customer and analyzed 
it using jxray (www.jxray.com). One source of memory waste that the tool 
detected is arrays of length 1 that come from {{BlockInfo[] 
org.apache.hadoop.hdfs.server.namenode.INodeFile.blocks}} and {{INode$Feature[] 
org.apache.hadoop.hdfs.server.namenode.INodeFile.features}}. Only a small 
fraction of these arrays (less than 10%) have a length greater than 1. 
Collectively these arrays waste 5.5GiB, or 9.2% of the heap. See the attached 
screenshot for more details.

The reason why an array of length 1 is problematic is that every array in the 
JVM has a header, that takes between 16 and 20 bytes depending on the JVM 
configuration. For a big enough array this 16-20 byte overhead is not a 
concern, but if the array has only one element (that takes 4-8 bytes depending 
on the JVM configuration), the overhead becomes bigger than the array's 
"workload".

In such a situation it makes sense to replace the array data field {{Foo[] ar}} 
with an {{Object obj}}, that would contain either a direct reference to the 
array's single workload element, or a reference to the array if there is more 
than one element. This change will require further code changes and type casts. 
For example, code like {{return ar[i];}} becomes {{return (obj instanceof Foo) 
? (Foo) obj : ((Foo[]) obj)[i];}} and so on. This doesn't look very pretty, but 
as far as I see, the code that deals with e.g. INodeFile.blocks already 
contains various null checks, etc. So we will not make the code much less 
readable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to