[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3867:
----------------------------------

    Attachment: LUCENE-3867.patch

With the help of Dawid (inspecting forbidden C code *g*), we checked the actual 
size and how they are calculated. Based on that I changed the defaults 
depending on bitness (Object header is 16 on 64 bit without compact refs, array 
header is 24 on 64 bit).

The attached patch will use the above defaults, but tries to update them using 
sun.misc.Unsafe. The trick to get the object header from usafe is by declaring 
a dummy class extending Object with one single field. We are then using unsafe 
to get the fieldOffset of that field. As Dawid pointed out, the return value is 
identical to his investigations (8 bytes on 32 bit archs, 16 bytes on 64 bit 
archs and 12 bytes on compact ref 64bit archs). So RamUsageEstimator was 
completely wrong in the past for 64 bit architectures.

I also changed the funny switch statement ("the " + adressSize + " inch dick") 
to assume 64 bits architecture, if addressSize >= 8.
                
> RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
> -----------------------------------------------------
>
>                 Key: LUCENE-3867
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3867
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Trivial
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
> LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, 
> LUCENE-3867.patch
>
>
> RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
> NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
> NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
> this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
> {quote}
> A single-dimension array is a single object. As expected, the array has the 
> usual object header. However, this object head is 12 bytes to accommodate a 
> four-byte array length. Then comes the actual array data which, as you might 
> expect, consists of the number of elements multiplied by the number of bytes 
> required for one element, depending on its type. The memory usage for one 
> element is 4 bytes for an object reference ...
> {quote}
> While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
> about including such helper methods in RUE, as static, stateless, methods? 
> It's not perfect, there's some room for improvement I'm sure, here it is:
> {code}
>       /**
>        * Computes the approximate size of a String object. Note that if this 
> object
>        * is also referenced by another object, you should add
>        * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
>        * method.
>        */
>       public static int sizeOf(String str) {
>               return 2 * str.length() + 6 // chars + additional safeness for 
> arrays alignment
>                               + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
> maintains 3 integers
>                               + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
> char[] array
>                               + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
> String object
>       }
> {code}
> If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
> long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to