[
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229440#comment-13229440
]
Dawid Weiss commented on LUCENE-3867:
-------------------------------------
bq. sizeOfWeiss(..)
We're talking some serious dimensions here, beware of buffer overflows!
bq. Or maybe we don't add the new string methods (sizeOf(String),
sizeOf(String[])) and somewhere document that you should do new
RUE().size(String/String[]) instead..
This is something I would go for -- it's consistent with what I would consider
this class's logic. I would even change it to sizeOf(Object) -- this would be a
static shortcut to just measure an object's size, no strings attached?
Kabutz's code also distinguishes interned strings/ cached boxed integers and
enums. This could be a switch much like it is now with interned Strings. Then
this would really be either an upper (why lower, Mike?) bound or something that
would try to be close to the exact memory consumption.
A fun way to determine if we're right would be to run a benchmark with -Xmx20mb
and test how close we can get to the main memory pool's maximum value before
OOM is thrown. :)
> RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
> -----------------------------------------------------
>
> Key: LUCENE-3867
> URL: https://issues.apache.org/jira/browse/LUCENE-3867
> Project: Lucene - Java
> Issue Type: Bug
> Components: core/index
> Reporter: Shai Erera
> Assignee: Shai Erera
> Priority: Trivial
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch,
> LUCENE-3867.patch
>
>
> RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that:
> NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The
> NUM_BYTES_OBJECT_REF part should not be included, at least not according to
> this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
> {quote}
> A single-dimension array is a single object. As expected, the array has the
> usual object header. However, this object head is 12 bytes to accommodate a
> four-byte array length. Then comes the actual array data which, as you might
> expect, consists of the number of elements multiplied by the number of bytes
> required for one element, depending on its type. The memory usage for one
> element is 4 bytes for an object reference ...
> {quote}
> While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel
> about including such helper methods in RUE, as static, stateless, methods?
> It's not perfect, there's some room for improvement I'm sure, here it is:
> {code}
> /**
> * Computes the approximate size of a String object. Note that if this
> object
> * is also referenced by another object, you should add
> * {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
> * method.
> */
> public static int sizeOf(String str) {
> return 2 * str.length() + 6 // chars + additional safeness for
> arrays alignment
> + 3 * RamUsageEstimator.NUM_BYTES_INT // String
> maintains 3 integers
> + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER //
> char[] array
> + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; //
> String object
> }
> {code}
> If people are not against it, I'd like to also add sizeOf(int[] / byte[] /
> long[] / double[] ... and String[]).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]