Hi Jaques,

at least it is within a factor of 1.5 or so, could be worse :-).

I believe some of the additional overhead is due to some intricacies in Erlang memory management (or possibly memory management in general, like fragmentation). There was some discussion about a similar issue on the Erlang mailing list a while ago (http://erlang.2086793.n4.nabble.com/discrepancy-of-memory-usage-figures-td3618939.html), but no conclusion.

I just checked one of our riak nodes, and also saw a similar discrepancy: 'erlang:memory()' (from the erlang shell of the node) reported about 10Gb used, but ps reported about 13Gb of resident memory for the beam.smp process. After restarting the riak node the discrepancy became much smaller, 9Gb to 10Gb.

Cheers,
Nico


Am 19.08.2011 01:40, schrieb Jacques:
This is very helpful. Thank you.

We actually have two different key sizes, 9 key + 1 bucket = 10 total
And 11 key + 1 bucket = 12 total.

Sounds like both of those will get stored on a 64 byte boundary.

That puts us at ~80gb of expected usage.  We're currently at 110gb.

Coming from the java world, I'm wondering if there are any intermediate objects that need to be collected?

It sounds like you're generally saying that "it is what it is" for now and just capacity plan around it.

thanks again,
Jacques

On Thu, Aug 18, 2011 at 3:31 AM, Nico Meyer <[email protected] <mailto:[email protected]>> wrote:

    Have you taken into account the size of the bucket name? is it 12
    bytes for key+bucket?

    Also there is some additional overhead that is due to memory
    alignment requirements, depending on your CPU architecture.
    I have done some tests on this for x86_64 a while ago:

    size=55: overhead=9
    size=56: overhead=8
    size=57: overhead=23
    size=58: overhead=22
    size=59: overhead=21
    size=60: overhead=20
    size=61: overhead=19
    size=62: overhead=18
    size=63: overhead=17
    size=64: overhead=16
    size=65: overhead=15
    size=66: overhead=14
    size=67: overhead=13
    size=68: overhead=12
    size=69: overhead=11
    size=70: overhead=10
    size=71: overhead=9
    size=72: overhead=8
    Not much you can do about that, unless you want to use unaligned
    memory, which is super complicated (replace malloc with your own
    allocator) and has a huge performance penalty.

    So lets assume your 12 bytes are include bucket and key:

    43 byte overhead + 12 byte keydata = 55 bytes, this is the number
    of bytes allocated per entry. In reality malloc(55) allocates 64
    bytes internally, so you total memory consumption is:

    64byte*3*450Mio = 80 Gb

    But if the bucket size is not included in the 12 bytes, and your
    bucket name(s) has at least a size of 2 bytes we get:

    (43 + 12 + 2) bytes  = 57 bytes, and in this case malloc(57)
    really allocates 80 bytes inernally. Therefore:

    80byte*3*450Mio = 100Gb


    Cheers,
    Nico


    Am 17.08.2011 19:43, schrieb Jacques:

    So if my math is correct, consumption should be around 70-80 for
    me. Yet we see a usage over 100. I'm also notclear what the step
    changes are. If you have any insights on these that would be
    great. Thanks!

    On Aug 17, 2011 12:07 AM, "Nico Meyer" <[email protected]
    <mailto:[email protected]>> wrote:
    > Hi Jacques,
    >
    > please read my reply back in May, which should clear things up:
    >
    >
    
http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-May/004292.html
    >
    > Cheers,
    > Nico
    >
    > On 16.08.2011 21:09, Jacques wrote:
    >> We're utilizing Riak 14.2 and we're seeing higher memory
    consumption
    >> than we expect.
    >>
    >> We're running on a 4 node cluster with each node housing 32gb
    of memory
    >> and are utilizing bitcask with a 3x write replication factor.
    We're
    >> seeing faster growth than we expect and also seeing weird
    bounces update.
    >>
    >> You can see an example chart
    <http://picturepush.com/public/6331935>.
    >> (note, there are sometimes where we've had to stop the job for
    short
    >> periods of time-- you can see these as flat spots).
    >>
    >> We are doing a large throttled import that has key sizes of
    >> approximately 12 bytes. We're currently around 450mm unique
    items and
    >> riak memory consumption is ~110gb. The input job is probably
    95% new
    >> puts and 5% overwriting puts.
    >>
    >> According to the capacity planner tools, our key space should
    probably
    >> be about half what are actual memory consumption is.
    >>
    >> As you can see in the chart, we're also seeing jumps in memory
    size at
    >> random intervals. What might these be? Nothing interesting in
    the logs
    >> that I can see. Regular merges.
    >>
    >> A close up <http://picturepush.com/public/6332084> of a recent
    jump in
    >> memory consumption for one of the nodes (they all look the
    same). There
    >> are no corresponding distinct patterns within the cpu chart.
    Things are
    >> pretty flat although we have more wait sometimes than we like
    (need more
    >> spindles clearly).
    >>
    >>
    >> Any helpful thoughts?
    >>
    >> Thanks,
    >> Jacques
    >>
    >>
    >>
    >>
    >>
    >> _______________________________________________
    >> riak-users mailing list
    >> [email protected] <mailto:[email protected]>
    >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to