Re: [Puppet-dev] memory leak tips...

Ben Ford Tue, 14 Oct 2014 08:37:03 -0700

I have to admit that this email made me feel a little bit dumb. Could you
provide a TL;DR summary that at least provides a little context for this?
Is this something that people writing types, functions, hiera backends, or
report processors need to concern themselves with?


On Mon, Oct 13, 2014 at 3:57 PM, Henrik Lindberg <
henrik.lindb...@cloudsmith.com> wrote:

> Hi,
> As you may know a memory leak was found in 3.7 (PUP-3345) and it seems
> like we found the cause of the problem. (YAY !!!)
>
> In order to find the leak, I came up with some (rough) tools to help
> detect leakage. Below are some tips if you want to use them. But first, the
> cause.
>
> Basically, the cause was a "faulty" cache implementation that made several
> assumptions that were not correct. So, here are some tips what not to do.
>
> Do not hold on to things in class variables (e.g. @@my_cache) unless the
> cache only contains things that are in the loaded ruby code and share the
> same lifecycle as the class. Alternatively you must have something that
> evict the cache content on some sort of transaction boundary. In the case
> found, this did not happen, and for each environment, it added a reference
> to a resource type instance (and since they get reloaded for each
> environment, the cache kept on growing).
>
> I would go so far as to say, almost never use the Class level for regular
> programming - create instances instead. That forces you to think about the
> lifecycle - when is it created, when does the things it hold on to get
> freed, etc.
>
> When using an object as a hash key, that object typically must have a hash
> method, and an equals method or you will very likely end up with an ever
> growing set of entries in the hash.
>
> If you are tempted to use the support for WeakRef in Ruby - then give up
> immediately since it is horribly slow on Ruby 1.8, and does not work
> correctly on Ruby 1.9 (seems to be based on Object Ids that can get
> recycled). If they worked owever, a WeakRef is otherwise ideal for cache
> implementations since it only binds the object if something else is also
> referencing it. (Still plenty of opportunity to write a cache
> that is incorrect though).
>
> Before you implement a cache - measure if the cache is an actual speed
> improvement! The overhead of a cache may eat the performance gain - or it
> may even be worse!
>
> Avoid binding lots of objects in the cache. Bind an identifier / name if
> possible. You may think you are keeping track of a Banana, but attached to
> that you may have a Gorilla, and it needs its jungle...
>
> The "Tools"
> ===========
> A new "benchmark" was added to the code base called "catalog_memory" - it
> is the same benchmark as "empty_catalog" (it contains a single "hello
> world" notice in each catalog), but the "catalog_memory" is instrumented to
> dump information about memory usage.
>
> To run this, you must be using Ruby 2.1.0. Then (if running from source)
> do:
>
> bundle exec rake benchmark:catalog_memory
>
> This will print some stats about the first and last run (it does 10 runs).
> It then computes the set of objects in memory that were not bound at the
> start, and it outputs two data files; "heap.json" with information about
> all live objects in memory, and "diff.json" with information about the diff
> between start and end of the run.
>
> It also outputs a list of source locations and methods being called where
> the allocations of the "leaked" objects were made. This list is typically
> not very helpful unless the leak is trivial.
>
> Once at this point, there is a rake task called "memwalk" that reads the
> two fils "heap.json", and "diff.json" and produces a graphviz .dot file
> that can be rendered. The result is a graph of all objects in memory and
> how they bind each other. (There is more to say about this...)
>
> You run this task with:
>
> bundle exec rake memwalk
>
> Then you produce the graph with the command:
>
> dot -Tsvg -omemwalk.svg memwalk.dot
>
> You now have a "memwalk.svg" file that you can open in Chrome. Nice
> features are that you can search the graph (like searching on any web
> page), and you can zoom and pan.
>
> The graph has a bubble per object, and it shows its address in hex. Arrows
> point to referenced objects from objects that bind them.
>
> The graph is pruned from all arrays, hashes and leaf data objects. For
> arrays and hashes it skips over them, and instead shows the Object that
> ultimately holds on to the structure (without the interleaving nested
> structure). This makes the graph readable (and have a size that is possible
> to process and view).
>
> The memwalk command prints out some information about what it rendered
> (counts). If you see something like tens of thousands of objects then the
> leak is massive and you may not be able to process it (nor be able to read
> and navigate the huge graph).
>
> To find a leak, browse the resulting graph, and find clusters that are not
> supposed to be there. In the current case, there where 10
> Puppet::Node::Environment objects and there was only supposed to be one.
>
> Then copy the address of one of the objects that are not supposed to be
> there in order to do a walk of only it and the objects that keeps it alive.
> Say 7f9afa20ba38.
>
> Then run memwalk again, now for this object (you need to quote the
> argument now):
>
> bundle exec rake 'memwalk[7f9afa20ba38]'
>
> This creates a file called memwalk-7f9afa20ba38.dot that you can now
> render using the dot command.
>
> View that and look at how it is bound. You may find that it is indirectly
> bound, and you may need to repeat this with what now appears to be a root
> holding on to a cluster of objects.
>
> When you got this far you know the class(es) involved. You may also want
> to figure out where it was allocated, and you can do that by using grep in
> the heap.json - say:
>
> grep 7f9afa20ba38 heap.json
>
> which will print out the information about this allocation (among other
> things it shows the file and line where it was allocated, a list of objects
> it references, and address (in hex) to the class object.
>
> This allows you to manually grep / walk the heap to find more details.
> (Or continue hacking on the memwalk rake script to make it do what you
> want.
>
> Hope the above is of help to someone having to track down a memory leak in
> the future...
>
> Regards
> - henrik
>
>
> --
>
> Visit my Blog "Puppet on the Edge"
> http://puppet-on-the-edge.blogspot.se/
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/puppet-dev/m1hld0%24mfc%241%40ger.gmane.org.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Ben Ford | Training Solutions Engineer
Puppet Labs, Inc.
926 NW 13th Ave, Suite #210
Portland, OR 97209

509.592.7291
ben.f...@puppetlabs.com

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CACkW_L5fCj%3DHfeYTTwsbeXef6wgkjg%2B1fC2c2nSdhJxMGcMLKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] memory leak tips...

Reply via email to