Hi Nick!

The combination of deriving traits and Rust's ownership model is great for
memory reporting, and this stuff is _so_ much nicer to define than the
equivalent about:memory measurements in Firefox.

But it doesn't play out as well in the presence of a GC: ownership is
muddied. The Arc/Rc issues you mention are the tip of the iceberg, as
ownership is relaxed and heap edges proliferate. In the general case of
arbitrary edges that GCs support, one needs to analyze the heap graph to
get equivalent information about why something is alive and how much heap
space it is responsible for. Things like computing dominator trees and the
shortest path to some object from GC roots. For everything in Servo relying
on SpiderMonkey's GC, I think it makes sense to use the `JS::ubi::Node`
infrastructure[0] from SpiderMonkey, which gives a graph API to the GC
heap, and has various analyses (dominator trees, shortest paths, stack at
allocation time, etc) out of the box. This is what we use for the Firefox
DevTools' memory tool.

Of course, `JS::ubi::Node` is fairly heavily templated and has many inline
functions, so exposing this infrastructure to Rust will take some care and
effort.

Backing up a bit: there is a large benefit in separating the heap graph
traversal from the analyses run on the heap graph.

* With a separation, we can traverse the heap graph at some particular
point in time, serialize it to a core dump file, and then perform as much
post processing and analyzing as we like, without needing to reproduce that
exact heap graph over and over. This is similar to how rr separates
reproducing a bug from debugging it.

* With a separation, the bucketing of various types and their collective
heap size is just *one* analysis of the heap graph. We could have many
others, like the ones I've mentioned above: dominator trees, shortest path
to GC roots (I guess to a stack variable or global in general Rust),
dynamic leak detection[1], etc. The possibilities are open ended.

So maybe instead of using `JS::ubi::Node`, what do you think of creating
`#[derive(HeapGraph)]` and implementing `MallocSizeOf`, etc on top of it?

If `#[derive(HeapGraph)]` implemented `petgraph`'s graph traits, we would
even get shortest paths and dominators[2] for free.

Interested in your thoughts on the subject!

Cheers,

Nick

[0] http://searchfox.org/mozilla-central/source/js/public/UbiNode.h#32-164
[1] http://www.cs.utexas.edu/ftp/techreports/tr06-07.pdf
[2] https://github.com/bluss/petgraph/blob/master/src/algo/dominators.rs

On Thu, Oct 5, 2017 at 6:27 PM, Nicholas Nethercote <n.netherc...@gmail.com>
wrote:

> Hi,
>
> Currently we have two similar but distinct approaches to measuring memory
> usage
> in Servo.
>
> - Servo uses the heapsize and heapsize_derive crates, from crates.io.
>
> - Gecko uses the malloc_size_of and malloc_size_of_derive crates, which are
> in
>   the tree.
>
> Because of this, you see this pattern quite a bit in style code:
>
> > #[cfg_attr(feature = "gecko", derive(MallocSizeOf))]
> > #[cfg_attr(feature = "servo", derive(HeapSizeOf))]
> > struct Foo {
> >     ...
> > }
>
> Why the difference? heapsize is the original approach. It sorta works, but
> has
> some big flaws. malloc_size_of is a redesign that addresses these flaws.
>
> - heapsize assumes you only want a single number for the size of any data
>   structure, when sometimes you want to break it down into different
> buckets.
>
>   malloc_size_of provides both "shallow" and "deep" measurements, which
> give
>   greater flexibilty, which helps with the multi-bucket case.
>
> - heapsize assumes jemalloc is the allocator. This causes build problems in
>   some configurations, e.g. https://github.com/servo/heapsize/issues/80.
>   It also means it doesn't integrate with DMD, which is the tool we use to
>   identify heap-unclassified memory in Firefox.
>
>   malloc_size_of doesn't assume a particular allocator. You pass in
> functions
>   that measure heap allocations. This avoids the build problems and also
> allows
>   integration with DMD.
>
> - heapsize doesn't measure HashMap/HashSet properly -- it computes an
> estimate
>   of the size, instead of getting the true size from the allocator. This
>   estimate can be (and in practice often will be) an underestimate.
>
>   malloc_size_of does measure HashMap/HashSet properly. However, this
> requires
>   that the allocator provide a function that can measure the size of an
>   allocation from an interior pointer. (Unlike Vec, HashMap/HashSet don't
>   provide a function that gives the raw pointer to the storage.) I had to
> add
>   support for this to mozjemalloc, and vanilla jemalloc doesn't support it.
> (I
>   guess we could fall back to computing the size when the allocator doesn't
>   support this, e.g. for Servo, which uses vanilla jemalloc.)
>
> - heapsize doesn't measure Rc/Arc properly -- currently it just defaults to
>   measuring through the Rc/Arc, which can lead to double-counting.
> Especially
>   when you use derive, where it's easy to overlook that Rc/Arc typically
> need
>   special handling. (This is https://github.com/servo/heapsize/issues/37.)
>
>   malloc_size_of does measure Rc/Arc properly. It lets you provide a table
> that
>   tracks which pointers have already been measured, which is used to
> prevent
>   double-counting. malloc_size_of also doesn't implement the standard
>   MallocSizeOf trait for Rc/Arc, which means they can't be unintentionally
>   measured via derive. (You can still use derive if you explicitly choose
> to
>   ignore all Rc/Arc fields, however.)
>
> Basically, malloc_size_of is heapsize done right, and using both is silly.
> I
> went with this dual-track approach while adding memory reporting to Stylo
> because time was tight and the exact design choices required to handle all
> the
> necessary cases weren't clear. But now that things have settled down I'd
> like
> to pay back this technical debt by removing the duplication.
>
> I see two options.
>
> - Overwrite the heapsize crate on crates.io with the malloc_size_of code.
> So
>   the crate name wouldn't change, but the API would change significantly,
> and
>   it would still be on crates.io. Then switch Servo over to using heapsize
>   everywhere.
>
> - Switch Servo over to using malloc_size_of everywhere. (This leaves open
> the
>   question of what should happen to the heapsize crate.)
>
> I personally prefer the second option, mostly because I view all of this
> code
> as basically unstable -- much like the allocator APIs in Rust itself -- and
> publishing it on crates.io makes me uneasy. Also, keeping the code in the
> tree
> makes it easier to modify.
>
> Thoughts?
>
> Nick
> _______________________________________________
> dev-servo mailing list
> dev-servo@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-servo
>
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo

Reply via email to