Thom, On Tue, Jun 26, 2018 at 11:08 PM, Thom Chiovoloni <tchiovol...@mozilla.com> wrote:
> So, it came up in the Sync.Next meeting today that we're a bit concerned > about the mentat binary size, and don't know where it comes from, etc. My > completely unfounded suspicion has been that there are some easy wins here. > So I decided to see if this is true. (I'm sending this to the sync-dev > mailing list even though it's clearly the wrong place mostly because it > would get completely lost if I just posted it in the slack channel, and it > seems plausible that someone might want to refer to it later). > This was my suspicion too, so thank you for digging in and producing this excellent report. > I decided to test using the mentat-cli binary, built for a 64-bit mac. > Because that's the platform I use, and because binaries are a lot easier to > do measurements on than libraries. Especially when we do LTO, which we do, > also, on binaries you can actually `strip` them without reading a bunch of > manpages on what flags to pass `strip` [0]. > > Anyway, the baseline on my machine for mentat_cli is 9539784 bytes > (9.5MBish), after running `strip` it shrinks to 7855608 bytes. > > First step is changing the optimization level to one that is set to > optimize for size. There are two of these, opt-level "s" and "z", opt-level > "z" being the more substantial. This shrunk the binary to 9114776 bytes, > 5352264 after strip. > > The next idea I had is that a lot of the libraries we import seem to > involve networking. The only network code we have is in `tolstoy`, which is > very experimental and not something we're planning on using in production. > Moving this to live under a `feature` flag reduces our size to 7396480 > bytes, 4571776 bytes after strip. > The WIP branch that Grisha and I are working on includes a "syncable" feature that does this, so we'll handle being able to avoid Hyper, etc in the very near future. But thanks for the reminder! > I also noticed a few libraries that had multiple versions built. > Specifically `regex` seems like it might be heavy (at the very least, it > has dependent crates), and we're building both 0.2 and 1.0.1 (the former is > specified by mentat_query_sql, and the latter by env_logger). Moving both > of these to be 1.0.1 brings the size down to 6565472 bytes, 4036968 after > strip. > > (Worth noting that regex is a transitive dependency from `env_logger`, > which I suspect we aren't thrilled with, and the use of it inside > mentat_query_sql could probably be trivially rewritten > <https://github.com/mozilla/mentat/blob/master/query-sql/src/lib.rs#L567> > to avoid the dependency.) > I filed a few issues rooted at https://github.com/mozilla/mentat/issues/772. We should definitely cull `regex`, but `env_logger` is an application choice more than a Mentat choice. > There are probably other targets for this (the `memchr` lib seems to be > included twice, but while I've done exactly no checking, my gut says it > doesn't have the same heft as `regex`). > Is it possible to estimate a size metric for each of our dependencies? Yes, it's difficult with LTO/inlining/dead code removal, but it would help gauge where to put effort. > There are two [1] more things I tried. > > It seems likely for various reasons that we will have to build mentat with > panic="abort" when distributing an FFI binary. This is mainly because it's > undefined behavior to `panic` across FFI boundaries, which basically means > arbitrarily bad things can happen (see [2] below on some hedging on this, > but I really don't know what other options we have here). Doing this > reduced the size to 5357500 bytes, 3484324 bytes after stripping. > > Finally, I tried replacing jemalloc with the system allocator. This shaved > off less than I was expecting, but not nothing. End result was 5097456 > bytes, 3293640 after stripping. > > This is under half the size we started with (for the stripped library, > it's a bit over half unstripped, but who needs debug symbols?). At this > point I'm giving up. It's kind of late and I've had a few beers, and I > think I've hit most of the low hanging fruit. You can stare at this work > here <https://github.com/thomcc/mentat/tree/shrink-binary> if you have a > burning need to, some of it is probably worth PRing too! I'll do that > tomorrow. > > - Thom > > [0] These last two comments are probably the cause of some of our > confusion here, and the first might make this work not terribly > representative, although I'd be surprised if many of the changes that saved > size don't do the same for a library -- given that we intend to ship as a > static or dynamic libraries (e.g. we distribute native code libraries that > might not always get benefits from LTO), it's possible that our strategy of > 'put everything in separate crates and rely on LTO to sort it out for us' > might not be so great. That also could be wrong! > > [1] Actually, I tried more but most didn't really work. Like hackily > removing our dependency on `num`, which seems to exist primarily so we can > use bigint, which we don't fully implement. This only shaved off about 10k, > but I guess that's not too surprising since we aren't doing any arithmetic > with the bigints. > > [2] While some libraries are able to avoid this using > `std::panic::catch_unwind`, rusqlite doesn't support this due to use of > Cell and RefCell. Neither does `sync15_adapter` (although I'll likely fix > this, as we should actually be unwind-safe). In the long term, it's not > clear to me that we want `panic = abort` behavior, although from what I can > tell, most of mentat was written with a pattern like this in mind (I could > be wrong). > > So, compiling with panic = "abort" is probably what we'll want to do in > the short term, *maybe* what we'll want to do in the long term, and should > result in a substantial space saving (no libbacktrace, no code bloat from > landing pads, etc), so even though it's possible that it's not great for > building something robust, I tried it. > Thanks for getting this started! Nick
_______________________________________________ Sync-dev mailing list Sync-dev@mozilla.org https://mail.mozilla.org/listinfo/sync-dev