Re: [rust-dev] Statically linked library crates

Graydon Hoare Mon, 16 May 2011 00:21:41 -0700

On 15/05/2011 2:37 PM, Patrick Walton wrote:

Hi everyone,


It occurred to me that our insistence on dynamic linking for the
standard library is likely to cause performance problems.


It might. Measure twice, cut once.

Consider uint::range(). This is the preferred way to do a C-style for
loop in Rust. Up to this point we've been preferring to write C-style
for loops as while loops, as near as I can tell, due to some early
flakiness in rustboot, and for performance reasons.

I'm not sure there's a single preferred way. We have some special-casingin the language for built-in loop types (elements of vectors forexample, avoiding bounds-checking on each iteration) and we could domore; for example we could add a 3-clause C style for loop, or a "from.. to .. by" range loop over scalar types. Or other options, see below.

At the moment, *each iteration* of uint::range() requires not one, but
two indirect function calls

Two? It should only be one. Range calls block, block runs, range does +=1, range calls block again. It's also not exactly a surprising indirectfunction; it keeps calling the same one, from the same place in the code+ stack. IOW it'll *probably* be well predicted.

LLVM (indeed, each boundary is a DLL boundary). This is likely not going
to work; we're inhibiting all loop optimizations and forcing new stack
frames to be created for every trip around the loop.

It is definitely going to inhibit some optimizations, it's a question ofasking which ones and how high the costs are. Measure?

If you find it is really bad, which is surely possible, then I'd notethat for scalar-range loops I think it's legit to ask for some languagehelp (possibly including pointer-range? maybe we can unify the fastvec-element for loop, since it's just a pointer scalar-range).

For iters with substantially more structure, I don't think "having afunciton call in the middle" is a guaranteed major performance issue.It's worth measuring a few different "kinds" of iter before jumping tothe conclusion that it can't involve any calling.

So range() can't be dynamically linked anymore. The simplest
solution I can think of is to allow both static and dynamic library crates.

Certainly possible. You're jumping through a few inferences to a limitedset of conclusions but it's a possible approach.

There's been a movement lately in some circles to forbid dynamic linking
entirely; Go for example explicitly does not support it (in keeping with
the Plan 9 tradition). I tend to disagree; I think that supporting
dynamic linking is useful to enable memory savings and to make security
problems in the wild easy to patch.

Yeah. I'm not really a "fan" of dynamic linking. There are also studiesshowing the putative memory savings of dynamic linking are .. prettyhypothetical. But then it really depends on the system load, systemdesign, what you're measuring, and and who you ask.

I mostly wanted dynamic linking so that people who *are* convinced theywant it can have it, for whatever reason. It's a common sentiment and asI said before, I'm a pluralist. I'm usually fine statically linking theprograms I write. But I want dynamic to work.

But exclusively relying on it is not going to be tenable;
some routines are so simple, yet so critical to performance, that static
linking is probably going to be the only viable option.

I think you mean more than "static linking", since LLVM is not a linker;you mean static linking where the .a file actually contains LLVMbitcode, a la the LLVM-LTO approach. That's plausible. It's *differentagain* from actual static linking. And it's different from pickled ASTs,and different from macros. So there are like ... 4 points on thespectrum here in addition to our existing compilation strategy:


  - Pickling ASTs so that we can load in STL-like "declared externally
    but mixed in to your compilation unit" functions. So that the
    inlining or intertwingling can happen at the front-and-middle ends,
    not just the LLVM layer.

  - Powerful macros that cover the "needs to be efficient" cases as well
    as the various other macro scenarios. Definitely front-end!

  - Static linking, just to avoid having runtime dependencies and PIC
    code and such. But still "calls a function on each iter".

  - Static linking LLVM bitcode, to get LTO across crates and cross-iter
    inlining.

On this list, I'd point out a couple things:

  - We want a macro system anyways. So let's build that soon.

  - You only have an inlining problem *when* you're dealing with
    separate crates. Granted, uint::range is "standard library fodder",
    but if you have some custom data structure X with iters on X
    implemented in the same crate as the iter-callers, LLVM may well
    inline the iter already. And uint::range is one of those cases
    that will fall into "supported by a scalar-range special case loop",
    so let's not over-generalize the problem too quickly.

  - We wanted some kind of "cache mode" for saving and reloading .bc
    files transparently inside the compiler, to avoid re-trans'ing
    subsets of a crate that the compiler determines as not having
    changed run-to-run. So that will probably help set up a framework
    for loading and static linking external crates as LLVM bitcode.

I'd probably say go with the macros and the scalar-range loop *first*while measuring the cost of DLL vs. crate-internal iter vs. scalar-rangeloops. If you have a serious performance problem that definitely can'tbe solved with scalar-range optimizations, then start digging into theLLVM-LTO stuff.

The reason is simply this: if "standard wisdom" is to use LLVM-LTOagainst the standard library to get "adequate performance", you get twobig follow-on problems:


  - The idea that "compilation may be fast" is shot; the standard
    approach will involve rebuilding libstd every time. Back in C++.

  - You're leaving DLL-based users out in the cold *by design*, rather
    than trying to find an approach that gets them decent performance
    too.

So I'd like to approach cross crate LLVM-LTO as a last resort, not afirst one.


-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Statically linked library crates

Reply via email to