Hello,
I wanted to check something. We are working on the Great Change to a
more flexible vector system and I want to outline the design that's in
my head. This has some implications for How Efficient Rust Code Is
Written, so I wanted to make sure we were all on the same page.
*Implications for writing efficient Rust*
I figured I'd just start with the implications for writing Rust.
Currently, to build up a vector, we rely upon an idiom like:
let mut v = [];
for some loop { v += [elt]; }
Now, often, such loops can (and should) be written using a higher-order
function (e.g., map, filter). But sometimes not. In such cases, under
the new regime, the recommended idiom would be:
let dv = dvec();
for some loop { v.push(elt); }
let v = dvec::unwrap(dv); // if necessary, convert to a vector
Actually the name `dvec()` (dynamic vector) will probably change—perhaps
to vecbuf? mvec? suggestions welcome—but you get the idea. The same
would eventually apply to building up strings.
Basically, the idea is that we have "builder" types for vectors and
strings. These builder types will overallocate and use dirty tricks to
achieve reasonable performance. Using convenience operators like `+`
will not do such things.
*Details*
The actual implementation strategy is that the representation of vectors
will stay mostly the same as it is now. However, when the compiler
allocates vectors, it will always do so for precisely the size they need
to be (fill == alloc, in our vector rep). There will be internal
functions (vec::alloc_empty_with_capacity() or something) that allocate
an empty vector but with a large capacity and unsafe functions that can
be used to set the length. These can be used by dvec-like classes but
also by routines like `vec::map()`. Most of this exists today. The only
real thing that changes is that we take *away* the tricks the compiler
does for `+=`.
*Motivation*
Part of the motivation for this change is that when you have task-local
vectors, the tricks we play now where we treat vectors both as values
and as things that can be updated in place don't work so well (this is
precisely why the move was made to unique vectors in the first place, as
I understand it). However, task-local vectors are good for a number of
reasons (cheaper copies; easier to ensure memory safety), so I expect
we'll wind up using them a fair amount: to obtain reliably good
performance, then, a builder like `dvec` can be used that encapsulates
the task-local vector pointer until construction is complete, making it
safe to append to it in place.
Another motivation is that it is part of a general trend to push
intelligence out of the compiler and into libraries where possible. We
can build vector append using overloaded operators. This also ensures
that end-users will be able to design efficient libraries and so forth.
Moving things like `vector +` into libraries also simplifies the type
checker, as we can draw on impls to handle all the various cases (@ vs ~
vectors, imm vs mut vectors, and so forth).
Niko
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev