Hello,

I wanted to check something. We are working on the Great Change to a more flexible vector system and I want to outline the design that's in my head. This has some implications for How Efficient Rust Code Is Written, so I wanted to make sure we were all on the same page.

*Implications for writing efficient Rust*

I figured I'd just start with the implications for writing Rust. Currently, to build up a vector, we rely upon an idiom like:

let mut v = [];
for some loop { v += [elt]; }

Now, often, such loops can (and should) be written using a higher-order function (e.g., map, filter). But sometimes not. In such cases, under the new regime, the recommended idiom would be:

let dv = dvec();
for some loop { v.push(elt); }
let v = dvec::unwrap(dv); // if necessary, convert to a vector

Actually the name `dvec()` (dynamic vector) will probably change—perhaps to vecbuf? mvec? suggestions welcome—but you get the idea. The same would eventually apply to building up strings.

Basically, the idea is that we have "builder" types for vectors and strings. These builder types will overallocate and use dirty tricks to achieve reasonable performance. Using convenience operators like `+` will not do such things.

*Details*

The actual implementation strategy is that the representation of vectors will stay mostly the same as it is now. However, when the compiler allocates vectors, it will always do so for precisely the size they need to be (fill == alloc, in our vector rep). There will be internal functions (vec::alloc_empty_with_capacity() or something) that allocate an empty vector but with a large capacity and unsafe functions that can be used to set the length. These can be used by dvec-like classes but also by routines like `vec::map()`. Most of this exists today. The only real thing that changes is that we take *away* the tricks the compiler does for `+=`.

*Motivation*

Part of the motivation for this change is that when you have task-local vectors, the tricks we play now where we treat vectors both as values and as things that can be updated in place don't work so well (this is precisely why the move was made to unique vectors in the first place, as I understand it). However, task-local vectors are good for a number of reasons (cheaper copies; easier to ensure memory safety), so I expect we'll wind up using them a fair amount: to obtain reliably good performance, then, a builder like `dvec` can be used that encapsulates the task-local vector pointer until construction is complete, making it safe to append to it in place.

Another motivation is that it is part of a general trend to push intelligence out of the compiler and into libraries where possible. We can build vector append using overloaded operators. This also ensures that end-users will be able to design efficient libraries and so forth. Moving things like `vector +` into libraries also simplifies the type checker, as we can draw on impls to handle all the various cases (@ vs ~ vectors, imm vs mut vectors, and so forth).


Niko
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to