On Sat, Jan 11, 2014 at 11:54 AM, Owen Shepherd <[email protected]> wrote: > On 11 January 2014 06:20, Daniel Micay <[email protected]> wrote: >> >> The branch on the overflow flag results in a very significant loss in >> performance. For example, I had to carefully write the vector `push` >> method for my `Vec<T>` type to only perform one overflow check. With >> two checks, it's over 5 times slower due to failed branch predictions. > > > What did the generated code look like? I suspect that LLVM wasn't generating > optimal code, perhaps because Rust wasn't giving it appropriate hints or > because of optimizer bugs. For reference, on AMD64 the code should look > something like the following hypothetical code: > > vec_allocate: > MOV $SIZE, %eax > MUL %rsi > JC Lerror > ADD $HEADER_SIZE, %rax > JC Lerror > MOV %rax, %rsi > JMP malloc > Lerror: > // Code to raise error here > > Note that the ordering is EXTREMELY important! x86 doesn't give you any > separate branch hints (excluding two obsolete ones which only the Pentium IV > ever cared about) so your only clue to the optimizer is the branch > direction. > > I suspect your generated code had forward branches for the no overflow case. > Thats absolutely no good (codegen inerting "islands" of failure case code); > it will screw up the branch predictor. > > x86 defaults to predicting all (conditional) forward jumps not taken, all > conditional backwards jumps taken (Loops!). If the optimizer wasn't informed > correctly, it will probably not have obeyed that. > > Being as the overflow case should basically be never hit, there is no reason > for it to ever be loaded into the optimizer, so that is good > > (P.S. If the rust compiler is really good it'll convince LLVM to put the > error case branch code in a separate section so it can all be packed > together far away from useful cache lines and TLB entries)
Rust directly exposes the checked overflow intrinsics so these are what was used. It already considers branches calling a `noreturn` function to be colder, so adding an explicit branch hint (which is easy enough via `llvm.expect` doesn't help). Feel free to implement it yourself if you think you can do better. The compiler work is already implemented. I doubt you'll get something performing in the same ballpark as plain integers. _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
