I think failure may have quite different inlining costs once we move to
libunwind-based backtraces instead of hardcoding file/line number information
into the generated code. The file and line number information tends to pollute
generated code a lot and it's basically unnecessary with proper DWARF info and
a functioning set of libunwind bindings, which we now have thanks to a couple
of awesome contributions from you all. :)
Patrick
Owen Shepherd <[email protected]> wrote:
>On 11 January 2014 21:42, Daniel Micay <[email protected]> wrote:
>
>> On Sat, Jan 11, 2014 at 4:31 PM, Owen Shepherd <[email protected]>
>> wrote:
>> > So I just did a test. Took the following rust code:
>> > pub fn test_wrap(x : u32, y : u32) -> u32 {
>> > return x.checked_mul(&y).unwrap().checked_add(&16).unwrap();
>> > }
>> >
>> > And got the following blob of assembly out. What we have there, my
>> friends,
>> > is a complete failure of the optimizer (N.B. it works for the
>simple
>> case of
>> > checked_add alone)
>> >
>> > Preamble:
>> >
>> > __ZN9test_wrap19hc4c136f599917215af4v0.0E:
>> > .cfi_startproc
>> > cmpl %fs:20, %esp
>> > ja LBB0_2
>> > pushl $12
>> > pushl $20
>> > calll ___morestack
>> > ret
>> > LBB0_2:
>> > pushl %ebp
>> > Ltmp2:
>> > .cfi_def_cfa_offset 8
>> > Ltmp3:
>> > .cfi_offset %ebp, -8
>> > movl %esp, %ebp
>> > Ltmp4:
>> > .cfi_def_cfa_register %ebp
>> >
>> > Align stack (for what? We don't do any SSE)
>> >
>> > andl $-8, %esp
>> > subl $16, %esp
>>
>> The compiler aligns the stack for performance.
>>
>>
>
>Oops, I misread and thought there was 16 byte alignment going on there,
>not
>8.
>
>
>> > Multiply x * y
>> >
>> > movl 12(%ebp), %eax
>> > mull 16(%ebp)
>> > jno LBB0_4
>> >
>> > If it didn't overflow, stash a 0 at top of stack
>> >
>> > movb $0, (%esp)
>> > jmp LBB0_5
>> >
>> > If it did overflow, stash a 1 at top of stack (we are building an
>> > Option<u32> here)
>> > LBB0_4:
>> > movb $1, (%esp)
>> > movl %eax, 4(%esp)
>> >
>> > Take pointer to &this for __thiscall:
>> > LBB0_5:
>> > leal (%esp), %ecx
>> > calll __ZN6option6Option6unwrap21h05c5cb6c47a61795Zcat4v0.0E
>> >
>> > Do the addition to the result
>> >
>> > addl $16, %eax
>> >
>> > Repeat the previous circus
>> >
>> > jae LBB0_7
>> > movb $0, 8(%esp)
>> > jmp LBB0_8
>> > LBB0_7:
>> > movb $1, 8(%esp)
>> > movl %eax, 12(%esp)
>> > LBB0_8:
>> > leal 8(%esp), %ecx
>> > calll __ZN6option6Option6unwrap21h05c5cb6c47a61795Zcat4v0.0E
>> > movl %ebp, %esp
>> > popl %ebp
>> > ret
>> > .cfi_endproc
>> >
>> >
>> > Yeah. Its' not fast because its' not inlining through
>option::unwrap.
>>
>> The code to initiate failure is gigantic and LLVM doesn't do partial
>> inlining by default. It's likely far above the inlining threshold.
>>
>>
>Right, why I suggested explicitly moving the failure code out of line
>into
>a separate function.
>
>
>> A purely synthetic benchmark only executing the unchecked or checked
>> instruction isn't interesting. You need to include several
>> optimizations in the loop as real code would use, and you will often
>> see a massive drop in performance from the serialization of the
>> pipeline. Register renaming is not as clever as you'd expect.
>>
>>
>Agreed. The variability within that tiny benchmark tells me that it
>can't
>really glean any valuable information.
>
>
>> The impact of trapping is known, because `clang` and `gcc` expose
>> `-ftrapv`.
>> Integer-heavy workloads like cryptography and video codecs are
>> several times slower with the checks.
>>
>
>What about other workloads?
>
>As I mentioned: What I'd propose is trapping by default, with
>non-trapping
>math along the lines of a single additonal character on a type
>declaration
>away.
>
>Also, I did manage to convince Rust + LLVM to optimize things cleanly,
>by
>defining an unwrap which invoked libc's abort() -> !, so there's that.
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Rust-dev mailing list
>[email protected]
>https://mail.mozilla.org/listinfo/rust-dev
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity._______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev