On Mon, Apr 30, 2012 at 10:09 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Fri, Apr 27, 2012 at 3:30 PM, Paolo Bonzini <bonz...@gnu.org> wrote: >> tzcnt is encoded as "rep;bsf" and unlike lzcnt is a drop-in replacement >> if we don't care about the flags (it has the same semantics for non-zero >> values). >> >> Since bsf is usually slower, just emit tzcnt unconditionally. However, >> write it as rep;bsf unless -mbmi is in use, to cater for old assemblers. > > Please emit "rep;bsf" when optimize_insn_for_speed_p () is true. > >> Bootstrapped on a non-BMI x86_64-linux host, regtest in progress. >> Ok for mainline? > > OK with the optimize_insn_for_speed_p conditional.
I have committed similar patch, where we emit bsf when optimizing for size (saving a whopping one byte) and rep;bsf for !TARGET_BMI. The same functionality can be added to *ffs<mode>_1, since we don't care what ends in the register for input operand == 0 (this is the key difference between tzcnt and bsf). 2012-05-06 Uros Bizjak <ubiz...@gmail.com> Paolo Bonzini <bonz...@gnu.org> * config/i386/i386.md (ctz<mode>2): Emit rep;bsf even for !TARGET_BMI and bsf when optimizing for size. (*ffs<mode>_1): Ditto. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros.
Index: i386.md =================================================================== --- i386.md (revision 187217) +++ i386.md (working copy) @@ -12112,9 +12112,22 @@ (set (match_operand:SWI48 0 "register_operand" "=r") (ctz:SWI48 (match_dup 1)))] "" - "bsf{<imodesuffix>}\t{%1, %0|%0, %1}" +{ + if (optimize_function_for_size_p (cfun)) + return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}"; + else if (TARGET_BMI) + return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}"; + else + /* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI. */ + return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}"; +} [(set_attr "type" "alu1") (set_attr "prefix_0f" "1") + (set (attr "prefix_rep") + (if_then_else + (match_test "optimize_function_for_size_p (cfun)") + (const_string "0") + (const_string "1"))) (set_attr "mode" "<MODE>")]) (define_insn "ctz<mode>2" @@ -12123,14 +12136,21 @@ (clobber (reg:CC FLAGS_REG))] "" { - if (TARGET_BMI) + if (optimize_function_for_size_p (cfun)) + return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}"; + else if (TARGET_BMI) return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}"; - else - return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}"; + else + /* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI. */ + return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}"; } [(set_attr "type" "alu1") (set_attr "prefix_0f" "1") - (set (attr "prefix_rep") (symbol_ref "TARGET_BMI")) + (set (attr "prefix_rep") + (if_then_else + (match_test "optimize_function_for_size_p (cfun)") + (const_string "0") + (const_string "1"))) (set_attr "mode" "<MODE>")]) (define_expand "clz<mode>2"