On Mon, Apr 30, 2012 at 10:09 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Fri, Apr 27, 2012 at 3:30 PM, Paolo Bonzini <bonz...@gnu.org> wrote:
>> tzcnt is encoded as "rep;bsf" and unlike lzcnt is a drop-in replacement
>> if we don't care about the flags (it has the same semantics for non-zero
>> values).
>>
>> Since bsf is usually slower, just emit tzcnt unconditionally.  However,
>> write it as rep;bsf unless -mbmi is in use, to cater for old assemblers.
>
> Please emit "rep;bsf" when optimize_insn_for_speed_p () is true.
>
>> Bootstrapped on a non-BMI x86_64-linux host, regtest in progress.
>> Ok for mainline?
>
> OK with the optimize_insn_for_speed_p conditional.

I have committed similar patch, where we emit bsf when optimizing for
size (saving a whopping one byte) and rep;bsf for !TARGET_BMI. The
same functionality can be added to *ffs<mode>_1, since we don't care
what ends in the register for input operand == 0 (this is the key
difference between tzcnt and bsf).

2012-05-06  Uros Bizjak  <ubiz...@gmail.com>
            Paolo Bonzini  <bonz...@gnu.org>

        * config/i386/i386.md (ctz<mode>2): Emit rep;bsf even for
        !TARGET_BMI and bsf when optimizing for size.
        (*ffs<mode>_1): Ditto.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: i386.md
===================================================================
--- i386.md     (revision 187217)
+++ i386.md     (working copy)
@@ -12112,9 +12112,22 @@
    (set (match_operand:SWI48 0 "register_operand" "=r")
        (ctz:SWI48 (match_dup 1)))]
   ""
-  "bsf{<imodesuffix>}\t{%1, %0|%0, %1}"
+{
+  if (optimize_function_for_size_p (cfun))
+    return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+  else if (TARGET_BMI)
+    return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}";
+  else 
+    /* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI.  */
+    return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+}
   [(set_attr "type" "alu1")
    (set_attr "prefix_0f" "1")
+   (set (attr "prefix_rep")
+     (if_then_else
+       (match_test "optimize_function_for_size_p (cfun)")
+       (const_string "0")
+       (const_string "1")))
    (set_attr "mode" "<MODE>")])
 
 (define_insn "ctz<mode>2"
@@ -12123,14 +12136,21 @@
    (clobber (reg:CC FLAGS_REG))]
   ""
 {
-  if (TARGET_BMI)
+  if (optimize_function_for_size_p (cfun))
+    return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+  else if (TARGET_BMI)
     return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}";
-  else
-    return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
+  else 
+    /* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI.  */
+    return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
 }
   [(set_attr "type" "alu1")
    (set_attr "prefix_0f" "1")
-   (set (attr "prefix_rep") (symbol_ref "TARGET_BMI"))
+   (set (attr "prefix_rep")
+     (if_then_else
+       (match_test "optimize_function_for_size_p (cfun)")
+       (const_string "0")
+       (const_string "1")))
    (set_attr "mode" "<MODE>")])
 
 (define_expand "clz<mode>2"

Reply via email to