Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification

Jeff Law via Gcc-patches Tue, 23 Nov 2021 12:03:08 -0800



On 11/23/2021 12:42 PM, Navid Rahimi wrote:

In case of x86_64. This is the code:

src_1(bool, bool):
         cmp     dil, sil
         setb    al
         ret

tgt_1(bool, bool):
         xor     edi, 1
         mov     eax, edi
         and     eax, esi
         ret


Lets look at the latency of the src_1:
cmp: latency of 1: (page 663, table C-17)
setb: latency of 2. They don't report setb latency in intel instruction manual. 
But the closest instruction to this setbe does have latency of 2.

But for tgt_1:
xor: latency 1.
mov: latency 1. (But it seems x86_64 does optimize this instruction and 
basically it is latency 0 in this case.  In Zero-Latency MOV Instructions 
section they explain it [1].)
and: latency 1.

So even if you consider setb as latency of 1 it is equal. But if it is latency 
of 2, it should be a 1 latency win.

1) 
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

But these are target issues you've raised -- those should be handled inthe RTL pipeline and are not a significant concern for gimple.

In gimple your primary goal should be to reduce the number ofexpressions that are evaluated. This patch does the opposite.


jeff

Re: [EXTERNAL] Re: [PATCH][WIP] PR tree-optimization/101808 Boolean comparison simplification

Reply via email to