[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434 --- Comment #7 from Adam Warner adam at consulting dot net.nz 2012-01-18 03:35:24 UTC --- It does not appear to be possible to generate inline asm that leaves GCC to choose the {ah, bh, ch, dh} register: #include stdint.h uint64_t u8l(uint64_t in) { uint64_t out; asm (movzbl %b1, %k0 : =q (out) : r (in)); return out; } uint64_t u8h(uint64_t in) { uint64_t out; asm (movzbl %h1, %k0 : =Q (out) : r (in)); //line number 11 return out; } int main(void) { return 0; } $ gcc -O3 ah_bh_ch_dh.c ah_bh_ch_dh.c: Assembler messages: ah_bh_ch_dh.c:11: Error: operand type mismatch for `movzbl' I found the h qualifier by reading: www.cs.virginia.edu/~clc5q/gcc-inline-asm.pdf %b0 prints the 8-bit form of an operand. %al, etc. %h0 prints the high 8-bit form of a register. %ah, etc. Q is the x86 machine constraint which means Any register accessible as rh: a, b, c, and d. http://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html Additionally, this inline assembly instruction causes LLVM-based clang to segfault and request a bug submission report. I haven't located the primary documentation for these %b, %h, %k, etc. qualifiers. Is my usage of %h correct/is this actually a bug in GCC?
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434 --- Comment #8 from Adam Warner adam at consulting dot net.nz 2012-01-18 07:00:44 UTC --- Apologies for the noise. I had my machine constraints around the wrong way. Here is the fixed version of the code: #include stdint.h uint64_t u8l(uint64_t in) { uint64_t out; asm (movzbl %b1, %k0 : =r (out) : q (in)); return out; } uint64_t u8h(uint64_t in) { uint64_t out; asm (movzbl %h1, %k0 : =r (out) : Q (in)); return out; } int main(void) { return 0; } With the X86-64 Linux ABI u8l() generates movzbl %dil,%eax and u8h() mov %rdi,%rdx; movzbl %dh,%eax as expected.
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
--- Comment #2 from ubizjak at gmail dot com 2010-08-29 16:40 --- A bit simpler testcase: int test(int i) { return (i 0xFF) == ((i 0xFF00) 8); } fails two combine attempts: Trying 8 - 9: Failed to match this instruction: (set (reg:CCZ 17 flags) (compare:CCZ (zero_extract:SI (subreg:DI (reg/v:SI 63 [ i ]) 0) (const_int 8 [0x8]) (const_int 8 [0x8])) (reg:SI 65))) Trying 8, 6 - 9: Failed to match this instruction: (set (reg:CCZ 17 flags) (compare:CCZ (subreg:QI (lshiftrt:SI (reg/v:SI 63 [ i ]) (const_int 8 [0x8])) 0) (subreg:QI (reg/v:SI 63 [ i ]) 0))) However, contrary to the first attempt, 8 - 9 substitution in the second attempt is over-optimized from ZERO_EXTRACT to LSHIFTRT. i386.md expects ZERO_EXTRACT form. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
--- Comment #3 from hjl dot tools at gmail dot com 2010-08-29 18:34 --- Combine seems to prefer LSHIFTRT over ZERO_EXTRACT. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
--- Comment #4 from ubizjak at gmail dot com 2010-08-29 18:59 --- (In reply to comment #3) Combine seems to prefer LSHIFTRT over ZERO_EXTRACT. It seems so. Slightly changed testcase: char test(int a, int b, int i, int j) { return (i 0xFF) == ((j 0xFF00) 8); } and changed *cmpqi_ext_3_insn_rex64 pattern: (define_insn *cmpqi_ext_3_insn_rex64 [(set (reg FLAGS_REG) (compare (subreg:QI (lshiftrt:SI ;; (zero_extract:SI (match_operand 0 ext_register_operand Q) ;;(const_int 8) (const_int 8)) 0) (match_operand:QI 1 nonmemory_operand Qn)))] TARGET_64BIT ix86_match_ccmode (insn, CCmode) cmp{b}\t{%1, %h0|%h0, %1} [(set_attr type icmp) (set_attr modrm 1) (set_attr mode QI)]) result in: cmpb%dl, %ch sete%al ret -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
--- Comment #5 from ebotcazou at gcc dot gnu dot org 2010-08-29 19:12 --- Combine seems to prefer LSHIFTRT over ZERO_EXTRACT. The former is the canonical form of the latter for internal computations, but the latter should be reinstantiated when matching instructions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
--- Comment #6 from ubizjak at gmail dot com 2010-08-29 19:20 --- Some other examples: char test(int a, int b, int i, int j) { return (i 0xFF) + ((j 0xFF00) 8); } movzbl %ch, %eax addl%edx, %eax ret char test(int a, int b, int i, int j) { return (i 0xFF) - ((j 0xFF00) 8); } movzbl %ch, %ecx movl%edx, %eax subb%cl, %al -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
--- Comment #1 from hjl dot tools at gmail dot com 2010-08-28 04:30 --- It may be related to PR 44659. -- hjl dot tools at gmail dot com changed: What|Removed |Added CC||ebotcazou at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434
[Bug rtl-optimization/45434] x86 missed optimization: use high register (ah, bh, ch, dh) when available to make comparisons
-- hjl dot tools at gmail dot com changed: What|Removed |Added CC||hjl dot tools at gmail dot ||com Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2010-08-28 04:35:30 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45434