[PATCH v3] testsuite: Add a test case for negating FP vectors containing zeros

2024-03-05 Thread Xi Ruoyao
Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch

Re: [PATCH v2] LoongArch: Fix inconsistent description in *sge_

2024-03-05 Thread Xi Ruoyao
int 1)))] >    "" > -  "slti\t%0,%.,%1" > +  "slt\t%0,%.,%1" >    [(set_attr "type" "slt") >     (set_attr "mode" "")]) Hmm, this define_insn seems never really used or it would generate something like "sltu

Re: [PATCH] LoongArch: Fix inconsistent description in *sge_

2024-03-04 Thread Xi Ruoyao
So allowing const_imm12_operand here makes no benefit. >    "" > -  "slti\t%0,%.,%1" > +  "slt%i1\t%0,%.,%1" >    [(set_attr "type" "slt") >     (set_attr "mode" "")]) >   -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-29 Thread Xi Ruoyao
On Thu, 2024-02-29 at 15:09 +0800, Xi Ruoyao wrote: > Recently I've fixed two wrong FP vector negate implementation which > caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To > prevent a similar issue from happening again, add a test case. > > Tested on x86_64

[PATCH] LoongArch: Allow s9 as a register alias

2024-02-28 Thread Xi Ruoyao
The psABI allows using s9 as an alias of r22. gcc/ChangeLog: * config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add s9 as an alias of r22. --- Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? gcc/config/loongarch/loongarch.h | 1 + 1 file changed, 1

[PATCH] LoongArch: Emit R_LARCH_RELAX for TLS IE with non-extreme code model to allow the IE to LE linker relaxation

2024-02-28 Thread Xi Ruoyao
In Binutils we need to make IE to LE relaxation only allowed when there is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid "partial" relaxation won't happen with the extreme code model. So if we are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an R_LARCH_RELAX

[PATCH v2] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-28 Thread Xi Ruoyao
Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch

[PATCH v2] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-02-28 Thread Xi Ruoyao
The vect_int_mod target selector is evaluated with the options in DEFAULT_VECTCFLAGS in effect, but these options are not automatically passed to tests out of the vect directories. So this test fails on targets where integer vector modulo operation is supported but requiring an option to enable,

Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao
On Thu, 2024-02-29 at 14:08 +0800, Xi Ruoyao wrote: > > +  "TARGET_TLS_DESC" > > +  "la.tls.desc\t%0,%1" > > With -mexplicit-relocs=always we should emit %desc_pc_lo12 etc. instead > of la.tls.desc.  As we don't want to add too many code we can just ha

Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao
ELOCS_ALWAS ? ".." : "la.tls.desc\t%0,%1"; } > +  [(set_attr "got" "load") > +   (set_attr "mode" "")]) We need (set_attr "length" "16") in this list as this actually expands into 16 bytes. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH 2/2] LoongArch: Remove unneeded sign extension after crc/crcc instructions

2024-02-25 Thread Xi Ruoyao
The specification of crc/crcc instructions is clear that the output is sign-extended to GRLEN. Add a define_insn to tell the compiler this fact and allow it to remove the unneeded sign extension on crc/crcc output. As crc/crcc instructions are usually used in a tight loop, this should produce a

[PATCH 1/2] LoongArch: NFC: Deduplicate crc instruction defines

2024-02-25 Thread Xi Ruoyao
Introduce an iterator for UNSPEC_CRC and UNSPEC_CRCC to make the next change easier. gcc/ChangeLog: * config/loongarch/loongarch.md (CRC): New define_int_iterator. (crc): New define_int_attr. (loongarch_crc_w__w, loongarch_crcc_w__w): Unify into ...

Pushed: [GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-23 Thread Xi Ruoyao
On Thu, 2024-02-22 at 19:09 +0800, chenglulu wrote: > > 在 2024/2/22 下午6:20, Xi Ruoyao 写道: > > To improve Binutils compatibility we've had to backported relaxation > > support.  But if a user just updates to GCC 13.3 and sticks with > > Binutils 2.41, there is no reason to

Pushed: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-23 Thread Xi Ruoyao
On Fri, 2024-02-23 at 11:37 +0800, chenglulu wrote: > > 在 2024/2/23 上午11:27, Xi Ruoyao 写道: > > On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote: > > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道: > > > > The gold linker has never been ported to LoongArch (and it se

Re: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao
On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote: > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道: > > The gold linker has never been ported to LoongArch (and it seems > > unlikely to be ported in the future as the new architectures are > > focusing on lld and/or mold for fast link

[GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-22 Thread Xi Ruoyao
To improve Binutils compatibility we've had to backported relaxation support. But if a user just updates to GCC 13.3 and sticks with Binutils 2.41, there is no reason to use -mno-explicit-relocs as the default because we are turning off relaxation for Binutils 2.41 (it lacks conditional branch

[PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao
The gold linker has never been ported to LoongArch (and it seems unlikely to be ported in the future as the new architectures are focusing on lld and/or mold for fast linkers). ChangeLog: * configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target list. * configure:

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao
On Tue, 2024-02-20 at 19:50 +0800, chenglulu wrote: > > 在 2024/2/20 下午7:31, Xi Ruoyao 写道: > > On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: > > > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: > > > > > > > So I think that witho

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao
On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: > > > So I think that without worrying about performance and ensuring that > > there is no problem > > > > with binutils, I think we can ma

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao
test failures due to "excessive errors" if running the GCC test suite with these earlier GAS versions. Maybe we'll have to add some autoconf-based probing for the linker anyway? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-09 Thread Xi Ruoyao
On Fri, 2024-02-09 at 00:02 +0800, chenglulu wrote: > > 在 2024/2/7 上午12:23, Xi Ruoyao 写道: > > Hi Lulu, > > > > I'm proposing to backport r14-4674 "LoongArch: Delete macro definition > > ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/gcc-13

Re: [PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao
On Tue, 2024-02-06 at 17:55 +0800, Xi Ruoyao wrote: > Recently I've fixed two wrong FP vector negate implementation which > caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To > prevent a similar issue from happening again, add a test case. > > Tested on x86_64

LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-06 Thread Xi Ruoyao
eases/gcc-12 and releases/gcc-13 then? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao
Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch

Pushed: [PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-05 Thread Xi Ruoyao
On Mon, 2024-02-05 at 09:56 +0800, YunQiang Su wrote: > Xi Ruoyao 于2024年2月5日周一 02:01写道: > > > > We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is > > wrong because -0.0 is not 0 - 0.0.  This causes some Python tests to > > fail when Pytho

[PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-04 Thread Xi Ruoyao
We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with MSA enabled. Use the bnegi.df instructions to simply reverse the sign bit instead. gcc/ChangeLog: *

Pushed: [PATCH] LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns

2024-02-04 Thread Xi Ruoyao
On Sun, 2024-02-04 at 11:19 +0800, chenglulu wrote: > > 在 2024/2/2 下午5:55, Xi Ruoyao 写道: > > We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes. > > But in loongarch_symbol_insns: > > > > if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE

Pushed: [PATCH] LoongArch: Fix wrong LSX FP vector negation

2024-02-04 Thread Xi Ruoyao
On Sun, 2024-02-04 at 11:20 +0800, chenglulu wrote: > > 在 2024/2/3 下午4:58, Xi Ruoyao 写道: > > We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is > > wrong because -0.0 is not 0 - 0.0.  This causes some Python tests to > > fail when Python is built with L

Pushed: [PATCH] LoongArch: Fix an ODR violation

2024-02-03 Thread Xi Ruoyao
On Fri, 2024-02-02 at 10:42 +0800, chenglulu wrote: > LGTM! > > Thanks! Pushed r14-8773. > 在 2024/2/2 上午5:54, Xi Ruoyao 写道: > > When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR > > violation is detected: > > > > ../../gcc/config/loo

[PATCH] LoongArch: Fix wrong LSX FP vector negation

2024-02-03 Thread Xi Ruoyao
We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with LSX enabled. Use the vbitrevi.{d/w} instructions to simply reverse the sign bit instead. We are already doing this for LASX and

[PATCH] LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns

2024-02-02 Thread Xi Ruoyao
We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes. But in loongarch_symbol_insns: if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode)) return 0; And LSX_SUPPORTED_MODE_P is defined as: #define LSX_SUPPORTED_MODE_P(MODE) \ (ISA_HAS_LSX \

[PATCH] LoongArch: Fix an ODR violation

2024-02-01 Thread Xi Ruoyao
When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR violation is detected: ../../gcc/config/loongarch/loongarch-opts.cc:57: warning: 'abi_minimal_isa' violates the C++ One Definition Rule [-Wodr] 57 | abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES];

Re: [PATCH] Change gcc/ira-conflicts.cc build_conflict_bit_table to use size_t/%zu

2024-02-01 Thread Xi Ruoyao
On Thu, 2024-02-01 at 14:55 +0100, Jakub Jelinek wrote: > On Thu, Feb 01, 2024 at 01:42:03PM +, Jonathan Yong wrote: > > On 2/1/24 13:06, Xi Ruoyao wrote: > > > On Thu, 2024-02-01 at 14:01 +0100, Jakub Jelinek wrote: > > > > On Thu, Feb 01, 2024 at 12:45:3

Re: [PATCH] Change gcc/ira-conflicts.cc build_conflict_bit_table to use size_t/%zu

2024-02-01 Thread Xi Ruoyao
quot;)\n", Should use HOST_WIDE_INT_PRINT_UNSIGNED instead of PRIu64. >(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE) >    * allocated_words_num), >(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE) >

Re: [PATCH] LoongArch: Fix soft-float builds of libffi

2024-01-31 Thread Xi Ruoyao
at. You need to wait until the PR is accepted by the libffi maintainers. Frankly I don't know what libffi maintainers are busy on and I'm frustrated as well (having a MIPS patch unreviewed there for a month) but this is the procedure :(. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-27 Thread Xi Ruoyao
On Sat, 2024-01-27 at 18:02 +0800, Xi Ruoyao wrote: > On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote: > > > > 在 2024/1/26 下午6:57, Xi Ruoyao 写道: > > > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: > > > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道: > > >

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-27 Thread Xi Ruoyao
On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote: > > 在 2024/1/26 下午6:57, Xi Ruoyao 写道: > > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: > > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道: > > > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > > >

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread Xi Ruoyao
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote: > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道: > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote: > > > v3 -> v4: > > >    1. Add macro support for TLS symbols > > >    2. Added support for loading __get_t

Re: [PATCH v4 2/4] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-26 Thread Xi Ruoyao
\t%0,%2,%1"; > +    case SYMBOL_TLSLDM: > +  return "la.tls.ld\t%0,%2,%1"; > + > +    default: > +  gcc_unreachable (); > +  } > +} > + "&& REG_P (operands[1]) && find_reg_note (insn, REG_UNUSED, operands[2]) != > 0" > + [(set (match_dup 0) (match_dup 1))] > + "" > + [(set_attr "mode" "DI") > +  (set_attr "length" "5")]) Should be 20, in bytes. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v4 1/4] LoongArch: Merge template got_load_tls_{ld/gd/le/ie}.

2024-01-26 Thread Xi Ruoyao
uot;la.tls.le\t%0,%1"; > +    case SYMBOL_TLS_IE: > +  return "la.tls.ie\t%0,%1"; > +    case SYMBOL_TLSLDM: > +  return "la.tls.ld\t%0,%1"; > +    case SYMBOL_TLSGD: > +  return "la.tls.gd\t%0,%1"; /* snip */ > +    default: > +  g

Re: [PATCH v4 0/4] When cmodel=extreme, add macro support and only support macros.

2024-01-26 Thread Xi Ruoyao
extreme TLS GD/LD with -mexplicit-relocs=auto. I've rebased and attached the patch to fix the bad split in -mexplicit- relocs={always,auto} -mcmodel=extreme on top of this series. I've not tested it seriously though (only tested the added and modified test cases). -- Xi Ruoyao School of Aerospace Scie

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-24 Thread Xi Ruoyao
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote: > > 在 2024/1/24 上午3:36, Xi Ruoyao 写道: > > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote: > > > > > The failure of this test case was because the compiler believes that > > > > > two > > &g

Re: [PATCH] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-01-24 Thread Xi Ruoyao
On Wed, 2024-01-24 at 19:08 +0800, chenxiaolong wrote: > At 19:00 +0800 on Wednesday, 2024-01-24, Xi Ruoyao wrote: > > On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote: > > > On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote: > > > > The vect_int_mo

Re: [PATCH] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-01-24 Thread Xi Ruoyao
On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote: > On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote: > > The vect_int_mod target selector is evaluated with the options in > > DEFAULT_VECTCFLAGS in effect, but these options are not automatically > > passed to

Re: [PATCH] LoongArch: Fix incorrect return type for frecipe/frsqrte intrinsic functions

2024-01-24 Thread Xi Ruoyao
n __inline float >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) >  __frecipe_s (float _1) >  { > -  __builtin_loongarch_frecipe_s ((float) _1); > +  return (float) __builtin_loongarch_frecipe_s ((float) _1); I don't think the (float) conversion is needed. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-23 Thread Xi Ruoyao
y papers over the same issue caused spec2006 failure. I tried a bootstrap with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS commented out, and there is no more spurious "note: non-delegitimized UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things. I feel that this hook is still written in a buggy way, so maybe removing it will solve the spec2017 issue. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-01-23 Thread Xi Ruoyao
The vect_int_mod target selector is evaluated with the options in DEFAULT_VECTCFLAGS in effect, but these options are not automatically passed to tests out of the vect directories. So this test fails on targets where integer vector modulo operation is supported but requiring an option to enable,

[PATCH] LoongArch: testsuite: Disable stack protector for got-load.C

2024-01-23 Thread Xi Ruoyao
When building GCC with --enable-default-ssp, the stack protector is enabled for got-load.C, causing additional GOT loads for __stack_chk_guard. So mem/u will be matched more than 2 times and the test will fail. Disable stack protector to fix this issue. gcc/testsuite: *

Pushed: [PATCH v2] LoongArch: Disable explicit reloc for TLS LD/GD with -mexplicit-relocs=auto

2024-01-23 Thread Xi Ruoyao
On Tue, 2024-01-23 at 10:37 +0800, chenglulu wrote: > LGTM! > > Thanks! Pushed v2 as attached. The only change is in the comment: Qinggang told me TLE LE relaxation actually *requires* explicit relocs. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian Univer

[PATCH] LoongArch: Disable explicit reloc for TLS LD/GD with -mexplicit-relocs=auto

2024-01-22 Thread Xi Ruoyao
Binutils 2.42 supports TLS LD/GD relaxation which requires the assembler macro. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_explicit_relocs_p): If la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO, return false for SYMBOL_TLS_LDM and SYMBOL_TLS_GD.

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-18 Thread Xi Ruoyao
quot;) (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))] With this the buggy REG_UNUSED notes were gone. But it then prevented the CSE when loading the address of __tls_get_addr (i.e. if we address 10 TLE_LD symbols in a function it would emit 10 instances of "la.global __tls_get

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-17 Thread Xi Ruoyao
derstand the purpose of adding > '-fno-tree-vectorize' here. I don't think -fno-tree-vectorize will make a difference here. This test case uses __attribute__((vector_size(...))) explicitly so the vector operation will be used even if -fno-tree-vectorize. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-17 Thread Xi Ruoyao
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote: > > 在 2024/1/13 下午9:05, Xi Ruoyao 写道: > > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道: > > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道: > > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: > > > > > > >

Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor

2024-01-16 Thread Xi Ruoyao
ibstdc++-v3/testsuite/lib/dg-options.exp > @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } { >    || ([istarget powerpc*-*-*] && [check_effective_target_ilp32]) >    || [istarget riscv*-*-*] >    || ([istarget sparc*-*-linux-gnu] && [check_effective_target_ilp32]) > + || ([istarget i?86-*-*] || [istarget x86_64-*-*]) This seems too overkill as "dg-add-options libatomic" is not intended to handle 16-byte atomics. Maybe we can fork this to a new dg-add-options like "add_options_for_libatomic_16b"? >         } { >   global TOOL_OPTIONS >   > --  > 2.25.1 -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-16 Thread Xi Ruoyao
On Tue, 2024-01-16 at 12:58 +0800, Xi Ruoyao wrote: > On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote: > > 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道: > > > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote: > > > > At 14:42 +0800 on the first day

Re: Ping: [PATCH] LoongArch: Remove constraint z from movsi_internal

2024-01-15 Thread Xi Ruoyao
On Tue, 2024-01-16 at 14:16 +0800, chenglulu wrote: > > > 在 2024/1/16 下午1:34, Xi Ruoyao 写道: > > Ping. > > > > On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote: > > > We don't allow SImode in FCC, so constraint z is never really used &g

Ping: [PATCH] LoongArch: Remove constraint z from movsi_internal

2024-01-15 Thread Xi Ruoyao
Ping. On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote: > We don't allow SImode in FCC, so constraint z is never really used > here. > > gcc/ChangeLog: > > * config/loongarch/loongarch.md (movsi_internal): Remove > constraint z. > --- > > Bootstrappe

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-15 Thread Xi Ruoyao
On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote: > 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道: > > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote: > > > At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote: > > > > On Mon, 2024-01-15 at

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-14 Thread Xi Ruoyao
On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote: > At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote: > > On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote: > > > Xi Ruoyao wrote at 12:11pm on Monday, January > > > 15, 2024: > > >

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-14 Thread Xi Ruoyao
On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote: > Xi Ruoyao 于2024年1月15日周一 12:11写道: > > > > On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote: > > > At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote: > > > > At 15:28 +0800 on Saturday 2024-01-1

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-14 Thread Xi Ruoyao
On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote: > At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote: > > At 15:28 +0800 on Saturday 2024-01-13, chenxiaolong wrote: > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.dg/pr104992.c: Added addition

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-13 Thread Xi Ruoyao
1 100644 > --- a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f > +++ b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f > @@ -2,6 +2,7 @@ >  ! { dg-require-effective-target vect_double } >  ! { dg-options "-O3 --param vect-max-peeling-for-alignment=0 > -fpredictive

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-13 Thread Xi Ruoyao
在 2024-01-13星期六的 15:01 +0800,chenglulu写道: > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道: > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道: > > > > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: > > > > we n

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-12 Thread Xi Ruoyao
enable-bootstrap > --enable-checking=release >     $ make BOOT_FLAGS="-mcmodel=extreme" > > What did I do wrong?:-( BOOT_CFLAGS, not BOOT_FLAGS :). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH 3/3] LoongArch: Redundant sign extension elimination optimization 2.

2024-01-06 Thread Xi Ruoyao
can-assembler-times "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0" > 0 } } */ Use scan-assembler-not instead of scan-assembler-times ... 0. Otherwise LGTM. >  #include >  #define my_min(x, y) ((x) < (y) ? (x) : (y)) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH 1/3] LoongArch: Optimized some of the symbolic expansion instructions generated during bitwise operations.

2024-01-06 Thread Xi Ruoyao
uot;")]) >   > +(define_insn "*nsi_internal" > +  [(set (match_operand:SI 0 "register_operand" "=r") > + (neg_bitwise:SI > +     (not:SI (match_operand:SI 1 "register_operand" "r")) > +     (match_operand:SI 2 "register_operand" "r")))] > +  "TARGET_64BIT" > +  "n\t%0,%2,%1" > +  [(set_attr "type" "logical") > +   (set_attr "mode" "SI")]) >   >  ;; >  ;;  > @@ -3167,7 +3210,6 @@ (define_expand "condjump" >     (label_ref (match_operand 1)) >     (pc)))]) >   > - >   >  ;; >  ;;  > @@ -3967,10 +4009,13 @@ (define_insn "bytepick_w_" >  (define_insn "bytepick_w__extend" >    [(set (match_operand:DI 0 "register_operand" "=r") >   (sign_extend:DI > -   (ior:SI (lshiftrt (match_operand:SI 1 "register_operand" "r") > -     (const_int )) > -   (ashift (match_operand:SI 2 "register_operand" "r") > -   (const_int bytepick_w_ashift_amount)] > + (subreg:SI > +   (ior:DI (subreg:DI (lshiftrt > +   (match_operand:SI 1 "register_operand" "r") > +   (const_int )) 0) > +   (subreg:DI (ashift > +   (match_operand:SI 2 "register_operand" "r") > +   (const_int bytepick_w_ashift_amount)) 0)) 0)))] >    "TARGET_64BIT" >    "bytepick.w\t%0,%1,%2," >    [(set_attr "mode" "SI")]) > diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c > b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c > new file mode 100644 > index 000..5753ef69db2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mabi=lp64d -O2" } */ > +/* { dg-final { scan-assembler-not "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0" } > } */ > + > +struct pmop > +{ > +  unsigned int op_pmflags; > +  unsigned int op_pmpermflags; > +}; > +unsigned int PL_hints; > + > +struct pmop *pmop; > +void > +Perl_newPMOP (int type, int flags) > +{ > +  if (PL_hints & 0x0010) > +    pmop->op_pmpermflags |= 0x0001; > +  if (PL_hints & 0x0004) > +    pmop->op_pmpermflags |= 0x0800; > +  pmop->op_pmflags = pmop->op_pmpermflags; > +} -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH 2/3] LoongArch: Redundant sign extension elimination optimization.

2024-01-06 Thread Xi Ruoyao
_rtx (DImode); > +   emit_insn (gen_addsi3_extended (t, operands[1], operands[2])); AFAIK if !TARGET_64BIT a DImode should be actually a pair of hardware registers, but addsi3_extended don't output such a pair so this seems invalid... > +   t = gen_lowpart (SImode, t); > +  

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote: > > 在 2024/1/5 下午7:55, Xi Ruoyao 写道: > > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: > > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > > &

Re: [PATCH 1/4] LoongArch: Handle ISA evolution switches along with other options

2024-01-05 Thread Xi Ruoyao
SA_HAS_DIV32 etc. in the code base? It seems some of them are not replaced. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote: > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > > > >   bool > > > >   loongarch_ex

Re: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-05 Thread Xi Ruoyao
ve_target_loongarch_sx] ||" because SIMD requires hard float. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote: > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道: > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote: > > >   bool > > >   loongarch_explicit_relocs_p (enum loongarch_symbol_type type) > > >   { > > &g

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-05 Thread Xi Ruoyao
ive me several hours trying to implement this... -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH]middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

2024-01-04 Thread Xi Ruoyao
_effective_target_s390_vx]) > > +|| ([istarget riscv*-*-*] > > + && [check_effective_target_riscv_v]) > > Unless I'm missing something, we have copysign in the scalar > floating-point ISAs as well.  So I think this should be > >   || ([istarget riscv*-*-*] >   && [check_effective_target_hard_float]) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [RFA] [V3] new pass for sign/zero extension elimination

2024-01-04 Thread Xi Ruoyao
as possible.  Assuming the rest is ACK'd for the trunk we'll put it into > the list of optimizations enabled by -O2. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-03 Thread Xi Ruoyao
On Thu, 2024-01-04 at 11:58 +0800, chenglulu wrote: > > 在 2024/1/4 上午11:51, Xi Ruoyao 写道: > > On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote: > > > +(define_insn "movdi_pcrel64" > > > + [(set (match_operand:DI 0 "register_op

Re: [PATCH 1/2] LoongArch: Add the macro implementation of mcmodel=extreme.

2024-01-03 Thread Xi Ruoyao
perand:DI 2 "register_operand "="))] And use gen_movdi_pcrel64 (operands[0], operands[1], gen_reg_rtx(DImode)) in expand. > + "TARGET_64BIT" > + "la.local %0,$r15,%1" > + [(set_attr "mode" "DI") > +  (set_attr "length" "5")]) -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Pushed: [PATCH] LoongArch: Provide fmin/fmax RTL pattern for vectors

2024-01-03 Thread Xi Ruoyao
On Wed, 2024-01-03 at 16:24 +0800, chenglulu wrote: > LGTM! > > Thanks! Pushed r14-6890. FWIW sometimes tree optimizer still fails to emit .reduc_f{max,min} or it emits them sub-optimally. I've commented in PR112457 but maybe I should've created a new ticket... > 在 2024/1/1 上午3:1

[PATCH] LoongArch: Provide fmin/fmax RTL pattern for vectors

2023-12-31 Thread Xi Ruoyao
We already had smin/smax RTL pattern using vfmin/vfmax instructions. But for smin/smax, it's unspecified what will happen if either operand contains any NaN operands. So we would not vectorize the loop with -fno-finite-math-only (the default for all optimization levels expect -Ofast). But,

Re: [PATCH v1] LoongArch: testsuite:Add the "-ffast-math" compilation option for the file vect-fmin-3.c.

2023-12-30 Thread Xi Ruoyao
On Sat, 2023-12-30 at 20:25 +0800, Xi Ruoyao wrote: > On Sat, 2023-12-30 at 12:15 +, Richard Sandiford wrote: > > This shouldn't be necessary.  The test does: > > > >   for (int i = 0; i < n; i += 2) > >     { > >   x0 = __builtin_fmin (x0, ptr[i + 0]

Re: [PATCH v1] LoongArch: testsuite:Add the "-ffast-math" compilation option for the file vect-fmin-3.c.

2023-12-30 Thread Xi Ruoyao
duc_fmin_scal_*? > If so, we probably need a new target selector for fmin/fmax reduction. Let me try if the [x]vf{min,max} instructions are IEEE-conform. They've still not released the volume 2 of the instruction manual so I can only try... -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH pushed] LoongArch: Fix the format of bstrins__for_ior_mask condition (NFC)

2023-12-29 Thread Xi Ruoyao
gcc/ChangeLog: * config/loongarch/loongarch.md (bstrins__for_ior_mask): For the condition, remove unneeded trailing "\" and move "&&" to follow GNU coding style. NFC. --- Pushed as obvious. gcc/config/loongarch/loongarch.md | 4 ++-- 1 file changed, 2 insertions(+), 2

Pushed: [PATCH v4] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-29 Thread Xi Ruoyao
Pushed v4 as attached, with the format issues fixed and a minor adjustment in the commit message ("define_insn_and_split" is changed to "define_insn_and_rewrite" to match the actual change). On Fri, 2023-12-29 at 19:55 +0800, Xi Ruoyao wrote: > On Fri, 2023-12-29 at 15:57

Re: [PATCH v3] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-29 Thread Xi Ruoyao
> +  return symbolic_pcrel_operand (op, Pmode) || > > +symbolic_pcrel_offset_operand (op, Pmode); > > +}) > > + > >   > Symbol '||' It shouldn't be at the end of the line. Indeed. > > +  return symbolic_pcrel_operand (op, Pmode) > +    || symbolic_pcrel_offset_operand (op, Pmode); > > Others LGTM. > Thanks! > > /* snip */ > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH v3] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-28 Thread Xi Ruoyao
The problem with peephole2 is it uses a naive sliding-window algorithm and misses many cases. For example: float a[1]; float t() { return a[0] + a[8000]; } is compiled to: la.local$r13,a la.local$r12,a+32768 fld.s $f1,$r13,0 fld.s $f0,$r12,-768

Re: [PATCH v1] LoongArch: Merge constant vector permuatation implementations.

2023-12-28 Thread Xi Ruoyao
   > rperm)); > +   tmp = gen_rtx_SUBREG (E_V4DImode, d->target, 0); Likewise. > +   emit_move_insn (tmp, sel); > +   break; > +     case E_V8SFmode: > +   sel = gen_rtx_CONST_VECTOR (E_V8SImode, gen_rtvec_v (d- > >nelt, > +

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-27 Thread Xi Ruoyao
ymbol_ref:DI ("*.LANCHOR0") [flags 0x182])) [0 S1 > A8]))) "volatile.c":5:11 -1 >  (nil)) > > The volatile property of the mem here is gone, so the test fails. Phew. I guess I couldn't reproduce it because I have Jeff's ext-dce patch in my local repo, which removed the zero_extend... I'll rework this patch. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] LoongArch: Fix infinite secondary reloading of FCCmode [PR113148]

2023-12-26 Thread Xi Ruoyao
The GCC internal doc says: X might be a pseudo-register or a 'subreg' of a pseudo-register, which could either be in a hard register or in memory. Use 'true_regnum' to find out; it will return -1 if the pseudo is in memory and the hard register number if it is in a register.

[PATCH v2] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-25 Thread Xi Ruoyao
The problem with peephole2 is it uses a naive sliding-window algorithm and misses many cases. For example: float a[1]; float t() { return a[0] + a[8000]; } is compiled to: la.local$r13,a la.local$r12,a+32768 fld.s $f1,$r13,0 fld.s $f0,$r12,-768

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-25 Thread Xi Ruoyao
On Mon, 2023-12-25 at 10:08 +0800, chenglulu wrote: > > 在 2023/12/24 下午8:59, Xi Ruoyao 写道: > > On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote: > > > On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: > > > > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:

Re: [PATCH v1] LoongArch: Fixed bug in *bstrins__for_ior_mask template.

2023-12-25 Thread Xi Ruoyao
gt; +  "&& true" >    [(set (match_dup 0) (match_dup 1)) >     (set (zero_extract:GPR (match_dup 0) (match_dup 2) (match_dup 4)) >   (match_dup 3))] -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-24 Thread Xi Ruoyao
On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote: > On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: > > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: > > > > The performance drop has nothing to do with this patch. I found that > > > > the h264 performa

Re: [PATCH] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-24 Thread Xi Ruoyao
On Sun, 2023-12-24 at 01:04 +0800, Xi Ruoyao wrote: > On Sun, 2023-12-24 at 00:56 +0800, Xi Ruoyao wrote: > > On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote: > > > Hi, > > > > > > This patch will cause the following tests to fail: > > > > >

[PATCH v2] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-24 Thread Xi Ruoyao
gcc/ChangeLog: * config/loongarch/loongarch.md (rotl3): New define_expand. * config/loongarch/simd.md (vrotl3): Likewise. (rotl3): Likewise. gcc/testsuite/ChangeLog: * gcc.target/loongarch/rotl-with-rotr.c: New test. *

Re: [PATCH] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-23 Thread Xi Ruoyao
On Sun, 2023-12-24 at 00:56 +0800, Xi Ruoyao wrote: > On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote: > > Hi, > > > > This patch will cause the following tests to fail: > > > > +FAIL: gcc.dg/vect/pr97081-2.c (internal compiler error: in extract_insn, > &

Re: [PATCH] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-23 Thread Xi Ruoyao
ence may be caused by a different binutils version or some other changes in GCC. I'll figure it out... -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-23 Thread Xi Ruoyao
On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: > > > The performance drop has nothing to do with this patch. I found that the > > > h264 performance compiled > > > by r14-6787 compared to r14-6421 dropped

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-23 Thread Xi Ruoyao
here is a problem. My regression test has the following two fail > items.(based on r14-6787) > +FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors) > +FAIL: gcc.dg/pr86617.c scan-rtl-dump-times final "mem/v" 6 Strange. I didn't see them on r14-6650 (with or without the patch). --

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread Xi Ruoyao
e new define_insn_and_split produces a better result instead of solely relying on define_insn_and_split? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread Xi Ruoyao
Ping :). On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote: > The problem with peephole2 is it uses a naive sliding-window algorithm > and misses many cases.  For example: > >     float a[1]; >     float t() { return a[0] + a[8000]; } > > is compiled to: >

<    1   2   3   4   5   6   7   8   9   10   >