Re: [PATCH v7] Add condition coverage (MC/DC)

2023-12-29 Thread Jørgen Kvalsvik
On 29/12/2023 22:14, Jan Hubicka wrote: gcc/ChangeLog: * builtins.cc (expand_builtin_fork_or_exec): Check condition_coverage_flag. * collect2.cc (main): Add -fno-condition-coverage to OBSTACK. * common.opt: Add new options -fcondition-coverage and

Re: [PATCH 1/2] RTX_COST: Count instructions

2023-12-29 Thread Jeff Law
On 12/29/23 10:46, YunQiang Su wrote: When we try to combine RTLs, the result may be very complex, and `rtx_cost` may think that it need lots of costs. But in fact, it may match a pattern in machine descriptions, which may emit only 1 or 2 hardware instructions. This combination may be

Re: [PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-29 Thread Jeff Law
On 12/28/23 19:07, YunQiang Su wrote: In general, I agree with this change. When gcc12 on RV64, more than one `sext.w` will be produced with our test. (Note, use -O1). There are two things that help here. The first is that the most significant bit never appears in the middle of a field,

Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-29 Thread Jeff Law
On 12/28/23 22:56, Li, Pan2 wrote: Thanks Jeff. I think I locate where aarch64 performs the trick here. 1. In the .final we have rtl like (insn:TI 6 8 29 (set (reg:SF 32 v0) (const_double:SF -0.0 [-0x0.0p+0]))

Re: [PATCH v1 1/8] LoongArch: testsuite:Add detection procedures supported by the target.

2023-12-29 Thread chenxiaolong
At 14:28 +0800 on 2023-12-29th, Chenghua Xu wrote: > chenxiaolong writes: > > > In order to improve and check the function of vector quantization > > in > > LoongArch architecture, tests on vector instruction set are > > provided > > in target-support.exp. > > > > gcc/testsuite/ChangeLog: > > >

[PATCH] libstdc++ testsuite/std/ranges/iota/max_size_type.cc: Reduce /10 for simulators

2023-12-29 Thread Hans-Peter Nilsson
I'm not completely sure I got the intent of the "log2_limit", or whether "limit" is sane to decrease like this; it just looked like an obvious and safe reduction. Also, I verified the 10+ minute runtime, on this same host (clocked at 11:43.61 elapsed time) for a r12-2797-g307e0d40367996 build

[PATCH] libstdc++ testsuite/20_util/hash/quality.cc: Increase timeout 3x

2023-12-29 Thread Hans-Peter Nilsson
Tested for mmix and observing the increased timeout in the .log file - and the test passing. Ok to commit? Or better suggestions? -- >8 -- Testing for mmix (a 64-bit target using Knuth's simulator). The test is largely pruned for simulators, but still needs 5m57s on my laptop from 3.5 years

[committed] MAINTAINERS: Update my email address

2023-12-29 Thread Joseph Myers
There will be another update in January. * MAINTAINERS: Update my email address. diff --git a/MAINTAINERS b/MAINTAINERS index 343560c5b84..fe5d95ae970 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -34,7 +34,7 @@ Jeff Law Michael Meissner

Re: skip vector profiles multiple exits

2023-12-29 Thread Jan Hubicka
> Hi Honza, Hi, > > I wasn't sure what to do here so I figured I'd ask. > > In adding support for multiple exits to the vectorizer I didn't know how to > update this bit: > > https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-vect-loop-manip.cc#L3363 > > Essentially, if skip_vector (i.e.

skip vector profiles multiple exits

2023-12-29 Thread Tamar Christina
Hi Honza, I wasn't sure what to do here so I figured I'd ask. In adding support for multiple exits to the vectorizer I didn't know how to update this bit: https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-vect-loop-manip.cc#L3363 Essentially, if skip_vector (i.e. not enough iteration to

Re: [PATCH 3/7] Lockfile.

2023-12-29 Thread Jan Hubicka
Hi, > This patch implements lockfile used for incremental LTO. > > Bootstrapped/regtested on x86_64-pc-linux-gnu > > gcc/ChangeLog: > > * Makefile.in: Add lockfile.o. > * lockfile.cc: New file. > * lockfile.h: New file. I can't approve it, but overall it looks good to me. We

Re: [PATCH 2/7] lto: Remove random_seed from section name.

2023-12-29 Thread Jan Hubicka
> Bootstrapped/regtested on x86_64-pc-linux-gnu > > gcc/ChangeLog: > > * lto-streamer.cc (lto_get_section_name): Remove random_seed in WPA. This is also OK. (since it lacks explanation - the random suffixes are added for ld -r to work. This never happens between WPA and ltrans, so they

Re: [PATCH 1/7] lto: Skip flag OPT_fltrans_output_list_.

2023-12-29 Thread Jan Hubicka
Hi, > Bootstrapped/regtested on x86_64-pc-linux-gnu > > gcc/ChangeLog: > > * lto-opts.cc (lto_write_options): Skip OPT_fltrans_output_list_. OK, thanks, Honza > --- > gcc/lto-opts.cc | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/gcc/lto-opts.cc b/gcc/lto-opts.cc > index

Re: [PATCH v7] Add condition coverage (MC/DC)

2023-12-29 Thread Jan Hubicka
> gcc/ChangeLog: > > * builtins.cc (expand_builtin_fork_or_exec): Check > condition_coverage_flag. > * collect2.cc (main): Add -fno-condition-coverage to OBSTACK. > * common.opt: Add new options -fcondition-coverage and > -Wcoverage-too-many-conditions. > *

[PATCH]middle-end: maintain LCSSA form when peeled vector iterations have virtual operands

2023-12-29 Thread Tamar Christina
Hi All, This patch fixes several interconnected issues. 1. When picking an exit we wanted to check for niter_desc.may_be_zero not true. i.e. we want to pick an exit which we know will iterate at least once. However niter_desc.may_be_zero is not a boolean. It is a tree that encodes a

[PATCH 1/2] RTX_COST: Count instructions

2023-12-29 Thread YunQiang Su
When we try to combine RTLs, the result may be very complex, and `rtx_cost` may think that it need lots of costs. But in fact, it may match a pattern in machine descriptions, which may emit only 1 or 2 hardware instructions. This combination may be refused due to cost comparison failure. Since

[PATCH 2/2] MIPS: Implement TARGET_INSN_COSTS

2023-12-29 Thread YunQiang Su
When combine some instructions, the generic `rtx_cost` may over estimate the cost of result RTL, due to that the RTL may be quite complex and `rtx_cost` has no information that this RTL can be convert to simple hardware instruction(s). In this case, Let's use `get_attr_insn_count` to estimate the

[PATCH v2 2/2] MIPS: define_attr perf_ratio in mips.md

2023-12-29 Thread YunQiang Su
The accurate cost of an pattern can get with insn_count * perf_ratio The default value is set to 0 instead of 1, since that we will need to distinguish the default value and it is really set for an pattern. Since it is not set for most patterns yet, to use it, we will need to be sure

[PATCH v2 1/2] MIPS: add pattern insqisi_extended and inshisi_extended

2023-12-29 Thread YunQiang Su
This match pattern allows combination (zero_extract:DI 8, 24, QI) with an sign-extend to 32bit INS instruction on TARGET_64BIT. The problem is that, for SI mode, if the sign-bit is modified by bitops, we will need a sign-extend operation. Since 32bit INS instruction can be sure that result is

Re: [C PATCH] C: Fix type compatibility for structs with variable sized fields.

2023-12-29 Thread Joseph Myers
On Wed, 27 Dec 2023, Martin Uecker wrote: > This patch hopefully fixes the test failure we see with gnu23-tag-4.c. > It does for me locally with -march=native (which otherwise reproduces > the problem). > > Bootstrapped and regession tested on x86_64 > > > C: Fix type compatibility for structs

Re: [PATCH]AArch64 Update costing for vector conversions [PR110625]

2023-12-29 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > In gimple the operation > > short _8; > double _9; > _9 = (double) _8; > > denotes two operations. First we have to widen from short to long and then > convert this integer to a double. Think it's worth saying "two operations on AArch64". Some targets can

[PATCH]middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

2023-12-29 Thread Tamar Christina
Hi All, Only trying to update certain dominators doesn't seem to work very well because as the loop gets versioned, peeled, or skip_vector then we end up with very complicated control flow. This means that the final merge blocks for the loop exit are not easy to find or update. Instead of

[PATCH]middle-end: rejects loops with nonlinear inductions and early breaks [PR113163]

2023-12-29 Thread Tamar Christina
Hi All, We can't support nonlinear inductions other than neg when vectorizing early breaks and iteration count is known. For early break we currently require a peeled epilog but in these cases we can't compute the remaining values. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

[PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation

2023-12-29 Thread Tamar Christina
Hi All, This adds an implementation for conditional branch optab for AArch32. The previous version only allowed operand 0 but it looks like cbranch expansion does not check with the target and so we have to implement all. I therefore did not commit it. This is a larger version. For e.g. void

[PATCH]AArch64 Update costing for vector conversions [PR110625]

2023-12-29 Thread Tamar Christina
Hi All, In gimple the operation short _8; double _9; _9 = (double) _8; denotes two operations. First we have to widen from short to long and then convert this integer to a double. Currently however we only count the widen/truncate operations: (double) _5 6 times vec_promote_demote costs 12

[PATCH pushed] LoongArch: Fix the format of bstrins__for_ior_mask condition (NFC)

2023-12-29 Thread Xi Ruoyao
gcc/ChangeLog: * config/loongarch/loongarch.md (bstrins__for_ior_mask): For the condition, remove unneeded trailing "\" and move "&&" to follow GNU coding style. NFC. --- Pushed as obvious. gcc/config/loongarch/loongarch.md | 4 ++-- 1 file changed, 2 insertions(+), 2

Pushed: [PATCH v4] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-29 Thread Xi Ruoyao
Pushed v4 as attached, with the format issues fixed and a minor adjustment in the commit message ("define_insn_and_split" is changed to "define_insn_and_rewrite" to match the actual change). On Fri, 2023-12-29 at 19:55 +0800, Xi Ruoyao wrote: > On Fri, 2023-12-29 at 15:57 +0800, chenglulu wrote:

Re: [PATCH v3] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-29 Thread Xi Ruoyao
On Fri, 2023-12-29 at 15:57 +0800, chenglulu wrote: /* snip */ > > diff --git a/gcc/config/loongarch/loongarch.md > > b/gcc/config/loongarch/loongarch.md > /* snip */ > > +(define_insn_and_rewrite "simple_load" > > +  [(set (match_operand:LD_AT_LEAST_32_BIT 0 "register_operand" "=r,f") > > +

[PATCH 2/2] MIPS: define_attr perf_ratio in mips.md

2023-12-29 Thread YunQiang Su
The accurate cost of an pattern can get with insn_count * perf_ratio The default value is set to 0 instead of 1, since that we will need to distinguish the default value and it is really set for an pattern. Since it is not set for most patterns yet, to use it, we will need to be sure

[PATCH 1/2] MIPS: add pattern insqisi_extended

2023-12-29 Thread YunQiang Su
This match pattern allows combination (zero_extract:DI 8, 24, QI) with an sign-extend to 32bit INS instruction on TARGET_64BIT. The problem is that, for SI mode, if the sign-bit is modified by bitops, we will need a sign-extend operation. Since 32bit INS instruction can be sure that result is

[PATCH] Do not count unused scalar use when marking STMT_VINFO_LIVE_P [PR113091]

2023-12-29 Thread Feng Xue OS
This patch is meant to fix over-estimation about SLP vector-to-scalar cost for STMT_VINFO_LIVE_P statement. When pattern recognition is involved, a statement whose definition is consumed in some pattern, may not be included in the final replacement pattern statements, and would be skipped when

Re: [PATCH] aarch64: fortran: Adjust vect-8.f90 for libmvec

2023-12-29 Thread Richard Sandiford
Szabolcs Nagy writes: > With new glibc one more loop can be vectorized via simd exp in libmvec. > > Found by the Linaro TCWG CI. > > gcc/testsuite/ChangeLog: > > * gfortran/vect/vect-8.f90: Accept more vectorized loops. OK. At first I thought it would be good to "defend" the increase when

Re: [PATCH] aarch64: add 'AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA'

2023-12-29 Thread Richard Sandiford
Di Zhao OS writes: > This patch adds a new tuning option 'AARCH64_EXTRA_TUNE_FULLY_PIPELINED_FMA', > to consider fully pipelined FMAs in reassociation. Also, set this option > by default for Ampere CPUs. > > Tested on aarch64-unknown-linux-gnu. Is this OK for trunk? > > Thanks, > Di Zhao > >

[committed] i386: Fix TARGET_USE_VECTOR_FP_CONVERTS SF->DF float_extend splitter [PR113133]

2023-12-29 Thread Uros Bizjak
The post-reload splitter currently allows xmm16+ registers with TARGET_EVEX512. The splitter changes SFmode of the output operand to V4SFmode, but the vector mode is currently unsupported in xmm16+ without TARGET_AVX512VL. lowpart_subreg returns NULL_RTX in this case and the compilation fails with