[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-29 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #19 from Alexander Monakov --- (In reply to rguent...@suse.de from comment #18) > True - but does that catch the cases people are interested and are > allowed by the FP contraction rules? I'm thinking of > > x = a*b + c*d + e + f;

[Bug tree-optimization/107099] New: uncprop a bit

2022-09-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- For the following testcase #include __attribute__((target("avx"))) int f(__m128i a[], long n) { for (long i = 0; i < n; i++) if (!_mm_testz_si128(a[i], a[i]))

[Bug tree-optimization/107107] [10/11/12/13 Regression] Wrong codegen from TBAA when stores to distinct same-mode types are collapsed?

2022-10-01 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107107 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug middle-end/107115] Wrong codegen from TBAA under stores that change effective type?

2022-10-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug middle-end/107115] Wrong codegen from TBAA under stores that change effective type?

2022-10-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 --- Comment #8 from Alexander Monakov --- Just optimizing out the redundant store seems difficult because on some targets scheduling is invoked from reorg (and it relies on alias sets). We need a solution that works for combine too — is it poss

[Bug middle-end/107115] Wrong codegen from TBAA under stores that change effective type?

2022-10-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 --- Comment #12 from Alexander Monakov --- For reference, the previous whacked mole appears to be PR 106187 (where mems_same_for_tbaa_p comes from).

[Bug tree-optimization/107250] Load unnecessarily happens before malloc

2022-10-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/107250] Load unnecessarily happens before malloc

2022-10-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250 --- Comment #3 from Alexander Monakov --- Well, obviously because in one function both 'f' and 'tmp' are live across the call, and in the other function only 'f' is live across the call. The difference is literally pushing one register vs. two r

[Bug middle-end/102380] [meta-bug] visibility (fvisibility=* and attributes) issues

2022-10-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102380 Bug 102380 depends on bug 99619, which changed state. Bug 99619 Summary: fails to infer local-dynamic TLS model from hidden visibility https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99619 What|Removed |Added

[Bug middle-end/99619] fails to infer local-dynamic TLS model from hidden visibility

2022-10-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99619 Alexander Monakov changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-10-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #1 from Alexander Monakov --- Suggested partial fix for the integer-pipe side of the blowup: https://inbox.sourceware.org/gcc-patches/4549f27b-238a-7d77-f72b-cc77df8ae...@ispras.ru/

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #8 from Alexander Monakov --- (In reply to Arseny Solokha from comment #7) > I have it on x86_64-pc-linux-gnu… Thanks for the info (I assume you don't have any special configure arguments), but that's surprising, I ran bootstrap+reg

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #9 from Alexander Monakov --- Actually, latest results from H.J. Lu's periodic x86_64 tester don't exhibit such issues either: https://inbox.sourceware.org/gcc-testresults/20221025065901.6dc0062...@gnu-34.sc.intel.com/T/#u

[Bug c++/107393] New: Wrong TLS model for specialized template

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org CC: amonakov at gcc dot gnu.org, asolokha at gmx dot com, bergner at gcc dot gnu.org, iains at gcc dot gnu.org, law

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #11 from Alexander Monakov --- I've broken out the C++ issue from comment #10 as PR 107393, thanks for the testcase. It's a separate issue from emutls and Fortran ICEs on other targets.

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #12 from Alexander Monakov --- ICE on the emutls-3.c testcase isn't related to emutls. Rather, the frontend invokes decl_default_tls_model before attributes are processed, so the first time around we miss the 'common' attribute when

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #13 from Alexander Monakov --- As for the Fortran testcases, the issue is again caused by the front-end invoking decl_default_tls_model before assigning DECL_COMMON, this time in fortran/trans-common.cc:build_common_decl. So I guess

[Bug c/107419] New: attributes are ignored when selecting TLS model

2022-10-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org CC: amonakov at gcc dot gnu.org, asolokha at gmx dot com, bergner at gcc dot gnu.org, iains at gcc dot gnu.org

[Bug fortran/107421] New: problematic interaction of 'common' and 'threadprivate'

2022-10-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Keywords: openmp Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org CC: amonakov at gcc dot gnu.org, asolokha at gmx dot com, bergner at gcc dot gnu.o

[Bug other/107353] frontends sometimes select wrong (too strong) TLS access model

2022-10-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 Alexander Monakov changed: What|Removed |Added Summary|[13 regression] Numerous|frontends sometimes select

[Bug tree-optimization/107505] [13 Regression] ICE: verify_flow_info failed (error: returns_twice call is not first in basic block 2)

2022-11-02 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107505 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #3 from Alexander Monakov --- Followup patches have been posted at https://inbox.sourceware.org/gcc-patches/20221101162637.14238-1-amona...@ispras.ru/

[Bug tree-optimization/107505] [13 Regression] ICE: verify_flow_info failed (error: returns_twice call is not first in basic block 2)

2022-11-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107505 Alexander Monakov changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug other/107621] spinx generated documents has too much white space on the top

2022-11-10 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107621 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/107647] [12/13 Regression] GCC 12.2.0 may produce FMAs even with -ffp-contract=off

2022-11-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107647 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/107647] [12/13 Regression] GCC 12.2.0 may produce FMAs even with -ffp-contract=off

2022-11-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107647 --- Comment #6 from Alexander Monakov --- Sure, but I was talking specifically about the pattern matching introduced by that commit.

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2022-11-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2022-11-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #15 from Alexander Monakov --- Ah, there will be an mfence after the vmovdqa when necessary for an atomic store, thanks (I missed that because the testcase doesn't scan for mfence).

[Bug target/107676] Nonsensical docs for -mrelax-cmpxchg-loop

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
||amonakov at gcc dot gnu.org Resolution|--- |FIXED --- Comment #8 from Alexander Monakov --- Fixed.

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #6 from Alexander Monakov --- With these patches on trunk, current situation is: nm -CS -t d --defined-only gcc/insn-automata.o | sed 's/^[0-9]* 0*//' | sort -n | tail -40 2496 r slm_base 2527 r bdver3_load_min_issue_delay 2746 r glm

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #8 from Alexander Monakov --- (In reply to Jan Hubicka from comment #7) > > 53730 r btver2_fp_min_issue_delay > > 53760 r znver1_fp_transitions > > 93960 r bdver3_fp_transitions > > 106102 r lujiazui_core_check > > 106102 r lujiazui_c

[Bug tree-optimization/107715] TSVC s161 for double runs at zen4 30 times slower when vectorization is enabled

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 --- Comment #3 from Alexander Monakov --- There's a forward dependency over 'c' (read of c[i] vs. write of c[i+1] with 'i' iterating forward), and the vectorized variant takes the hit on each iteration. How is a slowdown even surprising. For th

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #10 from Alexander Monakov --- (In reply to Jan Hubicka from comment #9) > Actually for older cores I think the manufacturers do not care much. I > still have a working Bulldozer machine and I can do some testing. > I think in Buldoz

[Bug middle-end/107719] 14% regression on TSVC s3113 on znve4 compared to GCC 7.5

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107719 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/107647] [12/13 Regression] GCC 12.2.0 may produce FMAs even with -ffp-contract=off

2022-11-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107647 --- Comment #15 from Alexander Monakov --- I'm confused about the first hunk in the attached patch: --- a/gcc/tree-vect-slp-patterns.cc +++ b/gcc/tree-vect-slp-patterns.cc @@ -1035,8 +1035,10 @@ complex_mul_pattern::matches (complex_operation_t

[Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3

2022-11-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug middle-end/107879] [13 Regression] ffmpeg-4 test suite fails on FPU arithmetics

2022-11-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107879 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3

2022-11-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832 --- Comment #21 from Alexander Monakov --- (In reply to Michael_S from comment #19) > > Also note that 'vfnmadd231pd 32(%rdx,%rax), %ymm3, %ymm0' would be > > 'unlaminated' (turned to 2 uops before renaming), so selecting independent > > IVs for

[Bug rtl-optimization/107772] function prologue generated even though it's only needed in an unlikely path

2022-11-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107772 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2022-11-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #24 from Alexander Monakov --- (In reply to Peter Cordes from comment #23) > But at least on Linux, I don't think there's a way for user-space to even > ask for a page of WT or WP memory (or UC or WC). Only WB memory is easily > ava

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2022-11-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #26 from Alexander Monakov --- Sure, the right course of action seems to be to simply document that atomic types and built-ins are meant to be used on "common" (writeback) memory, and no guarantees can be given otherwise, because it

[Bug middle-end/107905] 2x slowdown versus CLANG and ICL

2022-11-29 Thread amonakov at gcc dot gnu.org via Gcc-bugs
||amonakov at gcc dot gnu.org --- Comment #3 from Alexander Monakov --- LLVM does a better job at code layout, and massively wins on the amount of executed branches (in particular unconditional jumps). With -fdisable-rtl-bbro gcc achieves a similar performance.

[Bug driver/107787] -Werror=array-bounds=X does not work as expected

2022-11-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
||amonakov at gcc dot gnu.org Resolution|--- |FIXED --- Comment #3 from Alexander Monakov --- Fixed for gcc-13.

[Bug middle-end/107905] 2x slowdown versus CLANG and ICL

2022-11-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905 --- Comment #5 from Alexander Monakov --- Not sure what you don't like about the inputs, they appear quite reasonable. Perhaps GCC's estimation of bb frequencies is off (with profile feedback we achieve good performance). Georgi: you'll likely

[Bug middle-end/107905] 2x slowdown versus CLANG and ICL

2022-11-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905 --- Comment #6 from Alexander Monakov --- Let me add that Clang supports GCC's -fprofile-{generate,use} flags for compatibility as well.

[Bug tree-optimization/107879] [13 Regression] ffmpeg-4 test suite fails on FPU arithmetics

2022-12-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107879 --- Comment #10 from Alexander Monakov --- If anyone is confused like I was, the commit actually includes a testcase, but the addition is not mentioned in the Changelog. I was sure the server-side receive hook was supposed to reject such incompl

[Bug c/107971] linking an assembler object creates an executable stack

2022-12-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107971 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c++/108008] Compiler mis-optimization with posix_memalign

2022-12-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108008 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-12-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #11 from Alexander Monakov --- Factoring out Lujiazui divider shrinks its tables by almost 20x: 3 r lujiazui_decoder_min_issue_delay 20 r lujiazui_decoder_transitions 32 r lujiazui_agu_min_issue_delay 126 r lujiazui_agu_transitions 3

[Bug tree-optimization/108008] [12 Regression] wrong code with -O3 and posix_memalign

2022-12-08 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108008 --- Comment #9 from Alexander Monakov --- I think this is tree-ldist placing memset(sameZ, 0, zPlaneCount) after the loop, overwriting conditional 'sameZ[i] = true' assignments that happen in the loop. For the smaller testcase from comment #6,

[Bug tree-optimization/108008] [12 Regression] wrong code with -O3 and posix_memalign

2022-12-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108008 --- Comment #10 from Alexander Monakov --- Looks similar to PR 107323, but needs explicit -ftree-loop-distribution to trigger.

[Bug tree-optimization/108076] [10/11/12/13 Regression] GCC with -O3 produces code which fails to link

2022-12-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108076 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVA

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 --- Comment #9 from Alexander Monakov --- (In reply to Feng Xue from comment #8) > In another angle, because gcc already model control flow and SSA web for > setjmp/longjmp, explicit volatile specification is not really needed. That covers GIM

[Bug tree-optimization/108129] New: nop_atomic_bit_test_and_p is too bloated

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- match.pd has multi-pattern matcher 'nop_atomic_bit_test_and_p'. It expands to ~38 KLOC in gimple-match.cc and ~350 KB in the compiled binary. There h

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 --- Comment #12 from Alexander Monakov --- Shouldn't there be another bug for the sched1 issue specifically? In absence of abnormal control flow, extending lifetimes of pseudos across calls is still likely to be a pessimization.

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added Resolution|DUPLICATE |FIXED --- Comment #14 from Alexande

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added Resolution|FIXED |DUPLICATE --- Comment #15 from Alex

[Bug rtl-optimization/57067] Missing control flow edges for setjmp/longjmp

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57067 --- Comment #9 from Alexander Monakov --- *** Bug 108117 has been marked as a duplicate of this bug. ***

[Bug middle-end/108140] ICE expanding __rbit

2022-12-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108140 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291

2021-04-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
||amonakov at gcc dot gnu.org, ||zhroma at gcc dot gnu.org --- Comment #1 from Alexander Monakov --- Hi Martin, this is a modulo-scheduling bug; I think you added "Blocks: sel-sched" by mistake — removing, and Cc'ing Roma

[Bug tree-optimization/100363] gcc generating wider load/store than warranted at -O3

2021-05-01 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/93031] Wish: When the underlying ISA does not force pointer alignment, option to make GCC not assume it

2021-05-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93031 --- Comment #7 from Alexander Monakov --- In comment #2 I touched upon a potentially more practical way to offer -fno-strict-alignment: Run early work with ABI alignments: compute __alignof correctly, lay out composite types as required by ABI,

[Bug other/99903] 32-bit x86 frontends randomly crash while reporting timing on Windows

2021-05-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs
|UNCONFIRMED CC||amonakov at gcc dot gnu.org --- Comment #4 from Alexander Monakov --- 32-bit Linux should also be affected (perhaps with less probability if clock() is more precise). It is surprising we track time in a 'double'

[Bug c/100618] Add a -fno-semantic-interposition variant which allows variable interposition

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100618 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/100483] Extend -fno-semantic-interposition to global variables

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/100618] Add a -fno-semantic-interposition variant which allows variable interposition

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100618 --- Comment #3 from Alexander Monakov --- Furthermore as discussed in bug 100483 this request appears based on a misunderstanding what the 'semantic-' part of the option is about. It does not affect assembly/linker-level binding mechanism, so th

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #3 from Alexander Monakov --- I understand what you're saying, but it seems we're talking past each other. I agree that if a library is linked with any -Bsymbolic* flag, the main executable is at risk of broken address uniqueness un

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #5 from Alexander Monakov --- Hm, I still don't think I'm misunderstanding what you're saying. I'm familiar with the ELF standard (and FWIW I have read your blog posts on related matters). I am responding to this sentiment from the o

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-18 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #7 from Alexander Monakov --- Thanks. I agree that inferring address significance on the linker side is problematic. Thinking about your original request, I was about to say that it would be very reasonable to do under -fno-plt flag

[Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c

2021-05-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573 --- Comment #14 from Alexander Monakov --- I would break in gdb on cuModuleGetFunction and x/s $rdx to print the failing symbol (it's the third argument to the function). It seems the "inner" entrypoint (which your patch attempted to nullif

[Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c

2021-05-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573 --- Comment #17 from Alexander Monakov --- Yes, I'd agree normally it's present in the offload table, but ideally if you're trying to stub out the call, it should not be present in the offload table. I think Tobias is saying that on GIMPLE this

[Bug libgomp/100573] [OpenMP] 'omp target teams' fails with nvptx and GCN offloading: FAIL libgomp.c-c++-common/for-3.c + for-9.c

2021-05-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100573 --- Comment #19 from Alexander Monakov --- Ah, does the issue arise because foo._omp_fn.0 is (before the patch) callable in two contexts, in one it's called from host and should be 'omp target entrypoint', and in the other it's called from offlo

[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function

2021-05-27 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #10 from Alexander Monakov --- Is there something wrong or undesirable with making this under -fno-plt (or the noplt attribute as in your example)? (after all, it is a kind of PLT-avoidance transformation, just for addressing rather

[Bug target/105700] GCC miscompiles? wine when using -march=pentium-m

2022-05-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105700 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/105700] GCC miscompiles? wine when using -march=pentium-m

2022-05-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105700 --- Comment #5 from Alexander Monakov --- (In reply to Artem S. Tashkinov from comment #4) > > There should be a note in dmesg when a process segfaults outside of a > > debugger. If you run wine without gdb, and winedevice.exe crashes, is there

[Bug bootstrap/105688] Cannot build GCC 11.3 on Fedora 36

2022-05-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105688 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/105863] RFE: __attribute__((incbin("file"))) or __builtin_incbin("file")

2022-06-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105863 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/106019] New: Surprising SLP failure on trivial code

2022-06-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- In the following code, 'f' is not SLP-vectorized, but 'g' is. From a brief look at slp2 dump, looks like

[Bug target/106277] missed-optimization: redundant movzx

2022-07-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106277 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/101347] [11/12/13 Regression] ICE in cfg_layout_initialize with __builtin_setjmp and -fprofile-generate -fprofile-use

2022-07-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101347 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug lto/91299] LTO inlines a weak definition in presence of a non-weak definition from an ELF file

2022-07-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91299 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/101347] [11/12 Regression] ICE in cfg_layout_initialize with __builtin_setjmp and -fprofile-generate -fprofile-use

2022-07-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101347 Alexander Monakov changed: What|Removed |Added Summary|[11/12/13 Regression] ICE |[11/12 Regression] ICE in

[Bug middle-end/106421] New: ICE with computed goto from a nested functon

2022-07-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- int main(int argc, char **argv) { __label__ loop, end; void jmp(int c) { goto *(c ? &&loop : &&

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 --- Comment #4 from Alexander Monakov --- Regarding point 1 above, I should mention that Glibc headers mark both 'vfork' and 'raise' as leaf.

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 --- Comment #7 from Alexander Monakov --- I think item 2 from comment #3 (jump threading) still needs to be solved independently of what is decided about item 1 (leaf functions resuming earlier returns_twice call). --- The problem with 'leaf'

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 --- Comment #8 from Alexander Monakov --- I mean the minimized testcase, the original attachment does execve/_exit after vfork.

[Bug ipa/106437] New: Glibc marks functions that resume a returns_twice call as leaf

2022-07-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org CC: amonakov at gcc dot gnu.org, asolokha at gmx dot com, dcb314 at hotmail dot com, hubicka at

[Bug ipa/106437] Glibc marks functions that resume a returns_twice call as leaf

2022-07-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106437 --- Comment #1 from Alexander Monakov --- With the exception of '_exit', exit family of functions (exit, _Exit, quick_exit) are also marked leaf despite exit and quick_exit invoking atexit/on_exit/at_quick_exit handlers. Only _Exit is specified

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 --- Comment #10 from Alexander Monakov --- The leaf issue is now PR 106437.

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 --- Comment #11 from Alexander Monakov --- A cleaner testcase for jump threading (still ICEs despite presence of ABNORMAL_DISPATCHER): void vfork() __attribute__((__leaf__)); void semanage_reload_policy(char *arg, void cb(void)) { if (!arg) {

[Bug lto/91299] LTO inlines a weak definition in presence of a non-weak definition from an ELF file

2022-07-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91299 --- Comment #11 from Alexander Monakov --- Marxin, you've marked this as WAITING, can you please re-evaluate? The nice testcase from comment #2 is reproducible on trunk as well.

[Bug target/105135] [11/12/13 Regression] Optimization regression for handrolled branchless assignment since r11-4717-g3e190757fa332d32

2022-07-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105135 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug target/106453] New: Redundant zero extension after crc32q

2022-07-27 Thread amonakov at gcc dot gnu.org via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- On 64-bit x86, straightforward use of SSE 4.2 crc instruction looks like #include #include uint32_t f(uint32_t c, uint64_t *p, size_t n) { for (size_t i = 0; i < n

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 Alexander Monakov changed: What|Removed |Added CC||aldyh at gcc dot gnu.org --- Commen

[Bug target/106453] Redundant zero extension after crc32q

2022-07-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453 --- Comment #1 from Alexander Monakov --- Any idea if the following is reasonable? It compiles and achieves the desired result. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index bdde577dd..d82656678 100644 --- a/gcc/config/i3

<    4   5   6   7   8   9   10   11   >