[Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438 liuhongt at gcc dot gnu.org changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #7 from liuhongt at gcc dot gnu.org --- .
[Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438 --- Comment #6 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:c3f1768b21e9d994c4f090405e863feb06a54002 commit r14-2596-gc3f1768b21e9d994c4f090405e863feb06a54002 Author: liuhongt Date: Mon Jul 17 12:50:17 2023 +0800 Remove # from one_cmpl2 assemble output. optimize_insn_for_speed () in assemble output is not aligned with splitter condition, and it cause an ICE when building SPEC2017 blender_r. libpng/pngread.c: In function âpng_read_imageâ: libpng/pngread.c:786:1: internal compiler error: in final_scan_insn_1, at final.cc:2813 786 | } | ^ 0x73ac3d final_scan_insn_1 ../../gcc/final.cc:2813 0xb3420b final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*) ../../gcc/final.cc:2887 0xb344c4 final_1 ../../gcc/final.cc:1979 0xb34f64 rest_of_handle_final ../../gcc/final.cc:4240 0xb34f64 execute ../../gcc/final.cc:4318 gcc/ChangeLog: PR target/110438 * config/i386/sse.md (one_cmpl2): Remove # from assemble output.
[Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438 --- Comment #5 from Hongtao.liu --- Should be fixed in GCC14.
[Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438 --- Comment #4 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:13c556d6ae84be3ee2bc245a56eafa58221de86a commit r14-2447-g13c556d6ae84be3ee2bc245a56eafa58221de86a Author: liuhongt Date: Thu Jun 29 14:25:28 2023 +0800 Break false dependence for vpternlog by inserting vpxor or setting constraint of input operand to '0' False dependency happens when destination is only updated by pternlog. There is no false dependency when destination is also used in source. So either a pxor should be inserted, or input operand should be set with constraint '0'. gcc/ChangeLog: PR target/110438 PR target/110202 * config/i386/predicates.md (int_float_vector_all_ones_operand): New predicate. * config/i386/sse.md (*vmov_constm1_pternlog_false_dep): New define_insn. (*_cvtmask2_pternlog_false_dep): Ditto. (*_cvtmask2_pternlog_false_dep): Ditto. (*_cvtmask2): Adjust to define_insn_and_split to avoid false dependence. (*_cvtmask2): Ditto. (one_cmpl2): Adjust constraint of operands 1 to '0' to avoid false dependence. (*andnot3): Ditto. (iornot3): Ditto. (*3): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110438.c: New test. * gcc.target/i386/pr100711-6.c: Adjust testcase.
[Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438 --- Comment #3 from Alexander Monakov --- Patch available: https://inbox.sourceware.org/gcc-patches/8f73371d732237ed54ede44b7bd88...@ispras.ru/T/#u
[Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #2 from Hongtao.liu --- (In reply to Alexander Monakov from comment #1) > We might want to omit PXOR when optimizing for size. indeed.
[Bug target/110438] generating all-ones zmm needs dep-breaking pxor before ternlog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438 --- Comment #1 from Alexander Monakov --- We might want to omit PXOR when optimizing for size.