[PATCH] [i386] Fix typo in vec_setv8hi_0.

2022-03-27 Thread liuhongt via Gcc-patches
pinsrw is available for both reg and mem operand under sse2. pextrw requires sse4.1 for mem operands. The patch change attr "isa" for pinsrw mem alternative from sse4_noavx to noavx, will enable below optimization. -movzwl (%rdi), %eax pxor%xmm1, %xmm1 -pinsrw $0,

Disable gathers on zen3 for vectors with few elements

2022-03-27 Thread Jan Hubicka via Gcc-patches
Hi, as seen on TSVC, Spec2017, the Zen3 gather instruction is a win only for vectors with 8 elements. At the time I was implementing the tuning vectorizer did not know how to open-code gather and thus it was still a win to enable it for shorter vector, but this has changed. The following are

try multi dest registers in default_zero_call_used_regs

2022-03-27 Thread Alexandre Oliva via Gcc-patches
When the mode of regno_reg_rtx is not hard_regno_mode_ok for the target, try grouping the register with subsequent ones. This enables s16 to s31 and their hidden pairs to be zeroed with the default logic on some arm variants. Regstrapped on x86_64-linux-gnu, also tested on an affected arm

Re: [PATCH v3] configure: Implement --enable-host-pie

2022-03-27 Thread Alexandre Oliva via Gcc-patches
Hello, Marek, The patch looks good to me, and I'd have no trouble approving it if we were in stage1. Since we aren't, I'd prefer if we waited for another build system maintainer to give it a look, if it's to go in gcc-12. If release managers feel I'm being overcautious, I don't mind if they

[PATCH] PR fortran/50549 - should detect different type parameters in structure constructors

2022-03-27 Thread Harald Anlauf via Gcc-patches
Dear all, when assigning character pointers, we have a check for same length, which however does not trigger for character pointers within a structure constructor. The attached patch extends the character checks slightly to fix this loophole. I've verified that NAG and Crayftn behave similarly,

Re: [PATCH] x86: Use Yw constraint on *ssse3_pshufbv8qi3

2022-03-27 Thread Uros Bizjak via Gcc-patches
On Sun, Mar 27, 2022 at 8:14 PM H.J. Lu wrote: > > Since AVX512VL and AVX512BW are required for AVX512 VPSHUFB, replace the > "Yv" register constraint with the "Yw" register constraint. This is an obvious fix, as said in https://gcc.gnu.org/gitwrite.html : Obvious fixes can be committed without

[PATCH] x86: Use Yw constraint on *ssse3_pshufbv8qi3

2022-03-27 Thread H.J. Lu via Gcc-patches
Since AVX512VL and AVX512BW are required for AVX512 VPSHUFB, replace the "Yv" register constraint with the "Yw" register constraint. gcc/ PR target/105068 * config/i386/sse.md (*ssse3_pshufbv8qi3): Replace "Yv" with "Yw". gcc/testsuite/ PR target/105068

[committed] libstdc++: Define std::expected for C++23 (P0323R12)

2022-03-27 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk. It's late in stage 4 to be adding new features, but this is C++23-only so entirely experimental, and not defined for the std-gnu++17 default (or even -std=gnu++20). It does mean that the old (very old) std::unexpected function is no longer available in

Re: [PATCH] rs6000: Support UN[GL][ET] in rs6000_maybe_emit_maxc_minc [PR105002]

2022-03-27 Thread Segher Boessenkool
Hi! On Thu, Mar 24, 2022 at 10:00:43AM +0800, Kewen.Lin wrote: > Commit r12-7687 exposed one miss optimization chance in function > rs6000_maybe_emit_maxc_minc, for now it only considers comparison > codes GE/GT/LE/LT, but it can support more variants with codes > UNLT/UNLE/UNGT/UNGE by reversing

Re: [PATCH v3] c++: warn on undefined casts from Base to Derived ref/ptr [PR96765]

2022-03-27 Thread Zhao Wei Liew via Gcc-patches
On Fri, 25 Mar 2022 at 05:58, Jason Merrill wrote: > > > >>> + if (current_function_decl > >>> + && (DECL_CONSTRUCTOR_P (current_function_decl) > >>> + || DECL_DESTRUCTOR_P (current_function_decl)) > >>> + && TREE_CODE (expr) == NOP_EXPR > >>> + &&

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-03-27 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 25, 2022 at 3:08 AM MayShao wrote: > > Hi Uros, > > This patch fix Zhaoxin CPU Vendor ID detection problem > and add Zhaoxin "lujiazui" processor support and tuning. > > Currently gcc can't recognize Zhaoxin CPU (Vendor ID "CentaurHauls" and > "Shanghai") > and wrongly identify