[PATCH 2/2]AArch64 aarch64: Add implementation for pow2 bitmask division.

2022-06-08 Thread Tamar Christina via Gcc-patches
Hi All, This adds an implementation for the new optab for unsigned pow2 bitmask for AArch64. The implementation rewrites: x = y / (2 ^ (sizeof (y)/2)-1 into e.g. (for bytes) (x + ((x + 257) >> 8)) >> 8 where it's required that the additions be done in double the precision of x such

[PATCH 1/2]middle-end Support optimized division by pow2 bitmask

2022-06-08 Thread Tamar Christina via Gcc-patches
Hi All, In plenty of image and video processing code it's common to modify pixel values by a widening operation and then scale them back into range by dividing by 255. This patch adds an optab to allow us to emit an optimized sequence when doing an unsigned division that is equivalent to: x

Re: [PATCH v4, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-06-08 Thread HAO CHEN GUI via Gcc-patches
Hi, On 8/6/2022 下午 9:24, Segher Boessenkool wrote: > But it regresses the code quality generated with -ffast-math (because > the new unspecs arent't optimised like standard rtl is). This can be > follow-up work of course -- and the best direction is to make fmin/fmax > generic, even! :-)

Re: [PATCH] RISC-V: Compute default ABI from -mcpu or -march

2022-06-08 Thread Andrew Pinski via Gcc-patches
On Mon, Jun 6, 2022 at 7:53 PM wangpc via Gcc-patches wrote: > > If -mcpu or -march is specified and there is no -mabi, we will calculate > default ABI from arch string provided by -march or defined in CPU info. This is 100% wrong and goes against what all other targets do. All other targets

[pushed] c++: non-templated friends [PR105852]

2022-06-08 Thread Jason Merrill via Gcc-patches
The previous patch for 105852 avoids copying DECL_TEMPLATE_INFO from a non-templated friend, but it really shouldn't have it in the first place. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/105852 gcc/cp/ChangeLog: * decl.cc (duplicate_decls): Change non-templated

[pushed] c++: redeclared hidden friend take 2 [PR105852]

2022-06-08 Thread Jason Merrill via Gcc-patches
My previous patch for 105761 avoided copying DECL_TEMPLATE_INFO from a friend to a later definition, but in this testcase we have first a non-friend declaration and then a definition, and we need to avoid copying in that case as well. But we do still want to set new_template_info to avoid GC

[PATCH] c++: optimize specialization of nested class templates

2022-06-08 Thread Patrick Palka via Gcc-patches
When substituting a class template specialization, tsubst_aggr_type substitutes the TYPE_CONTEXT before passing it to lookup_template_class. This appears to be unnecessary, however, because the the initial value of lookup_template_class's context parameter is unused outside of the IDENTIFIER_NODE

[COMMITTED] gcc: xtensa: fix PR target/105879

2022-06-08 Thread Max Filippov via Gcc-patches
split_double operates with the 'word that comes first in memory in the target' terminology, while gen_lowpart operates with the 'value representing some low-order bits of X' terminology. They are not equivalent and must be dealt with differently on little- and big-endian targets. gcc/ PR

c++: Reimplement static init/fini generation

2022-06-08 Thread Nathan Sidwell
Currently we generate static init/fini code by generating a set of functions taking an 'initp' bool and an unsigned priority. (There can be more than one, as we repeat the end-of-compile loop.) We then generate a set of real init or fini functions for each needed prioroty, calling the previous

Re: [PATCH v2 01/11] OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer)

2022-06-08 Thread Julian Brown
Hi Jakub, Thanks for review! On Tue, 24 May 2022 15:03:07 +0200 Jakub Jelinek via Fortran wrote: > On Fri, Mar 18, 2022 at 09:24:51AM -0700, Julian Brown wrote: > > 2021-11-23 Julian Brown > > > > gcc/ > > * gimplify.c (is_or_contains_p, > > omp_target_reorder_clauses): Delete

Re: [PATCH 1/3] Disable generating store vector pair.

2022-06-08 Thread Peter Bergner via Gcc-patches
On 6/7/22 10:16 PM, Michael Meissner wrote: > Otherwise it is like the mess with -mpower8-fusion, where going from power8 to > power9 we have to clear the fusion flag. If store vector pair is a postive > flag, then it isn't set in power10 flags, but it might be set in next cpu > flags. But if it

[PATCH 2/2][AArch32] Fix 128-bit sequential consistency atomic operations.

2022-06-08 Thread Tamar Christina via Gcc-patches
Hi All, Similar to AArch64 the Arm implementation of 128-bit atomics is broken. For 128-bit atomics we rely on pthread barriers to correct guard the address in the pointer to get correct memory ordering. However for 128-bit atomics the address under the lock is different from the original

[PATCH 1/2]AArch64 Fix 128-bit sequential consistency atomic operations.

2022-06-08 Thread Tamar Christina via Gcc-patches
Hi All, The AArch64 implementation of 128-bit atomics is broken. For 128-bit atomics we rely on pthread barriers to correct guard the address in the pointer to get correct memory ordering. However for 128-bit atomics the address under the lock is different from the original pointer. This means

Re: [PATCH 1/3] Disable generating store vector pair.

2022-06-08 Thread will schmidt via Gcc-patches
On Tue, 2022-06-07 at 23:16 -0400, Michael Meissner wrote: > On Tue, Jun 07, 2022 at 07:59:34PM -0500, Peter Bergner wrote: > > On 6/7/22 4:24 PM, Segher Boessenkool wrote: > > > On Tue, Jun 07, 2022 at 04:17:04PM -0500, Peter Bergner wrote: > > > > I think I mentioned this offline, but I'd prefer

Re: [PATCH]AArch64 relax predicate on load structure load instructions

2022-06-08 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Wednesday, June 8, 2022 11:31 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH]AArch64 relax predicate on

Re: [PATCH] PR middle-end/105874: Use EXPAND_MEMORY to fix ada bootstrap.

2022-06-08 Thread Eric Botcazou via Gcc-patches
> The fix is to ensure that we call expand_expr with EXPAND_MEMORY > when processing the VAR_DECL's returned by get_inner_reference. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check (with no new failures), but also with > --enable-languages="ada" where

RE: [PATCH]AArch64 relax predicate on load structure load instructions

2022-06-08 Thread Tamar Christina via Gcc-patches
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, June 8, 2022 11:31 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: Re: [PATCH]AArch64 relax predicate on load structure load > instructions

Re: [PATCH v4, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-06-08 Thread Segher Boessenkool
On Wed, Jun 08, 2022 at 11:28:11AM +0800, HAO CHEN GUI wrote: > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. But it regresses the code quality generated with -ffast-math

Re: aarch64: Fix bitfield alignment in param passing [PR105549]

2022-06-08 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On 6/7/22 19:44, Richard Sandiford wrote: >> Christophe Lyon via Gcc-patches writes: >>> While working on enabling DFP for AArch64, I noticed new failures in >>> gcc.dg/compat/struct-layout-1.exp (t028) which were not actually >>> caused by DFP types handling. These

[PATCH] PR middle-end/105874: Use EXPAND_MEMORY to fix ada bootstrap.

2022-06-08 Thread Roger Sayle
Many thanks to Tamar Christina for filing PR middle-end/105874 indicating that SPECcpu 2017's Leela is failing on x86_64 due to a miscompilation of FastBoard::is_eye. This function is much smaller and easier to work with than my previous hunt for the cause of the Ada bootstrap failures due to

Re: GCC Rust git branch

2022-06-08 Thread Thomas Schwinge
Hi! This is about GCC/Rust, , now also having a presence in GCC upstream Git sources; see also "GCC Git Branch". On 2021-05-24T16:24:38+, Joseph Myers wrote: > On Mon, 24 May 2021, Philip Herron wrote: > >> remote:

Re: aarch64: Fix bitfield alignment in param passing [PR105549]

2022-06-08 Thread Christophe Lyon via Gcc-patches
On 6/7/22 19:44, Richard Sandiford wrote: Christophe Lyon via Gcc-patches writes: While working on enabling DFP for AArch64, I noticed new failures in gcc.dg/compat/struct-layout-1.exp (t028) which were not actually caused by DFP types handling. These tests are generated during 'make check'

[PATCH] Fix PR target/104871 (macosx-version-min wrong for macOS >= Big Sur (darwin20))

2022-06-08 Thread Simon Wright
(resent with commit message format update) This is the same sort of problem as in PR80204: at present, GCC 11 & 12 assume that if the OS version is >= 20, the compiler should see --mmacosx-version-min={major - 9}.{minor -1}.0, e.g. for OS version 21.3.0 that would be 12.2.0 (the linker sees

Document mailing list (was: GCC Rust git branch)

2022-06-08 Thread Thomas Schwinge
Hi! On 2021-05-28T11:19:16+0100, Philip Herron wrote: > On 28/05/2021 04:22, Jason Merrill wrote: >> On Mon, May 24, 2021 at 9:25 AM Philip Herron > > wrote: >>> As some of you might know, I have been working on GCC Rust over on >>> GitHub

Re: [PATCH]AArch64 relax predicate on load structure load instructions

2022-06-08 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > At some point in time we started lowering the ld1r instructions in gimple. > > That is: > > uint8x8_t f1(const uint8_t *in) { > return vld1_dup_u8([1]); > } > > generates at gimple: > > _3 = MEM[(const uint8_t *)in_1(D) + 1B]; > _4 = {_3, _3, _3, _3,

Re: [PATCH] RISC-V: Compute default ABI from -mcpu or -march

2022-06-08 Thread Kito Cheng via Gcc-patches
I also prefer adding a -mabi=auto option rather than change existing behavior. On Wed, Jun 8, 2022 at 5:06 PM pc.wang via Gcc-patches wrote: > > Thanks for your opinion! I did these just because LLVM has already done the > same thing and I wanted to make GCC with the same behavior of LLVM. The

[PATCH]AArch64 relax predicate on load structure load instructions

2022-06-08 Thread Tamar Christina via Gcc-patches
Hi All, At some point in time we started lowering the ld1r instructions in gimple. That is: uint8x8_t f1(const uint8_t *in) { return vld1_dup_u8([1]); } generates at gimple: _3 = MEM[(const uint8_t *)in_1(D) + 1B]; _4 = {_3, _3, _3, _3, _3, _3, _3, _3}; Which is good, but we then

[PATCH] RISC-V/testsuite: Fix pr105666.c under rv32

2022-06-08 Thread jiawei
From: Jia-wei Chen In rv32 regression test, this cases will report an error: "cc1: error: ABI requires '-march=rv32'" Add '-mabi' option will fix this. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr105666.c: New options. --- gcc/testsuite/gcc.target/riscv/pr105666.c | 2 +- 1 file

[Committed] Add -mno-avx2 to recent gcc.target/i386/xop-vpcmov3.c

2022-06-08 Thread Roger Sayle
Adding -march=cascadelake to the command line options of the recently added xop-vpcmov3.c test case causes problems as GCC then prefers to use AVX512's vpternlogd instruction, instead of the XOP vpcmov that the test is checking for. This is easily solved by adding an explicit -mno-avx512vl to

Re: [PATCH] RISC-V: Compute default ABI from -mcpu or -march

2022-06-08 Thread pc.wang via Gcc-patches
Thanks for your opinion! I did these just because LLVM has already done the same thing and I wanted to make GCC with the same behavior of LLVM. The only difference is that LLVM has no handling for ilp32f and lp64f and I have sent a patch to do it (sees https://reviews.llvm.org/D125947). As for

Re: [PATCH 1/3][ARM] STAR-MC1 CPU Support - arm: Add star-mc1 core

2022-06-08 Thread Chung-Ju Wu via Gcc-patches
Hi Kyrylo, On 2022/06/06 22:10 UTC+8, Kyrylo Tkachov wrote: Successfully bootstrapped and tested on arm-none-eabi. Is it OK for trunk? This is okay (together with the documentation additions in 3/3) Thanks for the patch, Thanks for the approval. The patches 1/3 and 3/3 have been merged

Re: [PATCH 2/3][ARM] STAR-MC1 CPU Support - arm: Add individual star-mc1 cost tables and cost functions

2022-06-08 Thread Chung-Ju Wu via Gcc-patches
Hi Kyrylo, On 2022/06/06 22:18 UTC+8, Kyrylo Tkachov wrote: I'd rather not duplicate those structures and functions in the master branch, as they provide a maintenance burden to the community. If some tuning parameters need to be modified in the future for better performance we can create

Re: [Patch] OpenMP: Fortran - fix ancestor's requires reverse_offload check

2022-06-08 Thread Jakub Jelinek via Gcc-patches
On Wed, Jun 08, 2022 at 09:54:07AM +0200, Tobias Burnus wrote: > The OpenMP requires directive may only be placed in the specification part of > a program unit (except it happens via the USE of a module). > > But the target directive ancestor-requires-'reverse_offload' only checked > the current

Re: [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453]

2022-06-08 Thread HAO CHEN GUI via Gcc-patches
Hi, On 7/6/2022 下午 11:59, Segher Boessenkool wrote: >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c >> @@ -0,0 +1,14 @@ >> +/* { dg-do compile { target lp64 } } */ >> +/* { dg-options "-mdejagnu-cpu=power6 -O2" } */ > It doesn't require -m64, only -mpowerpc64. You can use

[Patch] OpenMP: Fortran - fix ancestor's requires reverse_offload check

2022-06-08 Thread Tobias Burnus
The OpenMP requires directive may only be placed in the specification part of a program unit (except it happens via the USE of a module). But the target directive ancestor-requires-'reverse_offload' only checked the current namespace. OK for mainline? Tobias - Siemens

Re: [PATCH v4, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-06-08 Thread Kewen.Lin via Gcc-patches
on 2022/6/8 11:28, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead > of

Re: [PATCH v2] libgccjit: allow common objects in $(EXTRA_GCC_OBJS) and $(EXTRA_OBJS)

2022-06-08 Thread Xi Ruoyao via Gcc-patches
On Mon, 2022-06-06 at 18:33 -0400, David Malcolm wrote: > > On Thu, 2022-05-19 at 16:10 +0800, Yang Yujie wrote: > > > This patch does not affect any other target architecture than > > > loongarch, > > > and has been bootstrapped and regression-tested on loongarch64- > > > linux- > > > gnuf64 > >

[PATCH] c++: Fix up ICE on __builtin_shufflevector constexpr evaluation [PR105871]

2022-06-08 Thread Jakub Jelinek via Gcc-patches
Hi! As the following testcase shows, BIT_FIELD_REF result doesn't have to have just integral type, it can also have vector type. And in that case cxx_eval_bit_field_ref just ICEs on it because it is unprepared for that case, creates the initial value with build_int_cst (sure, that one could be