Re: [PATCH] i386: Avoid mutual recursion between two peephole2s [PR99600]

2021-03-16 Thread Uros Bizjak via Gcc-patches
On Tue, Mar 16, 2021 at 11:10 AM Jakub Jelinek wrote: > > Hi! > > As the testcase shows, the compiler hangs and eats all memory when compiling > it. This is because in r11-7274-gdecd8fb0128870d0d768ba53dae626913d6d9c54 > I have changed the ix86_avoid_lea_for_addr splitting from a splitter > into

Re: [PATCH] i386: Fix up _mm256_vzeroupper() handling [PR99563]

2021-03-16 Thread Uros Bizjak via Gcc-patches
On Tue, Mar 16, 2021 at 10:51 AM Jakub Jelinek wrote: > > Hi! > > My r10-6451-gb7b3378f91c0641f2ef4d88db22af62a571c9359 fix for > vzeroupper vs. ms ABI apparently broke the explicit vzeroupper handling > when the implicit vzeroupper handling is disabled. > The epilogue_completed splitter for vzero

Re: [PATCH v3] x86: Update 'P' operand modifier for -fno-plt

2021-03-14 Thread Uros Bizjak via Gcc-patches
On Sun, Mar 14, 2021 at 8:14 PM H.J. Lu wrote: > > > Done. Here is the updated patch. Tested on Linux/x86-64. OK for master? > > > > I don't understand the purpose of the current_output_insn check and I > > don't know if the usage of current_output_insn is correct. The > > comments are not help

Re: [PATCH v3] x86: Update 'P' operand modifier for -fno-plt

2021-03-14 Thread Uros Bizjak via Gcc-patches
On Sat, Mar 13, 2021 at 8:39 PM H.J. Lu wrote: > > On Fri, Mar 12, 2021 at 8:37 AM Uros Bizjak wrote: > > > > On Fri, Mar 12, 2021 at 2:20 PM H.J. Lu wrote: > > > > > > On Thu, Mar 11, 2021 at 11:21 PM Uros Bizjak wrote: > > > > > &g

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-14 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 11:56 PM Jakub Jelinek wrote: > > On Fri, Mar 12, 2021 at 07:52:16PM +0100, Uros Bizjak via Gcc-patches wrote: > > > I can test it on avx512{bw,vl,dq} hw tonight if you want. > > > > I'm testing the patch on avx2 hw, which is not represen

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
V pet., 12. mar. 2021 19:19 je oseba Jakub Jelinek napisala: > > On Fri, Mar 12, 2021 at 06:48:57PM +0100, Uros Bizjak wrote: > > It is hidden in *vec_extractv4si pattern: > > > > (define_insn "*vec_extractv4si" > > - [(set (match_operand:SI 0 &

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 6:32 PM Jakub Jelinek wrote: > > On Fri, Mar 12, 2021 at 06:05:34PM +0100, Uros Bizjak wrote: > > Untested patch that introduces YW to some remaining pextr > > instructions, fixes one case of 128bit vpsrldq and 128bit vpalignr w/o > > AVX512VL. >

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 5:11 PM Uros Bizjak wrote: > > On Fri, Mar 12, 2021 at 4:28 PM Jakub Jelinek wrote: > > > > On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote: > > > > (define_insn "*avx2_pmaddwd" > > > > -

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 4:28 PM Jakub Jelinek wrote: > > On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote: > > > (define_insn "*avx2_pmaddwd" > > > - [(set (match_operand:V8SI 0 "register_operand" "=x,v") > > > + [(set

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 2:38 PM Jakub Jelinek wrote: > > On Fri, Mar 12, 2021 at 09:35:00AM +0100, Uros Bizjak via Gcc-patches wrote: > > Perhaps we can introduce another Y... constraint for AVX512BW and use > > it here. I think they will be used in other places, too. > >

Re: [PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

2021-03-12 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 12, 2021 at 8:59 AM Jakub Jelinek wrote: > > Hi! > > This is the final patch of the series started with > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566139.html > and continued with > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566356.html > This time, I went through

Re: [PATCH] x86: Update 'P' operand modifier for -fno-plt

2021-03-11 Thread Uros Bizjak via Gcc-patches
On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu wrote: > > Update 'P' operand modifier for -fno-plt to support inline assembly > statements. In 64-bit, we can always load function address with > @GOTPCREL. In 32-bit, we load function address with @GOT only for > non-PIC since PIC register may not be av

Re: [PATCH] x86: Update 'P' operand modifier for -fno-plt

2021-03-11 Thread Uros Bizjak via Gcc-patches
On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu wrote: > > Update 'P' operand modifier for -fno-plt to support inline assembly > statements. In 64-bit, we can always load function address with > @GOTPCREL. In 32-bit, we load function address with @GOT only for > non-PIC since PIC register may not be av

Re: [PATCH] i386: Properly set ix86_isa_flags

2021-03-08 Thread Uros Bizjak via Gcc-patches
On Mon, Mar 8, 2021 at 2:52 PM Martin Liška wrote: > > Hi. > > The bug is about usage of ix86_isa_flags instead of opts->x_ix86_isa_flags. > I'm planning a bigger refactoring regarding the flags & PTA_PCLMUL conditions. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > >

Re: [PATCH] i386: Enable UINTR and HRESET for -march that supports it

2021-03-08 Thread Uros Bizjak via Gcc-patches
On Mon, Mar 8, 2021 at 1:45 PM Martin Liška wrote: > > Hello. > > The patch fixes missing features for -march targets that support > PTA_UINTR and PTA_HRESET. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog:

Re: [PATCH] i386, v2: Fix some -mavx512vl -mno-avx512bw bugs [PR99321]

2021-03-07 Thread Uros Bizjak via Gcc-patches
On Sun, Mar 7, 2021 at 9:41 AM Jakub Jelinek wrote: > > On Sat, Mar 06, 2021 at 01:39:23PM +0100, Uros Bizjak via Gcc-patches wrote: > > > One possibility would be to change the meaning of Yw, because it > > > is an internal undocumented constraint and all uses in GCC

Re: [PATCH] i386: Fix some -mavx512vl -mno-avx512bw bugs [PR99321]

2021-03-06 Thread Uros Bizjak via Gcc-patches
On Sat, Mar 6, 2021 at 11:34 AM Jakub Jelinek wrote: > > On Sat, Mar 06, 2021 at 11:19:15AM +0100, Uros Bizjak wrote: > > > We already have Yw constraint which is equivalent to v for > > > -mavx512bw -mavx512vl and to nothing otherwise, so for > > > the instructi

Re: [PATCH] i386: Fix some -mavx512vl -mno-avx512bw bugs [PR99321]

2021-03-06 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 5, 2021 at 9:51 PM Jakub Jelinek wrote: > > Hi! > > As I wrote in the mail with the previous PR99321 fix, we have various > bugs where we emit instructions that need avx512bw and avx512vl > ISAs when compiling with -mavx512vl -mno-avx512bw. > > Without the following patch, > /* PR targ

Re: [PATCH] i386: Improve [QH]Imode rotates with masked shift count [PR99405]

2021-03-06 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 5, 2021 at 9:40 PM Jakub Jelinek wrote: > > Hi! > > The following testcase shows that while we nicely optimize away the > useless and? of shift count before rotation for [SD]Imode rotates, > we don't do that for [QH]Imode. > > The following patch optimizes that by using the right itera

Re: [PATCH] i386: Fix a peephole2 for -mavx512vl -mno-avx512bw [PR99321]

2021-03-02 Thread Uros Bizjak via Gcc-patches
On Wed, Mar 3, 2021 at 12:29 AM Jakub Jelinek wrote: > > Hi! > > As the testcase shows, the > (define_peephole2 > [(set (match_operand 0 "sse_reg_operand") > (match_operand 1 "sse_reg_operand")) >(set (match_dup 0) > (match_operator 3 "commutative_operator" > [(matc

Re: [x86] Fix PR target/99264

2021-02-28 Thread Uros Bizjak via Gcc-patches
On Fri, Feb 26, 2021 at 5:14 PM Eric Botcazou wrote: > > Hi, > > this wrong-code PR for the C++ compiler on x86-64/Windows is a regression > in GCC 9 and later, but the underlying issue has probably been there since > SEH was implemented and is exposed by this comment in config/i386/winnt.c: > >

Re: [gcc-12 PATCH] ira: Correct HONOR_REG_ALLOC_ORDER usage

2021-02-23 Thread Uros Bizjak via Gcc-patches
On Tue, Feb 23, 2021 at 8:48 AM Richard Biener wrote: > > On Mon, 22 Feb 2021, Uros Bizjak wrote: > > > The intention of HONOR_REG_ALLOC_ORDER is to ensure that IRA allocates > > registers in the order given by REG_ALLOC_ORDER. However in > > ira_better_spill_reloa

[gcc-12 PATCH] ira: Correct HONOR_REG_ALLOC_ORDER usage

2021-02-22 Thread Uros Bizjak via Gcc-patches
The intention of HONOR_REG_ALLOC_ORDER is to ensure that IRA allocates registers in the order given by REG_ALLOC_ORDER. However in ira_better_spill_reload_regno_p, there is still a place where the calculation depends on the presence of REG_ALLOC_ORDER, ignoring HONOR_REG_ALLOC_ORDER macro altogeth

Re: [PATCH] split, i386: Fix up df uses in i386 splitters [PR99104]

2021-02-16 Thread Uros Bizjak via Gcc-patches
On Tue, Feb 16, 2021 at 1:53 PM Jakub Jelinek wrote: > > On Tue, Feb 16, 2021 at 12:39:39PM +0100, Uros Bizjak wrote: > > > @@ -5197,7 +5209,7 @@ (define_insn_and_split "*lea" > > > > > >/* ix86_avoid_lea_for_addr re-recognizes insn and may > >

Re: [PATCH] split, i386: Fix up df uses in i386 splitters [PR99104]

2021-02-16 Thread Uros Bizjak via Gcc-patches
On Tue, Feb 16, 2021 at 12:05 PM Jakub Jelinek wrote: > > On Tue, Feb 16, 2021 at 11:47:51AM +0100, Uros Bizjak wrote: > > > In this case the match_scratch wouldn't work, since CC_REGNUM is fixed. > > > But as you said on irc, there's peep2_regno_dead_p instead

Re: [PATCH] split, i386: Fix up df uses in i386 splitters [PR99104]

2021-02-16 Thread Uros Bizjak via Gcc-patches
On Tue, Feb 16, 2021 at 11:28 AM Richard Sandiford wrote: > > Jakub Jelinek via Gcc-patches writes: > > On Tue, Feb 16, 2021 at 09:55:40AM +, Richard Sandiford wrote: > >> I assume that's because pass_df_initialize_no_opt is slightly after > >> the first pass_split_all_insns? Seems like it s

Re: [PATCH] i386: Add combiner splitter to optimize V2SImode memory rotation [PR96166]

2021-02-13 Thread Uros Bizjak via Gcc-patches
On Sat, Feb 13, 2021 at 9:47 AM Jakub Jelinek wrote: > > Hi! > > Since the x86 backend enabled V2SImode vectorization (with > TARGET_MMX_WITH_SSE), slp vectorization can kick in and emit > movq(%rdi), %xmm1 > pshufd $225, %xmm1, %xmm0 > movq%xmm0, (%rdi) > instead

[PATCH] libgomp/i386: Revert the type of syscall wrappers output back to long.

2021-02-11 Thread Uros Bizjak via Gcc-patches
Linux man-pages 5.07 wrongly declares syscall output type as int. This error was fixed in release 5.10, so this patch reverts my recent change. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline as partial revert. 2021-02-11 Uroš Bizjak libgomp/ * conf

[PATCH] libgomp/i386: Move syscall asms to static inline wrapper.

2021-02-11 Thread Uros Bizjak via Gcc-patches
Move syscall asms to static inline wrapper functions to improve #ifdeffery. Also correct output type to int and timeout type to void *. No functional changes. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. OK for mainline? 2021-02-11 Uroš Bizjak libgomp/ * config/linux/

Re: [PATCH] i386: Fix ICEs due to simplify_gen_subreg returning NULL [PR99025]

2021-02-10 Thread Uros Bizjak via Gcc-patches
On Wed, Feb 10, 2021 at 4:12 PM Jakub Jelinek wrote: > > Hi! > > In these patterns, we call simplify_gen_subreg on the input operand > to create paradoxical subregs that have 2x, 4x or 8x elements as the input > operand. That works fine if the input operand is a REG, but when it is a > SUBREG, RT

Re: [PATCH] i386: Fix up TARGET_QIMODE_MATH for many AMD CPU tunings [PR98957]

2021-02-05 Thread Uros Bizjak via Gcc-patches
On Fri, Feb 5, 2021 at 10:31 AM Uros Bizjak wrote: > > On Fri, Feb 5, 2021 at 10:29 AM Jakub Jelinek wrote: > > > > Hi! > > > > As written in the PR, TARGET_QIMODE_MATH was meant to be set for all > > tunings and it was the case for GCC <= 7, but as the num

Re: [PATCH] i386: Fix up TARGET_QIMODE_MATH for many AMD CPU tunings [PR98957]

2021-02-05 Thread Uros Bizjak via Gcc-patches
On Fri, Feb 5, 2021 at 10:29 AM Jakub Jelinek wrote: > > Hi! > > As written in the PR, TARGET_QIMODE_MATH was meant to be set for all > tunings and it was the case for GCC <= 7, but as the number of > PROCESSOR_* enumerators grew, some AMD tunings (which are at the end > of the list) over time got

Re: [PATCH][X86] Enable X86_TUNE_AVX256_UNALIGNED_{LOAD, STORE}_OPTIMAL for generic tune [PR target/98172]

2021-02-03 Thread Uros Bizjak via Gcc-patches
On Thu, Feb 4, 2021 at 5:28 AM Hongtao Liu wrote: > > > >GCC11 will be the system GCC 2 years from now, and for the > > > > processors then, they shouldn't even need to split a 256-bit vector > > > > into 2 128-bits vectors. > > > >.i.e. Test SPEC2017 with the below 2 options on Zen3/ICL

Re: [PATCH] gimple-match, gimple-fold: After PROP_gimple_lvec is set, punt for vector stmts that veclower would need to lower [PR98287]

2021-02-02 Thread Uros Bizjak via Gcc-patches
On Tue, Feb 2, 2021 at 2:56 PM Jakub Jelinek wrote: > > On Tue, Feb 02, 2021 at 02:23:55PM +0100, Richard Biener wrote: > > Btw, I just can find V1DI mentioned in mmx.md but I can't find > > rotate or shift patterns that would match? > > The backend has several V1?Imode shifts, but optab only for

Re: [PATCH] i386: Add peephole2 for __atomic_sub_fetch (x, y, z) == 0 [PR98737]

2021-01-27 Thread Uros Bizjak via Gcc-patches
On Wed, Jan 27, 2021 at 10:20 AM Jakub Jelinek wrote: > > Hi! > > This patch adds a peephole2 for the optimization requested in the PR, > namely that we emit awful code for __atomic_sub_fetch (x, y, z) == 0 > or __atomic_sub_fetch (x, y, z) != 0 when y is not constant. > This can't be done in the

Re: [PATCH] libgcc, i386: Add .note.GNU-stack sections to the ms sse/avx sav/res

2021-01-27 Thread Uros Bizjak via Gcc-patches
On Wed, Jan 27, 2021 at 10:26 AM Jakub Jelinek wrote: > > Hi! > > On Linux, GCC emits .note.GNU-stack sections when compiling code to mark > the code as not needing or needing executable stack, missing section means > unknown. But assembly files need to be marked manually. We already > mark vari

[PATCH] i386: Use cpp_define_formatted for __SIZEOF_FLOAT80__ definition

2021-01-15 Thread Uros Bizjak via Gcc-patches
2021-01-15 Uroš Bizjak gcc/ * config/i386/i386-c.c (ix86_target_macros): Use cpp_define_formatted for __SIZEOF_FLOAT80__ definition. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to mainline. Uros. diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-

[PATCH] i386: Resolve variable shadowing in i386-options.c [PR98671]

2021-01-14 Thread Uros Bizjak via Gcc-patches
Also change global variable pta_size to unsigned. 2021-01-14 Uroš Bizjak gcc/ PR target/98671 * config/i386/i386-options.c (ix86_valid_target_attribute_inner_p): Remove declaration and initialization of shadow variable "ret". (ix86_option_override_internal): Remove declaration

Re: [PATCH] i386: Update PR target/95021 tests

2021-01-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jan 14, 2021 at 4:00 PM H.J. Lu wrote: > > Also pass -mpreferred-stack-boundary=4 -mno-stackrealign to avoid > disabling STV by: > > /* Disable STV if -mpreferred-stack-boundary={2,3} or > -mincoming-stack-boundary={2,3} or -mstackrealign - the needed > stack realignment will b

Re: [PATCH] x86: Error on -fcf-protection with incompatible target

2021-01-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jan 14, 2021 at 3:05 PM H.J. Lu wrote: > > -fcf-protection with CF_BRANCH inserts ENDBR32 at function entries. > ENDBR32 is NOP only on 64-bit processors and 32-bit TARGET_CMOVE > processors. Issue an error for -fcf-protection with CF_BRANCH when > compiling for 32-bit non-TARGET_CMOVE ta

[PATCH] i386: Remove redundant assignment in i386-options.c [PR98671]

2021-01-14 Thread Uros Bizjak via Gcc-patches
Also rename x86_prefetch_sse to ix86_prefetch_sse. 2021-01-14 Uroš Bizjak gcc/ PR target/98671 * config/i386/i386-options.c (ix86_function_specific_save): Remove redundant assignment to opts->x_ix86_branch_cost. * config/i386/i386.c (ix86_prefetch_sse): Rename from x86_pref

Re: [PATCH] i386: Fix the pmovzx SSE4.1 define_insn_and_split patterns [PR98670]

2021-01-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jan 14, 2021 at 11:44 AM Jakub Jelinek wrote: > > Hi! > > I've made two mistakes in the *sse4_1_zero_extend* define_insn_and_split > patterns. One is that when it uses vector_operand, it should use Bm rather > than m constraint, and the other one is that because it is a post-reload > spli

Re: [PATCH] i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if possible [PR95905]

2021-01-13 Thread Uros Bizjak via Gcc-patches
On Wed, Jan 13, 2021 at 8:13 AM Jakub Jelinek wrote: > > Hi! > > The following patch implements what I've talked about, i.e. to no longer > force operands of vec_perm_const into registers in the generic code, but let > each of the (currently 8) targets force it into registers individually, > givin

Re: [PATCH] i386: Add define_insn_and_split patterns for btrl [PR96938]

2021-01-13 Thread Uros Bizjak via Gcc-patches
On Wed, Jan 13, 2021 at 8:18 AM Jakub Jelinek wrote: > > Hi! > > In the following testcase we only optimize f2 and f7 to btrl, although we > should optimize that way all of the functions. The problem is the type > demotion/narrowing (which is performed solely during the generic folding and > not

Re: [PATCH] i386, v2: Optimize _mm_unpacklo_epi8 of 0 vector as second argument or similar VEC_PERM_EXPRs into pmovzx [PR95905]

2021-01-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jan 12, 2021 at 2:40 PM Jakub Jelinek wrote: > > On Tue, Jan 12, 2021 at 11:42:44AM +0100, Uros Bizjak via Gcc-patches wrote: > > You can use post-reload define_insn_and_split here. This way, > > gen_lowpart on all arguments, including output, can be used. So, > >

Re: [PATCH] i386: Optimize _mm_unpacklo_epi8 of 0 vector as second argument or similar VEC_PERM_EXPRs into pmovzx [PR95905]

2021-01-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jan 12, 2021 at 10:33 AM Jakub Jelinek wrote: > > Hi! > > The following patch adds patterns (in the end I went with define_insn rather > than combiner define_split + define_insn_and_split I initially hoped or > define_insn_and_split) to represent (so far 128-bit only) permutations > like {

Re: [PATCH] [x86]Delete dead code in ix86_expand_sse_comi.[PR98612]

2021-01-11 Thread Uros Bizjak via Gcc-patches
On Mon, Jan 11, 2021 at 11:28 AM Hongtao Liu wrote: > > Hi: > d->flag is always 0 for builtins located in > BDESC_FIRST (comi,COMI,...) > ... > BDESC_END (COMI, PCMPESTR) > So the condition can be deleted, so did BUILTIN_DESC_SWAP_OPERANDS. > > Regtested and bootstrapped on x86_64-linux-gnu{

Re: [PATCH] x86-64: Use R10 and R11 for profiling large model with PIC

2021-01-08 Thread Uros Bizjak via Gcc-patches
On Fri, Jan 8, 2021 at 2:28 PM H.J. Lu wrote: > > On Fri, Jan 8, 2021 at 4:50 AM H.J. Lu wrote: > > > > On Fri, Jan 8, 2021 at 1:24 AM Uros Bizjak wrote: > > > > > > > Since R10 is preserved when calling mcount, R10 can be used a scratch > &g

Re: [PATCH] x86-64: Use R10 for profiling large model

2021-01-08 Thread Uros Bizjak via Gcc-patches
> Since R10 is preserved when calling mcount, R10 can be used a scratch > register to call mcount in large model. Please mention that R10 can be used as a static chain registers and is preserved when calling mcount for nested functions. > gcc/ > > PR target/98482 > * config/i386/i386.c (x86_funct

Re: [PATCH] i386: Fix -mcmodel= vs. target attribute [PR98585]

2021-01-08 Thread Uros Bizjak via Gcc-patches
On Fri, Jan 8, 2021 at 9:24 AM Jakub Jelinek wrote: > > Hi! > > My patch to save/restore opts_set rather than essentially treating > global_options_set as a logical or whether some option has ever been > explicitly set somewhere apparently broke -mcmodel= vs. target attribute > (and as the patch s

[PATCH] i386: Merge various insn name mapping code attributes

2021-01-07 Thread Uros Bizjak via Gcc-patches
2021-01-07 Uroš Bizjak No functional changes. gcc/ * config/i386/i386.md (insn): Merge from plusminus_insn, shift_insn, rotate_insn and optab code attributes. Update all uses to merged code attribute. * config/i386/sse.md: Update all uses to merged code attribute. * config/

Re: [PATCH] i386, v2: Optimize blsi followed by comparison [PR98567]

2021-01-07 Thread Uros Bizjak via Gcc-patches
On Thu, Jan 7, 2021 at 10:46 AM Jakub Jelinek wrote: > > On Thu, Jan 07, 2021 at 10:14:33AM +0100, Uros Bizjak wrote: > > I wonder if we should also add _cc variant where scratch is used: > > So like this then if it passes bootstrap/regtest? > > I think both variants ar

Re: [PATCH] i386: Optimize blsi followed by comparison [PR98567]

2021-01-07 Thread Uros Bizjak via Gcc-patches
On Thu, Jan 7, 2021 at 10:14 AM Uros Bizjak wrote: > > On Thu, Jan 7, 2021 at 9:56 AM Jakub Jelinek wrote: > > > > Hi! > > > > The BLSI instruction sets SF and ZF based on the result and clears OF. > > CF is set to something unrelated. > > > >

Re: [PATCH] i386: Optimize blsi followed by comparison [PR98567]

2021-01-07 Thread Uros Bizjak via Gcc-patches
On Thu, Jan 7, 2021 at 9:56 AM Jakub Jelinek wrote: > > Hi! > > The BLSI instruction sets SF and ZF based on the result and clears OF. > CF is set to something unrelated. > > The following patch optimizes BLSI followed by comparison, so we don't need > to emit a TEST insn in between. > > Bootstrap

[PATCH] i386: Prevent spurious FP exceptions with _mm_cvt{, t}ps_pi32 [PR98522]

2021-01-05 Thread Uros Bizjak via Gcc-patches
Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 for TARGET_MMX_WITH_SSE by clearing the top 64 bytes of the input XMM register. 2021-01-05 Uroš Bizjak gcc/ PR target/98522 * config/i386/sse.md (sse_cvtps2pi): Redefine as define_insn_and_split. Clear the top 64 bytes of the i

[PATCH] i386: Add _mm256_cmov_si256 [PR98521]

2021-01-05 Thread Uros Bizjak via Gcc-patches
Add missing _mm256_cmov_si256 intrinsic to xopintrin.h. 2021-01-05 Uroš Bizjak gcc/ PR target/98521 * config/i386/xopintrin.h (_mm256_cmov_si256): New. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to mainline, will be beckported to gcc-10. Uros. diff --git

git commit hook does not record my patches to PRs

2021-01-05 Thread Uros Bizjak via Gcc-patches
Hello! For some reason git commit hook does not record my patches to PRs, mentioned in the commit message. Some recent examples: PR 98521: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=951bdbde6ade56eb63af1dfa18777348a8a0d89e and PR98522: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1ff0ddcd8b4

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-05 Thread Uros Bizjak via Gcc-patches
On Tue, Jan 5, 2021 at 11:25 AM Hongtao Liu wrote: > > On Tue, Jan 5, 2021 at 3:20 PM Uros Bizjak wrote: > > > > On Tue, Jan 5, 2021 at 8:04 AM Uros Bizjak wrote: > > > > > > > > +(define_split > > > > + [(set (match_operand:SI

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-04 Thread Uros Bizjak via Gcc-patches
On Tue, Jan 5, 2021 at 8:04 AM Uros Bizjak wrote: > > > > +(define_split > > + [(set (match_operand:SI 0 "register_operand") > > +(zero_extend:SI > > + (not:HI > > +(subreg:HI > > + (un

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-04 Thread Uros Bizjak via Gcc-patches
On Tue, Jan 5, 2021 at 7:30 AM Hongtao Liu wrote: > > On Mon, Jan 4, 2021 at 4:59 PM Hongtao Liu wrote: > > > > On Mon, Jan 4, 2021 at 4:49 PM Jakub Jelinek wrote: > > > > > > On Mon, Jan 04, 2021 at 01:56:44PM +0800, Hongtao Liu via Gcc-patches > > > wrote: > > > > +(define_insn_and_split "*ss

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-04 Thread Uros Bizjak via Gcc-patches
On Mon, Jan 4, 2021 at 8:52 AM Hongtao Liu wrote: > > On Mon, Jan 4, 2021 at 3:40 PM Uros Bizjak wrote: > > > > On Mon, Jan 4, 2021 at 6:54 AM Hongtao Liu wrote: > > > > > > Hi: > > > The following patch adds define_insn_and_split to optimize &

Re: [PATCH]i386: Optimize pmovskb on zero_extend of subreg HI of the result [PR98461]

2021-01-03 Thread Uros Bizjak via Gcc-patches
On Mon, Jan 4, 2021 at 6:54 AM Hongtao Liu wrote: > > Hi: > The following patch adds define_insn_and_split to optimize > >vpmovmskb %xmm0, %eax > - movzwl %ax, %eax > notl%eax > > Bootstrapped/regtested on x86_64-linux-gnu {,-m32}. > Ok for trunk? > > gcc/Cha

Re: [PATCH] i386: Remove unnecessary clobbers from combine splitters.

2020-12-31 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 31, 2020 at 1:29 PM Segher Boessenkool wrote: > > Hi! > > On Thu, Dec 31, 2020 at 09:54:01AM +0100, Uros Bizjak wrote: > > On Thu, Dec 31, 2020 at 9:40 AM Segher Boessenkool > > wrote: > > > Nice. Just one thing: in principle the splitters can be u

Re: [PATCH] i386: Remove unnecessary clobbers from combine splitters.

2020-12-31 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 31, 2020 at 9:40 AM Segher Boessenkool wrote: > > Hi Uros, > > On Wed, Dec 30, 2020 at 05:44:50PM +0100, Uros Bizjak via Gcc-patches wrote: > > There is no need for combine splitters to emit insn patterns with clobbers, > > the pass is smart enough to add

[PATCH] i386: Remove unnecessary clobbers from combine splitters.

2020-12-30 Thread Uros Bizjak via Gcc-patches
There is no need for combine splitters to emit insn patterns with clobbers, the pass is smart enough to add clobbers to patterns as necessary. 2020-12-30 Uroš Bizjak gcc/ * config/i386/i386.md: Remove unnecessary clobbers from combine splitters. Bootstrapped and regression tested on x

Re: [PATCH] i386: Optimize pmovmskb on inverted vector to inversion of pmovmskb result [PR98461]

2020-12-30 Thread Uros Bizjak via Gcc-patches
On Wed, Dec 30, 2020 at 10:23 AM Jakub Jelinek wrote: > > Hi! > > The following patch adds combine splitters to optimize: > - vpcmpeqd%ymm1, %ymm1, %ymm1 > - vpandn %ymm1, %ymm0, %ymm0 > vpmovmskb %ymm0, %eax > + notl%eax > etc. (for vectors with less t

[PATCH] i386: Rounding functions TLC

2020-12-28 Thread Uros Bizjak via Gcc-patches
Use copy_to_reg where appropriate, use int_mode_for_mode and fix comment indentation. No functional changes. 2020-12-29 Uroš Bizjak gcc/ * config/i386/i386-expand.c (ix86_gen_TWO52): Use REAL_MODE_FORMAT to determine number of mantissa bits. Use real_2expN instead of real_ldexp.

[PATCH] i386: Fix __builtin_rint with FE_DOWNWARD rounding direction [PR96793]

2020-12-28 Thread Uros Bizjak via Gcc-patches
[The same cure is needed for __builtin_rint.] x86_expand_rint expander uses x86_sse_copysign_to_positive, which is unable to change the sign from - to +. When FE_DOWNWARD rounding direction is in effect, the expanded sequence that involves subtraction can trigger x - x = -0.0 special rule. x86_s

[PATCH] i386: Use existing temporary register in rounding functions

2020-12-28 Thread Uros Bizjak via Gcc-patches
It is possible to avoid the call to force_reg and use existing temporary register in ix86_expand_trunc, ix86_expand_round and ix86_expand_rounddf_32 expanders. 2020-12-28 Uroš Bizjak gcc/ * config/i386/i386-expand.c (ix86_expand_trunc): Use existing temporary register to avoid a call t

[PATCH] i386: Fix __builtin_trunc with FE_DOWNWARD rounding direction [PR96793]

2020-12-23 Thread Uros Bizjak via Gcc-patches
[The same cure is needed for __builtin_trunc on 32bit SSE targets.] x86_expand_truncdf_32 expander uses x86_sse_copysign_to_positive, which is unable to change the sign from - to +. When FE_DOWNWARD rounding direction is in effect, the expanded sequence that involves subtraction can trigger x - x

[PATCH] i386: Fix __builtin_floor with FE_DOWNWARD rounding direction [PR96793]

2020-12-22 Thread Uros Bizjak via Gcc-patches
x86_expand_floorceil expander uses x86_sse_copysign_to_positive, which is unable to change the sign from - to +. When FE_DOWNWARD rounding direction is in effect, the expanded sequence that involves subtraction can trigger x - x = -0.0 special rule. x86_sse_copysign_to_positive fails to change th

Re: [PATCH][X86] Fix Typo

2020-12-21 Thread Uros Bizjak via Gcc-patches
On Mon, Dec 21, 2020 at 12:01 PM Hongtao Liu wrote: > > When i'm working on PR98348, i notice there's Typo in define_insn > "*one_cmpl2_1", There are 2 alternatives, so the index couldn't > be 2. > Bootstrap and regress test is ok on x86_64-unknown-linux. > > gcc/ChangeLog > > * config

Re: [x86] recompute opt flags after opt level change

2020-12-16 Thread Uros Bizjak via Gcc-patches
On Wed, Dec 16, 2020 at 2:18 PM Martin Liška wrote: > > On 10/1/19 10:58 AM, Alexandre Oliva wrote: > > + if (opts->x_flag_asynchronous_unwind_tables == 2) > > + opts->x_flag_unwind_tables > > + = opts->x_flag_asynchronous_unwind_tables = 1; > > Hello. > > Spotted that when I tried

Re: [PATCH] i386: Fix up -march=x86-64-v[234] vs. target attribute [PR98274]

2020-12-15 Thread Uros Bizjak via Gcc-patches
On Tue, Dec 15, 2020 at 10:03 AM Jakub Jelinek wrote: > > Hi! > > The following testcase fails to compile. The problem is that > when ix86_option_override_internal is called the first time for command > line, it sees -mtune= wasn't present on the command line and so as fallback > sets ix86_tune_s

Re: [PATCH] i386: Make -march=x86-64-v[234] behave more like other -march= options

2020-12-14 Thread Uros Bizjak via Gcc-patches
On Mon, Dec 14, 2020 at 2:13 PM Jakub Jelinek wrote: > > Hi! > > If somebody has -march=x86-64-v2 (or -v3 or -v4) in $CFLAGS, $CXXFLAGS etc., > then -m32 or -mabi=ms stops working. > What is worse, if one configures gcc --with-arch-32=x86-64-v2 (or -v3 or -v4), > then -mabi=ms stops working. > > I

Re: [PATCH] x86: Update user interrupt handler stack frame

2020-12-10 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 10, 2020 at 10:20 PM H.J. Lu wrote: > > User interrupt handler stack frame is similar to exception interrupt > handler stack frame. Instead of error code, the second argument is > user interrupt request register vector. > > gcc/ > > PR target/98219 > * config/i386/uint

Re: [PATCH] X86: Fix feature check for HRESET/AVX_VNNI/UINTR

2020-12-09 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 10, 2020 at 4:42 AM Hongyu Wang wrote: > > Hi, > > This patch is a simple fix for HRESET/AVX_VNNI/UINTR feature detect is > put wrongly under avx_512usable. > > Bootstrap and tested on x86-64-linux, OK for trunk? > > gcc/ChangeLog: > * common/config/i386/cpuinfo.h (get_available_fe

i386: Remove REG_ALLOC_ORDER definition

2020-12-09 Thread Uros Bizjak via Gcc-patches
REG_ALLOC_ORDER just defines what the default is set to. 2020-12-09 Uroš Bizjak gcc/ * config/i386/i386.h (REG_ALLOC_ORDER): Remove. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index

Re: [PATCH] i386: Fix up X87_ENABLE_{FLOAT, ARITH} in conditions [PR94440]

2020-12-08 Thread Uros Bizjak via Gcc-patches
On Tue, Dec 8, 2020 at 10:36 AM Jakub Jelinek wrote: > > Hi! > > The documentation says > For a named pattern, the condition may not depend on the data in > the insn being matched, but only the target-machine-type flags. > The i386 backend violates that by using flag_excess_precision and

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-07 Thread Uros Bizjak via Gcc-patches
On Mon, Dec 7, 2020 at 4:30 PM Segher Boessenkool wrote: > > Hi! > > On Mon, Dec 07, 2020 at 03:27:14PM +0100, Uros Bizjak wrote: > > On Fri, Dec 4, 2020 at 7:26 PM Segher Boessenkool > > wrote: > > > A splitter can *already* split to only one insn. > > >

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-07 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 7:26 PM Segher Boessenkool wrote: > > Hi! > > On Fri, Dec 04, 2020 at 07:06:45PM +0100, Uros Bizjak wrote: > > On Fri, Dec 4, 2020 at 6:57 PM Jakub Jelinek wrote: > > > > > > On Fri, Dec 04, 2020 at 06:53:49PM +0100, Uros Bizjak wrote:

Re: V2 [PATCH] x86: Check mode of pseudo register push

2020-12-06 Thread Uros Bizjak via Gcc-patches
On Sun, Dec 6, 2020 at 8:11 PM H.J. Lu wrote: > > On Sun, Dec 6, 2020 at 10:59 AM Uros Bizjak wrote: > > > > On Sun, Dec 6, 2020 at 7:51 PM H.J. Lu wrote: > > > > > > commit 266f44a91c0c9705d3d18e82d7c5bab32927a18f > > > Author: H.J. Lu

Re: [PATCH] x86: Check mode of pseudo register push

2020-12-06 Thread Uros Bizjak via Gcc-patches
On Sun, Dec 6, 2020 at 7:51 PM H.J. Lu wrote: > > commit 266f44a91c0c9705d3d18e82d7c5bab32927a18f > Author: H.J. Lu > Date: Sun May 17 10:10:34 2020 -0700 > > x86: Allow V1TI vector register pushes > > Add V1TI vector register push and split it after reload to a sequence > of: > >

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 7:26 PM Segher Boessenkool wrote: > > Hi! > > On Fri, Dec 04, 2020 at 07:06:45PM +0100, Uros Bizjak wrote: > > On Fri, Dec 4, 2020 at 6:57 PM Jakub Jelinek wrote: > > > > > > On Fri, Dec 04, 2020 at 06:53:49PM +0100, Uros Bizjak wrote:

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 7:09 PM Jakub Jelinek wrote: > > On Fri, Dec 04, 2020 at 07:06:45PM +0100, Uros Bizjak wrote: > > No, I didn't want to burden you with the additional task - the patch > > is OK as it is. I was just thinking out loud, as I remembered that > > ch

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 6:57 PM Jakub Jelinek wrote: > > On Fri, Dec 04, 2020 at 06:53:49PM +0100, Uros Bizjak wrote: > > > > I was trying that first, but it didn't work. Without the > > > > clobber it actually works right, we don't have the rotate insn wit

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 6:50 PM Kumar, Venkataramanan wrote: > > [AMD Public Use] > > Hi Uros > > > -Original Message- > > From: Uros Bizjak > > Sent: Friday, December 4, 2020 2:30 PM > > To: Kumar, Venkataramanan > > Cc: gcc-pat

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 6:42 PM Uros Bizjak wrote: > > On Fri, Dec 4, 2020 at 6:41 PM Jakub Jelinek wrote: > > > > On Fri, Dec 04, 2020 at 06:37:02PM +0100, Uros Bizjak wrote: > > > > + "(INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode) - 1)) >

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 6:41 PM Jakub Jelinek wrote: > > On Fri, Dec 04, 2020 at 06:37:02PM +0100, Uros Bizjak wrote: > > > + "(INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode) - 1)) > > > + == GET_MODE_BITSIZE (mode) - 1" > > > + [(set (match_d

Re: [PATCH] i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226]

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Fri, Dec 4, 2020 at 6:32 PM Jakub Jelinek wrote: > > Hi! > > As mentioned in the PR, we can combine ~(1 << x) into -2 r<< x, but we give > up in the ~(1 << (x & 31)) cases, as *3_mask* don't allow > immediate operand 1 and find_split_point prefers to split (x & 31) instead > of the constant. >

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-04 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 3, 2020 at 4:29 PM Kumar, Venkataramanan wrote: > > [AMD Public Use] > > > > > Hi Maintainers, > > > > PFA, the patch that enables support for the next generation AMD Zen3 CPU via > -march=znver3. > > This is a very basic enablement patch. As of now the cost, tuning and > scheduler c

[PATCH v2]: i386: Fix up ix86_md_asm_adjust for TImode [PR98086]

2020-12-03 Thread Uros Bizjak via Gcc-patches
ix86_md_asm_adjust assumes that dest_mode can be only [QHSD]Imode and nothing else. The patch rewrites zero-extension part to use convert_to_mode to handle TImode and hypothetically even wider modes. 2020-12-03 Uroš Bizjak Jakub Jelinek gcc/ PR target/98086 * config/i386/i38

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-03 Thread Uros Bizjak via Gcc-patches
The message says that it is for internal distribution. Please repost. Thanks, Uros. On Thu, Dec 3, 2020 at 4:11 PM Kumar, Venkataramanan wrote: > > [AMD Official Use Only - Internal Distribution Only] > > > Hi Maintainers, > > > > PFA, the patch that enables support for the next generation AMD Z

Re: [PATCH] i386: Fix up ix86_md_asm_adjust for TImode [PR98086]

2020-12-03 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 3, 2020 at 11:39 AM Uros Bizjak wrote: > > > ix86_md_asm_adjust right above this code uses: > > machine_mode dest_mode = GET_MODE (dest); > > if (!SCALAR_INT_MODE_P (dest_mode)) > > { > > error ("invalid type for %

Re: [PATCH] i386: Fix up ix86_md_asm_adjust for TImode [PR98086]

2020-12-03 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 3, 2020 at 9:50 AM Jakub Jelinek wrote: > > Hi! > > ix86_md_asm_adjust right above this code uses: > machine_mode dest_mode = GET_MODE (dest); > if (!SCALAR_INT_MODE_P (dest_mode)) > { > error ("invalid type for % flag output"); > continue; >

Re: [PATCH] x86_64: Fix up -fpic -mcmodel=large -fno-plt [PR98063]

2020-12-01 Thread Uros Bizjak via Gcc-patches
On Tue, Dec 1, 2020 at 9:53 AM Jakub Jelinek wrote: > > Hi! > > On the following testcase with -fpic -mcmodel=large -fno-plt we emit > call puts@GOTPCREL(%rip) > but that is not really appropriate for CM_LARGE_PIC, the .text can be larger > than 2GB in that case and the .got slot further away from

[PATCH] testsuite/i386: Fix XOP and FMA4 checking functions [PR98036].

2020-11-27 Thread Uros Bizjak via Gcc-patches
Add missing returns and remove unnecessary postfix increments. 2020-11-27 Uroš Bizjak PR testsuite/98036 gcc/testsuite/ * gcc.target/i386/fma4-256-maccXX.c (check_maccps): Remove unnecessary postfix increment on a returned variable. (check_maccpd): Ditto. * gcc.target/i386

Re: [PATCH] testsuite: i386: Extend effective-target checks for AVX512BW, AVX512DQ

2020-11-27 Thread Uros Bizjak via Gcc-patches
On Fri, Nov 27, 2020 at 11:08 AM Rainer Orth wrote: > > When using the Solaris/x86 assembler with gcc, a couple of testcases > currently FAIL. Those failures follow two patterns: > > FAIL: gcc.target/i386/avx512bw-vpmovb2m-2.c (test for excess errors) > Excess errors: > Assembler: avx512bw-vpmovb

i386: Cleanup argument handling in ix86_expand_*_builtin functions.

2020-11-26 Thread Uros Bizjak via Gcc-patches
There is no need for struct with rtx and mode members since mode is never used. 2020-11-26 Uroš Bizjak gcc/ * config/i386/i386-expand.c (ix86_expand_multi_arg_builtin): Remove args array of structs, declare rtx xops array instead. Update all uses. (ix86_expand_args_builtin): Di

i386: Use SDWIM mode iterator for abs and maxmin [PR97873]

2020-11-26 Thread Uros Bizjak via Gcc-patches
Generate special double mode sequence also for QImode and HImode. Without QImode and HImode patterns, middle-end extends operands to SImode and emits SImode conditional move instruction with paradoxical SImode output register. This form is not recognized by STV pass, but a single abs with input o

<    7   8   9   10   11   12   13   14   15   16   >