Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-01 Thread Kewen.Lin
Hi! on 2024/3/24 02:37, Ajit Agarwal wrote: > > > On 23/03/24 9:33 pm, Peter Bergner wrote: >> On 3/23/24 4:33 AM, Ajit Agarwal wrote: > - else if (align_words < GP_ARG_NUM_REG) > + else if (align_words < GP_ARG_NUM_REG > +|| (cum->hidden_string_length > +

Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-02 Thread Kewen.Lin
Hi Jakub, on 2024/4/2 16:03, Jakub Jelinek wrote: > On Tue, Apr 02, 2024 at 02:12:04PM +0800, Kewen.Lin wrote: >>>>>> The old code for the unused hidden parameter (which was the 9th param) >>>>>> would >>>>>> fall thru to the "retu

Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-03 Thread Kewen.Lin
Hi Jakub, on 2024/4/3 16:35, Jakub Jelinek wrote: > On Wed, Apr 03, 2024 at 01:18:54PM +0800, Kewen.Lin wrote: >>> I'd prefer not to remove DECL_ARGUMENTS chains, they are valid arguments >>> that just some >>> invalid code doesn't pass. By remo

Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-03 Thread Kewen.Lin
Hi! on 2024/4/3 17:23, Jakub Jelinek wrote: > On Wed, Apr 03, 2024 at 05:02:40PM +0800, Kewen.Lin wrote: >> on 2024/4/3 16:35, Jakub Jelinek wrote: >>> On Wed, Apr 03, 2024 at 01:18:54PM +0800, Kewen.Lin wrote: >>>>> I'd prefer not to remove DECL_ARGU

Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-03 Thread Kewen.Lin
on 2024/4/3 19:18, Jakub Jelinek wrote: > On Wed, Apr 03, 2024 at 07:01:50PM +0800, Kewen.Lin wrote: >> Thanks for the details on debugging support, but IIUC with this workaround >> being adopted, the debuggability on hidden args are already broken, aren't? > > No. >

Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Kewen.Lin
Hi Peter, on 2024/4/6 06:28, Peter Bergner wrote: > This is a cleanup patch in preparation to fixing the real bug in PR101865. > TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that. > Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR > and delete

Re: [PATCH 3/3] Add -mcpu=power11 tests

2024-04-08 Thread Kewen.Lin
Hi Mike, on 2024/3/20 12:16, Michael Meissner wrote: > This patch adds some simple tests for -mcpu=power11 support. In order to run > these tests, you need an assembler that supports the appropriate option for > supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under AIX). > > I

[PATCH] testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

2024-04-08 Thread Kewen.Lin
Hi, As PR114614 shows, the newly added test case gcov-20.c by commit r14-9789-g08a52331803f66 failed on targets which do not support atomic profile update, there would be a message like: warning: target does not support atomic profile update, single mode is selected Since the test c

[PATCH] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-08 Thread Kewen.Lin
Hi, As the comments in PR88309 show, there are two oversights in rs6000_gimple_fold_builtin that pass align in bytes to build_aligned_type but which actually requires align in bits, it causes unexpected ICE or hanging in function is_miss_rate_acceptable due to zero align_unit value. This patch is

Re: [PATCH] rs6000: Fix wrong align passed to build_aligned_type [PR88309]

2024-04-08 Thread Kewen.Lin
on 2024/4/8 18:47, Richard Biener wrote: > On Mon, Apr 8, 2024 at 11:22 AM Kewen.Lin wrote: >> >> Hi, >> >> As the comments in PR88309 show, there are two oversights >> in rs6000_gimple_fold_builtin that pass align in bytes to >> build_aligned_type but which

Re: [PATCH] testsuite: Add profile_update_atomic check to gcov-20.c [PR114614]

2024-04-08 Thread Kewen.Lin
on 2024/4/8 18:47, Richard Biener wrote: > On Mon, Apr 8, 2024 at 11:23 AM Kewen.Lin wrote: >> >> Hi, >> >> As PR114614 shows, the newly added test case gcov-20.c by >> commit r14-9789-g08a52331803f66 failed on targets which do >> not support atomic p

Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Kewen.Lin
Hi Peter, on 2024/4/8 21:21, Peter Bergner wrote: > On 4/8/24 3:55 AM, Kewen.Lin wrote: >> on 2024/4/6 06:28, Peter Bergner wrote: >>> +mno-direct-move >>> +Target Undocumented WarnRemoved >>> + >>> mdirect-move >>> -Target Undocument

Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Kewen.Lin
on 2024/4/9 11:20, Peter Bergner wrote: > On 4/8/24 9:37 PM, Kewen.Lin wrote: >> on 2024/4/8 21:21, Peter Bergner wrote: >> I prefer to remove it completely, that is: >> >>> -mdirect-move >>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) War

[PATCH] testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]

2024-04-09 Thread Kewen.Lin
Hi, pr113359-2_*.c define a struct having unsigned long type members ay and az which have 4 bytes size at -m32, while the related constants CL1 and CL2 used for equality check are always 8 bytes, it makes compiler consider the below 69 if (a.ay != CL1) 70 __builtin_abort (); always to

Re: [PATCH] testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]

2024-04-10 Thread Kewen.Lin
on 2024/4/10 15:11, Richard Biener wrote: > On Wed, Apr 10, 2024 at 8:24 AM Kewen.Lin wrote: >> >> Hi, >> >> pr113359-2_*.c define a struct having unsigned long type >> members ay and az which have 4 bytes size at -m32, while >> the related constants CL1

Re: [PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-11 Thread Kewen.Lin
Hi, on 2024/4/12 06:15, Peter Bergner wrote: > FYI: This patch is an update to Will Schmidt's patches to fix PR101865: > > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601825.html > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601823.html > > ...taking into considerat

Re: [PATCH, rs6000] Fix test case bcd4.c

2024-04-17 Thread Kewen.Lin
Hi, on 2024/4/17 13:12, HAO CHEN GUI wrote: > Hi, > This patch fixes loss of return statement in maxbcd of bcd-4.c. Without > return statement, it returns an invalid bcd number and make the test > noneffective. The patch also enables test to run on Power9 and Big Endian, > as all bcd instruction

Re: [PATCH V3] rs6000: Don't ICE when compiling the __builtin_vsx_splat_2di built-in [PR113950]

2024-04-17 Thread Kewen.Lin
Hi, on 2024/4/17 17:05, jeevitha wrote: > Hi, > > On 18/03/24 7:00 am, Kewen.Lin wrote: > >>> The bogus vsx_splat_ code goes all the way back to GCC 8, so we >>> should backport this fix. Segher and Ke Wen, can we get an approval >>> to backport this t

[PATCH] testsuite, rs6000: Fix builtins-6-p9-runnable.c for BE [PR114744]

2024-04-17 Thread Kewen.Lin
Hi, Test case builtins-6-p9-runnable.c doesn't work well on BE due to two problems: - When applying vec_xl_len onto data_128 and data_u128 with length 8, it expects to load 128[01] from the memory, but unfortunately assigning 128[01] to a {vector} {u,}int128 type variable, th

Re: [PATCH, rs6000] Use bcdsub. instead of bcdadd. for bcd invalid number checking

2024-04-17 Thread Kewen.Lin
Hi, on 2024/4/18 10:01, HAO CHEN GUI wrote: > Hi, > This patch replace bcdadd. with bcdsub. for bcd invalid number checking. > bcdadd on two same numbers might cause overflow which also set > overflow/invalid bit so that we can't distinguish it's invalid or overflow. > The bcdsub doesn't have th

Re: [PATCH] [testsuite] [ppc64] expect error on vxworks too

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:23, Alexandre Oliva wrote: > > These ppc lp64 tests check for errors or warnings on -mno-powerpc64. > On powerpc64-*-vxworks* we get the same errors as on most other > covered platforms, but the tests did not mark them as expected for > this target. On powerpc-*-vxworks*, the test

Re: [PATCH] disable ldist for test, to restore vectorizing-candidate loop

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:27, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566524.html > > The loop we're supposed to try to vectorize in > gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c is turned into a memset > before the vectorizer runs. > > Various other tests i

Re: [PATCH] Request check for hw support in ppc run tests with -maltivec/-mvsx

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:31, Alexandre Oliva wrote: > > From: Olivier Hainque > > Regstrapped on x86_64-linux-gnu and ppc64el-linux-gnu. Also tested with > gcc-13 on ppc64-vx7r2 and ppc-vx7r2. Ok to install? OK, thanks! BR, Kewen > > for gcc/testsuite/ChangeLog > > * gcc.target/powerpc/swap

Re: [PATCH] ppc: testsuite: vec-mul requires vsx runtime

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:35, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593947.html > > > vec-mul is an execution test, but it only requires a powerpc_vsx_ok > effective target, which is enough only for compile tests. In order to > To check for runtime and executi

Re: [PATCH v2] xfail fetestexcept test - ppc always uses fcmpu

2024-04-23 Thread Kewen.Lin
Hi, on 2024/4/22 18:00, Alexandre Oliva wrote: > On Mar 10, 2021, Joseph Myers wrote: > >> On Wed, 10 Mar 2021, Alexandre Oliva wrote: >>> operand exception for quiet NaN. I couldn't find any evidence that >>> the rs6000 backend ever outputs fcmpo. Therefore, I'm adding the same >>> execution

Re: [PATCH v2] [testsuite] require sqrt_insn effective target where needed

2024-04-23 Thread Kewen.Lin
Hi, on 2024/4/22 17:56, Alexandre Oliva wrote: > This patch takes feedback received for 3 earlier patches, and adopts a > simpler approach to skip the still-failing tests, that I believe to be > in line with ppc maintainers' expressed preferences. > https://gcc.gnu.org/pipermail/gcc-patches/2021-F

Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-24 Thread Kewen.Lin
Hi, on 2024/4/22 17:28, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566525.html > > > This test expects vectorization at power8+ because strict alignment is > not required for vectors. For power7, vectorization is not to take > place because it's not de

[PATCH] sel-sched: Verify change before replacing dest in EXPR_INSN_RTX [PR112995]

2023-12-15 Thread Kewen.Lin
Hi, PR112995 exposed one issue in current try_replace_dest_reg that the result rtx insn after replace_dest_with_reg_in_expr is probably unable to match any constraints. Although there are some checks on the changes onto dest or src of orig_insn, none is performed on the EXPR_INSN_RTX. This patch

Re: [Patchv2, rs6000] Correct definition of macro of fixed point efficient unaligned

2023-12-18 Thread Kewen.Lin
Hi Haochen, on 2023/12/18 10:43, HAO CHEN GUI wrote: > Hi, > The patch corrects the definition of > TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and replace it with the call of > slow_unaligned_access. > > Compared with last version, > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640076.

Re: [Patchv2, rs6000] Clean up pre-checkings of expand_block_compare

2023-12-18 Thread Kewen.Lin
Hi Haochen, on 2023/12/18 10:44, HAO CHEN GUI wrote: > Hi, > This patch cleans up pre-checkings of expand_block_compare. It does > 1. Assert only P7 above can enter this function as it's already guard > by the expand. > 2. Return false when optimizing for size. > 3. Remove P7 processor test as o

[PATCH] sched: Don't skip empty block by removing no_real_insns_p [PR108273]

2023-12-20 Thread Kewen.Lin
Hi, This patch follows Richi's suggestion "scheduling shouldn't special case empty blocks as they usually do not appear" in [1], it removes function no_real_insns_p and its uses completely. There is some case that one block previously has only one INSN_P, but while scheduling some other blocks th

Re: PING^1 [PATCH] sched: Remove debug counter sched_block

2023-12-20 Thread Kewen.Lin
Hi Jeff, on 2023/12/21 04:43, Jeff Law wrote: > > > On 12/11/23 23:17, Kewen.Lin wrote: >> Hi, >> >> Gentle ping this: >> >> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636597.html >> >> BR, >> Kewen >> >> on 2

Re: [PATCH] sel-sched: Verify change before replacing dest in EXPR_INSN_RTX [PR112995]

2023-12-20 Thread Kewen.Lin
Hi Jeff, on 2023/12/21 04:30, Jeff Law wrote: > > > On 12/15/23 01:52, Kewen.Lin wrote: >> Hi, >> >> PR112995 exposed one issue in current try_replace_dest_reg >> that the result rtx insn after replace_dest_with_reg_in_expr >> is probably unable to match

Re: [Patchv3, rs6000] Correct definition of macro of fixed point efficient unaligned

2023-12-20 Thread Kewen.Lin
Hi Haochen, on 2023/12/20 16:51, HAO CHEN GUI wrote: > Hi, > The patch corrects the definition of > TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and replace it with the call of > slow_unaligned_access. > > Compared with last version, > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640832.

Re: [Patch, rs6000] Call library for block memory compare when optimizing for size

2023-12-20 Thread Kewen.Lin
Hi Haochen, on 2023/12/20 16:56, HAO CHEN GUI wrote: > Hi, > This patch call library function for block memory compare when it's > optimized for size. > > Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no > regressions. Is this OK for trunk? > > Thanks > Gui Haochen > > C

Re: [Patchv3, rs6000] Clean up pre-checkings of expand_block_compare

2023-12-20 Thread Kewen.Lin
Hi, on 2023/12/21 09:37, HAO CHEN GUI wrote: > Hi, > This patch cleans up pre-checkings of expand_block_compare. It does > 1. Assert only P7 above can enter this function as it's already guard > by the expand. > 2. Remove P7 processor test as only P7 above can enter this function and > P7 LE is

[PATCH] strub: Only unbias stack point for SPARC_STACK_BOUNDARY_HACK [PR113100]

2024-01-07 Thread Kewen.Lin
Hi, As PR113100 shows, the unbiasing introduced by r14-6737 can cause the scrubbing to overrun and screw some critical data on stack like saved toc base consequently cause segfault on Power. By checking PR112917, IMHO we should keep this unbiasing guarded under SPARC_STACK_BOUNDARY_HACK (TARGET_A

[PATCH] testsuite, rs6000: Adjust pcrel-sibcall-1.c with noipa [PR112751]

2024-01-07 Thread Kewen.Lin
Hi, As PR112751 shows, commit r14-5628 caused pcrel-sibcall-1.c to fail as it enables ipa-vrp which makes return values of functions {x,y,xx} as known and propagated. This patch is to adjust it with noipa to make it not fragile. Tested well on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu

[PATCH] rs6000: Eliminate zext fed by vclzlsbb [PR111480]

2024-01-07 Thread Kewen.Lin
Hi, As PR111480 shows, commit r14-4079 only optimizes the case of vctzlsbb but not for the similar vclzlsbb. This patch is to consider vclzlsbb as well and avoid the failure on the reported test case. It also simplifies the patterns with iterator and attribute. Bootstrapped and regtested on pow

[PATCH] rs6000: Make copysign (x, -1) back to -abs (x) for IEEE128 float [PR112606]

2024-01-07 Thread Kewen.Lin
Hi, I noticed that commit r14-6192 can't help PR112606 #c3 as it only takes care of SF/DF but TF/KF can still suffer the issue. Similar to commit r14-6192, this patch is to take care of copysign3 with IEEE128 as well. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-

Re: [PATCH] strub: Only unbias stack point for SPARC_STACK_BOUNDARY_HACK [PR113100]

2024-01-09 Thread Kewen.Lin
on 2024/1/8 19:44, Richard Biener wrote: > On Mon, Jan 8, 2024 at 3:35 AM Kewen.Lin wrote: >> >> Hi, >> >> As PR113100 shows, the unbiasing introduced by r14-6737 can >> cause the scrubbing to overrun and screw some critical data >> on stack like saved

Re: [PATCH] PR target/112886, Add %S to print_operand for vector pair support

2024-01-10 Thread Kewen.Lin
Hi Mike, on 2024/1/6 06:18, Michael Meissner wrote: > In looking at support for load vector pair and store vector pair for the > PowerPC in GCC, I noticed that we were missing a print_operand output modifier > if you are dealing with vector pairs to print the 2nd register in the vector > pair. >

Re: [PATCH] strub: Only unbias stack point for SPARC_STACK_BOUNDARY_HACK [PR113100]

2024-01-11 Thread Kewen.Lin
Hi Alexandre, on 2024/1/11 17:05, Alexandre Oliva wrote: > On Jan 7, 2024, "Kewen.Lin" wrote: > >> As PR113100 shows, the unbiasing introduced by r14-6737 can >> cause the scrubbing to overrun and screw some critical data >> on stack like saved toc base conse

Re: [PATCH] strub: Only unbias stack point for SPARC_STACK_BOUNDARY_HACK [PR113100]

2024-01-14 Thread Kewen.Lin
on 2024/1/12 19:03, Alexandre Oliva wrote: > On Jan 12, 2024, "Kewen.Lin" wrote: > >>>> By checking PR112917, IMHO we should keep this unbiasing >>>> guarded under SPARC_STACK_BOUNDARY_HACK (TARGET_ARCH64 && >>>> TARGET_STACK_BIAS), sim

Re: [PATCH, rs6000] Refactor expand_compare_loop and split it to two functions

2024-01-14 Thread Kewen.Lin
Hi Haochen, on 2024/1/10 09:35, HAO CHEN GUI wrote: > Hi, > This patch refactors function expand_compare_loop and split it to two > functions. One is for fixed length and another is for variable length. > These two functions share some low level common help functions. I'm expecting refactoring

Re: [PATCH, rs6000] Enable block compare expand on P9 with m32 and mpowerpc64

2024-01-14 Thread Kewen.Lin
Hi Haochen, on 2024/1/12 14:48, HAO CHEN GUI wrote: > Hi, > On P9 "setb" is used to set the result of block compare. So it works > with m32 and mpowerpc64. On P8, carry bit is used. So it can't work > with m32 and mpowerpc64. This patch enables block compare expand for > m32 and mpowerpc64 on P9

Re: [Patch, rs6000] Eliminate unnecessary byte swaps for block clear on P8 LE [PR113325]

2024-01-14 Thread Kewen.Lin
Hi Haochen, on 2024/1/11 16:28, HAO CHEN GUI wrote: > Hi, > This patch eliminates unnecessary byte swaps for block clear on P8 > LE. For block clear, all the bytes are set to zero. The byte order > doesn't make sense. So the alignment of destination could be set to > the store mode size in stead

[PATCH] testsuite: Fix vect_long_mult on Power [PR109705]

2024-01-15 Thread Kewen.Lin
Hi, As pointed out by the discussion in PR109705, the current vect_long_mult effective target check on Power is broken. This patch is to fix it accordingly. With additional change by adding a guard vect_long_mult in gcc.dg/vect/pr25413a.c , it's tested well on Power{8,9} LE & BE (also on Power10

Re: [PATCH V1] rs6000: New pass for replacement of adjacent (load) lxv with lxvp

2024-01-16 Thread Kewen.Lin
on 2024/1/16 06:22, Ajit Agarwal wrote: > Hello Richard: > > On 15/01/24 6:25 pm, Ajit Agarwal wrote: >> >> >> On 15/01/24 6:14 pm, Ajit Agarwal wrote: >>> Hello Richard: >>> >>> On 15/01/24 3:03 pm, Richard Biener wrote: On Sun, Jan 14, 2024 at 4:29 PM Ajit Agarwal wrote: > >

[committed] testsuite, rs6000: Adjust fold-vec-extract-char.p7.c [PR111850]

2024-01-17 Thread Kewen.Lin
Hi, As PR101169 comment #c4 shows, previously the addi count update on fold-vec-extract-char.p7.c covered a sub-optimal code gen issue. On trunk, pass fold-mem-offsets helps to recover the best code sequence, so this patch is to revert the count back to the original which matches the optimal addi

Re: [PATCH] strub: Only unbias stack point for SPARC_STACK_BOUNDARY_HACK [PR113100]

2024-01-17 Thread Kewen.Lin
aix is: make check-gcc RUNTESTFLAGS="--target_board=unix'{-m64,-m32}' dg.exp=strub-unsupported*.c" BR, Kewen > Thanks, David > > > On Wed, Jan 17, 2024 at 8:06 PM Alexandre Oliva <mailto:ol...@adacore.com>> wrote: > > David, > &g

Re: Repost [PATCH 1/6] Add -mcpu=future

2024-01-23 Thread Kewen.Lin
Hi Mike, on 2024/1/6 07:35, Michael Meissner wrote: > This patch implements support for a potential future PowerPC cpu. Features > added with -mcpu=future, may or may not be added to new PowerPC processors. > > This patch adds support for the -mcpu=future option. If you use -mcpu=future, > the

Re: Repost [PATCH 2/6] PowerPC: Make -mcpu=future enable -mblock-ops-vector-pair.

2024-01-23 Thread Kewen.Lin
on 2024/1/6 07:37, Michael Meissner wrote: > This patch re-enables generating load and store vector pair instructions when > doing certain memory copy operations when -mcpu=future is used. > > During power10 development, it was determined that using store vector pair > instructions were problemati

Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-23 Thread Kewen.Lin
Hi Mike, on 2024/1/12 01:29, Michael Meissner wrote: > This is version 2 of the patch. The only difference is I made the test case > simpler to read. > > In looking at support for load vector pair and store vector pair for the > PowerPC in GCC, I noticed that we were missing a print_operand outp

Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-23 Thread Kewen.Lin
on 2024/1/24 11:11, Peter Bergner wrote: > On 1/23/24 8:30 PM, Kewen.Lin wrote: >>> - output_operand_lossage ("invalid %%x value"); >>> + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : >>> 'x&#x

Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-24 Thread Kewen.Lin
on 2024/1/24 23:51, Peter Bergner wrote: > On 1/24/24 12:04 AM, Kewen.Lin wrote: >> on 2024/1/24 11:11, Peter Bergner wrote: >>> But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx. >>> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, the

Re: [PATCH] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-01-24 Thread Kewen.Lin
Hi, Thanks for adjusting this. on 2024/1/24 19:42, Xi Ruoyao wrote: > On Wed, 2024-01-24 at 19:08 +0800, chenxiaolong wrote: >> At 19:00 +0800 on Wednesday, 2024-01-24, Xi Ruoyao wrote: >>> On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote: On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoya

Re: Repost [PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.

2024-01-25 Thread Kewen.Lin
Hi Mike, on 2024/1/6 07:38, Michael Meissner wrote: > The MMA subsystem added the notion of accumulator registers as an optional > feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with > the traditional floating point registers 0..31, but logically the accumulator > registe

Re: [PATCH] testsuite: Fix vect_long_mult on Power [PR109705]

2024-01-28 Thread Kewen.Lin
on 2024/1/27 06:42, Andrew Pinski wrote: > On Mon, Jan 15, 2024 at 6:43 PM Kewen.Lin wrote: >> >> Hi, >> >> As pointed out by the discussion in PR109705, the current >> vect_long_mult effective target check on Power is broken. >> This patch is to fix it ac

Re: Repost [PATCH 4/6] PowerPC: Make MMA insns support DMR registers.

2024-02-03 Thread Kewen.Lin
Hi Mike, on 2024/1/6 07:39, Michael Meissner wrote: > This patch changes the MMA instructions to use either FPR registers > (-mcpu=power10) or DMRs (-mcpu=future). In this patch, the existing MMA > instruction names are used. > > A macro (__PPC_DMR__) is defined if the MMA instructions use the D

Re: Repost [PATCH 5/6] PowerPC: Switch to dense math names for all MMA operations.

2024-02-03 Thread Kewen.Lin
Hi Mike, on 2024/1/6 07:40, Michael Meissner wrote: > This patch changes the assembler instruction names for MMA instructions from > the original name used in power10 to the new name when used with the dense > math > system. I.e. xvf64gerpp becomes dmxvf64gerpp. The assembler will emit the > sa

Re: Repost [PATCH 6/6] PowerPC: Add support for 1,024 bit DMR registers.

2024-02-04 Thread Kewen.Lin
Hi Mike, on 2024/1/6 07:42, Michael Meissner wrote: > This patch is a prelimianry patch to add the full 1,024 bit dense math > register> (DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the > top of the > DMR register. > > This patch only adds the new 1,024 bit register support.

Re: [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2024-02-05 Thread Kewen.Lin
Hi Sebastian, on 2024/2/5 18:38, Sebastian Huber wrote: > Hello, > > On 27.12.22 11:16, Kewen.Lin via Gcc-patches wrote: >> Hi Segher, >> >> on 2022/12/24 04:26, Segher Boessenkool wrote: >>> Hi! >>> >>> On Wed, Oct 12, 2022 at 04:12:21PM

Re: Repost [PATCH 1/6] Add -mcpu=future

2024-02-07 Thread Kewen.Lin
on 2024/2/6 14:01, Michael Meissner wrote: > On Tue, Jan 23, 2024 at 04:44:32PM +0800, Kewen.Lin wrote: ... >>> diff --git a/gcc/config/rs6000/rs6000-opts.h >>> b/gcc/config/rs6000/rs6000-opts.h >>> index 33fd0efc936..25890ae3034 100644 >>> --- a/gcc/co

Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-28 Thread Kewen.Lin
Hi, on 2024/4/28 16:14, Alexandre Oliva wrote: > On Apr 24, 2024, "Kewen.Lin" wrote: > >> For !has_arch_pwr7 case, it still adopts peeling but as the comment (one >> line above) >> shows the original intention of this case is to expect not profitable for >

Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-04-28 Thread Kewen.Lin
Hi, on 2024/4/28 16:20, Alexandre Oliva wrote: > On Apr 23, 2024, "Kewen.Lin" wrote: > >> This patch seemed to miss to CC gcc-patches list. :) > > Oops, sorry, thanks for catching that. > > Here it is. FTR, you've already responded suggesting an appare

Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-29 Thread Kewen.Lin
on 2024/4/29 14:28, Alexandre Oliva wrote: > On Apr 28, 2024, "Kewen.Lin" wrote: > >> Nit: Maybe add a prefix "testsuite: ". > > ACK > >>> >>> From: Kewen Lin > >> Thanks, you can just drop this. :) > > I've t

Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-04-29 Thread Kewen.Lin
on 2024/4/29 15:20, Alexandre Oliva wrote: > On Apr 28, 2024, "Kewen.Lin" wrote: > >> OK, from this perspective IMHO it seems more clear to adopt xfail >> with effective target long_double_64bit? > > That's effective target is quite broken, alas. I

[PATCH 1/4] rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, On rs6000, there are three 128 bit scalar floating point modes TFmode, IFmode and KFmode. With some historical reasons, we defines them with different mode precisions, that is KFmode 126, TFmode 127 and IFmode 128. But in fact all of them should have the same mode precision 128, this special

[PATCH 2/4] fortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, Previously effective target fortran_real_c_float128 never passes on Power regardless of the default 128 long double is ibmlongdouble or ieeelongdouble. It's due to that TF mode is always used for kind 16 real, which has precision 127, while the node float128_type_node for c_float128 has 128 t

[PATCH 3/4] ranger: Revert the workaround introduced in PR112788 [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, This reverts commit r14-6478-gfda8e2f8292a90 "range: Workaround different type precision between _Float128 and long double [PR112788]" as the fixes for PR112993 make all 128 bits scalar floating point have the same 128 bit precision, this workaround isn't needed any more. Bootstrapped and reg

[PATCH 4/4] tree: Remove KFmode workaround [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, The fix for PR112993 makes KFmode have 128 bit mode precision, we don't need this workaround to fix up the type precision any more, and just go with mode precision. So this patch is to remove KFmode workaround. Bootstrapped and regress-tested on: - powerpc64-linux-gnu P8/P9 (with ibm128 by

[PATCH] rs6000: Adjust -fpatchable-function-entry* support for dual entry [PR112980]

2024-05-07 Thread Kewen.Lin
Hi, As the discussion in PR112980, although the current implementation for -fpatchable-function-entry* conforms with the documentation (making N NOPs be consecutive), it's inefficient for both kernel and userspace livepatching (see comments in PR for the details). So this patch is to change the c

[PATCH] rs6000: Fix ICE on IEEE128 long double without vsx [PR114402]

2024-05-07 Thread Kewen.Lin
Hi, As PR114402 shows, we supports IEEE128 format long double even if there is no vsx support, but there is an ICE about cbranch as the test case shows. For now, we only supports compare:CCFP pattern for IEEE128 fp if TARGET_FLOAT128_HW, so in function rs6000_generate_compare we have a check with

[PATCH] rs6000: Clean up TF and TD check with FLOAT128_2REG_P

2024-05-07 Thread Kewen.Lin
Hi, Commit r6-2116-g2c83faf86827bf did some clean up on TFmode and TFmode check with FLOAT128_2REG_P, but it missed to update an assertion, this patch is to make it align. btw, it's noticed when I'm making a patch to get rid of TFmode. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and

[PATCH] rs6000: Remove useless operands[3]

2024-05-07 Thread Kewen.Lin
Hi, As shown, three uses of operands[3] are totally useless, so this patch is to remove them to avoid any confusion. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - gcc/ChangeLog:

[PATCH] rs6000: Remove useless entries in rreg

2024-05-07 Thread Kewen.Lin
Hi, When I was working on a trial patch to get rid of TFmode, I noticed that mode attribute rreg only gets used for mode iterator SFDF, it means that only SF and DF key-value pairs are useful, the other are useless, so this patch is to clean up them. Bootstrapped and regtested on powerpc64-linux-

[PATCH] rs6000: Drop useless vector_{load,store}_ defines

2024-05-07 Thread Kewen.Lin
Hi, When I was working on a patch to get rid of TFmode, I noticed that define_expands vector_load_ and vector_store_ are useless. This patch is to clean up both. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objec

[PATCH] testsuite: Fix typo in torture/vector-{1,2}.c

2024-05-07 Thread Kewen.Lin
Hi, When making some clean up patches, I happened to find test cases vector-{1,2}.c are having typo "powerpc64--*-*" in target selector, which should be powerpc64-*-*. The reason why we didn't catch before is that all our testing machines support VMX insns, so it passes always. But it would brea

[PATCH] testsuite, rs6000: Remove some checks with aix[456]

2024-05-07 Thread Kewen.Lin
Hi, Since r12-75-g0745b6fa66c69c aix6 support had been dropped, so we don't need to check for aix[456].* when testing, this patch is to remove such checks. Regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen -

[PATCH] testsuite, rs6000: Remove all linux*paired* checks and cases

2024-05-07 Thread Kewen.Lin
Hi, Since r9-115-g559289370f76bf the support of paired single had been dropped, but we still have some test checks and cases for that, this patch is to get rid of them. Regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR,

[PATCH] rs6000: Add assert !TARGET_VSX if !TARGET_ALTIVEC and strip a useless check

2024-05-07 Thread Kewen.Lin
Hi, In function rs6000_option_override_internal, we have the checks and adjustments like: if (TARGET_P8_VECTOR && !TARGET_ALTIVEC) rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR; if (TARGET_P8_VECTOR && !TARGET_VSX) rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR; But in fact some previous c

[PATCH] libgcc, rs6000: Remove powerpcspe related code

2024-05-07 Thread Kewen.Lin
Hi, Since r9-4728 the powerpcspe support had been removed, this follow-up patch is to remove the remaining pieces in libgcc. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - libgcc/Change

[PATCH] testsuite, rs6000: Remove effective target powerpc_405_nocache

2024-05-07 Thread Kewen.Lin
Hi, With the introduction of -mdejagnu-cpu=, when the test case is specifying -mdejagnu-cpu=405, it would override the other possibly given -mcpu=, so it would compile for PowerPC 405 for sure. This patch is to remove the effective target powerpc_405_nocache and update all its uses. Regtested on

[PATCH 1/2] testsuite, rs6000: Make powerpc_vsx consider current_compiler_flags [PR114842]

2024-05-07 Thread Kewen.Lin
Hi, As noted in PR114842, most of the test cases which require effective target check powerpc_vsx_ok actually care about if VSX feature is enabled, and they should adopt effective target powerpc_vsx instead. By considering we already have a number of test cases having explicit -mvsx in dg-options

[PATCH] testsuite, rs6000: Remove powerpc_popcntb_ok

2024-05-07 Thread Kewen.Lin
Hi, There are three uses of effective target powerpc_popcntb_ok, they are all for compiling, but powerpc_popcntb_ok checks for executable generation, which is too heavy. This patch is to remove powerpc_popcntb_ok and adjust its three uses accordingly. Regtested on powerpc64-linux-gnu P8/P9 and p

[PATCH] testsuite, rs6000: Remove powerpcspe test cases and checks

2024-05-08 Thread Kewen.Lin
Hi, Since r9-4728 the powerpcspe support had been removed, this follow-up patch is to remove the remaining pieces in testsuite. Regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - gcc/testsuite/ChangeLog:

Re: [PATCH] rs6000: Adjust -fpatchable-function-entry* support for dual entry [PR112980]

2024-05-08 Thread Kewen.Lin
Hi Richi, >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index c584664e168..58e48f7dc55 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -18363,11 +18363,11 @@ If @code{N=0}, no pad location is recorded. >> The NOP instructions are inserted at---and maybe befor

Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-05-08 Thread Kewen.Lin
on 2024/4/30 07:11, Alexandre Oliva wrote: > On Apr 29, 2024, "Kewen.Lin" wrote: > >> Thanks for catching this and sorry >> that I didn't check it before suggesting it, I think we can aggressively >> drop this effective target instead to avoid any po

Re: [PATCH 2/4] fortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993]g

2024-05-08 Thread Kewen.Lin
Hi, on 2024/5/9 06:01, Steve Kargl wrote: > On Wed, May 08, 2024 at 01:27:53PM +0800, Kewen.Lin wrote: >> >> Previously effective target fortran_real_c_float128 never >> passes on Power regardless of the default 128 long double >> is ibmlongdouble or ieeelongdouble.

Re: [PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-08 Thread Kewen.Lin
Hi, on 2024/5/8 14:47, HAO CHEN GUI wrote: > Hi, > This patch enables overlapped by-piece operations. On rs6000, default > move/set/clear ratio is 2. So the overlap is only enabled with compare > by-pieces. Thanks for enabling this, did you evaluate if it can help some benchmark? > > Bootst

Re: [PATCHv2] rs6000: Enable overlapped by-pieces operations

2024-05-12 Thread Kewen.Lin
on 2024/5/10 17:29, HAO CHEN GUI wrote: > Hi, > This patch enables overlapped by-piece operations. On rs6000, default > move/set/clear ratio is 2. So the overlap is only enabled with compare > by-pieces. > > Compared to previous version, the change is to remove power8 > requirement from test c

Re: [PATCH 1/13] rs6000, Remove __builtin_vsx_cmple* builtins

2024-05-12 Thread Kewen.Lin
Hi, on 2024/4/20 05:16, Carl Love wrote: > > rs6000, Remove __builtin_vsx_cmple* builtins > > The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di, > __builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take > unsigned arguments and return an unsigned result. The current de

Re: [PATCH] report message for operator %a on unaddressible exp

2024-05-13 Thread Kewen.Lin
Hi, on 2024/5/13 10:57, Jiufu Guo wrote: > Hi, > > For PR96866, when gcc print asm code for modifier "%a" which requires > an address operand, while the operand is with the constraint "X" which > allow non-address form. An error message would be reported to indicate > the invalid asm operands. >

Re: [PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-13 Thread Kewen.Lin
Hi, on 2024/5/9 15:35, HAO CHEN GUI wrote: > Hi Kewen, > Thanks for your comments. > > 在 2024/5/9 13:44, Kewen.Lin 写道: >> Hi, >> >> on 2024/5/8 14:47, HAO CHEN GUI wrote: >>> Hi, >>> This patch enables overlapped by-piece operations. On rs600

Re: [PATCH 5/13] rs6000, remove duplicated built-ins of vecmergl and vec_mergeh

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:17, Carl Love wrote: > rs6000, remove duplicated built-ins of vecmergl and vec_mergeh > > The following undocumented built-ins are same as existing documented > overloaded builtins. > > const vf __builtin_vsx_xxmrghw (vf, vf); > same as vf __builtin_vec_mergeh (vf, vf);

Re: [PATCH 6/13] rs6000, add overloaded vec_sel with int128 arguments

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:17, Carl Love wrote: > rs6000, add overloaded vec_sel with int128 arguments > > Extend the vec_sel built-in to take three signed/unsigned int128 arguments > and return a signed/unsigned int128 result. > > Extending the vec_sel built-in makes the existing buit-ins > __builtin_

Re: [PATCH 7/13] rs6000, remove the vec_xxsel built-ins, they are duplicates

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove the vec_xxsel built-ins, they are duplicates > > The following undocumented built-ins are covered by the existing overloaded > vec_sel built-in definitions. > > const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc); > same as vsc __builtin

Re: [PATCH 8/13] rs6000, remove __builtin_vsx_vperm_* built-ins

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove __builtin_vsx_vperm_* built-ins > > The undocumented built-ins: > __builtin_vsx_vperm_16qi_uns, > __builtin_vsx_vperm_1ti, > __builtin_vsx_vperm_1ti_uns, > __builtin_vsx_vperm_2df, > __builtin_vsx_vperm_2di, > __builtin_vsx_vpe

Re: [PATCH 9/13] rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins > > The undocumented __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp are > redundant. The overloaded vec_neg built-in provides the same > functionality. The two buit-ins are not d

<    1   2   3   4   5   6   7   8   9   10   >