from:"HAO CHEN GUI via Gcc\-patches"

Re: [PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-14 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen, 在 2023/9/12 17:33, Kewen.Lin 写道: > Ok, at least regression testing doesn't expose any needs to do disparaging > for this. Could you also test this patch with SPEC2017 for P7 and P8 > separately at options like -O2 or -O3, to see if there is any assembly > change, and if yes filtering ou

[PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-03 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch enables SImode in FP registers on P7. Instruction "fctiw" stores its integer output in an FP register. So SImode in FP register needs be enabled on P7 if we want support "fctiw" on P7. The test case is in the second patch which implements 32bit inline lrint. Compared to the l

[PATCH-2v2, rs6000] Implement 32bit inline lrint [PR88558]

2023-09-03 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch implements 32bit inline lrint by "fctiw". It depends on the patch1 to do SImode move from FP registers on P7. Compared to last version, the main change is to add tests for "lrintf" and adjust the count of corresponding instructions. https://gcc.gnu.org/pipermail/gcc-patches/2023

Re: [PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-30 Thread HAO CHEN GUI via Gcc-patches

Kewen, I refined the patch according to your comments and it passed bootstrap and regression test. I committed it as https://gcc.gnu.org/g:946b8967b905257ac9f140225db744c9a6ab91be Thanks Gui Haochen 在 2023/8/29 16:55, Kewen.Lin 写道: > Hi Haochen, > > on 2023/8/29 10:50, HAO CHEN GUI wrote: >

[PATCH, rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

2023-08-28 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds "TARGET_64BIT" check when calling vector load/store with length expand in expand_block_move. It matches the expand condition of "lxvl" and "stxvl" defined in vsx.md. This patch fixes the ICE occurred with the test case on 32-bit Power10. Bootstrapped and tested on powerp

[PATCH-2, rs6000] Implement 32bit inline lrint [PR88558]

2023-08-24 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch implements 32bit inline lrint by "fctiw". It depends on the patch1 to do SImode move from FP register on P7. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog rs6000: support 32bit inline lrint gcc/ PR target/88558

[PATCH-1, rs6000] Enable SImode in FP register on P7 [PR88558]

2023-08-24 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch enables SImode in FP register on P7. Instruction "fctiw" stores its integer output in an FP register. So SImode in FP register needs be enabled on P7 if we want support "fctiw" on P7. The test case is in the second patch which implements 32bit inline lrint. Bootstrapped and t

[PATCHv2, rs6000] Extract the element in dword0 by mfvsrd and shift/mask [PR110331]

2023-08-21 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch implements the vector element extraction by mfvsrd and shift/mask when the element is in dword0 of the vector. Originally, it generates vsplat/mfvsrd on P8 and li/vextract on P9. Since mfvsrd has lower latency than vextract and rldicl has lower latency than vsplat, the new sequence

Re: [PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-08-20 Thread HAO CHEN GUI via Gcc-patches

Jeff, Thanks a lot for your comments. The widen shift mode is on i1/i2 before they're combined with i3 to newpat. The newpat matches rotate/mask pattern. The i1/i2 itself don't match rotate/mask pattern. I did an experiment to disable widen shift mode for lshiftrt. I tested it on powerpc/x8

Re: [PATCHv4, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-08-15 Thread HAO CHEN GUI via Gcc-patches

Committed after fixing the comments. https://gcc.gnu.org/g:a79cf858b39e01c80537bc5d47a5e9004418c267 Thanks Gui Haochen 在 2023/8/14 15:47, Kewen.Lin 写道: > Hi Haochen, > > on 2023/8/14 10:18, HAO CHEN GUI wrote: >> Hi, >> This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx >> f

Re: [PATCH, rs6000] Skip redundant vector extract if the element is first element of dword0 [PR110429]

2023-08-15 Thread HAO CHEN GUI via Gcc-patches

Committed after tweaking and testing. https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=d471bdb0453de7b738f49148b66d57cb5871937d Thanks Gui Haochen 在 2023/7/28 17:32, Kewen.Lin 写道: > Hi Haochen, > > on 2023/7/5 11:22, HAO CHEN GUI wrote: >> Hi, >> This patch skips redundant vector extract insn to

[PATCHv4, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-08-13 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx for all sub targets when the mode is V4SI and the extracted element is word 1 from BE order. Also this patch adds a insn pattern for mfvsrwz which helps eliminate redundant zero extend. Compared to last version, the main

[PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-24 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx for all subtargets when the mode is V4SI and the index of extracted element is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz which helps eliminate redundant zero extend. Compared to last version,

[PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-20 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx for all subtargets when the mode is V4SI and the index of extracted element is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz which can help eliminate redundant zero extend. Compared to last versio

Re: [PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-07-20 Thread HAO CHEN GUI via Gcc-patches

Sorry for the typo s/change/chance 在 2023/7/21 8:59, HAO CHEN GUI 写道: > Hi Jeff, > > 在 2023/7/21 5:27, Jeff Law 写道: >> Wouldn't it make more sense to just try rotate/mask in the original mode >> before trying a shift in a widened mode? I'm not sure why we need a target >> hook here. > > There

Re: [PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-07-20 Thread HAO CHEN GUI via Gcc-patches

Hi Jeff, 在 2023/7/21 5:27, Jeff Law 写道: > Wouldn't it make more sense to just try rotate/mask in the original mode > before trying a shift in a widened mode? I'm not sure why we need a target > hook here. There is no change to try rotate/mask with the original mode when expensive_optimizations

Ping [PATCH v7, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2023-07-18 Thread HAO CHEN GUI via Gcc-patches

Hi, As the ticket(PR107013, adding fmin/max to RTL code) is suspended, I ping this patch. The unspec of fmin/max can be replaced with corresponding RTL code after that ticket is fixed. https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602181.html Thanks Gui Haochen 在 2022/9/26 11:35, H

[PATCH-1, combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-07-18 Thread HAO CHEN GUI via Gcc-patches

Hi, The shift mode will be widen in combine pass if the operand has a normal subreg. But when the target already has rotate/mask/insert instructions on the narrow mode, it's unnecessary to widen the mode for lshiftrt. As the lshiftrt is commonly converted to rotate/mask insn, the widen mode block

[PATCH-2, rs6000] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

2023-07-18 Thread HAO CHEN GUI via Gcc-patches

Hi, The patch relies on the fist patch. The reason of the change is also described in the first patch. This patch implements the target hook have_rotate_and_mask. It also modifies some test cases. The regression of rlwimi-2.c is fixed. For rlwinm-0.c and rlwinm-2.c, one more 32bit rotate/mask ins

[PATCH, rs6000] Skip redundant vector extract if the element is first element of dword0 [PR110429]

2023-07-04 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch skips redundant vector extract insn to be generated when the extracted element is the first element of dword0 and the destination is a memory operand. Only one 'stxsi[hb]x' instruction is enough. The V4SImode is fixed in a previous patch. https://gcc.gnu.org/pipermail/gcc-patche

[PATCH, rs6000] Extract the element in dword0 by mfvsrd and shift/mask [PR110331]

2023-07-02 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch implements the vector element extraction by mfvsrd and shift/mask when the element is in dword0 of the vector. Originally, it generates vsplat/mfvsrd on P8 and li/vextract on P9. Since mfvsrd has lower latency than vextract and rldicl has lower latency than vsplat, the new sequence

[PATCHv4, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-06-24 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds a new insn for vector splat with small V2DI constants on P8. If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be loaded with vspltisw and vupkhsw on P8. It should be efficient than loading vector from memory. Compared to last version, the main change i

[PATCHv4, rs6000] Add two peephole2 patterns for mr. insn

2023-06-20 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds two peephole2 patterns which help convert certain insn sequences to "mr." instruction. These insn sequences can't be combined in combine pass. Compared to last version, the empty constraint is removed and test cases run only on powerpc Linux as AIX doesn't support "-mregnam

Re: [PATCH, rs6000] Add two peephole2 patterns for mr. insn

2023-06-19 Thread HAO CHEN GUI via Gcc-patches

HP, It makes sense. I will update the patch. Thanks Gui Haochen 在 2023/6/20 8:07, Hans-Peter Nilsson 写道: > On Tue, 30 May 2023, HAO CHEN GUI via Gcc-patches wrote: > >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -7891,6 +7891,36 @@ (define_insn "*mov_internal2"

[PATCH, rs6000] Generate mfvsrwz for all platforms and remove redundant zero extend [PR106769]

2023-06-18 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch modifies vsx extract expander and generates mfvsrwz/stxsiwx for all platforms when the mode is V4SI and the index of extracted element is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz which can help eliminate redundant zero extend. Bootstrapped and teste

[PATCHv3, rs6000] Add two peephole2 patterns for mr. insn

2023-06-13 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds two peephole2 patterns which help convert certain insn sequences to "mr." instruction. These insn sequences can't be combined in combine pass. Compared to last version, it changes the new mode iterator name from "Q" to "WORD". Bootstrapped and tested on powerpc64-linux B

[PATCHv2, rs6000] Add two peephole2 patterns for mr. insn

2023-06-11 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds two peephole2 patterns which help convert certain insn sequences to "mr." instruction. These insn sequences can't be combined in combine pass. Compared to last version, it adds a new mode iterator "Q" which should be used for dot instruction. With "-m32/-mpowerpc64" set, th

[PATCH, rs6000] Add two peephole2 patterns for mr. insn

2023-05-29 Thread HAO CHEN GUI via Gcc-patches

Hi, By checking the object files of SPECint, I found that two kinds of compare/move can't be combined to "mr." pattern as there is no register link between them. The patch adds two peephole2 patterns for them. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gu

[PATCHv3, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-05-25 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds a new insn for vector splat with small V2DI constants on P8. If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be loaded with vspltisw and vupkhsw on P8. It should be efficient than loading vector from memory. Compared to last version, the main change i

[PATCHv2, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-05-04 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds a new insn for vector splat with small V2DI constants on P8. If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be loaded with vspltisw and vupkhsw on P8. It should be efficient than loading vector from TOC. Compared to last version, the main change is t

Ping [PATCHv2, rs6000] Merge two vector shift when their sources are the same

2023-04-23 Thread HAO CHEN GUI via Gcc-patches

Hi Gently ping this. https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612944.html Thanks Gui Haochen 在 2023/2/28 10:31, HAO CHEN GUI 写道: > Hi, > This patch merges two "vsldoi" insns when their sources are the > same. Particularly, it is simplified to be one move if the total > shift is

Ping^2 [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694]

2023-04-23 Thread HAO CHEN GUI via Gcc-patches

Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html Thanks Gui Haochen 在 2023/2/20 10:10, HAO CHEN GUI 写道: > Hi, > Gently ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html > > Gui Haochen > Thanks > > 在 2023/2/8 13:08, H

Re: [PATCH-4, rs6000] Change ilp32 target check for some scalar-extract-sig and scalar-insert-exp test cases

2023-04-23 Thread HAO CHEN GUI via Gcc-patches

Hi, Gently ping this. https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609372.html Thanks Gui Haochen 在 2023/1/4 14:17, HAO CHEN GUI 写道: > Hi, > "ilp32" is used in these test cases to make sure test cases only run on a > 32-bit environment. Unfortunately, these cases also run with > "-m

Re: [PATCH-3, rs6000] Change mode and insn condition for scalar insert exp instruction

2023-04-23 Thread HAO CHEN GUI via Gcc-patches

Hi, Gently ping this. https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609371.html Thanks Gui Haochen 在 2023/1/4 14:17, HAO CHEN GUI 写道: > Hi, > This patch changes the mode of exponent to GPR in scalar insert exp > pattern, as the exponent can be put into a 32-bit register. Also the > c

Ping [PATCH-2, rs6000] Change mode and insn condition for scalar extract sig instruction

2023-04-23 Thread HAO CHEN GUI via Gcc-patches

Hi, Gently ping this. https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609370.html Thanks Gui Haochen 在 2023/1/4 14:16, HAO CHEN GUI 写道: > Hi, > This patch changes the return type of __builtin_vsx_scalar_extract_sig > from const signed long to const signed long long, so that it can be c

Ping [PATCH-1, rs6000] Change mode and insn condition for scalar extract exp instruction

2023-04-23 Thread HAO CHEN GUI via Gcc-patches

Hi, Gently ping this. https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609369.html Thanks Gui Haochen 在 2023/1/4 14:16, HAO CHEN GUI 写道: > Hi, > This patch changes the return type of __builtin_vsx_scalar_extract_exp > from const signed long to const signed int, as the exponent can be pu

PING^2 [PATCH, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-04-23 Thread HAO CHEN GUI via Gcc-patches

Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601909.html Thanks Gui Haochen 在 2022/12/14 13:30, HAO CHEN GUI 写道: > Hi, >Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601909.html > > Thanks > Gui Haochen > > 在 2022/9/21 13:1

[PATCH 2/2, rs6000] xfail float128 comparison test case that fails on powerpc64 [PR108728]

2023-04-19 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch xfails a float128 comparison test case on powerpc64 that fails due to a longstanding issue with floating-point compares. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684 for more information. The patch passed regression test on Power Linux platforms. Thanks Gui Haochen

[PATCH 2/1, rs6000] make ppc_cpu_supports_hw as effective target keyword [PR108728]

2023-04-19 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds ppc_cpu_supports_hw into explicit name checking in proc is-effective-target-keyword. So ppc_cpu_supports_hw can be used as a target selector in test directives. It's required by patch2 of this issue. Thanks Gui Haochen ChangeLog testsuite: make ppc_cpu_supports_hw as effecti

[PATCH-1, rs6000] xfail float128 comparison test case that fails on powerpc64 [PR108728]

2023-04-17 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch xfails a float128 comparison test case on powerpc64 that fails due to a longstanding issue with floating-point compares. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684 for more information. The case is xfailed when instructions of float128 hardware are generated. When

[PATCH-2, rs6000] Add ppc_cpu_supports_hw into proc is-effective-target-keyword [PR108728]

2023-04-17 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds ppc_cpu_supports_hw into explicit name checking in proc is-effective-target-keyword. So ppc_cpu_supports_hw can be used as a target selector in test directives. The patch passed regression test on Power Linux platforms. Thanks Gui Haochen ChangeLog rs6000: Add ppc_cpu_sup

Re: [PATCH, rs6000] xfail float128 comparison test case that fails on powerpc64 [PR108728]

2023-04-13 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen, 在 2023/4/13 16:32, Kewen.Lin 写道: > xfail all powerpc*-*-* can have some XPASSes on those ENVs with > software emulation. Since the related hw insn xscmpuqp is guarded > with TARGET_FLOAT128_HW, could we use the effective target > ppc_float128_hw instead? Thanks for your review comments

[PATCH, rs6000] xfail float128 comparison test case that fails on powerpc64 [PR108728]

2023-04-11 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch xfails a float128 comparison test case on powerpc64 that fails due to a longstanding issue with floating-point compares. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684 for more information. The patch passed regression test on Power Linux platforms. Thanks Gui Haochen

[PATCHv3, rs6000] rs6000: correct vector sign extend built-ins on Big Endian [PR108812]

2023-04-05 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch removes byte reverse operation before vector integer sign extension on big endian. These built-ins require to sign extend the element of the input vector that would fall in the least significant portion of the result element. So both BE and LE should do the same operation and the b

[PATCHv2, rs6000] rs6000: correct vector sign extend built-ins on Big Endian [PR108812]

2023-03-28 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch removes byte reverse operation before vector integer sign extension on big endian. These built-ins require to sign extend the element of the input vector that would fall in the least significant portion of the result element. So both BE and LE should do the same operation and the b

Re: [PATCH] [rs6000] Correct match pattern in pr56605.c

2023-03-27 Thread HAO CHEN GUI via Gcc-patches

Kewen, The case still fails with trunk. FAIL: gcc.target/powerpc/pr56605.c scan-rtl-dump-times combine "\\(compare:CC \\((?:and|zero_extend):(?:[SD]I) \\((?:sub)?reg:[SD]I" 1 === gcc Summary === # of expected passes1 # of unexpected failures1 With the tr

[PATCH, rs6000] rs6000: correct vector sign extend built-ins on Big Endian [PR108812]

2023-03-26 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch removes byte reverse operation before vector integer sign extension on Big Endian. These built-ins require to sign extend the rightmost element. So both BE and LE should do the same operation and the byte reversion is no need. This patch fixes it. Now these built-ins have the same

Re: [PATCH-1, rs6000] Put constant into pseudo at expand when it needs two insns [PR86106]

2023-03-21 Thread HAO CHEN GUI via Gcc-patches

Hi Richard, 在 2023/3/16 15:57, Richard Biener 写道: > I'm not sure if careful constraints massaging like adding magic letters to > alternatives with constants to pessimize them for LRA, making them > more expensive than spilling the constant to a register but avoid > secondary reloads with spilling

[PATCHv4, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-03-20 Thread HAO CHEN GUI via Gcc-patches

Hi, I refined the patch according to reviewer's advice. The main change is to check if buffer_p is set and buffered error exists. Also two regtests are fixed by catching the new error. I sent out the revised one for review due to my limited knowledge on Fortran front end. The patch escalate

Ping [PATCHv3, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-03-19 Thread HAO CHEN GUI via Gcc-patches

Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613497.html Thanks Gui Haochen 在 2023/3/7 16:55, HAO CHEN GUI 写道: > Hi, > The patch escalates the failure when Hollerith constant to real conversion > fails in native_interpret_expr. It finally reports an "Cannot sim

Re: [PATCH-1, rs6000] Put constant into pseudo at expand when it needs two insns [PR86106]

2023-03-16 Thread HAO CHEN GUI via Gcc-patches

Hi Richard, 在 2023/3/16 18:36, Richard Biener 写道: > On Thu, Mar 16, 2023 at 10:04 AM HAO CHEN GUI wrote: >> >> Hi Richard, >> >> 在 2023/3/16 15:57, Richard Biener 写道: >>> So this is one way around the lack of CSE/PRE of constant operands. I'd >>> argue that a better spot for this _might_ be LRA

Re: [PATCH-1, rs6000] Put constant into pseudo at expand when it needs two insns [PR86106]

2023-03-16 Thread HAO CHEN GUI via Gcc-patches

Hi Richard, 在 2023/3/16 15:57, Richard Biener 写道: > So this is one way around the lack of CSE/PRE of constant operands. I'd > argue that a better spot for this _might_ be LRA (split the constant out if > there's a free register available), postreload-[g]cse (CSE the constants) and > then maybe cp

[PATCH-2, rs6000] Put constant into pseudo at expand when it needs two insns [PR86106]

2023-03-15 Thread HAO CHEN GUI via Gcc-patches

Hi, The background and motivation of the patch are listed in the note of PATCH-1. This patch changes the expander of ior/xor and force constant to a pseudo when it needs 2 insn. Also a combine and split pattern for ior/xor is defined. rtx_cost of ior insn is adjusted as now it may have 2 insns

[PATCH-1, rs6000] Put constant into pseudo at expand when it needs two insns [PR86106]

2023-03-15 Thread HAO CHEN GUI via Gcc-patches

Hi, Currently, rs6000 directly expands to 2 insns if an integer constant is the second operand and it needs two insns. For example, addi/addis and ori/oris. It may not benefit when the constant is used for more than 2 times in an extended basic block, just like the case in PR shows. One possib

Re: [PATCH] testsuite, rs6000: Adjust ppc-fortran.exp to support dg-{warning,error}

2023-03-10 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen, I tested it with my fortran test case. It works. Thanks a lot. Gui Haochen 在 2023/3/6 17:27, Kewen.Lin 写道: > Hi, > > According to Haochen's finding in [1], currently ppc-fortran.exp > doesn't support Fortran specific warning or error messages well. > By looking into it, it's due to t

[PATCHv3, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-03-07 Thread HAO CHEN GUI via Gcc-patches

Hi, The patch escalates the failure when Hollerith constant to real conversion fails in native_interpret_expr. It finally reports an "Cannot simplify expression" error in do_simplify method. The patch of pr95450 added a verification for decoding/encoding checking in native_interpret_expr. nati

Re: [PATCHv2, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-03-03 Thread HAO CHEN GUI via Gcc-patches

Hi Tobias, 在 2023/3/3 17:29, Tobias Burnus 写道: > But could you also include the 'gcc/fortran/intrinsic.cc' change > proposed in > https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613030.html (and > acknowledged by Steve)? Sure, I will merge it into the patch and do the regression test. Addi

Re: [PATCHv2, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-03-03 Thread HAO CHEN GUI via Gcc-patches

Hi, The patch passed regression test on Power linux platforms. Sorry for missing the information. Gui Haochen 在 2023/3/3 17:12, HAO CHEN GUI via Gcc-patches 写道: > Hi, > The patch escalates the failure when Hollerith constant to real conversion > fails in native_interpret_expr. I

[PATCHv2, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-03-03 Thread HAO CHEN GUI via Gcc-patches

Hi, The patch escalates the failure when Hollerith constant to real conversion fails in native_interpret_expr. It finally reports an "Unclassifiable statement" error. The patch of pr95450 added a verification for decoding/encoding checking in native_interpret_expr. native_interpret_expr may fa

[PATCH, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-02-28 Thread HAO CHEN GUI via Gcc-patches

Hi, The patch escalates the failure when Hollerith constant to real conversion fails in native_interpret_expr. It finally reports an "Unclassifiable statement" error. The patch of pr95450 added a verification for decoding/encoding checking in native_interpret_expr. native_interpret_expr may fa

[PATCHv2, rs6000] Merge two vector shift when their sources are the same

2023-02-27 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch merges two "vsldoi" insns when their sources are the same. Particularly, it is simplified to be one move if the total shift is multiples of 16 bytes. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog 2023-02-28 Haochen Gui

Ping [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694]

2023-02-20 Thread HAO CHEN GUI via Gcc-patches

Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html Gui Haochen Thanks 在 2023/2/8 13:08, HAO CHEN GUI 写道: > Hi, > The logical operations for TImode is split after reload pass right now. Some > potential optimizations miss as the split is too late. This

[PATCH, rs6000] Merge two vector shift when their sources are the same

2023-02-20 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch merges two "vsldoi" insns when their sources are the same. Particularly, it is simplified to be one move if the total shift is multiples of 16 bytes. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog 2023-02-20 Haochen Gui

[PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694]

2023-02-07 Thread HAO CHEN GUI via Gcc-patches

Hi, The logical operations for TImode is split after reload pass right now. Some potential optimizations miss as the split is too late. This patch removes TImode from "AND", "IOR", "XOR" and "NOT" expander so that these logical operations can be split at expand pass. The new test case illustrates

[PATCH, rs6000] Convert TI AND with a special constant to DI AND [PR93123]

2023-01-18 Thread HAO CHEN GUI via Gcc-patches

Hi, When TI AND with a special constant (the high part or low part is all ones), it may be converted to DI AND with a 64-bit constant and a simple DI move. When the DI AND can be implemented by rotate and mask or "andi.", it eliminates the 128-bit constant loading to save the cost. The patch c

[PATCH-4, rs6000] Change ilp32 target check for some scalar-extract-sig and scalar-insert-exp test cases

2023-01-03 Thread HAO CHEN GUI via Gcc-patches

Hi, "ilp32" is used in these test cases to make sure test cases only run on a 32-bit environment. Unfortunately, these cases also run with "-m32/-mpowerpc64" which causes unexpected errors. This patch changes the target check to skip if "has_arch_ppc64" is set. So the test cases won't run when ar

[PATCH-3, rs6000] Change mode and insn condition for scalar insert exp instruction

2023-01-03 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch changes the mode of exponent to GPR in scalar insert exp pattern, as the exponent can be put into a 32-bit register. Also the condition check is changed from TARGET_64BIT to TARGET_POWERPC64. The test cases are modified according to the changes of expand pattern. Bootstrapped

[PATCH-2, rs6000] Change mode and insn condition for scalar extract sig instruction

2023-01-03 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch changes the return type of __builtin_vsx_scalar_extract_sig from const signed long to const signed long long, so that it can be called with "-m32/-mpowerpc64" option. The bif needs TARGET_POWERPC64 instead of TARGET_64BIT. So the condition check in the expander is changed. The t

[PATCH-1, rs6000] Change mode and insn condition for scalar extract exp instruction

2023-01-03 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch changes the return type of __builtin_vsx_scalar_extract_exp from const signed long to const signed int, as the exponent can be put in a signed int. It is also inline with the external interface definition of the bif. The mode of exponent operand in "xsxexpdp" is changed to GPR mode

[PATCH v6, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-12-19 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch fixes several problems: 1. The exponent of double-precision can be put into a SImode register. So "xsxexpdp" doesn't require 64-bit environment. Also "xsxsigdp", "xsiexpdp" and "xsiexpdpf" can put exponent into a GPR register. 2. "TARGET_64BIT" check in insn cond

PING [PATCH, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2022-12-13 Thread HAO CHEN GUI via Gcc-patches

Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601909.html Thanks Gui Haochen 在 2022/9/21 13:13, HAO CHEN GUI 写道: > Hi, > This patch adds a new insn for vector splat with small V2DI constants on P8. > If the value of constant is in RANGE (-16, 15) and not 0 or

Re: [PATCH v5, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-12-11 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen, 在 2022/12/8 16:47, Kewen.Lin 写道: > This documentation update reminds me of that the current prototype of > __ieee128 > variant can be: > > unsigned int scalar_extract_exp (__ieee128 source); > > type unsigned int is enough for the exponent. It means xsxexpqp_ can > also > use SImo

[PATCH v4, rs6000] Enable have_cbranchcc4 on rs6000

2022-12-07 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch enables "have_cbranchcc4" on rs6000 by defining a "cbranchcc4" expander. "have_cbrnachcc4" is a flag in ifcvt.cc to indicate if branch by CC bits is invalid or not. With this flag enabled, some branches can be optimized to conditional moves. Compared to last version, the main ch

Re: [PATCH v3, rs6000] Enable have_cbranchcc4 on rs6000

2022-12-06 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen, Thanks so much for your review comments. I will fix them. 在 2022/12/7 11:06, Kewen.Lin 写道: > Does this issue which relies on the fix for generic part make bootstrapping > fail? > If no, how many failures it can cause? I'm thinking if we can commit this > firstly, > then in the commi

[PATCH v2] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-12-06 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch adds a new conversion to convert a certain branch to conditional ternary set in ifcvt. The branch commonly has following insns. cond_jump ? pc : label setcc (neg/subreg) label: set a constant cond_jump and setcc use the same CC reg and neg/subreg is optional. The br

[PATCH v3, rs6000] Enable have_cbranchcc4 on rs6000

2022-12-05 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch enables "have_cbranchcc4" on rs6000 by defining a "cbranchcc4" expander. "have_cbrnachcc4" is a flag in ifcvt.cc to indicate if branch by CC bits is invalid or not. With this flag enabled, some branches can be optimized to conditional moves. The patch relies on the former patch

Re: [PATCH v2] Return a NULL rtx when targets don't support cbranchcc4 or predicate check fails in prepare_cmp_insn

2022-12-05 Thread HAO CHEN GUI via Gcc-patches

Hi Richard, 在 2022/12/5 15:31, Richard Biener 写道: > I wonder if you have a testcase you can add showing this change is > worthwhile and > fixes a bug? I want to enable cbranchcc4 on rs6000. But not all sub CCmode is supported on rs6000. So the predicate check(assert) fails and it hits ICE. I draf

[PATCH v2] Return a NULL rtx when targets don't support cbranchcc4 or predicate check fails in prepare_cmp_insn

2022-12-04 Thread HAO CHEN GUI via Gcc-patches

Hi, It gets an assertion failure when targers don't support cbranchcc4 or predicate check fails in prepare_cmp_insn. prepare_cmp_insn is a help function to generate compare rtx, so it should not assume that cbranchcc4 is existing or all sub-CC modes are supported on one target. I think it should

[PATCH v5, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-12-01 Thread HAO CHEN GUI via Gcc-patches

Hi, For scalar extract/insert instructions, exponent field can be stored in a 32-bit register. So this patch changes the mode of exponent field from DI to SI so that these instructions can be generated in a 32-bit environment. Also it removes TARGET_64BIT check for these instructions. The inst

Re: [PATCH] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-12-01 Thread HAO CHEN GUI via Gcc-patches

Hi Nilsson, 在 2022/12/2 10:49, Hans-Peter Nilsson 写道: > On Wed, 23 Nov 2022, HAO CHEN GUI via Gcc-patches wrote: > >> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi >> index 92bda1a7e14..9823eccbe68 100644 >> --- a/gcc/doc/tm.texi >> +++ b/gcc/doc/tm.texi >&

Re: Ping [PATCH] Change the behavior of predicate check failure on cbranchcc4 operand0 in prepare_cmp_insn

2022-11-28 Thread HAO CHEN GUI via Gcc-patches

Hi Richard, 在 2022/11/29 2:46, Richard Biener 写道: > Anyhow - my question still stands - what's the fallback for the callers > that do not check for failure? How are we sure we're not running into > these when relaxing the requirement that a MODE_CC prepare_cmp_insn > must not fail? I examed the

Ping [PATCH] Change the behavior of predicate check failure on cbranchcc4 operand0 in prepare_cmp_insn

2022-11-27 Thread HAO CHEN GUI via Gcc-patches

Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607083.html Thanks Gui Haochen 在 2022/11/23 10:54, HAO CHEN GUI 写道: > Hi, > I want to enable "have_cbranchcc4" on rs6000. But not all combinations of > comparison codes and sub CC modes are benefited to generate cbr

Re: [PATCH] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-11-23 Thread HAO CHEN GUI via Gcc-patches

Hi Richard, 在 2022/11/24 4:06, Richard Biener 写道: > Wouldn't we usually either add an optab or try to recog a canonical > RTL form instead of adding a new target hook for things like this? Thanks so much for your comments. Please let me make it clear. Do you mean we should create an optab for "

[PATCH] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-11-22 Thread HAO CHEN GUI via Gcc-patches

Hi, There is a new insn on my target, which has a nested if_then_else and set -1, 0 and 1 according to a comparison. [(set (match_operand:SI 0 "gpc_reg_operand" "=r") (if_then_else:SI (lt (match_operand:CC 1 "cc_reg_operand" "y") (const_int 0))

[PATCH] Change the behavior of predicate check failure on cbranchcc4 operand0 in prepare_cmp_insn

2022-11-22 Thread HAO CHEN GUI via Gcc-patches

Hi, I want to enable "have_cbranchcc4" on rs6000. But not all combinations of comparison codes and sub CC modes are benefited to generate cbranchcc4 insns on rs6000. There is an predicate for operand0 of cbranchcc4 to bypass some combinations. It gets assertion failure in prepare_cmp_insn. I thin

Re: [PATCHv2, rs6000] Enable have_cbranchcc4 on rs6000

2022-11-21 Thread HAO CHEN GUI via Gcc-patches

Hi Segher, Thanks for your comments. 在 2022/11/22 7:49, Segher Boessenkool 写道: > *cbranch_2insn is not a machine insn. It generates a cror and a branch > insn. This makes no sense to have in a cbranchcc: those do a branch > based on an existing cr field, so based on the *output* of that cror. >

Re: [PATCHv2, rs6000] Enable have_cbranchcc4 on rs6000

2022-11-21 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen, 在 2022/11/22 11:11, Kewen.Lin 写道: > Maybe we can adjust prepare_cmp_insn to fail if the constructed cbranchcc4 > pattern doesn't satisfy the predicate of operand 0 rather than to assert. > It's something like: > > if (!insn_operand_matches (icode, 0, test)) > goto fail; > > or only a

Re: [PATCHv2, rs6000] Enable have_cbranchcc4 on rs6000

2022-11-20 Thread HAO CHEN GUI via Gcc-patches

Hi Segher, 在 2022/11/18 20:18, Segher Boessenkool 写道: > I don't think we should pretend we have any conditional jumps the > machine does not actually have, in cbranchcc4. When would this ever be > useful? cror;beq can be quite expensive, compared to the code it would > replace anyway. > > If so

Re: [PATCHv2, rs6000] Enable have_cbranchcc4 on rs6000

2022-11-17 Thread HAO CHEN GUI via Gcc-patches

Hi David, 在 2022/11/17 21:24, David Edelsohn 写道: > This is better, but the pattern should be near and after the existing > cbranch4 patterns earlier in the file, not the *cbranch pattern. It > doesn't match the comment. Sure, I will put it after existing "cbranch4" patterns. > > Why are you u

[PATCHv2, rs6000] Enable have_cbranchcc4 on rs6000

2022-11-16 Thread HAO CHEN GUI via Gcc-patches

Hi, The patch enables have_cbrnachcc4 which is a flag in ifcvt.cc to indicate if branch by CC bits is invalid or not. The new expand pattern "cbranchcc4" is created which intend to match the pattern defined in "*cbranch", "*cbranch_2insn" and "*creturn". The operand sequence in "cbranchcc4" is in

Re: [rs6000, patch] Enable have_cbranchcc4 on rs6000

2022-11-15 Thread HAO CHEN GUI via Gcc-patches

Hi David, I found definition of the operands in 'cbranch'. The argumnets matters. I will create a new expand pattern for cbranchcc4. Thanks a lot for your comments. 'cbranchmode4’ Conditional branch instruction combined with a compare instruction. Operand 0 is a comparison operator. Operand 1 an

[rs6000, patch] Enable have_cbranchcc4 on rs6000

2022-11-15 Thread HAO CHEN GUI via Gcc-patches

Hi, The patch enables have_cbrnachcc4 which is a flag in ifcvt.cc to indicate if branch by CC bits is invalid or not. As rs6000 already has "*cbranch" insn which does branching according to CC bits, the flag should be enabled and relevant branches can be optimized out. The test case illustrates t

[PATCH v4, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-11-06 Thread HAO CHEN GUI via Gcc-patches

Hi, For scalar extract/insert instructions, exponent field can be stored in a 32-bit register. So this patch changes the mode of exponent field from DI to SI. So these instructions can be generated in a 32-bit environment. The patch removes TARGET_64BIT check for these instructiions. The instr

[PATCH-2, rs6000] Reverse V8HI on Power8 by vector rotation [PR100866]

2022-10-23 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch implements V8HI byte reverse on Power8 by vector rotation. It should be effecient than orignial vector permute. The patch comes from Xionghu's comments in PR. I just added a test case for it. Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. Is this okay for

[PATCH-1, rs6000] Generate permute index directly for little endian target [PR100866]

2022-10-11 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch modifies the help function which generates permute index for vector byte reversion and generates permute index directly for little endian targets. It saves one "xxlnor" instructions on P8 little endian targets as the original process needs an "xxlnor" to calculate complement for th

[PATCH v7, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-09-25 Thread HAO CHEN GUI via Gcc-patches

Hi, This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. Tests show that outputs of xs[min/max]dp are consistent with the standard of C99 fmin/max. This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead of smin/max when fast-math is not set. While fast-math i

Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-09-22 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen & Segher, Thanks so much for your review comments. On 22/9/2022 上午 10:28, Kewen.Lin wrote: > on 2022/9/22 05:56, Segher Boessenkool wrote: >> Hi! >> >> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: >>> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instea

Ping [PATCH v3, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-09-20 Thread HAO CHEN GUI via Gcc-patches

Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601196.html Thanks. On 7/9/2022 下午 3:44, HAO CHEN GUI wrote: > Hi, > > For scalar extract/insert instructions, exponent field can be stored in a > 32-bit register. So this patch changes the mode of exponent fiel

Ping^3 [PATCH v2, rs6000] Use CC for BCD operations [PR100736]

2022-09-20 Thread HAO CHEN GUI via Gcc-patches

Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597020.html Thanks. On 1/8/2022 上午 10:02, HAO CHEN GUI wrote: > Hi, > Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597020.html > Thanks. > > On 4/7/2022 下午 2:33, HAO CHEN GUI wrote: >> H

Ping^3 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-09-20 Thread HAO CHEN GUI via Gcc-patches

Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html Thanks. On 1/8/2022 上午 10:03, HAO CHEN GUI wrote: > Hi, >Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html > Thanks. > > > On 4/7/2022 下午 2:32, HAO CHEN GUI wrote: >> H

1 2 3 >

1 - 100 of 296 matches

Mail list logo