Re: [Patch-2, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325]

2024-06-05 Thread HAO CHEN GUI
Hi Kewen, 在 2024/6/5 17:00, Kewen.Lin 写道: > This predicate can be moved to its only use (define_insn part condition). > The const_vector match_code check is redundant as const_vec_duplicate_p > already checks that, I wonder if we really need easy_altivec_constant? > Even if one vector constant

Ping [Patch-2, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325]

2024-06-04 Thread HAO CHEN GUI
Hi, Gently ping the patch. https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643995.html Thanks Gui Haochen 在 2024/1/26 9:17, HAO CHEN GUI 写道: > Hi, > This patch creates an insn_and_split pattern which helps the duplicated > constant vector replace the source pseudo of s

Re: [PATCH-1] fwprop: Replace rtx_cost with insn_cost in try_fwprop_subst_pattern [PR113325]

2024-06-04 Thread HAO CHEN GUI
Hi Jeff, 在 2024/6/4 22:14, Jeff Law 写道: > > > On 1/25/24 6:16 PM, HAO CHEN GUI wrote: >> Hi, >>    This patch replaces rtx_cost with insn_cost in forward propagation. >> In the PR, one constant vector should be propagated and replace a >> pseudo in a store

Ping [PATCH-1v3, rs6000] Implement optab_isinf for SFDF and IEEE128

2024-06-02 Thread HAO CHEN GUI
[PATCH-3v3, rs6000] Implement optab_isnormal for SFDF and IEEE128 https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652595.html Thanks Gui Haochen 在 2024/5/24 14:02, HAO CHEN GUI 写道: > Hi, > This patch implemented optab_isinf for SFDF and IEEE128 by test > data class instructions. >

Ping [PATCHv5] Optab: add isnormal_optab for __builtin_isnormal

2024-06-02 Thread HAO CHEN GUI
Hi, All issues were addressed. Gently ping it. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653001.html Thanks Gui Haochen 在 2024/5/29 14:36, HAO CHEN GUI 写道: > Hi, > This patch adds an optab for __builtin_isnormal. The normal check can be > implemented on rs6000 by

Ping [PATCHv5] Optab: add isfinite_optab for __builtin_isfinite

2024-06-02 Thread HAO CHEN GUI
Hi, All issues were addressed. Gently ping it. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652991.html Thanks Gui Haochen 在 2024/5/29 14:36, HAO CHEN GUI 写道: > Hi, > This patch adds an optab for __builtin_isfinite. The finite check can be > implemented on rs6000 by

[PATCHv2, rs6000] Optimize vector construction with two vector doubleword loads [PR103568]

2024-05-30 Thread HAO CHEN GUI
Hi, This patch optimizes vector construction with two vector doubleword loads. It generates an optimal insn sequence as "xxlor" has lower latency than "mtvsrdd" on Power10. Compared with previous version, the main change is to use "isa" attribute to guard "lxsd" and "lxsdx".

[PATCH, rs6000] Optimize vector construction with two vector doubleword loads [PR103568]

2024-05-29 Thread HAO CHEN GUI
Hi, This patch optimizes vector construction with two vector doubleword loads. It generates an optimal insn sequence as "xxlor" has lower latency than "mtvsrdd" on Power10. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. OK for the trunk? Thanks Gui Haochen

Re: [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]

2024-05-29 Thread HAO CHEN GUI
Hi Kewen, 在 2024/5/29 13:26, Kewen.Lin 写道: > I can understand re-using "unordered" and "eq" will save some efforts than > doing with unspecs, but they are actually RTL codes instead of bits on the > specific hardware CR, a downside is that people who isn't aware of this > design point can have

[PATCH-1v3] Value Range: Add range op for builtin isinf

2024-05-29 Thread HAO CHEN GUI
Hi, The builtin isinf is not folded at front end if the corresponding optab exists. It causes the range evaluation failed on the targets which has optab_isinf. For instance, range-sincos.c will fail on the targets which has optab_isinf as it calls builtin_isinf. This patch fixed the problem

[PATCH-3v2] Value Range: Add range op for builtin isnormal

2024-05-29 Thread HAO CHEN GUI
Hi, This patch adds the range op for builtin isnormal. It also adds two help function in frange to detect range of normal floating-point and range of subnormal or zero. Compared to previous version, the main change is to set the range to 1 if it's normal number otherwise to 0.

[PATCH-2v4] Value Range: Add range op for builtin isfinite

2024-05-29 Thread HAO CHEN GUI
Hi, This patch adds the range op for builtin isfinite. Compared to previous version, the main change is to set the range to 1 if it's finite number otherwise to 0. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652220.html Bootstrapped and tested on x86 and powerpc64-linux BE and LE

[PATCHv5] Optab: add isnormal_optab for __builtin_isnormal

2024-05-29 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isnormal. The normal check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

[PATCHv5] Optab: add isfinite_optab for __builtin_isfinite

2024-05-29 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isfinite. The finite check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

[PATCHv4] Optab: add isnormal_optab for __builtin_isnormal

2024-05-28 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isnormal. The normal check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

[PATCHv4] Optab: add isfinite_optab for __builtin_isfinite

2024-05-28 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isfinite. The finite check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

[PATCHv3] Optab: add isnormal_optab for __builtin_isnormal

2024-05-27 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isnormal. The normal check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

[PATCHv3] Optab: add isfinite_optab for __builtin_isfinite

2024-05-27 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isfinite. The finite check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

Re: [PATCHv2] Optab: add isfinite_optab for __builtin_isfinite

2024-05-27 Thread HAO CHEN GUI
Hi Kewen, Thanks for your comments. 在 2024/5/27 11:18, Kewen.Lin 写道: > Does this require "This pattern is not allowed to FAIL."? > > I guess yes? Since if it's decided to go with this pattern > expanding, there is no fall back? The builtin is inline folded if the optab doesn't exist on the

Ping^2 [Patch, rs6000] Enable overlap memory store for block memory clear

2024-05-26 Thread HAO CHEN GUI
Hi, Gently ping it. https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646478.html Thanks Gui Haochen 在 2024/5/8 9:55, HAO CHEN GUI 写道: > Hi, > As now it's stage 1, gently ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646478.html > > Thanks >

Ping [PATCH-1v2] Value Range: Add range op for builtin isinf

2024-05-26 Thread HAO CHEN GUI
.html [PATCH-3] Value Range: Add range op for builtin isnormal https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652221.html Thanks Gui Haochen 在 2024/5/21 10:52, HAO CHEN GUI 写道: > Hi, > The builtin isinf is not folded at front end if the corresponding optab > exists. It causes

Ping [PATCHv2] Optab: add isnormal_optab for __builtin_isnormal

2024-05-26 Thread HAO CHEN GUI
Hi, Gently ping it. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652172.html Thanks Gui Haochen 在 2024/5/20 16:15, HAO CHEN GUI 写道: > Hi, > This patch adds an optab for __builtin_isnormal. The normal check can be > implemented on rs6000 by a single instruction. It needs

Ping [PATCHv2] Optab: add isfinite_optab for __builtin_isfinite

2024-05-26 Thread HAO CHEN GUI
Hi, Gently ping it. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652170.html Thanks Gui Haochen 在 2024/5/20 16:15, HAO CHEN GUI 写道: > Hi, > This patch adds an optab for __builtin_isfinite. The finite check can be > implemented on rs6000 by a single instruction. It needs

Ping^2 [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]

2024-05-26 Thread HAO CHEN GUI
Hi, Gently ping them. Thanks Gui Haochen 在 2024/5/13 9:56, HAO CHEN GUI 写道: > Hi, > Gently ping the series of patches. > [PATCH-1, rs6000]Add a new type of CC mode - CCBCD for bcd insns [PR100736, > PR114732] > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650217.

[PATCH-3v3, rs6000] Implement optab_isnormal for SFDF and IEEE128

2024-05-24 Thread HAO CHEN GUI
Hi, This patch implemented optab_isnormal for SFDF and IEEE128 by test data class instructions. Compared with previous version, the main change is to narrow down the predict for float operand according to review's advice. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652130.html

[PATCH-2v3, rs6000] Implement optab_isfinite for SFDF and IEEE128

2024-05-24 Thread HAO CHEN GUI
Hi, This patch implemented optab_isfinite for SFDF and IEEE128 by test data class instructions. Compared with previous version, the main change is to narrow down the predict for float operand according to review's advice. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652129.html

[PATCH-1v3, rs6000] Implement optab_isinf for SFDF and IEEE128

2024-05-24 Thread HAO CHEN GUI
Hi, This patch implemented optab_isinf for SFDF and IEEE128 by test data class instructions. Compared with previous version, the main change is to narrow down the predict for float operand according to review's advice. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652128.html

Re: [PATCH-1v2, rs6000] Implement optab_isinf for SFDF and IEEE128

2024-05-23 Thread HAO CHEN GUI
Hi Peter, Thanks for your comments. 在 2024/5/23 5:58, Peter Bergner 写道: > Is there a reason not to use the vsx_register_operand predicate for op1 > which matches the predicate for the operand of the xststdcp pattern > we're passing op1 to? No, I will fix them. Thanks Gui Haochen

[PATCH-3] Value Range: Add range op for builtin isnormal

2024-05-20 Thread HAO CHEN GUI
Hi, This patch adds the range op for builtin isnormal. It also adds two help function in frange to detect range of normal floating-point and range of subnormal or zero. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is it OK for the trunk? Thanks Gui

[PATCH-2v3] Value Range: Add range op for builtin isfinite

2024-05-20 Thread HAO CHEN GUI
Hi, This patch adds the range op for builtin isfinite. Compared to previous version, the main change is to set varying if nothing is known about the range. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650857.html Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no

[PATCH-1v2] Value Range: Add range op for builtin isinf

2024-05-20 Thread HAO CHEN GUI
Hi, The builtin isinf is not folded at front end if the corresponding optab exists. It causes the range evaluation failed on the targets which has optab_isinf. For instance, range-sincos.c will fail on the targets which has optab_isinf as it calls builtin_isinf. This patch fixed the problem

[PATCHv2] Optab: add isnormal_optab for __builtin_isnormal

2024-05-20 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isnormal. The normal check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

[PATCHv2] Optab: add isfinite_optab for __builtin_isfinite

2024-05-20 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isfinite. The finite check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Compared to previous version,

Re: [PATCH] Optab: add isfinite_optab for __builtin_isfinite

2024-05-19 Thread HAO CHEN GUI
Hi Andrew, 在 2024/5/19 3:42, Andrew Pinski 写道: > This is missing adding documentation for the new optab. > It should be documented in md.texi under `Standard Pattern Names For > Generation` section. Thanks for your reminder. I will add ones for all patches. Thanks Gui Haochen

[PATCH-3v2, rs6000] Implement optab_isnormal for SFDF and IEEE128

2024-05-19 Thread HAO CHEN GUI
Hi, This patch implemented optab_isnormal for SFDF and IEEE128 by test data class instructions. Compared with previous version, the main change is not to test if pseudo can be created in expand and modify dg-options and dg-finals of test cases according to reviewer's advice.

[PATCH-2v2, rs6000] Implement optab_isfinite for SFDF and IEEE128

2024-05-19 Thread HAO CHEN GUI
Hi, This patch implemented optab_isfinite for SFDF and IEEE128 by test data class instructions. Compared with previous version, the main change is not to test if pseudo can be created in expand and modify dg-options and dg-finals of test cases according to reviewer's advice.

[PATCH-1v2, rs6000] Implement optab_isinf for SFDF and IEEE128

2024-05-19 Thread HAO CHEN GUI
Hi, This patch implemented optab_isinf for SFDF and IEEE128 by test data class instructions. Compared with previous version, the main change is to modify the dg-options and dg-finals of test cases according to reviewer's advice. https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648304.html

Re: [PATCH-4, rs6000] Implement optab_isnormal for SFmode, DFmode and TFmode [PR97786]

2024-05-16 Thread HAO CHEN GUI
Hi Segher, Thanks for your review comments. I will modify it and resend. Just one question on the insn condition. 在 2024/5/17 1:25, Segher Boessenkool 写道: >> +(define_expand "isnormal2" >> + [(use (match_operand:SI 0 "gpc_reg_operand")) >> +(use (match_operand:SFDF 1 "gpc_reg_operand"))]

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-15 Thread HAO CHEN GUI
Hi Andrew, Thanks so much for your explanation. I got it. I will address the issue. Thanks Gui Haochen 在 2024/5/15 2:45, Andrew MacLeod 写道: > > On 5/9/24 04:47, HAO CHEN GUI wrote: >> Hi Mikael, >> >>    Thanks for your comments. >> >> 在 2024/5/9

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread HAO CHEN GUI
Hi Jakub, Thanks for your review comments. 在 2024/5/14 23:57, Jakub Jelinek 写道: > BUILT_IN_ISFINITE is just one of many BUILT_IN_IS... builtins, > would be nice to handle the others as well. > > E.g. isnormal/isnan/isinf, fpclassify etc. > Yes, I already sent the patches which add range op

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread HAO CHEN GUI
Hi Mikael, Thanks for your comments. 在 2024/5/9 16:03, Mikael Morin 写道: > I think the canonical API behaviour sets R to varying and returns true > instead of just returning false if nothing is known about the range. > > I'm not sure whether it makes any difference; Aldy can probably tell.

Re: [PATCH] rtlanal: Correct cost regularization in pattern_cost

2024-05-14 Thread HAO CHEN GUI
Hi, 在 2024/5/10 20:50, Richard Biener 写道: > IMO give we're dispatching to the rtx_cost hook eventually it needs > documenting there or alternatively catching zero and adjusting its > result there. Of course cost == 0 ? 1 : cost is wrong as it makes > zero vs. one the same cost - using cost + 1

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-13 Thread HAO CHEN GUI
Hi Aldy, Thanks for your review comments. 在 2024/5/13 19:18, Aldy Hernandez 写道: > On Thu, May 9, 2024 at 10:05 AM Mikael Morin wrote: >> >> Hello, >> >> Le 07/05/2024 à 04:37, HAO CHEN GUI a écrit : >>> Hi, >>>The former patch adds isf

Ping [PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]

2024-05-12 Thread HAO CHEN GUI
] Replace explicit CC bit reverse with common format https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650766.html [PATCH-6, rs6000] Split setcc to two insns after reload https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650856.html Thanks Gui Haochen 在 2024/4/30 15:18, HAO CHEN GUI 写道: > Hi, >

[PATCH] rtlanal: Correct cost regularization in pattern_cost

2024-05-12 Thread HAO CHEN GUI
Hi, The cost return from set_src_cost might be zero. Zero for pattern_cost means unknown cost. So the regularization converts the zero to COSTS_N_INSNS (1). // pattern_cost cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed); return cost > 0 ? cost : COSTS_N_INSNS

Re: [PATCH] rtlanal: Correct cost regularization in pattern_cost

2024-05-10 Thread HAO CHEN GUI
Hi Richard, Thanks for your comments. 在 2024/5/10 15:16, Richard Biener 写道: > But if targets return sth < COSTS_N_INSNS (1) but > 0 this is now no > longer meaningful. So shouldn't it instead be > > return cost > 0 ? cost : 1; Yes, it's better. > > ? Alternatively returning fractions of

[PATCHv2] rs6000: Enable overlapped by-pieces operations

2024-05-10 Thread HAO CHEN GUI
Hi, This patch enables overlapped by-piece operations. On rs6000, default move/set/clear ratio is 2. So the overlap is only enabled with compare by-pieces. Compared to previous version, the change is to remove power8 requirement from test case.

[PATCH-1v2] fwprop: Replace rtx_cost with insn_cost in try_fwprop_subst_pattern [PR113325]

2024-05-09 Thread HAO CHEN GUI
Hi, This patch replaces rtx_cost with insn_cost in forward propagation. In the PR, one constant vector should be propagated and replace a pseudo in a store insn if we know it's a duplicated constant vector. It reduces the insn cost but not rtx cost. In this case, the cost is determined by

Re: [PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-09 Thread HAO CHEN GUI
Hi Kewen, 在 2024/5/9 13:44, Kewen.Lin 写道: > Why does it need power8 forced here? I think it over. It's no need. For the sub-targets which library is called, l[hb]z won't be generated too. Thanks Gui Haochen

Re: [PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-09 Thread HAO CHEN GUI
Hi Kewen, Thanks for your comments. 在 2024/5/9 13:44, Kewen.Lin 写道: > Hi, > > on 2024/5/8 14:47, HAO CHEN GUI wrote: >> Hi, >> This patch enables overlapped by-piece operations. On rs6000, default >> move/set/clear ratio is 2. So the overlap is only enable

[PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-08 Thread HAO CHEN GUI
Hi, This patch enables overlapped by-piece operations. On rs6000, default move/set/clear ratio is 2. So the overlap is only enabled with compare by-pieces. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is it OK for the trunk? Thanks Gui Haochen ChangeLog rs6000:

Ping^3 [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694]

2024-05-07 Thread HAO CHEN GUI
Hi, As now it's stage-1, gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html Gui Haochen Thanks 在 2023/4/24 13:35, HAO CHEN GUI 写道: > Hi, > Gently ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html > > Thank

Ping [PATCH, RFC] combine: Don't truncate const operand of AND if it's no benefits

2024-05-07 Thread HAO CHEN GUI
Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647533.html Thanks Gui Haochen 在 2024/3/18 17:10, HAO CHEN GUI 写道: > Hi, > Gently ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647533.html > > Thanks > Gui Haochen > >

Re: [Patch, rs6000] Enable overlap memory store for block memory clear

2024-05-07 Thread HAO CHEN GUI
Hi, As now it's stage 1, gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646478.html Thanks Gui Haochen 在 2024/2/26 10:25, HAO CHEN GUI 写道: > Hi, > This patch enables overlap memory store for block memory clear which > saves the number of store ins

[PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-06 Thread HAO CHEN GUI
Hi, The former patch adds isfinite optab for __builtin_isfinite. https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html Thus the builtin might not be folded at front end. The range op for isfinite is needed for value range analysis. This patch adds them. Compared to last version,

[PATCH-6, rs6000] Split setcc to two insns after reload

2024-05-06 Thread HAO CHEN GUI
Hi, It's the sixth patch of a series of patches optimizing CC modes on rs6000. This patch splits setcc to two separate insns after reload so that other insns can be inserted between them. It should increase the parallelism. The rotate_cr pattern still needs the info of the number of cr

[PATCH-5, rs6000] Replace explicit CC bit reverse with common format

2024-05-06 Thread HAO CHEN GUI
Hi, It's the fifth patch of a series of patches optimizing CC modes on rs6000. There are some explicit CR6 bit reverse (mfcr/xor) expand in vector.md. As the forth patch optimized CC bit reverse implement, the patch changes the explicit format to the common format (testing if the bit is not

[PATCH-4, rs6000] Optimize single cc bit reverse implementation

2024-04-30 Thread HAO CHEN GUI
Hi, It's the forth patch of a series of patches optimizing CC modes on rs6000. The single CC bit reverse can be implemented by setbcr on Power10 or isel on Power9 or mfcr on Power8 and below. Originally CCFP is not supported for isel and setbcr as bcd insns use CCFP and its bit reverse is not

[PATCH-3, rs6000] Set CC mode of vector string isolate insns to CCEQ

2024-04-30 Thread HAO CHEN GUI
Hi, It's the third patch of a series of patches optimizing CC modes on rs6000. This patch sets CC mode of vector string isolate insns to CCEQ instead of CCFP as these insns only set/check CR bit 2. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is it OK for the

[PATCH-2, rs6000] Add a new type of CC mode - CCLTEQ

2024-04-30 Thread HAO CHEN GUI
Hi, It's the second patch of a series of patches optimizing CC modes on rs6000. This patch adds a new type of CC mode - CCLTEQ used for the case which only set CR bit 0 and 2. The bit 1 and 3 are not used. The vector compare and test data class instructions are the cases. Bootstrapped and

[PATCH-1, rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732]

2024-04-30 Thread HAO CHEN GUI
Hi, It's the first patch of a series of patches optimizing CC modes on rs6000. bcd insns set all four bits of a CR field. But it has different single bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used to indict overflow or invalid number. It's not a bit for unordered

Re: [PATCH] Value range: Add range op for __builtin_isfinite

2024-04-23 Thread HAO CHEN GUI
Yes, it's my typo. Thanks. Gui Haochen 在 2024/4/23 17:10, rep.dot@gmail.com 写道: > On 12 April 2024 07:30:10 CEST, HAO CHEN GUI wrote: > > >> >> >> patch.diff >> diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc >> index 9de130b4022..99c51

[PATCH, rs6000] Use bcdsub. instead of bcdadd. for bcd invalid number checking

2024-04-17 Thread HAO CHEN GUI
Hi, This patch replace bcdadd. with bcdsub. for bcd invalid number checking. bcdadd on two same numbers might cause overflow which also set overflow/invalid bit so that we can't distinguish it's invalid or overflow. The bcdsub doesn't have the problem as subtracting on two same number never

[PATCH, rs6000] Fix test case bcd4.c

2024-04-16 Thread HAO CHEN GUI
Hi, This patch fixes loss of return statement in maxbcd of bcd-4.c. Without return statement, it returns an invalid bcd number and make the test noneffective. The patch also enables test to run on Power9 and Big Endian, as all bcd instructions are supported from Power9. Bootstrapped and

[PATCH-4, rs6000] Implement optab_isnormal for SFmode, DFmode and TFmode [PR97786]

2024-04-12 Thread HAO CHEN GUI
Hi, This patch implemented optab_isnormal for SF/DF/TFmode by rs6000 test data class instructions. This patch relies on former patch which adds optab_isnormal. https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649366.html Bootstrapped and tested on powerpc64-linux BE and LE with no

[PATCH] Optab: add isnormal_optab for __builtin_isnormal

2024-04-12 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isnormal. The normal check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Bootstrapped and tested on x86

[PATCH-3] Builtin: Fold builtin_isfinite on IBM long double to builtin_isfinite on double [PR97786]

2024-04-12 Thread HAO CHEN GUI
Hi, This patch folds builtin_isfinite on IBM long double to builtin_isfinite on double type. The former patch https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649346.html implemented the DFmode isfinite_optab. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is it

[PATCH-2, rs6000] Implement optab_isfinite for SFmode, DFmode and TFmode [PR97786]

2024-04-12 Thread HAO CHEN GUI
Hi, This patch implemented optab_finite for SF/DF/TFmode by rs6000 test data class instructions. This patch relies on former patch which adds optab_finite. https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html Bootstrapped and tested on powerpc64-linux BE and LE with no

[PATCH] Value range: Add range op for __builtin_isfinite

2024-04-11 Thread HAO CHEN GUI
Hi, The former patch adds isfinite optab for __builtin_isfinite. https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html Thus the builtin might not be folded at front end. The range op for isfinite is needed for value range analysis. This patch adds them. Bootstrapped and tested

[PATCH] Optab: add isfinite_optab for __builtin_isfinite

2024-04-11 Thread HAO CHEN GUI
Hi, This patch adds an optab for __builtin_isfinite. The finite check can be implemented on rs6000 by a single instruction. It needs an optab to be expanded to the certain sequence of instructions. The subsequent patches will implement the expand on rs6000. Bootstrapped and tested on x86

[Patch] Builtin: Fold builtin_isinf on IBM long double to builtin_isinf on double [PR97786]

2024-03-27 Thread HAO CHEN GUI
Hi, This patch folds builtin_isinf on IBM long double to builtin_isinf on double type. The former patch https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648304.html implemented the DFmode isinf_optab. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is it OK for

[patch, rs6000] Implement optab_isinf for SFmode, DFmode and TFmode [PR97786]

2024-03-24 Thread HAO CHEN GUI
Hi, This patch implemented optab_isinf for SF/DF/TFmode by rs6000 test data class instructions. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is it OK for next stage 1? Thanks Gui Haochen ChangeLog rs6000: Implement optab_isinf for SFmode, DFmode and TFmode gcc/

[PATCH] Value Range: Add range op for builtin isinf

2024-03-24 Thread HAO CHEN GUI
Hi, The builtin isinf is not folded at front end if the corresponding optab exists. It causes the range evaluation failed on the targets which has optab_isinf. For instance, range-sincos.c will fail on the targets which has optab_isinf as it calls builtin_isinf. This patch fixed the problem

Re: [PATCH, RFC] combine: Don't truncate const operand of AND if it's no benefits

2024-03-18 Thread HAO CHEN GUI
Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647533.html Thanks Gui Haochen 在 2024/3/11 13:41, HAO CHEN GUI 写道: > Hi, > This patch tries to fix the problem when a canonical form doesn't benefit > on a specific target. The cons

[PATCH, RFC] combine: Don't truncate const operand of AND if it's no benefits

2024-03-10 Thread HAO CHEN GUI
Hi, This patch tries to fix the problem when a canonical form doesn't benefit on a specific target. The const operand of AND is and with the nonzero bits of another operand in combine pass. It's a canonical form, but it's no benefits for the target which has rotate and mask insns. As the mask is

[PATCHv2, rs6000] Add subreg patterns for SImode rotate and mask insert

2024-03-08 Thread HAO CHEN GUI
Hi, This patch fixes regression cases in gcc.target/powerpc/rlwimi-2.c. In combine pass, SImode (subreg from DImode) lshiftrt is converted to DImode lshiftrt with an out AND. It matches a DImode rotate and mask insert on rs6000. Trying 2 -> 7: 2: r122:DI=r129:DI REG_DEAD r129:DI

[PATCHv2] fwprop: Avoid volatile defines to be propagated

2024-03-04 Thread HAO CHEN GUI
Hi, This patch tries to fix a potential problem which is raised by the patch for PR111267. The volatile asm operand tries to be propagated to a single set insn with the patch for PR111267. The volatile asm operand might be executed for multiple times if the define insn isn't eliminated after

Re: [PATCH] fwprop: Avoid volatile defines to be propagated

2024-03-04 Thread HAO CHEN GUI
Hi Jeff, 在 2024/3/4 11:37, Jeff Law 写道: > Can the same thing happen with a volatile memory load?  I don't think that  > will be caught by the volatile_insn_p check. Yes, I think so. If the define rtx contains volatile memory references, it may hit the same problem. We may use volatile_refs_p

Re: [PATCH] fwprop: Avoid volatile defines to be propagated

2024-03-03 Thread HAO CHEN GUI
Hi Jeff, Thanks for your comments. 在 2024/3/4 6:02, Jeff Law 写道: > Why specifically are you worried here?  Propagation of a volatile shouldn't > in and of itself cause a problem.  We're not changing the number of volatile > accesses or anything like that -- we're just moving them around a 

[PATCH, rs6000] Add subreg patterns for SImode rotate and mask insert

2024-02-29 Thread HAO CHEN GUI
Hi, This patch fixes regression cases in gcc.target/powerpc/rlwimi-2.c. In combine pass, SImode (subreg from DImode) lshiftrt is converted to DImode lshiftrt with an out AND. It matches a DImode rotate and mask insert on rs6000. Trying 2 -> 7: 2: r122:DI=r129:DI REG_DEAD r129:DI

[PATCH] fwprop: Avoid volatile defines to be propagated

2024-02-25 Thread HAO CHEN GUI
Hi, This patch tries to fix a potential problem which is raised by the patch for PR111267. The volatile asm operand tries to be propagated to a single set insn with the patch for PR111267. It has potential risk as the behavior is wrong. Currently set_src_cost comparison can reject such

[Patch, rs6000] Enable overlap memory store for block memory clear

2024-02-25 Thread HAO CHEN GUI
Hi, This patch enables overlap memory store for block memory clear which saves the number of store instructions. The expander calls widest_fixed_size_mode_for_block_clear to get the mode for looped block clear and calls widest_fixed_size_mode_for_block_clear to get the mode for last overlapped

[Patch-2, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325]

2024-01-25 Thread HAO CHEN GUI
Hi, This patch creates an insn_and_split pattern which helps the duplicated constant vector replace the source pseudo of store insn in fwprop pass. Thus the store can be implemented by a single stxvd2x and it eliminates the unnecessary byte swap insn on P8 LE. The test case shows the

[PATCH-1] fwprop: Replace rtx_cost with insn_cost in try_fwprop_subst_pattern [PR113325]

2024-01-25 Thread HAO CHEN GUI
Hi, This patch replaces rtx_cost with insn_cost in forward propagation. In the PR, one constant vector should be propagated and replace a pseudo in a store insn if we know it's a duplicated constant vector. It reduces the insn cost but not rtx cost. In this case, the kind of destination operand

[PATCH, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

2024-01-15 Thread HAO CHEN GUI
Hi, This patch adds const0 move checking for CLEAR_BY_PIECES. The original vec_duplicate handles duplicates of non-constant inputs. But 0 is a constant. So even a platform doesn't support vec_duplicate, it could still do clear by pieces if it supports const0 move by that mode. The test cases

Re: [PATCH, rs6000] Refactor expand_compare_loop and split it to two functions

2024-01-15 Thread HAO CHEN GUI
Hi Kewen, 在 2024/1/15 14:16, Kewen.Lin 写道: > Considering it's stage 4 now and the impact of this patch, let's defer > this to next stage 1, if possible could you organize the above changes > into patches: > > 1) Refactor expand_compare_loop by splitting into two functions without >any

[PATCH, rs6000] Enable block compare expand on P9 with m32 and mpowerpc64

2024-01-11 Thread HAO CHEN GUI
Hi, On P9 "setb" is used to set the result of block compare. So it works with m32 and mpowerpc64. On P8, carry bit is used. So it can't work with m32 and mpowerpc64. This patch enables block compare expand for m32 and mpowerpc64 on P9. Bootstrapped and tested on x86 and powerpc64-linux BE and

Re: [Patch, rs6000] Eliminate unnecessary byte swaps for block clear on P8 LE [PR113325]

2024-01-11 Thread HAO CHEN GUI
Hi Richard, Thanks so much for your comments. >> patch.diff >> diff --git a/gcc/config/rs6000/rs6000-string.cc >> b/gcc/config/rs6000/rs6000-string.cc >> index 7f777666ba9..4c9b2cbeefc 100644 >> --- a/gcc/config/rs6000/rs6000-string.cc >> +++ b/gcc/config/rs6000/rs6000-string.cc >> @@ -140,7

[Patch, rs6000] Eliminate unnecessary byte swaps for block clear on P8 LE [PR113325]

2024-01-11 Thread HAO CHEN GUI
Hi, This patch eliminates unnecessary byte swaps for block clear on P8 LE. For block clear, all the bytes are set to zero. The byte order doesn't make sense. So the alignment of destination could be set to the store mode size in stead of 1 byte in order to eliminates unnecessary byte swap

[PATCH, rs6000] Refactor expand_compare_loop and split it to two functions

2024-01-09 Thread HAO CHEN GUI
Hi, This patch refactors function expand_compare_loop and split it to two functions. One is for fixed length and another is for variable length. These two functions share some low level common help functions. Besides above changes, the patch also does: 1. Don't generate load and compare loop

[Patchv3, rs6000] Clean up pre-checkings of expand_block_compare

2023-12-20 Thread HAO CHEN GUI
Hi, This patch cleans up pre-checkings of expand_block_compare. It does 1. Assert only P7 above can enter this function as it's already guard by the expand. 2. Remove P7 processor test as only P7 above can enter this function and P7 LE is excluded by targetm.slow_unaligned_access. On P7 BE, the

[Patch, rs6000] Call library for block memory compare when optimizing for size

2023-12-20 Thread HAO CHEN GUI
Hi, This patch call library function for block memory compare when it's optimized for size. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is this OK for trunk? Thanks Gui Haochen ChangeLog rs6000: Call library for block memory compare when optimizing for

[Patchv3, rs6000] Correct definition of macro of fixed point efficient unaligned

2023-12-20 Thread HAO CHEN GUI
Hi, The patch corrects the definition of TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and replace it with the call of slow_unaligned_access. Compared with last version, https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640832.html the main change is to pass alignment measured by bits to

[Patchv2, rs6000] Clean up pre-checkings of expand_block_compare

2023-12-17 Thread HAO CHEN GUI
Hi, This patch cleans up pre-checkings of expand_block_compare. It does 1. Assert only P7 above can enter this function as it's already guard by the expand. 2. Return false when optimizing for size. 3. Remove P7 processor test as only P7 above can enter this function and P7 LE is excluded by

[Patchv2, rs6000] Correct definition of macro of fixed point efficient unaligned

2023-12-17 Thread HAO CHEN GUI
Hi, The patch corrects the definition of TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and replace it with the call of slow_unaligned_access. Compared with last version, https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640076.html the main change is to replace the macro with

[Patch, rs6000] Clean up pre-checking of expand_block_compare

2023-12-10 Thread HAO CHEN GUI
Hi, This patch cleans up pre-checking of expand_block_compare. It does 1. Assert only P7 above can enter this function as it's already guard by the expand. 2. Return false when optimizing for size. 3. Remove P7 CPU test as only P7 above can enter this function and P7 LE is excluded by

[Patch, rs6000] Correct definition of macro of fixed point efficient unaligned

2023-12-10 Thread HAO CHEN GUI
Hi, The patch corrects the definition of TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and change its name to a comprehensible name. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is this OK for trunk? Thanks Gui Haochen ChangeLog rs6000: Correct definition of

Re: [patch-2v3, rs6000] Guard fctid on PowerPC64 and PowerPC476 [PR112707]

2023-12-07 Thread HAO CHEN GUI
Hi, The "fctid" is supported on 64-bit Power processors and PowerPC476. It need a guard to check it. The patch fixes the issue. Compared with last version, https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639536.html the main change is to change the target requirement in pr88558*.c.

[patch-2v2, rs6000] guard fctid on PPC64 and powerpc 476 [PR112707]

2023-12-06 Thread HAO CHEN GUI
Hi, The "fctid" is supported on 64-bit Power processors and powerpc 476. It need a guard to check it. The patch fixes the issue. Compared with last version, https://gcc.gnu.org/pipermail/gcc-patches/2023-December/638859.html the main change is to define TARGET_FCTID to POWERPC64 or PPC476.

[patch-1v2, rs6000] enable fctiw on old archs [PR112707]

2023-12-06 Thread HAO CHEN GUI
Hi, SImode in float register is supported on P7 above. It causes "fctiw" can't be generated on old 32-bit processors as the output operand of fctiw insn is an SImode in float/double register. This patch fixes the problem by adding one expand and one insn pattern for fctiw. The output of new

  1   2   3   4   5   >