[PATCH 0/6] PowerPC Dense Math prelimary support (-mcpu=future)

2022-11-09 Thread Michael Meissner via Gcc-patches
This patch is very preliminary support for a potential new feature to the PowerPC that extends the current power10 MMA architecture. This feature may or may not be present in any specific future PowerPC processor. In the current MMA subsystem for Power10, there are 8 512-bit accumulator

Ping: [PATCH 3/3] Update float 128-bit conversions, PR target/107299.

2022-11-07 Thread Michael Meissner via Gcc-patches
Ping patch: | Date: Tue, 1 Nov 2022 22:44:01 -0400 | Subject: [PATCH 3/3] Update float 128-bit conversions, PR target/107299. | Message-ID: This patch fixes some issues with IEEE 128-bit long doubles once the previous 2 patches have been applied. -- Michael Meissner, IBM PO Box 98, Ayer,

Ping: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-11-07 Thread Michael Meissner via Gcc-patches
Ping patch: | Date: Tue, 1 Nov 2022 22:42:30 -0400 | Subject: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299 | Message-ID: This patch is needed to build GCC on Fedora 36 which has switched the long double default to be IEEE 128-bit. -- Michael Meissner, IBM PO Box 98,

Ping: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299

2022-11-07 Thread Michael Meissner via Gcc-patches
Ping patch: | Date: Tue, 1 Nov 2022 22:40:43 -0400 | Subject: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299 | Message-ID: This patch is needed to build GCC on Fedora 36 where the default for long double is now IEEE 128-bit. -- Michael Meissner, IBM PO Box 98, Ayer,

[PATCH 3/3] Update float 128-bit conversions, PR target/107299.

2022-11-01 Thread Michael Meissner via Gcc-patches
This patch fixes two tests that are still failing when long double is IEEE 128-bit after the previous 2 patches for PR target/107299 have been applied. The tests are: gcc.target/powerpc/convert-fp-128.c gcc.target/powerpc/pr85657-3.c This patch is a rewrite of the patch submitted

[PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-11-01 Thread Michael Meissner via Gcc-patches
This patch fixes the issue that GCC cannot build when the default long double is IEEE 128-bit. It fails in building libgcc, specifically when it is trying to buld the __mulkc3 function in libgcc. It is failing in gimple-range-fold.cc during the evrp pass. Ultimately it is failing because the

[PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299

2022-11-01 Thread Michael Meissner via Gcc-patches
This function reworks how the complex multiply and divide built-in functions are done. Previously we created built-in declarations for doing long double complex multiply and divide when long double is IEEE 128-bit. The old code also did not support __ibm128 complex multiply and divide if long

Patch [0/3] for PR target/107299 (GCC does not build on PowerPC when long double is IEEE 128-bit)

2022-11-01 Thread Michael Meissner via Gcc-patches
These 3 patches fix the problems with building GCC on PowerPC systems when long double is configured to use the IEEE 128-bit format. There are 3 patches in this patch set. The first two patches are required to fix the basic problem. The third patch fixes some issue that were noticed along the

Re: [PATCH] Improve converting between 128-bit modes that use the same format

2022-09-12 Thread Michael Meissner via Gcc-patches
I submitted a new patch that rewrites what this patch was trying to do. I didn't see the original version I submitted on September 8th, so I just reposted it. https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601504.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA,

[PATCH] Update float 128-bit conversions

2022-09-12 Thread Michael Meissner via Gcc-patches
I had sent this out on Thrusday, but it doesn't seem to have gone out. This patch is a rewrite of the patch submitted on August 18th: | https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html This patch reworks the conversions between 128-bit binary floating point types. Previously,

[PATCH] Update float 128-bit conversions

2022-09-08 Thread Michael Meissner via Gcc-patches
This patch is a rewrite of the patch submitted on August 18th: | https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html This patch reworks the conversions between 128-bit binary floating point types. Previously, we would call rs6000_expand_float128_convert to do all conversions. Now,

Re: [PATCH] Improve converting between 128-bit modes that use the same format

2022-09-07 Thread Michael Meissner via Gcc-patches
On Tue, Sep 06, 2022 at 05:22:11PM -0500, Segher Boessenkool wrote: > Please do this. It is the biggest problem I have with most of your > patches: you seem to save up development of a week, and then send it out > as big omnibus patch an hour or two before my weekend. This is not > ideal. This

Re: [PATCH] Improve converting between 128-bit modes that use the same format

2022-09-02 Thread Michael Meissner via Gcc-patches
On Tue, Aug 23, 2022 at 04:13:45PM -0500, Segher Boessenkool wrote: > Please do not send new patches as replies to other patches. This was sent as a new patch. > On Thu, Aug 18, 2022 at 05:48:29PM -0400, Michael Meissner wrote: > > mprove converting between 128-bit modes that use the same

Ping: [PATCH] Rework 128-bit complex multiply and divide.

2022-09-02 Thread Michael Meissner via Gcc-patches
Ping patch: | Date: Thu, 18 Aug 2022 17:46:51 -0400 | Subject: [PATCH] Rework 128-bit complex multiply and divide. | Message-ID: -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Re: [PATCH] Implement __builtin_issignaling

2022-08-26 Thread Michael Meissner via Gcc-patches
On Thu, Aug 25, 2022 at 09:56:18PM +0200, Jakub Jelinek wrote: > On Thu, Aug 25, 2022 at 03:23:12PM -0400, Michael Meissner wrote: > > On Mon, Aug 15, 2022 at 12:12:02PM +0200, Jakub Jelinek via Gcc-patches > > wrote: > > > Hi! > > > > > > The following patch implements a new builtin,

Re: [PATCH] Implement __builtin_issignaling

2022-08-25 Thread Michael Meissner via Gcc-patches
On Mon, Aug 15, 2022 at 12:12:02PM +0200, Jakub Jelinek via Gcc-patches wrote: > Hi! > > The following patch implements a new builtin, __builtin_issignaling, > which can be used to implement the ISO/IEC TS 18661-1 issignaling > macro. I haven't looked in detail at the patch, but from the

[PATCH] Improve converting between 128-bit modes that use the same format

2022-08-18 Thread Michael Meissner via Gcc-patches
mprove converting between 128-bit modes that use the same format. This patch improves the insns used for converting between two modes using the 128-bit floating point format (i.e. converting between KFmode and TFmode if -mabi=ieeelongdouble is used, and converting between IFmode and TFmode if

[PATCH] Rework 128-bit complex multiply and divide.

2022-08-18 Thread Michael Meissner via Gcc-patches
Rework 128-bit complex multiply and divide. This function reworks how the complex multiply and divide built-in functions are done. Previously we created built-in declarations for doing long double complex multiply and divide when long double is IEEE 128-bit. The old code also did not support

[PATCH 3/3] Add 'w' suffix for __ibm128 constants

2022-08-18 Thread Michael Meissner via Gcc-patches
Add 'w' suffix for __ibm128 constants. In the documentation, we mention that 'w' or 'W' can be used as a suffix for __ibm128 constants. We never implemented this. This patch fixes that. In addition, the 'q' and 'Q' suffix were changed to use the mode used for the __float128 type, instead of

[PATCH 2/3] Allow __ibm128 with -msoft-float (PR target/105334)

2022-08-18 Thread Michael Meissner via Gcc-patches
Allow __ibm128 with -msoft-float (PR target/105334) This patch allows __ibm128 to be used on systems with software floating point enabled. Previously, we required hardware floating point to be enabled to use __ibm128 keyword and the __ibm128 built-in functions. This patch fixes PR

[PATCH 1/3] Allow __ibm128 even if IEEE 128-bit floating point is not supported.

2022-08-18 Thread Michael Meissner via Gcc-patches
Allow __ibm128 even if IEEE 128-bit floating point is not supported. This patch allows the use of the __ibm128 keyword on non-VSX systems. Originally, the __ibm128 keyword was only enabled when the IEEE 128-bit floating point is enabled. Sometime back in the GCC 12 development period, Segher

[PATCH 0/3] Improvements to __ibm128 on PowerPC

2022-08-18 Thread Michael Meissner via Gcc-patches
The following 3 patches improve __ibm128 on the PowerPC GCC compiler: The first patch allows the use of the __ibm128 keyword on non-VSX systems. Originally, the __ibm128 keyword was only enabled when the IEEE 128-bit floating point is enabled. Sometime back in the GCC 12 development period,

Re: [PATCH 0/5] IEEE 128-bit built-in overload support.

2022-08-11 Thread Michael Meissner via Gcc-patches
On Wed, Aug 10, 2022 at 12:03:16PM -0500, Segher Boessenkool wrote: > On Wed, Aug 10, 2022 at 02:23:27AM -0400, Michael Meissner wrote: > > On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote: > > > On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote: > > > > These

Re: [PATCH 0/5] IEEE 128-bit built-in overload support.

2022-08-10 Thread Michael Meissner via Gcc-patches
On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote: > On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote: > > These patches lay the foundation for a set of follow-on patches that will > > change the internal handling of 128-bit floating point types in GCC. In the > >

Ping: [PATCH 0/5] IEEE 128-bit built-in overload support.

2022-08-03 Thread Michael Meissner via Gcc-patches
Ping patches. Patch #1 of 5. | Date: Thu, 28 Jul 2022 00:47:13 -0400 | Subject: [PATCH 1/5] IEEE 128-bit built-in overload support. | Message-ID: Patch #2 of 5. | Date: Thu, 28 Jul 2022 00:48:51 -0400 | Subject: [PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions. |

Ping: [PATCH, V2] Do not enable -mblock-ops-vector-pair.

2022-08-03 Thread Michael Meissner via Gcc-patches
Ping patch. | Date: Mon, 25 Jul 2022 16:15:05 -0400 | Subject: [PATCH, V2] Do not enable -mblock-ops-vector-pair. | Message-ID: -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH 5/5] Support IEEE 128-bit overload test data built-in functions.

2022-07-27 Thread Michael Meissner via Gcc-patches
[PATCH 5/5] Support IEEE 128-bit overload test data built-in functions. This patch adds support for overloading the IEEE 128-bit test data and test data negate built-in functions bewteeen KFmode and TFmode arguments. I have tested these patches on a power10 that is running Fedora 36, which

[PATCH 4/5] Support IEEE 128-bit overload extract and insert built-in functions.

2022-07-27 Thread Michael Meissner via Gcc-patches
[PATCH 4/5] Support IEEE 128-bit overload extract and insert built-in functions. This patch adds support for overloading the IEEE 128-bit extract and insert built-in functions bewteeen KFmode and TFmode arguments. I have tested these patches on a power10 that is running Fedora 36, which defaults

[PATCH 3/5] Support IEEE 128-bit overload comparison built-in functions.

2022-07-27 Thread Michael Meissner via Gcc-patches
PATCH 3/5] Support IEEE 128-bit overload comparison built-in functions. This patch adds support for overloading the IEEE 128-bit comparison built-in functions bewteeen KFmode and TFmode arguments. I have tested these patches on a power10 that is running Fedora 36, which defaults to using long

[PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions.

2022-07-27 Thread Michael Meissner via Gcc-patches
[PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions. This patch adds support for overloading the IEEE 128-bit round to odd built-in functions bewteeen KFmode and TFmode arguments. I have tested these patches on a power10 that is running Fedora 36, which defaults to using

[PATCH 1/5] IEEE 128-bit built-in overload support.

2022-07-27 Thread Michael Meissner via Gcc-patches
[PATCH 1/5] IEEE 128-bit built-in overload support. This patch lays the ground work that future patches will use to add builtin support (both normal and overloaded) for the case where long double uses the IEEE 128-bit encoding. This adds a new stanza (ieee128-hw-ld) for when we have IEEE 128-bit

[PATCH 0/5] IEEE 128-bit built-in overload support.

2022-07-27 Thread Michael Meissner via Gcc-patches
The following patches add support for doing built-in function overloading between the two 128-bit IEEE types (i.e. _Float182/__float128 using KFmode and when long double uses the IEEE 128-bit encoding with TFmode). These patches lay the foundation for a set of follow-on patches that will change

[PATCH, V2] Do not enable -mblock-ops-vector-pair.

2022-07-25 Thread Michael Meissner via Gcc-patches
Do not enable -mblock-ops-vector-pair. Testing has shown that using the load vector pair and store vector pair instructions for block moves has some performance issues on power10. A patch on June 11th modified the code so that GCC would not set -mblock-ops-vector-pair by default if we are tuning

[PATCH] Remove setting -mblock-ops-vector-pair on power10.

2022-07-21 Thread Michael Meissner via Gcc-patches
Remove setting -mblock-ops-vector-pair on power10. Testing has shown that using the load vector pair and store vector pair instructions for block moves has some performance issues on power10. This patch does not set this option by default. If it is a win in other machines in the future, this

Re: [GCC 12 backport] Disable generating load/store vector pairs for block copies.

2022-07-14 Thread Michael Meissner via Gcc-patches
On Thu, Jul 14, 2022 at 04:12:14PM -0500, Segher Boessenkool wrote: > On Thu, Jul 14, 2022 at 11:20:56AM -0400, Michael Meissner wrote: > > I have applied the patch to GCC 12. > > > > | From 22736f3d0d4fb8ce4afb3230023f8accdb03a623 Mon Sep 17 00:00:00 2001 > > | From: Michael Meissner > > |

Re: [GCC 11 backport] Disable generating load/store vector pairs for block copies.

2022-07-14 Thread Michael Meissner via Gcc-patches
Back port patch (changing .cc to .c) from trunk to GCC 11 committed. | From 3118d0856b030fe491a170354fed2df570df199f Mon Sep 17 00:00:00 2001 | From: Michael Meissner | Date: Thu, 14 Jul 2022 14:03:37 -0400 | Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for block

Re: [GCC 12 backport] Disable generating load/store vector pairs for block copies.

2022-07-14 Thread Michael Meissner via Gcc-patches
I have applied the patch to GCC 12. | From 22736f3d0d4fb8ce4afb3230023f8accdb03a623 Mon Sep 17 00:00:00 2001 | From: Michael Meissner | Date: Thu, 14 Jul 2022 11:16:08 -0400 | Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for block copies. Testing has found that using

[PATCH V2] Disable generating load/store vector pairs for block copies.

2022-06-10 Thread Michael Meissner via Gcc-patches
[PATCH, V2] Disable generating load/store vector pairs for block copies. Testing has found that using store vector pair for block copies can result in a slow down on power10. This patch disables using the vector pair instructions for block copies if we are tuning for power10. This is version 2

Re: [PATCH 1/3] Disable generating store vector pair.

2022-06-07 Thread Michael Meissner via Gcc-patches
On Tue, Jun 07, 2022 at 07:59:34PM -0500, Peter Bergner wrote: > On 6/7/22 4:24 PM, Segher Boessenkool wrote: > > On Tue, Jun 07, 2022 at 04:17:04PM -0500, Peter Bergner wrote: > >> I think I mentioned this offline, but I'd prefer a negative target flag, > >> something like

Re: [PATCH 1/3] Disable generating store vector pair.

2022-06-07 Thread Michael Meissner via Gcc-patches
On Tue, Jun 07, 2022 at 04:17:04PM -0500, Peter Bergner wrote: > On 6/6/22 7:55 PM, Michael Meissner wrote: > > gcc/ > [snip] > > * config/rs6000/rs6000.opt (-mstore-vector-pair): New option. > [snip] > > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > > index

[PATCH 3/3] Adjust MMA tests to account for no store vector pair.

2022-06-06 Thread Michael Meissner via Gcc-patches
[PATCH 3/3] Adjust MMA tests to account for no store vector pair. In changing the default for generating the store vector pair instructions, I had to adjust several of the MMA tests to remove checking for these instructions. Mostly I just deleted the scan-assembler lines checking for stxvp. In

[PATCH 2/3] Disable generating load/store vector pairs for block copies.

2022-06-06 Thread Michael Meissner via Gcc-patches
[PATCH 2/3] Disable generating load/store vector pairs for block copies. If the store vector pair instruction is disabled, do not generate block copies that use load and store vector pair instructions. I have built bootstrap compilers and run the regression tests on three different systems:

[PATCH 1/3] Disable generating store vector pair.

2022-06-06 Thread Michael Meissner via Gcc-patches
[PATCH 1/3] Disable generating store vector pair. Testing has revealed that the power10 has some slowdowns if the store vector pair instruction is generated in some cases. This patch disables generating the store vector pair instructions (stxvp, pstxvp, and stxvpx) unless an undocumented switch

[PATCH, 0/3] Disable generating store vector pair.

2022-06-06 Thread Michael Meissner via Gcc-patches
[PATCH 0/3] Disable generating store vector pair. Testing has revealed that the power10 has some slowdowns if the store vector pair instruction is generated in some cases. This patch disables generating the store vector pair instructions (stxvp, pstxvp, and stxvpx) unless an undocumented switch

[PATCH, V3] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293

2022-06-06 Thread Michael Meissner via Gcc-patches
Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. This is version 3 of the patch. The original patch was: | Date: Mon, 28 Mar 2022 12:26:02 -0400 | Subject: [PATCH 1/4] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. | Message-ID: |

Re: Ping: [PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293

2022-06-03 Thread Michael Meissner via Gcc-patches
On Thu, Jun 02, 2022 at 04:30:19PM -0500, Segher Boessenkool wrote: > On Thu, Jun 02, 2022 at 03:06:52PM -0400, Michael Meissner wrote: > > Ping patch posted on May 13th: > > Are you not going to apply any of Will's suggestions? They looked solid > to me. Sure, I will clean up the comments.

Ping: [PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293

2022-06-02 Thread Michael Meissner via Gcc-patches
Ping patch posted on May 13th: | Cate: Fri, 13 May 2022 10:49:26 -0400 | From: Michael Meissner | Subject: [PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293 | Message-ID: -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email:

Re: [PATCH] Delay splitting addti3/subti3 until first split pass.

2022-05-18 Thread Michael Meissner via Gcc-patches
On Fri, May 13, 2022 at 12:32:22PM -0500, Segher Boessenkool wrote: > On Fri, May 13, 2022 at 11:08:48AM -0400, Michael Meissner wrote: > > Add zero_extendditi2. Improve lxvr*x code generation. > > > > Nothing in this pass haas anything to do with the subject. Which is a > good thing, because

Re: [PATCH] Optimize multiply/add of DImode extended to TImode, PR target/103109.

2022-05-17 Thread Michael Meissner via Gcc-patches
On Fri, May 13, 2022 at 01:20:30PM -0500, will schmidt wrote: > On Fri, 2022-05-13 at 12:17 -0400, Michael Meissner wrote: > > Optimize multiply/add of DImode extended to TImode, PR target/103109. > > > > On power9 and power10 systems, we have instructions that support doing > > 64-bit integers

[PATCH] Generate vadduqm and vsubuqm for TImode add/subtract

2022-05-13 Thread Michael Meissner via Gcc-patches
Generate vadduqm and vsubuqm for TImode add/subtract If the TImode variable is in an Altivec register instead of a GPR register, then generate vadduqm and vsubuqm instead of having to move the value to the GPR registers and doing the add and subtract with carry instructions. To do this, we have

[PATCH] Optimize multiply/add of DImode extended to TImode, PR target/103109.

2022-05-13 Thread Michael Meissner via Gcc-patches
Optimize multiply/add of DImode extended to TImode, PR target/103109. On power9 and power10 systems, we have instructions that support doing 64-bit integers converted to 128-bit integers and producing 128-bit results. This patch adds support to generate these instructions. Previously GCC had

[PATCH] Add zero_extendditi2. Improve lxvr*x code generation.

2022-05-13 Thread Michael Meissner via Gcc-patches
Add zero_extendditi2. Improve lxvr*x code generation. This pattern adds zero_extendditi2 so that if we are extending DImode that is in a GPR register to TImode in a vector register, the compiler can generate MTVSRDDD. In addition the patterns for generating lxvr{b,h,w,d}x were tuned to allow

[PATCH] Delay splitting addti3/subti3 until first split pass.

2022-05-13 Thread Michael Meissner via Gcc-patches
Add zero_extendditi2. Improve lxvr*x code generation. This pattern adds zero_extendditi2 so that if we are extending DImode that is in a GPR register to TImode in a vector register, the compiler can generate MTVSRDDD. In addition the patterns for generating lxvr{b,h,w,d}x were tuned to allow

[PATCH] Replace UNSPEC with RTL code for extendditi2.

2022-05-13 Thread Michael Meissner via Gcc-patches
Replace UNSPEC with RTL code for extendditi2. When I submitted my patch on March 12th for extendditi2, Segher wished I had removed the use of the UNSPEC for the vextsd2q instruction. This patch rewrites extendditi2_vector to use VEC_SELECT rather than UNSPEC. 2022-05-13 Michael Meissner

[PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293

2022-05-13 Thread Michael Meissner via Gcc-patches
Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293. This patch has been previously posted, but it seemed to get lost.: | Date: Tue, 29 Mar 2022 23:25:31 -0400 | Subject: [PATCH, V2] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. | Message-ID: |

[PATCH] Remove -mpower8-fusion options

2022-05-11 Thread Michael Meissner via Gcc-patches
Eliminate power8-fusion and power8-fusion-sign options. As part of PR target/102059, one of the things came up is that we should eliminate the power8 fusion options altogether. This patch eliminates the -mpower8-fusion option. It does enable power8 fusion if the code is being tuned for power8.

Re: [PATCH, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-05-10 Thread Michael Meissner via Gcc-patches
On Tue, May 10, 2022 at 07:27:30AM -0500, Segher Boessenkool wrote: > > IMHO, it's something we want to fix as well, based on the reasons: > > 1) bif names have the corresponding mnemonics, users would expect 1-1 > > mapping here. > > 2) clang emits xs{min,max}dp all the time, with cpu type

Re: Ping #5: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-05-05 Thread Michael Meissner via Gcc-patches
On Thu, May 05, 2022 at 02:35:34PM -0500, Segher Boessenkool wrote: > On Thu, May 05, 2022 at 01:59:05PM -0500, Peter Bergner wrote: > > If we cannot get this in soonish, maybe we can at least get approval for > > applying Mike's simpler patch to the release branches, specifically GCC 10? > > > >

Re: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-05-05 Thread Michael Meissner via Gcc-patches
On Thu, May 05, 2022 at 02:12:43PM -0500, Segher Boessenkool wrote: > On Tue, Apr 12, 2022 at 09:14:55PM -0400, Michael Meissner wrote: > > This is V4 of the patch. Compared to V3 of the patch, GCC will just > > ignore -m{,no-}power8-fusion and -m{,no-}power8-fusion-sign. > > But incorrectly :-(

Ping #5: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-05-02 Thread Michael Meissner via Gcc-patches
Ping #5: | Date: Tue, 12 Apr 2022 21:14:55 -0400 | From: Michael Meissner | Subject: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059 | Message-ID: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593153.html We really need closure on this so I can do the

Ping #4: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-04-28 Thread Michael Meissner via Gcc-patches
Ping #4: | Date: Tue, 12 Apr 2022 21:14:55 -0400 | From: Michael Meissner | Subject: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059 | Message-ID: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593153.html -- Michael Meissner, IBM PO Box 98, Ayer,

Re: [PATCH] doc: Remove misleading text about multilibs for IEEE long double

2022-04-28 Thread Michael Meissner via Gcc-patches
On Thu, Apr 28, 2022 at 10:46:17AM +0100, Jonathan Wakely wrote: > IIUC this text is not true (maybe it was back in 2018?) Initially I thought the way to transition to IEEE 128-bit long double would be through multilibs, but we never installed multilibs for the different long double types. So

Ping: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-04-26 Thread Michael Meissner via Gcc-patches
Ping patch. The customer really needs this patch. We need to apply it to the trunk, and then I will have to refactor it for GCC 10 that the customer is using. | Date: Tue, 12 Apr 2022 21:14:55 -0400 | From: Michael Meissner | Subject: [PATCH, V4] Eliminate power8 fusion options, use power8

Ping: [PATCH] Add zero_extendditi2. Improve lxvr*x code generation.

2022-04-20 Thread Michael Meissner via Gcc-patches
Ping patch. | Date: Wed, 6 Apr 2022 14:21:26 -0400 | From: Michael Meissner | Subject: [PATCH] Add zero_extendditi2. Improve lxvr*x code generation. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping: [PATCH] Replace UNSPEC with RTL code for extendditi2.

2022-04-20 Thread Michael Meissner via Gcc-patches
Ping patch. While this could be held for GCC 13, it would be nice to know whether to keep this patch (which was asked for in one of the previous patches) or discard it. | Date: Fri, 1 Apr 2022 12:59:28 -0400 | From: Michael Meissner | Subject: [PATCH] Replace UNSPEC with RTL code for

Ping #2: [PATCH, V2] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.

2022-04-20 Thread Michael Meissner via Gcc-patches
Ping #2 on this patch. | Date: Tue, 29 Mar 2022 23:25:31 -0400 | From: Michael Meissner } Subject: [PATCH, V2] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. | Message-ID: -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email:

Ping: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-04-20 Thread Michael Meissner via Gcc-patches
Ping patch. | Date: Tue, 12 Apr 2022 21:14:55 -0400 | From: Michael Meissner | Subject: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059 | Message-ID: I feel this is an important patch. Please look at it and approve the patch or give me feedback on how to

[PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-04-12 Thread Michael Meissner via Gcc-patches
Eliminate power8 fusion options, use power8 tuning, PR target/102059 This is V4 of the patch. Compared to V3 of the patch, GCC will just ignore -m{,no-}power8-fusion and -m{,no-}power8-fusion-sign. The splitting of signed halfword and word loads into unsigned load and sign extension is now

Re: [PATCH] Add zero_extendditi2. Improve lxvr*x code generation.

2022-04-08 Thread Michael Meissner via Gcc-patches
On Wed, Apr 06, 2022 at 03:01:33PM -0500, will schmidt wrote: > In this context it's not clear what is the "old code" ? > The mtvsrdd > instruction is referenced in this code path. I see no direct reference > to lxvrdx here, though I suppose it's assumed somewhere behind the > emit_ calls. The

[PATCH, V3] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-04-07 Thread Michael Meissner via Gcc-patches
Eliminate power8 fusion options, use power8 tuning, PR target/102059 This is V3 of the patch. Compared to V2 of the patch, it changed some of the comments based on the feedback. Since -mpower8-fusion-sign was an undocumented option, I removed some of the wording about its removal. I removed an

Committed: [PATCH] Disable float128 tests on VxWorks, PR target/104253.

2022-04-07 Thread Michael Meissner via Gcc-patches
This is the patch that I committed. I will do the backport in a few days to GCC 11 and 10. Disable float128 tests on VxWorks, PR target/104253. In PR target/104253, it was pointed out the that test case added as part of fixing the PR does not work on VxWorks because float128 is not supported on

Re: [PATCH] Disable float128 tests on VxWorks, PR target/104253.

2022-04-07 Thread Michael Meissner via Gcc-patches
On Thu, Apr 07, 2022 at 12:47:27PM +0200, Eric Botcazou wrote: > > I have run the tests on my usual Linux systems (little endian power10, > > little endian power9, big endian power8), but I don't have access to a > > VxWorks system. Eric does this fix the failure for you? > > Yes, if you add '*'

Re: [PATCH] Disable float128 tests on VxWorks, PR target/104253.

2022-04-07 Thread Michael Meissner via Gcc-patches
On Thu, Apr 07, 2022 at 06:00:51AM -0500, Segher Boessenkool wrote: > On Thu, Apr 07, 2022 at 12:29:45AM -0400, Michael Meissner wrote: > > In PR target/104253, it was pointed out the that test case added as part > > of fixing the PR does not work on VxWorks because float128 is not > > supported

[PATCH] Disable float128 tests on VxWorks, PR target/104253.

2022-04-06 Thread Michael Meissner via Gcc-patches
Disable float128 tests on VxWorks, PR target/104253. In PR target/104253, it was pointed out the that test case added as part of fixing the PR does not work on VxWorks because float128 is not supported on that system. I have modified the three tests for float128 so that they are manually

[PATCH] Add zero_extendditi2. Improve lxvr*x code generation.

2022-04-06 Thread Michael Meissner via Gcc-patches
>From bf51c49f1481001c7b3223474d261dcbf9365eda Mon Sep 17 00:00:00 2001 From: Michael Meissner Date: Fri, 1 Apr 2022 22:27:13 -0400 Subject: [PATCH] Add zero_extendditi2. Improve lxvr*x code generation. This pattern adds zero_extendditi2 so that if we are extending DImode to TImode, and we want

Ping: [PATCH, V2] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.

2022-04-04 Thread Michael Meissner via Gcc-patches
Ping patch. | Date: Tue, 29 Mar 2022 23:25:31 -0400 | From: Michael Meissner | Subject: [PATCH, V2] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. | Message-ID: -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH] Replace UNSPEC with RTL code for extendditi2.

2022-04-01 Thread Michael Meissner via Gcc-patches
eplace UNSPEC with RTL code for extendditi2. When I submitted my patch on March 12th for extendditi2, Segher wished I had removed the use of the UNSPEC for the vextsd2q instruction. This patch rewrites extendditi2_vector to use VEC_SELECT rather than UNSPEC. I have built a power10 little endian

Re: [PATCH 3/4] Make vsx_extract_ use correct insn attributes, PR target 99293.

2022-03-30 Thread Michael Meissner via Gcc-patches
On Mon, Mar 28, 2022 at 05:06:00PM -0500, Segher Boessenkool wrote: > Hi! > > On Mon, Mar 28, 2022 at 12:28:04PM -0400, Michael Meissner wrote: > > In looking at PR target/99293, I noticed that the insn "type" attribute is > > incorrect for the vsx_extract_ insns. In particular: > > > > 1)

Re: [PATCH 2/4] Make vsx_splat__reg use correct insn attributes, PR target/99293

2022-03-30 Thread Michael Meissner via Gcc-patches
On Mon, Mar 28, 2022 at 03:28:39PM -0500, Segher Boessenkool wrote: > Hi! > > On Mon, Mar 28, 2022 at 12:27:05PM -0400, Michael Meissner wrote: > > In looking at PR target/99293, I noticed that the code in the insn > > vsx_splat__reg used "vecmove" as the "type" insn attribute when the > >

[PATCH, V2] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.

2022-03-29 Thread Michael Meissner via Gcc-patches
Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. This is version 2 of the patch. The original patch was: | Date: Mon, 28 Mar 2022 12:26:02 -0400 | Subject: [PATCH 1/4] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. | Message-ID: |

Re: [PATCH 4/4] Allow vsx_extract_ to use Altivec registers, PR target/99293

2022-03-29 Thread Michael Meissner via Gcc-patches
On Mon, Mar 28, 2022 at 06:59:14PM -0500, Segher Boessenkool wrote: > On Mon, Mar 28, 2022 at 12:28:55PM -0400, Michael Meissner wrote: > > In looking at PR target/99293, I noticed that the vsx_extract_ > > pattern for V2DImode and V2DFmode only allowed traditional floating point > > registers,

Re: [PATCH 1/4] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.

2022-03-28 Thread Michael Meissner via Gcc-patches
On Mon, Mar 28, 2022 at 12:14:09PM -0500, Segher Boessenkool wrote: > On Mon, Mar 28, 2022 at 12:26:02PM -0400, Michael Meissner wrote: > > However on power9 and power10 it generates: > > > > ;; vec_splats (vec_extract (src, 0)) > > mfvsld 3,34 > > mtvsrdd 34,9,9 > > > > ;;

[PATCH 4/4] Allow vsx_extract_ to use Altivec registers, PR target/99293

2022-03-28 Thread Michael Meissner via Gcc-patches
Allow vsx_extract_ to use Altivec registers, PR target/99293 In looking at PR target/99293, I noticed that the vsx_extract_ pattern for V2DImode and V2DFmode only allowed traditional floating point registers, and it did not allow Altivec registers. The original code was written a few years ago

[PATCH 3/4] Make vsx_extract_ use correct insn attributes, PR target 99293.

2022-03-28 Thread Michael Meissner via Gcc-patches
Make vsx_extract_ use correct insn attributes, PR target 99293. In looking at PR target/99293, I noticed that the insn "type" attribute is incorrect for the vsx_extract_ insns. In particular: 1) Simple vector register move should be vecmove (alternative 1); 2) Xxpermdi should be

[PATCH 2/4] Make vsx_splat__reg use correct insn attributes, PR target/99293

2022-03-28 Thread Michael Meissner via Gcc-patches
Make vsx_splat__reg use correct insn attributes, PR target/99293 In looking at PR target/99293, I noticed that the code in the insn vsx_splat__reg used "vecmove" as the "type" insn attribute when the "mtvsrdd" is generated. It should use "mfvsr". I also added a "p9v" isa attribute for that

[PATCH 1/4] Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.

2022-03-28 Thread Michael Meissner via Gcc-patches
Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293. In PR target/99293, it was pointed out that doing: vector long long dest0, dest1, src; /* ... */ dest0 = vec_splats (vec_extract (src, 0)); dest1 = vec_splats (vec_extract (src, 1));

[PATCH 0/4] Optimize vec_splats of vec_extract, PR target/99293

2022-03-28 Thread Michael Meissner via Gcc-patches
The following 4 patches fix PR target/99293. This bug complains that on power9 and power10: vector long long v, v0, v1; // ... v0 = __builtin_vec_splats (__builtin_vec_extract (v, 0)); v1 = __builtin_vec_splats (__builtin_vec_extract (v, 1)); generates move from

[BACKPORT] Backport PR fortran/96983 fix to GCC 11

2022-03-16 Thread Michael Meissner via Gcc-patches
Backport PR fortran/96983 patch to GCC 11. I applied a patch on the trunk in April 22nd, 2021 that fixes an issue (PR fortran/66983) where we could fail for 128-bit floating point types because we don't have a built-in function that is equivalent to llround for 128-bit integer types. Instead,

Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Michael Meissner via Gcc-patches
On Fri, Mar 11, 2022 at 03:07:39PM -0600, Segher Boessenkool wrote: > On Fri, Mar 11, 2022 at 09:57:50PM +0100, Jakub Jelinek wrote: > > On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote: > > > On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote: > > > > The version of

Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-03-11 Thread Michael Meissner via Gcc-patches
On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote: > On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote: > > The version of this patch applied to GCC 10 branch (commit > > 641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build for > >

Re: [PATCH] Fix DImode to TImode sign extend issue, PR target/104868

2022-03-11 Thread Michael Meissner via Gcc-patches
On Fri, Mar 11, 2022 at 02:41:05PM -0600, Segher Boessenkool wrote: > On Fri, Mar 11, 2022 at 01:07:29AM -0500, Michael Meissner wrote: > > Fix DImode to TImode sign extend issue, PR target/104898 > > > When I wrote the extendditi2 pattern, I forgot that mtvsrdd had that > > behavior so I used a

Re: [PATCH] Fix DImode to TImode sign extend issue, PR target/104868

2022-03-11 Thread Michael Meissner via Gcc-patches
Matheus Castanho reports that the patch I posted fixes the problem in the 1040868 bug report. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH] Fix DImode to TImode sign extend issue, PR target/104868

2022-03-10 Thread Michael Meissner via Gcc-patches
Fix DImode to TImode sign extend issue, PR target/104898 PR target/104868 had had an issue where my code that updated the DImode to TImode sign extension for power10 failed. In looking at the failure message, the reason is when extendditi2 tries to split the insn, it generates an insn that does

Re: [PATCH, V2] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-03-10 Thread Michael Meissner via Gcc-patches
On Thu, Mar 10, 2022 at 01:49:36PM -0600, Segher Boessenkool wrote: > On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote: > > On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote: > > > --- a/gcc/config/rs6000/rs6000-cpus.def > > > +++ b/gcc/config/rs6000/rs6000-cpus.def > > > @@

Re: [PATCH] rs6000: Fix up __SIZEOF_{FLOAT,IBM}128__ defines [PR99708]

2022-03-10 Thread Michael Meissner via Gcc-patches
On Mon, Mar 07, 2022 at 03:37:18PM -0600, Segher Boessenkool wrote: > Hi! > > On Sat, Mar 05, 2022 at 09:21:51AM +0100, Jakub Jelinek wrote: > > As mentioned in the PR, right now on powerpc* __SIZEOF_{FLOAT,IBM}128__ > > macros are predefined unconditionally, because {ieee,ibm}128_float_type_node

Re: [PATCH] rs6000, v3: Fix up __SIZEOF_{FLOAT, IBM}128__ defines [PR99708]

2022-03-10 Thread Michael Meissner via Gcc-patches
On Wed, Mar 09, 2022 at 04:57:01PM -0600, Segher Boessenkool wrote: > On Wed, Mar 09, 2022 at 10:10:07PM +0100, Jakub Jelinek wrote: > > On Wed, Mar 09, 2022 at 02:57:20PM -0600, Segher Boessenkool wrote: > > > But __ibm128 should *always* be supported, so this is a hypothetical > > > problem. > >

Re: [PATCH, V2] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-03-10 Thread Michael Meissner via Gcc-patches
On Thu, Mar 10, 2022 at 01:49:36PM -0600, Segher Boessenkool wrote: > On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote: > > On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote: > > > --- a/gcc/config/rs6000/rs6000-cpus.def > > > +++ b/gcc/config/rs6000/rs6000-cpus.def > > > @@

[PATCH, V2] Eliminate power8 fusion options, use power8 tuning, PR target/102059

2022-03-09 Thread Michael Meissner via Gcc-patches
Eliminate power8 fusion options, use power8 tuning, PR target/102059 The power8 fusion support used to be set automatically when -mcpu=power8 or -mtune=power8 was used, and it was cleared for other cpu's. However, if you used the target attribute or target #pragma to change the default cpu type

Re: Ping: [PATCH] PR target/102059 Fix inline of target specific functions

2022-03-08 Thread Michael Meissner via Gcc-patches
On Tue, Mar 08, 2022 at 11:28:03AM -0600, Segher Boessenkool wrote: > On Fri, Feb 11, 2022 at 12:53:07PM -0500, Michael Meissner wrote: > > Ping patch for PR target/102059 to ignore implicit -mpower8-fusion that > > prevents a function targeting power9 or power10 from inlining a function > > that

[COMMITTED] Optimize signed DImode -> TImode on power10.

2022-03-04 Thread Michael Meissner via Gcc-patches
Here is the patch that I committed to the trunk: Optimize signed DImode -> TImode on power10. On power10, GCC tries to optimize the signed conversion from DImode to TImode by using the vextsd2q instruction. However to generate this instruction, it would have to generate 3 direct moves (1 from

<    1   2   3   4   5   6   7   >