This patch is very preliminary support for a potential new feature to the
PowerPC that extends the current power10 MMA architecture. This feature may or
may not be present in any specific future PowerPC processor.
In the current MMA subsystem for Power10, there are 8 512-bit accumulator
Ping patch:
| Date: Tue, 1 Nov 2022 22:44:01 -0400
| Subject: [PATCH 3/3] Update float 128-bit conversions, PR target/107299.
| Message-ID:
This patch fixes some issues with IEEE 128-bit long doubles once the previous 2
patches have been applied.
--
Michael Meissner, IBM
PO Box 98, Ayer,
Ping patch:
| Date: Tue, 1 Nov 2022 22:42:30 -0400
| Subject: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299
| Message-ID:
This patch is needed to build GCC on Fedora 36 which has switched the long
double default to be IEEE 128-bit.
--
Michael Meissner, IBM
PO Box 98,
Ping patch:
| Date: Tue, 1 Nov 2022 22:40:43 -0400
| Subject: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR
target/107299
| Message-ID:
This patch is needed to build GCC on Fedora 36 where the default for long
double is now IEEE 128-bit.
--
Michael Meissner, IBM
PO Box 98, Ayer,
This patch fixes two tests that are still failing when long double is IEEE
128-bit after the previous 2 patches for PR target/107299 have been applied.
The tests are:
gcc.target/powerpc/convert-fp-128.c
gcc.target/powerpc/pr85657-3.c
This patch is a rewrite of the patch submitted
This patch fixes the issue that GCC cannot build when the default long double
is IEEE 128-bit. It fails in building libgcc, specifically when it is trying
to buld the __mulkc3 function in libgcc. It is failing in gimple-range-fold.cc
during the evrp pass. Ultimately it is failing because the
This function reworks how the complex multiply and divide built-in functions are
done. Previously we created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit. The old code also did not
support __ibm128 complex multiply and divide if long
These 3 patches fix the problems with building GCC on PowerPC systems when long
double is configured to use the IEEE 128-bit format.
There are 3 patches in this patch set. The first two patches are required to
fix the basic problem. The third patch fixes some issue that were noticed
along the
I submitted a new patch that rewrites what this patch was trying to do. I
didn't see the original version I submitted on September 8th, so I just
reposted it.
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601504.html
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA,
I had sent this out on Thrusday, but it doesn't seem to have gone out.
This patch is a rewrite of the patch submitted on August 18th:
| https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html
This patch reworks the conversions between 128-bit binary floating point types.
Previously,
This patch is a rewrite of the patch submitted on August 18th:
| https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html
This patch reworks the conversions between 128-bit binary floating point types.
Previously, we would call rs6000_expand_float128_convert to do all conversions.
Now,
On Tue, Sep 06, 2022 at 05:22:11PM -0500, Segher Boessenkool wrote:
> Please do this. It is the biggest problem I have with most of your
> patches: you seem to save up development of a week, and then send it out
> as big omnibus patch an hour or two before my weekend. This is not
> ideal.
This
On Tue, Aug 23, 2022 at 04:13:45PM -0500, Segher Boessenkool wrote:
> Please do not send new patches as replies to other patches.
This was sent as a new patch.
> On Thu, Aug 18, 2022 at 05:48:29PM -0400, Michael Meissner wrote:
> > mprove converting between 128-bit modes that use the same
Ping patch:
| Date: Thu, 18 Aug 2022 17:46:51 -0400
| Subject: [PATCH] Rework 128-bit complex multiply and divide.
| Message-ID:
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com
On Thu, Aug 25, 2022 at 09:56:18PM +0200, Jakub Jelinek wrote:
> On Thu, Aug 25, 2022 at 03:23:12PM -0400, Michael Meissner wrote:
> > On Mon, Aug 15, 2022 at 12:12:02PM +0200, Jakub Jelinek via Gcc-patches
> > wrote:
> > > Hi!
> > >
> > > The following patch implements a new builtin,
On Mon, Aug 15, 2022 at 12:12:02PM +0200, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> The following patch implements a new builtin, __builtin_issignaling,
> which can be used to implement the ISO/IEC TS 18661-1 issignaling
> macro.
I haven't looked in detail at the patch, but from the
mprove converting between 128-bit modes that use the same format.
This patch improves the insns used for converting between two modes using
the 128-bit floating point format (i.e. converting between KFmode and TFmode if
-mabi=ieeelongdouble is used, and converting between IFmode and TFmode if
Rework 128-bit complex multiply and divide.
This function reworks how the complex multiply and divide built-in functions are
done. Previously we created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit. The old code also did not
support
Add 'w' suffix for __ibm128 constants.
In the documentation, we mention that 'w' or 'W' can be used as a suffix for
__ibm128 constants. We never implemented this. This patch fixes that.
In addition, the 'q' and 'Q' suffix were changed to use the mode used for the
__float128 type, instead of
Allow __ibm128 with -msoft-float (PR target/105334)
This patch allows __ibm128 to be used on systems with software floating point
enabled. Previously, we required hardware floating point to be enabled to use
__ibm128 keyword and the __ibm128 built-in functions. This patch fixes PR
Allow __ibm128 even if IEEE 128-bit floating point is not supported.
This patch allows the use of the __ibm128 keyword on non-VSX systems.
Originally, the __ibm128 keyword was only enabled when the IEEE 128-bit
floating point is enabled. Sometime back in the GCC 12 development period,
Segher
The following 3 patches improve __ibm128 on the PowerPC GCC compiler:
The first patch allows the use of the __ibm128 keyword on non-VSX systems.
Originally, the __ibm128 keyword was only enabled when the IEEE 128-bit
floating point is enabled. Sometime back in the GCC 12 development period,
On Wed, Aug 10, 2022 at 12:03:16PM -0500, Segher Boessenkool wrote:
> On Wed, Aug 10, 2022 at 02:23:27AM -0400, Michael Meissner wrote:
> > On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote:
> > > On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote:
> > > > These
On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote:
> On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote:
> > These patches lay the foundation for a set of follow-on patches that will
> > change the internal handling of 128-bit floating point types in GCC. In the
> >
Ping patches.
Patch #1 of 5.
| Date: Thu, 28 Jul 2022 00:47:13 -0400
| Subject: [PATCH 1/5] IEEE 128-bit built-in overload support.
| Message-ID:
Patch #2 of 5.
| Date: Thu, 28 Jul 2022 00:48:51 -0400
| Subject: [PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in
functions.
|
Ping patch.
| Date: Mon, 25 Jul 2022 16:15:05 -0400
| Subject: [PATCH, V2] Do not enable -mblock-ops-vector-pair.
| Message-ID:
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com
[PATCH 5/5] Support IEEE 128-bit overload test data built-in functions.
This patch adds support for overloading the IEEE 128-bit test data and
test data negate built-in functions bewteeen KFmode and TFmode arguments.
I have tested these patches on a power10 that is running Fedora 36, which
[PATCH 4/5] Support IEEE 128-bit overload extract and insert built-in functions.
This patch adds support for overloading the IEEE 128-bit extract and
insert built-in functions bewteeen KFmode and TFmode arguments.
I have tested these patches on a power10 that is running Fedora 36, which
defaults
PATCH 3/5] Support IEEE 128-bit overload comparison built-in functions.
This patch adds support for overloading the IEEE 128-bit comparison
built-in functions bewteeen KFmode and TFmode arguments.
I have tested these patches on a power10 that is running Fedora 36, which
defaults to using long
[PATCH 2/5] Support IEEE 128-bit overload round_to_odd built-in functions.
This patch adds support for overloading the IEEE 128-bit round to odd
built-in functions bewteeen KFmode and TFmode arguments.
I have tested these patches on a power10 that is running Fedora 36, which
defaults to using
[PATCH 1/5] IEEE 128-bit built-in overload support.
This patch lays the ground work that future patches will use to add
builtin support (both normal and overloaded) for the case where long
double uses the IEEE 128-bit encoding.
This adds a new stanza (ieee128-hw-ld) for when we have IEEE 128-bit
The following patches add support for doing built-in function overloading
between the two 128-bit IEEE types (i.e. _Float182/__float128 using KFmode and
when long double uses the IEEE 128-bit encoding with TFmode).
These patches lay the foundation for a set of follow-on patches that will
change
Do not enable -mblock-ops-vector-pair.
Testing has shown that using the load vector pair and store vector pair
instructions for block moves has some performance issues on power10.
A patch on June 11th modified the code so that GCC would not set
-mblock-ops-vector-pair by default if we are tuning
Remove setting -mblock-ops-vector-pair on power10.
Testing has shown that using the load vector pair and store vector pair
instructions for block moves has some performance issues on power10. This
patch does not set this option by default. If it is a win in other
machines in the future, this
On Thu, Jul 14, 2022 at 04:12:14PM -0500, Segher Boessenkool wrote:
> On Thu, Jul 14, 2022 at 11:20:56AM -0400, Michael Meissner wrote:
> > I have applied the patch to GCC 12.
> >
> > | From 22736f3d0d4fb8ce4afb3230023f8accdb03a623 Mon Sep 17 00:00:00 2001
> > | From: Michael Meissner
> > |
Back port patch (changing .cc to .c) from trunk to GCC 11 committed.
| From 3118d0856b030fe491a170354fed2df570df199f Mon Sep 17 00:00:00 2001
| From: Michael Meissner
| Date: Thu, 14 Jul 2022 14:03:37 -0400
| Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for
block
I have applied the patch to GCC 12.
| From 22736f3d0d4fb8ce4afb3230023f8accdb03a623 Mon Sep 17 00:00:00 2001
| From: Michael Meissner
| Date: Thu, 14 Jul 2022 11:16:08 -0400
| Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for
block copies.
Testing has found that using
[PATCH, V2] Disable generating load/store vector pairs for block copies.
Testing has found that using store vector pair for block copies can result
in a slow down on power10. This patch disables using the vector pair
instructions for block copies if we are tuning for power10.
This is version 2
On Tue, Jun 07, 2022 at 07:59:34PM -0500, Peter Bergner wrote:
> On 6/7/22 4:24 PM, Segher Boessenkool wrote:
> > On Tue, Jun 07, 2022 at 04:17:04PM -0500, Peter Bergner wrote:
> >> I think I mentioned this offline, but I'd prefer a negative target flag,
> >> something like
On Tue, Jun 07, 2022 at 04:17:04PM -0500, Peter Bergner wrote:
> On 6/6/22 7:55 PM, Michael Meissner wrote:
> > gcc/
> [snip]
> > * config/rs6000/rs6000.opt (-mstore-vector-pair): New option.
> [snip]
> > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> > index
[PATCH 3/3] Adjust MMA tests to account for no store vector pair.
In changing the default for generating the store vector pair instructions,
I had to adjust several of the MMA tests to remove checking for these
instructions. Mostly I just deleted the scan-assembler lines checking for
stxvp. In
[PATCH 2/3] Disable generating load/store vector pairs for block copies.
If the store vector pair instruction is disabled, do not generate block
copies that use load and store vector pair instructions.
I have built bootstrap compilers and run the regression tests on three
different systems:
[PATCH 1/3] Disable generating store vector pair.
Testing has revealed that the power10 has some slowdowns if the store
vector pair instruction is generated in some cases. This patch disables
generating the store vector pair instructions (stxvp, pstxvp, and stxvpx)
unless an undocumented switch
[PATCH 0/3] Disable generating store vector pair.
Testing has revealed that the power10 has some slowdowns if the store vector
pair instruction is generated in some cases. This patch disables generating
the store vector pair instructions (stxvp, pstxvp, and stxvpx) unless an
undocumented switch
Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.
This is version 3 of the patch. The original patch was:
| Date: Mon, 28 Mar 2022 12:26:02 -0400
| Subject: [PATCH 1/4] Optimize vec_splats of constant vec_extract for
V2DI/V2DF, PR target 99293.
| Message-ID:
|
On Thu, Jun 02, 2022 at 04:30:19PM -0500, Segher Boessenkool wrote:
> On Thu, Jun 02, 2022 at 03:06:52PM -0400, Michael Meissner wrote:
> > Ping patch posted on May 13th:
>
> Are you not going to apply any of Will's suggestions? They looked solid
> to me.
Sure, I will clean up the comments.
Ping patch posted on May 13th:
| Cate: Fri, 13 May 2022 10:49:26 -0400
| From: Michael Meissner
| Subject: [PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR
target/99293
| Message-ID:
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email:
On Fri, May 13, 2022 at 12:32:22PM -0500, Segher Boessenkool wrote:
> On Fri, May 13, 2022 at 11:08:48AM -0400, Michael Meissner wrote:
> > Add zero_extendditi2. Improve lxvr*x code generation.
>
>
>
> Nothing in this pass haas anything to do with the subject. Which is a
> good thing, because
On Fri, May 13, 2022 at 01:20:30PM -0500, will schmidt wrote:
> On Fri, 2022-05-13 at 12:17 -0400, Michael Meissner wrote:
> > Optimize multiply/add of DImode extended to TImode, PR target/103109.
> >
> > On power9 and power10 systems, we have instructions that support doing
> > 64-bit integers
Generate vadduqm and vsubuqm for TImode add/subtract
If the TImode variable is in an Altivec register instead of a GPR
register, then generate vadduqm and vsubuqm instead of having to move the
value to the GPR registers and doing the add and subtract with carry
instructions. To do this, we have
Optimize multiply/add of DImode extended to TImode, PR target/103109.
On power9 and power10 systems, we have instructions that support doing
64-bit integers converted to 128-bit integers and producing 128-bit
results. This patch adds support to generate these instructions.
Previously GCC had
Add zero_extendditi2. Improve lxvr*x code generation.
This pattern adds zero_extendditi2 so that if we are extending DImode that
is in a GPR register to TImode in a vector register, the compiler can
generate MTVSRDDD.
In addition the patterns for generating lxvr{b,h,w,d}x were tuned to allow
Add zero_extendditi2. Improve lxvr*x code generation.
This pattern adds zero_extendditi2 so that if we are extending DImode that
is in a GPR register to TImode in a vector register, the compiler can
generate MTVSRDDD.
In addition the patterns for generating lxvr{b,h,w,d}x were tuned to allow
Replace UNSPEC with RTL code for extendditi2.
When I submitted my patch on March 12th for extendditi2, Segher wished I
had removed the use of the UNSPEC for the vextsd2q instruction. This
patch rewrites extendditi2_vector to use VEC_SELECT rather than UNSPEC.
2022-05-13 Michael Meissner
Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293.
This patch has been previously posted, but it seemed to get lost.:
| Date: Tue, 29 Mar 2022 23:25:31 -0400
| Subject: [PATCH, V2] Optimize vec_splats of constant vec_extract for
V2DI/V2DF, PR target 99293.
| Message-ID:
|
Eliminate power8-fusion and power8-fusion-sign options.
As part of PR target/102059, one of the things came up is that we should
eliminate the power8 fusion options altogether. This patch eliminates the
-mpower8-fusion option. It does enable power8 fusion if the code is being
tuned for power8.
On Tue, May 10, 2022 at 07:27:30AM -0500, Segher Boessenkool wrote:
> > IMHO, it's something we want to fix as well, based on the reasons:
> > 1) bif names have the corresponding mnemonics, users would expect 1-1
> > mapping here.
> > 2) clang emits xs{min,max}dp all the time, with cpu type
On Thu, May 05, 2022 at 02:35:34PM -0500, Segher Boessenkool wrote:
> On Thu, May 05, 2022 at 01:59:05PM -0500, Peter Bergner wrote:
> > If we cannot get this in soonish, maybe we can at least get approval for
> > applying Mike's simpler patch to the release branches, specifically GCC 10?
> >
> >
On Thu, May 05, 2022 at 02:12:43PM -0500, Segher Boessenkool wrote:
> On Tue, Apr 12, 2022 at 09:14:55PM -0400, Michael Meissner wrote:
> > This is V4 of the patch. Compared to V3 of the patch, GCC will just
> > ignore -m{,no-}power8-fusion and -m{,no-}power8-fusion-sign.
>
> But incorrectly :-(
Ping #5:
| Date: Tue, 12 Apr 2022 21:14:55 -0400
| From: Michael Meissner
| Subject: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR
target/102059
| Message-ID:
https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593153.html
We really need closure on this so I can do the
Ping #4:
| Date: Tue, 12 Apr 2022 21:14:55 -0400
| From: Michael Meissner
| Subject: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR
target/102059
| Message-ID:
https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593153.html
--
Michael Meissner, IBM
PO Box 98, Ayer,
On Thu, Apr 28, 2022 at 10:46:17AM +0100, Jonathan Wakely wrote:
> IIUC this text is not true (maybe it was back in 2018?)
Initially I thought the way to transition to IEEE 128-bit long double would be
through multilibs, but we never installed multilibs for the different long
double types. So
Ping patch. The customer really needs this patch. We need to apply it to the
trunk, and then I will have to refactor it for GCC 10 that the customer is
using.
| Date: Tue, 12 Apr 2022 21:14:55 -0400
| From: Michael Meissner
| Subject: [PATCH, V4] Eliminate power8 fusion options, use power8
Ping patch.
| Date: Wed, 6 Apr 2022 14:21:26 -0400
| From: Michael Meissner
| Subject: [PATCH] Add zero_extendditi2. Improve lxvr*x code generation.
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com
Ping patch. While this could be held for GCC 13, it would be nice to know
whether to keep this patch (which was asked for in one of the previous patches)
or discard it.
| Date: Fri, 1 Apr 2022 12:59:28 -0400
| From: Michael Meissner
| Subject: [PATCH] Replace UNSPEC with RTL code for
Ping #2 on this patch.
| Date: Tue, 29 Mar 2022 23:25:31 -0400
| From: Michael Meissner
} Subject: [PATCH, V2] Optimize vec_splats of constant vec_extract for
V2DI/V2DF, PR target 99293.
| Message-ID:
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email:
Ping patch.
| Date: Tue, 12 Apr 2022 21:14:55 -0400
| From: Michael Meissner
| Subject: [PATCH, V4] Eliminate power8 fusion options, use power8 tuning, PR
target/102059
| Message-ID:
I feel this is an important patch. Please look at it and approve the patch or
give me feedback on how to
Eliminate power8 fusion options, use power8 tuning, PR target/102059
This is V4 of the patch. Compared to V3 of the patch, GCC will just
ignore -m{,no-}power8-fusion and -m{,no-}power8-fusion-sign.
The splitting of signed halfword and word loads into unsigned load and
sign extension is now
On Wed, Apr 06, 2022 at 03:01:33PM -0500, will schmidt wrote:
> In this context it's not clear what is the "old code" ?
> The mtvsrdd
> instruction is referenced in this code path. I see no direct reference
> to lxvrdx here, though I suppose it's assumed somewhere behind the
> emit_ calls.
The
Eliminate power8 fusion options, use power8 tuning, PR target/102059
This is V3 of the patch. Compared to V2 of the patch, it changed some of
the comments based on the feedback. Since -mpower8-fusion-sign was an
undocumented option, I removed some of the wording about its removal.
I removed an
This is the patch that I committed. I will do the backport in a few days to
GCC 11 and 10.
Disable float128 tests on VxWorks, PR target/104253.
In PR target/104253, it was pointed out the that test case added as part
of fixing the PR does not work on VxWorks because float128 is not
supported on
On Thu, Apr 07, 2022 at 12:47:27PM +0200, Eric Botcazou wrote:
> > I have run the tests on my usual Linux systems (little endian power10,
> > little endian power9, big endian power8), but I don't have access to a
> > VxWorks system. Eric does this fix the failure for you?
>
> Yes, if you add '*'
On Thu, Apr 07, 2022 at 06:00:51AM -0500, Segher Boessenkool wrote:
> On Thu, Apr 07, 2022 at 12:29:45AM -0400, Michael Meissner wrote:
> > In PR target/104253, it was pointed out the that test case added as part
> > of fixing the PR does not work on VxWorks because float128 is not
> > supported
Disable float128 tests on VxWorks, PR target/104253.
In PR target/104253, it was pointed out the that test case added as part
of fixing the PR does not work on VxWorks because float128 is not
supported on that system. I have modified the three tests for float128 so
that they are manually
>From bf51c49f1481001c7b3223474d261dcbf9365eda Mon Sep 17 00:00:00 2001
From: Michael Meissner
Date: Fri, 1 Apr 2022 22:27:13 -0400
Subject: [PATCH] Add zero_extendditi2. Improve lxvr*x code generation.
This pattern adds zero_extendditi2 so that if we are extending DImode to
TImode, and we want
Ping patch.
| Date: Tue, 29 Mar 2022 23:25:31 -0400
| From: Michael Meissner
| Subject: [PATCH, V2] Optimize vec_splats of constant vec_extract for
V2DI/V2DF, PR target 99293.
| Message-ID:
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com
eplace UNSPEC with RTL code for extendditi2.
When I submitted my patch on March 12th for extendditi2, Segher wished I
had removed the use of the UNSPEC for the vextsd2q instruction. This
patch rewrites extendditi2_vector to use VEC_SELECT rather than UNSPEC.
I have built a power10 little endian
On Mon, Mar 28, 2022 at 05:06:00PM -0500, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Mar 28, 2022 at 12:28:04PM -0400, Michael Meissner wrote:
> > In looking at PR target/99293, I noticed that the insn "type" attribute is
> > incorrect for the vsx_extract_ insns. In particular:
> >
> > 1)
On Mon, Mar 28, 2022 at 03:28:39PM -0500, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Mar 28, 2022 at 12:27:05PM -0400, Michael Meissner wrote:
> > In looking at PR target/99293, I noticed that the code in the insn
> > vsx_splat__reg used "vecmove" as the "type" insn attribute when the
> >
Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.
This is version 2 of the patch. The original patch was:
| Date: Mon, 28 Mar 2022 12:26:02 -0400
| Subject: [PATCH 1/4] Optimize vec_splats of constant vec_extract for
V2DI/V2DF, PR target 99293.
| Message-ID:
|
On Mon, Mar 28, 2022 at 06:59:14PM -0500, Segher Boessenkool wrote:
> On Mon, Mar 28, 2022 at 12:28:55PM -0400, Michael Meissner wrote:
> > In looking at PR target/99293, I noticed that the vsx_extract_
> > pattern for V2DImode and V2DFmode only allowed traditional floating point
> > registers,
On Mon, Mar 28, 2022 at 12:14:09PM -0500, Segher Boessenkool wrote:
> On Mon, Mar 28, 2022 at 12:26:02PM -0400, Michael Meissner wrote:
> > However on power9 and power10 it generates:
> >
> > ;; vec_splats (vec_extract (src, 0))
> > mfvsld 3,34
> > mtvsrdd 34,9,9
> >
> > ;;
Allow vsx_extract_ to use Altivec registers, PR target/99293
In looking at PR target/99293, I noticed that the vsx_extract_
pattern for V2DImode and V2DFmode only allowed traditional floating point
registers, and it did not allow Altivec registers. The original code was
written a few years ago
Make vsx_extract_ use correct insn attributes, PR target 99293.
In looking at PR target/99293, I noticed that the insn "type" attribute is
incorrect for the vsx_extract_ insns. In particular:
1) Simple vector register move should be vecmove (alternative 1);
2) Xxpermdi should be
Make vsx_splat__reg use correct insn attributes, PR target/99293
In looking at PR target/99293, I noticed that the code in the insn
vsx_splat__reg used "vecmove" as the "type" insn attribute when the
"mtvsrdd" is generated. It should use "mfvsr". I also added a "p9v" isa
attribute for that
Optimize vec_splats of constant vec_extract for V2DI/V2DF, PR target 99293.
In PR target/99293, it was pointed out that doing:
vector long long dest0, dest1, src;
/* ... */
dest0 = vec_splats (vec_extract (src, 0));
dest1 = vec_splats (vec_extract (src, 1));
The following 4 patches fix PR target/99293. This bug complains that on power9
and power10:
vector long long v, v0, v1;
// ...
v0 = __builtin_vec_splats (__builtin_vec_extract (v, 0));
v1 = __builtin_vec_splats (__builtin_vec_extract (v, 1));
generates move from
Backport PR fortran/96983 patch to GCC 11.
I applied a patch on the trunk in April 22nd, 2021 that fixes an issue (PR
fortran/66983) where we could fail for 128-bit floating point types
because we don't have a built-in function that is equivalent to llround
for 128-bit integer types. Instead,
On Fri, Mar 11, 2022 at 03:07:39PM -0600, Segher Boessenkool wrote:
> On Fri, Mar 11, 2022 at 09:57:50PM +0100, Jakub Jelinek wrote:
> > On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote:
> > > On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote:
> > > > The version of
On Fri, Mar 11, 2022 at 02:51:23PM -0600, Segher Boessenkool wrote:
> On Fri, Mar 11, 2022 at 08:42:27PM +, Joseph Myers wrote:
> > The version of this patch applied to GCC 10 branch (commit
> > 641b407763ecfee5d4ac86d8ffe9eb1eeea5fd10) has broken the glibc build for
> >
On Fri, Mar 11, 2022 at 02:41:05PM -0600, Segher Boessenkool wrote:
> On Fri, Mar 11, 2022 at 01:07:29AM -0500, Michael Meissner wrote:
> > Fix DImode to TImode sign extend issue, PR target/104898
>
> > When I wrote the extendditi2 pattern, I forgot that mtvsrdd had that
> > behavior so I used a
Matheus Castanho reports that the patch I posted fixes the problem in the
1040868 bug report.
--
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com
Fix DImode to TImode sign extend issue, PR target/104898
PR target/104868 had had an issue where my code that updated the DImode to
TImode sign extension for power10 failed. In looking at the failure
message, the reason is when extendditi2 tries to split the insn, it
generates an insn that does
On Thu, Mar 10, 2022 at 01:49:36PM -0600, Segher Boessenkool wrote:
> On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote:
> > On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote:
> > > --- a/gcc/config/rs6000/rs6000-cpus.def
> > > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > > @@
On Mon, Mar 07, 2022 at 03:37:18PM -0600, Segher Boessenkool wrote:
> Hi!
>
> On Sat, Mar 05, 2022 at 09:21:51AM +0100, Jakub Jelinek wrote:
> > As mentioned in the PR, right now on powerpc* __SIZEOF_{FLOAT,IBM}128__
> > macros are predefined unconditionally, because {ieee,ibm}128_float_type_node
On Wed, Mar 09, 2022 at 04:57:01PM -0600, Segher Boessenkool wrote:
> On Wed, Mar 09, 2022 at 10:10:07PM +0100, Jakub Jelinek wrote:
> > On Wed, Mar 09, 2022 at 02:57:20PM -0600, Segher Boessenkool wrote:
> > > But __ibm128 should *always* be supported, so this is a hypothetical
> > > problem.
> >
On Thu, Mar 10, 2022 at 01:49:36PM -0600, Segher Boessenkool wrote:
> On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote:
> > On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote:
> > > --- a/gcc/config/rs6000/rs6000-cpus.def
> > > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > > @@
Eliminate power8 fusion options, use power8 tuning, PR target/102059
The power8 fusion support used to be set automatically when -mcpu=power8 or
-mtune=power8 was used, and it was cleared for other cpu's. However, if you
used the target attribute or target #pragma to change the default cpu type
On Tue, Mar 08, 2022 at 11:28:03AM -0600, Segher Boessenkool wrote:
> On Fri, Feb 11, 2022 at 12:53:07PM -0500, Michael Meissner wrote:
> > Ping patch for PR target/102059 to ignore implicit -mpower8-fusion that
> > prevents a function targeting power9 or power10 from inlining a function
> > that
Here is the patch that I committed to the trunk:
Optimize signed DImode -> TImode on power10.
On power10, GCC tries to optimize the signed conversion from DImode to
TImode by using the vextsd2q instruction. However to generate this
instruction, it would have to generate 3 direct moves (1 from
101 - 200 of 632 matches
Mail list logo