Add ops_num to targetm.sched.reassociation_width hook

2021-08-03 Thread Aaron Sawdey via Gcc
uses of that target hook)? Thanks! Aaron Aaron Sawdey, Ph.D. saw...@linux.ibm.com IBM Linux on POWER Toolchain

Re: Avoiding truncate/sign-extend of POImode on ppc target

2020-09-02 Thread Aaron Sawdey via Gcc
Meant to CC a few people, oops. Aaron Sawdey, Ph.D. saw...@linux.ibm.com IBM Linux on POWER Toolchain > On Sep 2, 2020, at 9:22 AM, Aaron Sawdey via Gcc wrote: > > > PR96791 is happening because DSE is trying to truncate a > POImode reg down to DImode. The POImode

Avoiding truncate/sign-extend of POImode on ppc target

2020-09-02 Thread Aaron Sawdey via Gcc
community: Is there an existing way to do this? Or, do we need another target hook of some kind to check this sort of thing? Thanks, Aaron Aaron Sawdey, Ph.D. saw...@linux.ibm.com IBM Linux on POWER Toolchain

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 1:01 PM, Jakub Jelinek wrote: > On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote: >> 1) rename optab movmem and the underlying patterns to cpymem. >> 2) add a new optab movmem that is really memmove() and add support for >> having __builtin_memmove() u

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 11:31 AM, Jakub Jelinek wrote: > On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote: >> My goals for this are: >> * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem] >> * memmove() call becomes __builtin_memmove (or __builtin_memcpy base

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 9:02 AM, Michael Matz wrote: > On Wed, 15 May 2019, Aaron Sawdey wrote: >> Next question would be how do we move from the existing movmem pattern >> (which Michael Matz tells us should be renamed cpymem anyway) to this >> new thing. Are you proposing that we still

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 7:22 AM, Richard Biener wrote: > On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey wrote: >> I'd be interested in any comments about pieces of this machinery that need to >> work a certain way, or other related issues that should be addressed in >> between e

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 8:10 AM, Michael Matz wrote:> On Tue, 14 May 2019, Aaron Sawdey wrote: > >> memcpy -> expand with movmem pattern >> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern >> memmove (overlap) -> remains memmove -> gl

Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-14 Thread Aaron Sawdey
comments about pieces of this machinery that need to work a certain way, or other related issues that should be addressed in between expand_builtin_memcpy() and emit_block_move_via_movmem(). Thanks! Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/26

Re: [RFC][GCC][rs6000] Remaining work for inline expansion of strncmp/strcmp/memcmp for powerpc

2018-10-18 Thread Aaron Sawdey
On 10/17/18 4:03 PM, Florian Weimer wrote: > * Aaron Sawdey: > >> I've previously posted a patch to add vector/vsx inline expansion of >> strcmp/strncmp for the power8/power9 processors. Here are some of the >> other items I have in the pipeline that I hope to get

[RFC][GCC][rs6000] Remaining work for inline expansion of strncmp/strcmp/memcmp for powerpc

2018-10-17 Thread Aaron Sawdey
first 512 bytes inline before dumping to the library function. If anyone has any other input on the inline expansion work I've been doing for the rs6000 target, please let me know. Thanks! Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/26

Re: help with PR78809 - inline strcmp for small constant strings

2017-08-05 Thread Aaron Sawdey
= 0) > > > > > > to be implemented with cmp+ccmp+ccmp and one branch. > > > > Even better would be wider loads if you either know the alignment > > of s or it's max size > > (although given the overhead of creating the return value that &g

Re: k-byte memset/memcpy/strlen builtins

2017-01-11 Thread Aaron Sawdey
situation might of > course more complicated than memset because of encodings etc. My > snippet > in question used a fixed-length encoding of 2 bytes, however. > > Another simple idea to tackle this would be a peephole optimization > but > I'm not sure if this is really feasib

cmpstrnsi pattern should check for zero byte?

2016-11-01 Thread Aaron Sawdey
"scmpu" instruction to do the comparison. The RX manual I found showed pseudocode for scmpu that shows it both checks for zero byte as well as comparing the strings. If this isn't correct, please let me know here or on the patch itself. Thanks,     Aaron -- Aaron Sawde

determining reassociation width

2016-05-02 Thread Aaron Sawdey
tions whose terms are fp multiplies because now we have fused multipy-adds to consider. See PR 70912 for more on this. Suggestions? Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

guessed profile counts leading to incorrect static branch hints on ppc64

2015-12-09 Thread Aaron Sawdey
e estimation. */ esucc->probability = REG_BR_PROB_BASE; } } } It would appear that the guessed counts are getting changed inconsistently before this during the tree-ssa-dom pass. Any trail of breadcrumbs to follow through the forest would be helpful here ... Thanks! Aaron -- A

Re: Live range Analysis based on tree representations

2015-09-15 Thread Aaron Sawdey
On Sat, 2015-09-12 at 18:45 +, Ajit Kumar Agarwal wrote: > > > -Original Message- > From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com] > Sent: Friday, September 04, 2015 11:51 PM > To: Ajit Kumar Agarwal > Cc: Jeff Law; vmaka...@redhat.com; Richard Bi

Re: Live range Analysis based on tree representations

2015-09-04 Thread Aaron Sawdey
On Thu, 2015-09-03 at 15:22 +, Ajit Kumar Agarwal wrote: > > > -Original Message- > From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com] > Sent: Wednesday, September 02, 2015 8:23 PM > To: Ajit Kumar Agarwal > Cc: Jeff Law; vmaka...@redhat.com; Richard Bi

Re: Live range Analysis based on tree representations

2015-09-02 Thread Aaron Sawdey
register allocation so it doesn't fall down when register pressure gets high. The code is in a branch called lto-pressure. Aaron > > Thanks & Regards > Ajit > -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Aaron Sawdey
On Mon, 2014-06-16 at 14:42 -0400, Vladimir Makarov wrote: > On 2014-06-16, 2:25 PM, Aaron Sawdey wrote: > > On Mon, 2014-06-16 at 14:14 +, Ajit Kumar Agarwal wrote: > >> Hello All: > >> > >> I have worked on the Open64 compiler where the Register Pressur

Re: Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Aaron Sawdey
milar goal of avoiding situations that cause a lot of spill code. I have been working in a branch if you want to take a look: gcc/branches/lto-pressure Aaron > > Thanks & Regards > Ajit > -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain