Re: [PATCH 21/25] GCN Back-end (part 2/2).

2018-11-13 Thread Andrew Stubbs
On 12/11/2018 18:54, Jeff Law wrote: On 11/12/18 10:52 AM, Andrew Stubbs wrote: On 12/11/2018 17:20, Segher Boessenkool wrote: If you don't want useless USEs deleted, use UNSPEC_VOLATILE instead? Or actually use the register, i.e. as input to an actually needed instruction. They&#

Re: [PATCH 21/25] GCN Back-end (part 2/2).

2018-11-15 Thread Andrew Stubbs
On 14/11/2018 22:30, Jeff Law wrote: There's a particular case that has historically been problematical. If you have this kind of sequence in the epilogue restore register using FP move fp->sp (deallocates frame) return Under certain circumstances the scheduler can swa

Re: [PATCH 01/25] Handle vectors that don't fit in an integer.

2018-11-15 Thread Andrew Stubbs
integer modes. This breaks a number of assumptions throughout the compiler, but I don't really want to create modes just for this purpose. Instead, this patch fixes up the cases that I've found, so far, such that the compiler tries something else, or fails to optimize, rather than just

Re: [PATCH 21/25] GCN Back-end (part 2/2).

2018-11-16 Thread Andrew Stubbs
The guideline I would give to determine if you're vulnerable...  Do you have speculation, including the ability to speculate past a memory operation, branch prediction, memory caches and high resolution timer (ie, like a cycle timer).  If you've got those, then the processor is likely vulnerable t

[PATCH 00/10] AMD GCN Port v2

2018-11-16 Thread Andrew Stubbs
This is a reworked version of the remaining parts of the patch series I posted on September 5th. As before, the series contains the non-OpenACC/OpenMP portions of a port to AMD GCN3 and GCN5 GPU processors. It's sufficient to build single-threaded programs, with vectorization in the usual way. C

[PATCH 03/10] GCN libgcc.

2018-11-16 Thread Andrew Stubbs
This patch contains the GCN port of libgcc. Since the previous posting, I've removed gomp_print.c and reduction.c, as well as addressing some other feedback. 2018-11-16 Andrew Stubbs Kwok Cheung Yeung Julian Brown Tom de Vries l

[PATCH 01/10] Fix IRA ICE.

2018-11-16 Thread Andrew Stubbs
it's not ideal that these registers have not been processed by IRA, but it does not appear to do any real harm. 2018-11-16 Andrew Stubbs gcc/ * ira.c (setup_preferred_alternate_classes_for_new_pseudos): Skip pseudos not created by this pass. (move_unallocate

[PATCH 02/10] GCN libgfortran.

2018-11-16 Thread Andrew Stubbs
[Already approved by Janne Blomqvist and Jeff Law. Included here for completeness.] This patch contains the GCN port of libgfortran. We use the minimal configuration created for NVPTX. That's all that's required, besides the target-independent bug fixes posted already. 2018-11-

[PATCH 06/10] GCN back-end config

2018-11-16 Thread Andrew Stubbs
disabled if libdl is not available. 2018-11-16 Andrew Stubbs Kwok Cheung Yeung Julian Brown Tom de Vries Jan Hubicka Martin Jambor * config.sub: Recognize amdgcn*-*-amdhsa. * configure.ac: Likewise

[PATCH 08/10] Testsuite: GCN is always PIE.

2018-11-16 Thread Andrew Stubbs
[Already approved by Jeff Law. Included here for completeness.] The GCN/HSA loader ignores the load address and uses a random location, so we build all GCN binaries as PIE, by default. This patch makes the necessary testsuite adjustments to make this work correctly. 2018-11-16 Andrew Stubbs

[PATCH 09/10] Ignore LLVM's blank lines.

2018-11-16 Thread Andrew Stubbs
(and very noisy). The LLVM tools also have different command line options, so it's not possible to autodetect object formats in the same way. This patch addresses both issues. 2018-11-16 Andrew Stubbs gcc/testsuite/ * lib/file-format.exp (gcc_target_object_for

[PATCH 07/10] Add dg-require-effective-target exceptions

2018-11-16 Thread Andrew Stubbs
cted by the change. 2018-11-16 Andrew Stubbs Kwok Cheung Yeung Julian Brown Tom de Vries gcc/testsuite/ * c-c++-common/ubsan/pr71512-1.c: Require exceptions. * c-c++-common/ubsan/pr71512-2.c: Require exceptions. * gcc.c-

[PATCH 10/10] Port testsuite to GCN

2018-11-16 Thread Andrew Stubbs
This collection of miscellaneous patches configures the testsuite to run on AMD GCN in a standalone (i.e. not offloading) configuration. It assumes you have your Dejagnu set up to run binaries via the gcn-run tool. 2018-11-16 Andrew Stubbs Kwok Cheung Yeung Julian

Re: [PATCH 06/10] GCN back-end config

2018-11-20 Thread Andrew Stubbs
On 16/11/2018 17:44, Joseph Myers wrote: On Fri, 16 Nov 2018, Andrew Stubbs wrote: * config.sub: Recognize amdgcn*-*-amdhsa. config.sub should be copied from upstream config.git (along with config.guess at the same time), once the support has been added there; it shouldn't be pa

Re: [PATCH 01/10] Fix IRA ICE.

2018-11-21 Thread Andrew Stubbs
On 21/11/2018 00:47, Jeff Law wrote: This seems like a really gross hack and sets an expectation that generating registers in the target after IRA has started is OK. It is not OK. THe fact that this works is, IMHO, likely an accident. What's the proper test for this? Neither lra_in_progress n

Re: [PR86438] avoid too-long shift in test

2019-04-15 Thread Andrew Stubbs
On 12/04/2019 02:42, Alexandre Oliva wrote: The test fell back to long long and long when __int128 is not available, but it assumed sizeof(long) < sizeof(long long) because of a shift count that would be out of range for a long long if their widths are the same. Fixed by splitting it up into two

[patch] Fix Fortran size_t parameter passing

2019-05-22 Thread Andrew Stubbs
e_zero_node". I presume this works on other architectures because the types are the same size, or else because parameters are always 64-bit wide. OK to commit? Andrew Fix fortran size_t parameter passing. 2019-05-22 Andrew Stubbs gcc/fortran/ * trans-stmt.c (gfc_trans_critical): Us

[patch] Fix coarray_lock_7.f90 test failure

2019-05-22 Thread Andrew Stubbs
able-width matches. OK to commit? Andrew Fix new coarray failures. 2019-05-22 Andrew Stubbs gcc/testsuite/ * gfortran.dg/coarray_lock_7.f90: Fix output patterns. diff --git a/gcc/testsuite/gfortran.dg/coarray_lock_7.f90 b/gcc/testsuite/gfortran.dg/coarray_lock_7.f90 index aedb2267413..4f4bdde8

Re: [patch] Fix Fortran size_t parameter passing

2019-05-22 Thread Andrew Stubbs
compatible with integer_type_node, but size_zero_node is not (necessarily) compatible with size_type_node? Well, that's just asking for trouble. :-( Just to confirm, is the attached what you mean? Thanks Andrew Fix fortran size_type_node parameter passing. 2019-05-22 Andrew Stubbs gcc/fo

Re: [patch] Fix Fortran size_t parameter passing

2019-05-22 Thread Andrew Stubbs
On 22/05/2019 13:28, Janne Blomqvist wrote: Just to confirm, is the attached what you mean? Yes, looks good. Thanks, now committed. Andrew

Re: [patch] Fix coarray_lock_7.f90 test failure

2019-05-22 Thread Andrew Stubbs
On 22/05/2019 12:38, Janne Blomqvist wrote: Ok. Thanks, committed. Andrew

Re: [PATCH 02/25] Propagate address spaces to builtins.

2019-09-03 Thread Andrew Stubbs
On 03/09/2019 15:01, Kyrill Tkachov wrote: Sorry for responding to this so late. I'm testing a rebased version of Richard's OOL atomic patches [1] and am hitting an ICE building the -mabi=ilp32 libgfortran multilib for aarch64-none-elf: I thought Andreas already fixed ILP32. https://gcc.gnu.o

Re: [committed, amdgcn] Remove expcnt waits.

2019-09-05 Thread Andrew Stubbs
On 31/07/2019 13:02, Andrew Stubbs wrote: However, in a couple of cases there is an exposed-pipeline issue that needs to be resolved with an actual "nop", which we no longer have. The patch also takes care of adding these, where appropriate. (As it happens, the cmpswap instruction wi

[OG9, committed] Backport GCN expcnt patches

2019-09-05 Thread Andrew Stubbs
I just committed the attached patch to the openacc-gcc-9-branch. The patch removes the redundant s_waitcnt instruction from store instructions. The s_waitcnt with expcnt was a misunderstanding of the documentation. Andrew Backport expcnt patches. 2019-09-05 Andrew Stubbs Backport from

Re: [OG9, committed] Backport GCN expcnt patches

2019-09-06 Thread Andrew Stubbs
On 06/09/2019 08:42, Bernhard Reutner-Fischer wrote: On 5 September 2019 18:07:25 CEST, Andrew Stubbs wrote: I just committed the attached patch to the openacc-gcc-9-branch. + /* Store that requires input registers are not overwritten by +following instruction

[OG9, committed] Backport error message for mapped parameters

2019-09-06 Thread Andrew Stubbs
I've backported this patch from mainline. Andrew Tweak error message for mapped parameters. 2019-07-05 Andrew Stubbs gcc/fortran/ * openmp.c (resolve_omp_clauses): Add custom error messages for parameters in map clauses. diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c

[OG9, committed] Backport gfx906 patches

2019-09-06 Thread Andrew Stubbs
I've just backported these from mainline. They add the Vega 20 gfx906 architecture and multilib. Andrew Document -march=gfx906 option. 2019-06-07 Andrew Stubbs gcc/ * doc/invoke.texi (AMD GCN Options): Add gfx906. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 3103f8

[OG9, amdgcn, committed] Move offload data to graphics memory

2019-09-09 Thread Andrew Stubbs
ointers to device data, and is typically read only once. It also contains the print output data, but this is not performance critical (as in, don't use it if you care about performance). This may need to be reviewed if we want to use it for profiling. Andrew Move offload data into G

[OG9, amdgcn,committed] Fix relocations with multiple devices

2019-09-10 Thread Andrew Stubbs
alternative would be to copy the entire image before modifying it, each time it is loaded. Andrew Fix relocations with multiple devices. 2019-09-10 Andrew Stubbs libgomp/ * plugin/plugin-gcn.c (obstack_chunk_alloc): Delete. (obstack_chunk_free): Delete. (obstack.h): Remove include

[OG9, amdgcn, committed] Use GFX9 granulated sgprs count correctly

2019-09-10 Thread Andrew Stubbs
would hurt performance. Andrew Use GFX9 granulated sgprs count correctly. 2019-09-10 Andrew Stubbs gcc/ * config/gcn/gcn.c (gcn_hsa_declare_function_name): Calculate granulated_sgprs according to architecture. diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index 66854b6f9c5..f8434e4a

[OG9, amdgcn, committed] Detect the actual number of hardware CUs

2019-09-10 Thread Andrew Stubbs
te Runtime (ROCr), but there are license issues with that. We could extract them from the documentation, but this is still on my TODO list. Andrew Detect number of GPU compute units. 2019-09-10 Andrew Stubbs libgomp/ * plugin/plugin-gcn.c (HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUN

[OG9, amdgcn, committed] Fix memory leak in libgomp

2019-09-10 Thread Andrew Stubbs
Committed to OG9 on behalf of Kwok ... The list of struct gomp_threads allocated in gomp_gcn_enter_kernel was not being freed in gomp_gcn_exit_kernel, leading to a small memory leak every time a kernel is run. Runs with a lot of teams or many kernels were running out of heap space. Andrew Fi

Re: [PATCH v3 00/10] AMD GCN Port v3

2019-01-07 Thread Andrew Stubbs
d in time for GCC 9, given that disruption to other targets should no longer be an issue? Andrew On 12/12/2018 11:52, Andrew Stubbs wrote: This is the third rework of the patchset previously posted on September 5th and November 16th. As before, the series contains the non-OpenACC/OpenMP portions

Re: [PATCH v3 00/10] AMD GCN Port v3

2019-01-14 Thread Andrew Stubbs
On 11/01/2019 23:19, Jeff Law wrote: And I think the V3 patch is reasonable enough to go in now. There's some concerns that have been raised with the implementation, but I'm comfortable with Andrew faulting in fixes if those concerns turn into real issues. Andrew, you're green-lighted for the t

Re: [PATCH v3 00/10] AMD GCN Port v3

2019-01-17 Thread Andrew Stubbs
On 14/01/2019 13:55, Andrew Stubbs wrote: I will now rebase, retest, change all the dates to 2019, and get it committed. This is now done! :-) THe Newlib port is also committed, so all the pieces needed for testing GCN should be available to everybody now. To be clear, the libgomp port

[wwwdocs] Mention AMD GCN on the website

2019-01-17 Thread Andrew Stubbs
AMD GCN has now been committed to the trunk. Is the attached OK for the website? Most of the wording has been modelled on the existing C-SKY announcements. Thanks Andrew diff --git a/htdocs/backends.html b/htdocs/backends.html index bb70aa6..eecd09a 100644 --- a/htdocs/backends.html +++ b/htd

Re: [wwwdocs] Mention AMD GCN on the website

2019-01-17 Thread Andrew Stubbs
On 17/01/2019 17:39, Andi Kleen wrote: Can you add a few words on the current limitations? How's this? Andrew diff --git a/htdocs/backends.html b/htdocs/backends.html index bb70aa6..eecd09a 100644 --- a/htdocs/backends.html +++ b/htdocs/backends.html @@ -81,6 +81,7 @@ csky |

[patch] Document AMD GCN features.

2019-01-18 Thread Andrew Stubbs
Hi, This patch adds the documentation needed for the newly-added AMD GCN back end. OK to commit? Andrew Document AMD GCN. 2019-01-18 Andrew Stubbs gcc/ * doc/extend.tex (AMD GCN Function Attributes): New section. * doc/install.texi (amdgcn-unknown-amdhsa): New instructions. * doc

Re: [patch] Document AMD GCN features.

2019-01-22 Thread Andrew Stubbs
On 21/01/2019 18:03, Jeff Law wrote: 2019-01-18 Andrew Stubbs gcc/ * doc/extend.tex (AMD GCN Function Attributes): New section. * doc/install.texi (amdgcn-unknown-amdhsa): New instructions. * doc/invoke.texi (AMD GCN Options): New section. * doc

[patch][pr88920] Fix noisy check_effective_target_offload_gcn

2019-01-29 Thread Andrew Stubbs
the message does not show up at all, unless the verbosity level is raised? Thanks Andrew Cache effective-target llvm_binutils result. 2019-01-21 Andrew Stubbs gcc/testsuite/ * lib/target-supports.exp: Cache result. diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib

Re: [patch][pr88920] Fix noisy check_effective_target_offload_gcn

2019-01-30 Thread Andrew Stubbs
On 29/01/2019 11:31, Richard Biener wrote: OK. Thanks. Patch committed. Andrew

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-10-15 Thread Andrew Stubbs
On 15/10/14 17:34, Jiong Wang wrote: On 23/09/14 16:22, Stubbs, Andrew wrote: Maybe the original patch is better? Or maybe it should reconfigure the FPU instead of erroring out? But reconfigure it to what? Andrew, are you still working on this? a bunch of tests on my local environment

Re: [patch] Warn on undefined loop exit

2014-11-13 Thread Andrew Stubbs
On 12/11/14 11:15, Richard Biener wrote: Please find a better way to communicate possibly_undefined_stmt than enlarging struct loop. Like associating it with the niter bound we record (so you can also have more than one). Unfortunately, the bounds get regenerated frequently, but the upper bou

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-11-14 Thread Andrew Stubbs
On 07/11/14 10:35, Andrew Stubbs wrote: if armv6 never co-exist with NEON, personally I think your original patch is better because TARGET_NEON generally will be used when all options are processed. any way, this needs gate keeper's approval. Ping, Richard. Ping.

Re: [patch] Warn on undefined loop exit

2014-11-19 Thread Andrew Stubbs
On 13/11/14 21:35, Andrew Stubbs wrote: On 12/11/14 11:15, Richard Biener wrote: Please find a better way to communicate possibly_undefined_stmt than enlarging struct loop. Like associating it with the niter bound we record (so you can also have more than one). Unfortunately, the bounds get

Re: [patch] Warn on undefined loop exit

2014-11-19 Thread Andrew Stubbs
On 19/11/14 16:39, Marek Polacek wrote: On Wed, Nov 19, 2014 at 04:32:43PM +, Andrew Stubbs wrote: +if (warning_at (gimple_location (elt->stmt), +OPT_Waggressive_loop_optimizations, +"Loop

Re: [patch] Warn on undefined loop exit

2014-11-20 Thread Andrew Stubbs
xit_warned && problem_stmts != vNULL) +{ !problem_stmts.empty () Otherwise it looks ok. I've committed the attached. I'll work up a patch to dedup the condition shortly. Andrew 2014-11-20 Andrew Stubbs gcc/ * tree-ssa-loop-niter.c (maybe_lower_iteration_bound): War

Re: [patch] Warn on undefined loop exit

2014-11-21 Thread Andrew Stubbs
On 20/11/14 20:55, Marek Polacek wrote: On Thu, Nov 20, 2014 at 05:27:35PM +0100, Richard Biener wrote: + if (exit_warned && problem_stmts != vNULL) +{ !problem_stmts.empty () /home/marek/src/gcc/gcc/tree-ssa-loop-niter.c: In function ‘void maybe_lower_iteration_bound(loop*)’: /

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-11-26 Thread Andrew Stubbs
On 14/11/14 11:12, Andrew Stubbs wrote: On 07/11/14 10:35, Andrew Stubbs wrote: if armv6 never co-exist with NEON, personally I think your original patch is better because TARGET_NEON generally will be used when all options are processed. any way, this needs gate keeper's app

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-11-27 Thread Andrew Stubbs
On 27/11/14 17:05, Mike Stump wrote: Could you include a link or the patch. If the test suite, I'll review it if no one else steps up. The original patch is here: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01119.html Thanks Andrew

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-12-03 Thread Andrew Stubbs
On 02/12/14 21:45, Ramana Radhakrishnan wrote: I've spent some time this evening pondering over your patch. Firstly it appears that the current behaviour is going to cause more breakage than originally expected. If this is to go in we'd have a number of users having to add -mfloat-abi=soft to the

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-12-23 Thread Andrew Stubbs
On 03/12/14 15:03, Andrew Stubbs wrote: The tools have always allowed us to drop down the arch to march=armv5te along with using -mfpu=neon. We are now changing command line behaviour, so an inform in terms of diagnostics to the user would be useful as it states that we don't really have

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2015-01-12 Thread Andrew Stubbs
Ping. On 23/12/14 16:46, Andrew Stubbs wrote: On 03/12/14 15:03, Andrew Stubbs wrote: The tools have always allowed us to drop down the arch to march=armv5te along with using -mfpu=neon. We are now changing command line behaviour, so an inform in terms of diagnostics to the user would be

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2015-01-13 Thread Andrew Stubbs
s. Andrew 2015-01-13 Andrew Stubbs gcc/testsuite/ * lib/target-supports.exp (check_effective_target_arm_neon_ok_nocache): Don't try to test Neon on ARM architures before v7. Index: gcc/testsuite/lib/target-supports.exp

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2015-01-14 Thread Andrew Stubbs
On 14/01/15 08:21, Ramana Radhakrishnan wrote: Ok, that should be enough. Please watch out for any testing fallout this week. Committed, thanks. Andrew

[arm][patch] fix arm_neon_ok check on !arm_arch7

2014-09-13 Thread Andrew Stubbs
s. Otherwise it just takes -mfpu=neon at face value, regardless of -march or -mcpu. This patch limits NEON to armv7 or higher. OK? Andrew 2014-09-13 Andrew Stubbs gcc/ * config/arm/arm.h (TARGET_NEON): Ensure target is v7 or higher. Index:

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-09-15 Thread Andrew Stubbs
On 15/09/14 10:46, Richard Earnshaw wrote: Hmm, I wonder if arm_override_options should reject neon + (arch < 7). Is this more to your taste? Andrew P.S. arm_override_options was renamed in 2010. 2014-09-15 Andrew Stubbs * gcc/config/arm/arm.c (arm_option_override): Reject -mfpu=n

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-09-17 Thread Andrew Stubbs
On 15/09/14 14:29, Richard Earnshaw wrote: Yep, that's fine. Committed, thanks. Andrew

[PATCH][PPC] Skip gcc.target tests with conflicting -mcpu

2014-10-30 Thread Andrew Stubbs
ent situation where a test uses dg-require-effective-target to test the default target, and then adds options that would change the result of that test; it can cause a test to get skipped when actually it would work fine. Anyway, that's a problem for a different day. OK to commit? Andrew 201

Re: [PATCH][PPC] Skip gcc.target tests with conflicting -mcpu

2014-10-31 Thread Andrew Stubbs
On 30/10/14 18:37, Mike Stump wrote: On Oct 30, 2014, at 10:25 AM, Andrew Stubbs wrote: Many of the tests in gcc.target/powerpc specify an explicit -mcpu option with dg-options. So, I think this isn’t the strategy people like for this sort of thing. The problem is default flags. You can

Re: [PATCH][PPC] Skip gcc.target tests with conflicting -mcpu

2014-11-04 Thread Andrew Stubbs
On 03/11/14 18:36, Mike Stump wrote: On Oct 30, 2014, at 10:25 AM, Andrew Stubbs wrote: Many of the tests in gcc.target/powerpc specify an explicit -mcpu option with dg-options. This is a problem for multilib configurations that use -mcpu in their definition OK to commit? Given the

[patch] Warn on undefined loop exit

2014-11-05 Thread Andrew Stubbs
re the compiler could be adjusted to avoid the surprising optimization. Would it be appropriate to do so? OK to commit? Andrew 2014-11-05 Andrew Stubbs gcc/ * tree-ssa-loop-niter.c (maybe_lower_iteration_bound): Set loop->possibly_undefined_stmt appropriately. * tree-ssa-loop-ivcanon.c (r

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-11-07 Thread Andrew Stubbs
if armv6 never co-exist with NEON, personally I think your original patch is better because TARGET_NEON generally will be used when all options are processed. any way, this needs gate keeper's approval. Ping, Richard. Andrew

Re: [patch] Warn on undefined loop exit

2014-11-12 Thread Andrew Stubbs
Ping. On 05/11/14 21:45, Andrew Stubbs wrote: This patch adds the following warning message: undefined.c:9:20: warning: statement may be undefined in the final loop iteration. [-Waggressive-loop-optimizations] for (i = 0; array[i] && i < 5; i++) ^ (Whe

[PATCH][ARM] add support for some missing 16-bit multiplication insns

2011-05-27 Thread Andrew Stubbs
ch until my other patch for canonical mult patterns is applied. OK? Andrew 2011-05-27 Andrew Stubbs gcc/ * config/arm/arm.md (*maddhidi4tb, *maddhidi4tt): New define_insns. (*maddhisi4tb, *maddhisi4tt): New define_insns. gcc/testsuite/ * gcc.target/arm/smlatb-1.c: New file. * gcc.ta

Re: [PATCH, ARM] Thumb-2 12-bit immediates in ADD and SUB instructions

2011-06-01 Thread Andrew Stubbs
On 31/05/11 16:27, Dmitry Plotnikov wrote: Would you include this in your patch? Or should we submit it as a separate patch? I'm not sure I *can* commit your patches, legally speaking, although this one is small enough that probably it's ok ... probably. Perhaps you should submit it yourself

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-02 Thread Andrew Stubbs
Ping 2. On 20/04/11 16:27, Andrew Stubbs wrote: This patch adds basic support for the Thumb ADDW and SUBW instructions. The patch permits the compiler to use the new instructions for constants that can be loaded with a single instruction (i.e. 16-bit unshifted), but does not support use of

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-02 Thread Andrew Stubbs
On 02/06/11 09:23, Ramana Radhakrishnan wrote: Please remove the alternatives in the subsi3 pattern since that is just unnecessary. Please make the constraints internal only. Is this better? Andrew 2011-06-02 Andrew Stubbs gcc/ * config/arm/arm-protos.h (const_ok_for_op): Add prototype

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-06 Thread Andrew Stubbs
On 06/06/11 13:15, Dmitry Plotnikov wrote: + && (const_ok_for_op (INTVAL (x), outer) + || const_ok_for_op (~INTVAL (x), outer The second call is redundant. const_ok_for_op should already do that. Andrew

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-06 Thread Andrew Stubbs
On 06/06/11 14:26, Dmitry Plotnikov wrote: if (const_ok_for_arm (INTVAL (x)) - || const_ok_for_arm (~INTVAL (x))) + || const_ok_for_arm (~INTVAL (x)) + || (TARGET_THUMB2&& outer == PLUS + && (const_ok_for_op (INTVAL (x), outer Sorry, I should have not

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-06-06 Thread Andrew Stubbs
On 06/06/11 15:26, Dmitry Plotnikov wrote: On 06/06/2011 05:33 PM, Andrew Stubbs wrote: On 06/06/11 14:26, Dmitry Plotnikov wrote: if (const_ok_for_arm (INTVAL (x)) - || const_ok_for_arm (~INTVAL (x))) + || const_ok_for_arm (~INTVAL (x)) + || (TARGET_THUMB2&& outer

Re: [patch][simplify-rtx] Fix 16-bit -> 64-bit multiply and accumulate

2011-06-07 Thread Andrew Stubbs
On 02/06/11 10:46, Richard Earnshaw wrote: OK. Committed, thanks. Andrew

Re: [PATCH][ARM] add support for some missing 16-bit multiplication insns

2011-06-07 Thread Andrew Stubbs
On 02/06/11 16:47, Richard Earnshaw wrote: OK. Committed, thanks. Andrew

[PATCH (0/7)] Improve use of Widening Multiplies

2011-06-23 Thread Andrew Stubbs
Hi all, This patch series is intended to improve use of widening multiply, and widening multiply-and-accumulate instructions. This is primarily for the benefit of ARM targets, but should give some improvements to other targets also. The patches provide a number of improvements: * Support f

[PATCH (1/7)] New optab framework for widening multiplies

2011-06-23 Thread Andrew Stubbs
have a widening add, shift, or whatever. Is this patch OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * expr.c (expand_expr_real_2): Use widening_optab_handler. * genopinit.c (optabs): Use set_widening_optab_handler for $N. (gen_insn): $N now means $a must be wider than $b, n

[PATCH (2/7)] Widening multiplies by more than one mode

2011-06-23 Thread Andrew Stubbs
quot;type2" were implicitly identical because they were required to be one mode smaller than "type". I regard the ARM portion of this patch as obvious, so I don't think I need an ARM maintainer to read this. Is the patch OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * conf

[PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-23 Thread Andrew Stubbs
-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Look for multiply statement beyond NOP_EXPR statements. gcc/testsuite/ * gcc.target/arm/umlal-1.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/umlal-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile

[PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-06-23 Thread Andrew Stubbs
guarantee by zero-extending the inputs to a wider mode (which must still be narrower than the output mode). OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency. * optabs.c (find_widening_optab_handler): Rename to

[PATCH (5/7)] Widening multiplies for mis-matched mode inputs

2011-06-23 Thread Andrew Stubbs
being the input type of the operation, and the gimple verification code is still valid. OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME. Ensure the the larger type is the first operand. (convert_mult_to_widen): Insert cast if type2 is

[PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-06-23 Thread Andrew Stubbs
re not being widened), so the pattern match failed. The patch fixes these issues by making the output type explicit, and by permitting unconverted inputs (the types are still checked, so this is safe). OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (is_widening_mult_rhs_

[PATCH (7/7)] Mixed-sign multiplies using narrowest mode

2011-06-23 Thread Andrew Stubbs
inputs may still have to be extended to fit the nearest available instruction, so it doesn't make a difference every time. OK? Andrew 2011-06-23 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (convert_mult_to_widen): Better handle unsigned inputs of different

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-24 Thread Andrew Stubbs
On 23/06/11 17:26, Richard Guenther wrote: On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs wrote: There are many cases where the widening_mult pass does not recognise widening multiply-and-accumulate cases simply because there is a type conversion step between the multiply and add statements

Re: [PATCH (0/7)] Improve use of Widening Multiplies

2011-06-27 Thread Andrew Stubbs
On 25/06/11 15:12, Bernd Schmidt wrote: That all sounds good, but missing from this list is something that occurs on many CPUs - widening from the high part of a register. The current machinery only recognizes lowxlow widening multiplication, but hardware often exists for highxlow and highxhigh.

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-28 Thread Andrew Stubbs
at any point. I've also changed the test cases to address Janis' comments. Andrew 2011-06-28 Andrew Stubbs gcc/ * gimple.h (tree_ssa_harmless_type_conversion): New prototype. (tree_ssa_strip_harmless_type_conversions): New prototype. (harmless_type_conversion_p): New prototype.

Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-06-28 Thread Andrew Stubbs
On 23/06/11 15:41, Andrew Stubbs wrote: If one or both of the inputs to a widening multiply are of unsigned type then the compiler will attempt to use usmul_widen_optab or umul_widen_optab, respectively. That works fine, but only if the target supports those operations directly. Otherwise, it

Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-06-28 Thread Andrew Stubbs
On 28/06/11 13:33, Andrew Stubbs wrote: On 23/06/11 15:41, Andrew Stubbs wrote: If one or both of the inputs to a widening multiply are of unsigned type then the compiler will attempt to use usmul_widen_optab or umul_widen_optab, respectively. That works fine, but only if the target supports

Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs

2011-06-28 Thread Andrew Stubbs
On 23/06/11 15:41, Andrew Stubbs wrote: This patch removes the restriction that the inputs to a widening multiply must be of the same mode. It does this by extending the smaller of the two inputs to match the larger; therefore, it remains the case that subsequent code (in the expand pass, for

Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-06-28 Thread Andrew Stubbs
On 23/06/11 15:42, Andrew Stubbs wrote: This patch fixes the case where widening multiply-and-accumulate were not recognised because the multiplication itself is not actually widening. This can happen when you have "DI + SI * SI" - the multiplication will be done in SImode as a no

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-06-28 Thread Andrew Stubbs
On 28/06/11 16:53, Michael Matz wrote: On Tue, 28 Jun 2011, Richard Guenther wrote: I'd name the predicate value_preserving_conversion_p which I think is what you mean. harmless isn't really descriptive. Note that you include non-value-preserving conversions, namely int -> unsigned int. It s

Re: [PATCH (7/7)] Mixed-sign multiplies using narrowest mode

2011-06-28 Thread Andrew Stubbs
On 23/06/11 15:43, Andrew Stubbs wrote: Patch 4 introduced support for using signed multiplies to code unsigned multiplies in a narrower mode. Patch 5 then introduced support for mis-matched input modes. These two combined mean that there is case where only the smaller of two inputs is unsigned

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-04 Thread Andrew Stubbs
re are about what sort of cast sequences can exist? Is this necessary? I haven't managed to coax it to generated any examples of extends followed by truncates myself, but in any case, it's hardly any code and it'll make sure it's future proofed. OK? Andrew 2011-06-

Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-07-04 Thread Andrew Stubbs
On 28/06/11 15:14, Andrew Stubbs wrote: On 28/06/11 13:33, Andrew Stubbs wrote: On 23/06/11 15:41, Andrew Stubbs wrote: If one or both of the inputs to a widening multiply are of unsigned type then the compiler will attempt to use usmul_widen_optab or umul_widen_optab, respectively. That

Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs

2011-07-04 Thread Andrew Stubbs
On 28/06/11 16:08, Andrew Stubbs wrote: On 23/06/11 15:41, Andrew Stubbs wrote: This patch removes the restriction that the inputs to a widening multiply must be of the same mode. It does this by extending the smaller of the two inputs to match the larger; therefore, it remains the case that

Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-07-04 Thread Andrew Stubbs
On 28/06/11 16:30, Andrew Stubbs wrote: On 23/06/11 15:42, Andrew Stubbs wrote: This patch fixes the case where widening multiply-and-accumulate were not recognised because the multiplication itself is not actually widening. This can happen when you have "DI + SI * SI" - the mult

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-07 Thread Andrew Stubbs
On 07/07/11 10:58, Richard Guenther wrote: I think you should assume that series of widenings, (int)(short)char_variable are already combined. Thus I believe you only need to consider a single conversion in valid_types_for_madd_p. Hmm, I'm not so sure. I'll look into it a bit further. +/* Ch

Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-07-07 Thread Andrew Stubbs
On 07/07/11 11:04, Richard Guenther wrote: Both types are equal, so please share the temporary variable you create + rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), rhs1, type1); + rhs2 = build_and_i

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-07 Thread Andrew Stubbs
On 07/07/11 11:26, Andrew Stubbs wrote: On 07/07/11 10:58, Richard Guenther wrote: I think you should assume that series of widenings, (int)(short)char_variable are already combined. Thus I believe you only need to consider a single conversion in valid_types_for_madd_p. Hmm, I'm not so

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-08 Thread Andrew Stubbs
On 07/07/11 13:37, Richard Guenther wrote: I'll cook up a quick patch for VRP. Like the attached. I'll finish and properly test it. Your patch appears to do the wrong thing for this test case: int foo (int a, short b, short c) { int bc = b * c; return a + (short)bc; } With your patch,

Re: [PATCH (1/7)] New optab framework for widening multiplies

2011-07-09 Thread Andrew Stubbs
On 23/06/11 15:37, Andrew Stubbs wrote: This patch should have no effect on the compiler output. It merely replaces one way to represent widening operations with another, and refactors the other parts of the compiler to match. The rest of the patch set uses this new framework to implement the

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-11 Thread Andrew Stubbs
ontinue to have to be careful about casting unsigned variables whenever they expect purely unsigned math. :( Is this one ok? Andrew 2011-07-11 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single conversion statement separating multiply-and-accumulate.

<    1   2   3   4   5   6   7   8   9   10   >