[committed, amdgcn] Zero-initialise masked load destinations

2020-01-31 Thread Andrew Stubbs
an.dg/assumed_rank_1.f90. 2020-01-30 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (gather_exec): Move contents ... (mask_gather_load): ... here, and zero-initialize the destination. (maskloaddi): Zero-initialize the destination. * config/gcn/gcn.c: diff --git a/gcc/config/gcn/gcn-valu.md

Re: [PR93488] [OpenACC] ICE in type-cast 'async', 'wait' clauses

2020-01-30 Thread Andrew Stubbs
. This should then be backported to all GCC release branches; I can easily test the backports for you, if you're not already set up to do such testing. How's this? Andrew Normalize GOACC_parallel_keyed async and wait parameters 2020-01-30 Andrew Stubbs Thomas Schwinge PR middle-end/9

[committed, amdgcn] Add LTGT support

2020-01-30 Thread Andrew Stubbs
in testcase gcc.dg/pr81228.c. Andrew Add LTGT operator support for amdgcn Fixes ICE in testcase gcc.dg/pr81228.c 2020-01-30 Andrew Stubbs gcc/ * config/gcn/gcn.c (print_operand): Handle LTGT. * config/gcn/predicates.md (gcn_fp_compare_operator): Allow ltgt. diff --git a/gcc/config/gcn/gcn.c b

Re: [PATCH] Add OpenACC acc_get_property support for AMD GCN

2020-01-30 Thread Andrew Stubbs
On 30/01/2020 16:08, Thomas Schwinge wrote: Hi! Andrew and Frederik, thanks for your emails reminding/educating me about 'snprintf' as well as this HSA fixed-size buffer API. There doesn't happen to be something available in the HSA API available so that we could use 'sizeof [something]'

Re: [PATCH, ivopts] Fix fast-math-pr55281.c ICE

2020-01-30 Thread Andrew Stubbs
On 30/01/2020 13:49, Richard Biener wrote: On Thu, Jan 30, 2020 at 2:04 PM Bin.Cheng wrote: On Thu, Jan 30, 2020 at 8:53 PM Andrew Stubbs wrote: On 29/01/2020 08:24, Richard Biener wrote: On Tue, Jan 28, 2020 at 5:53 PM Andrew Stubbs wrote: This patch fixes an ICE compiling fast-math

Re: [PATCH, ivopts] Fix fast-math-pr55281.c ICE

2020-01-30 Thread Andrew Stubbs
On 29/01/2020 08:24, Richard Biener wrote: On Tue, Jan 28, 2020 at 5:53 PM Andrew Stubbs wrote: This patch fixes an ICE compiling fast-math-pr55281.c for amdgcn. The problem is that an "iv" is created in which both base and step are pointer types, How did you get a POINTER

Re: [Patch] [libgomp, build] Skip plugin-{gcn,hsa} for (-m)x32 (PR bootstrap/93409)

2020-01-30 Thread Andrew Stubbs
On 30/01/2020 09:20, Jakub Jelinek wrote: On Fri, Jan 24, 2020 at 03:59:28PM +0100, Tobias Burnus wrote: As reported in PR93409, the build of libgomp/plugin/plugin-gcn.c fails with a bunch of error messages when building with --with-multilib-list=m32,m64,mx32 The reason is that the GCN plugin

Re: [PATCH] Add OpenACC acc_get_property support for AMD GCN

2020-01-29 Thread Andrew Stubbs
On 29/01/2020 17:44, Thomas Schwinge wrote: @@ -1513,6 +1518,23 @@ init_hsa_context (void) + size_t len = sizeof hsa_context.driver_version_s; + int printed = snprintf (hsa_context.driver_version_s, len, + "HSA Runtime %hu.%hu", (unsigned short int)major, +

Re: [Patch] GCN – call assembler with -mattr=-code-object-v3 (PR93409)

2020-01-29 Thread Andrew Stubbs
On 29/01/2020 15:40, Tobias Burnus wrote: Hi Andrew, On 1/29/20 2:01 PM, Andrew Stubbs wrote: On 29/01/2020 12:53, Tobias Burnus wrote: With LLVM 9, the old variant is only accepted when also passing "-mattr=-code-object-v3" to the compiler; that's a"-" after th

Re: [Patch] GCN – call assembler with -mattr=-code-object-v3 (PR93409)

2020-01-29 Thread Andrew Stubbs
On 29/01/2020 12:53, Tobias Burnus wrote: Cf. PR93409 comments 4 and later. The comments 1–3 of the PR are covered by patch https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01663.html (skip building libgomp's HSA/GCN plugin with -mx32). For AMDGCN, the LLVM assembler is used. While for LLVM 7+8,

Re: [PR93488] [OpenACC] ICE in type-cast 'async', 'wait' clauses

2020-01-29 Thread Andrew Stubbs
On 29/01/2020 12:30, Thomas Schwinge wrote: Hi Andrew! On 2019-11-22T11:06:14+, Andrew Stubbs wrote: This test case causes an ICE (reformatted for email): void test(int k) { unsigned int x = 1; #pragma acc parallel loop async(x) for (int i = 0; i < k

Re: [PATCH] Add OpenACC acc_get_property support for AMD GCN

2020-01-29 Thread Andrew Stubbs
On 29/01/2020 09:52, Harwath, Frederik wrote: @@ -1513,6 +1518,23 @@ init_hsa_context (void) GOMP_PLUGIN_error ("Failed to list all HSA runtime agents"); } + uint16_t minor, major; + status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MINOR, ); + if (status !=

[PATCH, ivopts] Fix fast-math-pr55281.c ICE

2020-01-28 Thread Andrew Stubbs
why I only see this issue on amdgcn, but it might be because the pointer in question is in a MASK_LOAD which is perhaps not that commonly used? I've tested this on amdgcn, and done a full bootstrap and test on x86_64 also. OK to commit? Thanks Andrew Fix fast-math-pr55281.c ICE. 2020-01-

Re: [committed, amdgcn] Fix ICE on unsupported FP comparison

2020-01-28 Thread Andrew Stubbs
On 24/01/2020 14:58, Andrew Stubbs wrote: I've committed this patch to fix an ICE building the gcc.dg/vect/fast-math-pr55281.c testcase. Oops, I got that crossed. This was the fix for gcc.dg/pr50310-2.c. The fast-math-pr55281.c fix will be posted shortly. The problem was that the combine

Re: [PATCH] Add OpenACC acc_get_property support for AMD GCN

2020-01-28 Thread Andrew Stubbs
On 28/01/2020 14:55, Harwath, Frederik wrote: Hi, this patch adds full support for the OpenACC 2.6 acc_get_property and acc_get_property_string functions to the libgomp GCN plugin. This replaces the existing stub in libgomp/plugin-gcn.c. Andrew: The value returned for acc_property_memory ("size

Re: [Patch] [libgomp, build] Skip plugin-{gcn,hsa} for (-m)x32 (PR bootstrap/93409)

2020-01-27 Thread Andrew Stubbs
On 24/01/2020 14:59, Tobias Burnus wrote: As reported in PR93409, the build of libgomp/plugin/plugin-gcn.c fails with a bunch of error messages when building with --with-multilib-list=m32,m64,mx32 The reason is that the GCN plugin assumes 64bit pointers. As with HSA, the build is only

[committed, amdgcn] Fix ICE on unsupported FP comparison

2020-01-24 Thread Andrew Stubbs
d have been rejected, but the predicates were too loose. Andrew Fix ICE on unsupported FP comparison 2020-01-24 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (vec_cmpdi): Use gcn_fp_compare_operator. (vec_cmpudi): Use gcn_compare_operator. (vec_cmpv64qidi): Use gcn_compare_operator. (

[committed, libgomp,amdgcn] Fix plugin-gcn.c bug

2020-01-23 Thread Andrew Stubbs
it could read attempt to read any unhandled argument as the thread limit. Andrew Fix libgomp plugin-gcn bug 2020-01-23 Andrew Stubbs libgomp/ * plugin/plugin-gcn.c (parse_target_attributes): Use correct mask for the device id. diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin

Re: [PATCH][amdgcn] Add runtime ISA check for amdgcn offloading

2020-01-20 Thread Andrew Stubbs
On 20/01/2020 16:42, Harwath, Frederik wrote: Hi Andrew, Thanks for the review! I have attached a revised patch containing the changes that you suggested. On 20.01.20 11:00, Andrew Stubbs wrote: On 20/01/2020 06:57, Harwath, Frederik wrote: Is it ok to commit this patch to the master branch

[committed, amdgcn] Update OpenACC testcases for amdgcn

2020-01-20 Thread Andrew Stubbs
-20 Andrew Stubbs libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Skip test on gcn. * testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c (main): Adjust test dimensions for amdgcn. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c (main): Adjust gang/worker/vector

Re: [PATCH][amdgcn] Add runtime ISA check for amdgcn offloading

2020-01-20 Thread Andrew Stubbs
On 20/01/2020 11:07, Jakub Jelinek wrote: On Mon, Jan 20, 2020 at 11:00:58AM +, Andrew Stubbs wrote: Indeed, fat binaries would be a good solution. Presumably it's possible, but I'm not sure how we'd go about getting the offload mechanism to launch the backend multiple times? Having got

Re: [PATCH][amdgcn] Add runtime ISA check for amdgcn offloading

2020-01-20 Thread Andrew Stubbs
On 20/01/2020 10:42, Jakub Jelinek wrote: :( Another option would be to build offloading code by GCN multiple times, once for each incompatible ISA the user is asking for, so that one can have then binaries that will work on different hw. Because e.g. with the distro vendor hat, it is hard to

Re: [PATCH][amdgcn] Add runtime ISA check for amdgcn offloading

2020-01-20 Thread Andrew Stubbs
On 20/01/2020 10:08, Jakub Jelinek wrote: On Mon, Jan 20, 2020 at 10:00:09AM +, Andrew Stubbs wrote: @@ -396,6 +396,88 @@ struct gcn_image_desc struct global_var_info *global_variables; }; +/* This enum mirrors the corresponding LLVM enum's values for all ISAs that we + support

Re: [PATCH][amdgcn] Add runtime ISA check for amdgcn offloading

2020-01-20 Thread Andrew Stubbs
Hi Frederik, On 20/01/2020 06:57, Harwath, Frederik wrote: Hi, this patch implements a runtime ISA check for amdgcn offloading. The check verifies that the ISA of the GPU to which we try to offload matches the ISA for which the code to be offloaded has been compiled. If it detects a mismatch,

[committed, amdgcn/openacc] Rename acc_device_gcn to acc_device_radeon

2020-01-17 Thread Andrew Stubbs
code will use, if anything, so we ought to be compatible. There's no official release using the "wrong" name, so I don't believe we need to retain that name for any reason. I've tested that there are no regressions. Andrew Rename acc_device_gcn to acc_device_radeon 2020-01-17 And

Re: [committed, amdgcn] Allow constants in vector extends and truncates

2020-01-16 Thread Andrew Stubbs
On 19/12/2019 17:39, Richard Sandiford wrote: Andrew Stubbs writes: This patch changes the operand predicates such that vector constants are permitted during compilation. This prevents ICEs caused by the compiler trying to emit such instructions without checking. That sounds like a target

Re: [patch, openacc] Fix ICE verifying gimple

2020-01-16 Thread Andrew Stubbs
Ping. On 22/11/2019 11:06, Andrew Stubbs wrote: This test case causes an ICE (reformatted for email):   void test(int k)   {     unsigned int x = 1;   #pragma acc parallel loop async(x)     for (int i = 0; i < k; i++) { }   }   t.c: In function 'test':   t.c:4:9: error: inva

Re: [PATCH] [amdgcn] Remove dependency on stdint.h in libgcc

2020-01-10 Thread Andrew Stubbs
On 10/01/2020 14:21, Kwok Cheung Yeung wrote: The patch for sub-word atomics support added an include of stdint.h for the definition of uintptr_h, but this can result in GCC compilation failing if the stdint.h header has not been installed (from newlib in the case of AMD GCN). I have fixed

Re: [PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap operations

2020-01-09 Thread Andrew Stubbs
On 08/01/2020 18:18, Kwok Cheung Yeung wrote: Is this version okay for trunk? OK, thanks. Andrew

Re: [PATCH] [amdgcn] Add support for sub-word sync_compare_and_swap operations

2020-01-08 Thread Andrew Stubbs
On 08/01/2020 11:07, Kwok Cheung Yeung wrote: +#define __sync_subword_compare_and_swap(type, size)    \ Macro parameters are conventionally upper case. +    \ +type    \ +__sync_val_compare_and_swap_##size

[committed, amdgcn] Add more modes for vector comparisons

2020-01-07 Thread Andrew Stubbs
d 3 loops" 1 FAIL: gcc.dg/vect/vect-cond-reduc-4.c scan-tree-dump-times vect "LOOP VECTORIZED" 2 FAIL: gcc.dg/vect/vect-cselim-1.c scan-tree-dump-times vect "vectorized 2 loops" 1 FAIL: gcc.dg/vect/vect-version-1.c scan-tree-dump vect "applying loop versioning to oute

[committed, amdgcn] Disallow 'B' constraints on addc/subb

2020-01-07 Thread Andrew Stubbs
' constraints on amdgcn addc/subb 2020-01-07 Andrew Stubbs gcc/ * config/gcn/constraints.md (DA): Update description and match. (DB): Likewise. (Db): New constraint. * config/gcn/gcn-protos.h (gcn_inline_constant64_p): Add second parameter. * config/gcn/gcn.c (gcn_inline_constant64_p

[committed, amdgcn] Fix issue with '0' constraints

2020-01-06 Thread Andrew Stubbs
bad code?) Adding an alternatives for each permutation fixes the problem. This has already been done for many other patterns. Andrew Fix amdgcn issue with '0' constraints 2020-01-06 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (subv64di3): Use separate alternatives for '0' matching inputs

[committed, amdgcn] Fix early-clobber in vec_extract

2020-01-06 Thread Andrew Stubbs
register pairs. Other patterns use '0' to allow exact matches, but the input and outputs here are different size, and I'm not sure what happens there. Anyway, this is safe. Andrew Fix early-clobber in amdgcn vec_extract 2020-01-06 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (vec_extract

[committed, amdgcn] Fix inline immediate range

2020-01-06 Thread Andrew Stubbs
Inline immediates for AMD GCN instructions are supposed to be in the range -16..64 inclusive, but the implementation had the upper bound exclusive. This patch fixes the error. Andrew Fix amdgcn inline immediate range 2020-01-06 Andrew Stubbs gcc/ * config/gcn/gcn.c

[committed, amdgcn] Allow constants in vector extends and truncates

2019-12-19 Thread Andrew Stubbs
constants in amdgcn extends and truncates 2019-12-19 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (2): Change input predcate to gcn_alu_operand. (extend2): Likewise. (truncv64di2): Likewise. (truncv64di2_exec): Likewise. (v64di2): Likewise. (v64di2_exec): Likewise. diff --git a/gcc

[committed, amdgcn] Use V64SI for all remaining add-with-carry insns

2019-12-19 Thread Andrew Stubbs
are not interesting for those modes (being mostly used to implement DImode splitters), so we can dispense with the notional iterator. Andrew Use V64SI for all amdgcn add-with-carry insns 2019-12-19 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (*plus_carry_dpp_shr_): Rename

[committed, amdgcn] Add sub-dword add/sub patterns

2019-12-19 Thread Andrew Stubbs
elsewhere. This results in 80 new test passes. There are a few regressions from vectorization tests that took a different code path and encountered another missing instruction. Andrew Implement sub-dword add/sub on amdgcn 2019-12-19 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (addv64si3

[committed, amdgcn] Fix vect/pr65947-8.c testcase for amdgcn

2019-12-18 Thread Andrew Stubbs
expect that it will not. I fixed it by special-casing GCN. There's might be a more general way, but apparently this does happen for other architectures (?) Andrew Fix vect/pr65947-8.c testcase for amdgcn. 2019-12-18 Andrew Stubbs gcc/testsuite/ * gcc.dg/vect/pr65947-8.c: Change pass

[committed, pr92772] Mention bug in comment

2019-12-17 Thread Andrew Stubbs
hopefully the pointer will save future readers some confusion. Andrew Add pointer to PR92772 2019-12-17 Andrew Stubbs * tree-vect-loop.c (vect_create_epilog_for_reduction): Mention pr92772 in the comments. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 353a5ff06e1..68699f2d

[committed, amdgcn] Implement extract_last and fold_extract_last

2019-12-17 Thread Andrew Stubbs
n vect.exp to name them all individually, but includes vect-cond_reduc-* and pr65947-10.c. Andrew Stubbs Mentor Graphics / CodeSourcery Add extract_last for amdgcn 2019-12-17 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (extract_last_): New expander. (fold_extract_last_): New expander. gcc

[committed, amdgcn] Implement clz and ctz

2019-12-17 Thread Andrew Stubbs
This patch implements the count leading and trailing zeros instruction patterns in the AMD GCN backend. This is prerequisite for implementing the extract_last patterns. Andrew Stubbs Mentor Graphics / CodeSourcery Add clz and ctz for amdgcn 2019-12-17 Andrew Stubbs gcc/ * config/gcn

Re: [PATCH] Add OpenACC 2.6 `acc_get_property' support

2019-12-17 Thread Andrew Stubbs
On 16/12/2019 23:00, Thomas Schwinge wrote: There is no AMD GCN support yet. This will be added later on. ACK, just to note that there now is a 'libgomp/plugin/plugin-gcn.c' that at least needs to get a stub implementation (can mostly copy from 'libgomp/plugin/plugin-hsa.c'?) as otherwise the

Re: [patch, openacc] Adjust tests for amdgcn offloading

2019-12-13 Thread Andrew Stubbs
On 19/11/2019 12:21, Andrew Stubbs wrote: This patch adds GCN special casing for most of the OpenACC libgomp tests that require it. It also disables one testcase that explicitly uses CUDA. The patches aren't all that controversial, should only change the results on amdgcn, and Tobias already

[committed, amdgcn] Add sub-dword vector multiply

2019-12-13 Thread Andrew Stubbs
I've committed this patch to add v64qi and v64hi multiply patterns. This is slowly working toward full char and short vectorization. Andrew Sub-dword vector multiply for amdgcn 2019-12-13 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (mulv64si3): Rename to ... (mul3

[committed, amdgcn] Add sub-dword vector extend and truncate insns

2019-12-13 Thread Andrew Stubbs
n 2019-12-13 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (sdwa): New mode attribute. (VCVT_FROM_MODE): Rename to ... (VCVT_MODE): ... this. (VCVT_TO_MODE): Delete mode iterator. (VCVT_FMODE): New mode iterator. (VCVT_IMODE): Likewise. (2): Change ... (2): ... to this. (2): New. (ze

Re: [RFC, vectorizer] Fix ICE with masked vectors

2019-12-10 Thread Andrew Stubbs
On 09/12/2019 15:59, Richard Sandiford wrote: No, the assumption's correct even there. The assert usually triggers because something elsewhere is getting confused about the vector types. The attached patch fixes the ICE in the testcase, but I suspect does not go far enough. Can it happen that

[RFC, vectorizer] Fix ICE with masked vectors

2019-12-09 Thread Andrew Stubbs
Hi, This patch fixes an ICE in testcase gcc.dg/vect/vect-ctor-1.c: during GIMPLE pass: vect dump file: vect-ctor-1.c.159t.vect .../gcc.dg/vect/vect-ctor-1.c: In function 'intrapred_luma_16x16': .../gcc.dg/vect/vect-ctor-1.c:9:6: internal compiler error: in exact_div, at poly-int.h:2162

Re: [committed, amdgcn] Fix unrecognised instruction

2019-12-09 Thread Andrew Stubbs
On 06/12/2019 17:57, Andrew Stubbs wrote: Hi all, I've committed the attached to fix a failure-to-assemble bug that can occur in some vectorized code.  This has been hidden for a long time because sub-word vectors were disabled on GCN, but this is no longer the case. The gather load

Re: [committed, amdgcn] Enable QI/HImode vector moves

2019-12-09 Thread Andrew Stubbs
Oops, please consider this patch as submitted from my @codesourcery.com address, for copyright assignment purposes. Andrew On 06/12/2019 17:31, Andrew Stubbs wrote: Hi all, This patch re-enables the V64QImode and V64HImode for GCN. GCC does not make these easy to work with because

Re: [committed, amdgcn] Enable QI/HImode vector moves

2019-12-09 Thread Andrew Stubbs
On 06/12/2019 18:21, Richard Sandiford wrote: Andrew Stubbs writes: Hi all, This patch re-enables the V64QImode and V64HImode for GCN. GCC does not make these easy to work with because there is (was?) an assumption that vector registers do not have excess bits in vector registers

[committed, amdgcn] Fix unrecognised instruction

2019-12-06 Thread Andrew Stubbs
didn't assemble well. E.g. it had 'flat_load_short', instead of 'flat_load_ustore'. This fixes about 39 tests in vect.exp. -- Andrew Stubbs Mentor Graphics / CodeSourcery Fix unrecognised GCN instruction. 2019-12-06 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md (gather_insn_1offset): Use %o

[committed, amdgcn] Enable QI/HImode vector moves

2019-12-06 Thread Andrew Stubbs
new passes in the vect.exp (there's also 41 new fails, but those are exposed bugs I'll fix shortly). Some of these were internal compiler errors that did not exist in older compilers. -- Andrew Stubbs Mentor Graphics / CodeSourcery Enable QI/HImode vector moves 2019-12-06 Andrew Stubbs gcc/ * conf

Re: [RFC] Characters per line: from punch card (80) to line printer (132)

2019-12-06 Thread Andrew Stubbs
On 05/12/2019 18:21, Robin Curtis wrote: My IBM Selectric golfball electronic printer only does 90 characters on A4 in portrait mode………(at 10 cps) (as for my all electric TELEX Teleprinter machine !) Is this debate for real ?! - or is this a Christmas spoof ? I can't speak for the debate,

Re: [RFC] Characters per line: from punch card (80) to line printer (132)

2019-12-05 Thread Andrew Stubbs
On 05/12/2019 16:17, Joseph Myers wrote: Longer lines mean less space for multiple terminal / editor windows side-by-side to look at different pieces of code. I don't think that's an improvement. Here's a data-point My 1920 pixel-wide screen, in the default font, allows 239 columns; not

Re: [PATCH][AMDGCN] Skip test gcc/testsuite/gcc.dg/asm-4.c

2019-12-05 Thread Andrew Stubbs
On 05/12/2019 07:05, Harwath, Frederik wrote: Hi, the inline assembly "p" modifier ("An operand that is a valid memory address is allowed", cf. https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints) is not supported on AMD GCN. This causes an ICE during the compilation

[amdgcn] Add missing vcondu patterns

2019-12-03 Thread Andrew Stubbs
now compiles, although not quite correctly, but that's another issue (pr92772). Andrew Add missing amdgcn vcondu patterns 2019-12-03 Andrew Stubbs gcc/ * config/gcn/gcn-valu.md: Change "vcondu" patterns to use VEC_1REG_MODE for the data mode. diff --git a/gcc/config/gcn/gcn-val

Re: [PATCH 4/7 libgomp,amdgcn] GCN libgomp port

2019-12-03 Thread Andrew Stubbs
On 02/12/2019 14:43, Thomas Schwinge wrote: Hi! On 2019-11-12T13:29:13+, Andrew Stubbs wrote: --- a/include/gomp-constants.h +++ b/include/gomp-constants.h @@ -174,6 +174,7 @@ enum gomp_map_kind #define GOMP_DEVICE_NVIDIA_PTX5 #define GOMP_DEVICE_INTEL_MIC 6

Re: [patch, libgomp] Enable OpenACC GCN testing

2019-12-03 Thread Andrew Stubbs
rew Enable OpenACC GCN testing. 2019-12-03 Andrew Stubbs libgomp/ * testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type): Recognize amdgcn. (check_effective_target_openacc_amdgcn_accel_present): New proc. (check_effective_target_openacc_amdgcn_accel_selected): New proc. *

Re: Host/device shared memory

2019-12-02 Thread Andrew Stubbs
On 02/12/2019 14:23, Thomas Schwinge wrote: Hi! On 2019-11-15T13:43:04+0100, Jakub Jelinek wrote: On Fri, Nov 15, 2019 at 12:38:06PM +, Andrew Stubbs wrote: On 15/11/2019 12:21, Jakub Jelinek wrote: I'm surprised by the set acc_mem_shared 0, I thought gcn is a shared memory offloading

Re: [Patch] config/gcn/mkoffload.c – remove unused static vars

2019-11-25 Thread Andrew Stubbs
On 25/11/2019 14:17, Tobias Burnus wrote: The compiler warns that funcs_tail and vars_tails are unused – they, funcs_ids/var_ids and struct id_map seem to be a copy-n-paste leftovers from gcc/config/nvptx/mkoffload.c. Additionally, COMMENT_PREFIX does not seem to be used anywhere. (In the

Re: [Patch][amdgcn] Silence warnings + add gcc_unreachable()

2019-11-25 Thread Andrew Stubbs
On 25/11/2019 11:14, Tobias Burnus wrote: This patch adds "gcc_unreachable ();" as suggested by me (cf. below). It also silences the -Wunused-variable + 'no return statement' warnings. OK for the trunk? OK. Thanks, Tobias. Andrew

[committed, amdgcn] Limit LDS usage

2019-11-22 Thread Andrew Stubbs
unchanged for non-offload compiles (this is only really used for running the testsuite). -- Andrew Stubbs CodeSourcery / Mentor Graphics Limit LDS usage. 2019-11-22 Andrew Stubbs gcc/ * config/gcn/gcn.c (OMP_LDS_SIZE): Define. (ACC_LDS_SIZE): Define. (OTHER_LDS_SIZE): Define. (LDS_SIZE

[committed, amdgcn] Use GFX9 granulated sgprs count correctly

2019-11-22 Thread Andrew Stubbs
I've committed the attached. The patch adjusts the GCN kernel metadata so that it is correct for GFX9 devices. The existing implementation was correct for GFX8, and seems to work on GFX9, but wasn't technically correct. -- Andrew Stubbs CodeSourcery / Mentor Graphics Use GFX9 granulated

[patch, openacc] Fix ICE verifying gimple

2019-11-22 Thread Andrew Stubbs
ed patch assigns the "(int) x" to a temporary and passes that to the function instead. OK to commit? -- Andrew Stubbs CodeSourcery / Mentor Graphics Normalize GOACC_parallel_keyed async parameter. 2019-11-22 Andrew Stubbs gcc/ * omp-expand.c (expand_omp_target): Pass sync parameter t

[committed] Update loop-1.c test for amdgcn

2019-11-19 Thread Andrew Stubbs
. The code is still correct for the purpose of the testcase either way, however, so I'm removing the over-fussy match. Andrew Update loop-1.c test for amdgcn 2019-11-19 Andrew Stubbs gcc/testsuite/ * gcc.dg/tree-ssa/loop-1.c: Change amdgcn assembler scan. diff --git a/gcc/testsuite/gcc.dg

[patch, openacc] Adjust tests for amdgcn offloading

2019-11-19 Thread Andrew Stubbs
This patch adds GCN special casing for most of the OpenACC libgomp tests that require it. It also disables one testcase that explicitly uses CUDA. OK to commit? Andrew Update OpenACC tests for amdgcn 2019-11-19 Andrew Stubbs libgomp/ * testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1

Re: [PATCH 13/13] Enable worker partitioning for AMD GCN

2019-11-18 Thread Andrew Stubbs
On 15/11/2019 21:44, Julian Brown wrote: This patch flips the switch to enable worker partitioning on AMD GCN. OK? This is OK, although I think we could just remove that flag now. Andrew

Re: [PATCH 11/13] AMD GCN symbol output with null cfun

2019-11-18 Thread Andrew Stubbs
On 15/11/2019 21:44, Julian Brown wrote: This patch checks that cfun is valid in the gcn_asm_output_symbol_ref function. This prevents a crash when that function is called with NULL cfun, i.e. when outputting debug symbols. OK? OK, although that FIXME still baffles me. Andrew

Re: [PATCH 09/13] AMD GCN libgomp plugin queue-full condition locking fix

2019-11-18 Thread Andrew Stubbs
On 15/11/2019 21:44, Julian Brown wrote: @@ -2732,13 +2732,9 @@ wait_for_queue_nonfull (struct goacc_asyncqueue *aq) { if (aq->queue_n == ASYNC_QUEUE_SIZE) { - pthread_mutex_lock (>mutex); - /* Queue is full. Wait for it to not be full. */ while (aq->queue_n ==

Re: [PATCH 08/13] Fix host-to-device copies from rodata for AMD GCN

2019-11-18 Thread Andrew Stubbs
On 15/11/2019 21:44, Julian Brown wrote: +static void +hsa_memory_copy_wrapper (void *dst, const void *src, size_t len) +{ + hsa_status_t status = hsa_fns.hsa_memory_copy_fn (dst, src, len); + + if (status == HSA_STATUS_SUCCESS) +return; + + /* It appears that the copy fails if the source

Re: [PATCH 4/5] [amdgcn] Update lower limits requested by non-leaf kernels

2019-11-15 Thread Andrew Stubbs
On 15/11/2019 15:51, Kwok Cheung Yeung wrote: On 15/11/2019 11:32 am, Andrew Stubbs wrote: On 14/11/2019 15:33, Kwok Cheung Yeung wrote: The kernel attributes are changed to request at least 64 SGPRs and 24 VGPRs (i.e. the non-kernel maximum, otherwise the callees may not have enough

Re: [patch, libgomp] Enable OpenACC GCN testing

2019-11-15 Thread Andrew Stubbs
On 15/11/2019 12:43, Jakub Jelinek wrote: APUs, such as Carizzo are shared memory. DGPUs, such as Fiji and Vega, have their own memory. A DGPU can access host memory, provided that it has been set up just so, but that is very slow, and I don't know of a way to do that without still having to

Re: [patch, libgomp] Enable OpenACC GCN testing

2019-11-15 Thread Andrew Stubbs
On 15/11/2019 12:21, Jakub Jelinek wrote: On Thu, Nov 14, 2019 at 04:36:38PM +, Andrew Stubbs wrote: This patch adds some necessary bits to enable OpenACC testings for amdgcn offloading. The two "check_effective" procedures are not actually needed yet, but later patches to

Re: [PATCH 5/5] [amdgcn] Unfix frame pointer

2019-11-15 Thread Andrew Stubbs
On 14/11/2019 15:34, Kwok Cheung Yeung wrote: This patch unfixes the registers for the hard frame pointer so that they can be used for other purposes if the frame pointer is not in use. This patch is dependent on the commit 'Support using multiple registers to hold the frame pointer'

Re: [PATCH 4/5] [amdgcn] Update lower limits requested by non-leaf kernels

2019-11-15 Thread Andrew Stubbs
On 14/11/2019 15:33, Kwok Cheung Yeung wrote: The kernel attributes are changed to request at least 64 SGPRs and 24 VGPRs (i.e. the non-kernel maximum, otherwise the callees may not have enough registers to run in) for non-leaf kernels to take advantage of the reduced number of registers used

Re: [PATCH 3/5] [amdgcn] Restrict register usage in non-kernel functions

2019-11-15 Thread Andrew Stubbs
On 14/11/2019 15:32, Kwok Cheung Yeung wrote: This patch restricts non-kernel functions to using a maximum of 64 SGPRs and 24 VGPRs. Kernels can request various pieces of information from the HSA runtime, and these will be loaded into the registers consecutively before the kernel executes.

Re: [PATCH 2/5] [amdgcn] Reinitialize registers for every function

2019-11-15 Thread Andrew Stubbs
On 14/11/2019 15:30, Kwok Cheung Yeung wrote: The set of fixed registers is adjusted by the TARGET_CONDITIONAL_REGISTER_USAGE hook, but this needs to be done on a per-function basis, whereas the hook is normally called once during GCC initialization before any functions have been processed

Re: [PATCH 1/5] [amdgcn] Use first lane of v1 for zero constant

2019-11-15 Thread Andrew Stubbs
On 14/11/2019 15:30, Kwok Cheung Yeung wrote: GCN 5 has commonly-used global memory instructions that specify the address as [SGPR address] + [VGPR offset] + [constant offset], and we often want the VGPR offset to be zero, so v0 is currently reserved for that purpose. However, v1 contains

Re: [patch, libgomp] Add tests for print from offload target

2019-11-14 Thread Andrew Stubbs
On 14/11/2019 17:05, Jakub Jelinek wrote: On Thu, Nov 14, 2019 at 04:47:49PM +, Andrew Stubbs wrote: This patch adds new libgomp tests to ensure that C "printf" and Fortran "write" work correctly within offload kernels. Both should work for amdgcn, but nvptx uses the

[patch, libgomp] Add tests for print from offload target

2019-11-14 Thread Andrew Stubbs
from offload kernels is not recommended in production, but can be useful in development. OK to commit? Thanks Andrew Add tests for print from offload target. 2019-11-14 Andrew Stubbs libgomp/ * testsuite/libgomp.c/target-print-1.c: New file. * testsuite/libgomp.fortran/target-print-1.f9

[patch, libgomp] Enable OpenACC GCN testing

2019-11-14 Thread Andrew Stubbs
Hi, This patch adds some necessary bits to enable OpenACC testings for amdgcn offloading. The two "check_effective" procedures are not actually needed yet, but later patches to test cases will use them. OK to commit? Thanks Andrew Enable OpenACC GCN testing. 2019-11-14 And

Re: [PATCH] [GCN] Fix handling of VCC_CONDITIONAL_REG

2019-11-14 Thread Andrew Stubbs
On 14/11/2019 12:43, Kwok Cheung Yeung wrote: Hello This patch fixes an issue seen in the following test cases on AMD GCN: libgomp.oacc-fortran/gemm.f90 libgomp.oacc-fortran/gemm-2.f90 libgomp.c/for-5-test_ttdpfs_ds128_auto.c libgomp.c/for-5-test_ttdpfs_ds128_guided32.c

Re: [PATCH 0/7 libgomp,amdgcn] AMD GCN Offloading Support

2019-11-13 Thread Andrew Stubbs
These patches are now all committed. I've adjusted the changelogs to list all the proper authors (apologies if I missed anyone). Thank you for the quick reviews, Jakub. :-) Andrew On 12/11/2019 13:29, Andrew Stubbs wrote: Hi all, This patch series contributes initial OpenMP and OpenACC

[committed, amdgcn] Move gcn-run heap into GPU memory

2019-11-13 Thread Andrew Stubbs
kernels will experience, and therefore make standalone testing more meaningful. Andrew Move gcn-run heap into GPU memory. 2019-11-13 Andrew Stubbs gcc/ * config/gcn/gcn-run.c (heap_region): New global variable. (struct hsa_runtime_fn_info): Add hsa_memory_assign_agent_fn

Re: [PATCH 5/7 libgomp,amdgcn] Optimize GCN OpenMP malloc performance

2019-11-12 Thread Andrew Stubbs
ine, but I don't think it was doing so anyway. I need to look at that, but how is this, for now? Andrew Optimize GCN OpenMP malloc performance 2019-11-12 Andrew Stubbs libgomp/ * config/gcn/team.c (gomp_gcn_enter_kernel): Set up the team arena and use team_malloc varia

Re: [PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin

2019-11-12 Thread Andrew Stubbs
On 12/11/2019 14:01, Jakub Jelinek wrote: On Tue, Nov 12, 2019 at 01:29:16PM +, Andrew Stubbs wrote: 2019-11-12 Andrew Stubbs libgomp/ * plugin/Makefrag.am: Add amdgcn plugin support. * plugin/configfrag.ac: Likewise. * plugin/plugin-gcn.c: New file

Re: [PATCH 4/7 libgomp,amdgcn] GCN libgomp port

2019-11-12 Thread Andrew Stubbs
On 12/11/2019 13:46, Jakub Jelinek wrote: On Tue, Nov 12, 2019 at 01:29:13PM +, Andrew Stubbs wrote: 2019-11-12 Andrew Stubbs include/ * gomp-constants.h (GOMP_DEVICE_GCN): Define. (GOMP_VERSION_GCN): Define. Perhaps this could be 0, but not a big deal. OG9

[PATCH 6/7 amdgcn] Use a single worker for OpenACC on AMD GCN

2019-11-12 Thread Andrew Stubbs
This patch prevents the compiler using multiple workers in a gang. This should be reverted when worker support is committed. I will commit this with the reset of the series. Andrew 2019-11-12 Andrew Stubbs Julian Brown gcc/ * config/gcn/gcn.c

[PATCH 5/7 libgomp,amdgcn] Optimize GCN OpenMP malloc performance

2019-11-12 Thread Andrew Stubbs
l search and replace. Dummy pass-through definitions are provided for other targets. OK to commit? Thanks Andrew 2019-11-12 Andrew Stubbs libgomp/ * config/gcn/team.c (gomp_gcn_enter_kernel): Set up the team arena and use team_malloc variants. (gomp_gcn_exit_ke

[PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin

2019-11-12 Thread Andrew Stubbs
This patch contributes the GCN libgomp plugin, with the various configure and make bits to go with it. This implementation is a much-cleaned-up version of the one present on the openacc-gcc-9-branch. OK to commit? Thanks Andrew 2019-11-12 Andrew Stubbs libgomp/ * plugin

[PATCH 4/7 libgomp,amdgcn] GCN libgomp port

2019-11-12 Thread Andrew Stubbs
new target-specific symbols to be added to libgomp. I couldn't find an existing way to do this without adding a new top-level file also, to there's an empty placeholder also. (The OG9 branch has this symbol in libgcc, but that seems wrong.) OK to commit? Thanks Andrew 2019-11-12 Andrew Stu

[PATCH 3/7 libgomp,nvptx] Add device number to GOMP_OFFLOAD_openacc_async_construct

2019-11-12 Thread Andrew Stubbs
the queue is intended, so this simply provides that information to the queue constructor. OK to commit? Thanks Andrew 2019-11-12 Andrew Stubbs libgomp/ * libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_construct): Add int parameter. * oacc-async.c

[PATCH 2/7 amdgcn] GCN mkoffload

2019-11-12 Thread Andrew Stubbs
This patch adds the mkoffload tool to the amdgcn backend. It's similar, but not quite the same as that on the openacc-gcc-9-branch. I will commit this patch when the others in this series are approved. Andrew 2019-11-12 Andrew Stubbs gcc/ * config/gcn/mkoffload.c: New file

[PATCH 1/7 libgomp,nvptx] Move generic libgomp files from nvptx to accel

2019-11-12 Thread Andrew Stubbs
h the GCN port, thus preventing much of the duplication. OK to commit? Thanks Andrew 2019-11-12 Andrew Stubbs libgomp/ * configure.tgt (nvptx*-*-*): Add "accel" directory. * config/nvptx/libgomp-plugin.c: Move ... * config/accel/libgomp-plugin.c: ... to here.

[PATCH 0/7 libgomp,amdgcn] AMD GCN Offloading Support

2019-11-12 Thread Andrew Stubbs
. Otherwise that will have to wait for GCC 11. Andrew Andrew Stubbs (7): Move generic libgomp files from nvptx to accel GCN mkoffload Add device number to GOMP_OFFLOAD_openacc_async_construct GCN libgomp port Optimize GCN OpenMP malloc performance Use a single worker for OpenACC on AMD GCN

Re: [Patch][AMD GCN][OpenMP] Add gcc/config/gcn/t-omp-device for OpenMP declare variant kind/arch/isa

2019-11-04 Thread Andrew Stubbs
On 04/11/2019 15:37, Jakub Jelinek wrote: My preference would be that arch on amdgcn is something like amdgcn or gcn. I hope the general distinction between arch and isa will be something that will be discussed next Tuesday on the language committee, so hopefully we'll know more afterwards and

[OG9, amdgcn, committed] Fix memory leak in libgomp

2019-09-10 Thread Andrew Stubbs
Committed to OG9 on behalf of Kwok ... The list of struct gomp_threads allocated in gomp_gcn_enter_kernel was not being freed in gomp_gcn_exit_kernel, leading to a small memory leak every time a kernel is run. Runs with a lot of teams or many kernels were running out of heap space. Andrew

[OG9, amdgcn, committed] Detect the actual number of hardware CUs

2019-09-10 Thread Andrew Stubbs
(ROCr), but there are license issues with that. We could extract them from the documentation, but this is still on my TODO list. Andrew Detect number of GPU compute units. 2019-09-10 Andrew Stubbs libgomp/ * plugin/plugin-gcn.c (HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT): Define

[OG9, amdgcn, committed] Use GFX9 granulated sgprs count correctly

2019-09-10 Thread Andrew Stubbs
would hurt performance. Andrew Use GFX9 granulated sgprs count correctly. 2019-09-10 Andrew Stubbs gcc/ * config/gcn/gcn.c (gcn_hsa_declare_function_name): Calculate granulated_sgprs according to architecture. diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index 66854b6f9c5..f8434e4a

<    1   2   3   4   5   6   7   8   9   10   >