an.dg/assumed_rank_1.f90.
2020-01-30 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (gather_exec): Move contents ...
(mask_gather_load): ... here, and zero-initialize the
destination.
(maskloaddi): Zero-initialize the destination.
* config/gcn/gcn.c:
diff --git a/gcc/config/gcn/gcn-valu.md
. This should then be backported to all GCC release branches; I
can easily test the backports for you, if you're not already set up to do
such testing.
How's this?
Andrew
Normalize GOACC_parallel_keyed async and wait parameters
2020-01-30 Andrew Stubbs
Thomas Schwinge
PR middle-end/9
in testcase
gcc.dg/pr81228.c.
Andrew
Add LTGT operator support for amdgcn
Fixes ICE in testcase gcc.dg/pr81228.c
2020-01-30 Andrew Stubbs
gcc/
* config/gcn/gcn.c (print_operand): Handle LTGT.
* config/gcn/predicates.md (gcn_fp_compare_operator): Allow ltgt.
diff --git a/gcc/config/gcn/gcn.c b
On 30/01/2020 16:08, Thomas Schwinge wrote:
Hi!
Andrew and Frederik, thanks for your emails reminding/educating me about
'snprintf' as well as this HSA fixed-size buffer API. There doesn't
happen to be something available in the HSA API available so that we
could use 'sizeof [something]'
On 30/01/2020 13:49, Richard Biener wrote:
On Thu, Jan 30, 2020 at 2:04 PM Bin.Cheng wrote:
On Thu, Jan 30, 2020 at 8:53 PM Andrew Stubbs wrote:
On 29/01/2020 08:24, Richard Biener wrote:
On Tue, Jan 28, 2020 at 5:53 PM Andrew Stubbs wrote:
This patch fixes an ICE compiling fast-math
On 29/01/2020 08:24, Richard Biener wrote:
On Tue, Jan 28, 2020 at 5:53 PM Andrew Stubbs wrote:
This patch fixes an ICE compiling fast-math-pr55281.c for amdgcn.
The problem is that an "iv" is created in which both base and step are
pointer types,
How did you get a POINTER
On 30/01/2020 09:20, Jakub Jelinek wrote:
On Fri, Jan 24, 2020 at 03:59:28PM +0100, Tobias Burnus wrote:
As reported in PR93409, the build of libgomp/plugin/plugin-gcn.c fails with
a bunch of error messages when building with
--with-multilib-list=m32,m64,mx32
The reason is that the GCN plugin
On 29/01/2020 17:44, Thomas Schwinge wrote:
@@ -1513,6 +1518,23 @@ init_hsa_context (void)
+ size_t len = sizeof hsa_context.driver_version_s;
+ int printed = snprintf (hsa_context.driver_version_s, len,
+ "HSA Runtime %hu.%hu", (unsigned short int)major,
+
On 29/01/2020 15:40, Tobias Burnus wrote:
Hi Andrew,
On 1/29/20 2:01 PM, Andrew Stubbs wrote:
On 29/01/2020 12:53, Tobias Burnus wrote:
With LLVM 9, the old variant is only accepted when also passing
"-mattr=-code-object-v3" to the compiler; that's a"-" after th
On 29/01/2020 12:53, Tobias Burnus wrote:
Cf. PR93409 comments 4 and later. The comments 1–3 of the PR are covered
by patch https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01663.html (skip
building libgomp's HSA/GCN plugin with -mx32).
For AMDGCN, the LLVM assembler is used. While for LLVM 7+8,
On 29/01/2020 12:30, Thomas Schwinge wrote:
Hi Andrew!
On 2019-11-22T11:06:14+, Andrew Stubbs wrote:
This test case causes an ICE (reformatted for email):
void test(int k)
{
unsigned int x = 1;
#pragma acc parallel loop async(x)
for (int i = 0; i < k
On 29/01/2020 09:52, Harwath, Frederik wrote:
@@ -1513,6 +1518,23 @@ init_hsa_context (void)
GOMP_PLUGIN_error ("Failed to list all HSA runtime agents");
}
+ uint16_t minor, major;
+ status = hsa_fns.hsa_system_get_info_fn (HSA_SYSTEM_INFO_VERSION_MINOR,
);
+ if (status !=
why I only see this issue on amdgcn, but it
might be because the pointer in question is in a MASK_LOAD which is
perhaps not that commonly used?
I've tested this on amdgcn, and done a full bootstrap and test on x86_64
also.
OK to commit?
Thanks
Andrew
Fix fast-math-pr55281.c ICE.
2020-01-
On 24/01/2020 14:58, Andrew Stubbs wrote:
I've committed this patch to fix an ICE building the
gcc.dg/vect/fast-math-pr55281.c testcase.
Oops, I got that crossed. This was the fix for gcc.dg/pr50310-2.c. The
fast-math-pr55281.c fix will be posted shortly.
The problem was that the combine
On 28/01/2020 14:55, Harwath, Frederik wrote:
Hi,
this patch adds full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin. This replaces
the existing stub in libgomp/plugin-gcn.c.
Andrew: The value returned for acc_property_memory ("size
On 24/01/2020 14:59, Tobias Burnus wrote:
As reported in PR93409, the build of libgomp/plugin/plugin-gcn.c fails
with a bunch of error messages when building with
--with-multilib-list=m32,m64,mx32
The reason is that the GCN plugin assumes 64bit pointers. As with HSA,
the build is only
d have been rejected, but the predicates were too
loose.
Andrew
Fix ICE on unsupported FP comparison
2020-01-24 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (vec_cmpdi): Use
gcn_fp_compare_operator.
(vec_cmpudi): Use gcn_compare_operator.
(vec_cmpv64qidi): Use gcn_compare_operator.
(
it could read attempt to read any unhandled argument as the
thread limit.
Andrew
Fix libgomp plugin-gcn bug
2020-01-23 Andrew Stubbs
libgomp/
* plugin/plugin-gcn.c (parse_target_attributes): Use correct mask for
the device id.
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin
On 20/01/2020 16:42, Harwath, Frederik wrote:
Hi Andrew,
Thanks for the review! I have attached a revised patch containing the changes
that you suggested.
On 20.01.20 11:00, Andrew Stubbs wrote:
On 20/01/2020 06:57, Harwath, Frederik wrote:
Is it ok to commit this patch to the master branch
-20 Andrew Stubbs
libgomp/
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Skip test on gcn.
* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c (main):
Adjust test dimensions for amdgcn.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c (main): Adjust
gang/worker/vector
On 20/01/2020 11:07, Jakub Jelinek wrote:
On Mon, Jan 20, 2020 at 11:00:58AM +, Andrew Stubbs wrote:
Indeed, fat binaries would be a good solution.
Presumably it's possible, but I'm not sure how we'd go about getting the
offload mechanism to launch the backend multiple times? Having got
On 20/01/2020 10:42, Jakub Jelinek wrote:
:(
Another option would be to build offloading code by GCN multiple times, once
for each incompatible ISA the user is asking for, so that one can have then
binaries that will work on different hw.
Because e.g. with the distro vendor hat, it is hard to
On 20/01/2020 10:08, Jakub Jelinek wrote:
On Mon, Jan 20, 2020 at 10:00:09AM +, Andrew Stubbs wrote:
@@ -396,6 +396,88 @@ struct gcn_image_desc
struct global_var_info *global_variables;
};
+/* This enum mirrors the corresponding LLVM enum's values for all ISAs that we
+ support
Hi Frederik,
On 20/01/2020 06:57, Harwath, Frederik wrote:
Hi,
this patch implements a runtime ISA check for amdgcn offloading.
The check verifies that the ISA of the GPU to which we try to offload matches
the ISA for which the code to be offloaded has been compiled. If it detects
a mismatch,
code will use, if anything, so we ought to be compatible.
There's no official release using the "wrong" name, so I don't believe
we need to retain that name for any reason.
I've tested that there are no regressions.
Andrew
Rename acc_device_gcn to acc_device_radeon
2020-01-17 And
On 19/12/2019 17:39, Richard Sandiford wrote:
Andrew Stubbs writes:
This patch changes the operand predicates such that vector constants are
permitted during compilation. This prevents ICEs caused by the compiler
trying to emit such instructions without checking.
That sounds like a target
Ping.
On 22/11/2019 11:06, Andrew Stubbs wrote:
This test case causes an ICE (reformatted for email):
void test(int k)
{
unsigned int x = 1;
#pragma acc parallel loop async(x)
for (int i = 0; i < k; i++) { }
}
t.c: In function 'test':
t.c:4:9: error: inva
On 10/01/2020 14:21, Kwok Cheung Yeung wrote:
The patch for sub-word atomics support added an include of stdint.h for
the definition of uintptr_h, but this can result in GCC compilation
failing if the stdint.h header has not been installed (from newlib in
the case of AMD GCN).
I have fixed
On 08/01/2020 18:18, Kwok Cheung Yeung wrote:
Is this version okay for trunk?
OK, thanks.
Andrew
On 08/01/2020 11:07, Kwok Cheung Yeung wrote:
+#define __sync_subword_compare_and_swap(type, size) \
Macro parameters are conventionally upper case.
+ \
+type \
+__sync_val_compare_and_swap_##size
d 3 loops" 1
FAIL: gcc.dg/vect/vect-cond-reduc-4.c scan-tree-dump-times vect "LOOP
VECTORIZED" 2
FAIL: gcc.dg/vect/vect-cselim-1.c scan-tree-dump-times vect "vectorized
2 loops" 1
FAIL: gcc.dg/vect/vect-version-1.c scan-tree-dump vect "applying loop
versioning to oute
' constraints on amdgcn addc/subb
2020-01-07 Andrew Stubbs
gcc/
* config/gcn/constraints.md (DA): Update description and match.
(DB): Likewise.
(Db): New constraint.
* config/gcn/gcn-protos.h (gcn_inline_constant64_p): Add second
parameter.
* config/gcn/gcn.c (gcn_inline_constant64_p
bad code?)
Adding an alternatives for each permutation fixes the problem. This has
already been done for many other patterns.
Andrew
Fix amdgcn issue with '0' constraints
2020-01-06 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (subv64di3): Use separate alternatives for
'0' matching inputs
register pairs. Other patterns use '0' to allow exact
matches, but the input and outputs here are different size, and I'm not
sure what happens there. Anyway, this is safe.
Andrew
Fix early-clobber in amdgcn vec_extract
2020-01-06 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (vec_extract
Inline immediates for AMD GCN instructions are supposed to be in the
range -16..64 inclusive, but the implementation had the upper bound
exclusive.
This patch fixes the error.
Andrew
Fix amdgcn inline immediate range
2020-01-06 Andrew Stubbs
gcc/
* config/gcn/gcn.c
constants in amdgcn extends and truncates
2019-12-19 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md
(2):
Change input predcate to gcn_alu_operand.
(extend2):
Likewise.
(truncv64di2): Likewise.
(truncv64di2_exec): Likewise.
(v64di2): Likewise.
(v64di2_exec): Likewise.
diff --git a/gcc
are not interesting for those modes (being mostly used to
implement DImode splitters), so we can dispense with the notional iterator.
Andrew
Use V64SI for all amdgcn add-with-carry insns
2019-12-19 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (*plus_carry_dpp_shr_): Rename
elsewhere.
This results in 80 new test passes. There are a few regressions from
vectorization tests that took a different code path and encountered
another missing instruction.
Andrew
Implement sub-dword add/sub on amdgcn
2019-12-19 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (addv64si3
expect that it will not.
I fixed it by special-casing GCN. There's might be a more general way,
but apparently this does happen for other architectures (?)
Andrew
Fix vect/pr65947-8.c testcase for amdgcn.
2019-12-18 Andrew Stubbs
gcc/testsuite/
* gcc.dg/vect/pr65947-8.c: Change pass
hopefully the pointer will save future readers some confusion.
Andrew
Add pointer to PR92772
2019-12-17 Andrew Stubbs
* tree-vect-loop.c (vect_create_epilog_for_reduction): Mention pr92772
in the comments.
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 353a5ff06e1..68699f2d
n vect.exp to name them all
individually, but includes vect-cond_reduc-* and pr65947-10.c.
Andrew Stubbs
Mentor Graphics / CodeSourcery
Add extract_last for amdgcn
2019-12-17 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (extract_last_): New expander.
(fold_extract_last_): New expander.
gcc
This patch implements the count leading and trailing zeros instruction
patterns in the AMD GCN backend.
This is prerequisite for implementing the extract_last patterns.
Andrew Stubbs
Mentor Graphics / CodeSourcery
Add clz and ctz for amdgcn
2019-12-17 Andrew Stubbs
gcc/
* config/gcn
On 16/12/2019 23:00, Thomas Schwinge wrote:
There is no AMD GCN support yet. This will be added later on.
ACK, just to note that there now is a 'libgomp/plugin/plugin-gcn.c' that
at least needs to get a stub implementation (can mostly copy from
'libgomp/plugin/plugin-hsa.c'?) as otherwise the
On 19/11/2019 12:21, Andrew Stubbs wrote:
This patch adds GCN special casing for most of the OpenACC libgomp tests
that require it. It also disables one testcase that explicitly uses CUDA.
The patches aren't all that controversial, should only change the
results on amdgcn, and Tobias already
I've committed this patch to add v64qi and v64hi multiply patterns.
This is slowly working toward full char and short vectorization.
Andrew
Sub-dword vector multiply for amdgcn
2019-12-13 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (mulv64si3): Rename to ...
(mul3
n
2019-12-13 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (sdwa): New mode attribute.
(VCVT_FROM_MODE): Rename to ...
(VCVT_MODE): ... this.
(VCVT_TO_MODE): Delete mode iterator.
(VCVT_FMODE): New mode iterator.
(VCVT_IMODE): Likewise.
(2): Change ...
(2): ... to this.
(2): New.
(ze
On 09/12/2019 15:59, Richard Sandiford wrote:
No, the assumption's correct even there. The assert usually triggers
because something elsewhere is getting confused about the vector types.
The attached patch fixes the ICE in the testcase, but I suspect does not
go far enough. Can it happen that
Hi,
This patch fixes an ICE in testcase gcc.dg/vect/vect-ctor-1.c:
during GIMPLE pass: vect
dump file: vect-ctor-1.c.159t.vect
.../gcc.dg/vect/vect-ctor-1.c: In function 'intrapred_luma_16x16':
.../gcc.dg/vect/vect-ctor-1.c:9:6: internal compiler error: in
exact_div, at poly-int.h:2162
On 06/12/2019 17:57, Andrew Stubbs wrote:
Hi all,
I've committed the attached to fix a failure-to-assemble bug that can
occur in some vectorized code. This has been hidden for a long time
because sub-word vectors were disabled on GCN, but this is no longer the
case.
The gather load
Oops, please consider this patch as submitted from my @codesourcery.com
address, for copyright assignment purposes.
Andrew
On 06/12/2019 17:31, Andrew Stubbs wrote:
Hi all,
This patch re-enables the V64QImode and V64HImode for GCN.
GCC does not make these easy to work with because
On 06/12/2019 18:21, Richard Sandiford wrote:
Andrew Stubbs writes:
Hi all,
This patch re-enables the V64QImode and V64HImode for GCN.
GCC does not make these easy to work with because there is (was?) an
assumption that vector registers do not have excess bits in vector
registers
didn't
assemble well. E.g. it had 'flat_load_short', instead of
'flat_load_ustore'.
This fixes about 39 tests in vect.exp.
--
Andrew Stubbs
Mentor Graphics / CodeSourcery
Fix unrecognised GCN instruction.
2019-12-06 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md (gather_insn_1offset): Use %o
new passes in the vect.exp (there's also 41
new fails, but those are exposed bugs I'll fix shortly). Some of these
were internal compiler errors that did not exist in older compilers.
--
Andrew Stubbs
Mentor Graphics / CodeSourcery
Enable QI/HImode vector moves
2019-12-06 Andrew Stubbs
gcc/
* conf
On 05/12/2019 18:21, Robin Curtis wrote:
My IBM Selectric golfball electronic printer only does 90 characters on A4 in
portrait mode………(at 10 cps)
(as for my all electric TELEX Teleprinter machine !)
Is this debate for real ?! - or is this a Christmas spoof ?
I can't speak for the debate,
On 05/12/2019 16:17, Joseph Myers wrote:
Longer lines mean less space for multiple terminal / editor windows
side-by-side to look at different pieces of code. I don't think that's an
improvement.
Here's a data-point
My 1920 pixel-wide screen, in the default font, allows 239 columns; not
On 05/12/2019 07:05, Harwath, Frederik wrote:
Hi,
the inline assembly "p" modifier ("An operand that is a valid memory address is
allowed",
cf.
https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints)
is not supported on AMD GCN. This causes an ICE during the compilation
now compiles, although not quite correctly, but that's another
issue (pr92772).
Andrew
Add missing amdgcn vcondu patterns
2019-12-03 Andrew Stubbs
gcc/
* config/gcn/gcn-valu.md: Change "vcondu" patterns to use VEC_1REG_MODE
for the data mode.
diff --git a/gcc/config/gcn/gcn-val
On 02/12/2019 14:43, Thomas Schwinge wrote:
Hi!
On 2019-11-12T13:29:13+, Andrew Stubbs wrote:
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -174,6 +174,7 @@ enum gomp_map_kind
#define GOMP_DEVICE_NVIDIA_PTX5
#define GOMP_DEVICE_INTEL_MIC 6
rew
Enable OpenACC GCN testing.
2019-12-03 Andrew Stubbs
libgomp/
* testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
Recognize amdgcn.
(check_effective_target_openacc_amdgcn_accel_present): New proc.
(check_effective_target_openacc_amdgcn_accel_selected): New proc.
*
On 02/12/2019 14:23, Thomas Schwinge wrote:
Hi!
On 2019-11-15T13:43:04+0100, Jakub Jelinek wrote:
On Fri, Nov 15, 2019 at 12:38:06PM +, Andrew Stubbs wrote:
On 15/11/2019 12:21, Jakub Jelinek wrote:
I'm surprised by the set acc_mem_shared 0, I thought gcn is a shared memory
offloading
On 25/11/2019 14:17, Tobias Burnus wrote:
The compiler warns that funcs_tail and vars_tails are unused – they,
funcs_ids/var_ids and struct id_map seem to be a copy-n-paste leftovers
from gcc/config/nvptx/mkoffload.c.
Additionally, COMMENT_PREFIX does not seem to be used anywhere. (In the
On 25/11/2019 11:14, Tobias Burnus wrote:
This patch adds "gcc_unreachable ();" as suggested by me (cf. below).
It also silences the -Wunused-variable + 'no return statement' warnings.
OK for the trunk?
OK.
Thanks, Tobias.
Andrew
unchanged for non-offload compiles (this is only
really used for running the testsuite).
--
Andrew Stubbs
CodeSourcery / Mentor Graphics
Limit LDS usage.
2019-11-22 Andrew Stubbs
gcc/
* config/gcn/gcn.c (OMP_LDS_SIZE): Define.
(ACC_LDS_SIZE): Define.
(OTHER_LDS_SIZE): Define.
(LDS_SIZE
I've committed the attached. The patch adjusts the GCN kernel metadata
so that it is correct for GFX9 devices.
The existing implementation was correct for GFX8, and seems to work on
GFX9, but wasn't technically correct.
--
Andrew Stubbs
CodeSourcery / Mentor Graphics
Use GFX9 granulated
ed patch assigns the "(int) x" to a temporary and passes that
to the function instead.
OK to commit?
--
Andrew Stubbs
CodeSourcery / Mentor Graphics
Normalize GOACC_parallel_keyed async parameter.
2019-11-22 Andrew Stubbs
gcc/
* omp-expand.c (expand_omp_target): Pass sync parameter t
. The code is still correct for the purpose of the testcase
either way, however, so I'm removing the over-fussy match.
Andrew
Update loop-1.c test for amdgcn
2019-11-19 Andrew Stubbs
gcc/testsuite/
* gcc.dg/tree-ssa/loop-1.c: Change amdgcn assembler scan.
diff --git a/gcc/testsuite/gcc.dg
This patch adds GCN special casing for most of the OpenACC libgomp tests
that require it. It also disables one testcase that explicitly uses CUDA.
OK to commit?
Andrew
Update OpenACC tests for amdgcn
2019-11-19 Andrew Stubbs
libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1
On 15/11/2019 21:44, Julian Brown wrote:
This patch flips the switch to enable worker partitioning on AMD GCN.
OK?
This is OK, although I think we could just remove that flag now.
Andrew
On 15/11/2019 21:44, Julian Brown wrote:
This patch checks that cfun is valid in the gcn_asm_output_symbol_ref
function. This prevents a crash when that function is called with NULL
cfun, i.e. when outputting debug symbols.
OK?
OK, although that FIXME still baffles me.
Andrew
On 15/11/2019 21:44, Julian Brown wrote:
@@ -2732,13 +2732,9 @@ wait_for_queue_nonfull (struct goacc_asyncqueue *aq)
{
if (aq->queue_n == ASYNC_QUEUE_SIZE)
{
- pthread_mutex_lock (>mutex);
-
/* Queue is full. Wait for it to not be full. */
while (aq->queue_n ==
On 15/11/2019 21:44, Julian Brown wrote:
+static void
+hsa_memory_copy_wrapper (void *dst, const void *src, size_t len)
+{
+ hsa_status_t status = hsa_fns.hsa_memory_copy_fn (dst, src, len);
+
+ if (status == HSA_STATUS_SUCCESS)
+return;
+
+ /* It appears that the copy fails if the source
On 15/11/2019 15:51, Kwok Cheung Yeung wrote:
On 15/11/2019 11:32 am, Andrew Stubbs wrote:
On 14/11/2019 15:33, Kwok Cheung Yeung wrote:
The kernel attributes are changed to request at least 64 SGPRs and 24
VGPRs (i.e. the non-kernel maximum, otherwise the callees may not
have enough
On 15/11/2019 12:43, Jakub Jelinek wrote:
APUs, such as Carizzo are shared memory. DGPUs, such as Fiji and Vega, have
their own memory. A DGPU can access host memory, provided that it has been
set up just so, but that is very slow, and I don't know of a way to do that
without still having to
On 15/11/2019 12:21, Jakub Jelinek wrote:
On Thu, Nov 14, 2019 at 04:36:38PM +, Andrew Stubbs wrote:
This patch adds some necessary bits to enable OpenACC testings for amdgcn
offloading.
The two "check_effective" procedures are not actually needed yet, but later
patches to
On 14/11/2019 15:34, Kwok Cheung Yeung wrote:
This patch unfixes the registers for the hard frame pointer so that they
can be used for other purposes if the frame pointer is not in use.
This patch is dependent on the commit 'Support using multiple registers
to hold the frame pointer'
On 14/11/2019 15:33, Kwok Cheung Yeung wrote:
The kernel attributes are changed to request at least 64 SGPRs and 24
VGPRs (i.e. the non-kernel maximum, otherwise the callees may not have
enough registers to run in) for non-leaf kernels to take advantage of
the reduced number of registers used
On 14/11/2019 15:32, Kwok Cheung Yeung wrote:
This patch restricts non-kernel functions to using a maximum of 64 SGPRs
and 24 VGPRs.
Kernels can request various pieces of information from the HSA runtime,
and these will be loaded into the registers consecutively before the
kernel executes.
On 14/11/2019 15:30, Kwok Cheung Yeung wrote:
The set of fixed registers is adjusted by the
TARGET_CONDITIONAL_REGISTER_USAGE hook, but this needs to be done on a
per-function basis, whereas the hook is normally called once during GCC
initialization before any functions have been processed
On 14/11/2019 15:30, Kwok Cheung Yeung wrote:
GCN 5 has commonly-used global memory instructions that specify the
address as [SGPR address] + [VGPR offset] + [constant offset], and we
often want the VGPR offset to be zero, so v0 is currently reserved for
that purpose.
However, v1 contains
On 14/11/2019 17:05, Jakub Jelinek wrote:
On Thu, Nov 14, 2019 at 04:47:49PM +, Andrew Stubbs wrote:
This patch adds new libgomp tests to ensure that C "printf" and Fortran
"write" work correctly within offload kernels. Both should work for amdgcn,
but nvptx uses the
from offload kernels is not recommended in
production, but can be useful in development.
OK to commit?
Thanks
Andrew
Add tests for print from offload target.
2019-11-14 Andrew Stubbs
libgomp/
* testsuite/libgomp.c/target-print-1.c: New file.
* testsuite/libgomp.fortran/target-print-1.f9
Hi,
This patch adds some necessary bits to enable OpenACC testings for
amdgcn offloading.
The two "check_effective" procedures are not actually needed yet, but
later patches to test cases will use them.
OK to commit?
Thanks
Andrew
Enable OpenACC GCN testing.
2019-11-14 And
On 14/11/2019 12:43, Kwok Cheung Yeung wrote:
Hello
This patch fixes an issue seen in the following test cases on AMD GCN:
libgomp.oacc-fortran/gemm.f90
libgomp.oacc-fortran/gemm-2.f90
libgomp.c/for-5-test_ttdpfs_ds128_auto.c
libgomp.c/for-5-test_ttdpfs_ds128_guided32.c
These patches are now all committed. I've adjusted the changelogs to
list all the proper authors (apologies if I missed anyone).
Thank you for the quick reviews, Jakub. :-)
Andrew
On 12/11/2019 13:29, Andrew Stubbs wrote:
Hi all,
This patch series contributes initial OpenMP and OpenACC
kernels will experience, and therefore make
standalone testing more meaningful.
Andrew
Move gcn-run heap into GPU memory.
2019-11-13 Andrew Stubbs
gcc/
* config/gcn/gcn-run.c (heap_region): New global variable.
(struct hsa_runtime_fn_info): Add hsa_memory_assign_agent_fn
ine, but I don't
think it was doing so anyway. I need to look at that, but how is this,
for now?
Andrew
Optimize GCN OpenMP malloc performance
2019-11-12 Andrew Stubbs
libgomp/
* config/gcn/team.c (gomp_gcn_enter_kernel): Set up the team arena
and use team_malloc varia
On 12/11/2019 14:01, Jakub Jelinek wrote:
On Tue, Nov 12, 2019 at 01:29:16PM +, Andrew Stubbs wrote:
2019-11-12 Andrew Stubbs
libgomp/
* plugin/Makefrag.am: Add amdgcn plugin support.
* plugin/configfrag.ac: Likewise.
* plugin/plugin-gcn.c: New file
On 12/11/2019 13:46, Jakub Jelinek wrote:
On Tue, Nov 12, 2019 at 01:29:13PM +, Andrew Stubbs wrote:
2019-11-12 Andrew Stubbs
include/
* gomp-constants.h (GOMP_DEVICE_GCN): Define.
(GOMP_VERSION_GCN): Define.
Perhaps this could be 0, but not a big deal.
OG9
This patch prevents the compiler using multiple workers in a gang. This
should be reverted when worker support is committed.
I will commit this with the reset of the series.
Andrew
2019-11-12 Andrew Stubbs
Julian Brown
gcc/
* config/gcn/gcn.c
l
search and replace.
Dummy pass-through definitions are provided for other targets.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* config/gcn/team.c (gomp_gcn_enter_kernel): Set up the team arena
and use team_malloc variants.
(gomp_gcn_exit_ke
This patch contributes the GCN libgomp plugin, with the various
configure and make bits to go with it.
This implementation is a much-cleaned-up version of the one present on the
openacc-gcc-9-branch.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* plugin
new target-specific
symbols to be added to libgomp. I couldn't find an existing way to do
this without adding a new top-level file also, to there's an empty
placeholder also. (The OG9 branch has this symbol in libgcc, but that
seems wrong.)
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stu
the queue is intended, so this simply provides that
information to the queue constructor.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_construct): Add int
parameter.
* oacc-async.c
This patch adds the mkoffload tool to the amdgcn backend. It's similar,
but not quite the same as that on the openacc-gcc-9-branch.
I will commit this patch when the others in this series are approved.
Andrew
2019-11-12 Andrew Stubbs
gcc/
* config/gcn/mkoffload.c: New file
h the GCN port, thus preventing much of
the duplication.
OK to commit?
Thanks
Andrew
2019-11-12 Andrew Stubbs
libgomp/
* configure.tgt (nvptx*-*-*): Add "accel" directory.
* config/nvptx/libgomp-plugin.c: Move ...
* config/accel/libgomp-plugin.c: ... to here.
. Otherwise that will have to wait for GCC 11.
Andrew
Andrew Stubbs (7):
Move generic libgomp files from nvptx to accel
GCN mkoffload
Add device number to GOMP_OFFLOAD_openacc_async_construct
GCN libgomp port
Optimize GCN OpenMP malloc performance
Use a single worker for OpenACC on AMD GCN
On 04/11/2019 15:37, Jakub Jelinek wrote:
My preference would be that arch on amdgcn is something like amdgcn or gcn.
I hope the general distinction between arch and isa will be something that
will be discussed next Tuesday on the language committee, so hopefully we'll
know more afterwards and
Committed to OG9 on behalf of Kwok ...
The list of struct gomp_threads allocated in gomp_gcn_enter_kernel was
not being freed in gomp_gcn_exit_kernel, leading to a small memory leak
every time a kernel is run. Runs with a lot of teams or many kernels
were running out of heap space.
Andrew
(ROCr), but
there are license issues with that. We could extract them from the
documentation, but this is still on my TODO list.
Andrew
Detect number of GPU compute units.
2019-09-10 Andrew Stubbs
libgomp/
* plugin/plugin-gcn.c (HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT): Define
would hurt performance.
Andrew
Use GFX9 granulated sgprs count correctly.
2019-09-10 Andrew Stubbs
gcc/
* config/gcn/gcn.c (gcn_hsa_declare_function_name): Calculate
granulated_sgprs according to architecture.
diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 66854b6f9c5..f8434e4a
501 - 600 of 947 matches
Mail list logo