Re: [OG12][committed] amdgcn: OpenMP low-latency allocator

2023-03-24 Thread Thomas Schwinge
Hi!

On 2023-02-16T18:06:41+, Andrew Stubbs  wrote:
> 2. 230216-amd-low-lat.patch
>
> Allocate the memory, adjust the default address space, and hook up the
> allocator.

Like done for nvptx in og12 commit 23f52e49368d7b26a1b1a72d6bb903d31666e961
"Miscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', 
'ompx_host_mem_space'",
I've now pushed the corresponding GCN 'ompx_host_mem_space' thing to
devel/omp/gcc-12 branch in commit b39e4bbab59f5e4b551c44dbce0ce3acf4afc22a
"Miscellaneous clean-up re OpenMP 'ompx_host_mem_space'", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From b39e4bbab59f5e4b551c44dbce0ce3acf4afc22a Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 17 Feb 2023 14:13:15 +0100
Subject: [PATCH] Miscellaneous clean-up re OpenMP 'ompx_host_mem_space'

Like done for nvptx in og12 commit 23f52e49368d7b26a1b1a72d6bb903d31666e961
"Miscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', 'ompx_host_mem_space'".

Clean-up for og12 commit c77c45a641fedc3fe770e909cc010fb1735bdbbd
"amdgcn, libgomp: low-latency allocator".  No functional change.

	libgomp/
	* config/gcn/allocator.c (gcn_memspace_free): Explicitly handle
	'memspace == ompx_host_mem_space'.
---
 libgomp/ChangeLog.omp  | 3 +++
 libgomp/config/gcn/allocator.c | 4 
 2 files changed, 7 insertions(+)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index 63d1f563d5d..ef957e3d2d8 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,5 +1,8 @@
 2023-03-24  Thomas Schwinge  
 
+	* config/gcn/allocator.c (gcn_memspace_free): Explicitly handle
+	'memspace == ompx_host_mem_space'.
+
 	Backported from master:
 	2023-03-24  Thomas Schwinge  
 
diff --git a/libgomp/config/gcn/allocator.c b/libgomp/config/gcn/allocator.c
index 001de89ffe0..e9980f6f98e 100644
--- a/libgomp/config/gcn/allocator.c
+++ b/libgomp/config/gcn/allocator.c
@@ -36,6 +36,7 @@
when the memspace access trait is set accordingly.  */
 
 #include "libgomp.h"
+#include 
 #include 
 
 #define BASIC_ALLOC_PREFIX __gcn_lowlat
@@ -86,6 +87,9 @@ gcn_memspace_free (omp_memspace_handle_t memspace, void *addr, size_t size)
 
   __gcn_lowlat_free (shared_pool, addr, size);
 }
+  else if (memspace == ompx_host_mem_space)
+/* Just verify what all allocator functions return.  */
+assert (addr == NULL);
   else
 free (addr);
 }
-- 
2.25.1



Re: [og12] Un-break nvptx libgomp build (was: [OG12][committed] amdgcn: OpenMP low-latency allocator)

2023-02-20 Thread Andrew Stubbs

On 16/02/2023 21:11, Thomas Schwinge wrote:

--- /dev/null
+++ b/libgomp/basic-allocator.c



+#ifndef BASIC_ALLOC_YIELD
+#deine BASIC_ALLOC_YIELD
+#endif


 In file included from [...]/libgomp/config/nvptx/allocator.c:49:
 [...]/libgomp/config/nvptx/../../basic-allocator.c:52:2: error: invalid 
preprocessing directive #deine; did you mean #define?
52 | #deine BASIC_ALLOC_YIELD
   |  ^
   |  define

Yes, indeed.

I've pushed to devel/omp/gcc-12 branch
commit 6cc0e7bebf1b3ad6aacf75419e7f06942409f90c
"Un-break nvptx libgomp build", see attached.


Oops, thanks Thomas.

Andrew


[og12] Un-break nvptx libgomp build (was: [OG12][committed] amdgcn: OpenMP low-latency allocator)

2023-02-16 Thread Thomas Schwinge
Hi!

On 2023-02-16T18:06:41+, Andrew Stubbs  wrote:
> 1. 230216-basic-allocator.patch
>
> Separate the allocator from NVPTX so the code can be shared.

Yay!

> nvptx, libgomp: Move the low-latency allocator code
>
> There shouldn't be a functionality change; this is just so AMD can share
> the code.

I've quickly observed one "functionality" change:

> --- /dev/null
> +++ b/libgomp/basic-allocator.c

> +#ifndef BASIC_ALLOC_YIELD
> +#deine BASIC_ALLOC_YIELD
> +#endif

In file included from [...]/libgomp/config/nvptx/allocator.c:49:
[...]/libgomp/config/nvptx/../../basic-allocator.c:52:2: error: invalid 
preprocessing directive #deine; did you mean #define?
   52 | #deine BASIC_ALLOC_YIELD
  |  ^
  |  define

Yes, indeed.

I've pushed to devel/omp/gcc-12 branch
commit 6cc0e7bebf1b3ad6aacf75419e7f06942409f90c
"Un-break nvptx libgomp build", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 6cc0e7bebf1b3ad6aacf75419e7f06942409f90c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 16 Feb 2023 21:59:55 +0100
Subject: [PATCH] Un-break nvptx libgomp build

In file included from [...]/libgomp/config/nvptx/allocator.c:49:
[...]/libgomp/config/nvptx/../../basic-allocator.c:52:2: error: invalid preprocessing directive #deine; did you mean #define?
   52 | #deine BASIC_ALLOC_YIELD
  |  ^
  |  define

Yes, indeed.

Fix-up for og12 commit 9583738a62a33a276b2aad980a27e77097f95924
"nvptx, libgomp: Move the low-latency allocator code".

	libgomp/
	* basic-allocator.c (BASIC_ALLOC_YIELD): instead of '#deine',
	'#define' it.
---
 libgomp/ChangeLog.omp | 3 +++
 libgomp/basic-allocator.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index ecc14b4f537..b667c72b8ca 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,5 +1,8 @@
 2023-02-16  Thomas Schwinge  
 
+	* basic-allocator.c (BASIC_ALLOC_YIELD): instead of '#deine',
+	'#define' it.
+
 	* testsuite/libgomp.c/usm-1.c: Re-enable non-GCN offloading
 	compilation.
 	* testsuite/libgomp.c/usm-2.c: Likewise.
diff --git a/libgomp/basic-allocator.c b/libgomp/basic-allocator.c
index 94b99a89e0b..b4b9e4ba13a 100644
--- a/libgomp/basic-allocator.c
+++ b/libgomp/basic-allocator.c
@@ -49,7 +49,7 @@
 #endif
 
 #ifndef BASIC_ALLOC_YIELD
-#deine BASIC_ALLOC_YIELD
+#define BASIC_ALLOC_YIELD
 #endif
 
 #define ALIGN(VAR) (((VAR) + 7) & ~7)/* 8-byte granularity.  */
-- 
2.25.1



[OG12][committed] amdgcn: OpenMP low-latency allocator

2023-02-16 Thread Andrew Stubbs

These patches implement an LDS memory allocator for OpenMP on AMD.

1. 230216-basic-allocator.patch

Separate the allocator from NVPTX so the code can be shared.

2. 230216-amd-low-lat.patch

Allocate the memory, adjust the default address space, and hook up the 
allocator.


They will need to be integrated with the rest of the memory management 
patch-stack when I repost that for mainline.


Andrewnvptx, libgomp: Move the low-latency allocator code

There shouldn't be a functionality change; this is just so AMD can share
the code.

The new basic-allocator.c is designed to be included so it can be used as a
template multiple times and inlined.

libgomp/ChangeLog:

* config/nvptx/allocator.c (BASIC_ALLOC_PREFIX): New define, and
include basic-allocator.c.
(__nvptx_lowlat_heap_root): Remove.
(heapdesc): Remove.
(nvptx_memspace_alloc): Move implementation to basic-allocator.c.
(nvptx_memspace_calloc): Likewise.
(nvptx_memspace_free): Likewise.
(nvptx_memspace_realloc): Likewise.
* config/nvptx/team.c (__nvptx_lowlat_heap_root): Remove.
(gomp_nvptx_main): Call __nvptx_lowlat_init.
* basic-allocator.c: New file.

diff --git a/libgomp/basic-allocator.c b/libgomp/basic-allocator.c
new file mode 100644
index 000..94b99a89e0b
--- /dev/null
+++ b/libgomp/basic-allocator.c
@@ -0,0 +1,380 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+/* This is a basic "malloc" implementation intended for use with small,
+   low-latency memories.
+
+   To use this template, define BASIC_ALLOC_PREFIX, and then #include the
+   source file.  The other configuration macros are optional.
+
+   The root heap descriptor is stored in the first bytes of the heap, and each
+   free chunk contains a similar descriptor for the next free chunk in the
+   chain.
+
+   The descriptor is two values: offset and size, which describe the
+   location of a chunk of memory available for allocation. The offset is
+   relative to the base of the heap.  The special offset value 0x
+   indicates that the heap (free chain) is locked.  The offset and size are
+   32-bit values so the base alignment can be 8-bytes.
+
+   Memory is allocated to the first free chunk that fits.  The free chain
+   is always stored in order of the offset to assist coalescing adjacent
+   chunks.  */
+
+#include "libgomp.h"
+
+#ifndef BASIC_ALLOC_PREFIX
+#error "BASIC_ALLOC_PREFIX not defined."
+#endif
+
+#ifndef BASIC_ALLOC_YIELD
+#deine BASIC_ALLOC_YIELD
+#endif
+
+#define ALIGN(VAR) (((VAR) + 7) & ~7)/* 8-byte granularity.  */
+
+#define fn1(prefix, name) prefix ## _ ## name
+#define fn(prefix, name) fn1 (prefix, name)
+#define basic_alloc_init fn(BASIC_ALLOC_PREFIX,init)
+#define basic_alloc_alloc fn(BASIC_ALLOC_PREFIX,alloc)
+#define basic_alloc_calloc fn(BASIC_ALLOC_PREFIX,calloc)
+#define basic_alloc_free fn(BASIC_ALLOC_PREFIX,free)
+#define basic_alloc_realloc fn(BASIC_ALLOC_PREFIX,realloc)
+
+typedef struct {
+  uint32_t offset;
+  uint32_t size;
+} heapdesc;
+
+void
+basic_alloc_init (char *heap, size_t limit)
+{
+  if (heap == NULL)
+return;
+
+  /* Initialize the head of the free chain.  */
+  heapdesc *root = (heapdesc*)heap;
+  root->offset = ALIGN(1);
+  root->size = limit - root->offset;
+
+  /* And terminate the chain.  */
+  heapdesc *next = (heapdesc*)(heap + root->offset);
+  next->offset = 0;
+  next->size = 0;
+}
+
+static void *
+basic_alloc_alloc (char *heap, size_t size)
+{
+  if (heap == NULL)
+return NULL;
+
+  /* Memory is allocated in N-byte granularity.  */
+  size = ALIGN (size);
+
+  /* Acquire a lock on the low-latency heap.  */
+  heapdesc root, *root_ptr = (heapdesc*)heap;
+  do
+{
+  root.offset = __atomic_exchange_n (_ptr->offset, 0x, 
+MEMMODEL_ACQUIRE);
+  if (root.offset != 0x)
+   {
+ root.size = root_ptr->size;
+