Re: [PATCH, v2, libgomp, OpenMP 5.0] Implement omp_get_device_num

2021-08-05 Thread Chung-Lin Tang




On 2021/8/3 8:22 PM, Thomas Schwinge wrote:

Hi Chung-Lin!

On 2021-08-02T21:10:57+0800, Chung-Lin Tang  wrote:

--- a/libgomp/fortran.c
+++ b/libgomp/fortran.c



+int32_t
+omp_get_device_num_ (void)
+{
+  return omp_get_device_num ();
+}


Missing 'ialias_redirect (omp_get_device_num)'?


Grüße
  Thomas



Thanks, will fix before committing.

Chung-Lin


Re: [PATCH, v2, libgomp, OpenMP 5.0] Implement omp_get_device_num

2021-08-03 Thread Thomas Schwinge
Hi Chung-Lin!

On 2021-08-02T21:10:57+0800, Chung-Lin Tang  wrote:
> --- a/libgomp/fortran.c
> +++ b/libgomp/fortran.c

> +int32_t
> +omp_get_device_num_ (void)
> +{
> +  return omp_get_device_num ();
> +}

Missing 'ialias_redirect (omp_get_device_num)'?


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH, v2, libgomp, OpenMP 5.0] Implement omp_get_device_num

2021-08-03 Thread Thomas Schwinge
Hi Chung-Lin!

Just a few quick comments:

On 2021-08-02T21:10:57+0800, Chung-Lin Tang  wrote:
> On 2021/7/23 6:39 PM, Jakub Jelinek wrote:
>> On Fri, Jul 23, 2021 at 06:21:41PM +0800, Chung-Lin Tang wrote:
>>> --- a/libgomp/icv-device.c
>>> +++ b/libgomp/icv-device.c
>>> @@ -61,8 +61,17 @@ omp_is_initial_device (void)
>>> return 1;
>>>   }
>>>
>>> +int
>>> +omp_get_device_num (void)
>>> +{
>>> +  /* By specification, this is equivalent to omp_get_initial_device
>>> + on the host.  */
>>> +  return omp_get_initial_device ();
>>> +}
>>> +
>>
>> I think this won't work properly with the intel micoffload, where the host
>> libgomp is used in the offloaded code.
>> For omp_is_initial_device, the plugin solves it by:
>> liboffloadmic/plugin/offload_target_main.cpp
>> overriding it:
>> /* Override the corresponding functions from libgomp.  */
>> extern "C" int
>> omp_is_initial_device (void) __GOMP_NOTHROW
>> {
>>return 0;
>> }
>>
>> extern "C" int32_t
>> omp_is_initial_device_ (void)
>> {
>>return omp_is_initial_device ();
>> }
>> but guess it will need slightly more work because we need to copy the value
>> to the offloading device too.
>> It can be done incrementally though.
>
> I guess this part of intelmic functionality will just have to wait later.
> There seem to be other parts of liboffloadmic that seems to need re-work,
> e.g. omp_get_num_devices() return mic_engines_total, where it should actually
> return the number of all devices (not just intelmic). omp_get_initial_device()
> returning -1 (which I don't quite understand), etc.

(I'm confirming there are such pre-existing problems with Intel MIC; I've
never looked up any details.)

> Really suggest to have intelmic support be re-worked as an offload plugin 
> inside
> libgomp, rather than floating outside by itself.

Well, it is a regular libgomp plugin, just its sources are not in
'libgomp/plugin/' and it's not built during libgomp build.  Are you
suggesting just to move it into 'libgomp/plugin/'?  This may need some
more complicated setup because of its 'liboffloadmic' dependency?


>>> --- a/libgomp/libgomp-plugin.h
>>> +++ b/libgomp/libgomp-plugin.h
>>> @@ -102,6 +102,12 @@ struct addr_pair
>>> uintptr_t end;
>>>   };
>>>
>>> +/* This symbol is to name a target side variable that holds the designated
>>> +   'device number' of the target device. The symbol needs to be available 
>>> to
>>> +   libgomp code and the  offload plugin (which in the latter case must be
>>> +   stringified).  */
>>> +#define GOMP_DEVICE_NUM_VAR __gomp_device_num
>>
>> For a single var it is acceptable (though, please avoid the double space
>> before offload plugin in the comment), but once we have more than one
>> variable, I think we should simply have a struct which will contain all the
>> parameters that need to be copied from the host to the offloading device at
>> image load time (and have eventually another struct that holds parameters
>> that we'll need to copy to the device on each kernel launch, I bet some ICVs
>> will be one category, other ICVs another one).

ACK.  Also other program state, like 'fenv' or the gfortran "state blob".
This is  "Missing data/state
sharing/propagation between host and offloading devices".

> Actually, if you look at the 5.[01] specifications, omp_get_device_num() is 
> not
> defined in terms of an ICV. Maybe it conceptually ought to be, but the current
> description of "the device number of the device on which the calling thread is
> executing" is not one if the defined ICVs.
>
> It looks like there will eventually be some kind of ICV block handled in a 
> similar
> way, but I think that the modifications will be straightforward then. For now,
> I think it's okay for GOMP_DEVICE_NUM_VAR to just be a normal global variable.

There is, by the way, precedent for that:
'libgomp/config/nvptx/time.c:double __nvptx_clocktick', set up in
'libgomp/plugin/plugin-nvptx.c:nvptx_set_clocktick' ('cuModuleGetGlobal'
to get the device address, followed by 'cuMemcpyHtoD'), invoked from
'libgomp/plugin/plugin-nvptx.c:GOMP_OFFLOAD_load_image', quite simple.

For the case discussed here, we're now adding more complex
'other_count'/'other_entries'/'num_others' bookkeeping.  (Great that all
of the plugins plus 'libgomp/target.c' invented their own terminology...)
;-)

> --- a/libgomp/plugin/plugin-gcn.c
> +++ b/libgomp/plugin/plugin-gcn.c

> @@ -3305,6 +3306,7 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, 
> const void *target_data,
>struct kernel_info *kernel;
>int kernel_count = image_desc->kernel_count;
>unsigned var_count = image_desc->global_variable_count;
> +  int other_count = 1;
>
>agent = get_agent_info (ord);
>if (!agent)
> @@ -3321,7 +3323,8 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, 
> const void *target_data,
>
>GCN_DEBUG ("Encountered %d kernels in an image\n", kernel_count);
>GCN_DEBUG ("Encountered %u global variables in an image\n", 

Re: [PATCH, v2, libgomp, OpenMP 5.0] Implement omp_get_device_num

2021-08-03 Thread Jakub Jelinek via Gcc-patches
On Mon, Aug 02, 2021 at 09:10:57PM +0800, Chung-Lin Tang wrote:
> > I think this won't work properly with the intel micoffload, where the host
> > libgomp is used in the offloaded code.
> > For omp_is_initial_device, the plugin solves it by:
> > liboffloadmic/plugin/offload_target_main.cpp
> > overriding it:
> > /* Override the corresponding functions from libgomp.  */
> > extern "C" int
> > omp_is_initial_device (void) __GOMP_NOTHROW
> > {
> >return 0;
> > }
> > extern "C" int32_t
> > omp_is_initial_device_ (void)
> > {
> >return omp_is_initial_device ();
> > }
> > but guess it will need slightly more work because we need to copy the value
> > to the offloading device too.
> > It can be done incrementally though.
> 
> I guess this part of intelmic functionality will just have to wait later.
> There seem to be other parts of liboffloadmic that seems to need re-work,
> e.g. omp_get_num_devices() return mic_engines_total, where it should actually
> return the number of all devices (not just intelmic). omp_get_initial_device()
> returning -1 (which I don't quite understand), etc.

For omp_get_num_devices() the standard says:
When called from within a target region the effect of this routine is 
unspecified.
Ditto for omp_get_initial_device and various other routines.
So it is UB if those functions are called in offloaded regions.

> > For a single var it is acceptable (though, please avoid the double space
> > before offload plugin in the comment), but once we have more than one
> > variable, I think we should simply have a struct which will contain all the
> > parameters that need to be copied from the host to the offloading device at
> > image load time (and have eventually another struct that holds parameters
> > that we'll need to copy to the device on each kernel launch, I bet some ICVs
> > will be one category, other ICVs another one).
> 
> Actually, if you look at the 5.[01] specifications, omp_get_device_num() is 
> not
> defined in terms of an ICV. Maybe it conceptually ought to be, but the current
> description of "the device number of the device on which the calling thread is
> executing" is not one if the defined ICVs.
> 
> It looks like there will eventually be some kind of ICV block handled in a 
> similar
> way, but I think that the modifications will be straightforward then. For now,
> I think it's okay for GOMP_DEVICE_NUM_VAR to just be a normal global variable.

Yeah, it is ok for now, but even for the below mentioned omp_display_env
we'll need to replace it...

> There is a new function check_effective_target_offload_target_intelmic() in
> testsuite/lib/libgomp.exp, used to test for non-intelmic offloading 
> situations.
> 
> Re-tested with no regressions, seeking approval for trunk.
> 
> Thanks,
> Chung-Lin
> 
> 2021-08-02  Chung-Lin Tang  
> 
> libgomp/ChangeLog
> 
>   * icv-device.c (omp_get_device_num): New API function, host side.
>   * fortran.c (omp_get_device_num_): New interface function.
>   * libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol.
>   * libgomp.map (OMP_5.0.2): New version space with omp_get_device_num,
>   omp_get_device_num_.
>   * libgomp.texi (omp_get_device_num): Add documentation for new API
>   function.
>   * omp.h.in (omp_get_device_num): Add declaration.
>   * omp_lib.f90.in (omp_get_device_num): Likewise.
>   * omp_lib.h.in (omp_get_device_num): Likewise.
>   * target.c (gomp_load_image_to_device): If additional entry for device
>   number exists at end of returned entries from 'load_image_func' hook,
>   copy the assigned device number over to the device variable.
> 
>   * config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
>   (omp_get_device_num): New API function, device side.
>   * config/plugin/plugin-gcn.c ("symcat.h"): Add include.
>   (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
>   at end of returned 'target_table' entries.
> 
>   * config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
>   (omp_get_device_num): New API function, device side.
>   * config/plugin/plugin-nvptx.c ("symcat.h"): Add include.
>   (GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
>   at end of returned 'target_table' entries.
> 
>   * testsuite/lib/libgomp.exp
>   (check_effective_target_offload_target_intelmic): New function for
>   testing for intelmic offloading.
>   * testsuite/libgomp.c-c++-common/target-45.c: New test.
>   * testsuite/libgomp.fortran/target10.f90: New test.

Ok, thanks.

Jakub



[PATCH, v2, libgomp, OpenMP 5.0] Implement omp_get_device_num

2021-08-02 Thread Chung-Lin Tang

On 2021/7/23 6:39 PM, Jakub Jelinek wrote:

On Fri, Jul 23, 2021 at 06:21:41PM +0800, Chung-Lin Tang wrote:

--- a/libgomp/icv-device.c
+++ b/libgomp/icv-device.c
@@ -61,8 +61,17 @@ omp_is_initial_device (void)
return 1;
  }
  
+int

+omp_get_device_num (void)
+{
+  /* By specification, this is equivalent to omp_get_initial_device
+ on the host.  */
+  return omp_get_initial_device ();
+}
+


I think this won't work properly with the intel micoffload, where the host
libgomp is used in the offloaded code.
For omp_is_initial_device, the plugin solves it by:
liboffloadmic/plugin/offload_target_main.cpp
overriding it:
/* Override the corresponding functions from libgomp.  */
extern "C" int
omp_is_initial_device (void) __GOMP_NOTHROW
{
   return 0;
}

extern "C" int32_t

omp_is_initial_device_ (void)
{
   return omp_is_initial_device ();
}
but guess it will need slightly more work because we need to copy the value
to the offloading device too.
It can be done incrementally though.


I guess this part of intelmic functionality will just have to wait later.
There seem to be other parts of liboffloadmic that seems to need re-work,
e.g. omp_get_num_devices() return mic_engines_total, where it should actually
return the number of all devices (not just intelmic). omp_get_initial_device()
returning -1 (which I don't quite understand), etc.

Really suggest to have intelmic support be re-worked as an offload plugin inside
libgomp, rather than floating outside by itself.


--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -102,6 +102,12 @@ struct addr_pair
uintptr_t end;
  };
  
+/* This symbol is to name a target side variable that holds the designated

+   'device number' of the target device. The symbol needs to be available to
+   libgomp code and the  offload plugin (which in the latter case must be
+   stringified).  */
+#define GOMP_DEVICE_NUM_VAR __gomp_device_num


For a single var it is acceptable (though, please avoid the double space
before offload plugin in the comment), but once we have more than one
variable, I think we should simply have a struct which will contain all the
parameters that need to be copied from the host to the offloading device at
image load time (and have eventually another struct that holds parameters
that we'll need to copy to the device on each kernel launch, I bet some ICVs
will be one category, other ICVs another one).


Actually, if you look at the 5.[01] specifications, omp_get_device_num() is not
defined in terms of an ICV. Maybe it conceptually ought to be, but the current
description of "the device number of the device on which the calling thread is
executing" is not one if the defined ICVs.

It looks like there will eventually be some kind of ICV block handled in a 
similar
way, but I think that the modifications will be straightforward then. For now,
I think it's okay for GOMP_DEVICE_NUM_VAR to just be a normal global variable.


diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 8ea27b5565f..ffcb98ae99e 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -197,6 +197,8 @@ OMP_5.0.1 {
omp_get_supported_active_levels_;
omp_fulfill_event;
omp_fulfill_event_;
+   omp_get_device_num;
+   omp_get_device_num_;
  } OMP_5.0;


This is wrong.  We've already released GCC 11.1 with the OMP_5.0.1
symbol version, so we must not add any further symbols into that symbol
version.  OpenMP 5.0 routines added in GCC 12 should be OMP_5.0.2 symbol
version.


I've adjusted this into 5.0.2, in between 5.0.1 and the new 5.1 added by the 
recent
omp_display_env[_] routines. omp_get_device_num is a OpenMP 5.0 introduced
API function, so I think this is the correct handling (instead of stashing into 
5.1).

There is a new function check_effective_target_offload_target_intelmic() in
testsuite/lib/libgomp.exp, used to test for non-intelmic offloading situations.

Re-tested with no regressions, seeking approval for trunk.

Thanks,
Chung-Lin

2021-08-02  Chung-Lin Tang  

libgomp/ChangeLog

* icv-device.c (omp_get_device_num): New API function, host side.
* fortran.c (omp_get_device_num_): New interface function.
* libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol.
* libgomp.map (OMP_5.0.2): New version space with omp_get_device_num,
omp_get_device_num_.
* libgomp.texi (omp_get_device_num): Add documentation for new API
function.
* omp.h.in (omp_get_device_num): Add declaration.
* omp_lib.f90.in (omp_get_device_num): Likewise.
* omp_lib.h.in (omp_get_device_num): Likewise.
* target.c (gomp_load_image_to_device): If additional entry for device
number exists at end of returned entries from 'load_image_func' hook,
copy the assigned device number over to the device variable.

* config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
(omp_get_device_num): New API function, device side.