On 07/28/2015 09:52 AM, Nathan Sidwell wrote:
> I've committed this patch to the gomp4 branch to redo the launch API.
> I'll post a version for trunk once the versioning patch gets approved &
> committed.
>
> This changes the API in a number of ways, allowing device-specific
> knowledge to be moved into the device compiler and out of the host
> compiler.
>
> Firstly, we attach a tuple of launch dimensions as an attribute to the
> offloaded function's 'oacc function' attribute. These are the constant
> launch dimensions. Dynamic dimensions get a zero for their slot in this
> list. Further this list can be extended in the future to an alist keyed
> by device_type.
>
> Dynamic dimensions are computed on the host. however they are passed
> via varadic args to the GOACC_parallel function (which is renamed). The
> varadic args are passed using key/value representation, and 3 keys are
> currently defined:
> END -- end of the varadic list
> DIM - set of runtime-computed dimensions. Only the dynamic ones are
> passed.
> ASYNC_WAIT - an async and a set of waits (possibly zero).
>
> I have arranged for the key to have a slot that can later be filled by
> device_type, and hence support multiple device types.
>
> The constant dimensions can be used in expansion of the GOACC_nid
> function in the device compiler. The device compiler could also process
> that list to select the device_type slot that is appropriate.
>
> For PTX the backend is augmented to emit the launch dimensions into the
> target data, from whence the ptx plugin can pick them up and overwrite
> with any dynamic ones passed in from the launch function.
Looking at set_oacc_fn_attrib, it appears that const values are also
considered dynamic. See the attached test case more more info. Is that
the expected behavior? If not, I could take a look at this after I
finished my reduction patch.
Cesar
#include <stdio.h>
const int vl = 32;
int
main ()
{
unsigned int red = 0;
#pragma acc parallel loop vector_length (vl) vector reduction (+:red) copy (red)
for (int i = 0; i < 100; i++)
red ++;
printf ("red = %d\n", red);
return 0;
}