On 07/28/2015 09:52 AM, Nathan Sidwell wrote: > I've committed this patch to the gomp4 branch to redo the launch API. > I'll post a version for trunk once the versioning patch gets approved & > committed. > > This changes the API in a number of ways, allowing device-specific > knowledge to be moved into the device compiler and out of the host > compiler. > > Firstly, we attach a tuple of launch dimensions as an attribute to the > offloaded function's 'oacc function' attribute. These are the constant > launch dimensions. Dynamic dimensions get a zero for their slot in this > list. Further this list can be extended in the future to an alist keyed > by device_type. > > Dynamic dimensions are computed on the host. however they are passed > via varadic args to the GOACC_parallel function (which is renamed). The > varadic args are passed using key/value representation, and 3 keys are > currently defined: > END -- end of the varadic list > DIM - set of runtime-computed dimensions. Only the dynamic ones are > passed. > ASYNC_WAIT - an async and a set of waits (possibly zero). > > I have arranged for the key to have a slot that can later be filled by > device_type, and hence support multiple device types. > > The constant dimensions can be used in expansion of the GOACC_nid > function in the device compiler. The device compiler could also process > that list to select the device_type slot that is appropriate. > > For PTX the backend is augmented to emit the launch dimensions into the > target data, from whence the ptx plugin can pick them up and overwrite > with any dynamic ones passed in from the launch function.
Looking at set_oacc_fn_attrib, it appears that const values are also considered dynamic. See the attached test case more more info. Is that the expected behavior? If not, I could take a look at this after I finished my reduction patch. Cesar
#include <stdio.h> const int vl = 32; int main () { unsigned int red = 0; #pragma acc parallel loop vector_length (vl) vector reduction (+:red) copy (red) for (int i = 0; i < 100; i++) red ++; printf ("red = %d\n", red); return 0; }