Hello, On Wed, 3 Feb 2016, Nathan Sidwell wrote: > 1) extend the -fopenacc-dim=X:Y:Z syntax to allow '-' indicating a runtime > choice. (0 also indicates that, but I thought best to have an explicit syntax > as well).
Does it work when the user specifies one of the dimensions, so that references to it are subject to constant folding and VRP, but leaves some other dimension unspecified, and when eventually GOMP_OPENACC_DIM is parsed at runtime, the runtime-specified value of the first dimension is different from what the compiler saw, invalidating all folding and propagation? Here: + /* Do some sanity checking. The CUDA API doesn't appear to + provide queries to determine these limits. */ + if (default_dims[GOMP_DIM_GANG] < 1) + default_dims[GOMP_DIM_GANG] = 32; + if (default_dims[GOMP_DIM_WORKER] < 1 + || default_dims[GOMP_DIM_WORKER] > 32) + default_dims[GOMP_DIM_WORKER] = 32; + default_dims[GOMP_DIM_VECTOR] = 32; I don't see why you say that because cuDeviceGetAttribute provides CU_DEVICE_ATTRIBUTE_WARP_SIZE, CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X (which is not too useful for this case) and cuFuncGetAttribute that allows to get a per-function thread limit. There's a patch on gomp-nvptx branch that adds querying some of those to the plugin. Alexander