On Wed, Dec 02, 2015 at 06:44:11PM +0300, Alexander Monakov wrote: > > But you never know if people actually use #pragma omp simd regions or not, > > sometimes they will, sometimes they won't, and if the uniform SIMT > increases > > power consumption, it might not be desirable. > > It's easy to address: just terminate threads 1-31 if the linked image has > no SIMD regions, like my pre-simd libgomp was doing.
Well, can't say the linked image in one shared library call a function in another linked image in another shared library? Or is that just not supported for PTX? I believe XeonPhi supports that. If each linked image is self-contained, then that is probably a good idea, but still you could have a single simd region somewhere and lots of other target regions that don't use simd, or cases where only small amount of time is spent in a simd region and this wouldn't help in that case. If the addressables are handled through soft stack, then the rest is mostly just SSA_NAMEs you can see on the edges of the SIMT region, that really shouldn't be that expensive to broadcast or reduce back. Jakub