On Tue, 1 Dec 2015, Bernd Schmidt wrote: > > Didn't we also conclude that address-taking (let's say for stack addresses) is > also an operation that does not result in the same state?
This is intended to be used with soft-stacks in OpenMP offloading, and soft-stacks are per-warp outside of SIMD regions, not private to hwthread. So no such problem arises. (also, I wouldn't phrase it that way -- I wouldn't say that taking address of a classic .local stack slot desyncs state) > Have you tried to use the mechanism used for OpenACC? IMO that would be a good > first step - get things working with fewer changes, and then look into > optimizing them (ideally for OpenMP and OpenACC both). I don't think I would have as much success trying to apply the OpenACC mechanism with the overall direction I'm taking, that is, running with a slightly modified libgomp port. The way parallel regions are activated in the guts of libgomp via GOMP_parallel/gomp_team_start makes things different, for example. Alexander