On 11/14/14 04:39, Jakub Jelinek wrote:

:(.  So what other option one has to implement something like TLS, even
using inline asm or similar?  There is %tid, so perhaps indexing some array
with %tid?  The trouble with that is that some thread can do
#pragma omp parallel again, and I bet the %tid afterwards would be
again 0-(n-1), and if it is an index into a global array, it wouldn't work
well then.  Maybe without anything like TLS we can't really support nested
parallelism, only one level of #pragma omp parallel inside of nvptx regions.
But, if we add support for #pragma omp team, we'd either need the array
in gang-local memory, or some other special register to give us gang id.
Does the interface to the hardware even allow a model where we can launch another offload task while one is in progress?

Jeff

Reply via email to