On 10/20/2015 11:13 PM, Alexander Monakov wrote:
On Tue, 20 Oct 2015, Bernd Schmidt wrote:

On 10/20/2015 08:34 PM, Alexander Monakov wrote:
2.  Make gomp_nvptx_main a device (.func) function.  To have that work, we'd
need to additionally emit a "trampoline" of sorts in the NVPTX backend.  For
each OpenMP target entrypoint foo$_omp_fn$0, we'd have to additionally emit

__global__ void foo$_omp_fn$0$entry(void *args)
{
     gomp_nvptx_main(foo$_omp_fn$0, args);
}

Wouldn't it be simpler to generate a .kernel for every target region function
(as OpenACC does)? That could be a small stub in each case which just calls
gomp_nvptx_main with the right function pointer. We already have the machinery
to look up the right kernel corresponding to a host address and invoke it, so
I think we should just reuse that functionality.

As I see we are describing the same thing in different words.

In what you describe, and in my quoted paragraph, both gomp_nvptx_main and the
function originally outlined for a target region are device-only (.func)
functions.  The .kernel function that the plugin looks up and launches is a
small piece of code that calls gomp_nvptx_main, passing it a pointer to the
target region function.

Unless I didn't fully catch what you say?  Like I said in the email, I do like
this approach more.

Could be that we're talking about the same thing. I think I was confused by a reference to .func vs .kernel and sm_30 vs sm_35 in patch 2/14. So let's go for this approach.


Bernd

Reply via email to