Hi,

On 10/09/2012 03:33 PM, Sam Parker wrote:
> I am creating a custom target which will run just the kernels, I want
> the kernels to be compiled and linked statically with everything needed
> to run the kernel included. So I believe the standalone method of
> compilation is right, is this correct?
>
> I am using a configurable VLIW so I will want to combine work items to
> expose ILP but the device is also multi-core. I am compiling the simple
> standalone example and just would like to know how get_global_id is
> calculated in the produced bytecode? And how I should use the
> _*kernel*_workgroup and _*kernel*_workgroup_fast functions? I'm trying
> to go through the code, but without comments, I am not making very fast
> progress.

It sounds you want to do exactly what we have done in TUT and what
was the original use case for the pocl kernel compiler passes
before pocl was published a year ago. The main difference, I suppose,
is that we use TTA as a processor template instead of a traditional
"OTA" VLIW.

The problematic part for the "standalone mode" are the host API parts,
unless you create a custom launcher for your kernel which is not
"official OpenCL". The standalone compilation of the host API
together with the kernel binary is not supported in pocl, but I implemented
the APIs I need in TCE libraries in the TCE source tree.

Basically the host API stubs assume that the kernels are linked with the
program, and thus the clBuildProgram etc. are dummy no-operations. The
pocl-standalone script generates the work group function assuming you have the
required work group attribute in place which is then called with a "trampoline
function" glued in using the compiler driver script.

Thus, check the TCE sources and its tcecc compiler driver (a python script)
(http://tce.cs.tut.fi). The standalone mode of TCE is an incomplete
proof-of-concept and I think the best way to get it more robust is to
reuse the pocl implementations for the standalone mode as well. It should
be possible to make it almost transparent to the OpenCL app whether it's
compiled in the standalone mode offline or with an online compiler.

There's a quick tutorial in TCE user manual:
http://tce.cs.tut.fi/user_manual/TCE/node21.html

Some papers we have written about this subject are available in the
http://tce.cs.tut.fi/publications.html page.

"OpenCL-based Design Methodology for Application-Specific Processors" and
"TCEMC: A Co-Design Flow for Application-Specific Multicores" are the
most relevant ones. Do you have any publications of your work, BTW?

BR,
-- 
--Pekka


------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to