On 2013-09-08, at 14:02 , Kalle Raiskila <[email protected]> wrote:

> Den Sun, 08 Sep 2013 12:20:18 +0300
> skrev Re: [pocl-devel] Triples, targets, cpus, and features... and HOST
> vs. TARGET:
> 
>> If you think of binary distributions of pocl with GPU device support
>> enabled, one cannot rely on configure-time detection of the device
>> features, not even on how many and which type of devices (GPUs) one
>> has in the machine.
> 
> Actually, we don't need to even think of GPUs to hit the problem.
> Already with x86_64 CPUs there is the issue of which SSE/AVX extension
> to enable. We probably want to enable the latest one the compiler &
> processor allow us. 
> 
> So - HOST flags should *only* be used to compile libpocl (libOpenCL).
> And each target must take care of the TARGET flags, probably detecting
> them at runtime. ?
> 
> How about the runtime kernel library, then? There is some "#ifdef
> SSE"-like stuff there already, and we compile this into .bc before we
> know the the exact processor that we are going to run on.


These #ifdefs are set by the compiler, depending on the options that one passes 
to the compiler. They do not depend on the host, but on the target. In other 
words, we're good here.

We are currently using configure.ac to determine much information about the 
host. There is no equivalent for determining information about the target 
(which may not be possible). Much of the logic that ensures that code for 
x86-64, i386, or ARM is generated correctly currently queries the host CPU. If 
we make this target-dependent, then the respective information needs to be 
passed by the end user, which would make pocl quite inconvenient to use.

I would prefer a model where we continue to query the host for all information 
possible. When pocl runs, the user can override this for particular targets 
(e.g. Nvidia, TCE) that require cross-compiling, in which case the user may 
have to provide a large amount of information, depending how complex the target 
is. However, if the target is the host, then we simply re-use the information 
we gathered. This allows e.g. detecting AVX, SSE, hardware floating-point etc. 
automatically, without having to detect this at run time.

This needs to be detected before the kernel library is built, i.e. at the time 
pocl is configured and built. Bytecode is not generic enough to handle 
different architectures; e.g. type widths, calling conventions, and name 
mangling depend on the target, and these are present in bytecode already.

We could, in the future, support having multiple targets, same as LLVM supports 
multiple targets with same "clang" executable. This would require us to build a 
set of run-time libraries. Apart from the respective configury magic, this 
should not be difficult.

-erik

-- 
Erik Schnetter <[email protected]>
http://www.perimeterinstitute.ca/personal/eschnetter/

My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu/.

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel

Reply via email to