This is a patch kit that adds the nvptx port to gcc. It contains preliminary patches to add needed functionality, the target files, and one somewhat optional patch with additional target tools. There'll be more patch series, one for the testsuite, and one to make the offload functionality work with this port. Also required are the previous four rtl patches, two of which weren't entirely approved yet.

For the moment, I've stripped out all the address space support that got bogged down in review by brokenness in our representation of address spaces. The ptx address spaces are of course still defined and used inside the backend.

Ptx really isn't a usual target - it is a virtual target which is then translated by another compiler (ptxas) to the final code that runs on the GPU. There are many restrictions, some imposed by the GPU hardware, and some by the fact that not everything you'd want can be represented in ptx. Here are some of the highlights:
 * Everything is typed - variables, functions, registers. This can
   cause problems with K&R style C or anything else that doesn't
   have a proper type internally.
 * Declarations are needed, even for undefined variables.
 * Can't emit initializers referring to their variable's address since
   you can't write forward declarations for variables.
 * Variables can be declared only as scalars or arrays, not
   structures. Initializers must be in the variable's declared type,
   which requires some code in the backend, and it means that packed
   pointer values are not representable.
 * Since it's a virtual target, we skip register allocation - no good
   can probably come from doing that twice. This means asm statements
   aren't fixed up and will fail if they use matching constraints.
 * No support for indirect jumps, label values, nonlocal gotos.
 * No alloca - ptx defines it, but it's not implemented.
 * No trampolines.
 * No debugging (at all, for now - we may add line number directives).
 * Limited C library support - I have a hacked up copy of newlib
   that provides a reasonable subset.
 * malloc and free are defined by ptx (these appear to be
   undocumented), but there isn't a realloc. I have one patch for
   Fortran to use a malloc/memcpy helper function in cases where we
   know the old size.

All in all, this is not intended to be used as a C (or any other source language) compiler. I've gone through a lot of effort to make it work reasonably well, but only in order to get sufficient test coverage from the testsuites. The intended use for this is only to build it as an offload compiler, and use it through OpenACC by way of lto1. That leaves the question of how we should document it - does it need the usual constraint and option documentation, given that user's aren't expected to use any of it?

A slightly earlier version of the entire patch kit was bootstrapped and tested on x86_64-linux. Ok for trunk?


Bernd

Reply via email to