Document 'pass_postreload' vs. 'pass_late_compilation' (was: The nvptx port [4/11+] Post-RA pipeline)

2024-06-28 Thread Thomas Schwinge
Hi! Before we start looking into enabling certain 'pass_postreload' passes for nvptx, as we've been discussing in "nvptx vs. [PATCH] Add a late-combine pass [PR106594]", let's first document the (not quite obvious)

Flip the nvptx port to LRA (was: [PATCH] Turn on LRA on all targets)

2023-06-30 Thread Thomas Schwinge
>> can confirm there are no new regressions. Confirmed. Also, no change in nvptx target libraries built. As expected. >> Nvptx is unique in that it >> doesn't >> use register allocation, i.e. GCC's only TARGET_NO_REGISTER_ALLOCATION >> target, >>

Re: nvptx-tools and nvptx-newlib (was: The nvptx port [10/11+] Target files)

2015-02-18 Thread Jakub Jelinek
On Wed, Feb 18, 2015 at 09:50:15AM +0100, Thomas Schwinge wrote: > > What about multilibs, is newlib built for both -m32 and -m64, or just the > > default option? > > So far, we have concentrated only on the 64-bit x86_64 configuration; > 32-bit has several known issues to be resolved. >

Re: nvptx-tools and nvptx-newlib (was: The nvptx port [10/11+] Target files)

2015-02-18 Thread Thomas Schwinge
Hi! On Wed, 4 Feb 2015 10:43:14 +0100, Jakub Jelinek wrote: > On Mon, Feb 02, 2015 at 04:32:34PM +0100, Thomas Schwinge wrote: > > Hi! > > > > On Tue, 23 Dec 2014 19:49:35 +0100, I wrote: > > > On Mon, 10 Nov 2014 17:19:57 +0100, Bernd Schmidt > > > wrote: > > > > The scripts (11/11) I've put

Re: The nvptx port [0/11+]

2015-02-18 Thread Thomas Schwinge
Hi! On Mon, 20 Oct 2014 16:17:56 +0200, Bernd Schmidt wrote: > This is a patch kit that adds the nvptx port to gcc. Committed to trunk in r220781: commit 0f7695734890f93fe58179e36ac2f41bf4147d78 Author: tschwinge Date: Wed Feb 18 08:01:03 2015 + nvptx-none: Disable the lto-plu

nvptx-none: Define empty GOMP_SELF_SPECS (was: The nvptx port [0/11+])

2015-02-17 Thread Thomas Schwinge
Hi! On Mon, 20 Oct 2014 16:17:56 +0200, Bernd Schmidt wrote: > This is a patch kit that adds the nvptx port to gcc. I wonder why we haven't been seeing this in our internal development branch -- maybe because on that branch we're still discarding more compiler options in the of

Re: nvptx-tools and nvptx-newlib (was: The nvptx port [10/11+] Target files)

2015-02-04 Thread Jakub Jelinek
On Mon, Feb 02, 2015 at 04:32:34PM +0100, Thomas Schwinge wrote: > Hi! > > On Tue, 23 Dec 2014 19:49:35 +0100, I wrote: > > On Mon, 10 Nov 2014 17:19:57 +0100, Bernd Schmidt > > wrote: > > > The scripts (11/11) I've put up on github, along with a hacked up > > > newlib. These are at [...] > >

Re: nvptx-tools and nvptx-newlib (was: The nvptx port [10/11+] Target files)

2015-02-02 Thread Thomas Schwinge
Hi! On Tue, 23 Dec 2014 19:49:35 +0100, I wrote: > On Mon, 10 Nov 2014 17:19:57 +0100, Bernd Schmidt > wrote: > > The scripts (11/11) I've put up on github, along with a hacked up > > newlib. These are at [...] > > They are likely to migrate to MentorEmbedded from bernds, but that had > > som

nvptx-tools and nvptx-newlib (was: The nvptx port [10/11+] Target files)

2014-12-23 Thread Thomas Schwinge
Hi! On Mon, 10 Nov 2014 17:19:57 +0100, Bernd Schmidt wrote: > The scripts (11/11) I've put up on github, along with a hacked up > newlib. These are at > > https://github.com/bernds/nvptx-tools > https://github.com/bernds/nvptx-newlib > > They are likely to migrate to MentorEmbedded from bern

Re: The nvptx port [10/11+] Target files

2014-12-12 Thread Thomas Schwinge
Hi! On Mon, 10 Nov 2014 17:19:57 +0100, Bernd Schmidt wrote: > I've now committed it, in the following form. > --- /dev/null > +++ b/gcc/config/nvptx/nvptx.h > @@ -0,0 +1,356 @@ > +#define ASM_OUTPUT_ALIGN(FILE, POWER) Committed to trunk in r218689: commit 61f8a1bd770ded96fcff88f3cbc426a23c4

Re: The nvptx port

2014-11-17 Thread Nathan Sidwell
On 11/14/14 10:43, Jeff Law wrote: On 11/14/14 04:09, Bernd Schmidt wrote: Hi Jakub, I have some questions about nvptx: 1) you've said that alloca isn't supported, but it seems Yes, it's unimplemented. There's an internal declaration for it but that seems to be as far as it goes, and that d

Re: The nvptx port

2014-11-17 Thread Nathan Sidwell
On 11/14/14 11:04, Jeff Law wrote: On 11/14/14 05:36, Jakub Jelinek wrote: So, for a warp, if some threads perform one branch of an if and other threads another one, all threads perform the first one first (with some maybe not doing anything), then all the threads the others (again, other threa

Re: The nvptx port

2014-11-14 Thread Jeff Law
On 11/14/14 05:36, Jakub Jelinek wrote: So, for a warp, if some threads perform one branch of an if and other threads another one, all threads perform the first one first (with some maybe not doing anything), then all the threads the others (again, other threads not doing anything)? Nobody ever

Re: The nvptx port

2014-11-14 Thread Jeff Law
On 11/14/14 04:39, Jakub Jelinek wrote: :(. So what other option one has to implement something like TLS, even using inline asm or similar? There is %tid, so perhaps indexing some array with %tid? The trouble with that is that some thread can do #pragma omp parallel again, and I bet the %tid

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 08:37:52AM -0800, Cesar Philippidis wrote: > On 11/14/2014 08:18 AM, Jakub Jelinek wrote: > > >> Also, keep in mind that PTX doesn't have a global TID. The user needs to > >> calculate it using ctaid/tid and friends. > > > > Ok. Is %gridid needed for that combo too? > >

Re: The nvptx port

2014-11-14 Thread Jeff Law
On 11/14/14 04:39, Jakub Jelinek wrote: On Fri, Nov 14, 2014 at 12:09:03PM +0100, Bernd Schmidt wrote: I have some questions about nvptx: 1) you've said that alloca isn't supported, but it seems to be wired up and uses the %alloca documented in the PTX manual, what is the issue with that

Re: The nvptx port

2014-11-14 Thread Jeff Law
On 11/14/14 04:09, Bernd Schmidt wrote: Hi Jakub, I have some questions about nvptx: 1) you've said that alloca isn't supported, but it seems to be wired up and uses the %alloca documented in the PTX manual, what is the issue with that? %alloca not being actually implemented by the

Re: The nvptx port

2014-11-14 Thread Cesar Philippidis
On 11/14/2014 08:18 AM, Jakub Jelinek wrote: >> Also, keep in mind that PTX doesn't have a global TID. The user needs to >> calculate it using ctaid/tid and friends. > > Ok. Is %gridid needed for that combo too? Eventually, probably. Currently, we're launching all of our kernels with cuLaunchKe

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 07:37:49AM -0800, Cesar Philippidis wrote: > > Hmm. It's worthwhile to keep in mind that GPU threads really behave > > somewhat differently from CPUs (they don't really execute > > independently); the OMP model may just be a poor match for the > > architecture in general. >

Re: The nvptx port

2014-11-14 Thread Cesar Philippidis
On 11/14/2014 04:12 AM, Bernd Schmidt wrote: - we'll need some synchronization primitives, I see atomic support is there, we need mutexes and semaphores I think, is that implementable using bar instruction? >>> >>> It's probably membar you need. >> >> That

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
On 11/14/2014 01:36 PM, Jakub Jelinek wrote: Any way to query those limits? Size of .shared memory, number of threads in warp, number of warps, etc.? I'd have to google most of that. There seems to be a WARP_SZ constant available in ptx to get the size of the warp. In OpenACC, are all work

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 01:12:40PM +0100, Bernd Schmidt wrote: > >:(. So what other option one has to implement something like TLS, even > >using inline asm or similar? There is %tid, so perhaps indexing some array > >with %tid? > > That ought to work. For performance you'd want that array in .s

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
I'm adding Thomas and Cesar to the Cc list, they may have more insight into CUDA library questions as I haven't really looked into that part all that much. On 11/14/2014 12:39 PM, Jakub Jelinek wrote: On Fri, Nov 14, 2014 at 12:09:03PM +0100, Bernd Schmidt wrote: I have some questions about n

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 12:09:03PM +0100, Bernd Schmidt wrote: > >I have some questions about nvptx: > >1) you've said that alloca isn't supported, but it seems > >to be wired up and uses the %alloca documented in the PTX > >manual, what is the issue with that? %alloca not being actually >

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
On 11/14/2014 11:01 AM, Jakub Jelinek wrote: On Fri, Nov 14, 2014 at 09:29:48AM +0100, Jakub Jelinek wrote: I have some questions about nvptx: Oh, and 5) I have noticed gcc doesn't generate the .uni suffixes anywhere, while llvm generates them; are those appropriate only when a function

Re: The nvptx port

2014-11-14 Thread Bernd Schmidt
uot;legacy" ptx code and I get the impression it's discouraged. (As an aside, there's a question of how to represent a different concept, gang-local memory, in gcc. That would be .shared memory. We're currently going with just using an internal attribute) 3) in assembly

Re: The nvptx port

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 09:29:48AM +0100, Jakub Jelinek wrote: > I have some questions about nvptx: Oh, and 5) I have noticed gcc doesn't generate the .uni suffixes anywhere, while llvm generates them; are those appropriate only when a function is guaranteed to be run unconditionally from th

The nvptx port

2014-11-14 Thread Jakub Jelinek
#x27;t the port just emit all DECL_THREAD_LOCAL_P variables into .local instead of .global address space? Would one need to convert those pointers to generic any way? I'm asking because e.g. libgomp uses __thread heavily and it would be nice to be able to use that. 3) in assembly

Re: The nvptx port [0/11+]

2014-11-12 Thread Jeff Law
On 11/12/14 05:34, Richard Biener wrote: Now that this has been committed - I notice that there is no entry in MAINTAINERS for the port. I propose Bernd. Well, ahead of you there. I proposed Bernd to the steering committee as the maintainer a little while ago. I need to go back and count v

Re: The nvptx port [0/11+]

2014-11-12 Thread Richard Biener
On Mon, Oct 20, 2014 at 4:17 PM, Bernd Schmidt wrote: > This is a patch kit that adds the nvptx port to gcc. It contains preliminary > patches to add needed functionality, the target files, and one somewhat > optional patch with additional target tools. There'll be more patch ser

Re: The nvptx port [10/11+] Target files

2014-11-10 Thread Mike Stump
On Nov 10, 2014, at 12:37 PM, H.J. Lu wrote: > I also checked in this patch to add missing braces in > gcc.dg/pr44194-1.c. Thanks.

Re: The nvptx port [10/11+] Target files

2014-11-10 Thread H.J. Lu
On Mon, Nov 10, 2014 at 12:04 PM, Jakub Jelinek wrote: > On Mon, Nov 10, 2014 at 05:19:57PM +0100, Bernd Schmidt wrote: >> commit 659744a99d815b168716b4460e32f6a21593e494 >> Author: Bernd Schmidt >> Date: Thu Nov 6 19:03:57 2014 +0100 > > Note, in r217301 you've committed a change to pr35468.c,

Re: The nvptx port [10/11+] Target files

2014-11-10 Thread H.J. Lu
On Mon, Nov 10, 2014 at 12:04 PM, Jakub Jelinek wrote: > On Mon, Nov 10, 2014 at 05:19:57PM +0100, Bernd Schmidt wrote: >> commit 659744a99d815b168716b4460e32f6a21593e494 >> Author: Bernd Schmidt >> Date: Thu Nov 6 19:03:57 2014 +0100 > > Note, in r217301 you've committed a change to pr35468.c,

Re: The nvptx port [10/11+] Target files

2014-11-10 Thread Jakub Jelinek
On Mon, Nov 10, 2014 at 05:19:57PM +0100, Bernd Schmidt wrote: > commit 659744a99d815b168716b4460e32f6a21593e494 > Author: Bernd Schmidt > Date: Thu Nov 6 19:03:57 2014 +0100 Note, in r217301 you've committed a change to pr35468.c, not mentioned in the ChangeLog, that uses no_const_addr_space e

Re: The nvptx port [10/11+] Target files

2014-11-10 Thread Bernd Schmidt
On 10/30/2014 12:35 AM, Jeff Law wrote: A "nit" -- Richard S. recently removed the need to include the "enum" for "enum machine_mode". I believe he had a script to handle the mundane parts of that change. Please make sure to update the nvptx port to conform to

the nvptx port

2014-11-07 Thread VandeVondele Joost
Hi Bernd, reading the patches, it seems like there is no mention of sm_35, only sm_30. So, I'm wondering what 'sub'targets will initially be supported, and if/how/when various processors will be selected. Thanks, Joost

Re: The nvptx port [8/11+] Write undefined decls.

2014-11-05 Thread Jeff Law
On 11/05/14 05:01, Bernd Schmidt wrote: On 10/22/2014 08:11 PM, Jeff Law wrote: I'm not going to insist you do this in the same way as the PA. That was a different era -- we had significant motivation to make things work in such a way that everything could be buried in the pa specific files. Th

Re: The nvptx port [10/11+] Target files

2014-11-05 Thread Bernd Schmidt
On 11/04/2014 05:51 PM, Bernd Schmidt wrote: On 11/04/2014 05:48 PM, Richard Henderson wrote: On 10/28/2014 03:56 PM, Bernd Schmidt wrote: +nvptx_ptx_type_from_mode (enum machine_mode mode, bool promote) +{ + switch (mode) +{ +case BLKmode: + return ".b8"; +case BImode: +

Re: The nvptx port [8/11+] Write undefined decls.

2014-11-05 Thread Bernd Schmidt
On 10/22/2014 08:11 PM, Jeff Law wrote: I'm not going to insist you do this in the same way as the PA. That was a different era -- we had significant motivation to make things work in such a way that everything could be buried in the pa specific files. That sometimes led to less than optimal app

Re: The nvptx port [10/11+] Target files

2014-11-04 Thread Bernd Schmidt
On 11/04/2014 05:48 PM, Richard Henderson wrote: On 10/28/2014 03:56 PM, Bernd Schmidt wrote: +nvptx_ptx_type_from_mode (enum machine_mode mode, bool promote) +{ + switch (mode) +{ +case BLKmode: + return ".b8"; +case BImode: + return ".pred"; +case QImode: + if (

Re: The nvptx port [10/11+] Target files

2014-11-04 Thread Richard Henderson
On 10/28/2014 03:56 PM, Bernd Schmidt wrote: > +nvptx_ptx_type_from_mode (enum machine_mode mode, bool promote) > +{ > + switch (mode) > +{ > +case BLKmode: > + return ".b8"; > +case BImode: > + return ".pred"; > +case QImode: > + if (promote) > + return ".u32";

Re: The nvptx port [1/11+] indirect jumps

2014-11-04 Thread Richard Henderson
On 11/04/2014 04:32 PM, Bernd Schmidt wrote: > On 10/20/2014 04:19 PM, Bernd Schmidt wrote: >> ptx doesn't have indirect jumps, so CODE_FOR_indirect_jump may not be >> defined. Add a sorry. > > Looking back through all the mails it turns out this one wasn't approved yet. > Ping? Ok. r~

Re: The nvptx port [1/11+] indirect jumps

2014-11-04 Thread Bernd Schmidt
On 10/20/2014 04:19 PM, Bernd Schmidt wrote: ptx doesn't have indirect jumps, so CODE_FOR_indirect_jump may not be defined. Add a sorry. Looking back through all the mails it turns out this one wasn't approved yet. Ping? Bernd

Re: The nvptx port [11/11] More tools.

2014-11-03 Thread Jeff Law
On 10/31/14 17:50, Bernd Schmidt wrote: On 10/31/2014 09:56 PM, Jeff Law wrote: Pondering this a bit more, I think this is fine in concept. As you note, removing the GNU extensions or at least making them conditional would be good since these are going to be built with the host tools. I'm not

Re: The nvptx port [11/11] More tools.

2014-10-31 Thread Bernd Schmidt
On 10/31/2014 09:56 PM, Jeff Law wrote: Pondering this a bit more, I think this is fine in concept. As you note, removing the GNU extensions or at least making them conditional would be good since these are going to be built with the host tools. I'm not going to dig into the implementations...

Re: The nvptx port [11/11] More tools.

2014-10-31 Thread Jeff Law
On 10/20/14 08:48, Bernd Schmidt wrote: This is a "bonus" optional patch which adds ar, ranlib, as and ld to the ptx port. This is not proper binutils; ar and ranlib are just linked to the host versions, and the other two tools have the following functions: * nvptx-as is required to convert the

Re: The nvptx port [10/11+] Target files

2014-10-29 Thread Jeff Law
On 10/29/14 17:55, Bernd Schmidt wrote: Thanks! I've pinged some of the preliminary patches that went unapproved up to this point. Thanks. One leftover issue, discussed in the [0/11] mail - what amount of documentation is appropriate for this, given that we don't want to support using this a

Re: The nvptx port [10/11+] Target files

2014-10-29 Thread Bernd Schmidt
On 10/30/2014 12:35 AM, Jeff Law wrote: A "nit" -- Richard S. recently removed the need to include the "enum" for "enum machine_mode". I believe he had a script to handle the mundane parts of that change. Please make sure to update the nvptx port to conform to

Re: The nvptx port [10/11+] Target files

2014-10-29 Thread Jeff Law
le. * config/nvptx/free.asm: New file. * config/nvptx/malloc.asm: New file. * config/nvptx/realloc.c: New file. A "nit" -- Richard S. recently removed the need to include the "enum" for "enum machine_mode". I believe he had a script to handle the mundane p

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-29 Thread Jeff Law
On 10/28/14 08:49, Bernd Schmidt wrote: On 10/22/2014 08:12 PM, Jeff Law wrote: Yea, let's keep your approach. Just wanted to explore a bit since the PA seems to have a variety of similar characteristics. Here's an updated version of the patch. I experimented a little with ptx calling convent

Re: The nvptx port [10/11+] Target files

2014-10-28 Thread Bernd Schmidt
On 10/22/2014 08:01 PM, Jeff Law wrote: Please make sure all the functions in nvptx.c have function comments. Done, and replaced regno 4 with NVPTX_RETURN_REGNUM. +const char * +nvptx_output_call_insn (rtx insn, rtx result, rtx callee) If possible, promote first argument to rtx_insn *. Als

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-28 Thread Bernd Schmidt
On 10/22/2014 08:12 PM, Jeff Law wrote: Yea, let's keep your approach. Just wanted to explore a bit since the PA seems to have a variety of similar characteristics. Here's an updated version of the patch. I experimented a little with ptx calling conventions and ran into an arg that had to be

Re: The nvptx port [11/11] More tools.

2014-10-24 Thread Jeff Law
On 10/22/14 15:11, Bernd Schmidt wrote: On 10/22/2014 10:31 PM, Jeff Law wrote: These tools currently require GNU extensions - something I probably ought to fix if we decide to add them to the gcc build itself. Would these be more appropriate in binutils? I don't think so, given that we don't

Re: The nvptx port [11/11] More tools.

2014-10-22 Thread Bernd Schmidt
On 10/22/2014 10:31 PM, Jeff Law wrote: These tools currently require GNU extensions - something I probably ought to fix if we decide to add them to the gcc build itself. Would these be more appropriate in binutils? I don't think so, given that we don't need any piece of regular binutils. The

Re: The nvptx port [11/11] More tools.

2014-10-22 Thread Jeff Law
On 10/20/14 08:48, Bernd Schmidt wrote: This is a "bonus" optional patch which adds ar, ranlib, as and ld to the ptx port. This is not proper binutils; ar and ranlib are just linked to the host versions, and the other two tools have the following functions: * nvptx-as is required to convert the

Re: The nvptx port [8/11+] Write undefined decls.

2014-10-22 Thread Jeff Law
On 10/21/14 16:15, Bernd Schmidt wrote: On 10/22/2014 12:05 AM, Jeff Law wrote: On 10/20/14 14:30, Bernd Schmidt wrote: ptx assembly requires that declarations are written for undefined variables. This adds that functionality. Does this need to happen at the use site, or can it be deferred?

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-22 Thread Jeff Law
On 10/21/14 16:06, Bernd Schmidt wrote: On 10/21/2014 11:53 PM, Jeff Law wrote: So, in the end I'm torn. I don't like adding new hooks when they're not needed, but I have some reservations about relying on the order of stuff in CALL_INSN_FUNCTION_USAGE and I worry a bit that you might end up w

Re: The nvptx port [10/11+] Target files

2014-10-22 Thread Jeff Law
On 10/20/14 08:33, Bernd Schmidt wrote: These are the main target files for the ptx port. t-nvptx is empty for now but will grow some content with follow up patches. Bernd 010-target.diff * configure.ac: Allow configuring lto for nvptx. * configure: Regenerate. gcc/

Re: The nvptx port [1/11+] indirect jumps

2014-10-22 Thread Jakub Jelinek
On Wed, Oct 22, 2014 at 12:02:16PM +0200, Richard Biener wrote: > > I'm not sure that's what you're suggesting, but at least on non-shared > > memory offloading devices, you can't switch arbitrarily between > > offloading device(s) and host-fallback, for you have to do data > > management between t

Re: The nvptx port [1/11+] indirect jumps

2014-10-22 Thread Richard Biener
On Wed, Oct 22, 2014 at 10:34 AM, Thomas Schwinge wrote: > Hi! > > On Wed, 22 Oct 2014 10:18:49 +0200, Richard Biener > wrote: >> On Tue, Oct 21, 2014 at 11:32 PM, Bernd Schmidt >> wrote: >> > On 10/21/2014 11:30 PM, Jakub Jelinek wrote: >> >> >> >> At least for OpenMP, the best would be if th

Re: The nvptx port [1/11+] indirect jumps

2014-10-22 Thread Thomas Schwinge
Hi! On Wed, 22 Oct 2014 10:18:49 +0200, Richard Biener wrote: > On Tue, Oct 21, 2014 at 11:32 PM, Bernd Schmidt > wrote: > > On 10/21/2014 11:30 PM, Jakub Jelinek wrote: > >> > >> At least for OpenMP, the best would be if the #pragma omp target regions > >> and/or #pragma omp declare target fu

Re: The nvptx port [1/11+] indirect jumps

2014-10-22 Thread Jakub Jelinek
On Wed, Oct 22, 2014 at 10:18:49AM +0200, Richard Biener wrote: > On Tue, Oct 21, 2014 at 11:32 PM, Bernd Schmidt > wrote: > > On 10/21/2014 11:30 PM, Jakub Jelinek wrote: > >> > >> At least for OpenMP, the best would be if the #pragma omp target regions > >> and/or #pragma omp declare target fun

Re: The nvptx port [1/11+] indirect jumps

2014-10-22 Thread Richard Biener
On Tue, Oct 21, 2014 at 11:32 PM, Bernd Schmidt wrote: > On 10/21/2014 11:30 PM, Jakub Jelinek wrote: >> >> At least for OpenMP, the best would be if the #pragma omp target regions >> and/or #pragma omp declare target functions contain anything a particular >> offloading accelerator can't handle,

Re: The nvptx port [8/11+] Write undefined decls.

2014-10-21 Thread Bernd Schmidt
On 10/22/2014 12:05 AM, Jeff Law wrote: On 10/20/14 14:30, Bernd Schmidt wrote: ptx assembly requires that declarations are written for undefined variables. This adds that functionality. Does this need to happen at the use site, or can it be deferred? This is independent of use sites. The pat

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 11:53 PM, Jeff Law wrote: So, in the end I'm torn. I don't like adding new hooks when they're not needed, but I have some reservations about relying on the order of stuff in CALL_INSN_FUNCTION_USAGE and I worry a bit that you might end up with stuff other than arguments on that li

Re: The nvptx port [9/11+] Epilogues

2014-10-21 Thread Jeff Law
On 10/20/14 14:32, Bernd Schmidt wrote: We skip the late compilation passes on ptx, but there's one piece we do need - fixing up the function so that we get return insns in the right places. This patch just makes thread_prologue_and_epilogue_insns callable from the reorg pass. Bernd 009-proep.

Re: The nvptx port [8/11+] Write undefined decls.

2014-10-21 Thread Jeff Law
On 10/20/14 14:30, Bernd Schmidt wrote: ptx assembly requires that declarations are written for undefined variables. This adds that functionality. Bernd 008-undefdecl.diff gcc/ * target.def (assemble_undefined_decl): New hooks. * hooks.c (hook_void_FILEptr_constcharp

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-21 Thread Jeff Law
On 10/21/14 21:29, Bernd Schmidt wrote: A normal call looks like { .param.u32 %retval_in; .param.u64 %out_arg0; st.param.u64 [%out_arg0], %r1400; call (%retval_in), PopCnt, (%out_arg0); ld.param.u32%r1403, [%retval_in]; } which declares local variables for the args and retur

Re: The nvptx port [1/11+] indirect jumps

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 11:30 PM, Jakub Jelinek wrote: At least for OpenMP, the best would be if the #pragma omp target regions and/or #pragma omp declare target functions contain anything a particular offloading accelerator can't handle, instead of failing the whole compilation perhaps just emit some at l

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 11:11 PM, Jeff Law wrote: On 10/20/14 14:29, Bernd Schmidt wrote: In ptx assembly we need to decorate call insns with the arguments that are being passed. We also need to know the exact function type. This is kind of hard to do with the existing infrastructure since things like fun

Re: The nvptx port [1/11+] indirect jumps

2014-10-21 Thread Jakub Jelinek
On Tue, Oct 21, 2014 at 11:00:35PM +0200, Bernd Schmidt wrote: > On 10/21/2014 08:26 PM, Jeff Law wrote: > >>* optabs.c (emit_indirect_jump): Test HAVE_indirect_jump and emit a > >>sorry if necessary. > >So doesn't this imply no hot-cold partitioning since we use indirect > >jumps to get ac

Re: The nvptx port [7/11+] Inform the port about call arguments

2014-10-21 Thread Jeff Law
On 10/20/14 14:29, Bernd Schmidt wrote: In ptx assembly we need to decorate call insns with the arguments that are being passed. We also need to know the exact function type. This is kind of hard to do with the existing infrastructure since things like function_arg are called at other times rathe

Re: The nvptx port [1/11+] indirect jumps

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 08:26 PM, Jeff Law wrote: * optabs.c (emit_indirect_jump): Test HAVE_indirect_jump and emit a sorry if necessary. So doesn't this imply no hot-cold partitioning since we use indirect jumps to get across the partition? Similarly doesn't this imply other missing features (se

Re: The nvptx port [6/11+] Pseudo call args

2014-10-21 Thread Jeff Law
On 10/20/14 14:26, Bernd Schmidt wrote: On ptx, we'll be using pseudos to pass function args as well, and there's one assert that needs to be toned town to make that work. Bernd 006-usereg.diff gcc/ * expr.c (use_reg_mode): Just return for pseudo registers. OK. I pondered

Re: The nvptx port [5/11+] Variable declarations

2014-10-21 Thread Jeff Law
On 10/20/14 14:25, Bernd Schmidt wrote: ptx assembly follows rather different rules than what's typical elsewhere. We need a new hook to add a " };" string when we are finished outputting a variable with an initializer. Bernd 005-declend.diff gcc/ * target.def (decl_end): Ne

Re: The nvptx port [4/11+] Post-RA pipeline

2014-10-21 Thread Jeff Law
On 10/20/14 14:24, Bernd Schmidt wrote: This stops most of the post-regalloc passes to be run if the target doesn't want register allocation. I'd previously moved them all out of postreload to the toplevel, but Jakub (I think) pointed out that the idea is not to run them to avoid crashes if reloa

Re: The nvptx port [3/11+] Struct returns

2014-10-21 Thread Jeff Law
On 10/20/14 14:22, Bernd Schmidt wrote: Even when returning a structure by passing an invisible reference, gcc still likes to set the return register to the address of the struct. This is undesirable on ptx where things like the return register have to be declared, and the function really returns

Re: The nvptx port [2/11+] No register allocation

2014-10-21 Thread Jeff Law
On 10/20/14 14:20, Bernd Schmidt wrote: Since it's a virtual target, I've chosen not to run register allocation. This is one of the patches necessary to make that work, it primarily adds a target hook to disable it and fixes some of the fallout. Bernd 002-noregalloc.diff gcc/

Re: The nvptx port [1/11+] indirect jumps

2014-10-21 Thread Jeff Law
On 10/20/14 14:19, Bernd Schmidt wrote: ptx doesn't have indirect jumps, so CODE_FOR_indirect_jump may not be defined. Add a sorry. Bernd 001-indjumps.diff gcc/ * optabs.c (emit_indirect_jump): Test HAVE_indirect_jump and emit a sorry if necessary. So doesn't this im

Re: The nvptx port [0/11+]

2014-10-21 Thread Richard Biener
On Tue, Oct 21, 2014 at 12:53 PM, Bernd Schmidt wrote: > On 10/21/2014 10:18 AM, Richard Biener wrote: >> >> So with this restriction I wonder why it didn't make sense to go the >> HSA "backend" route emitting PTX from a GIMPLE SSA pass. This >> would have avoided the LTO dance as well ... > > >

Re: The nvptx port [0/11+]

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 10:42 AM, Jakub Jelinek wrote: On Mon, Oct 20, 2014 at 04:17:56PM +0200, Bernd Schmidt wrote: * Can't emit initializers referring to their variable's address since you can't write forward declarations for variables. Can't that be handled by emitting the initializer without

Re: The nvptx port [0/11+]

2014-10-21 Thread Bernd Schmidt
On 10/21/2014 10:18 AM, Richard Biener wrote: So with this restriction I wonder why it didn't make sense to go the HSA "backend" route emitting PTX from a GIMPLE SSA pass. This would have avoided the LTO dance as well ... Quite simple - there isn't an established way to do this. If I'd known

Re: The nvptx port [0/11+]

2014-10-21 Thread Jakub Jelinek
On Mon, Oct 20, 2014 at 04:17:56PM +0200, Bernd Schmidt wrote: > * Can't emit initializers referring to their variable's address since >you can't write forward declarations for variables. Can't that be handled by emitting the initializer without the address and some constructor that fixes up

Re: The nvptx port [0/11+]

2014-10-21 Thread Richard Biener
On Mon, Oct 20, 2014 at 4:17 PM, Bernd Schmidt wrote: > This is a patch kit that adds the nvptx port to gcc. It contains preliminary > patches to add needed functionality, the target files, and one somewhat > optional patch with additional target tools. There'll be more patch ser

Re: The nvptx port [11/11] More tools.

2014-10-20 Thread Joseph S. Myers
On Mon, 20 Oct 2014, Bernd Schmidt wrote: > These tools currently require GNU extensions - something I probably ought to > fix if we decide to add them to the gcc build itself. And as regards library use, I'd expect the sources to start with #includes of config.h and system.h (and so not include

The nvptx port [11/11] More tools.

2014-10-20 Thread Bernd Schmidt
This is a "bonus" optional patch which adds ar, ranlib, as and ld to the ptx port. This is not proper binutils; ar and ranlib are just linked to the host versions, and the other two tools have the following functions: * nvptx-as is required to convert the compiler output to actual valid ptx a

The nvptx port [10/11+] Target files

2014-10-20 Thread Bernd Schmidt
These are the main target files for the ptx port. t-nvptx is empty for now but will grow some content with follow up patches. Bernd * configure.ac: Allow configuring lto for nvptx. * configure: Regenerate. gcc/ * config/nvptx/nvptx.c: New file. * config/nvptx/nvptx.h: New file. * confi

The nvptx port [9/11+] Epilogues

2014-10-20 Thread Bernd Schmidt
We skip the late compilation passes on ptx, but there's one piece we do need - fixing up the function so that we get return insns in the right places. This patch just makes thread_prologue_and_epilogue_insns callable from the reorg pass. Bernd gcc/ * function.c (thread_prologue_and_epilogue

The nvptx port [8/11+] Write undefined decls.

2014-10-20 Thread Bernd Schmidt
ptx assembly requires that declarations are written for undefined variables. This adds that functionality. Bernd gcc/ * target.def (assemble_undefined_decl): New hooks. * hooks.c (hook_void_FILEptr_constcharptr_const_tree): New function. * hooks.h (hook_void_FILEptr_constcharptr_const_tree

The nvptx port [7/11+] Inform the port about call arguments

2014-10-20 Thread Bernd Schmidt
In ptx assembly we need to decorate call insns with the arguments that are being passed. We also need to know the exact function type. This is kind of hard to do with the existing infrastructure since things like function_arg are called at other times rather than just when emitting a call, so t

The nvptx port [6/11+] Pseudo call args

2014-10-20 Thread Bernd Schmidt
On ptx, we'll be using pseudos to pass function args as well, and there's one assert that needs to be toned town to make that work. Bernd gcc/ * expr.c (use_reg_mode): Just return for pseudo registers. Index: gcc/expr.

The nvptx port [5/11+] Variable declarations

2014-10-20 Thread Bernd Schmidt
ptx assembly follows rather different rules than what's typical elsewhere. We need a new hook to add a " };" string when we are finished outputting a variable with an initializer. Bernd gcc/ * target.def (decl_end): New hook. * varasm.c (assemble_variable_contents, assemble_constant_conten

The nvptx port [4/11+] Post-RA pipeline

2014-10-20 Thread Bernd Schmidt
This stops most of the post-regalloc passes to be run if the target doesn't want register allocation. I'd previously moved them all out of postreload to the toplevel, but Jakub (I think) pointed out that the idea is not to run them to avoid crashes if reload fails e.g. for an invalid asm. So I'

Re: The nvptx port [3/11+] Struct returns

2014-10-20 Thread Bernd Schmidt
Even when returning a structure by passing an invisible reference, gcc still likes to set the return register to the address of the struct. This is undesirable on ptx where things like the return register have to be declared, and the function really returns void at ptx level. I've added a targe

The nvptx port [2/11+] No register allocation

2014-10-20 Thread Bernd Schmidt
Since it's a virtual target, I've chosen not to run register allocation. This is one of the patches necessary to make that work, it primarily adds a target hook to disable it and fixes some of the fallout. Bernd gcc/ * target.def (no_register_allocation): New data hook. * doc/tm.texi.in: A

The nvptx port [1/11+] indirect jumps

2014-10-20 Thread Bernd Schmidt
ptx doesn't have indirect jumps, so CODE_FOR_indirect_jump may not be defined. Add a sorry. Bernd gcc/ * optabs.c (emit_indirect_jump): Test HAVE_indirect_jump and emit a sorry if necessary. Index: gcc/optabs.c ===

The nvptx port [2/11+] No register allocation

2014-10-20 Thread Bernd Schmidt
Since it's a virtual target, I've chosen not to run register allocation. This is one of the patches necessary to make that work, it primarily adds a target hook to disable it and fixes some of the fallout. Bernd

The nvptx port [0/11+]

2014-10-20 Thread Bernd Schmidt
This is a patch kit that adds the nvptx port to gcc. It contains preliminary patches to add needed functionality, the target files, and one somewhat optional patch with additional target tools. There'll be more patch series, one for the testsuite, and one to make the offload functionality