[gomp4] Re: [2/3] OpenACC reductions

2015-11-06 Thread Thomas Schwinge
Hi Nathan! On Wed, 4 Nov 2015 11:59:28 -0500, Nathan Sidwell wrote: > [PTX backend pieces of OpenACC reduction handling] Merged your trunk r229768 into gomp-4_0-branch in r229836: commit 089a0224af68e30b55f42734de48adc645eb7370 Merge: 2b76127 78a78aa Author: tschwinge

Re: [2/3] OpenACC reductions

2015-11-04 Thread Nathan Sidwell
On 11/04/15 08:27, Bernd Schmidt wrote: On 11/02/2015 05:35 PM, Nathan Sidwell wrote: There are two such switch statements, and it's possible to write this more compactly: if (!INTEGRAL_MODE_P (...)) code = VIEW_CONVERT_EXPR; if (GET_MODE_SIZE (...) == 8) fn = CMP_SWAPLL;

Re: [2/3] OpenACC reductions

2015-11-04 Thread Bernd Schmidt
On 11/02/2015 05:35 PM, Nathan Sidwell wrote: +/* Size of buffer needed for worker reductions. This has to be Maybe "description" rather than "Size" since there's really four variables we're covering with the comment. + worker_red_size = (worker_red_size + worker_red_align - 1) +

Re: [2/3] OpenACC reductions

2015-11-04 Thread Nathan Sidwell
On 11/04/15 05:01, Jakub Jelinek wrote: On Mon, Nov 02, 2015 at 11:35:34AM -0500, Nathan Sidwell wrote: 2015-11-02 Nathan Sidwell Cesar Philippidis * config/nvptx/nvptx.c: Include gimple headers.

Re: [2/3] OpenACC reductions

2015-11-04 Thread Nathan Sidwell
On 11/04/15 08:27, Bernd Schmidt wrote: Adjust and applied, thanks! nathan 2015-11-04 Nathan Sidwell Cesar Philippidis * config/nvptx/nvptx.c: Include gimple headers. (worker_red_size, worker_red_align, worker_red_name,

Re: [2/3] OpenACC reductions

2015-11-04 Thread Jakub Jelinek
On Mon, Nov 02, 2015 at 11:35:34AM -0500, Nathan Sidwell wrote: > 2015-11-02 Nathan Sidwell > Cesar Philippidis > > * config/nvptx/nvptx.c: Include gimple headers. > (worker_red_size, worker_red_align, worker_red_name,

Re: [2/3] OpenACC reductions

2015-11-02 Thread Nathan Sidwell
This patch contains the PTX backend pieces of OpenACC reduction handling. These functions are lowered to gimple, using a couple of PTX-specific builtins for some functionality. Expansion to RTL introduced no new patterns. We need 3 different schemes for the 3 different partitioning axes, but