This is the third rework of the patchset previously posted on September
5th and November 16th. As before, the series contains the
non-OpenACC/OpenMP portions of a port to AMD GCN3 and GCN5 GPU
processors.  It's sufficient to build single-threaded programs, with
vectorization in the usual way.  C and Fortran are supported, C++ is not
supported, and the other front-ends have not been tested.  The
OpenACC/OpenMP/libgomp portion will follow, once this is committed,
eventually.

Compared to the v2 patchset, patch 1, "Fix IRA ICE", has been dropped,
and a new, unrelated, patch 1 has been added: "Fix LRA bug".

The IRA issue has now been solved by reworking the move instructions in
the back-end so that they no longer require explicit mention of the EXEC
register (this is now managed mostly by the md_reorg pass).  I also took
the opportunity to rework the EXEC use throughout the machine
description (something I've been wanting to get to for ages); the
primary instruction patterns no longer use vec_merge, and there are
"_exec" variants defined (mostly via define_subst) for the use of
specific expanders and so that combine can optimize conditional vector
moves.

Additionally, the patterns that choose which unit to use for
scalar operations now only clobber the relevant condition register (via
a match_scratch), not both of them.

The new LRA issue was exposed by the above changes, but would affect any
target where patterns referring to an eliminable register might also
include a "scratch" register.

I've also addressed the various feedback I received from patch
reviewers.

-- 
Andrew Stubbs
Mentor Graphics / CodeSourcery

Reply via email to