gcc-6-20170316 is now available
Snapshot gcc-6-20170316 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/6-20170316/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 6 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-6-branch revision 246212 You'll find: gcc-6-20170316.tar.bz2 Complete GCC SHA256=64a7e07bb163df01713c19526b34696d55966bfff3d6f6362edae2f17e4937cf SHA1=e5c5391596e97ccad9bc8ee6241f95662041539c Diffs from 6-20170309 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-6 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Obsolete powerpc*-*-*spe*
On 16/03/2017 21:11, Segher Boessenkool wrote: The e200z3 upwards have SPE units. None of them have classic FP. So it would make most sense for the e200/VLE support to be part of the SPE backend rather than the classic PowerPC backend. Great to hear! And all e300 are purely "classic"? That's one I'm less familiar with (as we don't deliver a multilib for it), but yes - my understanding is that this is classic core. Andrew
Re: Obsolete powerpc*-*-*spe*
On Thu, Mar 16, 2017 at 08:38:37PM +, Andrew Jenner wrote: > >>Are you proposing to take on the task of actually splitting it yourself? > >>If so, that would make me a lot happier about it. > > > >Yes, I can do the mechanics. But I cannot do most of the testing. > > That's fine (and what I expected). > > >And > >this does not include any of the huge simplifications that can be done > >after the split: both ports will be very close to what we have now, > >immediately after the split. > > I'd have thought that the simplifications would be the bulk of the > work... The simplifications are not necessary to make things work. They can all be done piecemeal, and later (we should do the split during early stage1 if possible). I cannot promise you much of IBM's time (or my own abundant spare time), but we will of course be available for advice and questions etc. It is not like removing 20k or 30k lines is as much work as writing them, of course ;-) > The simplification of the classic PowerPC port would be the > removal of the SPE code. What would be removed from the SPE port - > anything other than Altivec and 64-bit? Don't forget the other vector stuff, VSX. It is not small or simple. > >>All the e200 cores apart from e200z0 can execute 32-bit instructions as > >>well as VLE, though we'll always generate VLE code when targetting them > >>(otherwise they're fairly standard). > > > >Do any e200 support SPE, or classic FP? > > The e200z3 upwards have SPE units. None of them have classic FP. So it > would make most sense for the e200/VLE support to be part of the SPE > backend rather than the classic PowerPC backend. Great to hear! And all e300 are purely "classic"? Segher
Re: Obsolete powerpc*-*-*spe*
Hi Segher, On 16/03/2017 19:24, Segher Boessenkool wrote: e500mc (like e5500, e6500) are just PowerPC (and they use the usual ABIs), so those should stay on the "rs6000 side". Agreed. Are you proposing to take on the task of actually splitting it yourself? If so, that would make me a lot happier about it. Yes, I can do the mechanics. But I cannot do most of the testing. That's fine (and what I expected). And this does not include any of the huge simplifications that can be done after the split: both ports will be very close to what we have now, immediately after the split. I'd have thought that the simplifications would be the bulk of the work... The simplification of the classic PowerPC port would be the removal of the SPE code. What would be removed from the SPE port - anything other than Altivec and 64-bit? All the e200 cores apart from e200z0 can execute 32-bit instructions as well as VLE, though we'll always generate VLE code when targetting them (otherwise they're fairly standard). Do any e200 support SPE, or classic FP? The e200z3 upwards have SPE units. None of them have classic FP. So it would make most sense for the e200/VLE support to be part of the SPE backend rather than the classic PowerPC backend. Andrew
Re: Obsolete powerpc*-*-*spe*
Hi Andrew, On Wed, Mar 15, 2017 at 09:43:20PM +, Andrew Jenner wrote: > On 15/03/2017 14:26, Segher Boessenkool wrote: > >I do not think VLE can get in, not in its current shape at least. > > That's unfortunate. Disregarding the SPE splitting plan for a moment, > what do you think would need to be done to get it into shape? I had > thought we were almost there with the patches that I sent to you and > David off-list last year. > > > VLE > >is very unlike PowerPC in many ways so it comes at a very big cost to > >the port (maintenance and otherwise -- maintenance is what I care about > >most). > > I completely understand. That answers your previous question, too. > >Since SPE and VLE only share the part of the rs6000 port that doesn't > >change at all (except for a bug fix once or twice a year), and everything > >else needs special cases all over the place, it seems to me it would be > >best for everyone if we split the rs6000 port in two, one for SPE and VLE > >and one for the rest. Both ports could then be very significantly > >simplified. > > > >I am assuming SPE and VLE do not support AltiVec or 64-bit PowerPC, > >please correct me if that is incorrect. Also, is "normal" floating > >point supported at all? > > My understanding is that SPE is only present in the e500v1, e500v2 and > e200z[3-7] cores, all of which are 32-bit only and do not have classic > floating-point units. SPE and Altivec cannot coexist as they have some > overlapping instruction encodings. The successor to e500v2 (e500mc) > reinstated classic floating-point and got rid of SPE. e500mc (like e5500, e6500) are just PowerPC (and they use the usual ABIs), so those should stay on the "rs6000 side". > >Do you (AdaCore and Mentor) think splitting the port is a good idea? > > It wouldn't have been my preference, but I can understand the appeal of > that plan for you. I'm surprised that the amount of shared code between > SPE and PowerPC is as little as you say, but you have much more > experience with the PowerPC port than I do, so I'll defer to your > expertise on that matter. > > Are you proposing to take on the task of actually splitting it yourself? > If so, that would make me a lot happier about it. Yes, I can do the mechanics. But I cannot do most of the testing. And this does not include any of the huge simplifications that can be done after the split: both ports will be very close to what we have now, immediately after the split. > >> -te200z0 > >> -te200z3 > >> -te200z4 > > > > These are VLE? > > Yes. > > > Do some of those also support PowerPC? > > All the e200 cores apart from e200z0 can execute 32-bit instructions as > well as VLE, though we'll always generate VLE code when targetting them > (otherwise they're fairly standard). Do any e200 support SPE, or classic FP? Segher
GCN back-end branch
Hello, after working on GCN back-end in private branch, we would like to make it public and invite the community to have a look, comment, review or even contribute. Therefore we have just pushed the current state of the back-end to the git branch gcn (see https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/gcn or fetch it as any other git branch). We have decided not to have ChangeLog.gcn files but if you wish to contribute, please make standard changelog entries part of commit messages. Additionally, the basic git collaboration rules should apply, most notably make sure you do not do non-fast-forward pushes to the branch, start your commit messages with one-line brief summaries and so forth. Any patches against the branch should be sent to gcc-patches, and while I think that full-blown reviews are not necessary at this stage, please coordinate with me and Honza before you commit anything. I will be making regular merges from trunk. At this point, the back-end can compile small kernels open-coded in C with target-specific attributes, built-ins and address spaces to make use of the various special characteristics of the architecture. Eventually, it should of course provide for high-level programming models, most notably OpenMP, but the list of steps we need to take before we get there is very long. The changelog of the branch initial commit is below. Apart from a new machine description, it also contains a few modifications to the compiler proper, most of which are needed to increase the limit on size of scalar types and the number of arguments of an instruction (which are actually not strictly necessary now but we have bumped into them during development). We plan to commit generally useful generic changes early in stage1. So far we have tested output of the branch only on AMD APUs, we have not tested on discrete GPUs yet. To run the kernels, you need quite a few more pieces in your software stack in addition to our branch and the hardware. Most notably, you currently need: 1) an AMDGPU-LLVM-based assembler, 2) the amdphdrs utility from https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra, and 3,4,5) ROCK kernel, ROCT thunk interface library and ROCR run time library, which you can get from https://github.com/RadeonOpenCompute (or currently from http://download.opensuse.org/repositories/home:/jamborm:/roc-1.3/openSUSE_Tumbleweed if you use openSUSE Tumbleweed, so far I have packaged only version 1.3 but so far it was sufficient). The work-flow is that you configure the branch with --target=amdgcn-unknown-amdhsa, use it to compile the kernel into assembly, which you then feed to llvm-mc amdgcn assembler, we then use amdphdrs tool to convert the resultant object file to an AMD HSA "code object" which the ROCR run time can then load and execute. Honza and I hope to come up with an article demonstrating what can already be done with the branch soon, but that is clearly out of scope of this already too long announcement. We plan to write a wiki page with some examples and more detailed descriptions of some basic problems with modeling GCN in GCC. Thus, let me conclude saying that I'm looking forward to taking on many challenges this architecture will present for GCC and I would like to invite everyone interested to help tackling them, Martin 2017-03-10 Jan Hubicka Martin Jambor * config.sub: Added amdgcn cases. gcc/ * common/config/gcn/gcn-common.c: New file. * config/gcn/constraints.md: Likewise. * config/gcn/gcn-builtins.def: Likewise. * config/gcn/gcn-c.c: Likewise. * config/gcn/gcn-hsa.h: Likewise. * config/gcn/gcn-modes.def: Likewise. * config/gcn/gcn-protos.h: Likewise. * config/gcn/gcn-valu.md: Likewise. * config/gcn/gcn.c: Likewise. * config/gcn/gcn.h: Likewise. * config/gcn/gcn.md: Likewise. * config/gcn/gcn.opt: Likewise. * config/gcn/predicates.md: Likewise. * config/gcn/t-gcn-elf: Likewise. * ira.c (ira_init_register_move_cost): Also check that contains_reg_of_mode. * combine.c (gen_lowpart_or_truncate): Return clobber if there is not a integer mode if the same size as x. (gen_lowpart_for_combine): Fail if there is no integer mode of the same size. * config.gcc: Added amdgcn cases. * emit-rtl.c (get_mem_align_offset): Return zero for overaligned memory. * explow.c (memory_address_addr_space): Call memory_address_addr_space if a representation by a single register is invalid. * expr.c (expand_expr_real_1): disable converting operand to fields or BLK mode. * ira-costs.c (setup_allocno_class_and_costs): Do not assert that cost_classes_ptr->hard_regno_index is non-negative. * lra-constraints.c (process_alt_operands): Do not penalize constnats. (curr_insn_transfor
Re: [RFC] Support register groups in inline asm
2017-03-16 9:50 GMT+01:00 Richard Biener : > On Wed, 15 Mar 2017, Andrew Senkevich wrote: > >> 2016-12-05 16:31 GMT+01:00 Andrew Senkevich : >> > 2016-11-16 8:02 GMT+03:00 Andrew Pinski : >> >> On Tue, Nov 15, 2016 at 9:36 AM, Andrew Senkevich >> >> wrote: >> >>> Hi, >> >>> >> >>> new Intel instructions AVX512_4FMAPS and AVX512_4VNNIW introduce use >> >>> of register groups. >> >>> >> >>> To support register groups feature in inline asm needed some extension >> >>> with new constraints. >> >>> >> >>> Current proposal is the following syntax: >> >>> >> >>> __asm__ (“SMTH %[group], %[single]" : >> >>> [single] >> >>> "+x"(v0) : >> >>> [group] >> >>> "Yg4"(v1), “1+1"(v2), “1+2"(v3), “1+3"(v4)); >> >>> >> >>> where "YgN" constraint specifies group of N consecutive registers >> >>> (which is started from register having number as "0 mod >> >>> 2^ceil(log2(N))"), >> >>> and "1+K" specifies the next registers in the group. >> >>> >> >>> Is this syntax ok? How to implement it? >> >> >> >> >> >> Have you looked into how AARCH64 back-end handles this via OI, etc. >> >> Like: >> >> /* Oct Int: 256-bit integer mode needed for 32-byte vector arguments. */ >> >> INT_MODE (OI, 32); >> >> >> >> /* Opaque integer modes for 3 or 4 Neon q-registers / 6 or 8 Neon >> >> d-registers >> >>(2 d-regs = 1 q-reg = TImode). */ >> >> INT_MODE (CI, 48); >> >> INT_MODE (XI, 64); >> >> >> >> >> >> And then it implements TARGET_ARRAY_MODE_SUPPORTED_P. target hook? >> >> And the x2 types are defined as a struct of an array like: >> >> typedef struct int8x8x2_t >> >> { >> >> int8x8_t val[2]; >> >> } int8x8x2_t; >> > >> > Thanks! >> > >> > We have to update proposal with changing "+" symbol to "#" specifying >> > offset in a group (to avoid overloading the other meaning of “+” >> > specifying that operand is both input and output). >> > >> > So current proposal of syntax is: >> > >> > __asm__ (“INSTR %[group], %[single]" : >> > [single] >> > "+x"(v0) : >> > [group] >> > "Yg4"(v1), “1#1"(v2), “1#2"(v3), “1#3"(v4)); >> > >> > where "YgN" constraint specifies group of N consecutive registers >> > (which is started from register having number as "0 mod 2^ceil(log2(N))"), >> > and "1#K" specifies the next registers in the group. >> > >> > Some other questions or comments? >> > >> > What about consensus on this syntax? >> >> Hi Richard! >> >> Can we have agreement on this syntax, what do you think? > > I have no expertise / opinion here. Hi Jeff, are you proper person to ask? -- WBR, Andrew
Re: [RFC] Support register groups in inline asm
On Wed, 15 Mar 2017, Andrew Senkevich wrote: > 2016-12-05 16:31 GMT+01:00 Andrew Senkevich : > > 2016-11-16 8:02 GMT+03:00 Andrew Pinski : > >> On Tue, Nov 15, 2016 at 9:36 AM, Andrew Senkevich > >> wrote: > >>> Hi, > >>> > >>> new Intel instructions AVX512_4FMAPS and AVX512_4VNNIW introduce use > >>> of register groups. > >>> > >>> To support register groups feature in inline asm needed some extension > >>> with new constraints. > >>> > >>> Current proposal is the following syntax: > >>> > >>> __asm__ (“SMTH %[group], %[single]" : > >>> [single] > >>> "+x"(v0) : > >>> [group] > >>> "Yg4"(v1), “1+1"(v2), “1+2"(v3), “1+3"(v4)); > >>> > >>> where "YgN" constraint specifies group of N consecutive registers > >>> (which is started from register having number as "0 mod > >>> 2^ceil(log2(N))"), > >>> and "1+K" specifies the next registers in the group. > >>> > >>> Is this syntax ok? How to implement it? > >> > >> > >> Have you looked into how AARCH64 back-end handles this via OI, etc. > >> Like: > >> /* Oct Int: 256-bit integer mode needed for 32-byte vector arguments. */ > >> INT_MODE (OI, 32); > >> > >> /* Opaque integer modes for 3 or 4 Neon q-registers / 6 or 8 Neon > >> d-registers > >>(2 d-regs = 1 q-reg = TImode). */ > >> INT_MODE (CI, 48); > >> INT_MODE (XI, 64); > >> > >> > >> And then it implements TARGET_ARRAY_MODE_SUPPORTED_P. target hook? > >> And the x2 types are defined as a struct of an array like: > >> typedef struct int8x8x2_t > >> { > >> int8x8_t val[2]; > >> } int8x8x2_t; > > > > Thanks! > > > > We have to update proposal with changing "+" symbol to "#" specifying > > offset in a group (to avoid overloading the other meaning of “+” > > specifying that operand is both input and output). > > > > So current proposal of syntax is: > > > > __asm__ (“INSTR %[group], %[single]" : > > [single] > > "+x"(v0) : > > [group] > > "Yg4"(v1), “1#1"(v2), “1#2"(v3), “1#3"(v4)); > > > > where "YgN" constraint specifies group of N consecutive registers > > (which is started from register having number as "0 mod 2^ceil(log2(N))"), > > and "1#K" specifies the next registers in the group. > > > > Some other questions or comments? > > > > What about consensus on this syntax? > > Hi Richard! > > Can we have agreement on this syntax, what do you think? I have no expertise / opinion here. Richard. > > -- > WBR, > Andrew > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)