Announcing James Bowman as FT32 port maintainer

2015-06-03 Thread Jeff Law



I'm pleased to announce James Bowman has been appointed as the 
maintainer for the FT32 port.


James, can you please add yourself to the MAINTAINERS file.

Thanks,
Jeff


Re: [i386] Scalar DImode instructions on XMM registers

2015-06-03 Thread Jeff Law

On 05/27/2015 07:20 AM, Ilya Enkovich wrote:


I looked into assign_stack_local_1 call for this spill. LRA correctly
requests 16 bytes size with 16 bytes alignment. But
assign_stack_local_1 look reduces alignment to 8 because estimated
stack alignment before RA is 8 and requested mode's (DI) alignment
fits it. Probably LRA should pass biggest_mode of the reg when
requesting a stack slot?
It's hard to say for sure.  Within the lra_reg structure, biggest_mode 
refers to the largest mode in which a pseudo is referenced.  So for a 
pseudo it might make sense.  Presumably the biggest_mode for the pseudo 
in question is larger than DImode, right?




I handled it by increasing stack_alignment_estimated when transform
some instructions to vector mode.
I haven't looked deeply, but if your pass runs after 
stack_alignment_estimated is initially computed, then this seems like a 
desirable way to fix the problem.


jeff


Re: Question about find modifiable mems

2015-06-03 Thread Jim Wilson
On 06/02/2015 11:39 PM, shmeel gutl wrote:
> find_modifiable_mems was introduced to gcc 4.8 in september 2012. Is
> there any documentation as to how it is supposed to help the haifa
> scheduler?

The patch was submitted here
  https://gcc.gnu.org/ml/gcc-patches/2012-08/msg00155.html
and this message contains a brief explanation of what it is supposed to
do.  The explanation looks like a useful optimization, but perhaps it is
triggering in cases when it shouldn't.

Jim



Static Chain Register on iOS AArch64

2015-06-03 Thread Stephen Cross
Hello,

I noticed the following comment in the GCC source (
https://github.com/gcc-mirror/gcc/blob/7c62dfbbcd3699efcbbadc9fb3aa14f23a123add/libffi/src/aarch64/ffitarget.h#L66
):

/* iOS reserves x18 for the system. Disable Go closures until a new
static chain is chosen. */

Based on this comment, it sounds as if GCC hasn't yet decided which register
to use for the static chain pointer on iOS AArch64. Is this correct?

As I understand it, x18 (the platform register) is not used on Linux and hence
can be used by GCC. I couldn't find anything saying this, so could you confirm
this (that x18 is not used by Linux and hence used by GCC)?

In terms of the register to choose for iOS AArch64, it seems like either x16 or
x17 (the Intra Procedural call scratch registers) would be a good choice, in
the same way that r12 is used for ARM 32-bit. Does this seem sensible, or
is there some reason for rejecting these registers?

I'd appreciate anything anyone can tell me about the above.

In case you're interested, the context for this is:
http://comments.gmane.org/gmane.comp.compilers.llvm.devel/86370

Thanks,
Stephen Cross


gcc-4.9-20150603 is now available

2015-06-03 Thread gccadmin
Snapshot gcc-4.9-20150603 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20150603/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 224106

You'll find:

 gcc-4.9-20150603.tar.bz2 Complete GCC

  MD5=15a5364ce3de48e8708f366b13803168
  SHA1=72f655df6f472a38966ab556b3c3060d6a2dad2e

Diffs from 4.9-20150527 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: i386: does gcc work with CS ≠ DS?

2015-06-03 Thread Richard Henderson
On 06/02/2015 10:44 PM, H. Peter Anvin wrote:
> Hi guys, another low level question:
> 
> Obviously gcc for i386 requires DS = ES = SS (with FS and GS don't
> care), but does gcc also require CS = DS?

I don't believe so.  In these modern times we don't place switch statement
tables, or other constant data, in the .text section.  Just map the correct
sections to the correct segments and you should be fine.

What advantage are you looking for?


r~


Re: parameters to _mm_mwait intrinsic

2015-06-03 Thread Uros Bizjak
On Wed, Jun 3, 2015 at 2:47 PM, Kumar, Venkataramanan
 wrote:
> Hi,
>
> I was going through the "monitor" and "mwait" builtin implementation.
> I need clarification on the parameters passed to _mm_mwait intrinsic.
>

> Should the constraint be swaped for the operands in the pattern?

Please swap the constraints in the pattern.

Patch is pre-approved for mainline and release branches.

Thanks,
Uros.


parameters to _mm_mwait intrinsic

2015-06-03 Thread Kumar, Venkataramanan
Hi, 

I was going through the "monitor" and "mwait" builtin implementation.
I need clarification on the parameters passed to _mm_mwait intrinsic.

We have the following defined in "pmmintrin.h"

extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
_mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
{
  __builtin_ia32_monitor (__P, __E, __H);
}

extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
_mm_mwait (unsigned int __E, unsigned int __H)
{
  __builtin_ia32_mwait (__E, __H);
}

I assume parameter  names indicates  
P -> Address 
E -> Extensions
H -> Hints

Mwait as per AMD ISA manual 
Ref: 
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2008/10/24594_APM_v3.pdf
(---Snip---)
EAX specifies optional hints for the MWAIT instruction. There are currently no 
hints defined and all
bits should be 0. Setting a reserved bit in EAX is ignored by the processor.
ECX specifies optional extensions for the MWAIT instruction. The only extension 
currently defined is
ECX bit 0, which allows interrupts to wake MWAIT, even when eFLAGS.IF = 0. 
Support for this
extension is indicated by a feature flage returned by the CPUID instruction. 
Setting any unsupported
bit in ECX results in a #GP exception. 
(---Snip---)

Mwait defined as per intel ISA manual. 
Ref: 
http://www.intel.in/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
(---Snip---)
This instruction's operation is the same in non-64-bit modes and 64-bit mode.
ECX specifies optional extensions for the MWAIT instruction. EAX may contain 
hints such as the preferred optimized
state the processor should enter. The first processors to implement MWAIT 
supported only the zero value for
EAX and ECX. Later processors allowed setting ECX[0] to enable masked 
interrupts as break events for MWAIT
(see below). Software can use the CPUID instruction to determine the extensions 
and hints supported by the
processor
(---Snip---)


So for if a user calls  _mm_mwait (__E, __H)  __E should go into ECX and __H 
should go into EAX.

However I see implementation in GCC

(---snip---)
  case IX86_BUILTIN_MWAIT:
  arg0 = CALL_EXPR_ARG (exp, 0);
  arg1 = CALL_EXPR_ARG (exp, 1);
  op0 = expand_normal (arg0);
  op1 = expand_normal (arg1);
  if (!REG_P (op0))
    op0 = copy_to_mode_reg (SImode, op0);
  if (!REG_P (op1))
    op1 = copy_to_mode_reg (SImode, op1);
  emit_insn (gen_sse3_mwait (op0, op1));
  return 0;


(define_insn "sse3_mwait"
  [(unspec_volatile [(match_operand:SI 0 "register_operand" "a")
 (match_operand:SI 1 "register_operand" "c")]
    UNSPECV_MWAIT)]
  "TARGET_SSE3"
;; 64bit version is "mwait %rax,%rcx". But only lower 32bits are used.
;; Since 32bit register operands are implicitly zero extended to 64bit,
;; we only need to set up 32bit registers.
  "mwait"
  [(set_attr "length" "3")])
(---snip---)

Here first argument __E is moved to "EAX"  and __H is moved to "ECX"
. 
Should the constraint be swaped for the operands in the pattern?
Or My understanding is wrong?

Regards,
Venkat.



RE: Question about find modifiable mems

2015-06-03 Thread Ajit Kumar Agarwal


-Original Message-
From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of shmeel 
gutl
Sent: Wednesday, June 03, 2015 12:10 PM
To: GCC Development
Subject: Question about find modifiable mems

>>find_modifiable_mems was introduced to gcc 4.8 in september 2012. Is there 
>>any documentation as to how it is supposed to help the haifa scheduler?

>>In my private port of gcc it make the following type of transformations

 >>from
 >>a= *(b+20)
 >>b+=30

 >>to
>> b+=30
 >>a=*(b-10)

>>Although this is functionally correct, it has changed an ANTI_DEP into a 
>>TRUE_DEP and thus introduced stalls. If it went the other way, that would be 
>>good. >>Any pointers?

Breaking Anti-Dependencies is an important optimization for transformation like 
Vectorization. 

Thanks & Regards
Ajit

Thanks,
Shmeel



Re: s390: SImode pointers vs LR

2015-06-03 Thread Andreas Krebbel
On 06/03/2015 12:53 AM, Richard Henderson wrote:
> On 06/02/2015 08:32 AM, Andreas Krebbel wrote:
>> -(define_insn "*3"
>> +(define_insn "*3_reg"
>> [(set (match_operand:GPR 0 "register_operand" "=d")
>>   (SHIFT:GPR (match_operand:GPR 1 "register_operand" "")
>> -   (match_operand:SI 2 "shift_count_or_setmem_operand" 
>> "Y")))]
>> +   (match_operand:SI 2 "register_operand" "a")))]
>> ""
>> -  "sl\t%0,<1>%Y2"
>> +  "sl\t%0,<1>%2"
>> +  [(set_attr "op_type"  "RS")
>> +   (set_attr "atype""reg")])
>> +
>> +(define_insn "*3_imm"
>> +  [(set (match_operand:GPR 0 "register_operand" "=d")
>> +(SHIFT:GPR (match_operand:GPR 1 "register_operand" "")
>> +   (match_operand 2 "immediate_operand" "J")))]
>> +  ""
>> +  "sl\t%0,<1>%2"
>> +  [(set_attr "op_type"  "RS")
>> +   (set_attr "atype""reg")])
> 
> These two ought not be split apart.  They're simple alternatives.  
Right. That was just a quick copy and paste hack to check if it works.

> And why SImode?
Other modes would work as well since the instruction only uses the lower 6 bits 
anyway. But what's
wrong with SImode?

Bye,

-Andreas-



Re: s390: SImode pointers vs LR

2015-06-03 Thread Andreas Krebbel
On 06/02/2015 07:13 PM, Jeff Law wrote:
> But isn't that 3 registers used in the address computation if the 
> (const_int 1) gets reloaded?  one of the value shifted, two for the 
> shift count?  I'm not familiar with the s390, so if you can handle that 
> kind of insn, then, umm, cool.
The address style operand is only the shift count. Our instructions support 
base + displacement
here. E.g. sll %r2,%r3(3)is r2 << (r3 + 3)

> The only other thing that comes immediately to mind would be secondary 
> reloads.  But I always hate suggesting them.
I don't see how this would help here. It is not really that reload needs help 
moving something
to/from a register. In fact the INSN is good as is and we are trying to prevent 
reload from doing
anything.

Bye,

-Andreas-