esday, November 12, 2013 3:57 PM
To: Jan Hubicka
Cc: H.J. Lu; Vladimir Makarov; GCC Patches; Uros Bizjak; Richard Henderson;
Gopalasubramanian, Ganesh
Subject: Re: Honnor ix86_accumulate_outgoing_args again
On Tue, Nov 12, 2013 at 11:05:45AM +0100, Jan Hubicka wrote:
> >
> On Tue, Nov 12, 2013 at 10:39:28AM -0500, Vladimir Makarov wrote:
> > >> Shall we also disable argument accumulation for cores? It seems we won't
> > >> solve the IRA issues, right?
> > > You mean LRA issues here, right? If you are starting to use
> > > no-accumulate-outgoing-args much more ofte
On Tue, Nov 12, 2013 at 10:39:28AM -0500, Vladimir Makarov wrote:
> >> Shall we also disable argument accumulation for cores? It seems we won't
> >> solve the IRA issues, right?
> > You mean LRA issues here, right? If you are starting to use
> > no-accumulate-outgoing-args much more often than in
On 11/12/2013 05:26 AM, Jakub Jelinek wrote:
> On Tue, Nov 12, 2013 at 11:05:45AM +0100, Jan Hubicka wrote:
>>> @@ -16576,7 +16576,7 @@ ix86_avx256_split_vector_move_misalign (rtx
>>> op0, rtx op1)
>>>
>>>if (MEM_P (op1))
>>> {
>>> - if (TARGET_AVX256_SPLIT_UNALIGNED_LOAD)
>>> +
On Tue, Nov 12, 2013 at 2:26 AM, Jakub Jelinek wrote:
> On Tue, Nov 12, 2013 at 11:05:45AM +0100, Jan Hubicka wrote:
>> > @@ -16576,7 +16576,7 @@ ix86_avx256_split_vector_move_misalign (rtx
>> > op0, rtx op1)
>> >
>> >if (MEM_P (op1))
>> > {
>> > - if (TARGET_AVX256_SPLIT_UNALIGNED_L
On Tue, Nov 12, 2013 at 11:05:45AM +0100, Jan Hubicka wrote:
> > @@ -16576,7 +16576,7 @@ ix86_avx256_split_vector_move_misalign (rtx
> > op0, rtx op1)
> >
> >if (MEM_P (op1))
> > {
> > - if (TARGET_AVX256_SPLIT_UNALIGNED_LOAD)
> > + if (!TARGET_AVX2 && TARGET_AVX256_SPLIT_UNALIG
This is OK, thanks for catching the pasto!
Only...
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 430d562..b8cb871 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3974,10 +3974,10 @@ ix86_option_override_internal (bool main_args_p,
>if (fla
On Mon, Nov 11, 2013 at 4:18 PM, Jakub Jelinek wrote:
> On Thu, Oct 10, 2013 at 08:40:05PM +0200, Jan Hubicka wrote:
>> --- config/i386/x86-tune.def (revision 203387)
>> +++ config/i386/x86-tune.def (working copy)
>
>> +/* X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL: if true, unaligned loads are
>> +
On Thu, Oct 10, 2013 at 08:40:05PM +0200, Jan Hubicka wrote:
> --- config/i386/x86-tune.def (revision 203387)
> +++ config/i386/x86-tune.def (working copy)
> +/* X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL: if true, unaligned loads are
> + split. */
> +DEF_TUNE (X86_TUNE_AVX256_UNALIGNED_LOAD_OPTI
atin, Igor; gcc-patches@gcc.gnu.org
> Subject: Re: Honnor ix86_accumulate_outgoing_args again
>
> On 13-10-19 4:30 PM, Jan Hubicka wrote:
> >> Jan,
> >>
> >> Does this seem reasonable to you?
> > Oops, sorry, I missed your email. (I was travelling an
On 10/10/2013 08:40 PM, Jan Hubicka wrote:
+ In 32bit mode enabling argument accumulation results in about 5% code size
+ growth becuase move instructions are less compact than push. In 64bit
+ mode the difference is less drastic but visible.
+
+ FIXME: Unlike earlier implementations, th
Jan,
Please see my answers below
> -Original Message-
> From: Jan Hubicka [mailto:hubi...@ucw.cz]
> Sent: Sunday, October 20, 2013 12:30 AM
> To: Zamyatin, Igor; gcc-patches@gcc.gnu.org; vmaka...@redhat.com
> Cc: 'Jan Hubicka'
> Subject: Re: Honnor ix86_ac
: RE: Honnor ix86_accumulate_outgoing_args again
Jan,
Now we have following prologue in, say, phi0 routine in equake
0x804aa90 1 push %ebp
0x804aa91 2 mov%esp,%ebp
0x804aa93 3 sub$0x18,%esp
0x804aa96 4 vmovsd 0x80ef7a8,%xmm0
0x804aa9e 5 vmovsd 0x8(%ebp),%xmm1
0x804aaa3 6 vcomisd
tgoing_args. As for other machines - seems now
> > (after your change) they don't get that
> > MASK_ACCUMULATE_OUTGOING_ARGS and it leads to using ebp in the
> > prologue.
> >
> > Thanks,
> > Igor
> >
> > -Original Message-
> &
Hi,
this patch makes ACCUMULATE_OUTGOING_ARGS to disable itself when function is
cold. I did some extra testing and to my amusement we now seem to output
more compact unwind info when ACCUMULATE_OUTGOING_ARGS is disabled, so this
seems quite consistent code size win.
We actually can do better and
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57018
>
> and because LRA still misses some reload functionality for
> elimination. I am a bit embarrassed: I have this thing to do for 4
> months and I still did not start to work on it yet. There are too
> much things on my plate.
>
> As we are go
> > Unfortunately there is 40% regression on mgrid with -flto (and also
> > noticeable
> > regression without LTO). First thing I noticed is that we stop omitting
> > frame
> > pointer in the hottest function. This is because we see:
>
> Does it happen with both 32-bit and 64-bit?
No, 32bit on
On 13-10-02 6:45 PM, Jan Hubicka wrote:
So I thing we ought to honnor accumulate-outgoing-args again and in fact
consider disabling it for generic - it is disabled for core (that may need
re-benchmarking). For all AMD targets it is currently on. I tested disabling
it on buldozer 32bit and it see
On Wed, Oct 2, 2013 at 3:45 PM, Jan Hubicka wrote:
>> So I thing we ought to honnor accumulate-outgoing-args again and in fact
>> consider disabling it for generic - it is disabled for core (that may need
>> re-benchmarking). For all AMD targets it is currently on. I tested disabling
>> it on bul
> So I thing we ought to honnor accumulate-outgoing-args again and in fact
> consider disabling it for generic - it is disabled for core (that may need
> re-benchmarking). For all AMD targets it is currently on. I tested disabling
> it on buldozer 32bit and it seems mostly SPEC neutral for specint
20 matches
Mail list logo