[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-28 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #14 from Jakub Jelinek  2011-02-28 
17:11:29 UTC ---
Fixed.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-28 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #13 from Jakub Jelinek  2011-02-28 
17:05:10 UTC ---
Author: jakub
Date: Mon Feb 28 17:05:07 2011
New Revision: 170568

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170568
Log:
PR middle-end/47893
* rtl.h (ASLK_REDUCE_ALIGN, ASLK_RECORD_PAD): Define.
(assign_stack_local_1): Change last argument type to int.
* function.c (assign_stack_local_1): Replace reduce_alignment_ok
argument with kind.  If bit ASLK_RECORD_PAD is not set in it,
don't record padding space into frame_space_list nor
use those areas.
(assign_stack_local): Adjust caller.
(assign_stack_temp_for_type): Call assign_stack_local_1 instead
of assign_stack_local, pass 0 as last argument.
* caller-save.c (setup_save_areas): Adjust assign_stack_local_1
callers.

* gcc.dg/pr47893.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr47893.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/caller-save.c
trunk/gcc/function.c
trunk/gcc/rtl.h
trunk/gcc/testsuite/ChangeLog


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-26 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
URL||http://gcc.gnu.org/ml/gcc-p
   ||atches/2011-02/msg01647.htm
   ||l
 AssignedTo|unassigned at gcc dot   |jakub at gcc dot gnu.org
   |gnu.org |


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #12 from Jakub Jelinek  2011-02-25 
19:09:54 UTC ---
Created attachment 23471
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23471
gcc46-pr47893.patch

Updated patch, so far just lightly tested that it fixes this bug.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #11 from Jakub Jelinek  2011-02-25 
18:43:20 UTC ---
Created attachment 23469
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23469
statistics gathering patch

With this patch I got fncnt array summaries from i686-linux bootstrap/regtest:
56916 0 24380 27 96234 201 125938 699 1065459 5040
and from x86_64-linux bootstrap/regtest:
65358 0 11038 11 55139 84 91377 3184 701567 25806
This shows that assign_stack_temp_for_type is really never called during
bootstrap/regtest after virtual reg instantiation and at least for i386
add_frame_space is called just in ~ .5% of cases (and similarly for the size).
On x86_64 it happens more often, in ~ 3.5% of cases (haven't unfortunately
tracked which of those successes to save something were from
assign_stack_temp_for_type frame_space areas though.

Thus, I think the proposed patch is the right way to go.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #10 from Jakub Jelinek  2011-02-25 
17:39:36 UTC ---
I guess that would break pa, because then
  else if (!STACK_ALIGNMENT_NEEDED)
{
  ...
}
will be executed whenever record_alignment_slots is false, even for non-zero
sizes or non-BLKmode.
Other than that, I think it would be better to change the bool argument into an
enum, after all we need just 3 variants, reduce_alignment_ok &&
record_alignment_slots (for caller-save), !reduce_alignment_ok &&
record_alignment_slots (for assign_stack_local) and !reduce_alignment_ok &&
!record_alignment_slots (for assign_stack_temp_for_type).

I'm currently running x86_64-linux and i686-linux bootstraps/regtests gathering
statistics, so far from the partial numbers I have a patch like that isn't
going to pessimize stuff too much, add_frame_space from within
assign_stack_temp_for_type accounts for like .5% of all add_frame_space calls
(and similarly in the number of bytes thus recorded).


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread bernds at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #9 from Bernd Schmidt  2011-02-25 
17:25:13 UTC ---
Created attachment 23468
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23468
Test patch

Does this fix it?


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread bernds at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #8 from Bernd Schmidt  2011-02-25 
17:24:15 UTC ---
(In reply to comment #7)
> My temptation would be to revert until someone can get in there and design
> things so that we either have a single list or there's clear rules for
> manipulating objects in each list and reflecting the necessary information 
> back
> and forth.
> 
> Even with Jakub's suggestions my worry is other similar, subtle bugs are in
> there and will bite us later.

We're not actually calling assign_stack_temp after the assign phase, are we?
IMO we can set a flag once we expect to only call assign_stack_local, and start
using the frame_space list after that point. There'll be no worries about
inconsistent lists.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at redhat dot com

--- Comment #7 from Jeffrey A. Law  2011-02-25 16:40:56 
UTC ---
My temptation would be to revert until someone can get in there and design
things so that we either have a single list or there's clear rules for
manipulating objects in each list and reflecting the necessary information back
and forth.

Even with Jakub's suggestions my worry is other similar, subtle bugs are in
there and will bite us later.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread bernds at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #6 from Bernd Schmidt  2011-02-25 
16:10:55 UTC ---
(In reply to comment #5)
> Similarly not queuing anything into frame_space lists
> when assign_stack_local is called from within assign_stack_temp_for_type would
> kill most of the savings Bernds' original patch had.

I think that's worth trying though. I can't think of any other good solution at
the moment short of reverting the whole thing.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #5 from Jakub Jelinek  2011-02-25 
16:06:14 UTC ---
I guess not including the space added in add_frame_space by the
assign_stack_local_1 call in temp_slot's size/full_size, while it would be easy
to do (just walk the beginning of the frame_space list looking for slots that
overlap with what we'd like to include in size/full_size), would result in too
big stack wastage (e.g. in the given testcase suddenly the fn7 and fn2 return
slot couldn't be shared, even when it isn't clear if something would actually
like to use it or not).  Similarly not queuing anything into frame_space lists
when assign_stack_local is called from within assign_stack_temp_for_type would
kill most of the savings Bernds' original patch had.

Perhaps we could add pointers from temp_slots to frame_space list and back,
when deciding to reuse an earlier temp_slot in assign_stack_temp_for_type we'd
walk the referenced list and remove frame_space entries that overlap it and
similarly when assign_stack_local_1 decides it wants to use a frame_space we'd
decrease size/full_size of the temp_slot that overlaps it.
We could of course try to prefer frame_space slots that don't overlap any
temp_slots.

Or perhaps don't try to use frame_space slots until virtuals_instantiated,
assuming assign_stack_temp_for_type/assign_stack_temp/assign_temp aren't called
after virtuals_instantiated is set, then we could just have links from
temp_slots to frame_space and not vice versa.

OT, on the other side, perhaps add_frame_space could be called e.g. for the
padding inserted in expand_used_vars by:

  /* If the target requires that FRAME_OFFSET be aligned, do it.  */ 
  if (STACK_ALIGNMENT_NEEDED)
{
  HOST_WIDE_INT align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
  if (!FRAME_GROWS_DOWNWARD)
frame_offset += align - 1;
  frame_offset &= -align;
}

and perhaps also without -fstack-protector for inter-variable padding
(alloc_stack_frame_space - with -fstack-protector it would be a security risk
to place spills above any arrays).  But this OT isn't stage 4 matter.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

Jakub Jelinek  changed:

   What|Removed |Added

 CC||bernds at gcc dot gnu.org,
   ||law at gcc dot gnu.org
 AssignedTo|jakub at gcc dot gnu.org|unassigned at gcc dot
   ||gnu.org

--- Comment #4 from Jakub Jelinek  2011-02-25 
15:40:22 UTC ---
Ugh, this is ugly.
The problem is that Bernd's changes conflict with the way
assign_stack_temp_for_type/combine_temp_slots works.
Both are now parallel lists of stack slot information, but they aren't aware of
each other.
So, when assign_stack_temp_for_type is called to allocate slot for 12 byte long
16 byte aligned slot (for fn7 return value), this internally calls
assign_stack_local (why we CEIL_ROUND the size to 16 in that case is something
I don't understand).  assign_stack_local_1 sees the current frame offset is not
16 byte aligned, so it eats 8 bytes for alignment and 16 bytes for the actual
stack slot.  assign_stack_local_1 then calls add_frame_space for the extra 8
bytes, so
that it can be used by something else later (by assign_stack_local_1) - this is
Bernd's new code.
But in assign_stack_temp_for_type it tracks the allocation in struct temp_slot,
and there it records the whole stack block including whatever padding had to be
made again (i.e. 24 bytes in this case).

If assign_stack_local called from here will actually return something from the
queued frame space slots, then frame_offset_old == frame_offset and thus it
will do really weird things, that is something that needs to be fixed too, but
is not something that happens in this case (I'd say if frame_offset_old ==
frame_offset and it wasn't BLKmode with size 0, then it should record as
size/full_size just the requested size and nothing else).

In this testcase afterwards assign_stack_temp_for_type is called for 20 byte
BLKmode requesting 16 byte alignment, and as this is -fno-strict-aliasing, it
chooses that it could share the temp slot with the above allocated one and as
its size/full_size was 24, it just uses it.  Nothing informs the frame_space
list about this though and thus it will happily give that slot again during
reload.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
  Known to work||4.6.0
 AssignedTo|unassigned at gcc dot   |jakub at gcc dot gnu.org
   |gnu.org |
  Known to fail||4.5.2

--- Comment #3 from Jakub Jelinek  2011-02-25 
12:56:15 UTC ---
Looking into it...


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

Richard Guenther  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2011.02.25 12:51:14
   Target Milestone|--- |4.6.0
 Ever Confirmed|0   |1

--- Comment #2 from Richard Guenther  2011-02-25 
12:51:14 UTC ---
Confirmed.


[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686

2011-02-25 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893

--- Comment #1 from Jakub Jelinek  2011-02-25 
12:48:38 UTC ---
Seems to be caused by http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=159480