[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Comment #14 from Jakub Jelinek 2011-02-28 17:11:29 UTC --- Fixed.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #13 from Jakub Jelinek 2011-02-28 17:05:10 UTC --- Author: jakub Date: Mon Feb 28 17:05:07 2011 New Revision: 170568 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170568 Log: PR middle-end/47893 * rtl.h (ASLK_REDUCE_ALIGN, ASLK_RECORD_PAD): Define. (assign_stack_local_1): Change last argument type to int. * function.c (assign_stack_local_1): Replace reduce_alignment_ok argument with kind. If bit ASLK_RECORD_PAD is not set in it, don't record padding space into frame_space_list nor use those areas. (assign_stack_local): Adjust caller. (assign_stack_temp_for_type): Call assign_stack_local_1 instead of assign_stack_local, pass 0 as last argument. * caller-save.c (setup_save_areas): Adjust assign_stack_local_1 callers. * gcc.dg/pr47893.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr47893.c Modified: trunk/gcc/ChangeLog trunk/gcc/caller-save.c trunk/gcc/function.c trunk/gcc/rtl.h trunk/gcc/testsuite/ChangeLog
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED URL||http://gcc.gnu.org/ml/gcc-p ||atches/2011-02/msg01647.htm ||l AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org |
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #12 from Jakub Jelinek 2011-02-25 19:09:54 UTC --- Created attachment 23471 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23471 gcc46-pr47893.patch Updated patch, so far just lightly tested that it fixes this bug.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #11 from Jakub Jelinek 2011-02-25 18:43:20 UTC --- Created attachment 23469 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23469 statistics gathering patch With this patch I got fncnt array summaries from i686-linux bootstrap/regtest: 56916 0 24380 27 96234 201 125938 699 1065459 5040 and from x86_64-linux bootstrap/regtest: 65358 0 11038 11 55139 84 91377 3184 701567 25806 This shows that assign_stack_temp_for_type is really never called during bootstrap/regtest after virtual reg instantiation and at least for i386 add_frame_space is called just in ~ .5% of cases (and similarly for the size). On x86_64 it happens more often, in ~ 3.5% of cases (haven't unfortunately tracked which of those successes to save something were from assign_stack_temp_for_type frame_space areas though. Thus, I think the proposed patch is the right way to go.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #10 from Jakub Jelinek 2011-02-25 17:39:36 UTC --- I guess that would break pa, because then else if (!STACK_ALIGNMENT_NEEDED) { ... } will be executed whenever record_alignment_slots is false, even for non-zero sizes or non-BLKmode. Other than that, I think it would be better to change the bool argument into an enum, after all we need just 3 variants, reduce_alignment_ok && record_alignment_slots (for caller-save), !reduce_alignment_ok && record_alignment_slots (for assign_stack_local) and !reduce_alignment_ok && !record_alignment_slots (for assign_stack_temp_for_type). I'm currently running x86_64-linux and i686-linux bootstraps/regtests gathering statistics, so far from the partial numbers I have a patch like that isn't going to pessimize stuff too much, add_frame_space from within assign_stack_temp_for_type accounts for like .5% of all add_frame_space calls (and similarly in the number of bytes thus recorded).
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #9 from Bernd Schmidt 2011-02-25 17:25:13 UTC --- Created attachment 23468 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23468 Test patch Does this fix it?
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #8 from Bernd Schmidt 2011-02-25 17:24:15 UTC --- (In reply to comment #7) > My temptation would be to revert until someone can get in there and design > things so that we either have a single list or there's clear rules for > manipulating objects in each list and reflecting the necessary information > back > and forth. > > Even with Jakub's suggestions my worry is other similar, subtle bugs are in > there and will bite us later. We're not actually calling assign_stack_temp after the assign phase, are we? IMO we can set a flag once we expect to only call assign_stack_local, and start using the frame_space list after that point. There'll be no worries about inconsistent lists.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 Jeffrey A. Law changed: What|Removed |Added CC||law at redhat dot com --- Comment #7 from Jeffrey A. Law 2011-02-25 16:40:56 UTC --- My temptation would be to revert until someone can get in there and design things so that we either have a single list or there's clear rules for manipulating objects in each list and reflecting the necessary information back and forth. Even with Jakub's suggestions my worry is other similar, subtle bugs are in there and will bite us later.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #6 from Bernd Schmidt 2011-02-25 16:10:55 UTC --- (In reply to comment #5) > Similarly not queuing anything into frame_space lists > when assign_stack_local is called from within assign_stack_temp_for_type would > kill most of the savings Bernds' original patch had. I think that's worth trying though. I can't think of any other good solution at the moment short of reverting the whole thing.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #5 from Jakub Jelinek 2011-02-25 16:06:14 UTC --- I guess not including the space added in add_frame_space by the assign_stack_local_1 call in temp_slot's size/full_size, while it would be easy to do (just walk the beginning of the frame_space list looking for slots that overlap with what we'd like to include in size/full_size), would result in too big stack wastage (e.g. in the given testcase suddenly the fn7 and fn2 return slot couldn't be shared, even when it isn't clear if something would actually like to use it or not). Similarly not queuing anything into frame_space lists when assign_stack_local is called from within assign_stack_temp_for_type would kill most of the savings Bernds' original patch had. Perhaps we could add pointers from temp_slots to frame_space list and back, when deciding to reuse an earlier temp_slot in assign_stack_temp_for_type we'd walk the referenced list and remove frame_space entries that overlap it and similarly when assign_stack_local_1 decides it wants to use a frame_space we'd decrease size/full_size of the temp_slot that overlaps it. We could of course try to prefer frame_space slots that don't overlap any temp_slots. Or perhaps don't try to use frame_space slots until virtuals_instantiated, assuming assign_stack_temp_for_type/assign_stack_temp/assign_temp aren't called after virtuals_instantiated is set, then we could just have links from temp_slots to frame_space and not vice versa. OT, on the other side, perhaps add_frame_space could be called e.g. for the padding inserted in expand_used_vars by: /* If the target requires that FRAME_OFFSET be aligned, do it. */ if (STACK_ALIGNMENT_NEEDED) { HOST_WIDE_INT align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT; if (!FRAME_GROWS_DOWNWARD) frame_offset += align - 1; frame_offset &= -align; } and perhaps also without -fstack-protector for inter-variable padding (alloc_stack_frame_space - with -fstack-protector it would be a security risk to place spills above any arrays). But this OT isn't stage 4 matter.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 Jakub Jelinek changed: What|Removed |Added CC||bernds at gcc dot gnu.org, ||law at gcc dot gnu.org AssignedTo|jakub at gcc dot gnu.org|unassigned at gcc dot ||gnu.org --- Comment #4 from Jakub Jelinek 2011-02-25 15:40:22 UTC --- Ugh, this is ugly. The problem is that Bernd's changes conflict with the way assign_stack_temp_for_type/combine_temp_slots works. Both are now parallel lists of stack slot information, but they aren't aware of each other. So, when assign_stack_temp_for_type is called to allocate slot for 12 byte long 16 byte aligned slot (for fn7 return value), this internally calls assign_stack_local (why we CEIL_ROUND the size to 16 in that case is something I don't understand). assign_stack_local_1 sees the current frame offset is not 16 byte aligned, so it eats 8 bytes for alignment and 16 bytes for the actual stack slot. assign_stack_local_1 then calls add_frame_space for the extra 8 bytes, so that it can be used by something else later (by assign_stack_local_1) - this is Bernd's new code. But in assign_stack_temp_for_type it tracks the allocation in struct temp_slot, and there it records the whole stack block including whatever padding had to be made again (i.e. 24 bytes in this case). If assign_stack_local called from here will actually return something from the queued frame space slots, then frame_offset_old == frame_offset and thus it will do really weird things, that is something that needs to be fixed too, but is not something that happens in this case (I'd say if frame_offset_old == frame_offset and it wasn't BLKmode with size 0, then it should record as size/full_size just the requested size and nothing else). In this testcase afterwards assign_stack_temp_for_type is called for 20 byte BLKmode requesting 16 byte alignment, and as this is -fno-strict-aliasing, it chooses that it could share the temp slot with the above allocated one and as its size/full_size was 24, it just uses it. Nothing informs the frame_space list about this though and thus it will happily give that slot again during reload.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P1 Known to work||4.6.0 AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org | Known to fail||4.5.2 --- Comment #3 from Jakub Jelinek 2011-02-25 12:56:15 UTC --- Looking into it...
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 Richard Guenther changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2011.02.25 12:51:14 Target Milestone|--- |4.6.0 Ever Confirmed|0 |1 --- Comment #2 from Richard Guenther 2011-02-25 12:51:14 UTC --- Confirmed.
[Bug middle-end/47893] [4.6 Regression] 4.6 miscompiles mesa on i686
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47893 --- Comment #1 from Jakub Jelinek 2011-02-25 12:48:38 UTC --- Seems to be caused by http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=159480