[Bug middle-end/40060] [4.5 Regression] casts loose alignment info
--- Comment #5 from matz at suse dot de 2009-05-07 15:13 --- Subject: Re: [4.5 Regression] casts loose alignment info On Thu, 7 May 2009, rguenth at gcc dot gnu dot org wrote: > And if something should look through conversions it is get_pointer_alignment Yes, this is actually used in the ppc testcase to get hold of the pointer alignment of the mem buffer. The conservatively typed cast is confusing it then, and as explained we aren't allowed to look through it (and if we were we would have to use the _lowest_ not largest alignment in the those conversion sequence). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40060
[Bug middle-end/39954] [4.5 Regression] Revision 146817 caused unaligned access in gcc.dg/torture/pr26565.c
--- Comment #21 from matz at suse dot de 2009-05-06 21:39 --- Subject: Re: [4.5 Regression] Revision 146817 caused unaligned access in gcc.dg/torture/pr26565.c On Wed, 6 May 2009, rguenther at suse dot de wrote: > > + tree ret; > > + if (TYPE_PACKED (t)) > > +return t; > > Walking the type variants and searching for what we now build would > fix the inefficiency. And of course this function needs a comment ;) I anyway am unsure about using variants of types for this. E.g. some other lookup functions when searching for a qualified type simply walk the list and take the first one with matching qualifiers. Now if there suddenly are multiple variants with same qualifiers but different alignment in there it might chose an overly aligned type accidentally. Or maybe I'm confused, hmm... (but yes, that's the obvious fix for the inefficiency). Can qualified types themself be a new base (TYPE_MAIN_VARIANT) for a new chain? In that case it would work just fine. > > + ret = build_variant_type_copy (t); > > + TYPE_ALIGN (ret) = align * BITS_PER_UNIT; > > + TYPE_USER_ALIGN (ret) = 1; > > It seems that only ever place_field looks at this flag. TYPE_USER_ALIGN? By place_field and update_alignment_for_field, and it's copied into DECL_USER_ALIGN (which is used in more places). TYPE_USER_ALIGN only ever seems to guard calls to ADJUST_FIELD_ALIGN when PCC_BITFIELD_TYPE_MATTERS or BITFIELD_NBYTES_LIMITED. But I have no idea why a user specified alignment should not also be affected by ADJUST_FIELD_ALIGN. It all seems to have to do with the general theme of not overriding user-specified alignment in any way that the compiler normally takes to derive alignments. In any case it seems better to leave this alone, stor-layout.c is filled with sometimes quite arcane conditions and special cases and probably nobody in the world can test all combinations of strange ABIs and funny requirements or backward compatibilities. > How is the effect of setting it here? Well, the user explicitely put "attribute((packed))" there so it seems reasonable to deal with this as if he also specified an explicit alignment. > Overall I like this patch. Much to my surprise it even seems to work up to now, bootstrap was okay testsuite still running. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954
[Bug tree-optimization/34176] [4.3 Regression] SCCVN breaks gettext
--- Comment #9 from matz at suse dot de 2007-11-22 14:03 --- Subject: Re: [4.3 Regression] SCCVN breaks gettext [sorry for the breakage in last response] It does not. The RPO algorithm (the one proven) uses hash table deletes per iteration. About the SCC algorithm they have to say this: "Since we cannot remove the entries from the hash table after each pass as the RPO algorithm does, we will use two hash tables. The iterative phase uses an optimistic hash table. Once the value numbers in the SCC stabilize, entries are added to the valid table." Without proof that this actually has the same properties as the RPO algorithm. Had they gone through the hassle of trying to prove this they would have notived that it doesn't work. > > Maybe we aren't traversing uses in function arguments during DFS walk? No, that's not the problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34176
[Bug tree-optimization/34176] [4.3 Regression] SCCVN breaks gettext
--- Comment #7 from matz at suse dot de 2007-11-22 13:58 --- Subject: Re: [4.3 Regression] SCCVN breaks gettext Hi, On Thu, 22 Nov 2007, dberlin at dberlin dot org wrote: > Right, but this is the optimistic set of hash tables, so that is okay. I initially thought so too, but it really isn't. > At the end of SCC iteration, it is okay to keep optimistic > assumptions in the optimistic table, even if they turned out to be > wrong. It's okay only as long as you haven't proven them wrong. I.e. it's okay to have unproven but _consistent_ entries in the hash table. It is not okay to have inconsistent data in there, because it ripples through the whole SCC then. As in this case. > > ergo it enters (int)dest_9 into the hashtable, as having destptr.2_15 as > > value. I.e. (int)dest_9 == destptr.2_15. From there on everything breaks > > apart, because nobody is ever removing this association from the hash-table. > > In particular we still (wrongly) think that nitems_19 is zero. > > I don't see where above it has set nitems_19 to zero. I'll attach the complete dump. nitems_19 is zero because it is computed like so: Value numbering dest.3_16 stmt = dest.3_16 = (int) dest_9; Setting value number of dest.3_16 to destptr.2_15 Value numbering D.1214_17 stmt = D.1214_17 = destptr.2_15 - dest.3_16; RHS destptr.2_15 - dest.3_16 simplified to 0 has constants 0 Setting value number of D.1214_17 to 0 Value numbering D.1215_18 stmt = D.1215_18 = D.1214_17 /[ex] 8; RHS D.1214_17 /[ex] 8 simplified to 0 has constants 0 Setting value number of D.1215_18 to 0 Value numbering nitems_19 stmt = nitems_19 = (size_t) D.1215_18; RHS (size_t) D.1215_18 simplified to 0 has constants 0 Setting value number of nitems_19 to 0 So, because of the (in later passes invalid (!)) association of (int)dest_9 == destptr.2_15 --> D.1214_17 = destptr.2_15 - dest.3_16 == destptr.2_15 - destptr.2_15 == 0 --> D.1215_18 == 0 --> nitems_19 == 0 As the initial problem is the association of (int)dest_9 and destptr.2_15, and that stays in the hash table forever more we never notice that nitems_19 (and the other values in between) are _not_ zero. Not during the optimistic iterations at least. When we switch to valid_info (equivalent to deleting the hash table) we do notice that (we don't "discover" the invalid (int)dest_9 == destptr.2_15), nitems_19 isn't zero anymore. But nitems_19 happens to be traversed later than nitems_1 (later that nitems_20), so the now correct value of nitems_19 isn't used anymore. > There should be no need, as the fixpoint iteration of the optimistc > table should eventually end up with the values you want to insert into > the valid table. > That's in fact, the whole point. But as shown above it doesn't work that way. > > > so there has to be a way to either cleanup the hashtable after iterations > > (this also doesn't seem to be designed in this way), > > Again, it's okay for the optimistic assumptions to remain in the > table, and in fact, is designed for it to happen. > The paper goes into why this is so. The paper conveniently proves the algorithm which _deletes_ the hash table between iterations. And then handwaves over why it's also okay to not delete it. It's wrong. > No, this is also okay. > Again, it is fine for the optimistic hashtable to have invalid info. No, it's not. unproven but consistent is okay. provably false is not. > > version in it), but this canonicalization needs to happen when looking up > > the hash table, not when _inserting_ into it, as canonicalization is > > transient > > and changes from iteration to iteration. > > > Again, this isn't right. The paper goes into detail as to why it is > okay for the optimistic talbe to behave this way, It goes not. The RPO algorithm (the one proven) uses hash table deletes per iteration. About the SCC algorithm they have to say this: " º Â¥ «¤´ ¥¤ · § ¼´ ¯§¯ ¦ Â¥ © ± ¯§ ¥¤ · Â¥ ® ¯ ¸± ® ¤ « ª ¦ © § ¥¤ vc§1Y8Ey²cÂ|v(1¢Â¨d¾¨vÂÂ1c¦Yv6c§v1c© and why it is okay > to do algebraic simplification/etc on insert. > > The real problem seems to me, at least unless you guys haven't pasted > that part of the trace, that nitems_19 isn't part of the SCC but > should be. By the time iteration of the SCC finishes, we should have > discovered that nitems_19 does not have the value 0. > > The one in a real compiler I have of SCCVN both do canonicalization on > insert, as does the original code from Rice's massively scalar > compiler (which is where the algorithm comes from).
[Bug middle-end/27409] [4.1/4.2 Regression] ICE in get_constraint_for_component_ref
--- Comment #5 from matz at suse dot de 2006-05-03 17:54 --- Created an attachment (id=11368) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11368&action=view) patch relative to 4.1 This is the same patch adjusted for 4.1. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27409
[Bug middle-end/27409] [4.1/4.2 Regression] ICE in get_constraint_for_component_ref
--- Comment #4 from matz at suse dot de 2006-05-03 17:53 --- Yes. I'm testing it for trunk and 4.1 on a couple platforms. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27409
[Bug ada/26678] "GNAT BUG DETECTED" when compiling AWS.
--- Comment #12 from matz at suse dot de 2006-05-03 15:48 --- It's bug 27409 now. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26678
[Bug middle-end/27409] New: ICE in get_constraint_for_component_ref
The below testcase ICEs in get_constraint_for_component_ref when compiled with -O1 or beyond on x86_64. Richard mentions that it also fails with trunk. --- /* compile with gcc -c -Os -o foo.o foo.c */ typedef struct { struct { } z; } thang_t; struct widget { struct widget *p, *q; }; typedef struct thing { struct widget x; } thing_t; struct { int a; int b; int c; int d; int e; thang_t f; thing_t g; } my_struct; static void foo(thang_t *r) { splat(r);} void function(int blaz) { foo(&my_struct.f);} --- -- Summary: ICE in get_constraint_for_component_ref Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de GCC host triplet: x86_64-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27409
[Bug ada/26678] "GNAT BUG DETECTED" when compiling AWS.
--- Comment #10 from matz at suse dot de 2006-05-03 15:40 --- We also got a bugreport about an ICE in get_constraint_for_component_ref, but have a C testcase. In the hope that it's the same reason I paste it here: - /* compile with gcc -c -O2 -o foo.o foo.c */ typedef struct { struct { } z; } thang_t; struct widget { struct widget *p, *q; }; typedef struct thing { struct widget x; } thing_t; struct { int a; int b; int c; int d; int e; thang_t f; thing_t g; } my_struct; static void foo(thang_t *r) { splat(r);} void function(int blaz) { foo(&my_struct.f);} -- This fails with 4.1.x on x86_64 with -O1 and beyond. -- matz at suse dot de changed: What|Removed |Added CC| |matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26678
[Bug target/26826] [4.1/4.2 Regression] ICE in reg_or_subregno, at jump.c:2011
--- Comment #6 from matz at suse dot de 2006-03-25 21:10 --- The sequence of what happens is a bit involved, and breaks a very old invariant in reload.c which doesn't seem to hold anyway since a long time, as there is already much code handling this non-holding, namely that subreg's of MEM only happen for paradoxical subregs, except on WORD_REGISTER_OPERATIONS machines. OTOH there is also other code in reload*.c which still seem to rely on this invariant (like this asserting code here), or tries to make sure it's true (e.g. in eliminate_regs_1). So, what happens is this: * pseudo 64 doesn't get a hardreg, and we are faced with (subreg:QI (reg:SI 64)) in some operand * the first time through the reload loop, reg_equiv_memory_loc[64] is still zero, hence no elimination is run on it, and no stackslot is created for it yet. * first time in calculate_needs_all_insns() it does an eliminate_regs call on the insn in question. As reg_equiv_memory_loc[64] is not yet filled in it goes down until eliminate_regs_1((reg:SI 64)), which then allocates the stack-slot for pseudo 64 in alter_reg. [this already is strange design, that stackslots are created sort of by accident in random order by trying to eliminate other regs] * now reg_equiv_memory_loc[64] _is_ set up to that new stack slot. But we are still deep down in the calculate_needs_all_insns() activation, not in the outer reload loop. Hence reg_equiv_mem[64] or reg_equiv_address[64] are not yet filled (they are normally setup from reg_equiv_memory_loc[64] just before the whole insn scanning). * now the insn in question is scanned further, and eventually goes into find_reloads(), which, before scanning constraints, tries to reload operand addresses. When it sees a SUBREG in an operand (as here), it uses find_reloads_toplev on that one. * find_reloads_toplev tries to handle SUBREGs sensible, i.e. tries to avoid creating (subreg (mem ...)), but can only do that if either reg_equiv_address[regno] or reg_equiv_mem[regno] are set up. See above for why this normally, but not here, is the case. * So it happily creates the problematic (subreg:QI (mem:SI stackslot)) which is stored into recog_data.operand[i] in find_reloads, so that further on we see that subreg-mem as operand (for this run in find_reloads). * Further down the road it checks constraints, which all are fine, but then comes optional reloads. * find_reloads tries to be nice to reload inheritance and creates an optional reload for each MEM operand (or subreg thereof), i.e. also for this one, so push_reload() is called on it. * push_reload doesn't expect a SUBREG of MEM which isn't paradoxical, and has even some gcc_assert to that effect in some of it's conditional blocks. * OTOH it also has code to explicitely handle SUBREGs where the inner reg is not REG_P, but perhaps that is supposed to only handle subregs of constants, not (subreg(mem)). And it has to expect some non-paradoxical (subreg(mem)) on WORD_REGISTER_OPERATIONS machines anyway. If either the stackslots would have been set up already by the time calculate_needs_all_insns runs, or find_reloads_toplev would also deal with only reg_equiv_memory_loc being set (which it can't) this problem wouldn't have occured. Sooo, the easiest solution for this I believe is that patch which Richard already mentioned (perhaps attach it here?), which simply also tests REG_P (SUBREG_REG (in/out)) in both places. The other solution would be to reinstate the invariant of subreg(mem) never occuring except on some machines, but that would be much harder to prove correct. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26826
[Bug libgcj/13212] JNI/CNI AttachCurrentThread does not register thread with garbage collector
--- Comment #29 from matz at suse dot de 2006-03-25 01:07 --- There is a minor glitch in the patch from Richard, which went in when cleaning it up. This line: + __asm__ (".symver pthread_create, pthread_create@@" GC_PTHREAD_SYM_VERSION); which creates the right version of the overriding symbol actually needs to read + __asm__ (".symver GC_pthread_create,pthread_create@@"GC_PTHREAD_SYM_VERSION); as that is how the function now is called. regtesting of libjava with that change in the patch on x86_64 looks good now (it doesn't with the patch as posted). -- matz at suse dot de changed: What|Removed |Added CC| |matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13212
[Bug c++/22063] link failure involving symbol visibility
--- Comment #5 from matz at suse dot de 2006-03-21 13:59 --- There is no such thing as a hidden reference. A symbol can be hidden, then it's not exported and all references from inside DSO are directly bound to it. That's not the situation we have here. We have a global exported symbol ('vtable of foo') in libfoo.so, which somehow is not found by the reference from inside liblinkfoo.so. This might also be a linker error, I don't know. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22063
[Bug middle-end/26643] Linux matroxfb_probe miscompiled
--- Comment #9 from matz at suse dot de 2006-03-13 08:57 --- -fno-ivopts fixes it. The problem is how bitfield refs are dealt with it seems. With -fno-ivopts the final_cleanup pass looks like so at the interesting point: D.2180 = BIT_FIELD_REF <*pdev, 32, 544> & 4294967295; ... if ((BIT_FIELD_REF <*b, 32, 0> & 4294967295) != D.2180) goto ; else goto ; ivopts lead to this code at that point: D.2180 = BIT_FIELD_REF <*pdev, 32, 544> & 4294967295; ... if ((MEM[base: (long unsigned int *) b] & 4294967295) != D.2180) goto ; else goto ; Now BIT_FIELD_REF<*b,32,0> extract exactly the 32 bit at address 'b'. But MEM[base: (long unsigned int *) b] extracts the 64 bit at that address. The masking afterwards selects the lower 32bit from that, but ppc being a big endian target this extracts the wrong half. Let's CC Zdenek for this. -- matz at suse dot de changed: What|Removed |Added CC||rakdver at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26643
[Bug middle-end/22275] [3.4/4.0/4.2 Regression] bitfield layout change
--- Comment #53 from matz at suse dot de 2006-02-15 12:24 --- So, it's fixed. I'm not able to actually change the state to FIXED, so someone has to do this for me. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug tree-optimization/26260] PTA is slow with large structs (hits clisp)
--- Comment #1 from matz at suse dot de 2006-02-13 16:53 --- Created an attachment (id=10836) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10836&action=view) Testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26260
[Bug tree-optimization/26260] New: PTA is slow with large structs (hits clisp)
clisp currently can't be compiled with optimization very well, because PTA takes ages when presented with source code of the form clisp uses. The attachment demonstrates this: % /usr/lib/gcc/powerpc64-suse-linux/4.1.0/cc1 -O1 slow-pta.i tree PTA : 18.08 (100%) usr 0.03 (75%) sys 18.11 (100%) wall The code is trivial: symbol_ *bla; void slow (void) { bla = &symbol_tab_data.S_nil; } The crux is the form of symbol_tab_data containing a large number of members, each of them being a struct containing seven pointers. Making that latter struct contain less members hugely decreases compile time. -- Summary: PTA is slow with large structs (hits clisp) Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de GCC host triplet: ppc-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26260
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change
--- Comment #47 from matz at suse dot de 2006-02-12 03:59 --- What do you mean with 6 (as making more sense)? The size of the struct? Anyway, even ignoring that we talk about structs which are packed in various ways (as you rightly noticed) even the old (IMHO more sensible behaviour) fullfills the C standard you quoted. By aligning it to it automatically makes a following bitfield not come to lie in the same unit (a byte usually), though that's obviously not the most strict interpretation of this rule. So it's not that the old behabiour would violate C99. What strengenths the case to go back actually are programs relieing on that behaviour, _and_ that it's more expressive. With the new behaviour there's no difference between char :0; short :0; int :0; If the user really only want to close the current unit he can write 'char :0'. But if he wants more alignment in a otherwise packed struct he has to play games currently, whereas with the pre-3.4 sematic he could have written 'int:0' (if "int" was his desired alignment for the next field). So, I still stand by my opinion that we should want to go back. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug c++/24996] [4.0/4.1/4.2 Regression] ICE on throw code
--- Comment #20 from matz at suse dot de 2006-02-02 16:56 --- I've put the patch to testing. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24996
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #42 from matz at suse dot de 2006-01-23 11:28 --- Created an attachment (id=10711) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10711&action=view) Testprogram This program generates the following output for 3.3-hammer-branch on x86-64: S_normal_i size 8 align 4 ofs 4 S_pragma_i size 8 align 1 ofs 4 S_packed_i size 12 align 1 ofs 8 S_pragma_packed_i size 12 align 1 ofs 8 S_normal_ll size 16 align 8 ofs 8 S_pragma_ll size 16 align 1 ofs 8 S_packed_ll size 16 align 1 ofs 8 S_pragma_packed_ll size 16 align 1 ofs 8 With -m32 it's: S_normal_i size 8 align 4 ofs 4 S_pragma_i size 8 align 1 ofs 4 S_packed_i size 8 align 1 ofs 4 S_pragma_packed_i size 8 align 1 ofs 4 S_normal_ll size 16 align 4 ofs 8 S_pragma_ll size 16 align 1 ofs 8 S_packed_ll size 12 align 1 ofs 4 S_pragma_packed_ll size 12 align 1 ofs 4 Note how 3.3 handled packed structs (in difference to those under #pragma) really strange. With 4.1 plus patch on x86-64: S_normal_i size 8 align 4 ofs 4 S_pragma_i size 8 align 1 ofs 4 S_packed_i size 8 align 1 ofs 4 S_pragma_packed_i size 8 align 1 ofs 4 S_normal_ll size 16 align 8 ofs 8 S_pragma_ll size 16 align 1 ofs 8 S_packed_ll size 16 align 1 ofs 8 S_pragma_packed_ll size 16 align 1 ofs 8 With -m32: S_normal_i size 8 align 4 ofs 4 S_pragma_i size 8 align 1 ofs 4 S_packed_i size 8 align 1 ofs 4 S_pragma_packed_i size 8 align 1 ofs 4 S_normal_ll size 12 align 4 ofs 4 S_pragma_ll size 12 align 1 ofs 4 S_packed_ll size 12 align 1 ofs 4 S_pragma_packed_ll size 12 align 1 ofs 4 So, it's at least consistent, and maybe even senseful -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #41 from matz at suse dot de 2006-01-23 11:23 --- Created an attachment (id=10710) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10710&action=view) candidate patch This patch contains some commented out test code I had in for playing around. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #40 from matz at suse dot de 2006-01-23 11:21 --- Mark, re your comment #38: (my comment #39 actually came before, but I forgot to press "Commit" :-/ ) the #pragma pack(1) does not influence DECL_PACKED. It is only set by attribute(packed). That's why the difference of behaviour between a struct under #pragma pack(1) vs. a struct with an attribute packed occurs. I agree that it conceptually makes sense to implicitely have a zero-width bit-field never be DECL_PACKED (though this would deviate from pre-3.4). The ugly thing is, that current code in GCC seems to handle exactly this case differently. For instance in place_field(), we have this code: if (PCC_BITFIELD_TYPE_MATTERS && ! targetm.ms_bitfield_layout_p (rli->t) && TREE_CODE (field) == FIELD_DECL && type != error_mark_node && DECL_BIT_FIELD (field) && ! DECL_PACKED (field) && maximum_field_alignment == 0 && ! integer_zerop (DECL_SIZE (field)) the body contains a call to ADJUST_FIELD_ALIGN then. So if this is a zero-sized bitfield, then this ADJUST won't be done no matter what DECL_PACKED is, and it seems that this is wanted here (in difference to the other place in layout_decl, where zero-sized bitfield simply weren't handled). The comment above this code says that it's purpose is compatibility with PCC, so perhaps struct with zero-sized bitfields weren't handled at all by PCC, and this is a non-issue. I don't know. Otherwise it might be, that both places need to be handled the same way to not risk inconsistencies. It currently looks as if it also isn't handled consistently right now (another call to ADJUST_... misses to test DECL_PACKED at all for instance). Double-sigh. Anyway, I'll attach my current patch which implements the suggested behaviour, including zero-bitfield == !DECL_PACKED in layout_decl. And also a small testprogram showing information about different struct under different settings. It shows inconsistencies in 3.3, and with the patch 4.1 is more consistent (plus the case in wine, namely of using #pragma pack(1) still does the same as pre-3.4). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #39 from matz at suse dot de 2006-01-23 10:32 --- Gnah! It's even worse. I spoke too soon, and actually pre-3.4 (3.3 in fact) has the behaviour _swapped_. I.e. using the example struct I have these sizes (on i686): 3.3:normal 16, pragma 16, packed 12 4.1+patch: normal 12, pragma 12, packed 16 The 3.3 case is nice in the sense that the packed struct actually is smaller than the unpacked struct, but in a way it also means that 3.3 did somehow adjust the field offsets for the packed struct. What do we do? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #37 from matz at suse dot de 2006-01-20 16:36 --- Hmpf. One more difficulty. x86 uses the ADJUST_FIELD_ALIGN macro to further fiddle with alignments of fields. On x86 this is used to adjust the alignment of long long to 4 (instead of the natural 8). This is used only when the field is not DECL_PACKED (makes sense). This has the funny side-effect that a struct containing a long long zero-width bitfield aligns to 4 for unpacked and to 8 for packed structs, i.e. the packed struct actually is _larger_ than the unpacked struct. E.g. the running example with UINT being long long: typedef int BOOL; typedef unsigned long long UINT; #pragma pack(1) typedef struct { BOOL fFullPathTitle:1; BOOL fSaveLocalView:1; BOOL fNotShell:1; BOOL fSimpleDefault:1; BOOL fDontShowDescBar:1; BOOL fNewWindowMode:1; BOOL fShowCompColor:1; BOOL fDontPrettyNames:1; BOOL fAdminsCreateCommonGroups:1; UINT fUnusedFlags:7; UINT :0; UINT fMenuEnumFilter; } CABINETSTATE; This struct being unpacked (only influenced by #pragma) has size 12 after my patch on i686, and size 16 (!) when being packed via attribute. What's even more ugly is that pre-3.4 GCC had a size of 16 for both cases (pragma or attribute packed) on i686 :-( So, how would we like to handle this? Doing as pre 3.4 did is probably possible but not trivially done. Basically the code doing this is: if (! DECL_USER_ALIGN (decl) && ! DECL_PACKED (decl)) { #ifdef ADJUST_FIELD_ALIGN DECL_ALIGN (decl) = ADJUST_FIELD_ALIGN (decl, DECL_ALIGN (decl)); #endif } Now, I could just ignore DECL_PACKED for zero-width bitfields, then the adjustment would be done for both cases and we had a size of 12 with attribute or pragma, i.e. not the same as pre 3.4 in both. I also could never adjust zero-width bitfields, so that they would get their natural alignment even when the target wanted something else. Then both cases would have size 16, being the same as pre 3.4. I'm leaning towards not doing field adjustments for zero-width bitfields at all, having the effect that a zero-width bitfield has a user-alignment set explicitely (of it's base type). I think if one understands zero-width bitfields purely as alignment constraints than this implicit DECL_USER_ALIGN behaviour seems sensible. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #36 from matz at suse dot de 2006-01-20 14:01 --- Yes. Should be done shortly. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #32 from matz at suse dot de 2006-01-19 14:44 --- Mark, I agree that it's saner when both structures (with #pragma pack and attribute packed) have the same length of 8 on i686 and x86_64 (because the bitfield was declared 'int' in difference to 'long' for instance). Then I have a question to clarify if I understood correctly: by remembering the original maximum_field_alignment and using that for zero-sized bitfields you want to use the absolute first, default, m_f_a, or the one last set before the innermost #pragma pack? Consider an example like so, and lets assume the initial max field alignment was 4: mfa == 4 #pragma pack(2) // 1 mfa == 2 #pragma pack(1) // 2 mfa == 1 #pragma pack() // 3 mfa == 4 #pragma pack (push,2) // 4 mfa == 2 #pragma pack (push,1) // 5 mfa == 1 #pragma pop // 6 mfa == 2 #pragma pop // 7 mfa == 4 With what would you constrain the alignment of a zero sized bitfield at each of the seven points? What if the initial mfa is 0 (i.e. not set)? Should -fpack-struct=... (which influences the initial mfa) influence that constraint too, or not? My opinion is, that at each of the seven points above we should constrain with the initial mfa (i.e. 4 in the example above), as adjusted by the -fpack-struct command line option. That would have the effect of aligning zero sized bitfield at max to the architecture default (possibly adjusted globally by the cmdline option), while effectively ignoring all #pragma packs in effect. I think that is what we want the semantics of a zero-sized bitfield to be. Agreed? Another point: If we make the structure with attribute packed on both x86 and x86-64 be eight long (to agree with the behaviour of using pragma), then we do add another variant unfortunately. In pre 3.4 that structure was 12 on x86-64 (which I think was an actual error). Wine itself uses only #pragma pack AFAIK, so it wouldn't be affected by this change. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #28 from matz at suse dot de 2006-01-17 22:31 --- And indeed with this testcase: typedef int BOOL; typedef unsigned int UINT; typedef struct { BOOL fFullPathTitle:1; BOOL fSaveLocalView:1; BOOL fNotShell:1; BOOL fSimpleDefault:1; BOOL fDontShowDescBar:1; BOOL fNewWindowMode:1; BOOL fShowCompColor:1; BOOL fDontPrettyNames:1; BOOL fAdminsCreateCommonGroups:1; UINT fUnusedFlags:7; UINT :0; UINT fMenuEnumFilter; } __attribute__((packed)) CABINETSTATE; int f[sizeof(CABINETSTATE) == 8? 1 : -1]; --- it's still broken with the patch. gcc 3.3 has size 8 on i686 and size 12 on x86-64. With some other fix to my patch (below) I get 8 and 8 (without that fix it's 6 and 6). This is consistent with the old #pragma pack(1) behaviour, so arguably this was an inconsistency in 3.3, worth to be fixed, but it's still a change in behaviour. It would be interesting to know what earlier compilers had. The mentioned fix is ignoring zero sized bitfields also when DECL_PACKED is set, like so: Index: stor-layout.c === --- stor-layout.c (revision 107699) +++ stor-layout.c (working copy) @@ -337,6 +337,7 @@ /* For fields, it's a bit more complicated... */ { bool old_user_align = DECL_USER_ALIGN (decl); + bool zero_bitfield = false; if (DECL_BIT_FIELD (decl)) { @@ -345,9 +346,9 @@ /* A zero-length bit-field affects the alignment of the next field. */ if (integer_zerop (DECL_SIZE (decl)) - && ! DECL_PACKED (decl) && ! targetm.ms_bitfield_layout_p (DECL_FIELD_CONTEXT (decl))) { + zero_bitfield = true; #ifdef PCC_BITFIELD_TYPE_MATTERS if (PCC_BITFIELD_TYPE_MATTERS) do_type_align (type, decl); @@ -408,6 +409,7 @@ check old_user_align instead. */ if (DECL_PACKED (decl) && !old_user_align + && !zero_bitfield && (DECL_NONADDRESSABLE_P (decl) || DECL_SIZE_UNIT (decl) == 0 || TREE_CODE (DECL_SIZE_UNIT (decl)) == INTEGER_CST)) @@ -428,7 +430,7 @@ } /* Should this be controlled by DECL_USER_ALIGN, too? */ - if (maximum_field_alignment != 0) + if (maximum_field_alignment != 0 && ! zero_bitfield) DECL_ALIGN (decl) = MIN (DECL_ALIGN (decl), maximum_field_alignment); } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #27 from matz at suse dot de 2006-01-17 22:12 --- Funnily I've also looked at stor-layout.c a bit, and basically came to a similar conclusion and patch like Steven. I agree that as per documentation PCC_BITFIELD_TYPE_MATTERS overrides EMPTY_FIELD_BOUNDARY. But that was also a change by Jasons patch. Formerly it just "influenced" how empty fields are handled. Clearly it influenced it in a different way than simple overriding. The problem, like Steven already analyzed, is twofold: 1) maximum_field_alignment now affects also empty bitfields, which it didn't before, because before Jason maximum_field_alignment was evaluated before other things were taken into account, and now it's the last thing done. That's why earlier something larger than alignment 8 bit was possible with #pragma pack(1) at all. 2) A bug or feature in pre-3.4 lead to the ignoring of EMPTY_FIELD_ALIGNMENT when larger than 32 in our case. Namely because the initial alignment of 64 (as required by EMPTY_FIELD_ALIGNMENT) was overriden by the alignment of the type of the bitfield (int here, i.e. 32 bit). I've come up with this simple patch for the problem, which fixes the testcase for i386 and x86-64 (in the sense of being compatible with <= 3.3) . The idea is to simply ignore the max field alignment for empty bitfield (hence falling back to either PCC_BITFIELD_TYPE_MATTERS or EMPTY_FIELD_ALIGNMENT as the target requested). This needs to be tested also with struct where the packed property is not due to a #pragma pack(1) but rather a packed attribute, or similar. Index: stor-layout.c === --- stor-layout.c (revision 107699) +++ stor-layout.c (working copy) @@ -337,6 +337,7 @@ /* For fields, it's a bit more complicated... */ { bool old_user_align = DECL_USER_ALIGN (decl); + bool zero_bitfield = false; if (DECL_BIT_FIELD (decl)) { @@ -348,6 +349,7 @@ && ! DECL_PACKED (decl) && ! targetm.ms_bitfield_layout_p (DECL_FIELD_CONTEXT (decl))) { + zero_bitfield = true; #ifdef PCC_BITFIELD_TYPE_MATTERS if (PCC_BITFIELD_TYPE_MATTERS) do_type_align (type, decl); @@ -428,7 +430,7 @@ } /* Should this be controlled by DECL_USER_ALIGN, too? */ - if (maximum_field_alignment != 0) + if (maximum_field_alignment != 0 && ! zero_bitfield) DECL_ALIGN (decl) = MIN (DECL_ALIGN (decl), maximum_field_alignment); } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug middle-end/22275] [3.4/4.0/4.1/4.2 Regression] bitfield layout change (regression?)
--- Comment #23 from matz at suse dot de 2006-01-16 15:14 --- The x86-64 ABI itself doesn't talk about zero-sized bitfields. So both behaviours are correct regarding the ABI. It talks about unnamed bitfields (which zero-sized ones must be) not influencing the overall alignment of structures or unions, but the problem here is different. Having said that I agree with Marks mail on gcc@ that the pre-3.4 behaviour made more sense. Unfortunately I'm also no stor-layout.c expert, so can't really comment on how the best approach is to implement it. I assume Jason would be the best to comment here, as he changed that behaviour. Stevens latest patch changes the evaluation of EMPTY_FIELD_BOUNDARY vs. PCC_BITFIELD_TYPE_MATTERS, so someone needs to make sure that this is okay. Additionally I don't know how stor-layout tracks alignment, i.e. if desired_align contains the alignment for the _current_ field, or for the _next_ field. A zero-sized bitfield should influence alignment of the next field (although due to the size of zero this shouldn't make a difference). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22275
[Bug c++/25417] New: internal compiler error in check_initializer; hits clisp
This small testcase is extracted from clisp, and throws an ICE: --- struct object { int a; int b; }; void f (int c, int d) { object o = ((object){ a : c, b : d}); } % g++ -c bla.ii bla.ii: In function ‘void f(int, int)’: bla.ii:8: internal compiler error: in check_initializer, at cp/decl.c:4613 Removing the explicit cast and the surrounding parentheses, i.e. to read like so: object o = { a : c, b : d}; makes this compile. -- Summary: internal compiler error in check_initializer; hits clisp Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de GCC host triplet: ia64-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25417
[Bug tree-optimization/25329] New: Miscompilation in tcl, -INT_MIN test misoptimized
This testcase: int bla (long l) { long lR; if (l < 0) { lR = - l; if (lR < 0) {// 2 return 1; } } return 0; } extern void abort (); int main() { if (!bla (-2147483648)) abort (); return 0; } -- when compiled with -O2 will abort. because the inner test 2 will be optimized away. The variable lR is negative at that point, but the test won't trigger as it was removed. This same code is used in tcl 8.4.12 to detect if the passed number is the most negative one (like in the above example), and special case that one. So this code relies on the fact that -INT_MIN is INT_MIN again, and hence both <0 test will trigger. Now, with -fwrapv this testcase is not miscompiled, but I'm not sure if this behaviour of GCC is really justified by the standard and our intention. Certainly the -INT_MIN stored into an int is implementation defined, and older GCCs (and in fact also the current one) stores INT_MIN into the target. Just deleting the test seems wrong. -- Summary: Miscompilation in tcl, -INT_MIN test misoptimized Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de GCC host triplet: i386-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25329
[Bug middle-end/24969] tmpdir-gcc.dg-struct-layout-1/t026 fails execution
--- Comment #6 from matz at suse dot de 2005-11-21 14:25 --- Something is fishy. Iff registers are used for passing then it would have to be %rdi and %rsi (not %rax)! So the high part of this struct (where the bitfield lies) is not passed at all here. Per ABI this whole struct should be passed in registers (it's not larger than two eightbytes, and both eightbytes have class INTEGER (they contain no unaligned fields or other fancy stuff)). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24969
[Bug target/24661] unable to find a register to spill in class NO_REGS on ia64
--- Comment #11 from matz at suse dot de 2005-11-09 15:32 --- You mean ABI change, because the input register seems to be f8, instead of in0 (as would be need for this union)? I'm not sure, but it looks fishy at least. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24661
[Bug target/24661] unable to find a register to spill in class NO_REGS on ia64
--- Comment #9 from matz at suse dot de 2005-11-09 14:49 --- A shorter testcase (which at least breaks on SuSEs 3.3-hammer compiler) is: --- typedef union value { long double d; } Value; double ld2d(Value v) { return v.d; } --- The problem seems to be that the register containing the union is TImode, which gets converted to TFmode via subreg, which in turn is truncated to DFmode via float_truncate, which somehow confuses gcc. -- matz at suse dot de changed: What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24661
[Bug libstdc++/21072] base allocator change shared object issues
--- Comment #7 from matz at suse dot de 2005-11-07 19:59 --- Of course not. But unaware people trying trunk currently on distros which provided 3.4 or 4.0 will get non-obvious problems, and I'm not sure if that's a good idea (ignoring this problem 4.0 and trunk are nearly compatible, and 4.0 compiled programs work with the trunk libstc++, which has the same SOname like the 4.0 one). I think the only way to switch to the 'mt' allocator by default for the future without API issues would be to rename it to 'new', and not via some configure arguments. Another reason is that this switching back of the default allocator is not forgotten when 4.1 branches, which I think is necessary anyway, so that 4.1 libs are compatible with 4.0 programs. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21072
[Bug libstdc++/21072] base allocator change shared object issues
--- Comment #5 from matz at suse dot de 2005-11-04 14:45 --- While 4.0 had this fixed, trunk still uses the 'mt' allocator by default on linux, and hence is incompatible with 3.4 and 4.0 by default. Is that really intended, or shouldn't also trunk default back to the 'new' allocator? -- matz at suse dot de changed: What|Removed |Added CC| |matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21072
[Bug c++/23691] [4.0 Regression] `mpl_::bool_::value' is not a valid template argument for type `bool' because it is a non-constant expression
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23691
[Bug c++/23699] [4.0/4.1 Regression] patch for #23099 breaks glibmm
--- Additional Comments From matz at suse dot de 2005-09-02 16:14 --- Yes, I also got the boost error. And I got that with a 4.0 CVS version from today. Reverting Marks patch also solves the boost problem described in PR23691. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23699
[Bug c++/23699] New: patch for #23099 breaks glibmm
Marks patch for fixing PR23099 from 2005-08-29 makes glibmm not compile: % cat glib-test.cc #include struct A{ static const long npos = std::string::npos; }; % g++ -c glib-test.cc glib-test.cc:3: error: field initializer is not constant Reverting it makes this compile again. To fail it seems to need that the type of std::string::npos comes from a template argument member (the allocator of std::string in this case). Easier tests do compile. Also doing the initialization of A::npos outside the class definition makes this compile. -- Summary: patch for #23099 breaks glibmm Product: gcc Version: 4.0.2 Status: UNCONFIRMED Severity: normal Priority: P2 Component: c++ AssignedTo: mark at codesourcery dot com ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23699
[Bug tree-optimization/23326] [4.0 Regression] Wrong code from forwprop
--- Additional Comments From matz at suse dot de 2005-09-01 15:11 --- This still isn't in the 4.0 branch. Perhaps ping it? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23326
[Bug c++/23472] __attribute__((constructor)) called twice with -funit-at-a-time
--- Additional Comments From matz at suse dot de 2005-08-19 01:36 --- Still a problem in the current hammer branch. CCing Honza. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23472
[Bug rtl-optimization/15265] delete_output_reload deletes necessary insn
--- Additional Comments From matz at suse dot de 2005-08-11 16:13 --- I don't think this is actually fixed in reload1.c. Perhaps it is hidden by other changes, so that the particular miscompilation doesn't happen anymore, but even HEAD reload1.c contains the questionable double counting of inherited operands. Might be with enough work (perhaps by disabling some tree-ssa passes, but retaining -O1) we can still make it fail with mainline. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15265
[Bug c++/22592] -fvisibility-inlines-hidden broken differently
--- Additional Comments From matz at suse dot de 2005-07-27 13:46 --- Because these symbols indeed are not defined anywhere. On linux this happens to work, but on darwin you need to link against something which provides them. So you would need to create a library which implements both operators out-of-line (and hence also the vtable), and us that to link _this_ library against. But that's not the issue at hand. This is the bigger picture, how this error can be seen in the real world: First, there is a base library (let's call it libbase) implementing the whole class, all methods, the vtable, everything. Then there's another library (libtwo), using that class in implementing some of it's functionality (breakme in our case). It does so by including the header for that class (defining the inline operator !=, besides declaring the class), and linking against libbase. Hence no unresolved symbols will occur. The libtwo exports only those symbols and class it wants exported, hence it switches the default visibility to hidden (including inlines), because all these are already defined in libbase, no need to export them too. This is all perfectly valid usage of shared libs. But it doesn't work because libtwo can't be created due to the invalid call emitted to a method not defined in the same DSO. Perhaps I should have made more clear the bigger picture to not sidetrack others by the undefinedness of operator==. In the real world it _will_ be defined, in a different shared lib. So just for reference a little bit reformatted: --- libbase.ii struct A { virtual bool operator== (const A &a) const; virtual bool operator!= (const A &a) const; }; inline bool A::operator!= ( const A &a) const { return !(*this == a); } bool A::operator== (const A&a) const { return true; } --- Compile this with just "g++ -fPIC -shared -o libbase.so libbase.ii", and you have a shared lib you can use to link against when creating the second shared lib, from the source of the initial report here. Note that the first few lines (including definition of operator !=) reflect a header file which declares class A, which is included in libbase.ii and testcase.ii. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22592
[Bug c++/22592] -fvisibility-inlines-hidden broken differently
--- Additional Comments From matz at suse dot de 2005-07-22 12:46 --- I don't understand. The code itself is perfectly valid C++, I don't think you mean that it's invalid, right? Yes, operator== is also hidden, but there is no definition for it in this unit, hence GCC generates the correct call type (over PLT). (It should also be noted that because of the other bugs GCC can't emit the .hidden directives for undefined symbols, except when using HJs patches, but that's tangential and wouldn't make a difference, the crucial point is, that the correct call is emitted). And irrespective of that the error also happens without -fvisibility=hidden, i.e. when _only_ the inlines are hidden. I still think this is a bug, which should be corrected by making GCC just emit an out-of-line copy of the inline function (in a linkonce section). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22592
[Bug middle-end/22592] New: -fvisibility-inlines-hidden broken differently
This is another instance of the visibility stuff being broken. This time because a function isn't emitted, but called in a way as if it was. This code: - struct A { virtual bool operator== (const A &a) const; virtual bool operator!= (const A &a) const; }; inline bool A::operator!= ( const A &a) const { return !(*this == a); } bool breakme(const A& newPath, A &b) { return A(newPath) != b; } - Compile with (4.0.2 pre): % g++ -fPIC -DPIC -I/usr/lib/qt3/include -fvisibility=hidden \ -fvisibility-inlines-hidden -Wall -O2 -o libbla.so -shared testcase.ii ld: /tmp/ccWAn1go.o: relocation R_X86_64_PC32 against `A::operator!=(A const&) const' can not be used when making a shared object; recompile with -fPIC The chain of events goes like so: 1) the operator is virtual, hence initially the call is through the vtable 2) inlining happens before that is optimized, hence the operator call is not inlined 3) the virtual call is optimized to a direct call of the operator!=, because of the explicit copy operation GCC knows the final type of the object on which it is called (namely "A"), and hence can transform the indirect to a direct call 4) the call is emitted such that it expects a local (hidden) definition of the function in the DSO (i.e. not over PLT but direct call) 5) But no copy of that function is emitted in this DSO, because the out-of-line copies of such inline function are generated only in the .o file which contains the key method of the class (the first virtual func in this case). So nothing can be made hidden, and it anyway wouldn't work even if, as there simply is nothing to call in this DSO. I believe this is a different error from all the other visibility problems like bug 19664 or bug 20218. I don't know what GCC should do here. It either needs to emit an out-of-line copy of this operator, or generate the call over a PLT. The first solution would be better, but could mean that we need to emit such out-of-line copies in every .o file where they are referenced. The second solution might even not work at all, as probably the main library (containing the key method and hence the out-of-line versions) is also compiled with -fvisibility-inlines-hidden, and hence wouldn't even export those function which a call over PLT could resolve to. So, it seems that cgraph needs to be changed somehow to emit this function if referenced. Hence CCing Honza. -- Summary: -fvisibility-inlines-hidden broken differently Product: gcc Version: 4.0.2 Status: UNCONFIRMED Severity: normal Priority: P2 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org,jh at suse dot cz GCC host triplet: x86_64-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22592
[Bug c++/22568] Should use cmov in some stituations
--- Additional Comments From matz at suse dot de 2005-07-20 14:20 --- This still happens with 4.1. I also can't make it use two cmovs, by changing the source a bit, e.g. like: typedef unsigned long ulong; extern ulong use (ulong, ulong); ulong f(ulong a, ulong b) { ulong tmp = a; if (a < b) { a = b; b = tmp; } return use (a, b); } -- What|Removed |Added CC| |matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22568
[Bug target/19653] x87 reg allocated for constants for -mfpmath=sse
--- Additional Comments From matz at suse dot de 2005-07-13 13:55 --- I was going to add this text to PR22453, when I noticed that it was closed as duplicate to this one. So putting it here for reference, although everything seems to be analyzed already: The reload happens, because reg 58 gets no hardreg, because it's live over a call, and it's not worthwhile to put it into a call clobbered reg (which SSE regs are). So reg 58 is placed onto stack (at ebp+16). Now this mem must be initialized with 1.0. If that is done via x87 (fld1 , fst ebp+16), via GENERAL_REGS (mov 1.0 -> (reg:DF ax) , mov (reg:DF ax) -> (ebp+16)), or via SSE_REGS (movsd (mem 1.0) -> xmm0 , mov xmm0 -> (ebp+16)) is actually not that important. You won't get rid of this reload. Except that _if_ you force it to use SSE_REGS, then the next reload from (ebp+16) for the next insn can be inherited (as it's then the same mode), hence the initial store to ebp+16 is useless and will be removed. This can be tested with this hack: --- i386.md 12 Jul 2005 09:20:12 - 1.645 +++ i386.md 13 Jul 2005 13:47:06 - @@ -2417,9 +2417,9 @@ (define_insn "*movdf_nointeger" [(set (match_operand:DF 0 "nonimmediate_operand" - "=f#Y,m ,f#Y,*r ,o ,Y*x#f,Y*x#f,Y*x#f ,m") + "=?f#Y,m ,f#Y,*?r ,o ,Y*x#f,Y*x#f,Y*x#f ,m") (match_operand:DF 1 "general_operand" - "fm#Y,f#Y,G ,*roF,F*r,C,Y*x#f,HmY*x#f,Y*x#f"))] + "?fm#Y,f#Y,G ,*?roF,F*r,C,Y*x#f,HmY*x#f,Y*x#f"))] "(GET_CODE (operands[0]) != MEM || GET_CODE (operands[1]) != MEM) && ((optimize_size || !TARGET_INTEGER_DFMODE_MOVES) && !TARGET_64BIT) && (reload_in_progress || reload_completed But I don't see immediately how reload could be convinced to do so automatically, as the choice of the reload class for one insn is independend from the choices of reloads for the same reg but in other insns. -- What|Removed |Added ------------ CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19653
[Bug gcov/profile/21388] gcov-io.h compilation warning
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21388
[Bug c++/22252] [4.0/4.1 Regression] pragma interface/implementation still break synthesized methods
--- Additional Comments From matz at suse dot de 2005-06-30 15:23 --- Ah, I see. Note that you must compile the reduced testcase (thanks for that) with -O0, or with -fno-inline, otherwise the A::A ctor will be inlined in use.cc (making the warning about the non-availability of it even more funny ;-) ), and not lead to the link error. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22252
[Bug c++/22252] [4.0/4.1 Regression] pragma interface/implementation still break synthesized methods
--- Additional Comments From matz at suse dot de 2005-06-30 15:01 --- Andrew: that's not a diagnostic issue. While the diagnostic (the warning) indeed is wrong and misleading (and probably points to the underlying cause of this issue), the actual error I'm complaining about is the link error, due to not emitting an out-of-line copy of A::A() in a.cc as it would be required. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22252
[Bug c++/22252] New: pragma interface/implementation still break synthesized methods
This is similar to bug 21280, but it is _not_ fixed by the patches therein. In fact it still happens with current 4.0 branch as of 2005-06-30. Compile these files: % cat a.h #include #pragma interface struct A { std::vector vc; }; % cat use.cc #include "a.h" A a; int main() {} % cat a.cc #include #pragma implementation #include "a.h" % g++ -W -Wall -O2 a.cc use.cc a.h:3: warning: inline function ‘A::A()’ used but never defined a.h:3: warning: inline function ‘A::~A()’ used but never defined /tmp/ccuIwMBN.o: In function `__static_initialization_and_destruction_0(int, int)': use.cc:(.text+0x31): undefined reference to `A::A()' collect2: ld returned 1 exit status This is because A::A() is not synthesized, although it should happen in file a.cc (which contains the pragma implementation). So, something still is wrong (this actually breaks building lyx btw.) after 21280 was fixed. -- Summary: pragma interface/implementation still break synthesized methods Product: gcc Version: 4.0.1 Status: UNCONFIRMED Severity: normal Priority: P2 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org,schwab at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22252
[Bug middle-end/22197] invalid "is" used uninitialized, should be "may be"
--- Additional Comments From matz at suse dot de 2005-06-27 13:50 --- Hmm, sort of. The call of g(i) also warns with "is used", although I think it might deserve only a "may be used". But anyway I think that this nevertheless has different causes. It's not the call creating the problem, but the copy itself. On could for instance delete the call and instead make 'testarray' volatile (so that the copy is not optimized away). This would still warn, IMHO incorrectly. What's even stranger is, that if I add a call "forget(testvar)" then the warning vanishes. I.e. like so: - int main() { struct testme volatile testarray[1]; struct testme testvar; testvar.testval = 0; testarray[0] = testvar; forget (testvar); return 0; } - So this shows that not the call on a partly initialized struct is the problem (because that is still the case with the above), but something in SRA dealing with the copy. If one removes the call to forget above the warning will return (note the added volatile) on the copy. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22197
[Bug middle-end/22197] New: invalid "is" used uninitialized, should be "may be"
Compile this code with -O2 -Wall on 4.0.x or mainline: - struct testme { int testval; int unusedval; }; extern void forget (struct testme forgotten); int main () { struct testme testarray[1]; struct testme testvar; testvar.testval = 0; testarray[0] = testvar; forget (testarray[0]); return 0; } -- This will give this warning: unused.c:13: warning: ‘testvar.unusedval’ is used uninitialized in this function The problem is the copy of some uninitialized part. Yes, it does copy something uninitialized, but that's okay, as long as it is not really accessed. At the very least it should only be a "may be used uninit" warning. This is only noticed by tree-sra. With -fno-tree-sra there's no warning. So, in effect, accesses to uninitialized parts for purpose of copying should not lead to such warning. -- Summary: invalid "is" used uninitialized, should be "may be" Product: gcc Version: 4.0.1 Status: UNCONFIRMED Severity: normal Priority: P2 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22197
[Bug middle-end/19985] [3.4/4.0/4.1 Regression] executables created with -fprofile-arcs -ftest-coverage segfault in gcov_exit ()
--- Additional Comments From matz at suse dot de 2005-06-21 20:31 --- This patch seems to be the reason for warnings like: In file included from ../../gcc/gcov-io.h:239, from ../../gcc/libgcov.c:51: ./auto-host.h:23:1: warning: "DEFAULT_USE_CXA_ATEXIT" redefined In file included from ./tm.h:12, from ../../gcc/libgcov.c:39: ../../gcc/defaults.h:712:1: warning: this is the location of the previous definition There are now many warnings of this type during building gcc. This is because auto-host.h is now included, but _after_ all the other headers, which do something like #ifndef BLA #define BLA ... #endif Because auto-host.h is not yet included there, BLA is not defined, so the default will be defined, and then auto-host.h is included leading to double definitions. libgcov.c talks about not able to include because that's for the host, not for the target. So I don't know if auto-host.h (also for the host) should be included at all. But if it is, then it has to be earlier. Perhaps in libgcov.c directly as first file. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19985
[Bug c++/22063] link failure involving symbol visibility
--- Additional Comments From matz at suse dot de 2005-06-14 16:13 --- No. The vtable itself (as all methods of class foo) is implemented in libfoo.so with default visibility, i.e. exported just fine: 25: 17d812 OBJECT WEAK DEFAULT 20 vtable for foo Then there is liblinkfoo, which just refers to the vtable. It is compiled with the pragma visibility in effect in the declaration of class foo (i.e. simulating a header declaring a class of a library, where the pragma was in effect). That lib is linked against the above libfoo.so. And this results in the mentioned link error. The reference to the vtable from linkfoo.o also looks just fine: 14: 0 NOTYPE GLOBAL DEFAULT UND vtable for foo i.e., UNDEF (and of course global, but that's irrelevant for a undef). This should not happen. I could theorize, that this has something to do with the two definitions of the foo::foo ctor (in linkfoo.o it's hidden of course). The "unresolvable" relocation is from that hidden implementation of foo::foo to the (global, exported in the other lib) vtable. That implementation is also placed in a linkonce section, so that might be the reason too. I changed the testcase a bit to implement the ctor out-of class, and removed the breakme method, i.e. it looks like so: - #pragma GCC visibility push( hidden ) class foo { public: foo(); virtual void bar(); }; foo::foo() {} - (this is linkfoo.cc) together with the other virtualclass.cc this still reproduces the same error. Here no linkonce sections are involved. The only thing is that the foo ctor is defined twice (but in different shared libs, so no problem), in the second lib hidden. It still has a reference to the vtable defined in the first lib, which is exported. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22063
[Bug c++/22063] link failure involving symbol visibility
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22063
[Bug target/21721] [4.0 regression] fails to assemble, Use of p0 is not valid in this context
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21721
[Bug tree-optimization/19768] [4.0 Regression] ICE: SSA_NAME_OCCURS_IN_ABNORMAL_PHI should be set
--- Additional Comments From matz at suse dot de 2005-06-10 11:50 --- Thanks. Sorry, I couldn't revert as you suggested, as I became ill the day after noticing the problem :-( -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19768
[Bug middle-end/20297] #pragma GCC visibility isn't properly handled for builtin functions
--- Additional Comments From matz at suse dot de 2005-06-06 07:48 --- Created an attachment (id=9035) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9035&action=view) New patch This fixes a problem in HJs patch (which doesn't look at the fndecl of the builtin, but at the call_expr). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20297
[Bug tree-optimization/19768] [4.0 Regression] ICE: SSA_NAME_OCCURS_IN_ABNORMAL_PHI should be set
--- Additional Comments From matz at suse dot de 2005-06-06 06:35 --- I know. Andrew: when you backported this patch: PR tree-optimization/21085 * fold-const (fold): Don't change X % -C to X % C if C has overflowed. you accidentally also checked in a change to tree-ssa-dse.c (rev 2.17.4.1) which backs out your fix for this problem. This happened between -D "2005-05-11 14:00" and -D "2005-05-11 15:00" . (Of course the ChangeLog didn't tell this accident, so I was puzzled first ;) ) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19768
[Bug tree-optimization/19768] [4.0 Regression] ICE: SSA_NAME_OCCURS_IN_ABNORMAL_PHI should be set
--- Additional Comments From matz at suse dot de 2005-06-06 06:19 --- 20050512 was the working one, I meant. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19768
[Bug tree-optimization/19768] [4.0 Regression] ICE: SSA_NAME_OCCURS_IN_ABNORMAL_PHI should be set
--- Additional Comments From matz at suse dot de 2005-06-06 06:18 --- This happens again. I've seen it in a 20050603 4.0 compiler (compiled with --enable-checking). It was not happening with the 20040512 compiler from the 4.0 branch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19768
[Bug target/21041] [4.0 Regression] ICE: output_operand: Cannot decompose address
--- Additional Comments From matz at suse dot de 2005-06-03 14:56 --- There are some maybe-uninitialized vars warnings. But if I fix them by zeroing the vars it still ICEs. I can't find other uninitialized vars. It might be, that there are uninitialized struct members, but they shouldn't affect the problem. Just for reference I've added the full unreduced testcase (with no uninited vars anymore). On s390x it ICEs with -O2 -fPIC . -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21041
[Bug java/21722] gcj miscompiles accesses to static final vars with indirect dispatch
--- Additional Comments From matz at suse dot de 2005-06-01 20:59 --- Yes. I think this is because the compiler needs to see the definition and the use site to exhibit this bug. Without the def it will correctly emit the code walking the table to get to the member. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21722
[Bug java/21722] gcj miscompiles accesses to static final vars with indirect dispatch
--- Additional Comments From matz at suse dot de 2005-05-23 16:52 --- Created an attachment (id=8953) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8953&action=view) tarball showing the problem -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21722
[Bug java/21722] New: gcj miscompiles accesses to static final vars with indirect dispatch
Below I attached a tarball which contains two packages with one class each. B.java defines a static final String initilized to "foo", and A.java tries to call the 'equals' method on that object (and another string). This actually is reduced from trang. The problem happens when this is compiled like the doit.sh script does. I.e. first creating the .class files and then compiling both .class files at once into one object file with -findirect-dispatch. The generated program will segfault. The segfault happens because the generated code for A.main() accesses the ->vtable member of the global object '_ZN1b1B3FOOE' (== b::B::FOO) directly (if I read the .t03.generic dump correctly). But it is defined like so in the assembler: _ZN1b1B3FOOE: .long _Utf1 .section.rodata.jutf8.10 I.e. the first (and only) member of that symbol actually is the UTF-8 string itself, not a pointer to the vtable. But the code trying to resolve the address of the 'equals' method assumes so, and hence calls some random address. Note that this is not the same as the usual -findirect-dispatch only supports compiling from .class problem. This is the case here. -- Summary: gcj miscompiles accesses to static final vars with indirect dispatch Product: gcc Version: 4.0.1 Status: UNCONFIRMED Severity: normal Priority: P2 Component: java AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org,java-prs at gcc dot gnu dot org GCC target triplet: i686-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21722
[Bug middle-end/20297] #pragma GCC visibility isn't properly handled for builtin functions
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20297
[Bug rtl-optimization/21144] [4.0/4.1 regression] Apparent infinite loop in reload
--- Additional Comments From matz at suse dot de 2005-04-29 19:03 --- This is now fixed, but it seems, even though I'm logged in, I can't change the state of this report. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21144
[Bug c++/21089] [4.0/4.1 Regression] C++ front-end does not "inline" the static const double
--- Additional Comments From matz at suse dot de 2005-04-28 09:24 --- Yes, I determined that already in the initial report; to cite myself: > It's invalid for two reasons I think, first the missing definition, instead > of the declaration. [the second reason being the use of the GNU extension]. But it can be trivially made valid (just provide a definition), and I assumed this to be done for sake of this bugreport. Using the GNU extension this would then be valid, and _then_ the value is still not propagated to the method body. _That_'s what I'm complaining about, the missed optimization. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21089
[Bug c++/20912] C++ FE emitting assignments to read-only global symbols
-- Bug 20912 depends on bug 21089, which changed state. Bug 21089 Summary: [4.0/4.1 Regression] C++ front-end does not "inline" the static const double http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21089 What|Old Value |New Value Status|RESOLVED|REOPENED Resolution|DUPLICATE | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20912
[Bug c++/21089] [4.0/4.1 Regression] C++ front-end does not "inline" the static const double
--- Additional Comments From matz at suse dot de 2005-04-28 02:46 --- Uhm, wait. Perhaps the optimization would be invalid for your changed example from comment #5, but see below. But it will not be invalid for my initial testcase, where it missed to propagate 20.0 into setPosition. Why I think the transformation is valid _also_ for comment #5: See 3.6.2 #2: --- An implementation is permitted to perform the initialization of an object of namespace scope with static storage duration as a static initialization even if such initialization is not required to be done statically, provided that * the dynamic version of the initialization does not change the value of any other object of namespace scope with static storage duration prior to its initialization, and * the static version of the initialization produces the same value in the initialized object as would be produced by the dynamic initialization if all objects not required to be initialized statically were initialized dynamically. - It then goes on to provide an example which uses an inline function to dynamically initialize something, where comment #5 uses an arithmetic expression. So I think we are permitted to do this optimization, in both, by initial example, and Andrews example from comment #5. Reopening. -- What|Removed |Added Status|RESOLVED|REOPENED Resolution|DUPLICATE | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21089
[Bug c/21239] Illegal elimination of SSE2 load/store using xmm intrinsics
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21239
[Bug rtl-optimization/21144] [4.0/4.1 regression] Apparent infinite loop in reload
--- Additional Comments From matz at suse dot de 2005-04-25 13:26 --- Created an attachment (id=8734) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8734&action=view) Patch for above problem -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21144
[Bug rtl-optimization/21144] [4.0/4.1 regression] Apparent infinite loop in reload
--- Additional Comments From matz at suse dot de 2005-04-25 13:20 --- The problem is in reload_cse_move2add. It has such a loop: for (narrow_mode = GET_CLASS_NARROWEST_MODE (MODE_INT); narrow_mode != GET_MODE (reg); narrow_mode = GET_MODE_WIDER_MODE (narrow_mode)) { where 'reg' comes from a simple SET insn. In this testcase the insn is: (set (reg:BI r15) (const_int 1)) note the mode of reg being BImode. Now, BImode is in fact a FRACTIONAL_INT_MODE, not an INT_MODE (although GET_MODE_CLASS would return INT_MODE, so using this instead of hard-coded INT_MODE wouldn't help). And GET_CLASS_NARROWEST_MODE(INT_MODE) is QImode, as it will ignore modes with precision 1 bit in genmodes.c (I think because the rest of the compiler is not prepared to really see an BImode here, but I may be wrong, there are not that many instance of GET_CLASS_NARROWEST_MODE and most look safe, but will iterate one more time uselessly if started from BImode). So, this loop starts with QImode, widens the mode each time, and waits for it to become equal to the mode of 'reg', i.e. BImode. This of course never happens, so somewhen it is VOIDmode, and that's the fixed point of GET_CLASS_NARROWEST_MODE. So we are endlessly looping. I've attached the obvious change, which I'm going to regtest now. It fixes this testcase. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21144
[Bug regression/20973] [4.0/4.1 Regression] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-19 16:51 --- I agree with most of what Jim said. Except for the part that we maybe don't have to fix the reload issue, when we fix usage of the uninitialized register for piecewise struct initialization. The latter will fix this particular instance of the problem, but reload still would be confused by uninitialized registers. I think they can happen also for other reasons, like the user having some uninitialized variables, which perhaps never are used at runtime, but still could result in reload miscompiling the program the same way as seen here. As reload bugs are particular hard to track down, I really think we should fix this one for good, that it doesn't come back biting us in the future. If agreed I will cleanup the patch to comment both locations (I don't think it would deserve an own subfunction). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug c++/21089] [4.0/4.1 Regression] C++ front-end does not "inline" the static const double
--- Additional Comments From matz at suse dot de 2005-04-18 17:59 --- With -O0 we also don't inline 'a'. I thought in the past this already was done in the frontend, so the -O option didn't matter? If yes, this has changed (if not, well, I'm wrong ;-) ). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21089
[Bug c++/21089] c++ accepts invalid static const double members with initializer
--- Additional Comments From matz at suse dot de 2005-04-18 17:40 --- Indeed. Okay, but then this really is an optimization regression compared to gcc 3.3.x which compiled this just fine. As it's only rejected with -pedantic (and I think it's a sensible extension), shouldn't we make sure that we can compile this comparatively simple source, i.e. propagate the constant correctly everywhere? I'm not sure what to do, reopening with a new subject, or creating a new bug? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21089
[Bug c++/21089] New: c++ accepts invalid static const double members with initializer
See this testcase: - struct Ball { static const double diameter = 20; void setPosition(double ,double ); double vect_Pos; }; void move (double, double); void Ball::setPosition(double xval,double yval) { vect_Pos=xval; move(xval-(diameter/2),yval-(diameter/2)); } - This is from kbilliard, and I only noticed the problem, because a reference to Ball::diameter is left in the generated code, which can't be resolved, as it's defined nowhere. Initially I was tricked by the java like syntax, but then saw PR20098, according to which this is invalid code (for two reasons). So it seems there are two problems: 1) missed optimization, as for instance if I remove the 'vectPos=xval' line I get no linker error, and in fact in the dumps the references to diameter are substituted by the defined value (double)20 2) the invalidness of the code is not diagnosed. It's invalid for two reasons I think, first the missing definition, instead of the declaration. But that can't be diagnosed except by the linker, which indeed it does. But when reading PR20098 I learned that static const members are only allowed to have an initializer when they are of integral type. This is not the case here. If the compiler would have diagnosed it, the root cause of this problem had been more visible. -- Summary: c++ accepts invalid static const double members with initializer Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21089
[Bug regression/20973] [4.0/4.1 Regression] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-18 14:51 --- >From http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01508.html where I submitted the patch: the problem in khtml. I've bootstrapped it with gcc 4.0 on i686,x86_64,ppc,ppc64,ia64,s390 (s390x was breaking for different reasons), all languages (with Ada ;) ). There were no regressions (in fact some fixed Ada testcases, but I'm not sure if they were real). Note the last sentence. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug regression/20973] [4.0/4.1 Regression] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-18 14:22 --- This patch fixes the regressions in khtml for us. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug libstdc++/19664] libstdc++ headers should have pop/push of the visibility around the declarations
--- Additional Comments From matz at suse dot de 2005-04-18 12:50 --- Oh, and just annotating the testcase with the visibility push/pop #pragma is not enough, probably because of the problem in the c++ frontend, HJ noted. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19664
[Bug libstdc++/19664] libstdc++ headers should have pop/push of the visibility around the declarations
--- Additional Comments From matz at suse dot de 2005-04-18 12:49 --- So, any progress on this whole issue? I don't see either the pragmas in the C++ headers, nor HJs changes to the c++ frontend (despite testcase) in CVS. Just for the record, I see these problems (linkproblem with test from comment #11) also on ppc and ppc64, so it's not just a target dependend problem. On ppc64 it's an unresolvable R_PPC64_REL24 relocation, and on ppc it's simply an undefined reference to the std::string ctor. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19664
[Bug middle-end/20218] Can't use __attribute__ ((visibility ("hidden"))) to hide a symbol
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20218
[Bug target/21041] ICE: output_operand: Cannot decompose address
--- Additional Comments From matz at suse dot de 2005-04-15 08:20 --- Forgot to say, the preprocessed file is for s390x. On s390 the same happens, though. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21041
[Bug target/21041] ICE: output_operand: Cannot decompose address
--- Additional Comments From matz at suse dot de 2005-04-15 08:19 --- Created an attachment (id=8641) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8641&action=view) Preprocessed source for the ICE -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21041
[Bug target/21041] New: ICE: output_operand: Cannot decompose address
The below testcase is extracted from smpeg. It's C++. Compile it like so: % ./gcc/cc1plus -O2 -fPIC video.ii video.cpp: In function 'int ParseMacroBlock(VidStream*)': video.cpp:2205: internal compiler error: output_operand: Cannot decompose address. I wasn't able to make the functions much smaller than this (didn't try reducing the headers, though). -- Summary: ICE: output_operand: Cannot decompose address Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: target AssignedTo: uweigand at de dot ibm dot com ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org GCC host triplet: {s390,s390x}-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21041
[Bug middle-end/20991] [4.0 Regression] ICE in cgraph_mark_reachable_node
--- Additional Comments From matz at suse dot de 2005-04-15 06:51 --- Perhaps due to the IPA infrastructure? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20991
[Bug preprocessor/21039] libcpp incorrectly handles different headers with same name
--- Additional Comments From matz at suse dot de 2005-04-15 06:21 --- Created an attachment (id=8640) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8640&action=view) Tarball with the testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21039
[Bug preprocessor/21039] New: libcpp incorrectly handles different headers with same name
This hits when compiling rrdtools, which creates a perl .xs module. It's and autoconf package, and hence has a config.h. perl also has a config.h used from it's headers (by doing a 'include "config.h"', so the one from the perl include dir is used correctly), which has a multiple include guard. Once the perl config.h is included the source goes on, and tries to include it's own config.h. There is no way anymore due to the bug. It tries to do this with '#include ', i.e. just searching the -I paths (which are setup correctly, so it could find it in ../config.h). But this doesn't work anymore (it did with 3.3.x), as somehow it seems that the place of config.h as it first was found is cached (the perl one), and then the ../config.h one isn't even tried. To demonstrate this I've put a tarball below. After unpacking, please do: % cd a % find -type f ./b/header.h ./b/inc-b.h ./c/inc.c ./header.h % cd c % gcc-4 -E -I .. -I ../b inc.c # 1 "inc.c" # 1 "" # 1 "" # 1 "inc.c" # 1 "../b/inc-b.h" 1 # 1 "../b/header.h" 1 B # 1 "../b/inc-b.h" 2 # 2 "inc.c" 2 Note how the two include directives in inc.c have no effect, _although_ -I .. is before -I ../b in the cmdline. gcc 3 does it correctly: % gcc-3 -E -I .. -I ../b inc.c | grep header.h # 1 "inc.c" # 1 "" # 1 "" # 1 "inc.c" # 1 "../b/inc-b.h" 1 # 1 "../b/header.h" 1 B # 2 "../b/inc-b.h" 2 # 2 "inc.c" 2 # 1 "../header.h" 1 A # 3 "inc.c" 2 # 1 "../header.h" 1 A # 4 "inc.c" 2 -- Summary: libcpp incorrectly handles different headers with same name Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: preprocessor AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21039
[Bug middle-end/20991] [4.0/4.1 Regression] ICE in cgraph_mark_reachable_node
--- Additional Comments From matz at suse dot de 2005-04-15 03:16 --- One strange thing is, that the call to getWidth() in B is already in .generic: if (retval.1) { getWidth (&i_bnds); } while the call to getWidth() in isEmpty() (which is inlined later into B()) is D.1595 = this->_vptr.IMG_Rect; D.1596 = *D.1595; D.1597 = OBJ_TYPE_REF(D.1596;this->0) (this); D.1594 = D.1597 == 0; Both are actually virtual calls of course, so I wonder why they are represented differently. Note that this doesn't change if I make B() a method of IMG_Rect. I thought initially that might be a difference, as isEmpty is one. Another fact is, that if I comment out the totally unrelated useless decl for getVisibility(), it works. The .i01.cgraph dump in that case shows these Initial entry points: void B() void A() Unit entry points: void B() void A() virtual bool IMG_Rect::isEmpty() const virtual int IMG_Rect::getWidth() const while when it breaks (i.e. when getVisibility is there) it looks like: Initial entry points: void B() void A() Unit entry points: void B() void A() See how the two methods in question are now missing. As they are declared inline this actually should be okay, and that getVisibility makes a difference might be another bug, which hides this one sometimes. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20991
[Bug middle-end/20991] [4.0/4.1 Regression] ICE in cgraph_mark_reachable_node
--- Additional Comments From matz at suse dot de 2005-04-15 02:40 --- We see this error in blender. I was able to reduce it quite a bit to this: struct IMG_Rect { virtual inline int getWidth() const; virtual inline bool isEmpty() const; virtual int getVisibility(int) const; }; inline int IMG_Rect::getWidth() const { return 1; } inline bool IMG_Rect::isEmpty() const { return (getWidth() == 0); } void A() { IMG_Rect i_bnds; if (i_bnds.isEmpty()) i_bnds.getWidth(); } void B() { IMG_Rect i_bnds; if (i_bnds.isEmpty()) i_bnds.getWidth(); } -- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20991
[Bug regression/20973] [4.0/4.1 Regression] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-12 19:47 --- I have a patch for reload, which fixes the bug, when looking at the dumps. At least now find_reg is used for the insn in question, which also evicts pseudos using the same reg as the chosen final reg_rtx. I have only tested this by copying around the generated .s file, not by building a new compiler. (Hence it's also obviously not regression or bootstrap tested, but I'll do this now.) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug regression/20973] [4.0/4.1 Regression] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-12 19:05 --- The problem is in reload.c:find_dummy_reload. It tries to use the input reg as reload register for an in-out reload and has certain conditions when it can't do so: /* Consider using IN if OUT was not acceptable or if OUT dies in this insn (like the quotient in a divmod insn). We can't use IN unless it is dies in this insn, which means we must know accurately which hard regs are live. Also, the result can't go in IN if IN is used within OUT, or if OUT is an earlyclobber and IN appears elsewhere in the insn. */ if (hard_regs_live_known && REG_P (in) && REGNO (in) < FIRST_PSEUDO_REGISTER && (value == 0 || find_reg_note (this_insn, REG_UNUSED, real_out)) && find_reg_note (this_insn, REG_DEAD, real_in) && !fixed_regs[REGNO (in)] && HARD_REGNO_MODE_OK (REGNO (in), But this doesn't check if IN is used uninitialized. In that case it also can't be used. It's not immediately clear to me how to check for this, as nowhere is it noted that this or that pseudo is actually uninitialized and only therefore got a register by global.c. One could look at the global_live_at_start of the first block, if it mentions the original pseudo number of the IN operand. The problem of course being that at this point the hardregs already are substituted into the REG expressions. So one would have to trust the original regnos noted there. And it's not clear that this is the only place in reload which is confused by hardregs corresponding to uninitialized pseudos. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug regression/20973] [4.0/4.1 Regression] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-12 17:37 --- Another mail: I think the dust settles a bit. We have this situation in .t69.final_cleanup (excerpt from applyRule): struct Length D.83927; :; D.83927.D.21947.l.value = (int) primitiveValue->m_value.num; D.83927.D.21947.l.type = 1; D.83927.D.21947.l.quirk = 0; l = D.83927; apply = 1; :; printf (&"apply %d id %d\n"[0], (int) apply, id); if (apply == 0) goto ; else goto ; The above is the only use of D.83927, so it first is initialized memberwise, and then copied as a whole. This corresponds to these insns from .22.lreg (not the full above sequence, but only the interesting part). Reg 4101 is 'l' and is constructed piecewise. Reg 4197 is 'id'. ;; Start of basic block 1449, registers live: 3 [bx] 6 [bp] 7 [sp] 16 [argp] 18 [fpsr] 20 [frame] 3669 3843 4196 4197 (insn:HI 12371 12370 12372 1449 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel [ (set (reg/v:SI 4101 [ l ]) (and:SI (reg:SI 3843 [ D.83927 ]) (const_int -268435456 [0xf000]))) (clobber (reg:CC 17 flags)) ]) 200 {*andsi_1} (nil) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_DEAD (reg:SI 3843 [ D.83927 ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil) Note how reg 3843 dies here. Additionally it should be mentioned that it's never initialized (it's live up through the first insn), as it probably represents the yet-uninitialized part of a bitfield in construction. After the above follows some more bitmanipulation on reg 4101, setting finally all fields, then the block ends: ;; End of basic block 1449, registers live: 3 [bx] 6 [bp] 7 [sp] 16 [argp] 20 [frame] 3668 4101 4196 4197 ;; Start of basic block 1450, registers live: 3 [bx] 6 [bp] 7 [sp] 16 [argp] 20 [frame] 3668 4101 4196 4197 ... (insn:HI 12382 12381 12383 1450 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/css/cssstyleselector.cpp:2834 (set (mem:SI (plus:SI (reg/f:SI 7 sp) (const_int 8 [0x8])) [0 S4 A32]) (reg/v:SI 4197 [ id ])) 35 {*movsi_1} (nil) (nil)) So, and this is the setup of the arguments to the printf. So, we know that 4197 is live throughout block 1449, and correctly conflicts with reg 4101 (see below), which is the only register set here, by the bitmasking insn 12371. Now we look at .23.greg dump, first the interesting conflicts: ;; 3843 conflicts: ;; 4101 conflicts: ... 4197 ... ;; 4197 conflicts: ... 4101 ... So, all is well. But then reload goes and breaks it: ... Spilling for insn 12371. Spilling for insn 12371. Reloads for insn # 12371 Reload 0: reload_in (SI) = (reg:SI 4 si [orig:3843 D.83927 ] [3843]) reload_out (SI) = (reg/v:SI 4101 [ l ]) GENERAL_REGS, RELOAD_OTHER (opnum = 0) reload_in_reg: (reg:SI 4 si [orig:3843 D.83927 ] [3843]) reload_out_reg: (reg/v:SI 4101 [ l ]) reload_reg_rtx: (reg:SI 4 si [orig:3843 D.83927 ] [3843]) ... Yes, right, reload needs to fix up the insn to make both registers the same and it wants to use $esi as reg for reg3843. Problem is, it wants also to alloc reg4197 to $esi: ... 3826 in 0 3843 in 4 3844 in 4 3846 in 4 3847 in 1 3912 in 0 4173 in 5 4180 in 4 4197 in 4 4198 in 5 4199 in 0 4202 in 0 ... (4101 is not allocated by the way because reload is able to optimize it out, and instead uses 3843 in it's place). So this then results in this insn: (insn:HI 12371 12370 43006 1448 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/mis (set (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (const_int -268435456 [0xf000]))) (clobber (reg:CC 17 flags)) ]) 200 {*andsi_1} (nil) at which point %esi is lost. I think the problem is, that global-alloc uses %esi for 4197 and 3843, which is okay, as 3843 is uninitialized and hence doesn't conflict with 4197. But reload then goes on and actually uses this uninitialized register for _setting_. Basically it transforms this insn: rA <- op (rB) where rA and rB need to match into rB <- op (rB) [rA <- rB] where the latter move is not emitted because rA could be optimized away (rA is our 4101). This introduces a new setting of rB, which is a problem if rB was uninitialized. Normally reload should have generated rA <- rB rA <- op (rA) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug regression/20973] [4.0/4.1 Regression] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-12 17:34 --- Here some mails we exchanged: Adding something complex seems to change how things are allocated or REG_DEAD notes distributed or something like this. If we add 'kdDebug( 6080 ) << "applying property " << id << endl;' at the very start, it seems to work. If we comment it out, and add a + printf ("apply %d id %d\n", apply, id); if(!apply) return; switch(id) { case CSS_PROP_MAX_WIDTH: at line 2834, one can see where it goes wrong. The printf will show id being 0 every time. In the debugger printing 'id' (which refers to the stack position) shows the correct value. At that point the compiler has placed id into %esi, which indeed contains 0 here. This $esi is loaded at the beginning of the function with the argument, and the first time it changes (when setting a watchpoint on $esi in gdb) is at line 2823: l = Length(primitiveValue->computeLength(style, paintDeviceMetrics), Fixed, primitiveValue->isQuirkValue()); This corresponds to this insn: 0x4028a449 : and$0xf000,%esi After that, until the printf $esi is not correctly loaded back from the parameter stack slot, and remains 0. The above insn corresponds to this RTL: (insn 12371 55063 12368 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel [ (set (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (const_int -268435456 [0xf000]))) (clobber (reg:CC 17 flags)) ]) 200 {*andsi_1} (nil) (I think, there are multiple insns masking the high 4 bit out of %esi, and I've not yet traced through all paths). So somehow I think GCC put something into %esi while it still was live. Wasn't there some error in placing REG_DEAD notes just a few days ago? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug regression/20973] kdelibs (khtml) miscompiled by reload
--- Additional Comments From matz at suse dot de 2005-04-12 17:30 --- Created an attachment (id=8610) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8610&action=view) the preprocessed source showing the problem -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug regression/20973] New: kdelibs (khtml) miscompiled by reload
gcc 4 (about RC1) miscompiles khtml, in fact something in CSS, which basically leads to all websites being misrendered. I can't easily reduce the testcase, but have applied the whole preprocessed source of css/cssstyleselector.ii. It is to be compiled with g++ -O2 -fPIC -march=i586 -mtune=i686 -fno-exceptions. A more detailed analysis will follow, as we've found out some things already. -- Summary: kdelibs (khtml) miscompiled by reload Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: regression AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org GCC host triplet: i386-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20973
[Bug c++/20949] [4.0/4.1 Regression] misscompilation of konqueror, artsd, STLport, libstdc++, ...
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20949
[Bug target/20020] x86_64 - 128 bit structs not targeted to TImode
--- Additional Comments From matz at suse dot de 2005-02-17 22:06 --- I think that #19566 is a real bug. The ABI specifies to pass 16byte structs in registers. Anyway MAX_FIXED_MODE_SIZE doesn't influence the calling convention, only how such struct is handled by transforming code. I.e. changing MAX_FIXED_MODE_SIZE shouldn't fix any ABI bug (in fact shouldn't change how parameters are passed at all). At least from my understanding and if there aren't other bugs making this false ;) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020
[Bug target/20020] x86_64 - 128 bit structs not targeted to TImode
-- What|Removed |Added CC||matz at suse dot de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020
[Bug java/19823] New: java fails with non-executable memory
Newer linux kernels (2.6.11 in this case) check the executable for a PT_GNU_STACK program header, and if it exists default to provide non-executable memory (for stack _and_ malloced memory) on CPUs which support this (all x86-64 CPUs and newer x86 ones). It seems that gij is not prepared to handle this. This can be seen in some testresults, e.g. here: http://gcc.gnu.org/ml/gcc-testresults/2005-02/msg00223.html This is the autovect branch, but it's the same on all GCC versions (3.3 through 4.0). This is with such a new kernel, with an older kernel which didn't do this yet all those testcases work. Currently Andi Kleen proposed again to switch this off on the kernel side as a hot fix, because too much software currently breaks. But somewhen it will be activated for sure, and then GCC should be able to cope with this. My guess is, that there only are missing some mprotect calls at the right places. -- Summary: java fails with non-executable memory Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: java AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: matz at suse dot de CC: gcc-bugs at gcc dot gnu dot org,java-prs at gcc dot gnu dot org GCC host triplet: i686-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19823