Re: [ARM] Cirrus EP93xx Maverick Crunch Support - bge pattern
if (get_attr_cirrus (prev_active_insn(insn)) == CIRRUS_COMPARE) return \beq\\t%l0\;bvs\\t%l0\; else return \bge\\t%l0\;nop\; [(set_attr conds jump_clob) (set_attr length 8)] ) As you can see, I need to replace all bge with a maverick crunch equivalent. However, bge is still also used with integer comparisons, e.g: I think you should generate the compare using a different mode for the CC register (like cc:CCMAV) and then use two patterns: ; Special pattern to match GE for MAVERICK. Most restrictive ; pattern goes first. (define_insn *arm_cirrus_bge [(set (pc) (if_then_else (ge (match_operand:CCMAV 1 cc_register ) (const_int 0)) (label_ref (match_operand 0 )) (pc)))] TARGET_ARM TARGET_HARD_FLOAT TARGET_MAVERICK beq\\t%l0\;bvs\\t%l0\ [(set_attr conds jump_clob) (set_attr length 8)] ) ; Special pattern to match GE for ARM. (define_insn *arm_bge [(set (pc) (if_then_else (ge (match_operand 1 cc_register ) (const_int 0)) (label_ref (match_operand 0 )) (pc)))] TARGET_ARM TARGET_HARD_FLOAT bge\\t%l0\ [(set_attr conds jump_clob) (set_attr length 4)] )
Re: [ARM] Cirrus EP93xx Maverick Crunch Support - bge pattern
On Wed, 27 Jun 2007 08:17:47 +0200, Paolo Bonzini [EMAIL PROTECTED] said: if (get_attr_cirrus (prev_active_insn(insn)) == CIRRUS_COMPARE) return \beq\\t%l0\;bvs\\t%l0\; else return \bge\\t%l0\;nop\; [(set_attr conds jump_clob) (set_attr length 8)] ) As you can see, I need to replace all bge with a maverick crunch equivalent. However, bge is still also used with integer comparisons, e.g: I think you should generate the compare using a different mode for the CC register (like cc:CCMAV) and then use two patterns: ; Special pattern to match GE for MAVERICK. Most restrictive ; pattern goes first. (define_insn *arm_cirrus_bge [(set (pc) (if_then_else (ge (match_operand:CCMAV 1 cc_register ) (const_int 0)) (label_ref (match_operand 0 )) (pc)))] TARGET_ARM TARGET_HARD_FLOAT TARGET_MAVERICK beq\\t%l0\;bvs\\t%l0\ [(set_attr conds jump_clob) (set_attr length 8)] ) ; Special pattern to match GE for ARM. (define_insn *arm_bge [(set (pc) (if_then_else (ge (match_operand 1 cc_register ) (const_int 0)) (label_ref (match_operand 0 )) (pc)))] TARGET_ARM TARGET_HARD_FLOAT bge\\t%l0\ [(set_attr conds jump_clob) (set_attr length 4)] ) Yep, this will work. Floating point comparisons are already done in CCFP mode, so I have used that. NB, I already tried this earlier, but I think most of my problem comes from conditional execution ... I tried changing: (define_cond_exec [(match_operator 0 arm_comparison_operator [(match_operand 1 cc_register ) (const_int 0)])] TARGET_ARM ) to: (define_cond_exec [(match_operator 0 maverick_comparison_operator [(match_operand:CCFP 1 cc_register ) (const_int 0)])] TARGET_ARM TARGET_HARD_FLOAT TARGET_MAVERICK ) (define_cond_exec [(match_operator 0 arm_comparison_operator [(match_operand 1 cc_register ) (const_int 0)])] TARGET_ARM ) But I think I also need to modify or add to all the other scc and / ior etc lines, since I think combining scc's / condexecs doesn't work correctly. I think the if the above define_cond_exec is still there, then gcc thinks it can optimize all ge execution, and so optimises the above output from arm_bge, and deletes the label. I rebuilt gcc with all conditional execution disabled to see if it would work. I did this by commenting out any line referencing arm_comparison_operator or define_cond_exec. However, when I compile a c++ program, the compiler still can't generate the label again, and it fails with: internal compiler error: output_operand: '%l' operand isn't a label then of course the assembler fails with: undefined local label NB, I shouldn't need the second arm_bge as it should be handled by the code in arm_condition_code, for non MAVERICK and Maverick non-floating point. I've also disabled DImode on Maverick, since it is only signed or unsigned, and not both at the same time. I think it will also cause similar comparison-based problems too. Incidentally, is it possible to do something like: (if_then_else (ge (match_operand:CCFP,CCDI 1 cc_register ) (const_int 0)) And can someone explain what is the difference between these two lines: if_then_else (ge (match_operand:CCFP 1 cc_register ) (const_int 0)) if_then_else (ge:CCFP (match_operand 1 cc_register ) (const_int 0)) Is the second line still valid syntax?
Re: Type-punning
Herman Geza [EMAIL PROTECTED] writes: Yes, it's clear now, thanks. However, it brings a new question: the standard defines layout-compatible types. For example, if I'm correct, my Vector and Point are layout compatible. What can I do with layout compatible objects? You can put them in a union and be able to use the common initial sequence of either alternative, see 6.5.2.3[#5]. Once you have defined such a union, objects of either type can alias each other. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [ARM] Cirrus EP93xx Maverick Crunch Support - bge pattern
On Wed, 27 Jun 2007 18:15:12 +1000, Hasjim Williams [EMAIL PROTECTED] said: if_then_else (ge (match_operand:CCFP 1 cc_register ) (const_int 0)) if_then_else (ge:CCFP (match_operand 1 cc_register ) (const_int 0)) Is the second line still valid syntax? The second line doesn't work. The first one does. It also fixes up the internal compiler error: output_operand: '%l' operand isn't a label error... Incidentally, does anyone know if can you do something like: if_then_else (ge (match_operand:!CCFP 1 cc_register ) (const_int 0))
Re: m68k compound instructions
Maxim Kuvyrkov [EMAIL PROTECTED] writes: Option '2' looks more favorable to me, though implies greater initial effort of simplifying complex machine description without loosing precious bits of fine tuning. Following this path define_split will be my best friend and helper. I agree with that. The m68k machine description really would need a major overhaul to bring it up to modern standards. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: How to use GCC testsuite..?
The testsuite can be run with a simulator too (in absence of real hardware). You can refer to the *-sim.exp files in the dejagnu baseboards directory for sample specifications. HTH regards saurabh On Wed, 2007-06-27 at 00:18 +0300, Tehila Meyzels wrote: AFAIK, if you don't have such machine, you won't be able to run all the need-to-be-executed tests. Only the tests that not suppose to run will be tested (like compilation-only tests).
Re: [ARM] Cirrus EP93xx Maverick Crunch Support - bge pattern
On Wed, Jun 27, 2007 at 06:45:26PM +1000, Hasjim Williams wrote: On Wed, 27 Jun 2007 18:15:12 +1000, Hasjim Williams [EMAIL PROTECTED] said: if_then_else (ge (match_operand:CCFP 1 cc_register ) (const_int 0)) if_then_else (ge:CCFP (match_operand 1 cc_register ) (const_int 0)) Is the second line still valid syntax? The second line doesn't work. The first one does. The second line is valid syntax, but e.g. combine expects that the operator doesn't have a mode, so the insn won't match. I think that's just an undocumented feature of the first argument to if_then_else. Also, the mode would be the mode of the result, not the mode of the comparison result in the cc register. Example: (set (reg:SI 0) (ge:SI (match_operand:CCFP 1 cc_register ) (const_int 0)) which means set register 0, which has SImode, to STORE_FLAG_VALUE if the last comparison had op0 = op1, otherwise set register 0 to zero. It also fixes up the internal compiler error: output_operand: '%l' operand isn't a label error... Incidentally, does anyone know if can you do something like: if_then_else (ge (match_operand:!CCFP 1 cc_register ) (const_int 0)) You can't (but mode macros help). As Paolo says, you will have to define one or more new comparison modes and you will have to define branch insns which use the new mode(s), comparison insns which set the cc register in the new mode(s), new sCC style insns, and so on. Additionally, look at SELECT_CC_MODE and TARGET_CC_MODE_COMPATIBLE. If you have some sort of arm_output_compare_insn() function, modify that as well. The significance of defining a CCmode is that is says that comparisons done in that mode set the flags in a specific way. -- Rask Ingemann Lambertsen
Re: How to use GCC testsuite..?
[EMAIL PROTECTED] wrote on 27/06/2007 13:27:39: The testsuite can be run with a simulator too (in absence of real hardware). You can refer to the *-sim.exp files in the dejagnu baseboards directory for sample specifications. That's correct, I've forgotten that option. (We used systesim in the past, in cases we didn't have the target (arch) machine). Tehila. HTH regards saurabh On Wed, 2007-06-27 at 00:18 +0300, Tehila Meyzels wrote: AFAIK, if you don't have such machine, you won't be able to run all the need-to-be-executed tests. Only the tests that not suppose to run will be tested (like compilation-only tests).
Re: [tuples/LTO] RFC: houghts on auto-generating GS_* data structures
Ian Lance Taylor wrote: Kenneth Zadeck [EMAIL PROTECTED] writes: In the lto world we will be reading in a function and then hacking on it. Many (most) of those hacks are not in place changes, but adding, deleting and rearranging instructions into the stream. Doing in place mapping puts severe restrictions on the kinds of storage managers that are going to be available to the rest of the compiler. They are going to have to be aware of the instructions (and other structures) that have been mapped in vs the instructions that are newly created and thus can be recovered. I'm not sure I completely agree, or perhaps I don't completely understand. You can rearrange something that you mmap in, assuming you copy it in. Or just think the read system call, it doesn't really matter. I just think that if we have to take some action on every instruction we read in--i.e., parse it from bytecode into the internal representation--the I/O itself will be significant for LTO on a large program. Yes, memory management is more complex, but not that much more complex. Note that our GC system already understands what type of page an object is allocated in. The issue is not the io. The current organization, with each function arranged in its own section is designed to so that that section can be memory mapped in. The question how much work is it going to be to transform what is mapped in into the working representation. The unswizzling of the pointers will be a certain amount of work and will most likely touch a fairly dense amount of the program representation vs parsing it which will read everything and create a new representation of everything. All that I was saying was that that parsing is not obviously the wrong way to go if it means that the subsequent data structures are better suited for the kinds of manipulations that we are going to make. There is a strong argument for making developing the tools to generate the middle end data structures, even if we do not use them for lto: 1) It will force us into a discipline where we cannot do the braindead overloading that make the trees so difficult to manipulate. This is only doable if we start now before the rot sets in. 2) It will allow us to do lto serialization if we decide to. 3) If we decide we want tools to be able to write out, edit, modify and re-inject intermediate code into the compiler, the in and out part are easily derived from such a high level description. It's fine with me if you or somebody wants to tackle it. I agree that it brings benefits. I'm just not sure it's the most productive thing to work on. Ian
Re: [tuples/LTO] RFC: houghts on auto-generating GS_* data structures
On 6/26/07 4:08 PM, Diego Novillo wrote: But, first, I'd like to know what folks think about this. Would it be generally useful for us to have the IL data structures auto-generated this way? I can see the benefits in the reader/writer. But also, we are going to have to re-implement the reader/writer when we move GIMPLE out of the tree data structures. OTOH, we will probably change them, add new codes and having them autogenerated may have other advantages. One thing I forgot to add is that auto-generating .h files is not really necessary. We can always parse the .h files directly. As long as we have either easy to spot markers (a-la GTY) or just write the structs in a clear manner, it should be straightforward.
Re: [tuples/LTO] RFC: houghts on auto-generating GS_* data structures
On Wed, Jun 27, 2007 at 11:09:26AM -0400, Diego Novillo wrote: On 6/26/07 4:08 PM, Diego Novillo wrote: But, first, I'd like to know what folks think about this. Would it be generally useful for us to have the IL data structures auto-generated this way? I can see the benefits in the reader/writer. But also, we are going to have to re-implement the reader/writer when we move GIMPLE out of the tree data structures. OTOH, we will probably change them, add new codes and having them autogenerated may have other advantages. One thing I forgot to add is that auto-generating .h files is not really necessary. We can always parse the .h files directly. As long as we have either easy to spot markers (a-la GTY) or just write the structs in a clear manner, it should be straightforward. For the record, I'm not a big fan of auto-generating the .h files. Too much work, for too little gain.
Re: I'm sorry, but this is unacceptable (union members and ctors)
For instance, say you need to impliment a GUI, so you have yourself a rectangle struct which consists of four floating point values (the origin and difference between the opposite corner) ...Now you want those four values, but you also have a 2D vector struct. Here is a portable alternative to achieve this: struct Rectangle { private: Vector2D m_Position; Vector2D m_Size; public: Vector2D position() { return m_Position; } const Vector2D position() const { return m_Position; } Vector2D size() { return m_Size; } const Vector2D size() const { return m_Size; } float left() { return m_Position.x; } float left() const { return m_Position.x; } float top() { return m_Position.y; } float top() const { return m_Position.y; } float width() { return m_Size.x; } float width() const { return m_Size.x; } float height() { return m_Size.y; } float height() const { return m_Size.y; } }; Then you can access the members like this: Rect somerectangle; Rect.position().x = 45; Rect.left() = 45;
Re: [tuples/LTO] RFC: houghts on auto-generating GS_* data structures
Kenneth Zadeck [EMAIL PROTECTED] writes: The issue is not the io. The current organization, with each function arranged in its own section is designed to so that that section can be memory mapped in. The question how much work is it going to be to transform what is mapped in into the working representation. The unswizzling of the pointers will be a certain amount of work and will most likely touch a fairly dense amount of the program representation vs parsing it which will read everything and create a new representation of everything. I believe we should head toward an IR which does not require unswizzling of pointers. That is, the IR should itself be the working representation. It's fine to do some work when writing out the IR to make this happen, but I think we should strive to avoid doing work when reading it in. I know we aren't close to this at present. All that I was saying was that that parsing is not obviously the wrong way to go if it means that the subsequent data structures are better suited for the kinds of manipulations that we are going to make. Sure. Ian
combine corrupts insns + dumps with insn cost problems
I'm seeing this on my 16-bit ix86 port. Something isn't right: insn_cost 5: 12 insn_cost 6: 8 insn_cost 7: 4 ... rejecting combination of insns 5 and 6 original costs 12 + 8 = 24 replacement cost 28 Now, 12 + 8 = 20, not 24. The cost obviously includes insn 7 also. What's happening is that combine is trying to combine insns 5, 6 but needs a CCmode change in insn 7 because we have plain CCmode but SELECT_CC_MODE chooses CCZ_Cmode for the combined insn 5+6. The original insns: (insn 5 2 6 2 .../gcc.c-torture/execute/2801-4.c:12 (set (reg/f:HI 22) (const:HI (plus:HI (symbol_ref/f:HI (*.LC0) [flags 0x2] string_cst 0xb7ea5ee0) (const_int 1 [0x1] 9 {*movhi} (nil)) (insn 6 5 7 2 .../gcc/testsuite/gcc.c-torture/execute/2801-4.c:12 (set (reg:CC 13 cc) (compare:CC (mem/s:QI (reg/f:HI 22) [0 S1 A8]) (const_int 0 [0x0]))) 436 {cmpqi_cc} (expr_list:REG_DEAD (reg/f:HI 22) (nil))) (insn 7 6 10 2 .../gcc/testsuite/gcc.c-torture/execute/2801-4.c:12 (parallel [ (set (reg:HI 24) (eq:HI (reg:CC 13 cc) (const_int 0 [0x0]))) (clobber (scratch:QI)) (clobber (reg:CC 13 cc)) ]) 454 {*seqhi_cc} (expr_list:REG_DEAD (reg:CC 13 cc) (expr_list:REG_UNUSED (reg:CC 13 cc) (nil The replacements: (set (reg:CCZ_C 13 cc) (compare:CCZ_C (mem/s:QI (const:HI (plus:HI (symbol_ref/f:HI (*.LC0) [flags 0x2] string_cst 0xb7ea5ee0) (const_int 1 [0x1]))) [0 S1 A8]) (const_int 0 [0x0]))) (set (reg:HI 24) (eq:HI (reg:CCZ_C 13 cc) (const_int 0 [0x0]))) As noted, combine rejects the replacement. But the structure of insn 7 has now been corrupted: (insn 5 2 6 2 .../gcc.c-torture/execute/2801-4.c:12 (set (reg/f:HI 22) (const:HI (plus:HI (symbol_ref/f:HI (*.LC0) [flags 0x2] string_cst 0xb7ea5ee0) (const_int 1 [0x1] 9 {*movhi} (nil)) (insn 6 5 7 2 .../gcc.c-torture/execute/2801-4.c:12 (set (reg:CC 13 cc) (compare:CC (mem/s:QI (reg/f:HI 22) [0 S1 A8]) (const_int 0 [0x0]))) 436 {cmpqi_cc} (expr_list:REG_DEAD (reg/f:HI 22) (nil))) (insn 7 6 10 2 .../gcc.c-torture/execute/2801-4.c:12 (parallel [ (set (reg:HI 24) (eq:HI (reg:CC 13 cc) (const_int 0 [0x0]))) (clobber (reg:CC 13 cc)) ]) 454 {*seqhi_cc} (expr_list:REG_DEAD (reg:CC 13 cc) (expr_list:REG_UNUSED (reg:CC 13 cc) (nil The clobber of the scratch register has disappeared! A possible clue as to what sets up the failure is that the second replacement insn (to replace insn 7) (set (reg:HI 24) (eq:HI (reg:CCZ_C 13 cc) (const_int 0 [0x0]))) needs to have a clobber added. It really looks like this: (set (reg:HI 24) (eq:HI (reg:CCZ_C 13 cc) (const_int 0 [0x0]))) (clobber (reg:CC 13 cc)) Combine knows how to add clobbers to make insns recognizable. I'm guessing it accidentally clobbers the original insn in doing so. Where would I look? -- Rask Ingemann Lambertsen
Re: combine corrupts insns + dumps with insn cost problems
Hello! The clobber of the scratch register has disappeared! A possible clue as to what sets up the failure is that the second replacement insn (to replace insn 7) (set (reg:HI 24) (eq:HI (reg:CCZ_C 13 cc) (const_int 0 [0x0]))) needs to have a clobber added. It really looks like this: (set (reg:HI 24) (eq:HI (reg:CCZ_C 13 cc) (const_int 0 [0x0]))) (clobber (reg:CC 13 cc)) Combine knows how to add clobbers to make insns recognizable. I'm guessing it accidentally clobbers the original insn in doing so. Where would I look? Try by tracing through recog_for_combine(), combine.c. Uros.
Re: combine corrupts insns + dumps with insn cost problems
Combine knows how to add clobbers to make insns recognizable. I'm guessing it accidentally clobbers the original insn in doing so. Where would I look? Anywhere in combine. :-) This is by design, see the SUBST macro and the undo buffer machinery. You need to put a watchpoint on your insn. -- Eric Botcazou
Proposal: adding two zeros to the integer cost to calibrate better.
Rask Ingemann Lambertsen [EMAIL PROTECTED] wrote: I'm seeing this on my 16-bit ix86 port. Something isn't right: insn_cost 5: 12 insn_cost 6: 8 insn_cost 7: 4 ... rejecting combination of insns 5 and 6 original costs 12 + 8 = 24 replacement cost 28 Now, 12 + 8 = 20, not 24. The cost obviously includes insn 7 also. What's happening is that combine is trying to combine insns 5, 6 but needs a CCmode change in insn 7 because we have plain CCmode but SELECT_CC_MODE chooses CCZ_Cmode for the combined insn 5+6. I recommend to add 2 zeros to the integer costs as if those are 2 decimal zeros, for example, insn_cost 5: 1200 // it's 12.00 insn_cost 6: 800 // it's 8.00 insn_cost 7: 400 // it's 4.00 insn_cost 8: 433 // it's 4.33 little costly than 7th, +x.xx% better calibrating. insn_cost 9: 466 // it's 4.66 little costly than 8th, +x.xx% better calibrating. insn_cost 10: 500 // it's 5.00 The 8th 9th instructions are greater cost than the 7th instruction that has the value 4 and lesser cost than the 10th instruction that has the value 5. There is not integer value of cost beetween 4 and 5, so the solution is x100 to reach the objective beetween 400 and 500 meaning 4.00 .. 5.00, and then there are valid values between 401 and 499. Sincerely, J.C.
gcc-4.2-20070627 is now available
Snapshot gcc-4.2-20070627 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20070627/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch revision 126067 You'll find: gcc-4.2-20070627.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20070627.tar.bz2 C front end and core compiler gcc-ada-4.2-20070627.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20070627.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20070627.tar.bz2 C++ front end and runtime gcc-java-4.2-20070627.tar.bz2 Java front end and runtime gcc-objc-4.2-20070627.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20070627.tar.bz2The GCC testsuite Diffs from 4.2-20070620 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: [ARM] Cirrus EP93xx Maverick Crunch Support - condexec / bugfixing / co-processor offset out of range
On Wed, 27 Jun 2007 12:31:42 +0200, Rask Ingemann Lambertsen [EMAIL PROTECTED] said: On Wed, Jun 27, 2007 at 06:45:26PM +1000, Hasjim Williams wrote: It also fixes up the internal compiler error: output_operand: '%l' operand isn't a label error... Incidentally, does anyone know if can you do something like: if_then_else (ge (match_operand:!CCFP 1 cc_register ) (const_int 0)) You can't (but mode macros help). As Paolo says, you will have to define one or more new comparison modes and you will have to define branch insns which use the new mode(s), comparison insns which set the cc register in the new mode(s), new sCC style insns, and so on. Additionally, look at SELECT_CC_MODE and TARGET_CC_MODE_COMPATIBLE. If you have some sort of arm_output_compare_insn() function, modify that as well. The significance of defining a CCmode is that is says that comparisons done in that mode set the flags in a specific way. Thanks. This really clears things up for me. For the moment, I will leave conditional execution disabled for EVERYTHING when compiling for MaverickCrunch. I think the arm.md code only conditionally executes operands if the compare was in SImode, anyway - (see scc insns in gcc/config/arm/arm.md) . Can anyone confirm this? This just leaves me with one other major bug for MaverickCrunch. It is related to the bugs in the Cirrus silicon. Mainstream gcc and older versions of gcc have a parameter -mcirrus-fix-invalid-insns. The patch from Nucleus Systems (http://www.nucleusys.com/projects/crunch.php), removes this parameter, and replaces it with two -mfix-crunch-d0 and -mfix-crunch-d1. I've modified it and attached it to this post. At the moment, I hard code both to 0, to disable the bug fixes, since enabling them I think is the cause of a co-processor offset out of range error, in the assembler. Essentially the two major bugs that the attached code fixes are, after a branch, two nops are needed. Secondly, the a register written to in one instruction can not be read from in the next instruction, without a non-MaverickCrunch operation in between, i.e. a nop. Essentially this extra code is run in arm_reorg, which is always run on ARM, since an address can only be loaded a limited distance around the pc. Likewise for the MaverickCrunch coprocessor, we only have an 8-bit word offset, which means a max 1024 byte offset, minus the 8 byte minimum jump, etc. Now, it seems whether this patch is applied (and turned on) or not applied I get co-processor offset out of range errors, because of the extra NOPs inserted between the jump and original label point. I think this in turn shifts the offset. I can't see anyway to easily recalculate or fix the coprocessor offset instructions, since this happens AFTER the instruction has been generated. I tried to hack around this by putting 2 NOPs before all cirrus instructions, and modifying the length of each instruction. I think this means that the coprocessor offset will be correct, since the NOP has been generated BEFORE the instruction was generated. However, it will mean that all cirrus instructions are slower, since some will have additional unneccessary NOPs appended before them. I don't think that alone will work, though... I think the co-processor offset out of range error is generated because of the cfldrs and cfldrd instructions. These are used to Load Floating Point Values from Memory into MaverickCrunch registers directly. I commented out the cirrus_movsf / cirrus_movdf insn patterns (in asm generation), and it removed the error. Does this mean that someone calculated the pool_range neg_pool_range attrs incorrectly, or are the constraints I talk about below missing? Is there something else in arm.c/h that I should be looking at? arm_legitimate_index_p ??? arm_coproc_mem_operand ??? EXTRA_CONSTRAINT_STR_ARM ? 'Uv' is an address valid for VFP load/store insns. - i.e. doesn't support writeback 'Uy' is an address valid for iwmmxt load/store insns. - i.e. supports writeback Is there supposed to be something similar for FPA / Maverick load/store insns? Or should it use the Uy mode? Only the VFP supports the writeback modes? Autoincrement / decrement modes? Only VFP does this, I think... I think m mode is only used for r-mE and m-r. w-UvE Uv - w for VFP. y - yrUy yrUy - y for iwMMXt. However, this isn't done for FPA. Is this a bug for FPA? Or hasn't it been picked up since no-one really uses FPA? Also, once I get the code doing what it's supposed to do, and generate a patch against svn HEAD, do I need to do anything else special besides posting it to gcc-patches, and letting it go through the review process? http://gcc.gnu.org/contribute.html mentions some forms for Legal Prerequisites... diff -ruN /home/hwilliams/openembedded/build/tmp/work/ep9312-angstrom-linux-gnueabi/gcc-cross-4.1.2-r0/gcc-4.1.2/gcc/config/arm/arm.c gcc-4.1.2/gcc/config/arm/arm.c ---
Re: I'm sorry, but this is unacceptable (union members and ctors)
Antoine Chavasse wrote: For instance, say you need to impliment a GUI, so you have yourself a rectangle struct which consists of four floating point values (the origin and difference between the opposite corner) ...Now you want those four values, but you also have a 2D vector struct. Here is a portable alternative to achieve this: struct Rectangle { private: Vector2D m_Position; Vector2D m_Size; public: Vector2D position() { return m_Position; } const Vector2D position() const { return m_Position; } Vector2D size() { return m_Size; } const Vector2D size() const { return m_Size; } float left() { return m_Position.x; } float left() const { return m_Position.x; } float top() { return m_Position.y; } float top() const { return m_Position.y; } float width() { return m_Size.x; } float width() const { return m_Size.x; } float height() { return m_Size.y; } float height() const { return m_Size.y; } }; Then you can access the members like this: Rect somerectangle; Rect.position().x = 45; Rect.left() = 45; I pointed this out as the obvious portable solution somewhere in the thread. I just firmly believe this is an unnecessarily back breaking way of going about it (and physically backbreaking for whoever would have to change all of the code) It would be a blessing were intelligible code somewhat higher up on the rungs of c++ priorities (being the ever ubiquitous mainstay systems programming language it has become and will likely remain) -- View this message in context: http://www.nabble.com/I%27m-sorry%2C-but-this-is-unacceptable-%28union-members-and-ctors%29-tf3930964.html#a11337133 Sent from the gcc - Dev mailing list archive at Nabble.com.
Re: I'm sorry, but this is unacceptable (union members and ctors)
On Wed, Jun 27, 2007 at 10:14:18PM -0700, michael.a wrote: For instance, say you need to impliment a GUI, so you have yourself a rectangle struct which consists of four floating point values (the origin and difference between the opposite corner) ...Now you want those four values, but you also have a 2D vector struct. ... I pointed this out as the obvious portable solution somewhere in the thread. I just firmly believe this is an unnecessarily back breaking way of going about it (and physically backbreaking for whoever would have to change all of the code) It would be a blessing were intelligible code somewhat higher up on the rungs of c++ priorities (being the ever ubiquitous mainstay systems programming language it has become and will likely remain) Minding reading has always been considered a blessing when it comes to programming languages. Also, an impossibility. I don't understand what is being requested. Have one structure with four fields, and another with two, and allow them to be used automatically interchangeably? How is this a good thing? How will this prevent the implementor from making a stupid mistake? Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/
[Bug c/32520] C/C++ programs segfault at runtime if arrays larger than 8MB are declared.
--- Comment #1 from pluto at agmk dot net 2007-06-27 07:28 --- the 8MB array overflows stack and gcc has nothing to do here because stack size is controlled by operating system. use ulimit -s [stack size in kB] to workaround this problem. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32520
[Bug fortran/32439] hard-coded memory limit ? f951: out of memory with '-O1 -fbounds-check'
--- Comment #5 from jv244 at cam dot ac dot uk 2007-06-27 07:31 --- could be similar to PR32514 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32439
[Bug middle-end/32514] out of memory using -fprofile-generate
--- Comment #1 from jv244 at cam dot ac dot uk 2007-06-27 07:35 --- this is for: Target: x86_64-unknown-linux-gnu Configured with: /data03/vondele/gcc_trunk/gcc/configure --prefix=/data03/vondele/gcc_trunk/build --with-gmp=/data03/vondele/ --with-mpfr=/data03/vondele/ --enable-languages=c,fortran Thread model: posix gcc version 4.3.0 20070626 (experimental) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32514
[Bug target/28904] operand out of range on Linux/PowerPC
--- Comment #7 from srm at schokokeks dot org 2007-06-27 08:06 --- I have checked with 4.2.0 and it produces the same error. Maybe i'd like to rebuild python too with 4.2? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28904
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #119 from jv244 at cam dot ac dot uk 2007-06-27 08:24 --- Testing gcc 4.2.0 I unfortunately found that it miscompiles CP2K. The following testcase: tests/DFTB/regtest-scc/h2o-1.inp yields incorrect results. Should be similar to: Total energy: -130.561836 whereas one gets Total energy: -127.642599 This is a very large difference beyond numerics. The miscompilation is triggered by: # BUG FCFLAGS = -O3 -ffast-math -ftree-vectorize -march=native but not # OK FCFLAGS = -O3 -ffast-math -march=native # OK FCFLAGS = -O3 -funroll-loops -ftree-vectorize -march=native I might try to find out which module gets miscompiled, but this could be a bit of a slow process. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug fortran/32439] hard-coded memory limit ? f951: out of memory with '-O1 -fbounds-check'
--- Comment #6 from fxcoudert at gcc dot gnu dot org 2007-06-27 08:51 --- (In reply to comment #3) So, it looks like something inside gcc is hard-coded to just 1Gb of memory, instead of the available memory. That's probably a stupid thing to ask, but you don't have any shell limits (as reported per ulimit -a) that would match this number, do you? -- fxcoudert at gcc dot gnu dot org changed: What|Removed |Added CC||fxcoudert at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32439
[Bug target/32450] -pg seemingly causes miscompilation
--- Comment #4 from fxcoudert at gcc dot gnu dot org 2007-06-27 08:58 --- (In reply to comment #3) basically, you need -O2 and -march=native to trigger the bug I can't reproduce that, what is your processor exactly? (ie what is native for you) -- fxcoudert at gcc dot gnu dot org changed: What|Removed |Added CC||fxcoudert at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32450
[Bug c/32520] C/C++ programs segfault at runtime if arrays larger than 8MB are declared.
--- Comment #2 from rguenth at gcc dot gnu dot org 2007-06-27 08:59 --- Adjust your available stack size. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32520
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #120 from pinskia at gmail dot com 2007-06-27 09:37 --- Subject: Re: [meta-bugs] ICEs with CP2K On 27 Jun 2007 08:24:46 -, jv244 at cam dot ac dot uk [EMAIL PROTECTED] wrote: # BUG FCFLAGS = -O3 -ffast-math -ftree-vectorize -march=native So -ffast-math with vectorizer changes the results. I bet this is due to reduction which is done for -ffast-math with -ftree-vectorize. Which case it might not be a bug. Yes 3 out of 130 is actually huge but if the values are huge to begin with, it might be the case this is just a percussion issue. -- Pinski -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug target/32450] -pg seemingly causes miscompilation
--- Comment #5 from jv244 at cam dot ac dot uk 2007-06-27 10:34 --- (In reply to comment #4) (In reply to comment #3) basically, you need -O2 and -march=native to trigger the bug I can't reproduce that, what is your processor exactly? (ie what is native for you) ... here is a suggestion for the gcc crew ... what about having gfortran -v also print the value fo -march=native. Honestly, I often don't now the precise CPU I'm running on (that's why I find -march=native useful in the first place). In this case, it is either on a intel core 2 duo or on an opteron. I'll see if I can figure out where I did these runs -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32450
[Bug rtl-optimization/32084] gfortran 4.3 13%-18% slower for induct.f90 than gcc 4.0-based competitor
--- Comment #4 from ubizjak at gmail dot com 2007-06-27 11:24 --- (In reply to comment #3) The problem is in -ftree-vectorize The difference is, that without -ftree-vectorize the inner loop (do k = 1, 9) is completely unrolled, but with vectorization, the loop is vectorized, but _not_ unrolled. Since the vectorization factor is only 2 for V2DF mode vectors, we loose big time at this point. My best guess for unroller problems would be rtl-optimization. -- ubizjak at gmail dot com changed: What|Removed |Added CC|dorit at gcc dot gnu dot org| Component|tree-optimization |rtl-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32084
[Bug target/32450] -pg seemingly causes miscompilation
--- Comment #6 from pinskia at gcc dot gnu dot org 2007-06-27 11:25 --- ... here is a suggestion for the gcc crew ... what about having gfortran -v When you invoke gfortran with -v march=native and with a source file, it will show the values. This is the recommended way of showing how you involved gcc/gfortran anyways. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32450
[Bug target/32450] -pg seemingly causes miscompilation
--- Comment #7 from fxcoudert at gcc dot gnu dot org 2007-06-27 11:38 --- (In reply to comment #6) When you invoke gfortran with -v march=native and with a source file, it will show the values. This is the recommended way of showing how you involved gcc/gfortran anyways. Yes, that works fine: $ gcc -c -v -march=native a.c [...] /path/to/cc1 -quiet -v a.c -march=k8 -msahf --param l1-cache-size=1024 --param l1-cache-line-size=64 -mtune=k8 -quiet -dumpbase a.c -auxbase a -version -o /tmp/ccYMbEr2.s -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32450
[Bug rtl-optimization/32084] gfortran 4.3 13%-18% slower for induct.f90 than gcc 4.0-based competitor
--- Comment #5 from dorit at il dot ibm dot com 2007-06-27 11:57 --- (In reply to comment #4) (In reply to comment #3) The problem is in -ftree-vectorize The difference is, that without -ftree-vectorize the inner loop (do k = 1, 9) is completely unrolled, but with vectorization, the loop is vectorized, but _not_ unrolled. Since the vectorization factor is only 2 for V2DF mode vectors, we loose big time at this point. My best guess for unroller problems would be rtl-optimization. Could it be the tree-level complete unroller? (does the vectorizer peel the loop to handle a misaligned store by any chance? if so, and if the misalignment amount is unknown, then the number of iterations of the vectorized loop is unknown, in which case the complete unroller wouldn't work). In autovect-branch the tree-level complete unroller is before the vectorizer - wonder what happens there. Another thing to consider is using -fvect-cost-model (it's very perliminary and hasn't been tuned much, but this could be a good data point for whoever wants to tune the vectorizer cost-model for x86_64). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32084
[Bug target/32450] -pg seemingly causes miscompilation
--- Comment #8 from jv244 at cam dot ac dot uk 2007-06-27 12:15 --- (In reply to comment #7) (In reply to comment #6) When you invoke gfortran with -v march=native and with a source file, it will right.. that shows: gfortran --verbose -O2 -march=native -pg all.f90 Driving: gfortran -v -O2 -march=native -pg all.f90 -lgfortranbegin -lgfortran -lm -shared-libgcc Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: /data/vondele/gcc_trunk/gcc/configure --prefix=/data/vondele/gcc_trunk/build --enable-languages=c,fortran --with-mpfr=/data/programs/mpfr/ Thread model: posix gcc version 4.3.0 20070626 (experimental) /data/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/f951 all.f90 -march=core2 -mcx16 -msahf --param l1-cache-size=512 --param l1-cache-line-size=64 -mtune=core2 -quiet -dumpbase all.f90 -auxbase all -O2 -version -p -fintrinsic-modules-path /data/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/finclude -o /tmp/ccLxqkOP.s GNU F95 version 4.3.0 20070626 (experimental) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.3.0 20070626 (experimental), GMP version 4.1.4, MPFR version 2.2.1. GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 and that also leads to the executable which fails as: bench02 ../src/cp2k.sopt JAC_gen.inp CP2K| Stopped by processor number 0 CP2K| cp_log_handling:cp_add_default_loggertoo many default loggers, increase max_stack_pointer in cp_log_handling CP2K| Error number was 100 STOP mp_stop -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32450
[Bug fortran/32439] hard-coded memory limit ? f951: out of memory with '-O1 -fbounds-check'
--- Comment #7 from jv244 at cam dot ac dot uk 2007-06-27 12:18 --- (In reply to comment #6) (In reply to comment #3) So, it looks like something inside gcc is hard-coded to just 1Gb of memory, instead of the available memory. That's probably a stupid thing to ask, but you don't have any shell limits (as reported per ulimit -a) that would match this number, do you? I don't think so. We run large memory jobs on that machine (that's why we have it in the first place). I get the following output: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 529920 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 529920 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32439
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #121 from jv244 at cam dot ac dot uk 2007-06-27 12:47 --- (In reply to comment #119) I might try to find out which module gets miscompiled, but this could be a bit of a slow process. miscompilation happens with the module qs_neighbor_lists. It is a module with lots of dependencies, so I don't think I will get a reduced testcase for this. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #122 from jv244 at cam dot ac dot uk 2007-06-27 12:51 --- (In reply to comment #120) I bet this is due to reduction which is done for -ffast-math with -ftree-vectorize. Which case it might not be a bug. Yes 3 out of 130 is actually huge but if the values are huge to begin with, it might be the case this is just a percussion issue. I don't think there is code in the module mentioned before that would be sensitive to changes in the way reductions are done. It is likely something else. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #123 from jv244 at cam dot ac dot uk 2007-06-27 13:54 --- (In reply to comment #121) (In reply to comment #119) I might try to find out which module gets miscompiled, but this could be a bit of a slow process. miscompilation happens with the module qs_neighbor_lists. It is a module with lots of dependencies, so I don't think I will get a reduced testcase for this. Compiling that module under valgrind gives an error: valgrind --tool=memcheck /data03/vondele/gcc_4_2_0/build/libexec/gcc/x86_64-unknown-linux-gnu/4.2.0/f951 qs_neighbor_lists.f90 -march=k8 -mtune=k8 -quiet -dumpbase qs_neighbor_lists.f90 -auxbase qs_neighbor_lists -O3 -version -ffast-math -ftree-vectorize -ftree-vectorizer-verbose=1 -I /data03/vondele/gcc_4_2_0/build/lib/gcc/x86_64-unknown-linux-gnu/4.2.0/finclude -o /tmp/ccoFFIrV.s ==30523== Conditional jump or move depends on uninitialised value(s) ==30523==at 0x706E08: vrp_evaluate_conditional_warnv (tree-vrp.c:4186) ==30523==by 0x706F9C: vrp_evaluate_conditional (tree-vrp.c:4318) ==30523==by 0x4B6E9F: substitute_and_fold (tree-ssa-propagate.c:1053) ==30523==by 0x700F04: execute_vrp (tree-vrp.c:5318) ==30523==by 0x6F9F27: execute_one_pass (passes.c:881) ==30523==by 0x6FA08B: execute_pass_list (passes.c:932) ==30523==by 0x6FA09D: execute_pass_list (passes.c:933) ==30523==by 0x48CCCD: tree_rest_of_compilation (tree-optimize.c:463) ==30523==by 0x742363: cgraph_expand_function (cgraphunit.c:1244) ==30523==by 0x742C8D: cgraph_optimize (cgraphunit.c:1309) ==30523==by 0x4633DC: gfc_be_parse_file (f95-lang.c:307) ==30523==by 0x6DBF92: toplev_main (toplev.c:1033) also, I checked all vectorized loops in the code path that gets executed for the testcase, and there is only one trivial one (zeroing a freshly allocated array).Rewriting that bit so that it doesn't get vectorized still somehow triggers the bug. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug middle-end/32492] [4.3 Regression] attribute always_inline - sorry, unimplemented: recursive inlining
--- Comment #14 from rguenth at gcc dot gnu dot org 2007-06-27 14:01 --- Subject: Bug 32492 Author: rguenth Date: Wed Jun 27 14:01:27 2007 New Revision: 126054 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=126054 Log: 2007-06-27 Richard Guenther [EMAIL PROTECTED] PR middle-end/32492 * tree.h (fold_convertible_p): Declare. * fold-const.c (fold_convertible_p): New function. * gimplify.c (gimplify_call_expr): Use fold_convertible_p instead of lang_hooks.types_compatible_p. * gcc.dg/inline-22.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/inline-22.c Modified: trunk/gcc/ChangeLog trunk/gcc/fold-const.c trunk/gcc/gimplify.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree.h -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32492
[Bug c/32493] gcc-20070624 fails linux-kernel due to changed gcc-inlining
--- Comment #2 from rguenth at gcc dot gnu dot org 2007-06-27 14:17 --- Reduced testcase: static inline __attribute__((always_inline)) void __check_printsym_format(const char *fmt, ...) { } static inline __attribute__((always_inline)) void print_symbol(const char *fmt, unsigned long addr) { __check_printsym_format(fmt, ); } void do_initcalls(void **call) { print_symbol(: %s(), (unsigned long) *call); } Now, if we make use of the passed variable arguments we would have hit t2.i: In function '__check_printsym_format': t2.i:2: sorry, unimplemented: function '__check_printsym_format' can never be inlined because it uses variable argument lists that we now hit this even if the varargs are unused is ... unfortunate. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32493
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #124 from jv244 at cam dot ac dot uk 2007-06-27 14:21 --- (In reply to comment #123) and there is no valgrind error if I remove -ftree-vectorize from the options. Which, I guess, explains why things get compiled correctly in that case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug target/32437] [4.3 Regression] MIPS: FAIL in gcc.dg/cleanup-[8|9|10|11].c
--- Comment #19 from richard at codesourcery dot com 2007-06-27 14:37 --- Subject: Re: [4.3 Regression] MIPS: FAIL in gcc.dg/cleanup-[8|9|10|11].c Kenneth Zadeck [EMAIL PROTECTED] writes: 2007-06-23 Kenneth Zadeck [EMAIL PROTECTED] PR middle-end/32437 *dce.c (deletable_insn_p): Add extra parameter and recurse if insn is a PARALLEL. (prescan_insns_for_dce): Add extra parameter. Kenny found that this patch introduced problems on x86 (I think it was) because it applied the special handling for bare CLOBBERs to those inside PARALLELs as well. We don't want that; bare USEs and CLOBBERs are special DF markers, but USEs and CLOBBERs inside PARALLELs are parts of asms or define_insns. Kenny pre-approved the patch below. Bootstrapped regression-tested on x86_64-linux-gnu. Applied to mainline. Richard gcc/ * dce.c (deletable_insn_p_1): New function, split out from... (deletable_insn_p): ...here. Only treat bare USEs and CLOBBERs specially, not those inside PARALLELs. Remove BODY argument and adjust recursive call accordingly. (prescan_insns_for_dce): Update call to delete_insn_p. Index: gcc/dce.c === --- gcc/dce.c (revision 126053) +++ gcc/dce.c (working copy) @@ -58,16 +58,15 @@ static VEC(rtx,heap) *worklist; static sbitmap marked = NULL; -/* Return true if INSN with BODY is a normal instruction that can be - deleted by the DCE pass. */ +/* A subroutine for which BODY is part of the instruction being tested; + either the top-level pattern, or an element of a PARALLEL. The + instruction is known not to be a bare USE or CLOBBER. */ static bool -deletable_insn_p (rtx insn, rtx body, bool fast) +deletable_insn_p_1 (rtx body) { - rtx x; switch (GET_CODE (body)) { -case USE: case PREFETCH: case TRAP_IF: /* The UNSPEC case was added here because the ia-64 claims that @@ -79,6 +78,35 @@ deletable_insn_p (rtx insn, rtx body, bo case UNSPEC: return false; +default: + if (volatile_insn_p (body)) + return false; + + if (flag_non_call_exceptions may_trap_p (body)) + return false; + + return true; +} +} + +/* Return true if INSN is a normal instruction that can be deleted by + the DCE pass. */ + +static bool +deletable_insn_p (rtx insn, bool fast) +{ + rtx body, x; + int i; + + if (!NONJUMP_INSN_P (insn)) +return false; + + body = PATTERN (insn); + switch (GET_CODE (body)) +{ +case USE: + return false; + case CLOBBER: if (fast) { @@ -88,32 +116,20 @@ deletable_insn_p (rtx insn, rtx body, bo x = XEXP (body, 0); return REG_P (x) (!HARD_REGISTER_P (x) || reload_completed); } - else + else /* Because of the way that use-def chains are built, it is not possible to tell if the clobber is dead because it can never be the target of a use-def chain. */ return false; case PARALLEL: - { - int i; - for (i = XVECLEN (body, 0) - 1; i = 0; i--) - if (!deletable_insn_p (insn, XVECEXP (body, 0, i), fast)) - return false; - return true; - } + for (i = XVECLEN (body, 0) - 1; i = 0; i--) + if (!deletable_insn_p_1 (XVECEXP (body, 0, i))) + return false; + return true; default: - if (!NONJUMP_INSN_P (insn)) - return false; - - if (volatile_insn_p (body)) - return false; - - if (flag_non_call_exceptions may_trap_p (body)) - return false; - - return true; + return deletable_insn_p_1 (body); } } @@ -369,7 +385,7 @@ prescan_insns_for_dce (bool fast) rtx note = find_reg_note (insn, REG_LIBCALL_ID, NULL_RTX); if (note) mark_libcall (insn, fast); -else if (deletable_insn_p (insn, PATTERN (insn), fast)) +else if (deletable_insn_p (insn, fast)) mark_nonreg_stores (PATTERN (insn), insn, fast); else mark_insn (insn, fast); -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32437
[Bug target/32437] [4.3 Regression] MIPS: FAIL in gcc.dg/cleanup-[8|9|10|11].c
--- Comment #20 from zadeck at naturalbridge dot com 2007-06-27 14:39 --- Subject: Re: [4.3 Regression] MIPS: FAIL in gcc.dg/cleanup-[8|9|10|11].c richard at codesourcery dot com wrote: --- Comment #19 from richard at codesourcery dot com 2007-06-27 14:37 --- Subject: Re: [4.3 Regression] MIPS: FAIL in gcc.dg/cleanup-[8|9|10|11].c Kenneth Zadeck [EMAIL PROTECTED] writes: 2007-06-23 Kenneth Zadeck [EMAIL PROTECTED] PR middle-end/32437 *dce.c (deletable_insn_p): Add extra parameter and recurse if insn is a PARALLEL. (prescan_insns_for_dce): Add extra parameter. Kenny found that this patch introduced problems on x86 (I think it was) because it applied the special handling for bare CLOBBERs to those inside PARALLELs as well. We don't want that; bare USEs and CLOBBERs are special DF markers, but USEs and CLOBBERs inside PARALLELs are parts of asms or define_insns. Kenny pre-approved the patch below. Bootstrapped regression-tested on x86_64-linux-gnu. Applied to mainline. Richard gcc/ * dce.c (deletable_insn_p_1): New function, split out from... (deletable_insn_p): ...here. Only treat bare USEs and CLOBBERs specially, not those inside PARALLELs. Remove BODY argument and adjust recursive call accordingly. (prescan_insns_for_dce): Update call to delete_insn_p. Index: gcc/dce.c === --- gcc/dce.c (revision 126053) +++ gcc/dce.c (working copy) @@ -58,16 +58,15 @@ static VEC(rtx,heap) *worklist; static sbitmap marked = NULL; -/* Return true if INSN with BODY is a normal instruction that can be - deleted by the DCE pass. */ +/* A subroutine for which BODY is part of the instruction being tested; + either the top-level pattern, or an element of a PARALLEL. The + instruction is known not to be a bare USE or CLOBBER. */ static bool -deletable_insn_p (rtx insn, rtx body, bool fast) +deletable_insn_p_1 (rtx body) { - rtx x; switch (GET_CODE (body)) { -case USE: case PREFETCH: case TRAP_IF: /* The UNSPEC case was added here because the ia-64 claims that @@ -79,6 +78,35 @@ deletable_insn_p (rtx insn, rtx body, bo case UNSPEC: return false; +default: + if (volatile_insn_p (body)) + return false; + + if (flag_non_call_exceptions may_trap_p (body)) + return false; + + return true; +} +} + +/* Return true if INSN is a normal instruction that can be deleted by + the DCE pass. */ + +static bool +deletable_insn_p (rtx insn, bool fast) +{ + rtx body, x; + int i; + + if (!NONJUMP_INSN_P (insn)) +return false; + + body = PATTERN (insn); + switch (GET_CODE (body)) +{ +case USE: + return false; + case CLOBBER: if (fast) { @@ -88,32 +116,20 @@ deletable_insn_p (rtx insn, rtx body, bo x = XEXP (body, 0); return REG_P (x) (!HARD_REGISTER_P (x) || reload_completed); } - else + else /* Because of the way that use-def chains are built, it is not possible to tell if the clobber is dead because it can never be the target of a use-def chain. */ return false; case PARALLEL: - { - int i; - for (i = XVECLEN (body, 0) - 1; i = 0; i--) - if (!deletable_insn_p (insn, XVECEXP (body, 0, i), fast)) - return false; - return true; - } + for (i = XVECLEN (body, 0) - 1; i = 0; i--) + if (!deletable_insn_p_1 (XVECEXP (body, 0, i))) + return false; + return true; default: - if (!NONJUMP_INSN_P (insn)) - return false; - - if (volatile_insn_p (body)) - return false; - - if (flag_non_call_exceptions may_trap_p (body)) - return false; - - return true; + return deletable_insn_p_1 (body); } } @@ -369,7 +385,7 @@ prescan_insns_for_dce (bool fast) rtx note = find_reg_note (insn, REG_LIBCALL_ID, NULL_RTX); if (note) mark_libcall (insn, fast); -else if (deletable_insn_p (insn, PATTERN (insn), fast)) +else if (deletable_insn_p (insn, fast)) mark_nonreg_stores (PATTERN (insn), insn, fast); else mark_insn (insn, fast); thanks -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32437
[Bug middle-end/32521] New: [4.2] vrp_evaluate_conditional_warnv (tree-vrp.c:4186) at -O3 -ffast-math -ftree-vectorize -march=native
as discussed in PR 29975 CP2K gets miscompiled by gfortran 4.2.0 (see comments 119 to 125). sources (src/all.f90) can be obtained from: http://www.pci.unizh.ch/vandevondele/tmp/CP2K_gcc_2007_06.tgz and are miscompiled with -O3 -ffast-math -ftree-vectorize -march=native the module that gets miscompiled is qs_neighbor_lists. under valgrind, compilation of that module shows: valgrind --tool=memcheck /data03/vondele/gcc_4_2_0/build/libexec/gcc/x86_64-unknown-linux-gnu/4.2.0/f951 qs_neighbor_lists.f90 -march=k8 -mtune=k8 -quiet -dumpbase qs_neighbor_lists.f90 -auxbase qs_neighbor_lists -O3 -version -ffast-math -ftree-vectorize -ftree-vectorizer-verbose=1 -I /data03/vondele/gcc_4_2_0/build/lib/gcc/x86_64-unknown-linux-gnu/4.2.0/finclude -o /tmp/ccoFFIrV.s ==30523== Conditional jump or move depends on uninitialised value(s) ==30523==at 0x706E08: vrp_evaluate_conditional_warnv (tree-vrp.c:4186) ==30523==by 0x706F9C: vrp_evaluate_conditional (tree-vrp.c:4318) ==30523==by 0x4B6E9F: substitute_and_fold (tree-ssa-propagate.c:1053) ==30523==by 0x700F04: execute_vrp (tree-vrp.c:5318) ==30523==by 0x6F9F27: execute_one_pass (passes.c:881) ==30523==by 0x6FA08B: execute_pass_list (passes.c:932) ==30523==by 0x6FA09D: execute_pass_list (passes.c:933) ==30523==by 0x48CCCD: tree_rest_of_compilation (tree-optimize.c:463) ==30523==by 0x742363: cgraph_expand_function (cgraphunit.c:1244) ==30523==by 0x742C8D: cgraph_optimize (cgraphunit.c:1309) ==30523==by 0x4633DC: gfc_be_parse_file (f95-lang.c:307) ==30523==by 0x6DBF92: toplev_main (toplev.c:1033) as discussed in PR29975, the combination of options above is needed to trigger the bug. -- Summary: [4.2] vrp_evaluate_conditional_warnv (tree-vrp.c:4186) at -O3 -ffast-math -ftree-vectorize -march=native Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: jv244 at cam dot ac dot uk http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32521
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #125 from jv244 at cam dot ac dot uk 2007-06-27 14:45 --- (In reply to comment #119) Testing gcc 4.2.0 I unfortunately found that it miscompiles CP2K. filed as PR 32521 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)
--- Comment #19 from jb at gcc dot gnu dot org 2007-06-27 14:49 --- gfortran does inline most array intrinsics, but only if the result is a scalar. For most array intrinsics this isn't that much of a problem since usually one uses the variant that returns a scalar, but MINLOC is different in that usually one wants to use the version that returns an array. If one implements this I guess it would be straightforward to replicate the solution to many other array intrinsics as well. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067
[Bug c++/31309] g++ 4.2.0 amd64 codegen issue with -O0. 6 byte assignment at end of structure reads/writes past end of structure causing SEGV when that memory is not accessable.
--- Comment #4 from peeterj at ca dot ibm dot com 2007-06-27 14:49 --- removing Taavi from the CC list. Any update on getting this resolved? -- peeterj at ca dot ibm dot com changed: What|Removed |Added CC|taavib at ca dot ibm dot com| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31309
[Bug middle-end/32521] [4.2] vrp_evaluate_conditional_warnv (tree-vrp.c:4186) at -O3 -ffast-math -ftree-vectorize -march=native
--- Comment #1 from jv244 at cam dot ac dot uk 2007-06-27 14:59 --- this could be similar to PR 32006 as it had a similar valgrind trace. That bug was marked as a duplicate of another cp2k bug PR 32018 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32521
Re: [Bug target/32450] -pg seemingly causes miscompilation
When you invoke gfortran with -v march=native and with a source file, it will show the values. This is the recommended way of showing how you involved gcc/gfortran anyways. I get: f951: error: unrecognized command line option -march=native with [karma] bug/timing% gfc -v -march=native -O3 time_trans.f90 Driving: gfc -mmacosx-version-min=10.3 -v -march=native -O3 time_trans.f90 -lgfortranbegin -lgfortran -shared-libgcc Using built-in specs. Target: powerpc-apple-darwin7 Configured with: ../gcc-4.3-20070623/configure --prefix=/sw --prefix=/sw/lib/gcc4 --disable-multilib --enable-languages=c,c++,fortran,objc,java --infodir='/sw/lib/gcc4/share/info' --with-gmp=/sw --with-included-gettext --build=powerpc-apple-darwin7 --host=powerpc-apple-darwin7 --with-as=/sw/lib/odcctools/bin/as --with-ld=/sw/lib/odcctools/bin/ld --with-nm=/sw/lib/odcctools/bin/nm --with-ar=/sw/lib/odcctools/bin/ar --with-strip=/sw/lib/odcctools/bin/strip --with-ranlib=/sw/lib/odcctools/bin/ranlib Thread model: posix gcc version 4.3.0 20070622 (experimental) /sw/lib/gcc4/libexec/gcc/powerpc-apple-darwin7/4.3.0/f951 time_trans.f90 -fPIC -quiet -dumpbase time_trans.f90 -mmacosx-version-min=10.3 -march=native -auxbase time_trans -O3 -version -fintrinsic-modules-path /sw/lib/gcc4/lib/gcc/powerpc-apple-darwin7/4.3.0/finclude -o /var/tmp//ccEbNJW4.s f951: error: unrecognized command line option -march=native GNU F95 version 4.3.0 20070622 (experimental) (powerpc-apple-darwin7) compiled by GNU C version 4.3.0 20070622 (experimental), GMP version 4.2.1, MPFR version 2.2.1. GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 without -march=native, I get karma] bug/timing% gfc -v -O3 time_trans.f90 Driving: gfc -mmacosx-version-min=10.3 -v -O3 time_trans.f90 -lgfortranbegin -lgfortran -shared-libgcc Using built-in specs. Target: powerpc-apple-darwin7 Configured with: ../gcc-4.3-20070623/configure --prefix=/sw --prefix=/sw/lib/gcc4 --disable-multilib --enable-languages=c,c++,fortran,objc,java --infodir='/sw/lib/gcc4/share/info' --with-gmp=/sw --with-included-gettext --build=powerpc-apple-darwin7 --host=powerpc-apple-darwin7 --with-as=/sw/lib/odcctools/bin/as --with-ld=/sw/lib/odcctools/bin/ld --with-nm=/sw/lib/odcctools/bin/nm --with-ar=/sw/lib/odcctools/bin/ar --with-strip=/sw/lib/odcctools/bin/strip --with-ranlib=/sw/lib/odcctools/bin/ranlib Thread model: posix gcc version 4.3.0 20070622 (experimental) /sw/lib/gcc4/libexec/gcc/powerpc-apple-darwin7/4.3.0/f951 time_trans.f90 -fPIC -quiet -dumpbase time_trans.f90 -mmacosx-version-min=10.3 -auxbase time_trans -O3 -version -fintrinsic-modules-path /sw/lib/gcc4/lib/gcc/powerpc-apple-darwin7/4.3.0/finclude -o /var/tmp//ccOSY3Yn.s GNU F95 version 4.3.0 20070622 (experimental) (powerpc-apple-darwin7) compiled by GNU C version 4.3.0 20070622 (experimental), GMP version 4.2.1, MPFR version 2.2.1. GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 /sw/lib/odcctools/bin/as -arch ppc -o /var/tmp//ccFAndJ8.o /var/tmp//ccOSY3Yn.s /sw/lib/gcc4/libexec/gcc/powerpc-apple-darwin7/4.3.0/collect2 -dynamic -arch ppc -macosx_version_min 10.3 -multiply_defined suppress -weak_reference_mismatches non-weak -o a.out -lcrt1.o /sw/lib/gcc4/lib/gcc/powerpc-apple-darwin7/4.3.0/crt2.o /sw/lib/gcc4/lib/gcc/powerpc-apple-darwin7/4.3.0/crt3.o -L/sw/lib/gcc4/lib/gcc/powerpc-apple-darwin7/4.3.0 -L/sw/lib/gcc4/lib/gcc/powerpc-apple-darwin7/4.3.0/../../.. /var/tmp//ccFAndJ8.o -lgfortranbegin -lgfortran -lgcc_s.10.4 -lgcc -lSystem -lmx Dominique
[Bug middle-end/32492] [4.3 Regression] attribute always_inline - sorry, unimplemented: recursive inlining
--- Comment #15 from rguenth at gcc dot gnu dot org 2007-06-27 15:05 --- Fixed. Sort of. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32492
Re: [Bug target/32450] -pg seemingly causes miscompilation
On 6/27/07, Dominique Dhumieres [EMAIL PROTECTED] wrote: When you invoke gfortran with -v march=native and with a source file, it will show the values. This is the recommended way of showing how you involved gcc/gfortran anyways. I get: f951: error: unrecognized command line option -march=native with [karma] bug/timing% gfc -v -march=native -O3 time_trans.f90 Driving: gfc -mmacosx-version-min=10.3 -v -march=native -O3 time_trans.f90 -lgfortranbegin -lgfortran -shared-libgcc Using built-in specs. Target: powerpc-apple-darwin7 That is because -m options are target specific and -march=native (really -mcpu=native) have not been implemented for PowerPC yet. Though it could by reading the processor description bit. -- Pinski
[Bug middle-end/32399] [4.3 Regression] ICE in build2_stat, at tree.c:3074
--- Comment #7 from falk at debian dot org 2007-06-27 15:37 --- This makes bootstrap fail on alphaev68-linux: /src/gcc-2007.06.27/build/./gcc/xgcc -B/src/gcc-2007.06.27/build/./gcc/ -B/usr/local/alphaev68-unknown-linux-gnu/bin/ -B/usr/local/alphaev68-unknown-linux-gnu/lib/ -isystem /usr/local/alphaev68-unknown-linux-gnu/include -isystem /usr/local/alphaev68-unknown-linux-gnu/sys-include -g -fkeep-inline-functions -O2 -O2 -g -O2 -mieee -DIN_GCC-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -mieee -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../.././gcc -I../../../libgcc -I../../../libgcc/. -I../../../libgcc/../gcc -I../../../libgcc/../include -o _gcov_execl.o -MT _gcov_execl.o -MD -MP -MF _gcov_execl.dep -DL_gcov_execl -c ../../../libgcc/../gcc/libgcov.c ../../../libgcc/../gcc/libgcov.c: In function '__gcov_execl': ../../../libgcc/../gcc/libgcov.c:838: internal compiler error: in build2_stat, at tree.c:3074 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. make[3]: *** [_gcov_execl.o] Error 1 make[3]: Leaving directory `/src/gcc-2007.06.27/build/alphaev68-unknown-linux-gnu/libgcc' This happens as soon as varargs are used: $ cat test.c void f(int x, ...) { __builtin_va_list ap; __builtin_va_start(ap, x); } $ /src/gcc-2007.06.27/build/gcc/xgcc -B/src/gcc-2007.06.27/build/gcc/ -c test.c test.c: In function 'f': test.c:3: internal compiler error: in build2_stat, at tree.c:3074 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32399
[Bug middle-end/32399] [4.3 Regression] ICE in build2_stat, at tree.c:3074
--- Comment #8 from pinskia at gmail dot com 2007-06-27 15:41 --- Subject: Re: [4.3 Regression] ICE in build2_stat, at tree.c:3074 On 27 Jun 2007 15:37:26 -, falk at debian dot org [EMAIL PROTECTED] wrote: --- Comment #7 from falk at debian dot org 2007-06-27 15:37 --- This makes bootstrap fail on alphaev68-linux: This is a different bug, related to the alpha backend was not fix up for pointer plus. Please file seperately. Thanks, Andrew Pinski -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32399
[Bug bootstrap/32522] New: Bootstrap failure on Alpha due to pointer-plus changes
src/gcc-2007.06.27/build/./gcc/xgcc -B/src/gcc-2007.06.27/build/./gcc/ -B/usr/local/alphaev68-unknown-linux-gnu/bin/ -B/usr/local/alphaev68-unknown-linux-gnu/lib/ -isystem /usr/local/alphaev68-unknown-linux-gnu/include -isystem /usr/local/alphaev68-unknown-linux-gnu/sys-include -g -fkeep-inline-functions -O2 -O2 -g -O2 -mieee -DIN_GCC-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -mieee -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../.././gcc -I../../../libgcc -I../../../libgcc/. -I../../../libgcc/../gcc -I../../../libgcc/../include -o _gcov_execl.o -MT _gcov_execl.o -MD -MP -MF _gcov_execl.dep -DL_gcov_execl -c ../../../libgcc/../gcc/libgcov.c ../../../libgcc/../gcc/libgcov.c: In function '__gcov_execl': ../../../libgcc/../gcc/libgcov.c:838: internal compiler error: in build2_stat, at tree.c:3074 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. make[3]: *** [_gcov_execl.o] Error 1 make[3]: Leaving directory `/src/gcc-2007.06.27/build/alphaev68-unknown-linux-gnu/libgcc' This happens as soon as varargs are used: $ cat test.c void f(int x, ...) { __builtin_va_list ap; __builtin_va_start(ap, x); } $ /src/gcc-2007.06.27/build/gcc/xgcc -B/src/gcc-2007.06.27/build/gcc/ -c test.c test.c: In function 'f': test.c:3: internal compiler error: in build2_stat, at tree.c:3074 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. Apparently, the Alpha backend needs to adapt to the pointer-plus changes. -- Summary: Bootstrap failure on Alpha due to pointer-plus changes Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: falk at debian dot org GCC build triplet: alphaev68-unknown-linux-gnu GCC host triplet: alphaev68-unknown-linux-gnu GCC target triplet: alphaev68-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32522
[Bug fortran/31205] aliased operator assignment produces wrong result
--- Comment #4 from pault at gcc dot gnu dot org 2007-06-27 15:56 --- (In reply to comment #2) This is related to PR 14771, most likely the parentheses are being ignored. The parentheses are being ignored - in fact they disappear completely; I presume that gfc_simplify_expr is the culprit. In addition, a temporary needs to be made for intent(out), derived types with a default initializer and the initialization applied to that, when the variable is aliassed. I note that other compilers apply the initialization in the callee, whereas gfortran leaves that duty to the caller. I think that the former is cleaner in some sense and that we should make the change. Thus, this little beauty comprises at least two bugs and should probably be three PRs:-) I propose that, for the sake of tractability, it should be left as it is. Paul -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31205
[Bug fortran/32131] knowing that stride==1 when using allocated arrays and escaping allocatable arrays
--- Comment #7 from jb at gcc dot gnu dot org 2007-06-27 15:56 --- (In reply to comment #5) I can see two ways to address this issue (both of them worth pursuing): a) For allocatable arrays, we can always assume stride=1. But this helps only locally in the procedure where the array is declared. If you call another procedure with an explicit interface, that procedure cannot assume that stride==1. I wonder, would it make sense to generate code like if (stride ==1) then some array operation, simplified for the case stride==1 else general case end if Then at least the stride==1 case could be vectorized, and presumably that is also the overwhelmingly common case. Of course it would imply some code bloat. Or is this something the middle-end could do for us? Of course, with IPA this problem could be solved by looking at all the callers.. :) b) We can tell the middle-end that our random number generator doesn't modify the array descriptor (similar to PR 20165). Once we've fixed PR 20165, this should be easy, but I don't see anybody working on it. Another question, do we at the moment tell the middle-end anything about Fortran aliasing rules? E.g. that after the call to random_number (or any other procedure) the a-data is not reachable via some other variable? Or is this another manifestation of the pointer escaping thing from PR 20165? But I would assume some support exists for C99 restrict, which is similar? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32131
[Bug c/32523] New: disastrous scheduling for POWER5
Hi, On the POWER5, gcc 4.2 gets roughly half the performance of gcc 3.3.3 on the best ATLAS DGEMM kernel. By throwing the flags -fno-schedule-insns -fno-rerun-loop-opt I'm able to get most of that performance back. The most important flag is the no-schedule-insns, so I suspect the scheduler was rewritten between these releases. I will append a tarfile that will build a simplified kernel so you can see the affects yourself. This kernel is simplified, so it doesn't have quite the performance of the best one, but the general trend is the same (the best kernel is way to complicated to use). One thing that you might scope out is a feature we have found on the PowerPC970FX (the direct decendent of the POWER5): I went from 69% of peak to 85% by scheduling like instructions in sets of 4 (i.e. do 4 loads, then 4 fmacs, etc, even when this hurts advancing loads). Instruction alignment is also important on this architecture, despite it being putatitively RISC. I think both these features are results of it's complicated front-end, which does something similar to RISC-to-VLIW translation on the fly. I suspect the sets-of-4 rule helps in tracking the groups, but I don't know for sure . . . This scheduling seems to hurt the POWER4 only slightly. I have been trying to install gcc 4.2 on PowerPC970FX, but so far no luck (it doesn't seem to like MacOSX). I will let you know if I get results for the PowerPC970FX. Let me know if there is something else you need. Cheers, Clint -- Summary: disastrous scheduling for POWER5 Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: whaley at cs dot utsa dot edu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523
[Bug c/32523] disastrous scheduling for POWER5
--- Comment #1 from whaley at cs dot utsa dot edu 2007-06-27 16:21 --- Created an attachment (id=13794) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13794action=view) Makefile and source demonstrating problem Creates directory MMBENCH_PPC. Edit the Makefile and set GCC3 and GCC4 macros, and the do make all to see performance. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523
[Bug target/32523] disastrous scheduling for POWER5
--- Comment #2 from pinskia at gcc dot gnu dot org 2007-06-27 16:25 --- PowerPC970FX is not a direct descendent of Power5. It is a descendent of the 970 which is a heavily modified Power4. Power5 is the direct descendent of the Power4 though, at least in terms of scheduling (I don't know if in terms of the hardware itself). So at best they are siblings rather than descendents of one another. The main thing is that you turned off the first scheduling pass which is before the register allocator so I think the case is the register allocator is messing up (which is already known). The other thing is what options are you using to invoke GCC with? Power5 support inside GCC was not added until at least 3.4 (maybe it was 4.0). -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Component|c |target http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523
[Bug target/32523] disastrous scheduling for POWER5
--- Comment #3 from pinskia at gcc dot gnu dot org 2007-06-27 16:27 --- I have been trying to install gcc 4.2 on PowerPC970FX, but so far no luck (it doesn't seem to like MacOSX). I have no problems installing GCC on Mac OS X 10.4.8/9/10. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523
[Bug c/32493] gcc-20070624 fails linux-kernel due to changed gcc-inlining
--- Comment #3 from malitzke at metronets dot com 2007-06-27 16:44 --- I read that the last word ___unfortunate___ means that to hell with the users; We hold fast to our principles So now we have two cases that gcc-4.3.x pretty irrelevant. The is that inane transformation of subtraction in a division (the udivdi3 case) and now this one. Well, there exist three paths open to the user community: 1) The blas-atlas that just termed the whole gcc-3.x.y unusable for their purposes. 2) The xfree86-xorg fork. It might be instructive to check the top 25 distribution as ranked by Distrowatch as to who still uses xfree86 in the table of packages used by each distribution (just click on the name of the distribution in the right hand collunm). They all use xorg; but check for yourself. Personally I might be persuaded to in forming such a group. 3) Somebody with greater perspective might read the introduction to rationale for International Standard Programming Languages C revison 5.10, April-2003 (google C99 std rationale) and draw appropriate conclusion. BTW doesn't the the reduced case make into confirmed -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32493
[Bug c/32524] New: unable to build 4.2 on OS X G5
Hi, I successfully installed gcc 4.1.1 on the same machine, but can't seem to get gcc 4.2 to go. My install always dies with the following error message during the compile phase: /usr/bin/libtool: unknown option character `m' in: -macosx_version_min Here's my configure command (using mpfr and gmp I installed for 4.1.1): ../configure --prefix=/home/whaley/local/gcc-4.2 --with-gmp=/home/whaley/local/ --with-mpfr=/home/whaley/local --enable-languages=c Can anyone give me a pointer on what I'm doing wrong, assuming the problem is on my end? I include a more substantial snip of my bad install below. Here's the output of uname -a: Darwin etl-g52.cs.utsa.edu 8.9.0 Darwin Kernel Version 8.9.0: Thu Feb 22 20:54:07 PST 2007; root:xnu-792.17.14~1/RELEASE_PPC Power Macintosh powerpc Thanks, Clint | /Users/whaley/TEST/gcc-4.2.0/MyObj/./gcc/xgcc -B/Users/whaley/TEST/gcc-4.2.0/MyObj/./gcc/ -B/home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/bin/ -B/home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/lib/ -isystem /home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/include -isystem /home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/sys-include -O2 -O2 -g -O2 -DIN_GCC-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -Wa,-force_cpusubtype_ALL -pipe -mmacosx-version-min=10.4 -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I./../intl -I../../gcc/../libcpp/include -I/home/whaley/local//include -I/home/whaley/local/include -I../../gcc/../libdecnumber -I../libdecnumber -E -xassembler-with-cpp -; \ } | gawk -f ../../gcc/mkmap-flat.awk -v leading_underscore=1 libgcc/./tmp-libgcc.map mv 'libgcc/./tmp-libgcc.map' libgcc/./libgcc.map /Users/whaley/TEST/gcc-4.2.0/MyObj/./gcc/xgcc -B/Users/whaley/TEST/gcc-4.2.0/MyObj/./gcc/ -B/home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/bin/ -B/home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/lib/ -isystem /home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/include -isystem /home/whaley/local/gcc-4.2/powerpc-apple-darwin8.9.0/sys-include -O2 -O2 -g -O2 -DIN_GCC-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -Wa,-force_cpusubtype_ALL -pipe -mmacosx-version-min=10.4 -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -dynamiclib -nodefaultlibs -Wl,-install_name,/home/whaley/local/gcc-4.2/lib/libgcc_s`if test . = ppc64 ; then echo _. ; fi`.1.dylib -single_module -o ./libgcc_s.1.dylib.tmp -Wl,-exported_symbols_list,libgcc/./libgcc.map -compatibility_version 1 -current_version 1.0 libgcc/./_muldi3_s.o libgcc/./_negdi2_s.o libgcc/./_lshrdi3_s.o libgcc/./_ashldi3_s.o libgcc/./_ashrdi3_s.o libgcc/./_cmpdi2_s.o libgcc/./_ucmpdi2_s.o libgcc/./_clear_cache_s.o libgcc/./_enable_execute_stack_s.o libgcc/./_trampoline_s.o libgcc/./__main_s.o libgcc/./_absvsi2_s.o libgcc/./_absvdi2_s.o libgcc/./_addvsi3_s.o libgcc/./_addvdi3_s.o libgcc/./_subvsi3_s.o libgcc/./_subvdi3_s.o libgcc/./_mulvsi3_s.o libgcc/./_mulvdi3_s.o libgcc/./_negvsi2_s.o libgcc/./_negvdi2_s.o libgcc/./_ctors_s.o libgcc/./_ffssi2_s.o libgcc/./_ffsdi2_s.o libgcc/./_clz_s.o libgcc/./_clzsi2_s.o libgcc/./_clzdi2_s.o libgcc/./_ctzsi2_s.o libgcc/./_ctzdi2_s.o libgcc/./_popcount_tab_s.o libgcc/./_popcountsi2_s.o libgcc/./_popcountdi2_s.o libgcc/./_paritysi2_s.o libgcc/./_paritydi2_s.o libgcc/./_powisf2_s.o libgcc/./_powidf2_s.o libgcc/./_powixf2_s.o libgcc/./_powitf2_s.o libgcc/./_mulsc3_s.o libgcc/./_muldc3_s.o libgcc/./_mulxc3_s.o libgcc/./_multc3_s.o libgcc/./_divsc3_s.o libgcc/./_divdc3_s.o libgcc/./_divxc3_s.o libgcc/./_divtc3_s.o libgcc/./_fixunssfsi_s.o libgcc/./_fixunsdfsi_s.o libgcc/./_fixunsxfsi_s.o libgcc/./_fixsfdi_s.o libgcc/./_fixunssfdi_s.o libgcc/./_floatdisf_s.o libgcc/./_floatundisf_s.o libgcc/./_fixdfdi_s.o libgcc/./_fixunsdfdi_s.o libgcc/./_floatdidf_s.o libgcc/./_floatundidf_s.o libgcc/./_fixxfdi_s.o libgcc/./_fixunsxfdi_s.o libgcc/./_floatdixf_s.o libgcc/./_floatundixf_s.o libgcc/./_fixtfdi_s.o libgcc/./_fixunstfdi_s.o libgcc/./_floatditf_s.o libgcc/./_floatunditf_s.o libgcc/./_divdi3_s.o libgcc/./_moddi3_s.o libgcc/./_udivdi3_s.o libgcc/./_umoddi3_s.o libgcc/./_udiv_w_sdiv_s.o libgcc/./_udivmoddi4_s.o libgcc/./darwin-tramp_s.o libgcc/./ppc64-fp_s.o libgcc/./darwin-64_s.o libgcc/./darwin-ldouble_s.o libgcc/./darwin-world_s.o libgcc/./unwind-dw2_s.o libgcc/./unwind-dw2-fde-darwin_s.o libgcc/./unwind-sjlj_s.o libgcc/./unwind-c_s.o libgcc/./darwin-fallback_s.o -lc /usr/bin/libtool: unknown option character `m' in: -macosx_version_min Usage: /usr/bin/libtool -static [-] file [...] [-filelist listfile[,dirname]] [-arch_only arch] [-sacLT] Usage: /usr/bin/libtool -dynamic [-] file [...] [-filelist listfile[,dirname]] [-arch_only arch] [-o output] [-install_name name] [-compatibility_version #] [-current_version #] [-seg1addr 0x#] [-segs_read_only_addr 0x#] [-segs_read_write_addr 0x#]
[Bug target/32524] unable to build 4.2 on OS X G5
--- Comment #1 from pinskia at gcc dot gnu dot org 2007-06-27 16:48 --- /usr/bin/libtool: unknown option character `m' in: -macosx_version_min You need to make sure you have the latest version of Xcode installed. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Component|c |target Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32524
[Bug c/32523] disastrous scheduling for POWER5
--- Comment #4 from whaley at cs dot utsa dot edu 2007-06-27 17:00 --- Andrew, PowerPC970FX is not a direct descendent of Power5 Sorry, completely misremembered this. Since Power4 didn't suffer as bad as Power5 (I think it lost maybe 10% rather than 50), maybe the 970 will also not die. so I think the case is the register allocator is messing up (which is already known) OK, can you point me to the bug report? Is there some way to confirm this is the problem, rather than the scheduling pass itself? The other thing is what options are you using to invoke GCC with? My Makefile shows them. The gcc3-derived flags are: -mcpu=power5 -mtune=power5 -O3 -m64 for gcc4, I get most of my performance back if I add: -fno-schedule-insns -fno-rerun-loop-opt I include below example output and arch info on the machine I created the benchmark on (forgot to include it before, sorry). Thanks, Clint r78n04 noibm122/TEST uname -a Linux r78n04 2.6.5-7.244-pseries64 #1 SMP Mon Dec 12 18:32:25 UTC 2005 ppc64 ppc64 ppc64 GNU/Linux r78n04 noibm122/TEST /usr/bin/gcc -v Reading specs from /usr/lib/gcc-lib/powerpc-suse-linux/3.3.3/specs Configured with: ../configure --enable-threads=posix --prefix=/usr --with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/share/man --enable-languages=c,c++,f77,objc,java,ada --disable-checking --libdir=/usr/lib --enable-libgcj --with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib --with-system-zlib --enable-shared --enable-__cxa_atexit --host=powerpc-suse-linux --build=powerpc-suse-linux --target=powerpc-suse-linux --enable-targets=powerpc64-suse-linux --enable-biarch Thread model: posix gcc version 3.3.3 (SuSE Linux) r78n04 noibm122/TEST gcc -v Using built-in specs. Target: powerpc64-unknown-linux-gnu Configured with: ../configure --prefix=/home/whaley/local/linux --enable-languages=c --with-gmp=/u/noibm122/local/linux --with-mpfr-lib=/u/noibm122/local/linux/lib --with-mpfr-include=/u/noibm122/local/linux/include Thread model: posix gcc version 4.2.0 r78n04 TEST/MMBENCH_PPC make all /usr/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -c mmbench.c /usr/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -c dgemm_atlas.c /usr/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -o xdmm_gcc3 mmbench.o dgemm_atlas.o rm -f *.o /u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -c mmbench.c /u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -c dgemm_atlas.c /u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -o xdmm_gcc4 mmbench.o dgemm_atlas.o rm -f *.o /u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -c mmbench.c /u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -fno-schedule-insns -fno-rerun-loop-opt -c \ dgemm_atlas.c /u/noibm122/local/linux/home/whaley/local/linux/bin/gcc -DREPS=1000 -DWALL -mcpu=power5 -mtune=power5 -O3 -m64 -o xdmm_gcc4_nosched mmbench.o dgemm_atlas.o rm -f *.o echo GCC 3.x performance: GCC 3.x performance: ./xdmm_gcc3 ALGORITHM NB REPSTIME MFLOPS = = = == == atlasmm 40 1000 0.026 4998.24 echo GCC 4.2 performance: GCC 4.2 performance: ./xdmm_gcc4 ALGORITHM NB REPSTIME MFLOPS = = = == == atlasmm 40 1000 0.034 3806.35 echo GCC 4.2 w/o scheduling performance: GCC 4.2 w/o scheduling performance: ./xdmm_gcc4_nosched ALGORITHM NB REPSTIME MFLOPS = = = == == atlasmm 40 1000 0.025 5044.53 -- whaley at cs dot utsa dot edu changed: What|Removed |Added Component|target |c http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523
[Bug target/32523] disastrous scheduling for POWER5
--- Comment #5 from pinskia at gcc dot gnu dot org 2007-06-27 17:05 --- Well the 3.3.3 you are using is a heavy modified 3.3.3 which has the power5 backported and many other stuff. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Component|c |target http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523
[Bug middle-end/32493] [4.3 Regression] Fails to inline varargs function with unused arguments
--- Comment #4 from rguenth at gcc dot gnu dot org 2007-06-27 17:22 --- Sure it does. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Component|c |middle-end Ever Confirmed|0 |1 GCC build triplet|i686-pc-linux-gnu | GCC host triplet|i686-pc-linux-gnu | GCC target triplet|i686-pc-linux-gnu | Known to work||4.2.0 Last reconfirmed|-00-00 00:00:00 |2007-06-27 17:22:59 date|| Summary|gcc-20070624 fails linux- |[4.3 Regression] Fails to |kernel due to changed gcc- |inline varargs function with |inlining|unused arguments Target Milestone|--- |4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32493
[Bug target/32418] [4.3 Regression] ICE in global_alloc, at global.c:514
--- Comment #15 from zadeck at naturalbridge dot com 2007-06-27 18:04 --- it does not look like you ever dealt with the issue of EH_RETURN_STACKADJ_RTX that i pointed out. That code is clearly wrong. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32418
[Bug middle-end/32493] [4.3 Regression] Fails to inline varargs function with unused arguments
--- Comment #5 from pinskia at gcc dot gnu dot org 2007-06-27 18:07 --- I could thought we never inlined varargs even in 4.2.0. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org Keywords||missed-optimization, ||rejects-valid http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32493
[Bug c++/32525] New: Request for new warning: useless dynamic_casts
It might be useful for GCC to warn about dynamic_casts which are not necessary. For instance a dynamic_castT*(T*), or a dynamic_cast from a derived class to a base class (there might be some corner cases here with multiple inheritance?). I do see that such dynamic_casts are no-op'ed away (even without any optimization flags! (at least in my toy test program)), which is certainly positive. However it would be nice if the programmer was notified about them, since even if there is no run-time cost, there is a source-level increase in complexity which can easily be avoided (and there may well be run-time costs involved with other compilers). A quick example of the sort of thing I have in mind: class base { public: virtual int f() = 0; virtual ~base() {} }; class derived : public base { public: int f() { return 1; } }; #include stdio.h int main() { derived* obj = new derived(); base* baseptr = dynamic_castbase*(obj); // warn: to a base class derived* sametype = dynamic_castderived*(obj); // warn: same type derived* from_base = dynamic_castderived*(baseptr); // ok printf(%d %d %d %d\n, obj-f(), baseptr-f(), sametype-f(), from_base-f()); } (Compiling this on x86-64 shows GCC 4.1.0 is no-op'ing the first two dynamic_casts, with or without optimization). -- Summary: Request for new warning: useless dynamic_casts Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: lloyd at randombit dot net http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32525
[Bug c++/32525] Request for new warning: useless dynamic_casts
--- Comment #1 from pinskia at gcc dot gnu dot org 2007-06-27 18:14 --- However it would be nice if the programmer was notified about them, since even if there is no run-time cost, there is a source-level increase in complexity which can easily be avoided (and there may well be run-time costs involved with other compilers). If this warning comes into GCC, we should disable it for templates. The main reason why I say that is because if you do: templatetypename A, typename B A* f(B *b) { return dynamic_castA*(b); } And then instanitite it where typename A == typename B, it is hard to avoid the warning in this case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32525
[Bug target/32481] ICE in df_refs_verify, at df-scan.c:4058
--- Comment #10 from spark at gcc dot gnu dot org 2007-06-27 18:17 --- Subject: Bug 32481 Author: spark Date: Wed Jun 27 18:17:15 2007 New Revision: 126058 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=126058 Log: 2007-06-27 Seongbae Park [EMAIL PROTECTED] PR rtl-optimization/32481 * combine.c (adjust_for_new_dest): Rescan the changed insn. Modified: trunk/gcc/ChangeLog trunk/gcc/combine.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32481
[Bug middle-end/32521] [4.2] vrp_evaluate_conditional_warnv (tree-vrp.c:4186) at -O3 -ffast-math -ftree-vectorize -march=native
--- Comment #2 from jv244 at cam dot ac dot uk 2007-06-27 18:33 --- Reduced testcase: [EMAIL PROTECTED]:/scratch/vondele/clean/cp2k/obj/Linux-x86-64-gfortran/sopt cat test.f90 SUBROUTINE build_qs_neighbor_lists() INTEGER, PARAMETER :: dp=KIND(0.0D0) REAL(dp), ALLOCATABLE, DIMENSION(:) :: c_radius INTEGER :: nkind,istat REAL(dp) :: alpha ALLOCATE (c_radius(nkind),STAT=istat) IF (istat /= 0) CALL stop_memory() c_radius = 0.5_dp*SQRT(-LOG(3.5_dp*alpha**3*1.e-12_dp))/alpha END SUBROUTINE build_qs_neighbor_lists valgrind --tool=memcheck /data03/vondele/gcc_4_2_0/build/libexec/gcc/x86_64-unknown-linux-gnu/4.2.0/f951 test.f90 -march=k8 -mtune=k8 -quiet -dumpbase test.f90 -auxbase qs_neighbor_lists -O3 -version -ffast-math -ftree-vectorize -I /data03/vondele/gcc_4_2_0/build/lib/gcc/x86_64-unknown-linux-gnu/4.2.0/finclude ==3990== Memcheck, a memory error detector. ==3990== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==3990== Using LibVEX rev 1732, a library for dynamic binary translation. ==3990== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==3990== Using valgrind-3.2.3, a dynamic binary instrumentation framework. ==3990== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==3990== For more details, rerun with: -v ==3990== GNU F95 version 4.2.0 (x86_64-unknown-linux-gnu) compiled by GNU C version 4.2.0. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 ==3990== Conditional jump or move depends on uninitialised value(s) ==3990==at 0x706E08: vrp_evaluate_conditional_warnv (tree-vrp.c:4186) ==3990==by 0x706F9C: vrp_evaluate_conditional (tree-vrp.c:4318) ==3990==by 0x4B6E9F: substitute_and_fold (tree-ssa-propagate.c:1053) ==3990==by 0x700F04: execute_vrp (tree-vrp.c:5318) ==3990==by 0x6F9F27: execute_one_pass (passes.c:881) ==3990==by 0x6FA08B: execute_pass_list (passes.c:932) ==3990==by 0x6FA09D: execute_pass_list (passes.c:933) ==3990==by 0x48CCCD: tree_rest_of_compilation (tree-optimize.c:463) ==3990==by 0x742363: cgraph_expand_function (cgraphunit.c:1244) ==3990==by 0x742C8D: cgraph_optimize (cgraphunit.c:1309) ==3990==by 0x4633DC: gfc_be_parse_file (f95-lang.c:307) ==3990==by 0x6DBF92: toplev_main (toplev.c:1033) ==3990== -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32521
[Bug middle-end/32096] [4.3 Regression] ICE (segfault) in vrp_evaluate_conditional_warnv
-- pinskia at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.3.0 |4.2.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32096
[Bug c++/32525] Request for new warning: useless dynamic_casts
--- Comment #2 from bangerth at dealii dot org 2007-06-27 18:53 --- This strikes me as one of the things that hardly anybody would ever find useful. I mean, yes it happens, but no, it doesn't hurt, and I haven't seen such code written in the first place ever. Warnings are for cases where either code may not do what you expect, or where a certain way of coding has a significant cost that can be avoided. I doubt anyone will ever implement this. W. -- bangerth at dealii dot org changed: What|Removed |Added CC||bangerth at dealii dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32525
[Bug middle-end/32521] [4.2] vrp_evaluate_conditional_warnv (tree-vrp.c:4186) at -O3 -ffast-math -ftree-vectorize -march=native
--- Comment #3 from pinskia at gcc dot gnu dot org 2007-06-27 18:54 --- This was fixed in 4.2.1 already by: 2007-05-30 Ralf Wildenhues [EMAIL PROTECTED] * tree-vrp.c (compare_names): Initialize sop. Which was applied for PR 32096. *** This bug has been marked as a duplicate of 32096 *** -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32521
[Bug middle-end/32096] [4.3 Regression] ICE (segfault) in vrp_evaluate_conditional_warnv
--- Comment #7 from pinskia at gcc dot gnu dot org 2007-06-27 18:54 --- *** Bug 32521 has been marked as a duplicate of this bug. *** -- pinskia at gcc dot gnu dot org changed: What|Removed |Added CC||jv244 at cam dot ac dot uk http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32096
[Bug c++/32519] [4.1/4.2/4.3 regression] g++ allows access to protected template member functions of base class
--- Comment #1 from bangerth at dealii dot org 2007-06-27 19:01 --- Confirmed, a regression apparently introduced in 3.4.x. -- bangerth at dealii dot org changed: What|Removed |Added CC||bangerth at dealii dot org Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Keywords||accepts-invalid Known to fail||3.4.3 4.1.0 Known to work||2.95.3 3.2.3 3.3.6 Last reconfirmed|-00-00 00:00:00 |2007-06-27 19:01:01 date|| Summary|g++ allows access to|[4.1/4.2/4.3 regression] g++ |protected template member |allows access to protected |functions of base class |template member functions of ||base class Target Milestone|--- |4.2.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32519
[Bug c++/32525] Request for new warning: useless dynamic_casts
--- Comment #3 from lloyd at randombit dot net 2007-06-27 19:06 --- I haven't seen such code written in the first place ever. Neither had I, until I found out it is endemic in a large project at work. I'd just as soon write a script to find these cases, but figuring out what the type of the casted-from pointer/reference is can be somewhat nontrivial. Warnings are for cases where either code may not do what you expect, or where a certain way of coding has a significant cost that can be avoided. I think that's a good definition. My impression is that dynamic_cast is fairly expensive, and while it is great that GCC noops out this case I suspect not all compilers will do the same; at this point I'm not even sure that GCC does it consistently. So I'd figure it a reasonable case for a warning as per your second condition. I doubt anyone will ever implement this. I've gotten used to that. :) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32525
[Bug target/32523] disastrous scheduling for POWER5
--- Comment #6 from whaley at cs dot utsa dot edu 2007-06-27 19:09 --- Andrew, OK, I installed stock gnu gcc 3.4.6: 78n04 TEST/MMBENCH_PPC ~/local/gcc-3.4.6/bin/gcc -v Reading specs from /u/noibm122/local/gcc-3.4.6/lib/gcc/powerpc64-unknown-linux-gnu/3.4.6/specs Configured with: ../configure --prefix=/u/noibm122/local/gcc-3.4.6 --enable-languages=c Thread model: posix gcc version 3.4.6 and I get the exact same behavior as with the modified gcc 3 (it accepts the power5 flags and everything). So, it would seem something that used to work in the stock gcc is now broken . . . Thanks, Clint -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32523
[Bug fortran/32526] New: Spurious error: Name 'x' at (1) is an ambiguous reference to 'x' from module 'y'
When I compile the modules listed below I get the following error message: e.f90:51.30: use poly_Personnel_class 1 Error: Name 'new' at (1) is an ambiguous reference to 'new' from module 'student_class' Compaq Fortran, Lahey Fortran, and g95 do not produce an error. module Personnel_class implicit none private :: init_Personnel interface new module procedure init_Personnel end interface contains subroutine init_Personnel(this) integer, intent (in) :: this end subroutine init_Personnel end module Personnel_class module Student_class use Personnel_class implicit none private :: init_Student type Student private integer :: personnel end type Student interface new module procedure init_Student end interface contains subroutine init_Student(this) type (Student), intent (in) :: this call new(this%personnel) end subroutine init_Student end module Student_class module Teacher_class use Personnel_class implicit none private :: init_Teacher type Teacher private integer :: personnel end type Teacher interface new module procedure init_Teacher end interface contains subroutine init_Teacher(this) type (Teacher), intent (in) :: this call new(this%personnel) end subroutine init_Teacher end module Teacher_class module poly_Personnel_class use Student_class use Teacher_class end module poly_Personnel_class module Database_class use poly_Personnel_class end module Database_class -- Summary: Spurious error: Name 'x' at (1) is an ambiguous reference to 'x' from module 'y' Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: michael dot a dot richmond at nasa dot gov http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32526
[Bug target/32418] [4.3 Regression] ICE in global_alloc, at global.c:514
--- Comment #16 from rask at sygehus dot dk 2007-06-27 19:15 --- What's wrong with the patch at the top of comment 9? FWIW, I can now build the m32c after this configure command: $ /n/12/rask/src/all/configure --target=m32c-unknown-elf --enable-languages=c,c++ --enable-cxx-flags=-O2 --with-newlib --enable-sim --disable-{multilib,nls,gdb} where --disable-multilib is due to bug 32441 and --enable-cxx-flags=-O2 hides the unrelated reload failure in comment 9. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32418
[Bug c++/32525] Request for new warning: useless dynamic_casts
--- Comment #4 from bangerth at dealii dot org 2007-06-27 19:17 --- (In reply to comment #3) I think that's a good definition. My impression is that dynamic_cast is fairly expensive, But only if the compiler can't know the actual type of an object (which is exactly the case that you want to treat). If the actual type of an object is known or if you are casting to a base class, dynamic_cast is as cheap as static_cast. I doubt anyone will ever implement this. I've gotten used to that. :) Well, people implement what they consider important to them. PRs about uninteresting things will lie dormant until there are no interesting things left to implement. I think everyone's time would be better used if you tried to find cases where gcc doesn't produce a no-op for the constructs you want to warn about. That would be a missed-optimization, rather than a more or less uninteresting warning, and would receive more interest. W. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32525
[Bug rtl-optimization/32466] illegal loop store motion of bitfield
--- Comment #2 from mrs at apple dot com 2007-06-27 19:18 --- Radar 5276895 -- mrs at apple dot com changed: What|Removed |Added CC||mrs at apple dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32466
[Bug target/32418] [4.3 Regression] ICE in global_alloc, at global.c:514
--- Comment #17 from zadeck at naturalbridge dot com 2007-06-27 19:30 --- Subject: Re: [4.3 Regression] ICE in global_alloc, at global.c:514 rask at sygehus dot dk wrote: --- Comment #16 from rask at sygehus dot dk 2007-06-27 19:15 --- What's wrong with the patch at the top of comment 9? FWIW, I can now build the m32c after this configure command: $ /n/12/rask/src/all/configure --target=m32c-unknown-elf --enable-languages=c,c++ --enable-cxx-flags=-O2 --with-newlib --enable-sim --disable-{multilib,nls,gdb} where --disable-multilib is due to bug 32441 and --enable-cxx-flags=-O2 hides the unrelated reload failure in comment 9. i am sorry for being dense. but i have completely lost track of this bug. if i pull the latest code down from the trunk, this patch is not there. Is there still a bug related to dataflow on this platform if everything that needs to be done has been done? Kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32418
[Bug c++/32525] Request for new warning: useless dynamic_casts
--- Comment #5 from lloyd at randombit dot net 2007-06-27 19:33 --- I filed the bug because it seems like this would be at least marginally useful, and this way people can find it / read the discussion / whatever. Even if the end result is WONTFIX, that at least lets anyone in the future who searches the bug database know what the situation is. I'm sorry if I made it sound like I was expecting this to be implemented immediately or anything like that. That is not the case at all; even serious problems like code miscompilations can go a good while without being analyzed or fixed due to time and resource constraints, and something like this naturally falls much (much) deeper into the queue of things to work on. Thus my comment about being used to it, I know there is are many more interesting/important things to work on in GCC. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32525
[Bug target/32418] [4.3 Regression] ICE in global_alloc, at global.c:514
--- Comment #18 from rask at sygehus dot dk 2007-06-27 19:48 --- It has not been committed yet because I feared that it was causing the reload failure. I have now verified that the reload failure still happens with a pre-dataflow checkout, so I'll submit the patch. I think that concludes the dataflow related changes. -- rask at sygehus dot dk changed: What|Removed |Added BugsThisDependsOn|32441 | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32418
[Bug c++/27492] [4.0/4.1/4.2 regression] ICE on invalid covariant return type
--- Comment #7 from simartin at gcc dot gnu dot org 2007-06-27 19:53 --- Subject: Bug 27492 Author: simartin Date: Wed Jun 27 19:53:45 2007 New Revision: 126061 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=126061 Log: gcc/cp/ 2007-06-27 Simon Martin [EMAIL PROTECTED] PR c++/27492 * decl.c (duplicate_decls): Don't reset DECL_INVALID_OVERRIDER_P for function decls. gcc/testsuite/ 2007-06-27 Simon Martin [EMAIL PROTECTED] PR c++/27492 * g++.dg/inherit/covariant15.C: New test. Added: branches/gcc-4_2-branch/gcc/testsuite/g++.dg/inherit/covariant15.C Modified: branches/gcc-4_2-branch/gcc/cp/ChangeLog branches/gcc-4_2-branch/gcc/cp/decl.c branches/gcc-4_2-branch/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27492
[Bug fortran/29975] [meta-bugs] ICEs with CP2K
--- Comment #126 from jv244 at cam dot ac dot uk 2007-06-27 19:55 --- As Andrew pointed out in PR 32521 the valgrind warning was fixed in 4.2.1 (prerelease). I've now built the 4.2_branch, and the warning is indeed gone, but unfortunately the same qs_neighbor_lists module is still miscompiled (i.e. same wrong answers obtained from 4.2_branch). The fact that the miscompilation is now completely silent makes it a bit harder to find I'm afraid. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29975
[Bug target/32418] [4.3 Regression] ICE in global_alloc, at global.c:514
--- Comment #19 from zadeck at naturalbridge dot com 2007-06-27 19:56 --- Subject: Re: [4.3 Regression] ICE in global_alloc, at global.c:514 rask at sygehus dot dk wrote: --- Comment #18 from rask at sygehus dot dk 2007-06-27 19:48 --- It has not been committed yet because I feared that it was causing the reload failure. I have now verified that the reload failure still happens with a pre-dataflow checkout, so I'll submit the patch. I think that concludes the dataflow related changes. thanks, this was what i suspected. kenny -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32418
[Bug c++/27492] [4.0/4.1 regression] ICE on invalid covariant return type
--- Comment #8 from simartin at gcc dot gnu dot org 2007-06-27 19:59 --- Fixed in 4.2 as well. -- simartin at gcc dot gnu dot org changed: What|Removed |Added Summary|[4.0/4.1/4.2 regression] ICE|[4.0/4.1 regression] ICE on |on invalid covariant return |invalid covariant return |type|type http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27492
[Bug fortran/32467] structure containing allocatable array is accepted in COPYIN clause
--- Comment #8 from dfranke at gcc dot gnu dot org 2007-06-27 20:03 --- Subject: Bug 32467 Author: dfranke Date: Wed Jun 27 20:02:31 2007 New Revision: 126063 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=126063 Log: gcc/fortran: 2007-06-24 Daniel Franke [EMAIL PROTECTED] PR fortran/32467 * openmp.c (resolve_omp_clauses): Emit error on allocatable components in COPYIN, COPYPRIVATE, FIRSTPRIVATE and LASTPRIVATE clauses. gcc/testsuite: 2007-06-24 Daniel Franke [EMAIL PROTECTED] PR fortran/32467 * gfortran.dg/gomp/allocatable_components_1.f90: New test. Added: trunk/gcc/testsuite/gfortran.dg/gomp/allocatable_components_1.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/openmp.c trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32467
[Bug fortran/32467] structure containing allocatable array is accepted in COPYIN clause
--- Comment #9 from dfranke at gcc dot gnu dot org 2007-06-27 20:04 --- Fixed in trunk. Not a regression, thus no backport to 4.2. Closing. -- dfranke at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Keywords|wrong-code |accepts-invalid Known to fail|4.2.1 4.3.0 |4.2.1 Known to work||4.3.0 Resolution||FIXED Target Milestone|--- |4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32467
[Bug fortran/32526] Spurious error: Name 'x' at (1) is an ambiguous reference to 'x' from module 'y'
--- Comment #1 from dfranke at gcc dot gnu dot org 2007-06-27 20:10 --- Which version of gfortran and which flags are you using? I tried 4.1.2, 4.2 and a recent svn (20070622), neither gave the error on the code you quote. All I get is: $ gfortran-svn -g -Wall -std=f95 -c pr32526.f90 pr32526.f90:8.39: subroutine init_Personnel(this) 1 Warning: Unused variable this declared at (1) -- dfranke at gcc dot gnu dot org changed: What|Removed |Added CC||dfranke at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32526
[Bug target/32418] [4.3 Regression] ICE in global_alloc, at global.c:514
--- Comment #20 from zadeck at naturalbridge dot com 2007-06-27 20:15 --- I believe that rask is going to submit the patch at the end of comment #9 to close this bug. -- zadeck at naturalbridge dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32418
[Bug fortran/32526] Spurious error: Name 'x' at (1) is an ambiguous reference to 'x' from module 'y'
--- Comment #2 from mrichmond at mscmail dot gsfc dot nasa dot gov 2007-06-27 20:21 --- Subject: RE: Spurious error: Name 'x' at (1) is an ambiguous reference to 'x' from module 'y' I downloaded the latest snapshot. The bug does not occur with older versions, so I believe it is a regression from a recent fix. --- Original Message --- From: dfranke at gcc dot gnu dot org[mailto:[EMAIL PROTECTED] Sent: 6/27/2007 4:10:15 PM To : [EMAIL PROTECTED] Cc : Subject : RE: [Bug fortran/32526] Spurious error: Name 'x' at (1) is an ambiguous reference to 'x' from module 'y' --- Comment #1 from dfranke at gcc dot gnu dot org 2007-06-27 20:10 --- Which version of gfortran and which flags are you using? I tried 4.1.2, 4.2 and a recent svn (20070622), neither gave the error on the code you quote. All I get is: $ gfortran-svn -g -Wall -std=f95 -c pr32526.f90 pr32526.f90:8.39: subroutine init_Personnel(this) 1 Warning: Unused variable this declared at (1) -- dfranke at gcc dot gnu dot org changed: What|Removed |Added CC||dfranke at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32526 --- You are receiving this mail because: --- You reported the bug, or are watching the reporter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32526
[Bug fortran/32527] New: [Optimization] ICE in build2_stat, at tree.c:3074
Hi, the attached file crashes gfortran when compiling with -O and -Os, but passes with -O0, -O2 and -O3. % gfortran -O -c gfcbug65.f90 gfcbug65.f90: In function 'nf90_put_var_7d_fourbyteint': gfcbug65.f90:1: internal compiler error: in build2_stat, at tree.c:3074 ... -- Summary: [Optimization] ICE in build2_stat, at tree.c:3074 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: anlauf at gmx dot de GCC host triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32527
[Bug middle-end/32493] [4.3 Regression] Fails to inline varargs function with unused arguments
--- Comment #6 from malitzke at metronets dot com 2007-06-27 20:43 --- This appears to be the essence of what I wanted note per 3) in comment 3 In going from pdf to html via pdftohtml I was forced to do some realigning and erase special symbols by hand. If you do not trust go to the actual document. If this is what bound the standardization committee it is certainly binding on myself the GCC apparently feels differently. The Xfree86-xorg inspires me to believe that reason will prevail one way or another. The original X3J11 charter clearly mandated codifying common existing practice, and the C89 Committee held fast to precedent wherever that was clear and unambiguous. The vast majority of the language defined by C89 was precisely the same as defined in Appendix A of the first edition of The C Programming Language by Brian Kernighan and Dennis Ritchie, and as was implemented in almost all C translators of the time. (That document is hereinafter referred to asKR.) KR was not the only source of existing practice. Much work had been done over the years to improve the C language by addressing its weaknesses, and the C89 Committee formalized enhancements of proven value which had become part of the various dialects of C. This practice has continued in the present Committee. Existing practice, however, has not always been consistent. Various dialects of C have approached problems in different and sometimes diametrically opposed ways. This divergence has happened for several reasons. First, KR, which once served as the language specification for almost all C translators, is imprecise in some areas (thereby allowing divergent interpretations), and it does not address some issues (such as a complete specification of a library) important for code portability. Second, as the language has matured over the years, various extensions have been added in different dialects to address limitations and weaknesses of the language; but these extensions have not been consistent across dialects. One of the C89 Committee's goals was to consider such areas of divergence and to establish a set of clear, unambiguous rules consistent with the rest of the language. This effort included the consideration of extensions made in various C dialects, the specification of a complete set of required library functions, and the development of a complete, correct syntax for C. Much of the Committee's work has always been in large part a balancing act. The C89 Committee tried to improve portability while retaining the definition of certain features of C as machine-dependent, it attempted to incorporate valuable new ideas without disrupting the basic structure and fabric of the language, and it tried to develop a clear and consistent language without invalidating existing programs. All of the goals were important and each decision was weighed in the light of sometimes contradictory requirements in an attempt to reach a workable compromise. In specifying a standard language, the C89 Committee used several principles which continue to guide our deliberations today. The most important of these are: Existing code is important, existing implementations are not. A large body of C code exists of considerable commercial value. Every attempt has been made to ensure that the bulk of this code will be acceptable to any implementation conforming to the Standard. The C89 Committee did not want to force most programmers to modify their C programs just to have them accepted by a conforming translator. On the other hand, no one implementation was held up as the exemplar by which to define C. It was assumed that all existing implementations must change somewhat to conform to the Standard. C code can be portable. Although the C language was originally born with the UNIX operating system on the PDP-11, it has since been implemented on a wide variety of computers and operating systems. It has also seen considerable use in cross-compilation of code for embedded systems to be executed in a free-standing environment. The C89 Committee attempted to specify the language and the library to be as widely implementable as possible, while recognizing that a system must meet certain minimum criteria to be considered a viable host or target for the language. C code can be non-portable. Although it strove to give programmers the opportunity to write truly portable programs, the C89 Committee did not want to force programmers into writing portably, to preclude the use of C as a high-level assembler: the ability to write machine-specific code is one of the strengths of C. It is this principle which largely motivates drawing the distinction between strictly conforming program and conforming program . Avoid quiet changes. Any change to widespread practice altering the meaning of existing code causes problems. Changes that cause code to be so ill-formed as to require diagnostic messages are at least easy to detect. As much as seemed possible consistent with
[Bug fortran/32527] [Optimization] ICE in build2_stat, at tree.c:3074
--- Comment #1 from anlauf at gmx dot de 2007-06-27 20:44 --- Created an attachment (id=13795) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13795action=view) Demo code, extracted from netcdf-3.6.1 Compile with -O or -Os -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32527