[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch Summary|[4.8 Regression]|[4.8 Regression] |miscompilation at -O2 |miscompilation at -O2 ||(tree-pre?) --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-12-01 15:53:17 UTC --- Using -O2 -fno-tree-pre fixes the testcase. Using -O1 -ftree-pre leads to an infinite loop at runtime.
[Bug fortran/55469] memory leak on read with istat.ne.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55469 --- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-29 10:23:13 UTC --- Is that for the more complete patch posted here: http://gcc.gnu.org/ml/fortran/2012-11/msg00083.html BTW, wrong PR number in that message.
[Bug tree-optimization/55213] vectorizer ignores __restrict__
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55213 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-30 07:45:08 UTC --- Something similar was reported in PR47341 which adds some analysis.
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #33 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-29 07:30:58 UTC --- (In reply to comment #31) As for the backport, I think the patch is absolutely risk-free, and it should have been approved for 4.7 even though it doesn't fulfill the formal requirements. Please ping the patch in a few weeks so it's not forgotten. Ping
[Bug fortran/55469] memory leak on read with istat.ne.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55469 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-26 08:01:27 UTC --- BTW, would there be a simple workaround ?
[Bug fortran/55469] New: memory leak on read with istat.ne.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55469 Bug #: 55469 Summary: memory leak on read with istat.ne.0 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch The following testcase leads to memory leaks with gfortran 4.5/4.6/4.7/4.8 (as found by valgrind) 4.1 seems not to leak (but has a couple of warnings). REAL :: z INTEGER :: istat CHARACTER(LEN=3) :: t t=NVE READ (UNIT=t,FMT=*,IOSTAT=istat) z END note that istat.NE.0 in this case. ==37422== 300 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==37422==at 0x4A057F4: calloc (vg_replace_malloc.c:593) ==37422==by 0x4C298FF: _gfortrani_xcalloc (memory.c:56) ==37422==by 0x4CE2A82: l_push_char.isra.2 (list_read.c:641) ==37422==by 0x4CE3223: read_real (list_read.c:1634) ==37422==by 0x4CE504E: _gfortrani_list_formatted_read (list_read.c:1895)
[Bug fortran/55341] New: address-sanitizer and Fortran
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55341 Bug #: 55341 Summary: address-sanitizer and Fortran Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch Hardly a bug, rather a feature... it seems '-faddress-sanitizer' works with Fortran seemingly out-of-the-box. Great! could it be documented a being for c/c++/Fortran ? Both these testcases work ('fail') as expected: PROGRAM TEST_ASAN_01 INTEGER :: A(10) i=-1 A(i)=0 END PROGRAM PROGRAM TEST_ASAN_02 INTEGER, POINTER :: x1,x2,x3 ALLOCATE(X1) X2=X1 DEALLOCATE(X1) X2=0 END PROGRAM
[Bug fortran/55341] address-sanitizer and Fortran
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55341 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-15 14:02:47 UTC --- Trying -faddress-sanitizer on CP2K leads to the following failure: cat bug.f90 MODULE qs_environment_types TYPE rt_prop_type INTEGER,DIMENSION(:,:),ALLOCATABLE:: orders END TYPE rt_prop_type TYPE qs_environment_type TYPE(rt_prop_type),POINTER:: rtp END TYPE qs_environment_type CONTAINS SUBROUTINE set_qs_env(qs_env,rtp) TYPE(qs_environment_type), POINTER:: qs_env TYPE(rt_prop_type), OPTIONAL, POINTER :: rtp IF (PRESENT(rtp)) qs_env%rtp=rtp END SUBROUTINE set_qs_env END MODULE qs_environment_types gfortran -O0 -faddress-sanitizer bug.f90 bug.f90: In function ‘set_qs_env’: bug.f90:9:0: error: type mismatch in binary expression SUBROUTINE set_qs_env(qs_env,rtp) ^ unsigned long integer(kind=8) unsigned long _182 = _181 - 1; bug.f90:9:0: error: type mismatch in binary expression unsigned long integer(kind=8) unsigned long _206 = _205 - 1; bug.f90:9:0: internal compiler error: verify_gimple failed 0x9a47ac verify_gimple_in_cfg(function*) ../../gcc/gcc/tree-cfg.c:4728 0x8dad97 execute_function_todo ../../gcc/gcc/passes.c:1979 0x8db75d execute_todo ../../gcc/gcc/passes.c:2008 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions.
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #32 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-09 10:05:18 UTC --- If you can use the additional free time to walk over to my brother's office, then please say 'Hi' to him. Otherwise the faculty meeting will have to do :-) Let's call it a small world... I will meet him next week.
[Bug tree-optimization/55238] ICE in find_aggregate_values_for_callers_subset, at ipa-cp.c:2908 building zlib
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55238 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-08 19:27:22 UTC --- gfortran -O3 also aborts on this testcase at the same location MODULE dbcsr_dist_operations TYPE dbcsr_type LOGICAL :: symmetry END TYPE CONTAINS SUBROUTINE get_stored_coordinates_type(matrix, transpose, processor) TYPE(dbcsr_type), INTENT(IN) :: matrix LOGICAL, INTENT(INOUT) :: transpose INTEGER, INTENT(OUT), OPTIONAL :: processor LOGICAL :: checker_tr IF (PRESENT (processor)) THEN IF (matrix%symmetry .AND. checker_tr()) THEN processor = dbcsr_distribution_processor () ENDIF ENDIF END SUBROUTINE get_stored_coordinates_type SUBROUTINE get_block_index_type(matrix, transpose) TYPE(dbcsr_type), INTENT(IN) :: matrix LOGICAL, INTENT(OUT) :: transpose transpose = .FALSE. CALL get_stored_coordinates_type (matrix, transpose) END SUBROUTINE get_block_index_type END MODULE dbcsr_dist_operations
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #30 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-11-09 07:31:28 UTC --- (In reply to comment #29) I committed the C-only version of the patch as the issues mentioned in comment #27 couldn't be addressed before stage3. Thanks Tobi! I have been using your C-only patch for a couple of weeks now for the 4.7 branch, and it is greatly improving our edit/compile-cycles. For one of my students, it yields an effective 10x speedup in building CP2K after a typical code change, greatly facilitating the programming project he is on. I would suggest that after a couple of weeks on trunk, this should be reconsidered again for backporting to 4.7.
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added URL||http://gcc.gnu.org/ml/fortr ||an/2012-10/msg00061.html CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #26 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-28 11:11:19 UTC --- The patch has been posted some time ago, with an OK for trunk.. http://gcc.gnu.org/ml/fortran/2012-10/msg00067.html Maybe it is a good time to commit before the next stage starts ?
[Bug fortran/55099] New: Surprising 'PROCEDURE attribute conflicts with INTENT attribute' error
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55099 Bug #: 55099 Summary: Surprising 'PROCEDURE attribute conflicts with INTENT attribute' error Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch In the following, a surprising (but correct) error message is issued. Maybe it is possible to improve the wording to point at the other option... looking at ifort's error message is cheating ;-) but certainly that one helps the non-expert. SUBROUTINE S(num_proc_2d) INTEGER, INTENT(IN) :: num_proc_2d INTEGER :: proc_x,proc_y proc_x=num_proc_2d(1) ; proc_x=num_proc_2d(2) END SUBROUTINE Error message: SUBROUTINE S(num_proc_2d) 1 Error: PROCEDURE attribute conflicts with INTENT attribute in 'num_proc_2d' at (1)
[Bug fortran/55099] Surprising but valid 'PROCEDURE attribute conflicts with INTENT attribute' error
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55099 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-27 17:38:34 UTC --- (In reply to comment #1) How about the following (which of course implies that the users didn't intent to use an array - if they did, Intel's becomes more helpful.) Indeed, I was coding this with the intent of declaring it as an array, no doubt, passing an array is much more common than passing a procedure. Note that intel's location is also more useful in that case. I think the usefulness of Intel's message lies in the fact that it suggests the common cause of the error. Even with the experience I have, I first started to look for a procedure with the same name as the variable.
[Bug rtl-optimization/54991] [LRA] internal compiler error: in lra_assign, at lra-assigns.c:1361
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54991 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||Joost.VandeVondele at mat ||dot ethz.ch Resolution||FIXED --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-22 08:43:07 UTC --- verified with several full CP2K builds as fixed.
[Bug fortran/31119] -fbounds-check: Check for presence of optional arguments before bound checking
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31119 --- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-20 14:59:08 UTC --- (In reply to comment #7) Hi, can someone fortran aware please double-check that the tests * gfortran.dg/bounds_check_9.f90: New test. * gfortran.dg/bounds_check_fail_2.f90: New test. do not contain out of bounds access? I am working on path to bound number of loop iterations better based on array accesses and what I see is array A.9 containing values {1,2} that is accessed in the loop header. We bound number of iterations of that loop to 1 (that is one loopback edge iteration to walk both parts of the array) and then the testcases start failing. I do not understand the testcase. Perhaps the bounds-check instrumentation happens too late or we need to disable this logic with -fbounds-check? Honza According to me, the first testcase (bounds_check_9.f90) should contain no out-of-bounds access (at least from the fortran point of view, and also according to valgrind), while the second testcase (bounds_check_fail_2.f90) does contain out-of-bounds access (by design). Of course, -fbounds-check is designed to catch out-of-bounds at runtime (which the second testcase tests). Of course, fortran programs with out-of-bounds access are not standard conforming. Actually, the situation is a bit bizarre. There are no conforming programs for which bounds-checking can trigger... all these bounds-checking statements can be just optimized away :-). That's not quite what the users want... I run -fbounds-check -O2 quite often. I don't think one should switch off optimization in the presence of -fbounds-check. Maybe the docs should be enhanced and mention that bounds checking is most effective at -O0 ?
[Bug rtl-optimization/54991] New: [LRA] internal compiler error: in lra_assign, at lra-assigns.c:1361
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54991 Bug #: 54991 Summary: [LRA] internal compiler error: in lra_assign, at lra-assigns.c:1361 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch I have tested LRA (20121014 (experimental) [lra revision 192621]) on CP2K and find the following ICE: /data/vjoost/gnu/cp2k/cp2k/makefiles/../src/dbcsr_lib/dbcsr_operations.F:4471:0: internal compiler error: in lra_assign, at lra-assigns.c:1361 END SUBROUTINE dbcsr_lanczos_extremal_eig ^ 0x8621b2 lra_assign() ../../gcc/gcc/lra-assigns.c:1361 0x85e2f2 lra(_IO_FILE*) ../../gcc/gcc/lra.c:2309 0x826696 do_reload ../../gcc/gcc/ira.c:4613 0x826696 rest_of_handle_reload ../../gcc/gcc/ira.c:4719 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. make[1]: *** [dbcsr_operations.o] Error 1 make[1]: Leaving directory `/data/vjoost/gnu/cp2k/cp2k/obj/gfortran-test28/sopt' make: *** [build] Error 2 Unfortunately, I don't know how to produce a small testcase, as this is happening only for a compilation with '-fprofile-use'. I could tar the needed mod files and .gdca if that is an option. This is happening on x86_64: -march=corei7 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=24576 -mtune=corei7
[Bug rtl-optimization/54991] [LRA] internal compiler error: in lra_assign, at lra-assigns.c:1361
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54991 --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-19 18:58:31 UTC --- Created attachment 28495 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28495 testcase, including source, .mod and .gcda files needed. README gives compilation command needed
[Bug tree-optimization/54967] New: [4.8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:55
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54967 Bug #: 54967 Summary: [4.8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:55 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch This started failing very recently on trunk : gfortran -c -O2 -funroll-loops bug.f90 bug.f90: In function ‘calc_s_derivs’: bug.f90:1:0: internal compiler error: in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:557 SUBROUTINE calc_S_derivs() ^ 0xa0cf93 check_loop_closed_ssa_use ../../gcc/gcc/tree-ssa-loop-manip.c:557 0xa0d591 check_loop_closed_ssa_stmt ../../gcc/gcc/tree-ssa-loop-manip.c:572 0xa0d591 verify_loop_closed_ssa(bool) ../../gcc/gcc/tree-ssa-loop-manip.c:606 0xa0d880 gimple_duplicate_loop_to_header_edge(loop*, edge_def*, unsigned int, simple_bitmap_def*, edge_def*, vec_tedge_def***, int) ../../gcc/gcc/tree-ssa-loop-manip.c:762 0xdbd99a try_unroll_loop_completely ../../gcc/gcc/tree-ssa-loop-ivcanon.c:519 0xdbd99a canonicalize_loop_induction_variables ../../gcc/gcc/tree-ssa-loop-ivcanon.c:666 0xdbea10 tree_unroll_loops_completely(bool, bool) ../../gcc/gcc/tree-ssa-loop-ivcanon.c:815 cat bug.f90 SUBROUTINE calc_S_derivs() INTEGER, DIMENSION(6, 2) :: c_map_mat INTEGER, DIMENSION(:), POINTER:: C_mat DO j=1,3 DO m=j,3 n=n+1 c_map_mat(n,1)=j IF(m==j)CYCLE c_map_mat(n,2)=m END DO END DO DO m=1,6 DO j=1,2 IF(c_map_mat(m,j)==0)CYCLE CALL foo(C_mat(c_map_mat(m,j))) END DO END DO END SUBROUTINE calc_S_derivs
[Bug tree-optimization/54967] [4.8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:55
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54967 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|NEW |UNCONFIRMED CC||rguenth at gcc dot gnu.org Ever Confirmed|1 |0 --- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-18 08:11:14 UTC --- I assume it is: r190978 | rguenth | 2012-09-05 15:29:13 +0200 (Wed, 05 Sep 2012) | 11 lines 2012-09-05 Richard Guenther rguent...@suse.de
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #25 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-15 14:14:12 UTC --- Just to provide some additional numbers on how important this patch is for practical development (and of course to +1 on backports) for a 'typical code change' on a CP2K tree (add an unused local variable to a subroutine) the speedup due to avoided recompilation (on a 32 core server) can be obtained from the following compile timings (repeatable for various tries): 4.6(unpatched) real1m45.064s 4.7(patched) real0m14.958s I really think this is a pretty substantial bug fix of an existing feature.
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #18 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-13 08:13:14 UTC --- (In reply to comment #14) Created attachment 28425 [details] Patch for testing thanks... now repeated CP2K compiles give identical '.mod's, and of course also omp_lib.mod is fixed. This will very much improve the user experience for those working on large code bases. since you're using C++, I guess a backport to older branches is out of question
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #23 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-13 12:28:12 UTC --- (In reply to comment #22) Created attachment 28440 [details] patch that doesn't use c++ I've tested the patch with (an older version of) the 4.7 branch, and it works fine for CP2K.
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #24 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-13 12:45:11 UTC --- (In reply to comment #23) I've tested the patch with (an older version of) the 4.7 branch, and it works fine for CP2K. it doesn't apply cleanly to 4.6, so no testing there unfortunately.
[Bug middle-end/37150] vectorizer misses some loops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Last reconfirmed|2009-08-06 07:54:57 |2012-10-06 7:54:57 --- Comment #13 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-06 10:38:57 UTC --- reconfirming this with current trunk ifort:1.02s gfortran 4.8: 2.01s gfortran -ffast-math -march=native -O3 -v PR37150.f90 -march=corei7 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-06 12:42:13 UTC --- (In reply to comment #3) 2012-10-06 Tobias Schlüter t...@gcc.gnu.org PR fortran/51727 * module.c (write_generic): Traverse tree in left-to-right order. If tested that this patch fixes the problem for omp_lib.mod, so would likely also fix recompilation cascades.. However, testing it on CP2K I'm finding that compilation fails with this patch (and passes without), so something seems wrong. The difference between the generated modules is rather large.
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-06 12:46:36 UTC --- Created attachment 28373 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28373 bad module
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-06 12:47:19 UTC --- Created attachment 28374 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28374 good module
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #7 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-06 12:48:39 UTC --- The main difference between 'good' and 'bad' seems to be the 'header' lines bad: () (('arch_topology' 'machine_architecture_types' 2)) () good: () (('arch_topology' 'machine_architecture_types' 2) ('ma_mp_type' 'machine_architecture_types' 3) ('ma_process' 'machine_architecture_types' 4) ('machine_output' 'machine_architecture_types' 5) ('thread_inf' 'machine_architecture_types' 6)) ()
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 --- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-06 12:52:09 UTC --- (In reply to comment #3) Created attachment 28372 [details] Candidate patch actually... looking at the patch, don't you need to deal with the if statements that return ?
[Bug fortran/51727] Changing module files
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||simonb at google dot com --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-05 18:16:41 UTC --- Also reported here: http://gcc.gnu.org/ml/gcc/2012-10/msg00075.html this is the source of recompilation cascades sometimes seen in CP2K as well. I'm wondering if a very naive hack like sorting .mod content (like in cat old.mod 1 | sort -s new.mod) could not paper over this problem sufficiently well to make it irrelevant in reality.
[Bug rtl-optimization/54751] [4.8 Regression] slow compile time with rtl loop unroller
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54751 --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-10-02 10:39:41 UTC --- More reasonable with -enable-checking=release 4.8(checking=yes):~10min 4.8(checking=release): 1min28s. 4.7 : 0min58s maybe some of the checking is a bit excessive in this case.
[Bug fortran/54758] New: accessing gcc builtins from fortran
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54758 Bug #: 54758 Summary: accessing gcc builtins from fortran Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch I would like to experiment with prefetching in Fortran code (beyond -fprefetch-loop-arrays). A convenient way to do this would be access the gcc_builtins. I tried this the following way: INTERFACE SUBROUTINE builtin_prefetch(a) BIND(C,name=__builtin_prefetch) USE ISO_C_BINDING, ONLY: C_FLOAT REAL(KIND=C_FLOAT), dimension(*) :: a END SUBROUTINE END INTERFACE real*4 :: data(100) DO i=1,100 CALL builtin_prefetch(data(i)) data(i)=0 ENDDO END but it didn't work... test.f90:(.text+0x36): undefined reference to `__builtin_prefetch' no surprise, I guess, but it would be cool if it did.
[Bug fortran/45586] [4.8 Regression] ICE non-trivial conversion at assignment
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45586 --- Comment #86 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-30 12:30:43 UTC --- (In reply to comment #84) LTO might work for many codes, as using allocatables in derived types was not standard Fortran90 (IIRC) and appears needed to trigger the bug. Anyway, since most people will use released versions of gcc, this checking error will be hidden behind --enable-checking=release. Only very few people will be able to locate and in particular reduce wrong code generation that only happens with LTO, so I wouldn't expect bug reports for actual wrong code generation very quickly. Meanwhile a shorter testcase for 4.8, using gfortran -flto -O0. TYPE t REAL, DIMENSION(:), ALLOCATABLE :: r END TYPE t TYPE t_p TYPE(t), POINTER :: d_t END TYPE t_p REAL, DIMENSION(:), POINTER:: d TYPE(t_p) :: x d=x%d_t%r END
[Bug go/54749] New: libbacktrace
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54749 Bug #: 54749 Summary: libbacktrace Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: go AssignedTo: i...@airs.com ReportedBy: joost.vandevond...@mat.ethz.ch On a testcase that makes the compiler run out-of-memory (by setting ulimit to ulimit -m 8388608 ulimit -v 8388608 ulimit -d 8388608 ulimit -t 600 and running the full testcase of PR53852) I get the following stacktrace, which is a bit ugly: GNU MP: Cannot allocate memory (size=8) In function 'build_d_tensor_gks': H�D$A��H�H�D$H�H�D$ H�CH�D$(H�kH�Cv,H��H�l$HH�\$@L�d$PL�l$XL�t$`H��h�Aborted mmap: Cannot allocate memory failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information failed to read executable information Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. make[2]: *** [semi_empirical_int_gks.o] Error 1 make[2]: Target `_progr' not remade because of errors. make[2]: Leaving directory `/data/vjoost/gnu/cp2k/cp2k/obj/gfortran-test12/sopt' make[1]: *** [build] Error 2 make[1]: Leaving directory `/data/vjoost/gnu/cp2k/cp2k/makefiles'
[Bug rtl-optimization/54751] New: [4.8 Regression] slow compile time with rtl loop unroller
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54751 Bug #: 54751 Summary: [4.8 Regression] slow compile time with rtl loop unroller Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch Created attachment 28299 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28299 gzipped testcase. compiling the attached testcase need ~10x more time with current 4.8 trunk than with 4.7. I believe this is a recent regression. A typical stack trace looks like #0 0x006a7a02 in df_ref_equal_p(df_ref_d*, df_ref_d*) () #1 0x006a7af5 in df_refs_verify(vec_tdf_ref_d**, df_ref_d**, bool) () #2 0x006abf3f in df_insn_refs_verify(df_collection_rec*, basic_block_def*, rtx_def*, bool) () #3 0x006aea2a in df_bb_verify(basic_block_def*) () #4 0x006aed40 in df_scan_verify() () #5 0x0069e155 in df_analyze() () #6 0x0083b1dd in iv_analysis_loop_init(loop*) () #7 0x0083e685 in get_simple_loop_desc(loop*) () #8 0x00841265 in unroll_and_peel_loops(int) () #9 0x00835cd7 in rtl_unroll_and_peel_loops() () #10 0x00881107 in execute_one_pass(opt_pass*) () compile flags: gfortran -c -cpp -O2 -ftree-vectorize -funroll-loops -ffast-math test.f90 (needs about 10min (gcc 4.8) or 1min (gcc 4.7) on my machine, removing -funroll-loops reduces that to 1m20s (4.8) or 28s (4.7))
[Bug middle-end/54749] libbacktrace
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54749 --- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-29 17:34:04 UTC --- (In reply to comment #1) You filed this against the go component, but it seems that Go is not involved. Is that right? This is just about a backtrace printed after a run of the Fortran compiler? yes, unclear what the proper component was for libbacktrace... I didn't consider this middle end either (and I was under the impression that go and libbracktrace had something in common). The problem is not the fact that this particular run crashes, but the fact that the trace should deal with the mmap out-of-mem more nicel (i.e. one line of error).
[Bug fortran/45586] [4.8 Regression] ICE non-trivial conversion at assignment
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45586 --- Comment #83 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-26 06:42:59 UTC --- Mikael, any progress on this one (BTW, the PR is not yet assigned)? It would be great to have LTO work with Fortran in 4.8 (especially with all the inlining improvements). However, I would guess that this is stage 1 material, and I'm assuming stage 1 is nearing its end.
[Bug tree-optimization/54634] New: [4.8 Regression] miscompilation with -O3 -ftree-loop-distribution
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54634 Bug #: 54634 Summary: [4.8 Regression] miscompilation with -O3 -ftree-loop-distribution Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch Created attachment 28227 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28227 testcase sources The attached sources are miscompiled with current trunk ([trunk revision 191430]) at -O3 -ftree-loop-distribution. To reproduce gfortran -O3 -ftree-loop-distribution -ffree-form other.F mathconstants.F orbital_pointers.F orbital_symbols.F orbital_transformation_matrices.F main.F ; ./a.out which outputs wrong values (as compared to -O0) and shows a valgrind warning (not present at -O0). The miscompiled file is orbital_transformation_matrices.F, most likely the routine create_spherical_harmonics (which seems inlined). If I cat at files in a single .F file, the error also disappears, which might hint at some ipa thing ? 4.7 branch ([gcc-4_7-branch revision 190437]) is doing fine.
[Bug tree-optimization/54634] [4.8 Regression] miscompilation with -O3 -ftree-loop-distribution
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54634 --- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-20 10:15:57 UTC --- (In reply to comment #1) Retry with PR54629 fix? after applying the patch mentioned above, the testcase still fails. The failure is also older than the commit mentioned in PR54629
[Bug tree-optimization/54634] [4.8 Regression] miscompilation with -O3 -ftree-loop-distribution
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54634 --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-20 13:06:50 UTC --- (In reply to comment #4) Ah, binomial () is pure. In this case, it was presumably triggered by Tobias' changes for PR54389. binomial() has not been declared pure in the source, but most likely correctly declared 'implicitly pure' but the Fortran frontend.
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #11 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-13 12:31:03 UTC --- (In reply to comment #10) Draft patch (replaces the one in comment 9): --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -13567,6 +13572,5 @@ gfc_impure_variable (gfc_symbol *sym) proc = sym-ns-proc_name; - if (sym-attr.dummy gfc_pure (proc) -((proc-attr.subroutine sym-attr.intent == INTENT_IN) - || -proc-attr.function)) + if (sym-attr.dummy + ((proc-attr.subroutine sym-attr.intent == INTENT_IN) + || proc-attr.function)) return 1; this one fixes the error seen with CP2K.
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #16 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-14 05:57:51 UTC --- (In reply to comment #15) FIXED on the trunk - and on the 4.6/4.7 branch. Sorry for the breakage! Thank you and other gcc experts for regularly fixing issues quickly and professionally, while steadily improving the quality of the compiler!
[Bug fortran/54389] [F2003/F2008 difference] PURE functions and pointer dummy arguments / DECL_PURE_P issue
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54389 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-12 10:00:46 UTC --- This revision causes CP2K to produce wrong results at -O1 and above. I don't have a reduced testcase, other than compiling and building CP2K, but found this by bisection.
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-12 11:41:12 UTC --- the two revisions lead to a lot of changes, all these files differ in their disassembled form: 1admm_methods.o Files f1 and f2 differ 2atom_fit.o Files f1 and f2 differ 3atom_pseudo.o Files f1 and f2 differ 9cp_ddapc_methods.o Files f1 and f2 differ 10cp_fm_basic_linalg.o Files f1 and f2 differ 11cp_ma_interface.o Files f1 and f2 differ 12cp_parser_inpp_methods.o Files f1 and f2 differ 13cp_parser_methods.o Files f1 and f2 differ 14dbcsr_dist_operations.o Files f1 and f2 differ 15dbcsr_example_3.o Files f1 and f2 differ 16dbcsr_index_operations.o Files f1 and f2 differ 17dbcsr_internal_operations.o Files f1 and f2 differ 18dbcsr_iterator_operations.o Files f1 and f2 differ 19dbcsr_operations.o Files f1 and f2 differ 20dbcsr_performance_multiply.o Files f1 and f2 differ 21dbcsr_test_add.o Files f1 and f2 differ 22dbcsr_test_methods.o Files f1 and f2 differ 23dbcsr_test_multiply.o Files f1 and f2 differ 24dbcsr_transformations.o Files f1 and f2 differ 25dbcsr_work_operations.o Files f1 and f2 differ 26efield_utils.o Files f1 and f2 differ 27et_coupling.o Files f1 and f2 differ 28f77_interface.o Files f1 and f2 differ 29fp_methods.o Files f1 and f2 differ 30helium_io.o Files f1 and f2 differ 31hfx_types.o Files f1 and f2 differ 32input_cp2k.o Files f1 and f2 differ 33lgrid_types.o Files f1 and f2 differ 34ma_affinity.o Files f1 and f2 differ 35mltfftsg.o Files f1 and f2 differ 36molsym.o Files f1 and f2 differ 37orbital_transformation_matrices.o Files f1 and f2 differ 38pair_potential.o Files f1 and f2 differ 39parallel_rng_types.o Files f1 and f2 differ 40paw_proj_set_types.o Files f1 and f2 differ 41preconditioner.o Files f1 and f2 differ 42pw_methods.o Files f1 and f2 differ 43pw_poisson_methods.o Files f1 and f2 differ 44pw_poisson_types.o Files f1 and f2 differ 45pw_pool_types.o Files f1 and f2 differ 46qs_gspace_mixing.o Files f1 and f2 differ 47qs_integrate_potential.o Files f1 and f2 differ 48qs_ks_methods.o Files f1 and f2 differ 49qs_neighbor_lists.o Files f1 and f2 differ 50qs_neighbor_list_types.o Files f1 and f2 differ 51qs_rho0_methods.o Files f1 and f2 differ 52qs_rho_methods.o Files f1 and f2 differ 53qs_scf_block_davidson.o Files f1 and f2 differ 54qs_scf_diagonalization.o Files f1 and f2 differ 55qs_scf.o Files f1 and f2 differ 56qs_vxc.o Files f1 and f2 differ 57restraint.o Files f1 and f2 differ 58rtp_admm_methods.o Files f1 and f2 differ 59rt_propagation_methods.o Files f1 and f2 differ 60sap_kind_types.o Files f1 and f2 differ 61scp_hartree_1center.o Files f1 and f2 differ 62se_core_matrix.o Files f1 and f2 differ 63se_fock_matrix_coulomb_ga.o Files f1 and f2 differ 64se_fock_matrix_coulomb_mpi.o Files f1 and f2 differ 65semi_empirical_expns3_methods.o Files f1 and f2 differ 66semi_empirical_par_utils.o Files f1 and f2 differ 67task_list_methods.o Files f1 and f2 differ 68thermostat_mapping.o Files f1 and f2 differ 69xc.o Files f1 and f2 differ
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-12 20:11:24 UTC --- some progress.. the object file that leads to wrong results is parallel_rng_types.o. I'll see if I can get some further insight.
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-12 20:26:49 UTC --- (In reply to comment #3) (In reply to comment #2) some progress.. the object file that leads to wrong results is parallel_rng_types.o. I'll see if I can get some further insight. It seems that - for some reason - IMPLICIT_PURE is only set for functions. (Or at least that's here the case for a simple test case.) If you produce a module, have a look at the .mod file and search for IMPLICIT_PURE. In my example I have something like: 3 's' 'm' '' 1 ((PROCEDURE [...] FUNCTION IMPLICIT_PURE) [...] where s is the name of my function and m is the name of the module. Then, check whether that procedure could be PURE or has to be IMPURE. yes, I think from looking at the optimized dumps, I can see that a function that is called twice in the correct version is called only once in the wrong version. I think I might be able to reduce it to a testcase. (If you care, the function is rn53 which calls rn32 only once, so I guess that's the issue).
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-12 20:46:05 UTC --- Created attachment 28179 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28179 testcase
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-12 20:50:40 UTC --- The testcase illustrates the issue, compiling as gfortran -c -O1 test.f90 -fdump-tree-optimized shows that rn32 is only called once from rn53, whereas the proper number would be 2 or 3. So I guess rn32 is incorrectly marked as pure.
[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556 --- Comment #7 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-12 20:58:23 UTC --- (In reply to comment #6) So I guess rn32 is incorrectly marked as pure. which indeed is also visible in the .mod file: 'rn32' 'parallel_rng_types' '' 1 ((PROCEDURE UNKNOWN-INTENT MODULE-PROC DECL UNKNOWN 0 0 FUNCTION IMPLICIT_PURE ALWAYS_EXPLICIT)
[Bug fortran/45586] [4.8 Regression] ICE non-trivial conversion at assignment
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45586 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added URL||http://gcc.gnu.org/ml/fortr ||an/2012-08/msg00150.html --- Comment #82 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-09-04 12:22:12 UTC --- URL for the current version of the patch added.
[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474 --- Comment #70 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-28 11:28:06 UTC --- (In reply to comment #69) Is there still a problem here? for current trunk and the original testcase, timings are reasonable at -O0 -O1 -O2, but very long at -O3 (60min): report.O0.txt: TOTAL : 38.78 0.8939.67 691166 kB report.O1.txt: TOTAL : 70.04 1.1371.22 634523 kB report.O2.txt: TOTAL : 204.51 1.16 205.71 691522 kB the biggest consumers are -O0: integrated RA : 10.36 reload : 5.16; -O1: tree PTA: 7.77 integrated RA : 13.36 -O2: expand vars : 83.15 tree PTA: 35.04 -O3: (also needs about 4Gb of memory) ??? not yet finished (60min)
[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474 --- Comment #71 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-28 14:54:54 UTC --- The -O3 compile is 3h later still running and needs 20Gb of RAM. The issue seems now to be variable_tracking_main #0 0x00b7b8ce in dataflow_set_preserve_mem_locs(void**, void*) () #1 0x00e76168 in htab_traverse_noresize () #2 0x00b770e0 in dataflow_set_clear_at_call(dataflow_set_def*) () #3 0x00b7c613 in vt_emit_notes() () #4 0x00b847ea in variable_tracking_main() () #5 0x008e8acf in execute_one_pass(opt_pass*) ()
[Bug fortran/25708] Module loading is not good at all
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25708 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Depends on||40958 --- Comment #21 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-24 14:00:40 UTC --- I did another timing experiment on compiling CP2K. I found that on my server, compiling with -fsyntax-only is as fast as just compiling at -O0. I believe the reason for this is that module reading is dominating the compile time. In CP2K each module is included only once per file, so I think it is the efficiency of reading the module that matters most. My guess would be that the human readable format of the .mod file is the source of most inefficiency. Is it still important to the development of gfortran that the .mod file is in this form ? If I count the number of times a module is used, and multiply that with the size, I have about 1Gb of .mod files being parsed per CP2K compile (for about 35Mb of Fortran).
[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-22 07:40:26 UTC --- Fixed for current trunk, maybe a dup of PR54332
[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269 --- Comment #7 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-22 07:43:30 UTC --- Fixed for current trunk, maybe a dup of PR54332
[Bug tree-optimization/53852] [4.8 Regression] -ftree-loop-linear: large compile time / memory usage
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53852 --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-22 11:58:00 UTC --- simplified testcase and some analysis: SUBROUTINE build_d_tensor_gks(d5f,v,d5) INTEGER, PARAMETER :: dp=8 REAL(KIND=dp), DIMENSION(3, 3, 3, 3, 3), INTENT(OUT) :: d5f REAL(KIND=dp), DIMENSION(3), INTENT(IN) :: v REAL(KIND=dp), INTENT(IN) :: d5 INTEGER :: k1, k2, k3, k4, k5 REAL(KIND=dp) :: w d5f = 0.0_dp DO k1=1,3 DO k2=1,3 DO k3=1,3 DO k4=1,3 DO k5=1,3 d5f(k5,k4,k3,k2,k1)=d5f(k5,k4,k3,k2,k1)+ v(k1)*v(k2)*v(k3)*v(k4)*v(k5)*d5 ENDDO w=v(k1)*v(k2)*v(k3)*d4 d5f(k1,k2,k3,k4,k4)=d5f(k1,k2,k3,k4,k4)+w d5f(k1,k2,k4,k3,k4)=d5f(k1,k2,k4,k3,k4)+w d5f(k1,k4,k2,k3,k4)=d5f(k1,k4,k2,k3,k4)+w d5f(k4,k1,k2,k3,k4)=d5f(k4,k1,k2,k3,k4)+w d5f(k1,k2,k4,k4,k3)=d5f(k1,k2,k4,k4,k3)+w ! d5f(k1,k4,k2,k4,k3)=d5f(k1,k4,k2,k4,k3)+w ! d5f(k4,k1,k2,k4,k3)=d5f(k4,k1,k2,k4,k3)+w ! d5f(k1,k4,k4,k2,k3)=d5f(k1,k4,k4,k2,k3)+w ! d5f(k4,k1,k4,k2,k3)=d5f(k4,k1,k4,k2,k3)+w ! d5f(k4,k4,k1,k2,k3)=d5f(k4,k4,k1,k2,k3)+w ENDDO ENDDO ENDDO ENDDO END SUBROUTINE build_d_tensor_gks the issue is that the compile time grows exponentially in the number of uncommented lines of the d5f=d5f+w type: 1 0m1.112s 2 0m4.448s 3 0m11.513s 4 0m21.514s 5 0m35.529s
[Bug rtl-optimization/54269] New: [4.8 Regression] memory usage too large when optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269 Bug #: 54269 Summary: [4.8 Regression] memory usage too large when optimizing Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch Created attachment 28019 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28019 gzipped testcase. The attached testcase requires +- 10Gb resident memory to compile with: gfortran -c -O3 -funroll-all-loops -march=native -ffree-form -D__LIBINT hfx_contraction_methods.F using current trunk. I believe this is a recent regression in trunk. 4.7 needs 500Mb. From a very quick gdb session, I guess this is some rtl thing.
[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269 --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-15 09:57:13 UTC --- seems like it is triggered by unrolling, using gfortran -O2 -funroll-loops -ffree-form -D__LIBINT hfx_contraction_methods.F is enough. A bt at the first point where memory seems to go up is: #1 0x007176de in df_scan_verify () at ../../gcc/gcc/df-scan.c:4540 #2 0x00706245 in df_verify () at ../../gcc/gcc/df-core.c:1645 #3 df_analyze () at ../../gcc/gcc/df-core.c:1206 #4 0x008a211b in iv_analysis_loop_init (loop=0x7f4b0ece63b8) at ../../gcc/gcc/loop-iv.c:299 #5 0x008a56ba in get_simple_loop_desc (loop=0x7f4b0ece63b8) at ../../gcc/gcc/loop-iv.c:2973 #6 0x008a8c70 in decide_peel_once_rolling (flags=2) at ../../gcc/gcc/loop-unroll.c:337 #7 peel_loops_completely (flags=2) at ../../gcc/gcc/loop-unroll.c:248 #8 unroll_and_peel_loops (flags=2) at ../../gcc/gcc/loop-unroll.c:164 #9 0x0089cc98 in rtl_unroll_and_peel_loops () at ../../gcc/gcc/loop-init.c:370
[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269 --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-15 10:59:38 UTC --- (In reply to comment #2) Well, that's ENABLE_CHECKING code. Are you sure 4.7 built with --enable-checking=yes does not exhibit this behavior? I'm pretty sure this was not observed 3 weeks ago on trunk. Just to make sure, I'm doing a new trunk build with --enable-checking=no.
[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269 --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-15 11:37:51 UTC --- (In reply to comment #2) Well, that's ENABLE_CHECKING code. Are you sure 4.7 built with --enable-checking=yes does not exhibit this behavior? it looks like --enable-checking is key. --enable-checking=no leads to about 1Gb, while --enable-checking=yes leads to about 10Gb mem usage.
[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269 --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-08-16 05:29:46 UTC --- 4.7 configured with --enable-checking=yes also needs 1.0Gb. for a checking enable compiler, time went from 25s with 4.7 to 1m27s with 4.8
[Bug middle-end/53852] New: -ftree-loop-linear: large compile time / memory usage
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53852 Bug #: 53852 Summary: -ftree-loop-linear: large compile time / memory usage Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch Current trunk (189233) has X Gb of memory usage (before I have to kill the compilation) on the following testcase with: gfortran -O2 -ftree-loop-linear test.f90 SUBROUTINE build_d_tensor_gks(d1f, d2f, d3f, d4f, d5f, v, d1, d2, d3, d4, d5) INTEGER, PARAMETER :: dp=8 REAL(KIND=dp), DIMENSION(3), INTENT(OUT) :: d1f REAL(KIND=dp), DIMENSION(3, 3), INTENT(OUT):: d2f REAL(KIND=dp), DIMENSION(3, 3, 3), INTENT(OUT):: d3f REAL(KIND=dp), DIMENSION(3, 3, 3, 3), INTENT(OUT):: d4f REAL(KIND=dp), DIMENSION(3, 3, 3, 3, 3), INTENT(OUT), OPTIONAL :: d5f REAL(KIND=dp), DIMENSION(3), INTENT(IN) :: v REAL(KIND=dp), INTENT(IN):: d1, d2, d3, d4 REAL(KIND=dp), INTENT(IN), OPTIONAL :: d5 INTEGER :: k1, k2, k3, k4, k5 REAL(KIND=dp):: w d1f = 0.0_dp d2f = 0.0_dp d3f = 0.0_dp d4f = 0.0_dp DO k1=1,3 d1f(k1)=d1f(k1)+v(k1)*d1 ENDDO DO k1=1,3 DO k2=1,3 d2f(k2,k1)=d2f(k2,k1)+v(k1)*v(k2)*d2 ENDDO d2f(k1,k1)=d2f(k1,k1)+ d1 ENDDO DO k1=1,3 DO k2=1,3 DO k3=1,3 d3f(k3,k2,k1)=d3f(k3,k2,k1)+v(k1)*v(k2)*v(k3)*d3 ENDDO w=v(k1)*d2 d3f(k1,k2,k2)=d3f(k1,k2,k2)+w d3f(k2,k1,k2)=d3f(k2,k1,k2)+w d3f(k2,k2,k1)=d3f(k2,k2,k1)+w ENDDO ENDDO DO k1=1,3 DO k2=1,3 DO k3=1,3 DO k4=1,3 d4f(k4,k3,k2,k1)=d4f(k4,k3,k2,k1)+ v(k1)*v(k2)*v(k3)*v(k4)*d4 ENDDO w=v(k1)*v(k2)*d3 d4f(k1,k2,k3,k3)=d4f(k1,k2,k3,k3)+w d4f(k1,k3,k2,k3)=d4f(k1,k3,k2,k3)+w d4f(k3,k1,k2,k3)=d4f(k3,k1,k2,k3)+w d4f(k1,k3,k3,k2)=d4f(k1,k3,k3,k2)+w d4f(k3,k1,k3,k2)=d4f(k3,k1,k3,k2)+w d4f(k3,k3,k1,k2)=d4f(k3,k3,k1,k2)+w ENDDO d4f(k1,k1,k2,k2)=d4f(k1,k1,k2,k2)+d2 d4f(k1,k2,k1,k2)=d4f(k1,k2,k1,k2)+d2 d4f(k1,k2,k2,k1)=d4f(k1,k2,k2,k1)+d2 ENDDO ENDDO IF (PRESENT(d5f).AND.PRESENT(d5)) THEN d5f = 0.0_dp DO k1=1,3 DO k2=1,3 DO k3=1,3 DO k4=1,3 DO k5=1,3 d5f(k5,k4,k3,k2,k1)=d5f(k5,k4,k3,k2,k1)+ v(k1)*v(k2)*v(k3)*v(k4)*v(k5)*d5 ENDDO w=v(k1)*v(k2)*v(k3)*d4 d5f(k1,k2,k3,k4,k4)=d5f(k1,k2,k3,k4,k4)+w d5f(k1,k2,k4,k3,k4)=d5f(k1,k2,k4,k3,k4)+w d5f(k1,k4,k2,k3,k4)=d5f(k1,k4,k2,k3,k4)+w d5f(k4,k1,k2,k3,k4)=d5f(k4,k1,k2,k3,k4)+w d5f(k1,k2,k4,k4,k3)=d5f(k1,k2,k4,k4,k3)+w d5f(k1,k4,k2,k4,k3)=d5f(k1,k4,k2,k4,k3)+w d5f(k4,k1,k2,k4,k3)=d5f(k4,k1,k2,k4,k3)+w d5f(k1,k4,k4,k2,k3)=d5f(k1,k4,k4,k2,k3)+w d5f(k4,k1,k4,k2,k3)=d5f(k4,k1,k4,k2,k3)+w d5f(k4,k4,k1,k2,k3)=d5f(k4,k4,k1,k2,k3)+w ENDDO w=v(k1)*d3 d5f(k1,k2,k2,k3,k3)=d5f(k1,k2,k2,k3,k3)+w d5f(k1,k2,k3,k2,k3)=d5f(k1,k2,k3,k2,k3)+w d5f(k1,k2,k3,k3,k2)=d5f(k1,k2,k3,k3,k2)+w d5f(k2,k1,k2,k3,k3)=d5f(k2,k1,k2,k3,k3)+w d5f(k2,k1,k3,k2,k3)=d5f(k2,k1,k3,k2,k3)+w d5f(k2,k1,k3,k3,k2)=d5f(k2,k1,k3,k3,k2)+w d5f(k2,k2,k1,k3,k3)=d5f(k2,k2,k1,k3,k3)+w d5f(k2,k3,k1,k2,k3)=d5f(k2,k3,k1,k2,k3)+w d5f(k2,k3,k1,k3,k2)=d5f(k2,k3,k1,k3,k2)+w d5f(k2,k2,k3,k1,k3)=d5f(k2,k2,k3,k1,k3)+w d5f(k2,k3,k2,k1,k3)=d5f(k2,k3,k2,k1,k3)+w d5f(k2,k3,k3,k1,k2)=d5f(k2,k3,k3,k1,k2)+w d5f(k2,k2,k3,k3,k1)=d5f(k2,k2,k3,k3,k1)+w d5f(k2,k3,k2,k3,k1)=d5f(k2,k3,k2,k3,k1)+w d5f(k2,k3,k3,k2,k1)=d5f(k2,k3,k3,k2,k1)+w ENDDO ENDDO ENDDO END IF END SUBROUTINE build_d_tensor_gks
[Bug middle-end/53852] -ftree-loop-linear: large compile time / memory usage
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53852 --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-07-04 12:17:47 UTC --- To fill in the X, 130 Gb is not sufficient for this testcase.
[Bug bootstrap/53835] New: in tree isl / cloog build fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53835 Bug #: 53835 Summary: in tree isl / cloog build fails Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch after downloading from gcc/infrastructure, and put cloog and isl in-tree, a bootstrap fails with the errors below. Executing make in obj/cloog goes fine. make[3]: Entering directory `/data/vjoost/gnu/gcc_trunk/obj/cloog' Making all in . checking for ANSI C header files... make[4]: Entering directory `/data/vjoost/gnu/gcc_trunk/obj/cloog' CC libcloog_isl_la-block.lo CC libcloog_isl_la-clast.lo CC libcloog_isl_la-matrix.lo CC libcloog_isl_la-state.lo CC libcloog_isl_la-input.lo CC libcloog_isl_la-int.lo CC libcloog_isl_la-loop.lo CC libcloog_isl_la-names.lo CC libcloog_isl_la-options.lo CC libcloog_isl_la-pprint.lo CC libcloog_isl_la-program.lo CC libcloog_isl_la-union_domain.lo CC libcloog_isl_la-statement.lo CC libcloog_isl_la-stride.lo CC libcloog_isl_la-domain.lo CC libcloog_isl_la-backend.lo CC libcloog_isl_la-version.lo CC libcloog_isl_la-constraints.lo CC cloog.o In file included from ../../gcc/cloog/include/cloog/isl/constraintset.h:4:0, from ../../gcc/cloog/include/cloog/isl/cloog.h:9, from ../../gcc/cloog/source/isl/backend.c:1: ../../gcc/cloog/include/cloog/isl/backend.h:4:28: fatal error: isl/constraint.h: No such file or directory compilation terminated. make[4]: *** [libcloog_isl_la-backend.lo] Error 1 make[4]: *** Waiting for unfinished jobs In file included from ../../gcc/cloog/include/cloog/isl/constraintset.h:4:0, from ../../gcc/cloog/include/cloog/isl/cloog.h:9, from ../../gcc/cloog/source/isl/constraints.c:4: ../../gcc/cloog/include/cloog/isl/backend.h:4:28: fatal error: isl/constraint.h: No such file or directory
[Bug tree-optimization/51179] poor vectorization on interlagos.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179 --- Comment #11 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-30 11:26:59 UTC --- It looks like this problem is solved in the current 4.7 and 4.8 branches. At least on an avx machine, the best performance found by the code in comment #4 jumps from 5.3Gflops in 4.6 to 13.9Glfops in 4.7/4.8. Great work. I can't test this right now on interlagos, but I guess this could be OK as well.
[Bug tree-optimization/47657] missed vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47657 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-30 13:34:24 UTC --- performance seems good on 4.8
[Bug middle-end/47341] unnecessary versioning in the vectorizer.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47341 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Last reconfirmed|2011-01-18 11:21:06 |2012-06-30 11:21:06 --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-30 13:39:57 UTC --- versioning still happens with 4.8
[Bug libfortran/51119] MATMUL slow for large matrices
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 07:19:03 UTC --- (In reply to comment #7) (In reply to comment #6) Janne, have you had a chance to look at this ? For larger matrices MATMMUL is really slow. Anything that includes even the most basic blocking scheme should be faster. I think this would be a valuable improvement. I implemented a block-panel multiplication algorithm similar to GOTO BLAS and Eigen, but I got side-tracked by other things and never found the time to fix the corner-case bugs and tune performance. IIRC I reached about 30-40 % of peak flops which was a bit disappointing. I think 30% of peak is a good improvement over the current version (which reaches 7% of peak (92% for MKL) for a double precision 8000x8000 matrix multiplication) on a sandy bridge. In addition to blocking, is the Fortran runtime being compiled with a set of compile options that enables vectorization ? In the ideal world, gcc would recognize the loop pattern in the runtime library code, and do blocking, vectorization etc. automagically.
[Bug middle-end/40194] fortran rules for optimizing
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40194 --- Comment #10 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 14:14:16 UTC --- this testcase now looks optimized (at least the optimized dump contains return 1; as expected). I guess this can be closed ?
[Bug middle-end/40282] ICE with -fipa-type-escape
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40282 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||WONTFIX --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 14:22:34 UTC --- ipa-type-escape has long been removed.
[Bug middle-end/41453] use INTENT(out) for optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41453 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Last reconfirmed||2012-06-29 --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 14:25:46 UTC --- still happens on 4.8 trunk
[Bug libgomp/41737] [omp] missing error for undeclared variable in a parallel region with default(none)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41737 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Last reconfirmed||2012-06-29 --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 14:34:00 UTC --- simplified testcase, for 4.8: INTEGER :: ip,np !$omp parallel do default(none) DO ip=0,np ENDDO !$omp end parallel do END while it is OK for ip to have no explicit attribute, I believe the standard requires one for np. Intel ifort gives: est.f90(3): error #6752: Since the OpenMP* DEFAULT(NONE) clause applies, the PRIVATE, SHARED, REDUCTION, FIRSTPRIVATE, or LASTPRIVATE attribute must be explicitly specified for every variable. [NP]
[Bug middle-end/47298] -O3 destroys beautifully vectorized code obtained at -O2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47298 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Last reconfirmed||2012-06-29 --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 14:44:05 UTC --- on 4.8 this still is not handled optimally. I get 4.3s for: gfortran -O2 -funroll-loops -ftree-vectorize -ffast-math -march=native 6.7s for: gfortran -O3 -funroll-loops -ftree-vectorize -ffast-math -march=native so more than 50% slowdown going from -O2 to -O3 on -march=corei7 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=32 --param l1-cache-line-size=64
[Bug tree-optimization/34940] contained subroutines called only once are not inlined
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34940 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Last reconfirmed|2008-01-23 11:27:01 |2012-06-29 11:27:01 --- Comment #15 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 14:52:44 UTC --- no inlining with 4.8 either
[Bug libgomp/41737] [omp] missing error for undeclared variable in a parallel region with default(none)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41737 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 18:46:13 UTC --- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46532 *** This bug has been marked as a duplicate of bug 46532 ***
[Bug fortran/46532] [OMP] missing error for loop bounds missing an attribute
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46532 --- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-29 18:46:13 UTC --- *** Bug 41737 has been marked as a duplicate of this bug. ***
[Bug libfortran/51119] MATMUL slow for large matrices
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119 --- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-28 11:58:20 UTC --- Janne, have you had a chance to look at this ? For larger matrices MATMMUL is really slow. Anything that includes even the most basic blocking scheme should be faster. I think this would be a valuable improvement.
[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474 --- Comment #60 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-15 15:26:20 UTC --- (In reply to comment #59) There should be no compile performance problems in expand anymore. The alias stmt walker as used from IPA remains a problem, though. Thanks... expand is now indeed essentially gone from the timing report. gfortran -ftime-report -ffree-line-length-512 -g -c testcase.f90 Execution times (seconds) phase setup : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 243 kB ( 0%) ggc phase parsing : 3.57 ( 9%) usr 0.06 ( 7%) sys 3.63 ( 9%) wall 47592 kB ( 7%) ggc phase cgraph: 36.49 (91%) usr 0.86 (93%) sys 37.34 (91%) wall 647436 kB (93%) ggc phase generate : 36.50 (91%) usr 0.86 (93%) sys 37.36 (91%) wall 647838 kB (93%) ggc garbage collection : 1.04 ( 3%) usr 0.00 ( 0%) sys 1.04 ( 3%) wall 0 kB ( 0%) ggc callgraph construction : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 15909 kB ( 2%) ggc callgraph optimization : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 201 kB ( 0%) ggc cfg construction: 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 7 kB ( 0%) ggc cfg cleanup : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc CFG verifier: 1.26 ( 3%) usr 0.00 ( 0%) sys 1.25 ( 3%) wall 0 kB ( 0%) ggc trivially dead code : 0.43 ( 1%) usr 0.00 ( 0%) sys 0.41 ( 1%) wall 0 kB ( 0%) ggc df scan insns : 0.98 ( 2%) usr 0.24 (26%) sys 1.24 ( 3%) wall 11 kB ( 0%) ggc df live regs: 0.58 ( 1%) usr 0.01 ( 1%) sys 0.57 ( 1%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.43 ( 1%) usr 0.01 ( 1%) sys 0.45 ( 1%) wall 19416 kB ( 3%) ggc register information: 0.18 ( 0%) usr 0.00 ( 0%) sys 0.18 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%) wall 8337 kB ( 1%) ggc rebuild jump labels : 0.22 ( 1%) usr 0.00 ( 0%) sys 0.21 ( 1%) wall 0 kB ( 0%) ggc parser (global) : 3.57 ( 9%) usr 0.06 ( 7%) sys 3.63 ( 9%) wall 47587 kB ( 7%) ggc inline heuristics : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 54 kB ( 0%) ggc tree gimplify : 0.51 ( 1%) usr 0.01 ( 1%) sys 0.51 ( 1%) wall 26304 kB ( 4%) ggc tree eh : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 39 kB ( 0%) ggc tree CFG construction : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 190 kB ( 0%) ggc tree CFG cleanup: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree find ref. vars : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 3263 kB ( 0%) ggc tree PHI insertion : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA other : 0.01 ( 0%) usr 0.01 ( 1%) sys 0.02 ( 0%) wall 18 kB ( 0%) ggc tree operand scan : 0.03 ( 0%) usr 0.03 ( 3%) sys 0.05 ( 0%) wall 118 kB ( 0%) ggc tree SSA verifier : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc tree STMT verifier : 0.56 ( 1%) usr 0.05 ( 5%) sys 0.63 ( 2%) wall 0 kB ( 0%) ggc callgraph verifier : 0.25 ( 1%) usr 0.00 ( 0%) sys 0.27 ( 1%) wall 0 kB ( 0%) ggc out of ssa : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc expand vars : 1.02 ( 3%) usr 0.02 ( 2%) sys 1.03 ( 3%) wall 10086 kB ( 1%) ggc expand : 2.03 ( 5%) usr 0.12 (13%) sys 2.18 ( 5%) wall 249774 kB (36%) ggc post expand cleanups: 0.14 ( 0%) usr 0.01 ( 1%) sys 0.14 ( 0%) wall 1744 kB ( 0%) ggc integrated RA : 10.75 (27%) usr 0.15 (16%) sys 10.93 (27%) wall 128826 kB (19%) ggc reload : 5.56 (14%) usr 0.16 (17%) sys 5.77 (14%) wall 123587 kB (18%) ggc thread pro- epilogue : 2.65 ( 7%) usr 0.00 ( 0%) sys 2.64 ( 6%) wall 198 kB ( 0%) ggc machine dep reorg : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 0 kB ( 0%) ggc final : 3.11 ( 8%) usr 0.04 ( 4%) sys 3.15 ( 8%) wall 7227 kB ( 1%) ggc symout : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 4914 kB ( 1%) ggc rest of compilation : 2.46 ( 6%) usr 0.00 ( 0%) sys 2.39 ( 6%) wall 47578 kB ( 7%) ggc unaccounted todo: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc verify RTL sharing : 1.49 ( 4%) usr 0.00 ( 0%) sys 1.48 ( 4%) wall 0 kB ( 0%) ggc TOTAL : 40.09 0.9241.02 695674 kB
[Bug tree-optimization/53081] memcpy/memset loop recognition
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53081 --- Comment #12 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-06 11:32:08 UTC --- It doesn't quite seem to work for this simple Fortran testcase yet SUBROUTINE S(a,N) INTEGER :: N,a(N) a=1 END SUBROUTINE S (works for memset to 0)
[Bug tree-optimization/53081] memcpy/memset loop recognition
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53081 --- Comment #14 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-06 11:54:22 UTC --- (In reply to comment #13) Well, you can't transform this to a memset ;) blush things work as advertised for correct testcases... thanks!
[Bug fortran/53521] [4.5/4.6/4.7 Regression] Memory leak with zero-sized array constructor
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Summary|[4.5/4.6/4.7/4.8|[4.5/4.6/4.7 Regression] |Regression] Memory leak |Memory leak with zero-sized |with zero-sized array |array constructor |constructor | Known to fail|4.8.0 | --- Comment #9 from Tobias Burnus burnus at gcc dot gnu.org 2012-05-31 14:28:46 UTC --- Author: burnus Date: Thu May 31 14:28:41 2012 New Revision: 188062 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=188062 Log: 2012-05-31 Tobias Burnus bur...@net-b.de PR fortran/53521 * trans.c (gfc_deallocate_scalar_with_status): Properly handle the case size == 0. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/trans.c --- Comment #10 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-06-01 07:42:41 UTC --- Thanks Tobias... this fixes the issue I saw for CP2K. From some further tests I did, I couldn't see any negative side effects. I had a look at other gcc branches, the patch applies flawlessly to 4.7 and 4.6 (I did not have a 4.5 branch around). I would be very happy to see it integrated in 4.7.1 and 4.6.4, as it is nearly impossible to fully code around this in CP2K. Array constructors are used much, and it is hard to guess which ones could be zero-sized.
[Bug fortran/53521] [4.5/4.6/4.7/4.8 Regression] Zero-byte memory leak with zero-sized array constructor (valgrind warning)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521 --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-05-30 12:31:18 UTC --- (In reply to comment #2) Well, I think this is a valgrind issue and not a real leak. Whether you want to optimize code for the non-NULL case by omitting the NULL check is another question of course. It's definitely not wrong-code IMHO. No, definitely a real bug... not a valgrind issue. If you put a loop around 'CALL T2' the process memory usage is a 1Gb in a few seconds. This is a real issue which causes our simulation code to crash after 24h of running.
[Bug fortran/53521] [4.5/4.6/4.7/4.8 Regression] Zero-byte memory leak with zero-sized array constructor (valgrind warning)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521 --- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-05-30 14:37:09 UTC --- (In reply to comment #4) You say not doing free (0) leaks memory? What OS is this on? I'm observing on a Linux box that : MODULE TEST IMPLICIT NONE CONTAINS SUBROUTINE T(n1) INTEGER :: n1(:) END SUBROUTINE T SUBROUTINE T2(n) INTEGER :: n INTEGER :: k CALL T((/(k,k=1,n-1)/)) END SUBROUTINE END MODULE USE TEST DO CALL T2(1) ENDDO END needs 25Gb of memory after a while (notice the endless loop around CALL T2).
[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474 --- Comment #53 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-05-29 07:45:36 UTC --- For the original testcase I have for trunk (gcc version 4.8.0 20120516 (experimental) [trunk revision 187595] (GCC)) very reasonable times (1min) at -O0, but pretty slow (20min) at -O2. At -O2, all time goes to 'alias stmt walking : 826.02' in the latter case. Time reports below: gfortran -ftime-report -ffree-line-length-512 -g -c testcase.f90 Execution times (seconds) phase setup : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 243 kB ( 0%) ggc phase parsing : 3.59 ( 6%) usr 0.05 ( 5%) sys 3.64 ( 6%) wall 47592 kB ( 7%) ggc phase cgraph: 60.02 (94%) usr 0.90 (95%) sys 60.94 (94%) wall 649547 kB (93%) ggc phase generate : 60.03 (94%) usr 0.90 (95%) sys 60.95 (94%) wall 649948 kB (93%) ggc garbage collection : 1.04 ( 2%) usr 0.00 ( 0%) sys 1.04 ( 2%) wall 0 kB ( 0%) ggc callgraph construction : 0.18 ( 0%) usr 0.01 ( 1%) sys 0.20 ( 0%) wall 15909 kB ( 2%) ggc callgraph optimization : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 201 kB ( 0%) ggc cfg construction: 0.08 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 7 kB ( 0%) ggc cfg cleanup : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc CFG verifier: 1.16 ( 2%) usr 0.00 ( 0%) sys 1.18 ( 2%) wall 0 kB ( 0%) ggc trivially dead code : 0.34 ( 1%) usr 0.00 ( 0%) sys 0.35 ( 1%) wall 0 kB ( 0%) ggc df scan insns : 1.00 ( 2%) usr 0.25 (26%) sys 1.23 ( 2%) wall 11 kB ( 0%) ggc df live regs: 0.46 ( 1%) usr 0.00 ( 0%) sys 0.49 ( 1%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.45 ( 1%) usr 0.01 ( 1%) sys 0.47 ( 1%) wall 19416 kB ( 3%) ggc register information: 0.20 ( 0%) usr 0.01 ( 1%) sys 0.19 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 8336 kB ( 1%) ggc rebuild jump labels : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.21 ( 0%) wall 0 kB ( 0%) ggc parser (global) : 3.59 ( 6%) usr 0.05 ( 5%) sys 3.64 ( 6%) wall 47587 kB ( 7%) ggc inline heuristics : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 54 kB ( 0%) ggc tree gimplify : 0.48 ( 1%) usr 0.01 ( 1%) sys 0.49 ( 1%) wall 26304 kB ( 4%) ggc tree eh : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 39 kB ( 0%) ggc tree CFG construction : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 190 kB ( 0%) ggc tree find ref. vars : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 3263 kB ( 0%) ggc tree PHI insertion : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA rewrite: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 43 kB ( 0%) ggc tree SSA other : 0.04 ( 0%) usr 0.02 ( 2%) sys 0.01 ( 0%) wall 18 kB ( 0%) ggc tree operand scan : 0.01 ( 0%) usr 0.01 ( 1%) sys 0.06 ( 0%) wall 118 kB ( 0%) ggc tree SSA verifier : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc tree STMT verifier : 0.58 ( 1%) usr 0.06 ( 6%) sys 0.62 ( 1%) wall 0 kB ( 0%) ggc callgraph verifier : 0.28 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall 0 kB ( 0%) ggc out of ssa : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc expand vars : 21.72 (34%) usr 0.02 ( 2%) sys 21.74 (34%) wall 10086 kB ( 1%) ggc expand : 6.18 (10%) usr 0.15 (16%) sys 6.31 (10%) wall 251886 kB (36%) ggc post expand cleanups: 0.14 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall 1744 kB ( 0%) ggc integrated RA : 10.75 (17%) usr 0.16 (17%) sys 10.87 (17%) wall 128826 kB (18%) ggc reload : 5.72 ( 9%) usr 0.15 (16%) sys 5.92 ( 9%) wall 123587 kB (18%) ggc thread pro- epilogue : 2.51 ( 4%) usr 0.00 ( 0%) sys 2.50 ( 4%) wall 198 kB ( 0%) ggc machine dep reorg : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc final : 2.61 ( 4%) usr 0.04 ( 4%) sys 2.65 ( 4%) wall 7227 kB ( 1%) ggc symout : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 4914 kB ( 1%) ggc rest of compilation : 2.36 ( 4%) usr 0.00 ( 0%) sys 2.35 ( 4%) wall 47578 kB ( 7%) ggc verify RTL sharing : 1.02 ( 2%) usr 0.00 ( 0%) sys 1.04 ( 2%) wall 0 kB ( 0%) ggc TOTAL : 63.65 0.9564.62 697784 kB gfortran -ftime-report -ffree-line-length-512 -O2 -g -c testcase.f90 Execution times (seconds) phase setup : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0
[Bug fortran/53521] New: Memory leak with zero sized array constructor
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521 Bug #: 53521 Summary: Memory leak with zero sized array constructor Classification: Unclassified Product: gcc Version: 4.6.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch The following testcase (as reduced from CP2K) leaks memory when compiled with gfortran 4.6 - 4.8 : MODULE TEST IMPLICIT NONE CONTAINS SUBROUTINE T(n1) INTEGER :: n1(:) END SUBROUTINE T SUBROUTINE T2(n) INTEGER :: n INTEGER :: k CALL T((/(k,k=1,n-1)/)) END SUBROUTINE END MODULE USE TEST CALL T2(1) END as can be verified with valgrind or putting a loop around call t2. The issue seems to be the zero-sized array constructor.
[Bug lto/49700] LTO compile time hog
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||FIXED --- Comment #9 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-05-08 18:52:12 UTC --- trying 4.7.X instead it actually looks very reasonable now. Using -flto=jobserver -fuse-linker-plugin -ftime-report -O3 -march=native -ffast-math -g -ffree-form I get CP2K to build in 4min on a 32 cores server. The time report also looks OK. I'll close this PR as fixed (to issue with 4.8 is tracked in PR 45586).
[Bug lto/49700] LTO compile time hog
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700 --- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-05-07 19:04:29 UTC --- (In reply to comment #7) Has the situation improved? current trunk LTO seems to fail on CP2K with: /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F: In function ‘propagate_cn_or_em’: /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_805 D.79093_629 = D.79094_628-orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_805 D.79092_630 = D.79094_628-orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_805 D.79090_632 = D.79094_628-orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_816 D.79093_652 = D.79094_651-orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_816 D.79092_653 = D.79094_651-orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_816 D.79090_655 = D.79094_651-orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_827 D.79093_675 = D.79094_674-orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_827 D.79092_676 = D.79094_674-orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_827 D.79090_678 = D.79094_674-orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_838 D.79093_700 = D.79094_699-orders.data; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_838 D.79092_701 = D.79094_699-orders.offset; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error: type mismatch in component reference SUBROUTINE propagate_cn_or_em(qs_env, error) ^ struct array2_integer(kind=4) struct array2_integer(kind=4) # VUSE .MEM_838 D.79090_703 = D.79094_699-orders.dim[1].stride; /data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: internal compiler error: verify_gimple failed SUBROUTINE propagate_cn_or_em(qs_env, error) ^ Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. lto-wrapper: gfortran returned 1 exit status /data/vjoost/gnu/binutils-2.22/install/bin/ld: lto-wrapper failed collect2: error: ld returned 1 exit status
[Bug middle-end/53217] New: [4.8 Regression] internal compiler error: verify_ssa failed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53217 Bug #: 53217 Summary: [4.8 Regression] internal compiler error: verify_ssa failed Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch
[Bug middle-end/53217] [4.8 Regression] internal compiler error: verify_ssa failed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53217 --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-05-03 18:38:27 UTC --- The following testcase causes an ICE with current trunk (4.8) MODULE xc_cs1 INTEGER, PARAMETER :: dp=KIND(0.0D0) REAL(KIND=dp), PARAMETER :: a = 0.04918_dp, c = 0.2533_dp, d = 0.349_dp CONTAINS SUBROUTINE cs1_u_2 ( rho, grho, r13, e_rho_rho, e_rho_ndrho, e_ndrho_ndrho, npoints, error) REAL(KIND=dp), DIMENSION(*), INTENT(INOUT) :: e_rho_rho, e_rho_ndrho, e_ndrho_ndrho DO ip = 1, npoints IF ( rho(ip) eps_rho ) THEN oc = 1.0_dp/(r*r*r3*r3 + c*g*g) d2rF4 = c4p*f13*f23*g**4*r3/r * (193*d*r**5*r3*r3+90*d*d*r**5*r3 -88*g*g*c*r**3*r3-100*d*d*c*g*g*r*r*r3*r3 +104*r**6)*od**3*oc**4 e_rho_rho(ip) = e_rho_rho(ip) + d2F1 + d2rF2 + d2F3 + d2rF4 END IF END DO END SUBROUTINE cs1_u_2 END MODULE xc_cs1 gfortran -O1 -ffast-math bug.f90 bug.f90: In function ‘cs1_u_2’: bug.f90:7:0: error: definition in block 4 follows the use SUBROUTINE cs1_u_2 ( rho, grho, r13, e_rho_rho, e_rho_ndrho, e_ndrho_ndrho, ^ for SSA_NAME: reassocpow.5_24 in statement: reassocpow.5_99 = __builtin_powi (reassocpow.5_24, 2); bug.f90:7:0: internal compiler error: verify_ssa failed SUBROUTINE cs1_u_2 ( rho, grho, r13, e_rho_rho, e_rho_ndrho, e_ndrho_ndrho,
[Bug fortran/52325] unclear error: Unclassifiable statement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52325 --- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-02-22 06:49:41 UTC --- (In reply to comment #2) Submitted patch (pending review): http://gcc.gnu.org/ml/fortran/2012-02/msg00089.html OK ;-) this would be a significant improvement. I think it is independent, but a better choice for the error message could be 'Symbol %s at %C has an undefined type'. The type could be implicitly or explicitly defined, that doesn't matter so much. For consistency, I believe your proposed message is fine.
[Bug fortran/52325] unclear error: Unclassifiable statement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52325 --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2012-02-22 06:53:09 UTC --- (In reply to comment #2) Submitted patch (pending review): http://gcc.gnu.org/ml/fortran/2012-02/msg00089.html and a nitpick... it should be 'non-derived type' instead on 'nonderived type' (unless I got this with the hyphens wrong again).
[Bug libgomp/51298] libgomp team_barrier locking failures
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51298 --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2011-12-15 09:44:46 UTC --- similarly, does this only affect power7, or potentially also other targets such as x86_64 (interlagos?)
[Bug lto/51355] [4.7 Regression] cgraph_add_edge_to_call_site_hash, at cgraph.c:765
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51355 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE --- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2011-12-01 15:10:24 UTC --- yes this is very likely a dup. I'll check again as soon as 51346 is resolved. *** This bug has been marked as a duplicate of bug 51346 ***
[Bug bootstrap/51346] [4.7 Regression] LTO bootstrap failed with bootstrap-profiled
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51346 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added CC||Joost.VandeVondele at mat ||dot ethz.ch --- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2011-12-01 15:10:25 UTC --- *** Bug 51355 has been marked as a duplicate of this bug. ***
[Bug middle-end/51089] [4.7 Regression] internal compiler error: verify_flow_info failed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51089 Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed: What|Removed |Added Last reconfirmed||2011-11-30 Target Milestone|--- |4.7.0 --- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2011-11-30 12:49:32 UTC --- still fails with current trunk
[Bug lto/51355] New: [4.7 Regression] cgraph_add_edge_to_call_site_hash, at cgraph.c:765
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51355 Bug #: 51355 Summary: [4.7 Regression] cgraph_add_edge_to_call_site_hash, at cgraph.c:765 Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto AssignedTo: unassig...@gcc.gnu.org ReportedBy: joost.vandevond...@mat.ethz.ch Building CP2K with LTO has started to fail somewhere in the last 3 weeks. In function ‘current_build_current’: lto1: internal compiler error: in cgraph_add_edge_to_call_site_hash, at cgraph.c:765 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. make[3]: *** [/dev/shm/vondele/ccPFwJ9D.ltrans2.ltrans.o] Error 1 In function ‘nmr_shift_print’: lto1: internal compiler error: in cgraph_add_edge_to_call_site_hash, at cgraph.c:765 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. make[3]: *** [/dev/shm/vondele/ccPFwJ9D.ltrans10.ltrans.o] Error 1 In function ‘atom_int_setup’: lto1: internal compiler error: in cgraph_add_edge_to_call_site_hash, at cgraph.c:765 [...] No other testcase than building full CP2K with the following arch file: # CC = cc CPP = FC = gfortran LD = gfortran AR = ar -r CPPFLAGS = DFLAGS = -D__GFORTRAN -D__FFTSG -D__FFTW3 -D__LIBINT FCFLAGS = -flto -ffree-form -cpp $(DFLAGS) -I$(GFORTRAN_INC) LDFLAGS = $(FCFLAGS) -O3 -march=native -fuse-linker-plugin -flto=jobserver -L/users/vondele/LAPACK/ -L$(GFORTRAN_LIB) LIBS = -llapack_gfortran_x86 -lblas_gfortran_x86 -lfftw3 -lderiv -lint -lstdc++ OBJECTS_ARCHITECTURE = machine_gfortran.o
[Bug fortran/25708] Module loading is not good at all
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25708 --- Comment #17 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2011-11-30 19:50:37 UTC --- Janne's lseek patch: http://gcc.gnu.org/ml/fortran/2011-11/msg00251.html has further nice results on CP2K (CP2K_2009-05-01.f90) Thomas (trunk): 92.084.963429 0 19557182 lseek 5.910.318514 1523064 read 0.610.032888 3 11208 munmap 0.370.020212 2 11969 757 open 0.240.012753 1 11212 close 0.210.011314 1 1053321 stat 0.170.009117 0 25154 mmap 0.160.008425 0 56715 write 0.150.008353 1 12138 brk 0.050.002811 0 11211 fstat 0.020.001068 2 684 rename Janne (trunk+patch): 77.601.316715 0 5265206 lseek 9.120.154767 0466059 read 4.070.069073 0242658 madvise 2.770.046965 4 1196974 open 1.820.030845 3 11891 munmap 1.470.024943 36 684 unlink 0.720.012244 1 11895 close 0.630.010689 1 1053321 stat 0.560.009533 0 56715 write 0.510.008707 0 25837 mmap 0.400.006794 1 12117 brk 0.150.002542 0 11894 fstat
[Bug fortran/25708] Module loading is not good at all
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25708 --- Comment #18 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2011-12-01 07:29:25 UTC --- Janne's latest patch now effectively 'removes' lseek: 26.840.108906 0242658 madvise 20.120.081608 045 read 19.270.078198 0512288 lseek 12.330.050038 73 684 unlink 5.990.024315 2 1196974 open 4.570.018544 2 11891 munmap (512288 down from 19800 a few days ago).
[Bug fortran/40958] module files too large
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40958 --- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 2011-11-28 14:24:02 UTC --- Just for reference, compiling CP2K_2009-05-01.f90 results in 684 modules, stracing yields something like 12000 calls to open, and 148'847'399 calls to lseek. Clearly anything reducing the number of seeks is likely to have a good impact on compile time. For this particular case, caching modules would help a lot as well. However, our usual pattern is to have a single module per file, and all use statements at the top of the module. Caching would be of little help for this style. An efficient encoding of the information in the module would help. The idea of writing the module compressed, and decompressing it as a big string to memory for reading and parsing, seems appealing to me. Concerning a change of format, it would be important to keep one of gfortran's nice features, that is, the ability to use the modification time of the .mod files to avoid recompilation cascades. If .mod files would contain a reference to other .mod files (instead of containing the info directly), this property might be at risk.