[Bug middle-end/55555] [4.8 Regression] miscompilation at -O2 (tree-pre?)

2012-12-01 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 CC||Joost.VandeVondele at mat

   ||dot ethz.ch

Summary|[4.8 Regression]|[4.8 Regression]

   |miscompilation at -O2   |miscompilation at -O2

   ||(tree-pre?)



--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-12-01 15:53:17 UTC ---

Using -O2 -fno-tree-pre fixes the testcase.

Using -O1 -ftree-pre leads to an infinite loop at runtime.


[Bug fortran/55469] memory leak on read with istat.ne.0

2012-11-29 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55469



--- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-29 10:23:13 UTC ---

Is that for the more complete patch posted here:



http://gcc.gnu.org/ml/fortran/2012-11/msg00083.html



BTW, wrong PR number in that message.


[Bug tree-optimization/55213] vectorizer ignores __restrict__

2012-11-29 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55213



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 CC||Joost.VandeVondele at mat

   ||dot ethz.ch



--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-30 07:45:08 UTC ---

Something similar was reported in PR47341 which adds some analysis.


[Bug fortran/51727] Changing module files

2012-11-28 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #33 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-29 07:30:58 UTC ---

(In reply to comment #31)

 As for the backport, I think the patch is absolutely risk-free, and it should

 have been approved for 4.7 even though it doesn't fulfill the formal

 requirements. Please ping the patch in a few weeks so it's not forgotten.



Ping


[Bug fortran/55469] memory leak on read with istat.ne.0

2012-11-26 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55469



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 CC||Joost.VandeVondele at mat

   ||dot ethz.ch



--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-26 08:01:27 UTC ---

BTW, would there be a simple workaround ?


[Bug fortran/55469] New: memory leak on read with istat.ne.0

2012-11-25 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55469



 Bug #: 55469

   Summary: memory leak on read with istat.ne.0

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: fortran

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: joost.vandevond...@mat.ethz.ch





The following testcase leads to memory leaks with gfortran 4.5/4.6/4.7/4.8 (as

found by valgrind) 4.1 seems not to leak (but has a couple of warnings).



REAL :: z

INTEGER :: istat

CHARACTER(LEN=3) :: t

t=NVE

READ (UNIT=t,FMT=*,IOSTAT=istat) z

END



note that istat.NE.0 in this case.



==37422== 300 bytes in 1 blocks are definitely lost in loss record 1 of 1

==37422==at 0x4A057F4: calloc (vg_replace_malloc.c:593)

==37422==by 0x4C298FF: _gfortrani_xcalloc (memory.c:56)

==37422==by 0x4CE2A82: l_push_char.isra.2 (list_read.c:641)

==37422==by 0x4CE3223: read_real (list_read.c:1634)

==37422==by 0x4CE504E: _gfortrani_list_formatted_read (list_read.c:1895)


[Bug fortran/55341] New: address-sanitizer and Fortran

2012-11-15 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55341



 Bug #: 55341

   Summary: address-sanitizer and Fortran

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: fortran

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: joost.vandevond...@mat.ethz.ch





Hardly a bug, rather a feature... it seems '-faddress-sanitizer' works with

Fortran seemingly out-of-the-box. Great!



could it be documented a being for c/c++/Fortran ?



Both these testcases work ('fail') as expected:



PROGRAM TEST_ASAN_01

  INTEGER :: A(10)

  i=-1

  A(i)=0

END PROGRAM





PROGRAM TEST_ASAN_02

  INTEGER, POINTER :: x1,x2,x3

  ALLOCATE(X1)

  X2=X1

  DEALLOCATE(X1)

  X2=0

END PROGRAM


[Bug fortran/55341] address-sanitizer and Fortran

2012-11-15 Thread Joost.VandeVondele at mat dot ethz.ch

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55341

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat
   ||dot ethz.ch

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-15 14:02:47 UTC ---
Trying -faddress-sanitizer on CP2K leads to the following failure:

 cat bug.f90 
MODULE qs_environment_types
 TYPE rt_prop_type
INTEGER,DIMENSION(:,:),ALLOCATABLE:: orders
 END TYPE rt_prop_type
  TYPE qs_environment_type
TYPE(rt_prop_type),POINTER:: rtp
  END TYPE qs_environment_type
CONTAINS
  SUBROUTINE set_qs_env(qs_env,rtp)
TYPE(qs_environment_type), POINTER:: qs_env
TYPE(rt_prop_type), OPTIONAL, POINTER :: rtp
IF (PRESENT(rtp)) qs_env%rtp=rtp
  END SUBROUTINE set_qs_env
END MODULE qs_environment_types


 gfortran -O0 -faddress-sanitizer bug.f90 
bug.f90: In function ‘set_qs_env’:
bug.f90:9:0: error: type mismatch in binary expression
   SUBROUTINE set_qs_env(qs_env,rtp)
 ^
unsigned long

integer(kind=8)

unsigned long

_182 = _181 - 1;

bug.f90:9:0: error: type mismatch in binary expression
unsigned long

integer(kind=8)

unsigned long

_206 = _205 - 1;

bug.f90:9:0: internal compiler error: verify_gimple failed
0x9a47ac verify_gimple_in_cfg(function*)
../../gcc/gcc/tree-cfg.c:4728
0x8dad97 execute_function_todo
../../gcc/gcc/passes.c:1979
0x8db75d execute_todo
../../gcc/gcc/passes.c:2008
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.


[Bug fortran/51727] Changing module files

2012-11-09 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #32 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-09 10:05:18 UTC ---

 If you can use the additional free time to walk over to my

 brother's office, then please say 'Hi' to him.  Otherwise the faculty meeting

 will have to do :-)



Let's call it a small world... I will meet him next week.


[Bug tree-optimization/55238] ICE in find_aggregate_values_for_callers_subset, at ipa-cp.c:2908 building zlib

2012-11-08 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55238



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 CC||Joost.VandeVondele at mat

   ||dot ethz.ch



--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-08 19:27:22 UTC ---

gfortran -O3 also aborts on this testcase at the same location



MODULE dbcsr_dist_operations

  TYPE dbcsr_type

LOGICAL :: symmetry

  END TYPE

CONTAINS

  SUBROUTINE get_stored_coordinates_type(matrix,

  transpose, processor)

TYPE(dbcsr_type), INTENT(IN) :: matrix

LOGICAL, INTENT(INOUT)   :: transpose

INTEGER, INTENT(OUT), OPTIONAL   :: processor

LOGICAL :: checker_tr

IF (PRESENT (processor)) THEN

   IF (matrix%symmetry .AND. checker_tr()) THEN

  processor = dbcsr_distribution_processor ()

   ENDIF

ENDIF

  END SUBROUTINE get_stored_coordinates_type

  SUBROUTINE get_block_index_type(matrix, transpose)

TYPE(dbcsr_type), INTENT(IN) :: matrix

LOGICAL, INTENT(OUT) :: transpose

transpose = .FALSE.

CALL get_stored_coordinates_type (matrix, transpose)

  END SUBROUTINE get_block_index_type

END MODULE dbcsr_dist_operations


[Bug fortran/51727] Changing module files

2012-11-08 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #30 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-11-09 07:31:28 UTC ---

(In reply to comment #29)

 I committed the C-only version of the patch as the issues mentioned in comment

 #27 couldn't be addressed before stage3.



Thanks Tobi!



I have been using your C-only patch for a couple of weeks now for the 4.7

branch, and it is greatly improving our edit/compile-cycles. For one of my

students, it yields an effective 10x speedup in building CP2K after a typical

code change, greatly facilitating the programming project he is on. I would

suggest that after a couple of weeks on trunk, this should be reconsidered

again for backporting to 4.7.


[Bug fortran/51727] Changing module files

2012-10-28 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



URL||http://gcc.gnu.org/ml/fortr

   ||an/2012-10/msg00061.html

 CC||Joost.VandeVondele at mat

   ||dot ethz.ch



--- Comment #26 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-28 11:11:19 UTC ---

The patch has been posted some time ago, with an OK for trunk..



http://gcc.gnu.org/ml/fortran/2012-10/msg00067.html



Maybe it is a good time to commit before the next stage starts ?


[Bug fortran/55099] New: Surprising 'PROCEDURE attribute conflicts with INTENT attribute' error

2012-10-27 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55099



 Bug #: 55099

   Summary: Surprising 'PROCEDURE attribute conflicts with INTENT

attribute' error

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: fortran

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: joost.vandevond...@mat.ethz.ch





In the following, a surprising (but correct) error message is issued. Maybe it

is possible to improve the wording to point at the other option... looking at

ifort's error message is cheating ;-) but certainly that one helps the

non-expert.



SUBROUTINE S(num_proc_2d)

  INTEGER, INTENT(IN) :: num_proc_2d

  INTEGER :: proc_x,proc_y

  proc_x=num_proc_2d(1) ; proc_x=num_proc_2d(2)

END SUBROUTINE



Error message:



SUBROUTINE S(num_proc_2d)

1

Error: PROCEDURE attribute conflicts with INTENT attribute in 'num_proc_2d' at

(1)


[Bug fortran/55099] Surprising but valid 'PROCEDURE attribute conflicts with INTENT attribute' error

2012-10-27 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55099



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 CC||Joost.VandeVondele at mat

   ||dot ethz.ch



--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-27 17:38:34 UTC ---

(In reply to comment #1)



 How about the following (which of course implies that the users didn't intent

 to use an array - if they did, Intel's becomes more helpful.)



Indeed, I was coding this with the intent of declaring it as an array, no

doubt, passing an array is much more common than passing a procedure. Note that

intel's location is also more useful in that case. I think the usefulness of

Intel's message lies in the fact that it suggests the common cause of the

error. Even with the experience I have, I first started to look for a procedure

with the same name as the variable.


[Bug rtl-optimization/54991] [LRA] internal compiler error: in lra_assign, at lra-assigns.c:1361

2012-10-22 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54991



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 Status|UNCONFIRMED |RESOLVED

 CC||Joost.VandeVondele at mat

   ||dot ethz.ch

 Resolution||FIXED



--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-22 08:43:07 UTC ---

verified with several full CP2K builds as fixed.


[Bug fortran/31119] -fbounds-check: Check for presence of optional arguments before bound checking

2012-10-20 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31119



--- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-20 14:59:08 UTC ---

(In reply to comment #7)

 Hi,

 can someone fortran aware please double-check that the tests

 

 * gfortran.dg/bounds_check_9.f90: New test.

 * gfortran.dg/bounds_check_fail_2.f90: New test.

 

 do not contain out of bounds access?  I am working on path to bound number of

 loop iterations better based on array accesses and what I see is array A.9

 containing values {1,2} that is accessed in the loop header.

 We bound number of iterations of that loop to 1 (that is one loopback edge

 iteration to walk both parts of the array) and then the testcases start

 failing.

 

 I do not understand the testcase.

 Perhaps the bounds-check instrumentation happens too late or we need to 
 disable

 this logic with -fbounds-check?

 

 Honza



According to me, the first testcase (bounds_check_9.f90) should contain no

out-of-bounds access (at least from the fortran point of view, and also

according to valgrind), while the second testcase (bounds_check_fail_2.f90)

does contain out-of-bounds access (by design). Of course, -fbounds-check is

designed to catch out-of-bounds at runtime (which the second testcase tests).

Of course, fortran programs with out-of-bounds access are not standard

conforming. 



Actually, the situation is a bit bizarre. There are no conforming programs for

which bounds-checking can trigger... all these bounds-checking statements can

be just optimized away :-). That's not quite what the users want... I run

-fbounds-check -O2 quite often. I don't think one should switch off

optimization in the presence of -fbounds-check. Maybe the docs should be

enhanced and mention that bounds checking is most effective at -O0 ?


[Bug rtl-optimization/54991] New: [LRA] internal compiler error: in lra_assign, at lra-assigns.c:1361

2012-10-19 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54991



 Bug #: 54991

   Summary: [LRA] internal compiler error: in lra_assign, at

lra-assigns.c:1361

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: rtl-optimization

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: joost.vandevond...@mat.ethz.ch





I have tested LRA (20121014 (experimental) [lra revision 192621]) on CP2K and

find the following ICE:



/data/vjoost/gnu/cp2k/cp2k/makefiles/../src/dbcsr_lib/dbcsr_operations.F:4471:0:

internal compiler error: in lra_assign, at lra-assigns.c:1361

   END SUBROUTINE dbcsr_lanczos_extremal_eig

 ^

0x8621b2 lra_assign()

../../gcc/gcc/lra-assigns.c:1361

0x85e2f2 lra(_IO_FILE*)

../../gcc/gcc/lra.c:2309

0x826696 do_reload

../../gcc/gcc/ira.c:4613

0x826696 rest_of_handle_reload

../../gcc/gcc/ira.c:4719

Please submit a full bug report,

with preprocessed source if appropriate.

Please include the complete backtrace with any bug report.

See http://gcc.gnu.org/bugs.html for instructions.

make[1]: *** [dbcsr_operations.o] Error 1

make[1]: Leaving directory

`/data/vjoost/gnu/cp2k/cp2k/obj/gfortran-test28/sopt'

make: *** [build] Error 2



Unfortunately, I don't know how to produce a small testcase, as this is

happening only for a compilation with '-fprofile-use'. I could tar the needed

mod files and .gdca if that is an option.



This is happening on x86_64:

-march=corei7 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm

-mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx

-mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c

-mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx --param l1-cache-size=32 --param

l1-cache-line-size=64 --param l2-cache-size=24576 -mtune=corei7


[Bug rtl-optimization/54991] [LRA] internal compiler error: in lra_assign, at lra-assigns.c:1361

2012-10-19 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54991



--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-19 18:58:31 UTC ---

Created attachment 28495

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28495

testcase, including source, .mod and .gcda files needed. README gives

compilation command needed


[Bug tree-optimization/54967] New: [4.8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:55

2012-10-18 Thread Joost.VandeVondele at mat dot ethz.ch

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54967

 Bug #: 54967
   Summary: [4.8 Regression] ICE in check_loop_closed_ssa_use, at
tree-ssa-loop-manip.c:55
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: joost.vandevond...@mat.ethz.ch


This started failing very recently on trunk :

 gfortran  -c -O2 -funroll-loops bug.f90 
bug.f90: In function ‘calc_s_derivs’:
bug.f90:1:0: internal compiler error: in check_loop_closed_ssa_use, at
tree-ssa-loop-manip.c:557
   SUBROUTINE calc_S_derivs()
 ^
0xa0cf93 check_loop_closed_ssa_use
../../gcc/gcc/tree-ssa-loop-manip.c:557
0xa0d591 check_loop_closed_ssa_stmt
../../gcc/gcc/tree-ssa-loop-manip.c:572
0xa0d591 verify_loop_closed_ssa(bool)
../../gcc/gcc/tree-ssa-loop-manip.c:606
0xa0d880 gimple_duplicate_loop_to_header_edge(loop*, edge_def*, unsigned int,
simple_bitmap_def*, edge_def*, vec_tedge_def***, int)
../../gcc/gcc/tree-ssa-loop-manip.c:762
0xdbd99a try_unroll_loop_completely
../../gcc/gcc/tree-ssa-loop-ivcanon.c:519
0xdbd99a canonicalize_loop_induction_variables
../../gcc/gcc/tree-ssa-loop-ivcanon.c:666
0xdbea10 tree_unroll_loops_completely(bool, bool)
../../gcc/gcc/tree-ssa-loop-ivcanon.c:815

 cat bug.f90 
  SUBROUTINE calc_S_derivs()
INTEGER, DIMENSION(6, 2)  :: c_map_mat
INTEGER, DIMENSION(:), POINTER:: C_mat
DO j=1,3
   DO m=j,3
  n=n+1
  c_map_mat(n,1)=j
  IF(m==j)CYCLE
  c_map_mat(n,2)=m
   END DO
END DO
DO m=1,6
   DO j=1,2
  IF(c_map_mat(m,j)==0)CYCLE
  CALL foo(C_mat(c_map_mat(m,j))) 
   END DO
END DO
  END SUBROUTINE calc_S_derivs


[Bug tree-optimization/54967] [4.8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:55

2012-10-18 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54967



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 Status|NEW |UNCONFIRMED

 CC||rguenth at gcc dot gnu.org

 Ever Confirmed|1   |0



--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-18 08:11:14 UTC ---

I assume it is:



r190978 | rguenth | 2012-09-05 15:29:13 +0200 (Wed, 05 Sep 2012) | 11 lines



2012-09-05  Richard Guenther  rguent...@suse.de


[Bug fortran/51727] Changing module files

2012-10-15 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #25 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-15 14:14:12 UTC ---

Just to provide some additional numbers on how important this patch is for

practical development (and of course to +1 on backports) for a 'typical

code change' on a CP2K tree (add an unused local variable to a subroutine) the

speedup due to avoided recompilation (on a 32 core server) can be obtained from

the following compile timings (repeatable for various tries):



4.6(unpatched)

real1m45.064s



4.7(patched)

real0m14.958s



I really think this is a pretty substantial bug fix of an existing feature.


[Bug fortran/51727] Changing module files

2012-10-13 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #18 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-13 08:13:14 UTC ---

(In reply to comment #14)

 Created attachment 28425 [details]

 Patch for testing





thanks... now repeated CP2K compiles give identical '.mod's, and of course also

omp_lib.mod is fixed. This will very much improve the user experience for those

working on large code bases.



since you're using C++, I guess a backport to older branches is out of

question


[Bug fortran/51727] Changing module files

2012-10-13 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #23 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-13 12:28:12 UTC ---

(In reply to comment #22)

 Created attachment 28440 [details]

 patch that doesn't use c++



I've tested the patch with (an older version of) the 4.7 branch, and it works

fine for CP2K.


[Bug fortran/51727] Changing module files

2012-10-13 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #24 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-13 12:45:11 UTC ---

(In reply to comment #23)



 I've tested the patch with (an older version of) the 4.7 branch, and it works

 fine for CP2K.



it doesn't apply cleanly to 4.6, so no testing there unfortunately.


[Bug middle-end/37150] vectorizer misses some loops

2012-10-06 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



   Last reconfirmed|2009-08-06 07:54:57 |2012-10-06 7:54:57



--- Comment #13 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-06 10:38:57 UTC ---

reconfirming this with current trunk 



ifort:1.02s

gfortran 4.8: 2.01s



gfortran -ffast-math -march=native -O3 -v PR37150.f90



-march=corei7 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm

-mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx

-mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c

-mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx


[Bug fortran/51727] Changing module files

2012-10-06 Thread Joost.VandeVondele at mat dot ethz.ch

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-06 12:42:13 UTC ---
(In reply to comment #3)

 
 2012-10-06  Tobias Schlüter  t...@gcc.gnu.org
 
 PR fortran/51727
 * module.c (write_generic): Traverse tree in left-to-right order.

If tested that this patch fixes the problem for omp_lib.mod, so would likely
also fix recompilation cascades..

However, testing it on CP2K I'm finding that compilation fails with this patch
(and passes without), so something seems wrong. The difference between the
generated modules is rather large.


[Bug fortran/51727] Changing module files

2012-10-06 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-06 12:46:36 UTC ---

Created attachment 28373

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28373

bad module


[Bug fortran/51727] Changing module files

2012-10-06 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-06 12:47:19 UTC ---

Created attachment 28374

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28374

good module


[Bug fortran/51727] Changing module files

2012-10-06 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #7 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-06 12:48:39 UTC ---

The main difference between 'good' and 'bad' seems to be the 'header' lines 



bad:

()



(('arch_topology' 'machine_architecture_types' 2))



()



good:

()



(('arch_topology' 'machine_architecture_types' 2) ('ma_mp_type'

'machine_architecture_types' 3) ('ma_process' 'machine_architecture_types'

4) ('machine_output' 'machine_architecture_types' 5) ('thread_inf'

'machine_architecture_types' 6))



()


[Bug fortran/51727] Changing module files

2012-10-06 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



--- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-06 12:52:09 UTC ---

(In reply to comment #3)

 Created attachment 28372 [details]

 Candidate patch



actually... looking at the patch, don't you need to deal with the if statements

that return ?


[Bug fortran/51727] Changing module files

2012-10-05 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51727



Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:



   What|Removed |Added



 CC||simonb at google dot com



--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-05 18:16:41 UTC ---

Also reported here:



http://gcc.gnu.org/ml/gcc/2012-10/msg00075.html



this is the source of recompilation cascades sometimes seen in CP2K as well.



I'm wondering if a very naive hack like sorting .mod content (like in cat

old.mod 1 | sort -s  new.mod) could not paper over this problem sufficiently

well to make it irrelevant in reality.


[Bug rtl-optimization/54751] [4.8 Regression] slow compile time with rtl loop unroller

2012-10-02 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54751



--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-10-02 10:39:41 UTC ---

More reasonable with -enable-checking=release



4.8(checking=yes):~10min

4.8(checking=release):  1min28s.

4.7  :  0min58s  



maybe some of the checking is a bit excessive in this case.


[Bug fortran/54758] New: accessing gcc builtins from fortran

2012-09-30 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54758



 Bug #: 54758

   Summary: accessing gcc builtins from fortran

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: enhancement

  Priority: P3

 Component: fortran

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: joost.vandevond...@mat.ethz.ch





I would like to experiment with prefetching in Fortran code (beyond

-fprefetch-loop-arrays). A convenient way to do this would be access the

gcc_builtins. I tried this the following way:



  INTERFACE

SUBROUTINE builtin_prefetch(a) BIND(C,name=__builtin_prefetch)

  USE ISO_C_BINDING, ONLY: C_FLOAT

   REAL(KIND=C_FLOAT), dimension(*) :: a

END SUBROUTINE

  END INTERFACE

  real*4 :: data(100)

  DO i=1,100

 CALL builtin_prefetch(data(i))

 data(i)=0

  ENDDO

END



but it didn't work... 



test.f90:(.text+0x36): undefined reference to `__builtin_prefetch'



no surprise, I guess, but it would be cool if it did.


[Bug fortran/45586] [4.8 Regression] ICE non-trivial conversion at assignment

2012-09-30 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45586



--- Comment #86 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-30 12:30:43 UTC ---

(In reply to comment #84)



LTO might work for many codes, as using allocatables in derived types was not

standard Fortran90 (IIRC) and appears needed to trigger the bug. Anyway, since

most people will use released versions of gcc, this checking error will be

hidden behind --enable-checking=release. Only very few people will be able to

locate and in particular reduce wrong code generation that only happens with

LTO, so I wouldn't expect bug reports for actual wrong code generation very

quickly.



Meanwhile a shorter testcase for 4.8, using gfortran -flto -O0.



  TYPE t

 REAL, DIMENSION(:), ALLOCATABLE :: r

  END TYPE t

  TYPE t_p

 TYPE(t), POINTER :: d_t

  END TYPE t_p



  REAL, DIMENSION(:), POINTER:: d

  TYPE(t_p) ::  x

  d=x%d_t%r

END


[Bug go/54749] New: libbacktrace

2012-09-29 Thread Joost.VandeVondele at mat dot ethz.ch

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54749

 Bug #: 54749
   Summary: libbacktrace
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
AssignedTo: i...@airs.com
ReportedBy: joost.vandevond...@mat.ethz.ch


On a testcase that makes the compiler run out-of-memory (by setting ulimit to 
ulimit -m 8388608
ulimit -v 8388608
ulimit -d 8388608
ulimit -t 600
and running the full testcase of PR53852) I get the following stacktrace, which
is a bit ugly:

GNU MP: Cannot allocate memory (size=8)
In function 'build_d_tensor_gks':
H�D$A��H�H�D$H�H�D$
H�CH�D$(H�kH�Cv,H��H�l$HH�\$@L�d$PL�l$XL�t$`H��h�Aborted
mmap: Cannot allocate memory
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
failed to read executable information
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
make[2]: *** [semi_empirical_int_gks.o] Error 1
make[2]: Target `_progr' not remade because of errors.
make[2]: Leaving directory
`/data/vjoost/gnu/cp2k/cp2k/obj/gfortran-test12/sopt'
make[1]: *** [build] Error 2
make[1]: Leaving directory `/data/vjoost/gnu/cp2k/cp2k/makefiles'


[Bug rtl-optimization/54751] New: [4.8 Regression] slow compile time with rtl loop unroller

2012-09-29 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54751



 Bug #: 54751

   Summary: [4.8 Regression] slow compile time with rtl loop

unroller

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: rtl-optimization

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: joost.vandevond...@mat.ethz.ch





Created attachment 28299

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28299

gzipped testcase.



compiling the attached testcase need ~10x more time with current 4.8 trunk than

with 4.7. I believe this is a recent regression.



A typical stack trace looks like

#0  0x006a7a02 in df_ref_equal_p(df_ref_d*, df_ref_d*) ()

#1  0x006a7af5 in df_refs_verify(vec_tdf_ref_d**, df_ref_d**, bool)

()

#2  0x006abf3f in df_insn_refs_verify(df_collection_rec*,

basic_block_def*, rtx_def*, bool) ()

#3  0x006aea2a in df_bb_verify(basic_block_def*) ()

#4  0x006aed40 in df_scan_verify() ()

#5  0x0069e155 in df_analyze() ()

#6  0x0083b1dd in iv_analysis_loop_init(loop*) ()

#7  0x0083e685 in get_simple_loop_desc(loop*) ()

#8  0x00841265 in unroll_and_peel_loops(int) ()

#9  0x00835cd7 in rtl_unroll_and_peel_loops() ()

#10 0x00881107 in execute_one_pass(opt_pass*) ()



compile flags:



gfortran -c -cpp -O2 -ftree-vectorize -funroll-loops -ffast-math test.f90

(needs about 10min (gcc 4.8) or 1min (gcc 4.7) on my machine, removing

-funroll-loops reduces that to 1m20s (4.8) or 28s (4.7))


[Bug middle-end/54749] libbacktrace

2012-09-29 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54749



--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-29 17:34:04 UTC ---

(In reply to comment #1)

 You filed this against the go component, but it seems that Go is not

 involved.  Is that right?  This is just about a backtrace printed after a run

 of the Fortran compiler?



yes, unclear what the proper component was for libbacktrace... I didn't

consider this middle end either (and I was under the impression that go and

libbracktrace had something in common). 



The problem is not the fact that this particular run crashes, but the fact that

the trace should deal with the mmap out-of-mem more nicel (i.e. one line of

error).


[Bug fortran/45586] [4.8 Regression] ICE non-trivial conversion at assignment

2012-09-26 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45586



--- Comment #83 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-26 06:42:59 UTC ---

Mikael, any progress on this one (BTW, the PR is not yet assigned)? It would be

great to have LTO work with Fortran in 4.8 (especially with all the inlining

improvements). However, I would guess that this is stage 1 material, and I'm

assuming stage 1 is nearing its end.


[Bug tree-optimization/54634] New: [4.8 Regression] miscompilation with -O3 -ftree-loop-distribution

2012-09-20 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54634



 Bug #: 54634

   Summary: [4.8 Regression] miscompilation with -O3

-ftree-loop-distribution

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: tree-optimization

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: joost.vandevond...@mat.ethz.ch





Created attachment 28227

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28227

testcase sources



The attached sources are miscompiled with current trunk ([trunk revision

191430]) at -O3  -ftree-loop-distribution. To reproduce 



gfortran -O3  -ftree-loop-distribution  -ffree-form other.F mathconstants.F

orbital_pointers.F orbital_symbols.F orbital_transformation_matrices.F main.F ;

./a.out



which outputs wrong values (as compared to -O0) and shows a valgrind warning

(not present at -O0).



The miscompiled file is orbital_transformation_matrices.F, most likely the

routine create_spherical_harmonics (which seems inlined). If I cat at files in

a single .F file, the error also disappears, which might hint at some ipa thing

?



4.7 branch ([gcc-4_7-branch revision 190437]) is doing fine.


[Bug tree-optimization/54634] [4.8 Regression] miscompilation with -O3 -ftree-loop-distribution

2012-09-20 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54634



--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-20 10:15:57 UTC ---

(In reply to comment #1)

 Retry with PR54629 fix?



after applying the patch mentioned above, the testcase still fails. The failure

is also older than the commit mentioned in PR54629


[Bug tree-optimization/54634] [4.8 Regression] miscompilation with -O3 -ftree-loop-distribution

2012-09-20 Thread Joost.VandeVondele at mat dot ethz.ch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54634



--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-20 13:06:50 UTC ---

(In reply to comment #4)

 Ah, binomial () is pure.



In this case, it was presumably triggered by Tobias' changes for PR54389.

binomial() has not been declared pure in the source, but most likely correctly

declared 'implicitly pure' but the Fortran frontend.


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-13 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #11 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-13 12:31:03 UTC ---
(In reply to comment #10)
 Draft patch (replaces the one in comment 9):
 
 --- a/gcc/fortran/resolve.c
 +++ b/gcc/fortran/resolve.c
 @@ -13567,6 +13572,5 @@ gfc_impure_variable (gfc_symbol *sym)
proc = sym-ns-proc_name;
 -  if (sym-attr.dummy  gfc_pure (proc)
 -((proc-attr.subroutine  sym-attr.intent == INTENT_IN)
 -   ||
 -proc-attr.function))
 +  if (sym-attr.dummy
 +   ((proc-attr.subroutine  sym-attr.intent == INTENT_IN)
 + || proc-attr.function))
  return 1;

this one fixes the error seen with CP2K.


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-13 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #16 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-14 05:57:51 UTC ---
(In reply to comment #15)
 FIXED on the trunk - and on the 4.6/4.7 branch. Sorry for the breakage!

Thank you and other gcc experts for regularly fixing issues quickly and
professionally, while steadily improving the quality of the compiler!


[Bug fortran/54389] [F2003/F2008 difference] PURE functions and pointer dummy arguments / DECL_PURE_P issue

2012-09-12 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54389

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat
   ||dot ethz.ch

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-12 10:00:46 UTC ---
This revision causes CP2K to produce wrong results at -O1 and above. I don't
have a reduced testcase, other than compiling and building CP2K, but found this
by bisection.


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-12 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-12 11:41:12 UTC ---
the two revisions lead to a lot of changes, all these files differ in their
disassembled form:

 1admm_methods.o Files f1 and f2 differ
 2atom_fit.o Files f1 and f2 differ
 3atom_pseudo.o Files f1 and f2 differ
 9cp_ddapc_methods.o Files f1 and f2 differ
10cp_fm_basic_linalg.o Files f1 and f2 differ
11cp_ma_interface.o Files f1 and f2 differ
12cp_parser_inpp_methods.o Files f1 and f2 differ
13cp_parser_methods.o Files f1 and f2 differ
14dbcsr_dist_operations.o Files f1 and f2 differ
15dbcsr_example_3.o Files f1 and f2 differ
16dbcsr_index_operations.o Files f1 and f2 differ
17dbcsr_internal_operations.o Files f1 and f2 differ
18dbcsr_iterator_operations.o Files f1 and f2 differ
19dbcsr_operations.o Files f1 and f2 differ
20dbcsr_performance_multiply.o Files f1 and f2 differ
21dbcsr_test_add.o Files f1 and f2 differ
22dbcsr_test_methods.o Files f1 and f2 differ
23dbcsr_test_multiply.o Files f1 and f2 differ
24dbcsr_transformations.o Files f1 and f2 differ
25dbcsr_work_operations.o Files f1 and f2 differ
26efield_utils.o Files f1 and f2 differ
27et_coupling.o Files f1 and f2 differ
28f77_interface.o Files f1 and f2 differ
29fp_methods.o Files f1 and f2 differ
30helium_io.o Files f1 and f2 differ
31hfx_types.o Files f1 and f2 differ
32input_cp2k.o Files f1 and f2 differ
33lgrid_types.o Files f1 and f2 differ
34ma_affinity.o Files f1 and f2 differ
35mltfftsg.o Files f1 and f2 differ
36molsym.o Files f1 and f2 differ
37orbital_transformation_matrices.o Files f1 and f2 differ
38pair_potential.o Files f1 and f2 differ
39parallel_rng_types.o Files f1 and f2 differ
40paw_proj_set_types.o Files f1 and f2 differ
41preconditioner.o Files f1 and f2 differ
42pw_methods.o Files f1 and f2 differ
43pw_poisson_methods.o Files f1 and f2 differ
44pw_poisson_types.o Files f1 and f2 differ
45pw_pool_types.o Files f1 and f2 differ
46qs_gspace_mixing.o Files f1 and f2 differ
47qs_integrate_potential.o Files f1 and f2 differ
48qs_ks_methods.o Files f1 and f2 differ
49qs_neighbor_lists.o Files f1 and f2 differ
50qs_neighbor_list_types.o Files f1 and f2 differ
51qs_rho0_methods.o Files f1 and f2 differ
52qs_rho_methods.o Files f1 and f2 differ
53qs_scf_block_davidson.o Files f1 and f2 differ
54qs_scf_diagonalization.o Files f1 and f2 differ
55qs_scf.o Files f1 and f2 differ
56qs_vxc.o Files f1 and f2 differ
57restraint.o Files f1 and f2 differ
58rtp_admm_methods.o Files f1 and f2 differ
59rt_propagation_methods.o Files f1 and f2 differ
60sap_kind_types.o Files f1 and f2 differ
61scp_hartree_1center.o Files f1 and f2 differ
62se_core_matrix.o Files f1 and f2 differ
63se_fock_matrix_coulomb_ga.o Files f1 and f2 differ
64se_fock_matrix_coulomb_mpi.o Files f1 and f2 differ
65semi_empirical_expns3_methods.o Files f1 and f2 differ
66semi_empirical_par_utils.o Files f1 and f2 differ
67task_list_methods.o Files f1 and f2 differ
68thermostat_mapping.o Files f1 and f2 differ
69xc.o Files f1 and f2 differ


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-12 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-12 20:11:24 UTC ---
some progress.. the object file that leads to wrong results is
parallel_rng_types.o. I'll see if I can get some further insight.


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-12 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-12 20:26:49 UTC ---
(In reply to comment #3)
 (In reply to comment #2)
  some progress.. the object file that leads to wrong results is
  parallel_rng_types.o. I'll see if I can get some further insight.
 
 It seems that - for some reason - IMPLICIT_PURE is only set for functions. (Or
 at least that's here the case for a simple test case.) If you produce a 
 module,
 have a look at the .mod file and search for IMPLICIT_PURE. In my example I 
 have
 something like:
   3 's' 'm' '' 1 ((PROCEDURE [...] FUNCTION IMPLICIT_PURE) [...]
 
 where s is the name of my function and m is the name of the module. Then,
 check whether that procedure could be PURE or has to be IMPURE.

yes, I think from looking at the optimized dumps, I can see that a function
that is called twice in the correct version is called only once in the wrong
version. I think I might be able to reduce it to a testcase. (If you care, the
function is rn53 which calls rn32 only once, so I guess that's the issue).


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-12 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-12 20:46:05 UTC ---
Created attachment 28179
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28179
testcase


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-12 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-12 20:50:40 UTC ---
The testcase illustrates the issue, compiling as

gfortran -c -O1 test.f90 -fdump-tree-optimized

shows that rn32 is only called once from rn53, whereas the proper number would
be 2 or 3. So I guess rn32 is incorrectly marked as pure.


[Bug fortran/54556] [4.8 Regression] Marking implicitly pure variables as DECL_PURE_P leads to wrong code

2012-09-12 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54556

--- Comment #7 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-12 20:58:23 UTC ---
(In reply to comment #6)
 So I guess rn32 is incorrectly marked as pure.

which indeed is also visible in the .mod file:

'rn32' 'parallel_rng_types' '' 1 ((PROCEDURE UNKNOWN-INTENT
MODULE-PROC DECL UNKNOWN 0 0 FUNCTION IMPLICIT_PURE ALWAYS_EXPLICIT)


[Bug fortran/45586] [4.8 Regression] ICE non-trivial conversion at assignment

2012-09-04 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45586

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/fortr
   ||an/2012-08/msg00150.html

--- Comment #82 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-09-04 12:22:12 UTC ---
URL for the current version of the patch added.


[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo

2012-08-28 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474

--- Comment #70 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-28 11:28:06 UTC ---
(In reply to comment #69)
 Is there still a problem here?

for current trunk and the original testcase, timings are reasonable at -O0 -O1
-O2, but very long at -O3 (60min):

report.O0.txt: TOTAL :  38.78 0.8939.67
691166 kB
report.O1.txt: TOTAL :  70.04 1.1371.22
634523 kB
report.O2.txt: TOTAL : 204.51 1.16   205.71
691522 kB

the biggest consumers are

-O0:  

integrated RA   :  10.36 
reload  :   5.16;

-O1:

tree PTA:   7.77
integrated RA   :  13.36

-O2:

expand vars :  83.15
tree PTA:  35.04

-O3: (also needs about 4Gb of memory)

??? not yet finished (60min)


[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo

2012-08-28 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474

--- Comment #71 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-28 14:54:54 UTC ---
The -O3 compile is 3h later still running and needs 20Gb of RAM. The issue
seems now to be variable_tracking_main

#0  0x00b7b8ce in dataflow_set_preserve_mem_locs(void**, void*) ()
#1  0x00e76168 in htab_traverse_noresize ()
#2  0x00b770e0 in dataflow_set_clear_at_call(dataflow_set_def*) ()
#3  0x00b7c613 in vt_emit_notes() ()
#4  0x00b847ea in variable_tracking_main() ()
#5  0x008e8acf in execute_one_pass(opt_pass*) ()


[Bug fortran/25708] Module loading is not good at all

2012-08-24 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25708

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Depends on||40958

--- Comment #21 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-24 14:00:40 UTC ---
I did another timing experiment on compiling CP2K. I found that on my server,
compiling with -fsyntax-only is as fast as just compiling at -O0. I believe the
reason for this is that module reading is dominating the compile time. In CP2K
each module is included only once per file, so I think it is the efficiency of
reading the module that matters most. My guess would be that the human readable
format of the .mod file is the source of most inefficiency. Is it still
important to the development of gfortran that the .mod file is in this form ?
If I count the number of times a module is used, and multiply that with the
size, I have about 1Gb of .mod files being parsed per CP2K compile (for about
35Mb of Fortran).


[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing

2012-08-22 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-22 07:40:26 UTC ---
Fixed for current trunk, maybe a dup of PR54332


[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing

2012-08-22 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269

--- Comment #7 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-22 07:43:30 UTC ---
Fixed for current trunk, maybe a dup of PR54332


[Bug tree-optimization/53852] [4.8 Regression] -ftree-loop-linear: large compile time / memory usage

2012-08-22 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53852

--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-22 11:58:00 UTC ---
simplified testcase and some analysis:

SUBROUTINE  build_d_tensor_gks(d5f,v,d5)
INTEGER, PARAMETER :: dp=8
REAL(KIND=dp),  DIMENSION(3, 3, 3, 3, 3), 
  INTENT(OUT) :: d5f
REAL(KIND=dp), DIMENSION(3), INTENT(IN)  :: v
REAL(KIND=dp), INTENT(IN) :: d5
INTEGER   :: k1, k2, k3, k4, k5
REAL(KIND=dp) :: w

d5f = 0.0_dp
DO k1=1,3
   DO k2=1,3
  DO k3=1,3
 DO k4=1,3
DO k5=1,3
   d5f(k5,k4,k3,k2,k1)=d5f(k5,k4,k3,k2,k1)+ 
  v(k1)*v(k2)*v(k3)*v(k4)*v(k5)*d5
ENDDO
w=v(k1)*v(k2)*v(k3)*d4
d5f(k1,k2,k3,k4,k4)=d5f(k1,k2,k3,k4,k4)+w
d5f(k1,k2,k4,k3,k4)=d5f(k1,k2,k4,k3,k4)+w
d5f(k1,k4,k2,k3,k4)=d5f(k1,k4,k2,k3,k4)+w
d5f(k4,k1,k2,k3,k4)=d5f(k4,k1,k2,k3,k4)+w
d5f(k1,k2,k4,k4,k3)=d5f(k1,k2,k4,k4,k3)+w
 !  d5f(k1,k4,k2,k4,k3)=d5f(k1,k4,k2,k4,k3)+w
 !  d5f(k4,k1,k2,k4,k3)=d5f(k4,k1,k2,k4,k3)+w
 !  d5f(k1,k4,k4,k2,k3)=d5f(k1,k4,k4,k2,k3)+w
 !  d5f(k4,k1,k4,k2,k3)=d5f(k4,k1,k4,k2,k3)+w
 !  d5f(k4,k4,k1,k2,k3)=d5f(k4,k4,k1,k2,k3)+w
 ENDDO
  ENDDO
   ENDDO
ENDDO
  END SUBROUTINE build_d_tensor_gks


the issue is that the compile time grows exponentially in the number of
uncommented lines of the d5f=d5f+w type:

1 0m1.112s
2 0m4.448s
3 0m11.513s
4 0m21.514s
5 0m35.529s


[Bug rtl-optimization/54269] New: [4.8 Regression] memory usage too large when optimizing

2012-08-15 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269

 Bug #: 54269
   Summary: [4.8 Regression] memory usage too large when
optimizing
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: joost.vandevond...@mat.ethz.ch


Created attachment 28019
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28019
gzipped testcase.

The attached testcase requires +- 10Gb resident memory to compile with:

gfortran -c -O3 -funroll-all-loops -march=native -ffree-form -D__LIBINT
hfx_contraction_methods.F

using current trunk. I believe this is a recent regression in trunk. 4.7 needs
500Mb. From a very quick gdb session, I guess this is some rtl thing.


[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing

2012-08-15 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-15 09:57:13 UTC ---
seems like it is triggered by unrolling, using

gfortran -O2 -funroll-loops -ffree-form -D__LIBINT hfx_contraction_methods.F

is enough. A bt at the first point where memory seems to go up is:

#1  0x007176de in df_scan_verify () at ../../gcc/gcc/df-scan.c:4540
#2  0x00706245 in df_verify () at ../../gcc/gcc/df-core.c:1645
#3  df_analyze () at ../../gcc/gcc/df-core.c:1206
#4  0x008a211b in iv_analysis_loop_init (loop=0x7f4b0ece63b8)
at ../../gcc/gcc/loop-iv.c:299
#5  0x008a56ba in get_simple_loop_desc (loop=0x7f4b0ece63b8)
at ../../gcc/gcc/loop-iv.c:2973
#6  0x008a8c70 in decide_peel_once_rolling (flags=2)
at ../../gcc/gcc/loop-unroll.c:337
#7  peel_loops_completely (flags=2) at ../../gcc/gcc/loop-unroll.c:248
#8  unroll_and_peel_loops (flags=2) at ../../gcc/gcc/loop-unroll.c:164
#9  0x0089cc98 in rtl_unroll_and_peel_loops ()
at ../../gcc/gcc/loop-init.c:370


[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing

2012-08-15 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-15 10:59:38 UTC ---
(In reply to comment #2)
 Well, that's ENABLE_CHECKING code.  Are you sure 4.7 built with
 --enable-checking=yes does not exhibit this behavior?

I'm pretty sure this was not observed 3 weeks ago on trunk. Just to make sure,
I'm doing a new trunk build with --enable-checking=no.


[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing

2012-08-15 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-15 11:37:51 UTC ---
(In reply to comment #2)
 Well, that's ENABLE_CHECKING code.  Are you sure 4.7 built with
 --enable-checking=yes does not exhibit this behavior?

it looks like --enable-checking is key. --enable-checking=no leads to about
1Gb, while --enable-checking=yes leads to about 10Gb mem usage.


[Bug rtl-optimization/54269] [4.8 Regression] memory usage too large when optimizing

2012-08-15 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54269

--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-08-16 05:29:46 UTC ---
4.7 configured with --enable-checking=yes also needs  1.0Gb.

for a checking enable compiler, time went from 25s with 4.7 to 1m27s with 4.8


[Bug middle-end/53852] New: -ftree-loop-linear: large compile time / memory usage

2012-07-04 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53852

 Bug #: 53852
   Summary: -ftree-loop-linear: large compile time / memory usage
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: joost.vandevond...@mat.ethz.ch


Current trunk (189233) has X Gb of memory usage (before I have to kill the
compilation) on the following testcase with:

gfortran -O2 -ftree-loop-linear test.f90

  SUBROUTINE  build_d_tensor_gks(d1f, d2f, d3f, d4f, d5f, v, d1, d2, d3, d4,
d5)
INTEGER, PARAMETER :: dp=8
REAL(KIND=dp), DIMENSION(3), INTENT(OUT) :: d1f
REAL(KIND=dp), DIMENSION(3, 3), 
  INTENT(OUT):: d2f
REAL(KIND=dp), DIMENSION(3, 3, 3), 
  INTENT(OUT):: d3f
REAL(KIND=dp), DIMENSION(3, 3, 3, 3), 
  INTENT(OUT):: d4f
REAL(KIND=dp), 
  DIMENSION(3, 3, 3, 3, 3), 
  INTENT(OUT), OPTIONAL  :: d5f
REAL(KIND=dp), DIMENSION(3), INTENT(IN)  :: v
REAL(KIND=dp), INTENT(IN):: d1, d2, d3, d4
REAL(KIND=dp), INTENT(IN), OPTIONAL  :: d5

INTEGER  :: k1, k2, k3, k4, k5
REAL(KIND=dp):: w

d1f = 0.0_dp
d2f = 0.0_dp
d3f = 0.0_dp
d4f = 0.0_dp
DO k1=1,3
   d1f(k1)=d1f(k1)+v(k1)*d1
ENDDO
DO k1=1,3
   DO k2=1,3
  d2f(k2,k1)=d2f(k2,k1)+v(k1)*v(k2)*d2
   ENDDO
   d2f(k1,k1)=d2f(k1,k1)+ d1
ENDDO
DO k1=1,3
   DO k2=1,3
  DO k3=1,3
 d3f(k3,k2,k1)=d3f(k3,k2,k1)+v(k1)*v(k2)*v(k3)*d3
  ENDDO
  w=v(k1)*d2
  d3f(k1,k2,k2)=d3f(k1,k2,k2)+w
  d3f(k2,k1,k2)=d3f(k2,k1,k2)+w
  d3f(k2,k2,k1)=d3f(k2,k2,k1)+w
   ENDDO
ENDDO
DO k1=1,3
   DO k2=1,3
  DO k3=1,3
 DO k4=1,3
d4f(k4,k3,k2,k1)=d4f(k4,k3,k2,k1)+ 
v(k1)*v(k2)*v(k3)*v(k4)*d4
 ENDDO
 w=v(k1)*v(k2)*d3
 d4f(k1,k2,k3,k3)=d4f(k1,k2,k3,k3)+w
 d4f(k1,k3,k2,k3)=d4f(k1,k3,k2,k3)+w
 d4f(k3,k1,k2,k3)=d4f(k3,k1,k2,k3)+w
 d4f(k1,k3,k3,k2)=d4f(k1,k3,k3,k2)+w
 d4f(k3,k1,k3,k2)=d4f(k3,k1,k3,k2)+w
 d4f(k3,k3,k1,k2)=d4f(k3,k3,k1,k2)+w
  ENDDO
  d4f(k1,k1,k2,k2)=d4f(k1,k1,k2,k2)+d2
  d4f(k1,k2,k1,k2)=d4f(k1,k2,k1,k2)+d2
  d4f(k1,k2,k2,k1)=d4f(k1,k2,k2,k1)+d2
   ENDDO
ENDDO
IF (PRESENT(d5f).AND.PRESENT(d5)) THEN
   d5f = 0.0_dp

   DO k1=1,3
  DO k2=1,3
 DO k3=1,3
DO k4=1,3
   DO k5=1,3
  d5f(k5,k4,k3,k2,k1)=d5f(k5,k4,k3,k2,k1)+ 
 v(k1)*v(k2)*v(k3)*v(k4)*v(k5)*d5
   ENDDO
   w=v(k1)*v(k2)*v(k3)*d4
   d5f(k1,k2,k3,k4,k4)=d5f(k1,k2,k3,k4,k4)+w
   d5f(k1,k2,k4,k3,k4)=d5f(k1,k2,k4,k3,k4)+w
   d5f(k1,k4,k2,k3,k4)=d5f(k1,k4,k2,k3,k4)+w
   d5f(k4,k1,k2,k3,k4)=d5f(k4,k1,k2,k3,k4)+w
   d5f(k1,k2,k4,k4,k3)=d5f(k1,k2,k4,k4,k3)+w
   d5f(k1,k4,k2,k4,k3)=d5f(k1,k4,k2,k4,k3)+w
   d5f(k4,k1,k2,k4,k3)=d5f(k4,k1,k2,k4,k3)+w
   d5f(k1,k4,k4,k2,k3)=d5f(k1,k4,k4,k2,k3)+w
   d5f(k4,k1,k4,k2,k3)=d5f(k4,k1,k4,k2,k3)+w
   d5f(k4,k4,k1,k2,k3)=d5f(k4,k4,k1,k2,k3)+w
ENDDO
w=v(k1)*d3
d5f(k1,k2,k2,k3,k3)=d5f(k1,k2,k2,k3,k3)+w
d5f(k1,k2,k3,k2,k3)=d5f(k1,k2,k3,k2,k3)+w
d5f(k1,k2,k3,k3,k2)=d5f(k1,k2,k3,k3,k2)+w
d5f(k2,k1,k2,k3,k3)=d5f(k2,k1,k2,k3,k3)+w
d5f(k2,k1,k3,k2,k3)=d5f(k2,k1,k3,k2,k3)+w
d5f(k2,k1,k3,k3,k2)=d5f(k2,k1,k3,k3,k2)+w
d5f(k2,k2,k1,k3,k3)=d5f(k2,k2,k1,k3,k3)+w
d5f(k2,k3,k1,k2,k3)=d5f(k2,k3,k1,k2,k3)+w
d5f(k2,k3,k1,k3,k2)=d5f(k2,k3,k1,k3,k2)+w
d5f(k2,k2,k3,k1,k3)=d5f(k2,k2,k3,k1,k3)+w
d5f(k2,k3,k2,k1,k3)=d5f(k2,k3,k2,k1,k3)+w
d5f(k2,k3,k3,k1,k2)=d5f(k2,k3,k3,k1,k2)+w
d5f(k2,k2,k3,k3,k1)=d5f(k2,k2,k3,k3,k1)+w
d5f(k2,k3,k2,k3,k1)=d5f(k2,k3,k2,k3,k1)+w
d5f(k2,k3,k3,k2,k1)=d5f(k2,k3,k3,k2,k1)+w
 ENDDO
  ENDDO
   ENDDO
END IF
  END SUBROUTINE build_d_tensor_gks


[Bug middle-end/53852] -ftree-loop-linear: large compile time / memory usage

2012-07-04 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53852

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-07-04 12:17:47 UTC ---
To fill in the X, 130 Gb is not sufficient for this testcase.


[Bug bootstrap/53835] New: in tree isl / cloog build fails

2012-07-03 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53835

 Bug #: 53835
   Summary: in tree isl / cloog build fails
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: joost.vandevond...@mat.ethz.ch


after downloading from gcc/infrastructure, and put cloog and isl in-tree, a
bootstrap fails with the errors below. Executing make in obj/cloog goes fine.

make[3]: Entering directory `/data/vjoost/gnu/gcc_trunk/obj/cloog'
Making all in .
checking for ANSI C header files... make[4]: Entering directory
`/data/vjoost/gnu/gcc_trunk/obj/cloog'
  CC libcloog_isl_la-block.lo
  CC libcloog_isl_la-clast.lo
  CC libcloog_isl_la-matrix.lo
  CC libcloog_isl_la-state.lo
  CC libcloog_isl_la-input.lo
  CC libcloog_isl_la-int.lo
  CC libcloog_isl_la-loop.lo
  CC libcloog_isl_la-names.lo
  CC libcloog_isl_la-options.lo
  CC libcloog_isl_la-pprint.lo
  CC libcloog_isl_la-program.lo
  CC libcloog_isl_la-union_domain.lo
  CC libcloog_isl_la-statement.lo
  CC libcloog_isl_la-stride.lo
  CC libcloog_isl_la-domain.lo
  CC libcloog_isl_la-backend.lo
  CC libcloog_isl_la-version.lo
  CC libcloog_isl_la-constraints.lo
  CC cloog.o
In file included from ../../gcc/cloog/include/cloog/isl/constraintset.h:4:0,
 from ../../gcc/cloog/include/cloog/isl/cloog.h:9,
 from ../../gcc/cloog/source/isl/backend.c:1:
../../gcc/cloog/include/cloog/isl/backend.h:4:28: fatal error:
isl/constraint.h: No such file or directory
compilation terminated.
make[4]: *** [libcloog_isl_la-backend.lo] Error 1
make[4]: *** Waiting for unfinished jobs
In file included from ../../gcc/cloog/include/cloog/isl/constraintset.h:4:0,
 from ../../gcc/cloog/include/cloog/isl/cloog.h:9,
 from ../../gcc/cloog/source/isl/constraints.c:4:
../../gcc/cloog/include/cloog/isl/backend.h:4:28: fatal error:
isl/constraint.h: No such file or directory


[Bug tree-optimization/51179] poor vectorization on interlagos.

2012-06-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51179

--- Comment #11 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-30 11:26:59 UTC ---
It looks like this problem is solved in the current 4.7 and 4.8 branches. At
least on an avx machine, the best performance found by the code in comment #4
jumps from 5.3Gflops in 4.6 to 13.9Glfops in 4.7/4.8. Great work.

I can't test this right now on interlagos, but I guess this could be OK as
well.


[Bug tree-optimization/47657] missed vectorization

2012-06-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47657

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-30 13:34:24 UTC ---
performance seems good on 4.8


[Bug middle-end/47341] unnecessary versioning in the vectorizer.

2012-06-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47341

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

   Last reconfirmed|2011-01-18 11:21:06 |2012-06-30 11:21:06

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-30 13:39:57 UTC ---
versioning still happens with 4.8


[Bug libfortran/51119] MATMUL slow for large matrices

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat
   ||dot ethz.ch

--- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 07:19:03 UTC ---
(In reply to comment #7)
 (In reply to comment #6)
  Janne, have you had a chance to look at this ? For larger matrices MATMMUL 
  is
  really slow. Anything that includes even the most basic blocking scheme 
  should
  be faster. I think this would be a valuable improvement.
 
 I implemented a block-panel multiplication algorithm similar to GOTO BLAS and
 Eigen, but I got side-tracked by other things and never found the time to fix
 the corner-case bugs and tune performance. IIRC I reached about 30-40 % of 
 peak
 flops which was a bit disappointing.

I think 30% of peak is a good improvement over the current version (which
reaches 7% of peak (92% for MKL) for a double precision 8000x8000 matrix
multiplication) on a sandy bridge.

In addition to blocking, is the Fortran runtime being compiled with a set of
compile options that enables vectorization ? In the ideal world, gcc would
recognize the loop pattern in the runtime library code, and do blocking,
vectorization etc. automagically.


[Bug middle-end/40194] fortran rules for optimizing

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40194

--- Comment #10 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 14:14:16 UTC ---
this testcase now looks optimized (at least the optimized dump contains return
1; as expected). I guess this can be closed ?


[Bug middle-end/40282] ICE with -fipa-type-escape

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40282

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||WONTFIX

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 14:22:34 UTC ---
ipa-type-escape has long been removed.


[Bug middle-end/41453] use INTENT(out) for optimization

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41453

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

   Last reconfirmed||2012-06-29

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 14:25:46 UTC ---
still happens on  4.8 trunk


[Bug libgomp/41737] [omp] missing error for undeclared variable in a parallel region with default(none)

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41737

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

   Last reconfirmed||2012-06-29

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 14:34:00 UTC ---
simplified testcase, for 4.8:

INTEGER :: ip,np
!$omp parallel do default(none)
  DO ip=0,np
  ENDDO
!$omp end parallel do
END 

while it is OK for ip to have no explicit attribute, I believe the standard
requires one for np. Intel ifort gives:

est.f90(3): error #6752: Since the OpenMP* DEFAULT(NONE) clause applies, the
PRIVATE, SHARED, REDUCTION, FIRSTPRIVATE, or LASTPRIVATE attribute must be
explicitly specified for every variable.   [NP]


[Bug middle-end/47298] -O3 destroys beautifully vectorized code obtained at -O2

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47298

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

   Last reconfirmed||2012-06-29

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 14:44:05 UTC ---
on 4.8 this still is not handled optimally. I get

4.3s for: gfortran -O2 -funroll-loops -ftree-vectorize -ffast-math
-march=native 
6.7s for: gfortran -O3 -funroll-loops -ftree-vectorize -ffast-math
-march=native

so more than 50% slowdown going from -O2 to -O3

on

-march=corei7 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mno-abm
-mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx
-mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c
-mno-fsgsbase --param l1-cache-size=32 --param l1-cache-line-size=64


[Bug tree-optimization/34940] contained subroutines called only once are not inlined

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34940

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

   Last reconfirmed|2008-01-23 11:27:01 |2012-06-29 11:27:01

--- Comment #15 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 14:52:44 UTC ---
no inlining with 4.8 either


[Bug libgomp/41737] [omp] missing error for undeclared variable in a parallel region with default(none)

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41737

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 18:46:13 UTC ---
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46532

*** This bug has been marked as a duplicate of bug 46532 ***


[Bug fortran/46532] [OMP] missing error for loop bounds missing an attribute

2012-06-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46532

--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-29 18:46:13 UTC ---
*** Bug 41737 has been marked as a duplicate of this bug. ***


[Bug libfortran/51119] MATMUL slow for large matrices

2012-06-28 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51119

--- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-28 11:58:20 UTC ---
Janne, have you had a chance to look at this ? For larger matrices MATMMUL is
really slow. Anything that includes even the most basic blocking scheme should
be faster. I think this would be a valuable improvement.


[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo

2012-06-15 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474

--- Comment #60 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-15 15:26:20 UTC ---
(In reply to comment #59)
 There should be no compile performance problems in expand anymore.
 The alias stmt walker as used from IPA remains a problem, though.

Thanks... expand is now indeed essentially gone from the timing report.

 gfortran -ftime-report -ffree-line-length-512 -g -c testcase.f90

Execution times (seconds)
 phase setup :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
   243 kB ( 0%) ggc
 phase parsing   :   3.57 ( 9%) usr   0.06 ( 7%) sys   3.63 ( 9%) wall 
 47592 kB ( 7%) ggc
 phase cgraph:  36.49 (91%) usr   0.86 (93%) sys  37.34 (91%) wall 
647436 kB (93%) ggc
 phase generate  :  36.50 (91%) usr   0.86 (93%) sys  37.36 (91%) wall 
647838 kB (93%) ggc
 garbage collection  :   1.04 ( 3%) usr   0.00 ( 0%) sys   1.04 ( 3%) wall 
 0 kB ( 0%) ggc
 callgraph construction  :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.19 ( 0%) wall 
 15909 kB ( 2%) ggc
 callgraph optimization  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   201 kB ( 0%) ggc
 cfg construction:   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
 7 kB ( 0%) ggc
 cfg cleanup :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
 0 kB ( 0%) ggc
 CFG verifier:   1.26 ( 3%) usr   0.00 ( 0%) sys   1.25 ( 3%) wall 
 0 kB ( 0%) ggc
 trivially dead code :   0.43 ( 1%) usr   0.00 ( 0%) sys   0.41 ( 1%) wall 
 0 kB ( 0%) ggc
 df scan insns   :   0.98 ( 2%) usr   0.24 (26%) sys   1.24 ( 3%) wall 
11 kB ( 0%) ggc
 df live regs:   0.58 ( 1%) usr   0.01 ( 1%) sys   0.57 ( 1%) wall 
 0 kB ( 0%) ggc
 df reg dead/unused notes:   0.43 ( 1%) usr   0.01 ( 1%) sys   0.45 ( 1%) wall 
 19416 kB ( 3%) ggc
 register information:   0.18 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall 
 0 kB ( 0%) ggc
 alias analysis  :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall 
  8337 kB ( 1%) ggc
 rebuild jump labels :   0.22 ( 1%) usr   0.00 ( 0%) sys   0.21 ( 1%) wall 
 0 kB ( 0%) ggc
 parser (global) :   3.57 ( 9%) usr   0.06 ( 7%) sys   3.63 ( 9%) wall 
 47587 kB ( 7%) ggc
 inline heuristics   :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
54 kB ( 0%) ggc
 tree gimplify   :   0.51 ( 1%) usr   0.01 ( 1%) sys   0.51 ( 1%) wall 
 26304 kB ( 4%) ggc
 tree eh :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
39 kB ( 0%) ggc
 tree CFG construction   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   190 kB ( 0%) ggc
 tree CFG cleanup:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
 0 kB ( 0%) ggc
 tree find ref. vars :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
  3263 kB ( 0%) ggc
 tree PHI insertion  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 tree SSA other  :   0.01 ( 0%) usr   0.01 ( 1%) sys   0.02 ( 0%) wall 
18 kB ( 0%) ggc
 tree operand scan   :   0.03 ( 0%) usr   0.03 ( 3%) sys   0.05 ( 0%) wall 
   118 kB ( 0%) ggc
 tree SSA verifier   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall 
 0 kB ( 0%) ggc
 tree STMT verifier  :   0.56 ( 1%) usr   0.05 ( 5%) sys   0.63 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph verifier  :   0.25 ( 1%) usr   0.00 ( 0%) sys   0.27 ( 1%) wall 
 0 kB ( 0%) ggc
 out of ssa  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
 0 kB ( 0%) ggc
 expand vars :   1.02 ( 3%) usr   0.02 ( 2%) sys   1.03 ( 3%) wall 
 10086 kB ( 1%) ggc
 expand  :   2.03 ( 5%) usr   0.12 (13%) sys   2.18 ( 5%) wall 
249774 kB (36%) ggc
 post expand cleanups:   0.14 ( 0%) usr   0.01 ( 1%) sys   0.14 ( 0%) wall 
  1744 kB ( 0%) ggc
 integrated RA   :  10.75 (27%) usr   0.15 (16%) sys  10.93 (27%) wall 
128826 kB (19%) ggc
 reload  :   5.56 (14%) usr   0.16 (17%) sys   5.77 (14%) wall 
123587 kB (18%) ggc
 thread pro-  epilogue  :   2.65 ( 7%) usr   0.00 ( 0%) sys   2.64 ( 6%) wall 
   198 kB ( 0%) ggc
 machine dep reorg   :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall 
 0 kB ( 0%) ggc
 final   :   3.11 ( 8%) usr   0.04 ( 4%) sys   3.15 ( 8%) wall 
  7227 kB ( 1%) ggc
 symout  :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
  4914 kB ( 1%) ggc
 rest of compilation :   2.46 ( 6%) usr   0.00 ( 0%) sys   2.39 ( 6%) wall 
 47578 kB ( 7%) ggc
 unaccounted todo:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
 0 kB ( 0%) ggc
 verify RTL sharing  :   1.49 ( 4%) usr   0.00 ( 0%) sys   1.48 ( 4%) wall 
 0 kB ( 0%) ggc
 TOTAL :  40.09 0.9241.02
695674 kB


[Bug tree-optimization/53081] memcpy/memset loop recognition

2012-06-06 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53081

--- Comment #12 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-06 11:32:08 UTC ---
It doesn't quite seem to work for this simple Fortran testcase yet

SUBROUTINE S(a,N)
  INTEGER :: N,a(N)
  a=1
END SUBROUTINE S

(works for memset to 0)


[Bug tree-optimization/53081] memcpy/memset loop recognition

2012-06-06 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53081

--- Comment #14 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-06 11:54:22 UTC ---
(In reply to comment #13)
 Well, you can't transform this to a memset ;)

blush

things work as advertised for correct testcases... thanks!


[Bug fortran/53521] [4.5/4.6/4.7 Regression] Memory leak with zero-sized array constructor

2012-06-01 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

Summary|[4.5/4.6/4.7/4.8|[4.5/4.6/4.7 Regression]
   |Regression] Memory leak |Memory leak with zero-sized
   |with zero-sized array   |array constructor
   |constructor |
  Known to fail|4.8.0   |

--- Comment #9 from Tobias Burnus burnus at gcc dot gnu.org 2012-05-31 
14:28:46 UTC ---
Author: burnus
Date: Thu May 31 14:28:41 2012
New Revision: 188062

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=188062
Log:
2012-05-31  Tobias Burnus  bur...@net-b.de

PR fortran/53521
* trans.c (gfc_deallocate_scalar_with_status): Properly
handle the case size == 0.


Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/trans.c

--- Comment #10 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-06-01 07:42:41 UTC ---
Thanks Tobias... this fixes the issue I saw for CP2K. From some further tests I
did, I couldn't see any negative side effects.

I had a look at other gcc branches, the patch applies flawlessly to 4.7 and 4.6
(I did not have a 4.5 branch around). I would be very happy to see it
integrated in 4.7.1 and 4.6.4, as it is nearly impossible to fully code around
this in CP2K. Array constructors are used much, and it is hard to guess which
ones could be zero-sized.


[Bug fortran/53521] [4.5/4.6/4.7/4.8 Regression] Zero-byte memory leak with zero-sized array constructor (valgrind warning)

2012-05-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-05-30 12:31:18 UTC ---
(In reply to comment #2)
 Well, I think this is a valgrind issue and not a real leak.  Whether you
 want to optimize code for the non-NULL case by omitting the NULL check is
 another question of course.  It's definitely not wrong-code IMHO.

No, definitely a real bug... not a valgrind issue. If you put a loop around
'CALL T2' the process memory usage is a 1Gb in a few seconds. This is a real
issue which causes our simulation code to crash after 24h of running.


[Bug fortran/53521] [4.5/4.6/4.7/4.8 Regression] Zero-byte memory leak with zero-sized array constructor (valgrind warning)

2012-05-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521

--- Comment #6 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-05-30 14:37:09 UTC ---
(In reply to comment #4)
 You say not doing free (0) leaks memory?  What OS is this on?  

I'm observing on a Linux box that :

MODULE TEST
 IMPLICIT NONE
CONTAINS
 SUBROUTINE T(n1)
  INTEGER :: n1(:)
 END SUBROUTINE T
 SUBROUTINE T2(n)
   INTEGER :: n
   INTEGER :: k
   CALL T((/(k,k=1,n-1)/))
 END SUBROUTINE
END MODULE
USE TEST
  DO
CALL T2(1)
  ENDDO
END 

needs 25Gb of memory after a while (notice the endless loop around CALL T2).


[Bug middle-end/38474] slow compilation at -O0 due to expand's temp slot goo

2012-05-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38474

--- Comment #53 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-05-29 07:45:36 UTC ---
For the original testcase I have for trunk (gcc version 4.8.0 20120516
(experimental) [trunk revision 187595] (GCC)) very reasonable times (1min) at
-O0, but pretty slow (20min) at -O2. At -O2, all time goes to 'alias stmt
walking  : 826.02' in the latter case. Time reports below:

gfortran -ftime-report -ffree-line-length-512 -g -c testcase.f90

Execution times (seconds)
 phase setup :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
   243 kB ( 0%) ggc
 phase parsing   :   3.59 ( 6%) usr   0.05 ( 5%) sys   3.64 ( 6%) wall 
 47592 kB ( 7%) ggc
 phase cgraph:  60.02 (94%) usr   0.90 (95%) sys  60.94 (94%) wall 
649547 kB (93%) ggc
 phase generate  :  60.03 (94%) usr   0.90 (95%) sys  60.95 (94%) wall 
649948 kB (93%) ggc
 garbage collection  :   1.04 ( 2%) usr   0.00 ( 0%) sys   1.04 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph construction  :   0.18 ( 0%) usr   0.01 ( 1%) sys   0.20 ( 0%) wall 
 15909 kB ( 2%) ggc
 callgraph optimization  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
   201 kB ( 0%) ggc
 cfg construction:   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
 7 kB ( 0%) ggc
 cfg cleanup :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
 0 kB ( 0%) ggc
 CFG verifier:   1.16 ( 2%) usr   0.00 ( 0%) sys   1.18 ( 2%) wall 
 0 kB ( 0%) ggc
 trivially dead code :   0.34 ( 1%) usr   0.00 ( 0%) sys   0.35 ( 1%) wall 
 0 kB ( 0%) ggc
 df scan insns   :   1.00 ( 2%) usr   0.25 (26%) sys   1.23 ( 2%) wall 
11 kB ( 0%) ggc
 df live regs:   0.46 ( 1%) usr   0.00 ( 0%) sys   0.49 ( 1%) wall 
 0 kB ( 0%) ggc
 df reg dead/unused notes:   0.45 ( 1%) usr   0.01 ( 1%) sys   0.47 ( 1%) wall 
 19416 kB ( 3%) ggc
 register information:   0.20 ( 0%) usr   0.01 ( 1%) sys   0.19 ( 0%) wall 
 0 kB ( 0%) ggc
 alias analysis  :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall 
  8336 kB ( 1%) ggc
 rebuild jump labels :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall 
 0 kB ( 0%) ggc
 parser (global) :   3.59 ( 6%) usr   0.05 ( 5%) sys   3.64 ( 6%) wall 
 47587 kB ( 7%) ggc
 inline heuristics   :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall 
54 kB ( 0%) ggc
 tree gimplify   :   0.48 ( 1%) usr   0.01 ( 1%) sys   0.49 ( 1%) wall 
 26304 kB ( 4%) ggc
 tree eh :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
39 kB ( 0%) ggc
 tree CFG construction   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
   190 kB ( 0%) ggc
 tree find ref. vars :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
  3263 kB ( 0%) ggc
 tree PHI insertion  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 tree SSA rewrite:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
43 kB ( 0%) ggc
 tree SSA other  :   0.04 ( 0%) usr   0.02 ( 2%) sys   0.01 ( 0%) wall 
18 kB ( 0%) ggc
 tree operand scan   :   0.01 ( 0%) usr   0.01 ( 1%) sys   0.06 ( 0%) wall 
   118 kB ( 0%) ggc
 tree SSA verifier   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
 0 kB ( 0%) ggc
 tree STMT verifier  :   0.58 ( 1%) usr   0.06 ( 6%) sys   0.62 ( 1%) wall 
 0 kB ( 0%) ggc
 callgraph verifier  :   0.28 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall 
 0 kB ( 0%) ggc
 out of ssa  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 expand vars :  21.72 (34%) usr   0.02 ( 2%) sys  21.74 (34%) wall 
 10086 kB ( 1%) ggc
 expand  :   6.18 (10%) usr   0.15 (16%) sys   6.31 (10%) wall 
251886 kB (36%) ggc
 post expand cleanups:   0.14 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall 
  1744 kB ( 0%) ggc
 integrated RA   :  10.75 (17%) usr   0.16 (17%) sys  10.87 (17%) wall 
128826 kB (18%) ggc
 reload  :   5.72 ( 9%) usr   0.15 (16%) sys   5.92 ( 9%) wall 
123587 kB (18%) ggc
 thread pro-  epilogue  :   2.51 ( 4%) usr   0.00 ( 0%) sys   2.50 ( 4%) wall 
   198 kB ( 0%) ggc
 machine dep reorg   :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
 0 kB ( 0%) ggc
 final   :   2.61 ( 4%) usr   0.04 ( 4%) sys   2.65 ( 4%) wall 
  7227 kB ( 1%) ggc
 symout  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
  4914 kB ( 1%) ggc
 rest of compilation :   2.36 ( 4%) usr   0.00 ( 0%) sys   2.35 ( 4%) wall 
 47578 kB ( 7%) ggc
 verify RTL sharing  :   1.02 ( 2%) usr   0.00 ( 0%) sys   1.04 ( 2%) wall 
 0 kB ( 0%) ggc
 TOTAL :  63.65 0.9564.62
697784 kB


gfortran -ftime-report -ffree-line-length-512 -O2 -g -c testcase.f90

Execution times (seconds)
 phase setup :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0

[Bug fortran/53521] New: Memory leak with zero sized array constructor

2012-05-29 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53521

 Bug #: 53521
   Summary: Memory leak with zero sized array constructor
Classification: Unclassified
   Product: gcc
   Version: 4.6.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: joost.vandevond...@mat.ethz.ch


The following testcase (as reduced from CP2K) leaks memory when compiled with
gfortran 4.6 - 4.8 :

MODULE TEST
 IMPLICIT NONE
CONTAINS
 SUBROUTINE T(n1)
  INTEGER :: n1(:)
 END SUBROUTINE T
 SUBROUTINE T2(n)
   INTEGER :: n
   INTEGER :: k
   CALL T((/(k,k=1,n-1)/))
 END SUBROUTINE
END MODULE
USE TEST
  CALL T2(1)
END 

as can be verified with valgrind or putting a loop around call t2. The issue
seems to be the zero-sized array constructor.


[Bug lto/49700] LTO compile time hog

2012-05-08 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution||FIXED

--- Comment #9 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-05-08 18:52:12 UTC ---
trying 4.7.X instead it actually looks very reasonable now.

Using -flto=jobserver -fuse-linker-plugin -ftime-report -O3 -march=native
-ffast-math -g -ffree-form

I get CP2K to build in 4min on a 32 cores server. The time report also looks
OK. I'll close this PR as fixed (to issue with 4.8 is tracked in PR 45586).


[Bug lto/49700] LTO compile time hog

2012-05-07 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49700

--- Comment #8 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-05-07 19:04:29 UTC ---
(In reply to comment #7)
 Has the situation improved?

current trunk LTO seems to fail on CP2K with:

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F: In function
‘propagate_cn_or_em’:
/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_805
D.79093_629 = D.79094_628-orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_805
D.79092_630 = D.79094_628-orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_805
D.79090_632 = D.79094_628-orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_816
D.79093_652 = D.79094_651-orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_816
D.79092_653 = D.79094_651-orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_816
D.79090_655 = D.79094_651-orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_827
D.79093_675 = D.79094_674-orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_827
D.79092_676 = D.79094_674-orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_827
D.79090_678 = D.79094_674-orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_838
D.79093_700 = D.79094_699-orders.data;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_838
D.79092_701 = D.79094_699-orders.offset;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0: error:
type mismatch in component reference
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
struct array2_integer(kind=4)

struct array2_integer(kind=4)

# VUSE .MEM_838
D.79090_703 = D.79094_699-orders.dim[1].stride;

/data/vjoost/clean/cp2k/cp2k/src/../src/rt_propagation_methods.F:217:0:
internal compiler error: verify_gimple failed
   SUBROUTINE propagate_cn_or_em(qs_env, error)
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
lto-wrapper: gfortran returned 1 exit status
/data/vjoost/gnu/binutils-2.22/install/bin/ld: lto-wrapper failed
collect2: error: ld returned 1 exit status


[Bug middle-end/53217] New: [4.8 Regression] internal compiler error: verify_ssa failed

2012-05-03 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53217

 Bug #: 53217
   Summary: [4.8 Regression] internal compiler error: verify_ssa
failed
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: joost.vandevond...@mat.ethz.ch


[Bug middle-end/53217] [4.8 Regression] internal compiler error: verify_ssa failed

2012-05-03 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53217

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-05-03 18:38:27 UTC ---
The following testcase causes an ICE with current trunk (4.8)

MODULE xc_cs1
  INTEGER, PARAMETER :: dp=KIND(0.0D0)
  REAL(KIND=dp), PARAMETER :: a = 0.04918_dp, 
  c = 0.2533_dp, 
  d = 0.349_dp
CONTAINS
  SUBROUTINE cs1_u_2 ( rho, grho, r13, e_rho_rho, e_rho_ndrho, e_ndrho_ndrho,
   npoints, error)
REAL(KIND=dp), DIMENSION(*), 
  INTENT(INOUT)  :: e_rho_rho, e_rho_ndrho, 
e_ndrho_ndrho
DO ip = 1, npoints
  IF ( rho(ip)  eps_rho ) THEN
 oc = 1.0_dp/(r*r*r3*r3 + c*g*g)
 d2rF4 = c4p*f13*f23*g**4*r3/r * (193*d*r**5*r3*r3+90*d*d*r**5*r3 
 -88*g*g*c*r**3*r3-100*d*d*c*g*g*r*r*r3*r3 
 +104*r**6)*od**3*oc**4
 e_rho_rho(ip) = e_rho_rho(ip) + d2F1 + d2rF2 + d2F3 + d2rF4
  END IF
END DO
  END SUBROUTINE cs1_u_2
END MODULE xc_cs1


gfortran -O1  -ffast-math  bug.f90 
bug.f90: In function ‘cs1_u_2’:
bug.f90:7:0: error: definition in block 4 follows the use
   SUBROUTINE cs1_u_2 ( rho, grho, r13, e_rho_rho, e_rho_ndrho, e_ndrho_ndrho,
 ^
for SSA_NAME: reassocpow.5_24 in statement:
reassocpow.5_99 = __builtin_powi (reassocpow.5_24, 2);
bug.f90:7:0: internal compiler error: verify_ssa failed
   SUBROUTINE cs1_u_2 ( rho, grho, r13, e_rho_rho, e_rho_ndrho, e_ndrho_ndrho,


[Bug fortran/52325] unclear error: Unclassifiable statement

2012-02-21 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52325

--- Comment #3 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-02-22 06:49:41 UTC ---
(In reply to comment #2)
 Submitted patch (pending review):
   http://gcc.gnu.org/ml/fortran/2012-02/msg00089.html

OK ;-)

this would be a significant improvement. 

I think it is independent, but a better choice for the error message could be
'Symbol %s at %C has an undefined type'. The type could be implicitly or
explicitly defined, that doesn't matter so much. For consistency, I believe
your proposed message is fine.


[Bug fortran/52325] unclear error: Unclassifiable statement

2012-02-21 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52325

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2012-02-22 06:53:09 UTC ---
(In reply to comment #2)
 Submitted patch (pending review):
   http://gcc.gnu.org/ml/fortran/2012-02/msg00089.html

and a nitpick... it should be 'non-derived type' instead on 'nonderived type'
(unless I got this with the hyphens wrong again).


[Bug libgomp/51298] libgomp team_barrier locking failures

2011-12-15 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51298

--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2011-12-15 09:44:46 UTC ---
similarly, does this only affect power7, or potentially also other targets such
as x86_64 (interlagos?)


[Bug lto/51355] [4.7 Regression] cgraph_add_edge_to_call_site_hash, at cgraph.c:765

2011-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51355

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE

--- Comment #2 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2011-12-01 15:10:24 UTC ---
yes this is very likely a dup. I'll check again as soon as 51346 is resolved.

*** This bug has been marked as a duplicate of bug 51346 ***


[Bug bootstrap/51346] [4.7 Regression] LTO bootstrap failed with bootstrap-profiled

2011-12-01 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51346

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

 CC||Joost.VandeVondele at mat
   ||dot ethz.ch

--- Comment #5 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2011-12-01 15:10:25 UTC ---
*** Bug 51355 has been marked as a duplicate of this bug. ***


[Bug middle-end/51089] [4.7 Regression] internal compiler error: verify_flow_info failed

2011-11-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51089

Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch changed:

   What|Removed |Added

   Last reconfirmed||2011-11-30
   Target Milestone|--- |4.7.0

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2011-11-30 12:49:32 UTC ---
still fails with current trunk


[Bug lto/51355] New: [4.7 Regression] cgraph_add_edge_to_call_site_hash, at cgraph.c:765

2011-11-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51355

 Bug #: 51355
   Summary: [4.7 Regression] cgraph_add_edge_to_call_site_hash, at
cgraph.c:765
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: joost.vandevond...@mat.ethz.ch


Building CP2K with LTO has started to fail somewhere in the last 3 weeks.

In function ‘current_build_current’:
lto1: internal compiler error: in cgraph_add_edge_to_call_site_hash, at
cgraph.c:765
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
make[3]: *** [/dev/shm/vondele/ccPFwJ9D.ltrans2.ltrans.o] Error 1
In function ‘nmr_shift_print’:
lto1: internal compiler error: in cgraph_add_edge_to_call_site_hash, at
cgraph.c:765
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
make[3]: *** [/dev/shm/vondele/ccPFwJ9D.ltrans10.ltrans.o] Error 1
In function ‘atom_int_setup’:
lto1: internal compiler error: in cgraph_add_edge_to_call_site_hash, at
cgraph.c:765
[...]

No other testcase than building full CP2K with the following arch file:

#
CC   = cc
CPP  = 

FC   = gfortran 
LD   = gfortran

AR   = ar -r

CPPFLAGS = 
DFLAGS   = -D__GFORTRAN -D__FFTSG -D__FFTW3 -D__LIBINT
FCFLAGS  = -flto -ffree-form -cpp $(DFLAGS) -I$(GFORTRAN_INC)
LDFLAGS  = $(FCFLAGS) -O3 -march=native -fuse-linker-plugin -flto=jobserver
-L/users/vondele/LAPACK/ -L$(GFORTRAN_LIB)
LIBS = -llapack_gfortran_x86 -lblas_gfortran_x86 -lfftw3 -lderiv -lint
-lstdc++

OBJECTS_ARCHITECTURE = machine_gfortran.o


[Bug fortran/25708] Module loading is not good at all

2011-11-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25708

--- Comment #17 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2011-11-30 19:50:37 UTC ---
Janne's lseek patch:
http://gcc.gnu.org/ml/fortran/2011-11/msg00251.html
has further nice results on CP2K (CP2K_2009-05-01.f90)

Thomas (trunk):
 92.084.963429   0  19557182   lseek
  5.910.318514   1523064   read
  0.610.032888   3 11208   munmap
  0.370.020212   2 11969   757 open
  0.240.012753   1 11212   close
  0.210.011314   1 1053321 stat
  0.170.009117   0 25154   mmap
  0.160.008425   0 56715   write
  0.150.008353   1 12138   brk
  0.050.002811   0 11211   fstat
  0.020.001068   2   684   rename
Janne (trunk+patch):
 77.601.316715   0   5265206   lseek
  9.120.154767   0466059   read
  4.070.069073   0242658   madvise
  2.770.046965   4 1196974 open
  1.820.030845   3 11891   munmap
  1.470.024943  36   684   unlink
  0.720.012244   1 11895   close
  0.630.010689   1 1053321 stat
  0.560.009533   0 56715   write
  0.510.008707   0 25837   mmap
  0.400.006794   1 12117   brk
  0.150.002542   0 11894   fstat


[Bug fortran/25708] Module loading is not good at all

2011-11-30 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25708

--- Comment #18 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2011-12-01 07:29:25 UTC ---
Janne's latest patch now effectively 'removes' lseek:

 26.840.108906   0242658   madvise
 20.120.081608   045   read
 19.270.078198   0512288   lseek
 12.330.050038  73   684   unlink
  5.990.024315   2 1196974 open
  4.570.018544   2 11891   munmap

(512288 down from 19800 a few days ago).


[Bug fortran/40958] module files too large

2011-11-28 Thread Joost.VandeVondele at mat dot ethz.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40958

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at mat dot ethz.ch 
2011-11-28 14:24:02 UTC ---
Just for reference, compiling CP2K_2009-05-01.f90 results in 684 modules,
stracing yields something like 12000 calls to open, and 148'847'399 calls to
lseek.

Clearly anything reducing the number of seeks is likely to have a good impact
on compile time.

For this particular case, caching modules would help a lot as well. However,
our usual pattern is to have a single module per file, and all use statements
at the top of the module. Caching would be of little help for this style. An
efficient encoding of the information in the module would help.

The idea of writing the module compressed, and decompressing it as a big string
to memory for reading and parsing, seems appealing to me.

Concerning a change of format, it would be important to keep one of gfortran's
nice features, that is, the ability to use the modification time of the .mod
files to avoid recompilation cascades. If .mod files would contain a reference
to other .mod files (instead of containing the info directly), this property
might be at risk.


<    2   3   4   5   6   7   8   >