[Wien] WIEN2k 12 fft_modules & symmetry

2012-07-29 Thread Gavin Abo
Thanks, Prof. Marks.  Your explanation is better than mine.  Yes, almost 
certainly the default -r4 is used for -O2, but by luck it is not 
truncating the variable.

By the way, do you think it is also by luck that the ifort compiler 
produces an "x symmetry" executable that does not crash with a memory 
access violation outside the lm array for certain structures?  If you 
check SRC_symmet*ry*/class.f on line 8, the array is allocated as 
lm(2,49).  However, the array is only allocated as lm(2,48) in kurki.f 
on line 3.  Since class.f and kurki.f in SRC_symmet*so* both have 
lm(2,49), it suggests lm(2,48) should be replaced by lm(2,49).  This 
would affect at least Wienk2k 11 and 12.

How I caught the potential issue on my system:

1. Add capitalized -C in SRC_symmetry*/*Makefile
2. make
3. cp symmetry ..
4. Use in2o3.struct in the Wien2k folder example_struct_files
5. x symmetry
6. The first line in the error message:

forrtl: severe (408): fort: (2): Subscript #2 of the array LM has value 
49 which is greater than the upper bound of 48

When "-C" is not used, the executable runs without error and seems to 
produce the correct output for in2o3.  The error cannot be caught with TiC.

On 7/29/2012 2:54 PM, Laurence Marks wrote:
> Almost certainly it is trickier than this. I expect that -O1 is
> truncating relevant variables to real*4 which is leading to problems.
> With -O2 the compiler may well be not bothering to truncate and, at
> the end of the space allocated for the variable, by luck the correct
> values are present. This is luck; the same type of bug can in other
> cases lead to segmentation violations when code gets overwritten.
>
> N.B., I think there are only two places where real*4 variables are
> used, in parts of aim and for storage of the Hamiltonian in lapw1.
> Everything else should be real*8.
>
> On Sun, Jul 29, 2012 at 3:05 PM, Gavin Abo  wrote:
>> I didn't use -r8.  However, you are right.  The scf cycle works
>> correctly if I use "-O1 -r8".
>>
>> So the higher optimizations -O2 and -O3 must be invoking the use of -r8,
>> whereas -O0 and -O1 should be using the default -r4.
>>
>> On 7/29/2012 1:40 PM, Laurence Marks wrote:
>>> I have not tested, but it looks like you are probably right. There may
>>> be other cases where variables are not explicitly defined to be 8
>>> bytes which are normally hidden by the use of "-r8". Did you use -r8?
>>>
>>> On Sun, Jul 29, 2012 at 1:57 PM, Gavin Abo  wrote:
 Dear Prof. Blaha,

 Thanks, the scf cycle runs correctly using -O2 or -O3 with the new
 files for the "fftpack" routines.  However, the scf cycle of the TiC
 example does not converge with -O1 (in the lapw0 makefile) with wrong
 values in TiC.output0 such as the plane wave contribution.  I don't
 know whether the problem is reproducible on another system.

 It seems to be due to "IMPLICIT REAL*8 (A-H,O-Z)" not being in the
 PIMACH function at the end of the fortran file fftpack_helpers.f.
 This line is in the function in the old file zfft3d.F.

 Kind Regards,

 Gavin

 On Thu, Jul 26, 2012 at 1:42 AM, Peter Blaha
  wrote:
> Thank's for the report.
>
> The problem concerns  lapw0, when compiled in sequential mode WITHOUT
> -DFFTW2 or -DFFTW3
> in the Makefile  (i.e. using the old "fftpack" routines instead of the new
> and faster fftw library).
>
> The fix suggested in the mail below does not work. Instead, you have to
> replace the 3 attached
> subroutines and recompile. (eramps.f, fft_modules.F fftpack_helpers.f)
>
> A corrected version is on the web.
>
> PB
>
>
> Am 25.07.2012 23:21, schrieb Gavin Abo:
>> Dear Prof. Blaha,
>>
>> When I run the TiC example with WIEN2k 12 "without" k-point or mpi
>> parallelization, the program stops in lapw2 with the error shown below.
>> Here lapw2 cannot read the TiC.energy
>> file, because it is missing data in it as lapw0 gives bad output such as 
>> a
>> Density Integral with the value NaN in TiC.output0.
>>
>> The problem seems to be related to the new fft module.
>>
>> If lines 536-538 and 612-614 in SRC_lapw0/fft_modules.F:
>>
>> N2 = N+N
>> DO 117 I=1,N2
>>   C(I) = CH(I)
>>
>> are both changed to:
>>
>> DO 117 I=1,N
>>   C(I) = CH(I)
>>
>> Then, the error goes way.  On my system, N was the number 64.  The array 
>> C
>> had a size of 64, such that the loop is indexing outside the array (N2 =
>> 128).
>>
>> In Wien2k 11, TiC.output0 had:
>>
>> PLANE WAVE CONTRIBUTION -0.235589
>> :DEN  : DENSITY INTEGRAL  =  -754.35311720   (Ry)
>>
>> In Wien2k 12 with both changes made in fft_modules.F, TiC.output0 had:
>>
>> PLANE WAVE CONTRIBUTION -0.049778
>> :DEN  : DENSITY INTEGRAL  =  -753.97972930   (Ry)
>>
>> The density integral value is about the 

[Wien] WIEN2k 12 fft_modules & symmetry

2012-07-29 Thread Laurence Marks
If the first declaration is lm(2,49) then in later ones it does not
matter (in standard fortran) if it is declared lm(2,1), lm(2,*) or
lm(2,48) -- although lm(2,50) could be problematic. The reason is that
the size of the array is 2*49 and so long as this is not exceeded
everything is fine -- the locations go first over the first index,
then in order (the opposite to C). The "-C" option in fact checks some
this which are allowed in fortran and slightly incorrectly calls it an
error.

I think it should be 48 everywhere -- I will leave this to Peter who
will probably correct it.

On Sun, Jul 29, 2012 at 5:49 PM, Gavin Abo  wrote:
> Thanks, Prof. Marks.  Your explanation is better than mine.  Yes, almost
> certainly the default -r4 is used for -O2, but by luck it is not truncating
> the variable.
>
> By the way, do you think it is also by luck that the ifort compiler produces
> an "x symmetry" executable that does not crash with a memory access
> violation outside the lm array for certain structures?  If you check
> SRC_symmetry/class.f on line 8, the array is allocated as lm(2,49).
> However, the array is only allocated as lm(2,48) in kurki.f on line 3.
> Since class.f and kurki.f in SRC_symmetso both have lm(2,49), it suggests
> lm(2,48) should be replaced by lm(2,49).  This would affect at least Wienk2k
> 11 and 12.
>
> How I caught the potential issue on my system:
>
> 1. Add capitalized -C in SRC_symmetry/Makefile
> 2. make
> 3. cp symmetry ..
> 4. Use in2o3.struct in the Wien2k folder example_struct_files
> 5. x symmetry
> 6. The first line in the error message:
>
> forrtl: severe (408): fort: (2): Subscript #2 of the array LM has value 49
> which is greater than the upper bound of 48
>
> When "-C" is not used, the executable runs without error and seems to
> produce the correct output for in2o3.  The error cannot be caught with TiC.
>
> On 7/29/2012 2:54 PM, Laurence Marks wrote:
>
> Almost certainly it is trickier than this. I expect that -O1 is
> truncating relevant variables to real*4 which is leading to problems.
> With -O2 the compiler may well be not bothering to truncate and, at
> the end of the space allocated for the variable, by luck the correct
> values are present. This is luck; the same type of bug can in other
> cases lead to segmentation violations when code gets overwritten.
>
> N.B., I think there are only two places where real*4 variables are
> used, in parts of aim and for storage of the Hamiltonian in lapw1.
> Everything else should be real*8.
>
> On Sun, Jul 29, 2012 at 3:05 PM, Gavin Abo  wrote:
>
> I didn't use -r8.  However, you are right.  The scf cycle works
> correctly if I use "-O1 -r8".
>
> So the higher optimizations -O2 and -O3 must be invoking the use of -r8,
> whereas -O0 and -O1 should be using the default -r4.
>
> On 7/29/2012 1:40 PM, Laurence Marks wrote:
>
> I have not tested, but it looks like you are probably right. There may
> be other cases where variables are not explicitly defined to be 8
> bytes which are normally hidden by the use of "-r8". Did you use -r8?
>
> On Sun, Jul 29, 2012 at 1:57 PM, Gavin Abo  wrote:
>
> Dear Prof. Blaha,
>
> Thanks, the scf cycle runs correctly using -O2 or -O3 with the new
> files for the "fftpack" routines.  However, the scf cycle of the TiC
> example does not converge with -O1 (in the lapw0 makefile) with wrong
> values in TiC.output0 such as the plane wave contribution.  I don't
> know whether the problem is reproducible on another system.
>
> It seems to be due to "IMPLICIT REAL*8 (A-H,O-Z)" not being in the
> PIMACH function at the end of the fortran file fftpack_helpers.f.
> This line is in the function in the old file zfft3d.F.
>
> Kind Regards,
>
> Gavin
>
> On Thu, Jul 26, 2012 at 1:42 AM, Peter Blaha
>  wrote:
>
> Thank's for the report.
>
> The problem concerns  lapw0, when compiled in sequential mode WITHOUT
> -DFFTW2 or -DFFTW3
> in the Makefile  (i.e. using the old "fftpack" routines instead of the new
> and faster fftw library).
>
> The fix suggested in the mail below does not work. Instead, you have to
> replace the 3 attached
> subroutines and recompile. (eramps.f, fft_modules.F fftpack_helpers.f)
>
> A corrected version is on the web.
>
> PB
>
>
> Am 25.07.2012 23:21, schrieb Gavin Abo:
>
> Dear Prof. Blaha,
>
> When I run the TiC example with WIEN2k 12 "without" k-point or mpi
> parallelization, the program stops in lapw2 with the error shown below.
> Here lapw2 cannot read the TiC.energy
> file, because it is missing data in it as lapw0 gives bad output such as a
> Density Integral with the value NaN in TiC.output0.
>
> The problem seems to be related to the new fft module.
>
> If lines 536-538 and 612-614 in SRC_lapw0/fft_modules.F:
>
> N2 = N+N
> DO 117 I=1,N2
>  C(I) = CH(I)
>
> are both changed to:
>
> DO 117 I=1,N
>  C(I) = CH(I)
>
> Then, the error goes way.  On my system, N was the number 64.  The array C
> had a size of 64, such that the loop is indexing outside the array (N2 =
> 128).
>
> In

[Wien] WIEN2k 12 fft_modules & symmetry

2012-07-30 Thread Peter Blaha
Yes, I can confirm that the missing real*8 in pimach is a "bug", even when it 
does not show up
in optimized compilations. One should insert

IMPLICIT REAL*8 (A-H,O-Z)

in function pimach (bottom of SRC_lapw0/fftpack_helpers.f file).

As L.Marks already pointed out, the "kurki.f problem is not really a bug (and 
should never cause problems,
as something is allowed in Fortran. However, it is certainly not "clean" and 
I'll change it.


Am 30.07.2012 00:49, schrieb Gavin Abo:
> Thanks, Prof. Marks.  Your explanation is better than mine.  Yes, almost 
> certainly the default -r4 is used for -O2, but by luck it is not truncating 
> the variable.
>
> By the way, do you think it is also by luck that the ifort compiler produces 
> an "x symmetry" executable that does not crash with a memory access violation 
> outside the lm array for
> certain structures?  If you check SRC_symmet*ry*/class.f on line 8, the array 
> is allocated as lm(2,49).  However, the array is only allocated as lm(2,48) 
> in kurki.f on line 3.
> Since class.f and kurki.f in SRC_symmet*so* both have lm(2,49), it suggests 
> lm(2,48) should be replaced by lm(2,49).  This would affect at least Wienk2k 
> 11 and 12.
>
> How I caught the potential issue on my system:
>
> 1. Add capitalized -C in SRC_symmetry*/*Makefile
> 2. make
> 3. cp symmetry ..
> 4. Use in2o3.struct in the Wien2k folder example_struct_files
> 5. x symmetry
> 6. The first line in the error message:
>
> forrtl: severe (408): fort: (2): Subscript #2 of the array LM has value 49 
> which is greater than the upper bound of 48
>
> When "-C" is not used, the executable runs without error and seems to produce 
> the correct output for in2o3.  The error cannot be caught with TiC.
>
> On 7/29/2012 2:54 PM, Laurence Marks wrote:
>> Almost certainly it is trickier than this. I expect that -O1 is
>> truncating relevant variables to real*4 which is leading to problems.
>> With -O2 the compiler may well be not bothering to truncate and, at
>> the end of the space allocated for the variable, by luck the correct
>> values are present. This is luck; the same type of bug can in other
>> cases lead to segmentation violations when code gets overwritten.
>>
>> N.B., I think there are only two places where real*4 variables are
>> used, in parts of aim and for storage of the Hamiltonian in lapw1.
>> Everything else should be real*8.
>>
>> On Sun, Jul 29, 2012 at 3:05 PM, Gavin Abo  wrote:
>>> I didn't use -r8.  However, you are right.  The scf cycle works
>>> correctly if I use "-O1 -r8".
>>>
>>> So the higher optimizations -O2 and -O3 must be invoking the use of -r8,
>>> whereas -O0 and -O1 should be using the default -r4.
>>>
>>> On 7/29/2012 1:40 PM, Laurence Marks wrote:
 I have not tested, but it looks like you are probably right. There may
 be other cases where variables are not explicitly defined to be 8
 bytes which are normally hidden by the use of "-r8". Did you use -r8?

 On Sun, Jul 29, 2012 at 1:57 PM, Gavin Abo  wrote:
> Dear Prof. Blaha,
>
> Thanks, the scf cycle runs correctly using -O2 or -O3 with the new
> files for the "fftpack" routines.  However, the scf cycle of the TiC
> example does not converge with -O1 (in the lapw0 makefile) with wrong
> values in TiC.output0 such as the plane wave contribution.  I don't
> know whether the problem is reproducible on another system.
>
> It seems to be due to "IMPLICIT REAL*8 (A-H,O-Z)" not being in the
> PIMACH function at the end of the fortran file fftpack_helpers.f.
> This line is in the function in the old file zfft3d.F.
>
> Kind Regards,
>
> Gavin
>
> On Thu, Jul 26, 2012 at 1:42 AM, Peter Blaha
>   wrote:
>> Thank's for the report.
>>
>> The problem concerns  lapw0, when compiled in sequential mode WITHOUT
>> -DFFTW2 or -DFFTW3
>> in the Makefile  (i.e. using the old "fftpack" routines instead of the 
>> new
>> and faster fftw library).
>>
>> The fix suggested in the mail below does not work. Instead, you have to
>> replace the 3 attached
>> subroutines and recompile. (eramps.f, fft_modules.F fftpack_helpers.f)
>>
>> A corrected version is on the web.
>>
>> PB
>>
>>
>> Am 25.07.2012 23:21, schrieb Gavin Abo:
>>> Dear Prof. Blaha,
>>>
>>> When I run the TiC example with WIEN2k 12 "without" k-point or mpi
>>> parallelization, the program stops in lapw2 with the error shown below.
>>> Here lapw2 cannot read the TiC.energy
>>> file, because it is missing data in it as lapw0 gives bad output such 
>>> as a
>>> Density Integral with the value NaN in TiC.output0.
>>>
>>> The problem seems to be related to the new fft module.
>>>
>>> If lines 536-538 and 612-614 in SRC_lapw0/fft_modules.F:
>>>
>>> N2 = N+N
>>> DO 117 I=1,N2
>>>   C(I) = CH(I)
>>>
>>> are both changed to:
>>>
>>> DO 117 I=1,N
>>>