date:20160902

[Pw_forum] strange population analysis result, in PW v 5.4

2016-09-02 Thread Yi Wang

Hi, dear developers,

I'm using PW v5.4 to do some calculation, I found the population analysis  
printed during scf calculation has a strange behavior depending on the  
settings of cell_parameter.

bellowing is the input where I found the strange thing:
as you may see, here, I use celldm(1) and alat to control the cell  
parameters. I got a m=1.36 in this case.
If I change the cell parameters in the form of "CELL_PARAMETERS bohr", I  
got m=1.96.
In both cases, the "total magnetization" is 1.96 (the cutoff is not  
converged for the potential, but anyway, potentials and cutoffs are  
irrelevant, I've checked both of them)
The geometry is essentially the same, but the analysis is very different,  
energy, "total" and "absolute" magnetizations are not affected, of course.

May I also ask is the analysis printed during scf is the Mulliken analysis?


&control
calculation = 'scf' ,
prefix = 'pwscf' ,
outdir = './tempfft/' ,
pseudo_dir = './' ,
restart_mode = from_scratch ,
disk_io= 'none'
/
&system
ibrav = 0 ,
ecutwfc = 80 ,
occupations = 'tetrahedra' ,
nspin=2,
nbnd= 18 ,
nat = 1 ,
ntyp = 1 ,
use_all_frac=.TRUE.
celldm(1)=2.7120201830,
starting_magnetization( 1 )= 0.26,
/
&electrons
conv_thr = 1d-12 ,
diagonalization = 'david' ,
mixing_mode = 'plain' ,
startingpot = 'atomic' ,
startingwfc = 'atomic+random' ,
mixing_beta = 0.12 ,
mixing_ndim =  10,
/
CELL_PARAMETERS alat
-1 1 1
1 -1 1
1 1 -1
ATOMIC_SPECIES
   Fe 55.845 Fe.gth.upf
ATOMIC_POSITIONS bohr
Fe   0 0 0
K_POINTS automatic
20 20 20 0 0 0


Thanks for your attention.

Yi Wang

-- 
Yi Wang
Ph.D candidate at Nanjing University of Science and Technology
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] errors testing espresso 5.4.0

2016-09-02 Thread Fabricio Cannini

Hello there

I'm facing errors in a few tests of espresso 5.4.0.
I'm compiling it a centos 6.x machine in the following manner:
=
FC = intel 15.0
MPI = impi 5.0
BLAS/LAPACK = mkl 11.2
FFT = fftw 3.3.5

BLAS_LIBS="-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core"
LAPACK_LIBS="-lmkl_core"
FFT_LIBS="-lfftw3"
FFLAGS="-O2 -assume byterecl -g -traceback -fpe0 -CB -openmp"
MPIF90=mpiifort

./configure --enable-openmp --enable-parallel --without-scalapack

make pw cp ph neb epw
=


When running the pw tests, some of those fail no matter how many mpi 
processes I use.

'pw_b3lyp/b3lyp-h2o.in' and 'pw_b3lyp/b3lyp-O.in' fail with the error 
message:
---
forrtl: severe (408): fort: (2): Subscript #1 of the array CORR has 
value 12 which is greater than the upper bound of 10

Image  PCRoutineLine 
Source
pw.x   016F0EF0  Unknown   Unknown  Unknown
pw.x   00D7B085  funct_mp_set_dft_ 597 
funct.f90
pw.x   00D79837  funct_mp_enforce_ 723 
funct.f90
pw.x   00E2E054  read_pseudo_mod_m 101 
read_pseudo.f90
pw.x   006EA301  iosys_   1444 
input.f90
pw.x   004080B9  run_pwscf_ 63 
run_pwscf.f90
pw.x   00407FBD  MAIN__ 30 
pwscf.f90
pw.x   00407F1E  Unknown   Unknown  Unknown
libc.so.6  0034F221ED1D  Unknown   Unknown  Unknown
pw.x   00407E29  Unknown   Unknown  Unknown
---


'pw_uspp/uspp-hyb-g.in' fails with the error message:
---
forrtl: severe (408): fort: (2): Subscript #1 of the array DSPHER has 
value 1 which is greater than the upper bound of 0

Image  PCRoutineLine 
Source
pw.x   016F0EF0  Unknown   Unknown  Unknown
pw.x   00517B8C  realus_mp_real_sp 602 
realus.f90
pw.x   0050D056  realus_mp_addusfo1284 
realus.f90
pw.x   00AFB1F6  force_us_.L   113 
force_us.f90
pw.x   006A3415  forces_90 
forces.f90
pw.x   004081B8  run_pwscf_129 
run_pwscf.f90
pw.x   00407FBD  MAIN__ 30 
pwscf.f90
pw.x   00407F1E  Unknown   Unknown  Unknown
libc.so.6  0034F221ED1D  Unknown   Unknown  Unknown
pw.x   00407E29  Unknown   Unknown  Unknown
---


'pw_vdw/vdw-ts.in' fails with the error message:
---
forrtl: severe (408): fort: (2): Subscript #1 of the array UTSVDW has 
value 5201 which is greater than the upper bound of 5200

Image  PCRoutineLine 
Source
pw.x   016F0EF0  Unknown   Unknown  Unknown
pw.x   00470436  v_of_rho_  92 
v_of_rho.f90
pw.x   0080AE0B  potinit_  227 
potinit.f90
pw.x   006D98F8  init_run_  99 
init_run.f90
pw.x   00408111  run_pwscf_ 78 
run_pwscf.f90
pw.x   00407FBD  MAIN__ 30 
pwscf.f90
pw.x   00407F1E  Unknown   Unknown  Unknown
libc.so.6  0034F221ED1D  Unknown   Unknown  Unknown
pw.x   00407E29  Unknown   Unknown  Unknown
---



All messages are similar so they may have a common cause, but I'm unable 
to tell why exactly. Any ideas?


TIA,
Fabricio
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] Order of optimization parameters

2016-09-02 Thread mohammadreza hosseini

Dear all

I am starting calculations about a novel structure. What is the order and 
priority of parameter optimization? (kpoin, ecut , lattice parameters or 
) Should these parameters optimize by scf or relax ?

Best
mohammadreza___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Restarting phonon calculation with images, possibility of changing the number of images

2016-09-02 Thread Ye Luo

Hi Thomas,

It could work but not really solve your problem in the best way.
The ph.x first decide how to distribute the q+irr calculations among the
images without knowing whether individual q+irr is done or not.
Then it starts the calculations including checking whether the calculation
has already been done.
So it is highly probably that you still end up with one image finishes way
eariler then other images.

I noticed the speed between different q can be very different. you can
check the number of k points used for each q.
The speed between different irr belonging to the same q is similar.
To use my resource more efficiently, I prefer the Grid way of computing by
breaking the whole calculation by q and then distribute irr among images.

Ye

===
Ye Luo, Ph.D.
Leadership Computing Facility
Argonne National Laboratory

2016-09-02 10:40 GMT-05:00 Thomas Brumme :

> OK, I think I found a possibility, which does not involve
> writing input files like in the GRID example (i.e. finding out
> which q points and representations finished and which
> didn't which can be quite tedious) but maybe someone
> can confirm...
>
> I now have, e.g., four _ph folders. For two of the images the
> calculations are nearly finished, while the other two haven't
> finished a single calculation. Restarting with 4 images would
> result in 2 of them just waiting... However, if I could restart
> with more or less images the work should be more evenly
> distributed.
>
> Lets say the original image 1 and 3 are finished and 0 and 2
> not, so "not / finished / not / finished"... In that case I could
> restart with only two images and the work would be evenly
> distributed.
>
> On the other hand, if I would have something like:
> "finished / finished / not / not"
> reducing the number of images to 2 would not solve the
> problem, but in the more general case with many more
> images, doubling the number of images could at least
> reduce the total number of CPUs which don't do anything.
>
> So, I need to create _ph folders for the number of images
> I want to use... Then I need to copy the directory
> _ph0/$prefix.phsave/
> of the original calculation into them in order to have the
> patterns. Then I also copy all the dynmat files of all the
> original images into those directories. If I now restart
> the phonon code should always recognize if a calculation
> has already been done...
>
> Does this sound reasonable?
>
> Kind regards
>
> Thomas
>
>
> On 09/02/2016 12:01 PM, Thomas Brumme wrote:
> > Dear all,
> >
> > I have a question concerning the restart possibilities with image
> > parallelization in a phonon calculation.
> > I have the problem that for some of the images the calculation did not
> > converge. I know that I can achieve
> > convergence by reducing the mixing since I encountered the problem
> > before for exactly the same system.
> > Yet, now, as some of the images are finished with their task (or close
> > to), I have only the possibility of either
> > using only one image copying the dynmat.$iq.$ir.xml files to the
> > _ph0/*.phsave/ directory, or to restart using
> > the same number of images and live with the fact that some images will
> > do nothing...
> > Or is there a third possibility I don't know? Wouldn't it be better to
> > first check what has already been done
> > and then distributing the work among the images? Or is this too hard to
> > code? (I haven't looked at this part
> > of the code yet)
> >
> > OK, I think I could also use some kind of GRID parallelization and
> > create some input files by hand, setting
> > the start_irr, start_q, and so on, but this is rather tedious since I
> > have a big system and a q-point grid...
> > So, again the (maybe stupid) question: Is there another possibility?
> >
> > Regards
> >
> > Thomas
> >
> >
>
> --
> Dr. rer. nat. Thomas Brumme
> Max Planck Institute for the Structure and Dynamics of Matter
> Luruper Chaussee 149
> 22761 Hamburg
>
> Tel:  +49 (0)40 8998 6557
>
> email: thomas.bru...@mpsd.mpg.de
>
> ___
> Pw_forum mailing list
> Pw_forum@pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] lda+U abinitio U

2016-09-02 Thread Lorenzo Paulatto

Hello,
I have a question about how to compute the abinitio U, I'm following the 
online Santa Barbara 2009 tutorial (and trying to use the script therein) and 
PRB 71, 35105 (2005).

I have a system that is not very large but quite heavy to compute, and have 
several atomic species (3 at first, then 4) but only one is problematic, as it 
is could be Cerium or other Lanthanoids. 

It would be quite anoying and very cpu time consuming for me to properly* 
compute U by finite response for all the species, but I'm not sure that
1. it is possible with the available toolchain
2. it makes sense

Also, if somebody (e.g. Matteo) has some more recent and/or more user friendly 
code to compute dalpha/dn I'm a taker.

thank you for your help!

*) by properly I mean take care that it is converged with the supercell size

-- 
Dr. Lorenzo Paulatto
IdR @ IMPMC -- CNRS & Université Paris 6
+33 (0)1 44 275 084 / skype: paulatz
http://www.impmc.upmc.fr/~paulatto/
23-24/4é16 Boîte courrier 115, 
4 place Jussieu 75252 Paris Cédex 05

___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Restarting phonon calculation with images, possibility of changing the number of images

2016-09-02 Thread Thomas Brumme

OK, I think I found a possibility, which does not involve
writing input files like in the GRID example (i.e. finding out
which q points and representations finished and which
didn't which can be quite tedious) but maybe someone
can confirm...

I now have, e.g., four _ph folders. For two of the images the
calculations are nearly finished, while the other two haven't
finished a single calculation. Restarting with 4 images would
result in 2 of them just waiting... However, if I could restart
with more or less images the work should be more evenly
distributed.

Lets say the original image 1 and 3 are finished and 0 and 2
not, so "not / finished / not / finished"... In that case I could
restart with only two images and the work would be evenly
distributed.

On the other hand, if I would have something like:
"finished / finished / not / not"
reducing the number of images to 2 would not solve the
problem, but in the more general case with many more
images, doubling the number of images could at least
reduce the total number of CPUs which don't do anything.

So, I need to create _ph folders for the number of images
I want to use... Then I need to copy the directory
_ph0/$prefix.phsave/
of the original calculation into them in order to have the
patterns. Then I also copy all the dynmat files of all the
original images into those directories. If I now restart
the phonon code should always recognize if a calculation
has already been done...

Does this sound reasonable?

Kind regards

Thomas

On 09/02/2016 12:01 PM, Thomas Brumme wrote:
> Dear all,
>
> I have a question concerning the restart possibilities with image
> parallelization in a phonon calculation.
> I have the problem that for some of the images the calculation did not
> converge. I know that I can achieve
> convergence by reducing the mixing since I encountered the problem
> before for exactly the same system.
> Yet, now, as some of the images are finished with their task (or close
> to), I have only the possibility of either
> using only one image copying the dynmat.$iq.$ir.xml files to the
> _ph0/*.phsave/ directory, or to restart using
> the same number of images and live with the fact that some images will
> do nothing...
> Or is there a third possibility I don't know? Wouldn't it be better to
> first check what has already been done
> and then distributing the work among the images? Or is this too hard to
> code? (I haven't looked at this part
> of the code yet)
>
> OK, I think I could also use some kind of GRID parallelization and
> create some input files by hand, setting
> the start_irr, start_q, and so on, but this is rather tedious since I
> have a big system and a q-point grid...
> So, again the (maybe stupid) question: Is there another possibility?
>
> Regards
>
> Thomas
>
>

-- 
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg

Tel:  +49 (0)40 8998 6557

email: thomas.bru...@mpsd.mpg.de

___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Problem with MPI parallelization: Error in routine zsqmred

2016-09-02 Thread Filippo SPIGA

Dear Jan,

Paolo is right, you are providing us very little information to help. Please 
create a tar,gz containing:
- your make.sys
- the file "install/config.log"
- the submission script you used to run the job
- the input file
- the pseudo-potentials required to run the example
- some technical details about your workstation / server / cluster


On Sep 2, 2016, at 8:43 AM, Jan Oliver Oelerich 
 wrote:
> 
> Hi QE users,
> 
> I am trying to run QE 5.4.0 with MPI parallelization on a mid-size 
> cluster. I successfully tested the installation using 8 processes 
> distributed on 2 nodes, so communication across nodes is not a problem. 
> When I, however, run the same calculation on 64 cores, I am getting the 
> following error messages in the stdout:

--
Filippo SPIGA ~ Quantum ESPRESSO Foundation ~ http://www.quantum-espresso.org


___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Band Despression

2016-09-02 Thread Stefano de Gironcoli

The file names in bands.x and plotbands.x should be consistent. Either 
bands.dat or siliconbands.dat. If you change your mind midway poor code should 
have ESP abilities to do what you want

stefano 
(sent from my phone)

> On 02 Sep 2016, at 13:43, Santosh Chiniwar  wrote:
> 
> May get any reply from anybody.
> Thank you
> 
> 
>> On 02-Sep-2016 9:40 am, "Santosh Chiniwar"  wrote:
>> May I accept any reply please
>> 
>> thanks and kind regards
>> iquantware
>> 
>> 
>>> On 31-Aug-2016 10:12 pm, "Santosh Chiniwar"  wrote:
>>> Dear Pw_forum, 
>>>  I am trying to simulate silicon from tutorial. But I couldn't  line 
>>> plot get on Bands.ps. But instead I got point spread.
>>> 
>>> 
>>> Bands.incode is following 
>>> &bands
>>> prefix  = 'Si_exc2'
>>>  outdir='./'
>>> filband = 'siliconbands.dat'
>>> /
>>> 
>>> 
>>> and 
>>> plotbands.in  code is following 
>>> 
>>> bands.dat
>>> -6.00 10.00
>>> bands.xmgr
>>> bands.ps
>>> 6.337
>>> 1.00 6.337
>>>  
>>> Any help or suggestion is appreciated. 
>>> 
>>> I am looking for band structure plot as following. 
>>> 
>>> 
>>> I have used macbook and used preview to open bands.ps in Mac OsX.
>>> 
>>> Thank you 
> ___
> Pw_forum mailing list
> Pw_forum@pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Band Despression

2016-09-02 Thread Francesco Pelizza


You're files are not visible to me at least

Are you using a personal scrypt to do the plot? if so you're quite 
likely doing a scatter plot, change to normal plot


But much better is to use plotband.x and see what comes out from it


Francesco




On 02/09/16 12:43, Santosh Chiniwar wrote:


May get any reply from anybody.
Thank you


On 02-Sep-2016 9:40 am, "Santosh Chiniwar" > wrote:


May I accept any reply please

thanks and kind regards
iquantware


On 31-Aug-2016 10:12 pm, "Santosh Chiniwar"
mailto:santosh.ch...@gmail.com>> wrote:

Dear Pw_forum,
 I am trying to simulate silicon from tutorial. But I
couldn't  line plot get on Bands.ps. But instead I got point
spread.

Inline images 1
Bands.incode is following
&bands
prefix  = 'Si_exc2'
 outdir='./'
filband = 'siliconbands.dat'
/


and
plotbands.in   code is following

bands.dat
-6.00 10.00
bands.xmgr
bands.ps 
6.337
1.00 6.337
Any help or suggestion is appreciated.

I am looking for band structure plot as following.
Inline images 2

I have used macbook and used preview to open bands.ps
 in Mac OsX.

Thank you



*
*
*
*





___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum


___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Band Despression

2016-09-02 Thread Santosh Chiniwar

May get any reply from anybody.
Thank you

On 02-Sep-2016 9:40 am, "Santosh Chiniwar"  wrote:

> May I accept any reply please
>
> thanks and kind regards
> iquantware
>
> On 31-Aug-2016 10:12 pm, "Santosh Chiniwar" 
> wrote:
>
>> Dear Pw_forum,
>>  I am trying to simulate silicon from tutorial. But I couldn't  line
>> plot get on Bands.ps. But instead I got point spread.
>>
>> [image: Inline images 1]
>> Bands.incode is following
>> &bands
>> prefix  = 'Si_exc2'
>>  outdir='./'
>> filband = 'siliconbands.dat'
>> /
>>
>>
>> and
>> plotbands.in  code is following
>>
>> bands.dat
>> -6.00 10.00
>> bands.xmgr
>> bands.ps
>> 6.337
>> 1.00 6.337
>>
>> Any help or suggestion is appreciated.
>>
>> I am looking for band structure plot as following.
>> [image: Inline images 2]
>>
>> I have used macbook and used preview to open bands.ps in Mac OsX.
>>
>> Thank you
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] Restarting phonon calculation with images, possibility of changing the number of images

2016-09-02 Thread Thomas Brumme

Dear all,

I have a question concerning the restart possibilities with image 
parallelization in a phonon calculation.
I have the problem that for some of the images the calculation did not 
converge. I know that I can achieve
convergence by reducing the mixing since I encountered the problem 
before for exactly the same system.
Yet, now, as some of the images are finished with their task (or close 
to), I have only the possibility of either
using only one image copying the dynmat.$iq.$ir.xml files to the 
_ph0/*.phsave/ directory, or to restart using
the same number of images and live with the fact that some images will 
do nothing...
Or is there a third possibility I don't know? Wouldn't it be better to 
first check what has already been done
and then distributing the work among the images? Or is this too hard to 
code? (I haven't looked at this part
of the code yet)

OK, I think I could also use some kind of GRID parallelization and 
create some input files by hand, setting
the start_irr, start_q, and so on, but this is rather tedious since I 
have a big system and a q-point grid...
So, again the (maybe stupid) question: Is there another possibility?

Regards

Thomas


-- 
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg

Tel:  +49 (0)40 8998 6557

email: thomas.bru...@mpsd.mpg.de

___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Problem with MPI parallelization: Error in routine zsqmred

2016-09-02 Thread Paolo Giannozzi

First of all, try to figure out if the problem is reproducible on another
machine, or with another software configuration (compilers, libraries etc).
Nobody has ever reported such an error.

Paolo

On Fri, Sep 2, 2016 at 9:43 AM, Jan Oliver Oelerich <
jan.oliver.oeler...@physik.uni-marburg.de> wrote:

> Hi QE users,
>
> I am trying to run QE 5.4.0 with MPI parallelization on a mid-size
> cluster. I successfully tested the installation using 8 processes
> distributed on 2 nodes, so communication across nodes is not a problem.
> When I, however, run the same calculation on 64 cores, I am getting the
> following error messages in the stdout:
>
>
>iteration #  1 ecut=30.00 Ry beta=0.70
>Davidson diagonalization with overlap
>
>
> 
> %%
>Error in routine  zsqmred (8):
>
> somthing wrong with row 3
>
> 
> %%
>
>stopping ...
>
> 
> %%
>
>Error in routine  zsqmred (4):
>
> 
> %%
> somthing wrong with row 3
>Error in routine  zsqmred (12):
>
> 
> %%
> somthing wrong with row 3
>
>
> 
> %%
>stopping ...
>
>stopping ...
>
>
> The cluster queues stderr shows that some MPI processes exited:
>
>
> PSIlogger: Child with rank 28 exited with status 12.
> PSIlogger: Child with rank 8 exited with status 4.
> application called MPI_Abort(MPI_COMM_WORLD, 12) - process 28application
> called MPI_Abort(MPI_COMM_WORLD, 4) - process 8application called
> MPI_Abort(MPI_COMM_WORLD, 8) - process 18kvsprovider[12375]: sighandler:
> Terminating the job.
> PSIlogger: Child with rank 18 exited with status 8.
> PSIlogger: Child with rank 4 exited with status 1.
> PSIlogger: Child with rank 15 exited with status 1.
> PSIlogger: Child with rank 53 exited with status 1.
> PSIlogger: Child with rank 30 exited with status 1.
>
>
> The cluster is running some sort of Sun Grid Engine and I used intel
> MPI. I see no other error messages. Could you give me a hint how to
> debug this further? Verbosity is already 'high'.
>
> Thank you very much and best regards,
> Jan Oliver Oelerich
>
>
>
>
> --
> Dr. Jan Oliver Oelerich
> Faculty of Physics and Material Sciences Center
> Philipps-Universität Marburg
>
> Addr.: Room 02D35, Hans-Meerwein-Straße 6, 35032 Marburg, Germany
> Phone: +49 6421 2822260
> Mail : jan.oliver.oeler...@physik.uni-marburg.de
> Web  : http://academics.oelerich.org
> ___
> Pw_forum mailing list
> Pw_forum@pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum




-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] Problem with MPI parallelization: Error in routine zsqmred

2016-09-02 Thread Jan Oliver Oelerich

Hi QE users,

I am trying to run QE 5.4.0 with MPI parallelization on a mid-size 
cluster. I successfully tested the installation using 8 processes 
distributed on 2 nodes, so communication across nodes is not a problem. 
When I, however, run the same calculation on 64 cores, I am getting the 
following error messages in the stdout:


   iteration #  1 ecut=30.00 Ry beta=0.70
   Davidson diagonalization with overlap

 
%%
   Error in routine  zsqmred (8):

somthing wrong with row 3
 
%%

   stopping ...
 
%%

   Error in routine  zsqmred (4):
 
%%
somthing wrong with row 3
   Error in routine  zsqmred (12):
 
%%
somthing wrong with row 3

 
%%
   stopping ...

   stopping ...


The cluster queues stderr shows that some MPI processes exited:


PSIlogger: Child with rank 28 exited with status 12.
PSIlogger: Child with rank 8 exited with status 4.
application called MPI_Abort(MPI_COMM_WORLD, 12) - process 28application 
called MPI_Abort(MPI_COMM_WORLD, 4) - process 8application called 
MPI_Abort(MPI_COMM_WORLD, 8) - process 18kvsprovider[12375]: sighandler: 
Terminating the job.
PSIlogger: Child with rank 18 exited with status 8.
PSIlogger: Child with rank 4 exited with status 1.
PSIlogger: Child with rank 15 exited with status 1.
PSIlogger: Child with rank 53 exited with status 1.
PSIlogger: Child with rank 30 exited with status 1.


The cluster is running some sort of Sun Grid Engine and I used intel 
MPI. I see no other error messages. Could you give me a hint how to 
debug this further? Verbosity is already 'high'.

Thank you very much and best regards,
Jan Oliver Oelerich




-- 
Dr. Jan Oliver Oelerich
Faculty of Physics and Material Sciences Center
Philipps-Universität Marburg

Addr.: Room 02D35, Hans-Meerwein-Straße 6, 35032 Marburg, Germany
Phone: +49 6421 2822260
Mail : jan.oliver.oeler...@physik.uni-marburg.de
Web  : http://academics.oelerich.org
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] strange population analysis result, in PW v 5.4

[Pw_forum] errors testing espresso 5.4.0

[Pw_forum] Order of optimization parameters

Re: [Pw_forum] Restarting phonon calculation with images, possibility of changing the number of images

[Pw_forum] lda+U abinitio U

Re: [Pw_forum] Restarting phonon calculation with images, possibility of changing the number of images

Re: [Pw_forum] Problem with MPI parallelization: Error in routine zsqmred

Re: [Pw_forum] Band Despression

Re: [Pw_forum] Band Despression

Re: [Pw_forum] Band Despression

[Pw_forum] Restarting phonon calculation with images, possibility of changing the number of images

Re: [Pw_forum] Problem with MPI parallelization: Error in routine zsqmred

[Pw_forum] Problem with MPI parallelization: Error in routine zsqmred

13 matches

Site Navigation

Mail list logo

Footer information