[Pw_forum] forrtl: severe (174): SIGSEGV, segmentation fault occurred

2009-09-25 Thread James J Ramsey
2009/9/25 Q.J.Wang :
> Dear all
> According to people's advice ,I had checked it ,the reason of memory was
> ruled out .I task I run is a small system .With K-points division it can
> finish the task ,while with R G division, it can't. and turns up :
> node1
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image? PC??? Routine??? Line
> Source
> mca_pml_ob1.so 7F34DE43BB56? Unknown?? Unknown? Unknown
> mca_pml_ob1.so 7F34DE43C268? Unknown?? Unknown? Unknown
> mca_btl_sm.so? 7F34DDC2102D? Unknown?? Unknown? Unknown
> libopen-pal.so.0?? 7F34E25ECFC9? Unknown?? Unknown? Unknown
> mca_pml_ob1.so 7F34DE43803C? Unknown?? Unknown? Unknown
> libmpi.so.0??? 7F34E2B306E0? Unknown?? Unknown? Unknown
> libmpi_f77.so.0??? 7F34E2DCFB94? Unknown?? Unknown? Unknown
> pw.x?? 0051F1BA? Unknown?? Unknown? Unknown
> pw.x?? 0067389F? Unknown?? Unknown? Unknown
> pw.x?? 00663521? Unknown?? Unknown? Unknown
> pw.x?? 00664647? Unknown?? Unknown? Unknown
> pw.x?? 00595B60? Unknown?? Unknown? Unknown
> pw.x?? 004629DB? Unknown?? Unknown? Unknown
> pw.x?? 004627CC? Unknown?? Unknown? Unknown
> libc.so.6? 7F34E180C586? Unknown?? Unknown? Unknown
> pw.x?? 004626C9? Unknown?? Unknown? Unknown
> --
> mpirun has exited due to process rank 4 with PID 6302 on
> node node1 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> Any advice will be appreciated !

What implementation of MPI are you using, and what version? For
example, are you using MPICH 1.2.7 or OpenMPI 1.3?

[Pw_forum] Running on cluster often turn up

2009-09-25 Thread Q.J.Wang
Dear all
 When I running on cluster ,it often turn up some bizarre errors ,which not 
turn up ong single computer .I don't know why .Whether is it because the 
computing environment settings ? I don' know how to do .Plese help me .


Best regards
XiangTan University 
-- next part --
An HTML attachment was scrubbed...

[Pw_forum] about OMP_NUM_THREADS

2009-09-25 Thread Paolo Giannozzi
Lorenzo Paulatto wrote:

> Welcome Jason, and thank you for providing all the necessary  
> information from the beginning.

... with the exception of an appropriate subject :-)

> You are using  
> norm-conserving pseudopotential, I think 50-60Ry should be enough

for Ge and T, I think it is more than enough.

One may spare time by making first a quick-and-dirty structural
optimization, then refining it with higher cutoff and more k-points.
I would start with a much lower cutoff (25Ry for instance) and the
Gamma point only (maybe with a few more bands and small smearing:
if the system is close to a metallic state, it is easy to run into
trouble otherwise). For damped MD, a good choice of the time step
is important. I personally prefer BFGS to damped MD.

Paolo Giannozzi, Democritos and University of Udine, Italy

[Pw_forum] about OMP_NUM_THREADS

2009-09-25 Thread Lorenzo Paulatto
In data 25 settembre 2009 alle ore 16:06:16, Jason Larkin  
 ha scritto:
> First, thanks very much for the great software package.

Welcome Jason, and thank you for providing all the necessary information  
 from the beginning. There are indeed a couple of issues with your input,  
let's have a look.

> forc_conv_thr = 1.0d-5,
> etot_conv_thr = 1.0d-5,

force and energy are not in the same unit of measure, I don't know if this  
is a sensible choice, but it looks like the force threshold is too strict  
compared to the energy: you are requesting the total energy to change less  
that 1mRy if you move one atom by one hundredth of bohr...

> ecutwfc = 100, ecutrho = 800,

These thresholds are very high: in QE cutoffs are specified in Rydberg  
units (some less-free DFT-base software uses eV). You are using  
norm-conserving pseudopotential, I think 50-60Ry should be enough (you  
convergence should test for a small system), for the same reason you can  
use the default value of ecutrho (=4*ecutwfc).

> My problem (issue) is this; the simulation continues to time out before
> completing the relaxation.

You can always restart a simulation from the point you've left it (as long  
the new run has access to outdir), just set restart_mode='restart' in the  
control namelist.

> As such, I've been coninuously increasing
> the amount of time and processors that I am requesting to perform this
> calculation.  However, it has reached a point (32 processors and 3:30
> hours) where I'm almost certain that I am doing something wrong.

You have about 100 electrons in your system, with such a high cutoff it  
won't be able to do many iterations, furthermore the very strict threshold  
for force convergence could cause problems (all criteria has to be  

> It appears that my force(s) are not consistent with the force threshold
> that I set. I'm guessing this is because my initial guess at the
> structure is quite far from the relaxed structure.  So, would it be
> reasonable to take the unit cell coordinates from the last completed
> self-consistent calcualtion and use those as starting coordinates for a
> brand new vc-relax run?

Yes, or you can just use the restart feature, which will also recover the  
partially-completed scf cycle.

> This would be an effort to continue the
> calculation, but not lose the 3:30 of run time I've already used.

If you want to be sure that you job is not killed when it's saving the  
restart data (which requires a boring manual restart) you can use the  
max_seconds variables. Be careful because setting it to close to the time  
limit will likely cause the effect it's meant to prevent!

Finally, you cannot use the restart feature if you change the cutoff,  
you'll have to do a manual restart by copying the last atomic position.


Lorenzo Paulatto
phone: +39 040 3787 511
skype: paulatz
www:   http://people.sissa.it/~paulatto/

 *** save italian brains ***

[Pw_forum] About example 22

2009-09-25 Thread Dal Corso Andrea
On Fri, 2009-09-25 at 14:25 +, xirainbow wrote:
> Dear developers:
> There are three trivial slip of example22.
> In the "README" file of example22:
> ONE: "2) make a band structure calculation for Pt (input=pt.band.in,
> output=pt.band.out)." 
> should be "(input=pt.nscf.in,   output=pt.nscf.out)."
> TWO:"9) make a self-consistent calculation for Pt in a tetragonal cell
> with 4 atoms  (input=pt.tet4.in, output=pt.tet4.out)."
> should be " (input=pt4.in, output=pt4.out)."
Thank you. I will correct in the cvs version.

> THREE: When I try to open "pt.nscf_ph.in" with PWgui, I get the follow
> error"syntax error in the input file:expecting keyword ATOMIC SPECIES,
> but read PHONON instead"
> I use espresso4.0.4. But I notice that in espresso4.1, there is
> not "pt.nscf_ph.in" file any more.
> The "pt.nscf_ph.in" is given below:
> "Pt
> Pt
> calculation = 'phonon'
> restart_mode='from_scratch',
> prefix='Pt',
> pseudo_dir = '/home/raman/espresso-4.0.4/pseudo/',
> outdir='/home/raman/tmp/'
>  /
> ibrav=  2, celldm(1) =7.42, nat=  1, ntyp= 1,
> lspinorb=.true.,
> noncolin=.true.,
> starting_magnetization=0.0,
> occupations='smearing',
> degauss=0.02,
> smearing='mp',
> ecutwfc =30.0,
> ecutrho =250.0,
>  /
> mixing_beta = 0.7,
> conv_thr =  1.0d-8
>  /
>  xqq(1)=1.d0,
>  xqq(2)=0.d0,
>  xqq(3)=0.d0,
>  /
> Pt  79.90Ptrel.RRKJ3.UPF
> Pt  0.000   0.   0.0
> 2 2 2 1 1 1"

The option 'phonon' is no more available in the pw.x input.
The band structure calculation is done by ph.x if needed.


> -- 
> Hui Wang
> School of physics, Nankai University, Tianjin, China
> ___
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum
Andrea Dal CorsoTel. 0039-040-3787428
SISSA, Via Beirut 2/4   Fax. 0039-040-3787528
34151 Trieste (Italy)   e-mail: dalcorso at sissa.it

[Pw_forum] about OMP_NUM_THREADS

2009-09-25 Thread Gabriele Sclauzero
Q.J.Wang wrote:
> Dear all
>  According to the advice of the forum,I added OMP_NUM_THREADS=1 in 
> my PBS script .But I can only run a task on the cluster .When one was 
> running ,I run another task ,and one of them will turn up error and stop .

Have you specified different outdir and/or prefix for the two tasks?
Have you checked that you have enough memory on a node to run two pw.x runs?


> --
> Best regards
> Q.J.Wang
> XiangTan University
> "",60??? 
> ___
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum


o  o
| Gabriele Sclauzero, PhD Student  |
| c/o:   SISSA & CNR-INFM Democritos,  |
|via Beirut 2-4, 34014 Trieste (Italy) |
| email: sclauzer at sissa.it |
| phone: +39 040 3787 511  |
| skype: gurlonotturno |
o  o

[Pw_forum] about OMP_NUM_THREADS

2009-09-25 Thread Lorenzo Paulatto
In data 25 settembre 2009 alle ore 15:41:03, Q.J.Wang  ha  
>  According to the advice of the forum,I added OMP_NUM_THREADS=1 in  
> my PBS script .But I can only run a task on the cluster .When one was  
> running ,I run another task ,and one of them will turn up error and stop

Dear Q.J. Wang,
as usual, you MUST provide the error message and as many details as  
possible on how you run the job, there is very little we can say about you  
problem with any information.

To begin you should check that the jobs do not have the same prefix and  
outdir, if it is the case they will crash after a while.

best regards

Lorenzo Paulatto
phone: +39 040 3787 511
skype: paulatz
www:   http://people.sissa.it/~paulatto/

 *** save italian brains ***

[Pw_forum] Running on cluster often turn up

2009-09-25 Thread Giovanni Cantele
Q.J.Wang wrote:
> Dear all
> When I running on cluster ,it often turn up some bizarre errors ,which 
> not turn up ong single computer .I don't know why .Whether is it 
> because the computing environment settings ? I don' know how to do 
> .Plese help me .
Well, I think that it will be rather difficult for anybody to answer 
unless you provide much more
extensive information than you did.

Which kind of errors? Parallel or serial runs?

Some issues might be related to your cluster configuration / hardware / 
software rather than to the Quantum-ESPRESSO.

You can also try to search the forum to find if some of these errors 
have been ever discussed before.



Dr. Giovanni Cantele
Coherentia CNR-INFM and Dipartimento di Scienze Fisiche
Universita' di Napoli "Federico II"
Complesso Universitario di Monte S. Angelo - Ed. 6
Via Cintia, I-80126, Napoli, Italy
Phone: +39 081 676910
Fax:   +39 081 676346
E-mail: giovanni.cantele at cnr.it
giovanni.cantele at na.infn.it
Web: http://people.na.infn.it/~cantele
Research Group: http://www.nanomat.unina.it
Skype contact: giocan74

[Pw_forum] About example 22

2009-09-25 Thread xirainbow
Dear developers:
There are three trivial slip of example22.

In the "README" file of example22:
ONE: "2) make a band structure calculation for Pt (input=pt.band.in,
should be "(input=pt.nscf.in,   output=pt.nscf.out)."
TWO:"9) make a self-consistent calculation for Pt in a tetragonal cell with
4 atoms  (input=pt.tet4.in, output=pt.tet4.out)."
should be " (input=pt4.in, output=pt4.out)."

THREE: When I try to open "pt.nscf_ph.in" with PWgui, I get the follow
error"syntax error in the input file:expecting keyword ATOMIC SPECIES, but
read PHONON instead"
I use espresso4.0.4. But I notice that in espresso4.1, there is not "
pt.nscf_ph.in" file any more.
The "pt.nscf_ph.in" is given below:
calculation = 'phonon'
pseudo_dir = '/home/raman/espresso-4.0.4/pseudo/',
ibrav=  2, celldm(1) =7.42, nat=  1, ntyp= 1,
ecutwfc =30.0,
ecutrho =250.0,
mixing_beta = 0.7,
conv_thr =  1.0d-8
Pt  79.90Ptrel.RRKJ3.UPF
Pt  0.000   0.   0.0
2 2 2 1 1 1"


Hui Wang
School of physics, Nankai University, Tianjin, China
-- next part --
An HTML attachment was scrubbed...

[Pw_forum] problem in phonon in prallel

2009-09-25 Thread dev sharma

 when i am running the same file with the following inputs , it is running

phonons at gamma
  amass(1)= 44.995,
  amass(2)= 50.9415,
  outdir = '/home/devsharma/work/newscvo/temp',
0.0 0.0 0.0

means there is some problem in inputs. Any comment on the extra parameters i
gave in previous file i.e. fpol and elop.

Dev Sharma,
University  of Delhi,

On Fri, Sep 25, 2009 at 11:03 AM, dev sharma  wrote:

> hi 2 all,
>  i have done scf , phonon  and the running ph.x in parallel for some
> optical properties. my programme stops without giving any error in the
> ph.out file and without any CRASH. My input file is listed below. Please
> help.
> Thanks in advance.
> phonons at gamma
>   tr2_ph=1.0e-10,
>   prefix='yvo',
>   epsil=.true.,
>   elop=.true.,
>   amass(1)= 44.995,
>   amass(2)= 50.9415,
>   amass(3)=15.9994,
>   outdir = '/home/devsharma/work/newscvo/temp',
>   fildyn='yvo.dyn1',
>  /
> 0.0 0.0 0.0
> 65
> 7.889e+14
> /
> /
> 4.389e+14
> and mssg in terminal is coming
> [devsharma at headnode newscvo]$ [headnode.du.ac.in:6309] *** An error
> occurred in MPI_Allreduce
> [headnode.du.ac.in:6309] *** on communicator MPI COMMUNICATOR 28 SPLIT
> FROM 4
> [headnode.du.ac.in:6309] *** MPI_ERR_TRUNCATE: message truncated
> [headnode.du.ac.in:6309] *** MPI_ERRORS_ARE_FATAL (your MPI job will now
> abort)
> --
> mpirun has exited due to process rank 1 with PID 6309 on
> node headnode.du.ac.in exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --
> and the output file runs upto
>  Representation52  1 modes - To be done
>  PHONON   :  0m41.95s CPU time, 2m31.11s wall time
>  Alpha used in Ewald sum =   1.
>  Frequency Dependent Polarizability Calculation
> and after here it stops.
> Please help,
> sincerly,
> Dev Sharma,
> Univeristy of Delhi
-- next part --
An HTML attachment was scrubbed...

[Pw_forum] Pw_forum Digest, Vol 27, Issue 82

2009-09-25 Thread Duy Le
> [headnode.du.ac.in:6309] *** on communicator MPI COMMUNICATOR 28 SPLIT
> >> FROM 4
> >> [headnode.du.ac.in:6309] *** MPI_ERR_TRUNCATE: message truncated
> >> [headnode.du.ac.in:6309] *** MPI_ERRORS_ARE_FATAL (your MPI job will now
> >> abort)
> >> --
> >> mpirun has exited due to process rank 1 with PID 6309 on
> >> node headnode.du.ac.in exiting without calling "finalize". This may
> >> have caused other processes in the application to be
> >> terminated by signals sent by mpirun (as reported here).
> >>
> >> --
> >> and the output file runs upto
> >>
> >>
> >>  Representation52  1 modes - To be done
> >>  PHONON   :  0m41.95s CPU time, 2m31.11s wall time
> >>
> >>
> >>  Alpha used in Ewald sum =   1.
> >>
> >>  Frequency Dependent Polarizability Calculation
> >>
> >> and after here it stops.
> >>
> >>
> >> Please help,
> >>
> >> sincerly,
> >> Dev Sharma,
> >> Univeristy of Delhi
> >>
> >>
> >>
> >-- next part --
> >An HTML attachment was scrubbed...
> >URL: 
> >http://www.democritos.it/pipermail/pw_forum/attachments/20090925/270ab22b/attachment-0001.htm
> >
> >--
> >
> >Message: 2
> >Date: Fri, 25 Sep 2009 10:18:39 +0200
> >From: Paolo Giannozzi 
> >Subject: Re: [Pw_forum] problem in phonon in prallel
> >To: PWSCF Forum 
> >Message-ID: <4ABC7CDF.2050209 at democritos.it>
> >Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >
> >dev sharma wrote:
> >
> >> Any comment on the extra
> >> parameters i gave in previous file i.e. fpol and elop.
> >
> >don't give any extra parameters unless you know what they do,
> >and unless you need them
> >
> >P.
> >--
> >Paolo Giannozzi, Democritos and University of Udine, Italy
> >
> >
> >--
> >
> >Message: 3
> >Date: Fri, 25 Sep 2009 21:02:50 +0800 (CST)
> >From: "Q.J.Wang" 
> >Subject: [Pw_forum] Running on cluster often turn up
> >To: pw_forum 
> >Message-ID:
> > <10083410.509721253883770555.JavaMail.coremail at bj126app104.126.com>
> >Content-Type: text/plain; charset="gbk"
> >
> >Dear all
> > When I running on cluster ,it often turn up some bizarre errors ,which 
> > not turn up ong single computer .I don't know why .Whether is it because 
> > the computing environment settings ? I don' know how to do .Plese help me .
> >
> >--
> >
> >Best regards
> >
> >Q.J.Wang
> >
> >XiangTan University
> >-- next part --
> >An HTML attachment was scrubbed...
> >URL: 
> >http://www.democritos.it/pipermail/pw_forum/attachments/20090925/9b5d0c9e/attachment-0001.htm
> >
> >--
> >
> >Message: 4
> >Date: Fri, 25 Sep 2009 15:08:14 +0200
> >From: Giovanni Cantele 
> >Subject: Re: [Pw_forum] Running on cluster often turn up
> >To: PWSCF Forum 
> >Message-ID: <4ABCC0BE.40302 at na.infn.it>
> >Content-Type: text/plain; charset=x-gbk; format=flowed
> >
> >Q.J.Wang wrote:
> >> Dear all
> >> When I running on cluster ,it often turn up some bizarre errors ,which
> >> not turn up ong single computer .I don't know why .Whether is it
> >> because the computing environment settings ? I don' know how to do
> >> .Plese help me .
> >>
> >Well, I think that it will be rather difficult for anybody to answer
> >unless you provide much more
> >extensive information than you did.
> >
> >Which kind of errors? Parallel or serial runs?
> >
> >Some issues might be related to your cluster configuration / hardware /
> >software rather than to the Quantum-ESPRESSO.
> >
> >You can also try to search the forum to find if some of these errors
> >have been ever discussed before.
> >
> >Giovanni
> >
> >--
> >
> >
> >
> >Dr. Giovanni Cantele
> >Coherentia CNR-INFM and Dipartimento di Scienze Fisiche
> >Universita' di Napoli "Federico II"
> >Complesso Universitario di Monte S. Angelo - Ed. 6
> >Via Cintia, I-80126, Napoli, Italy
> >Phone: +39 081 676910
> &g

[Pw_forum] problem in phonon in prallel

2009-09-25 Thread dev sharma
hi 2 all,
 i have done scf , phonon  and the running ph.x in parallel for some optical
properties. my programme stops without giving any error in the ph.out file
and without any CRASH. My input file is listed below. Please help.
Thanks in advance.
phonons at gamma
  amass(1)= 44.995,
  amass(2)= 50.9415,
  outdir = '/home/devsharma/work/newscvo/temp',
0.0 0.0 0.0


and mssg in terminal is coming

[devsharma at headnode newscvo]$ [headnode.du.ac.in:6309] *** An error occurred
in MPI_Allreduce
[headnode.du.ac.in:6309] *** on communicator MPI COMMUNICATOR 28 SPLIT FROM
[headnode.du.ac.in:6309] *** MPI_ERR_TRUNCATE: message truncated
[headnode.du.ac.in:6309] *** MPI_ERRORS_ARE_FATAL (your MPI job will now
mpirun has exited due to process rank 1 with PID 6309 on
node headnode.du.ac.in exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

and the output file runs upto

 Representation52  1 modes - To be done
 PHONON   :  0m41.95s CPU time, 2m31.11s wall time

 Alpha used in Ewald sum =   1.

 Frequency Dependent Polarizability Calculation

and after here it stops.

Please help,

Dev Sharma,
Univeristy of Delhi
-- next part --
An HTML attachment was scrubbed...

[Pw_forum] problem in phonon in prallel

2009-09-25 Thread Paolo Giannozzi
dev sharma wrote:

> Any comment on the extra 
> parameters i gave in previous file i.e. fpol and elop.

don't give any extra parameters unless you know what they do,
and unless you need them

Paolo Giannozzi, Democritos and University of Udine, Italy

[Pw_forum] about OMP_NUM_THREADS

2009-09-25 Thread Jason Larkin
Hello Forum,

First, thanks very much for the great software package. Here is my 
issue, one which is probably due to my own ignorance about QE/quantum 
physics in general:

I'm trying to run a varaible cell relaxation on the compound GeTe.  
Below is my input file:

calculation = 'vc-relax'
pseudo_dir = '$HOME/pseudo/',
forc_conv_thr = 1.0d-5,
etot_conv_thr = 1.0d-5,

ibrav = 4, celldm(1)=7.993540933, celldm(3)=2.581560284
nat = 18, ntyp= 2,
ecutwfc = 100, ecutrho = 800,
mixing_beta = 0.7
conv_thr = 1.0d-6,
  upscale = 100.d0,
 Te  1.0 Te.pz-bhs.UPF
 Ge  1.0 Ge.pz-bhs.UPF
K_POINTS {automatic}
 12 12 2 0 0 0 

My problem (issue) is this; the simulation continues to time out before 
completing the relaxation.  As such, I've been coninuously increasing 
the amount of time and processors that I am requesting to perform this 
calculation.  However, it has reached a point (32 processors and 3:30 
hours) where I'm almost certain that I am doing something wrong.  I was 
able to find one forum posting where a user was running a relaxation and 
the calculation was not stopping even though the Energy and Force 
calculation error thresh holds where being satisfied. It appears that my 
energy does indeed converge.   Let me attach part of my output:

!total energy  =  -203.60367909 Ry
 Harris-Foulkes estimate   =  -203.60367961 Ry
 estimated scf accuracy< 0.0054 Ry

 convergence has been achieved in   9 iterations

 Forces acting on atoms (Ry/au):

 atom   1 type  1   force = 0.0.0.
 atom   2 type  2   force =
 atom   3 type  1   force = 0.0.   -0.00108418
 atom   4 type  2   force = 0.0.   -0.00111021
 atom   5 type  1   force = 0.0.   -0.00066898
 atom   6 type  2   force =
 atom   7 type  1   force = 0.0.   -0.00150570
 atom   8 type  2   force = 0.0.   -0.00163780
 atom   9 type  1   force =
 atom  10 type  2   force = 0.0.0.
 atom  11 type  1   force = 0.0.   -0.00177984
 atom  12 type  2   force =
 atom  13 type  1   force =
 atom  14 type  2   force = 0.0.   -0.00030722
 atom  15 type  1   force =
 atom  16 type  2   force =
 atom  17 type  1   force =
 atom  18 type  2   force = 0.0.   -0.00052582

 Total force = 0.004763 Total SCF correction = 0.002516

 entering subroutine stress ...

  total   stress  (Ry/bohr**3)   (kbar) P= 
   0.02322957   0.   0.   3417.19  0.00  0.00
   0.   0.02322957   0.  0.00   3417.19  0.00
   0.   0.   0.06877108  0.00  0.00  10116.57

 Wentzcovitch Damped Cell-Dynamics Minimization
 convergence thresholds: EPSE = 0.10E-04  EPSF = 0.10E-04  EPSP = 

 Entering Dynamics;  it = 1   time =  0.0 pico-seconds

The simulation goes through 23 complete Self-consistent Calculations. 
Here is the last self-consistent run (before the 24th run timed out):

!total energy  =  -216.83428802 Ry
 Harris-Foulkes estimate   =  -216.83428838 Ry
 estimated scf accuracy< 0.0036 Ry

 convergence has been achieved in  10 iterations

 Forces acting on atoms (Ry/au):

 atom   1 type  1   force = 0.