[Pw_forum] wfc files: heavy I/O, handling for restarts

2011-09-04 Thread Michael Sternberg
Dear fellow users and developers,

What's the current wisdom regarding wfc updates hammering a networked file 
system?


Details:

I have trouble with what the author of the user guide 
 knowingly calls 
"excessive I/O", I see users running some 20 .. 40 pw.x processes which 
concurrently write large wfc files. Those writes choke my Lustre file system. 
So, count me in as an "angry system manager". I will be throwing more hardware 
at this problem shortly, but I feel there is room for improvement in other ways.

The problem arises because pw.x is being run with the somewhat lazy setting of 
ESPRESSO_TMPDIR=".", which means all scratch files are being dumped into 
$PBS_O_WORKDIR, typically somewhere in $HOME or a parallel scratch file system. 
I wonder to what extent "a modern Parallel File System" as prescribed by the 
documentation is actually needed, other than the requirement that it provide 
lots of R/W bandwidth. If one MPI rank writes a file, must this file indeed be 
seen or even readable by another MPI rank? It appears not - one can run pw.x 
just fine with $ESPRESSO_TMPDIR pointing to local scratch directories on the 
nodes.

The tricky bits are

- stageout, i.e., gathering wfc (and bfgs) files from the nodes upon 
job termination, regular or otherwise, so as to preserve intermediate results, 
and

- stagein for restarts, i.e., provide to each MPI rank exactly the 
required .wfc in its local $ESPRESSO_TMPDIR.

I did a proof-of-concept using the following code in a PBS/Torque job file:

---
# $TMPDIR is provided as pointing to a job-specific node-local scratch
# directory that is created on all nodes under the same name.
export ESPRESSO_TMPDIR=$TMPDIR

basename="pwscf"

# Scatter wfc restart files
awk '{ files_for[$1] = files_for[$1] " '$basename'.wfc" NR }
END { for (host in files_for) print host, files_for[host] }' 
$PBS_NODEFILE \
| while read host files
do
ssh -n $host "cd $PBS_O_WORKDIR; mv $files $ESPRESSO_TMPDIR/"
done
# on master host, copy .save directory as well
rsync -a $basename.save $ESPRESSO_TMPDIR


mpirun  -x ESPRESSO_TMPDIR \
-np $(wc -l < $PBS_NODEFILE) \
-machinefile  $PBS_NODEFILE  \
pw.x -inp input.txt > output.txt

# Gather remote files
uniq $PBS_NODEFILE \
| while read host
do
ssh -n $host "rsync -a $TMPDIR/ $PBS_O_WORKDIR/"
done
---

E.g. for a job with nodes=3:ppn=4 the scatter part would distribute the 
existing files pwscf.wfc{1..12} as follows:

ssh -n n340 'cd /home/stern/test/quantum-espresso/restart_test/run8; mv 
pwscf.wfc5 pwscf.wfc6 pwscf.wfc7 pwscf.wfc8 /tmp/191405.mds01.carboncluster/'
ssh -n n342 'cd /home/stern/test/quantum-espresso/restart_test/run8; mv 
pwscf.wfc9 pwscf.wfc10 pwscf.wfc11 pwscf.wfc12 /tmp/191405.mds01.carboncluster/'
ssh -n n339 'cd /home/stern/test/quantum-espresso/restart_test/run8; mv 
pwscf.wfc1 pwscf.wfc2 pwscf.wfc3 pwscf.wfc4 /tmp/191405.mds01.carboncluster/'
rsync -a pwscf.save /tmp/191405.mds01.carboncluster

(I chose "mv" rather than "cp" for the proof of concept to be sure there's only 
one instance per wfc file available.)

Now, this is of course cumbersome code to repeat in production job scripts, but 
the scatter and gather bits could be isolated into utility scripts callable by 
a single line. Torque provides for Prologue & Epilogue Scripts 
,
 but those have rather restrictive runtime environments.

Is this something to pursue further?

To avoid the stagein file name+number juggling, could the fopen() functions for 
wfc files (and others) perhaps be wrapped such that if a file is not found in 
$ESPRESSO_TMPDIR it is read instead from in "." but written to 
$ESPRESSO_TMPDIR.  The stageout is somewhat simpler and in fact not specific to 
pw.x at all.


With best regards,
Michael


[Pw_forum] xspectra calculation error

2011-09-04 Thread bamidele ibrahim
Dear all,

?? I am trying to run xspectra calculation on MgSe. Knowing that there is no 
gipaw pseudopotential for both contributing element on
? the pwscf-pseudo page, i tried to generate one for both Mg and Se. After 
running scf with this pseudo, i proceed to running the
? xspectra with xspectra.x but to my surprise, it keeps given this error ;

? ?? from allocate_fft : error # 1
 the nr"s are too small!
Please, can anybody help me out with this?

?
Adetunji Bamidele Ibrahim
Department of physics,University of Agriculture,
Abeokuta, Ogun State,Nigeria.
-- next part --
An HTML attachment was scrubbed...
URL: 
http://www.democritos.it/pipermail/pw_forum/attachments/20110904/05c3ac7b/attachment.htm
 


[Pw_forum] about PWscf PP generation

2011-09-04 Thread GAO Zhe
Of course, since the element Xe already has 5s2 and 5p2~ you can change 
parameter "config" like: '[Xe] 5d1 6s1.5 6p0.5'

--
GAO Zhe
CMC Lab, MSE, SNU, Seoul, S.Korea


At 2011-09-04 20:37:09,"Robin H"  wrote:

Hello everyone, I used revisedPBE to generate  PWscf  PP of La atom.But for the 
first time I met a mistake like this 
 Program LD1 v.4.3.1starts on  4Sep2011 at 19:53:36
 This program is part of the open-source Quantum ESPRESSO suite
 for quantum simulation of materials; please cite
 "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
  URLhttp://www.quantum-espresso.org;,
 in publications or presentations arising from this work. More details at
http://www.quantum-espresso.org/wiki/index.php/Citing_Quantum-ESPRESSO
 Parallel version (MPI), running on 1 processors
 %%
 from el_config : error #12
 wavefunction 5S found too many times
 %%
 stopping ...
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
since the input is like this
 
title='La'
zed=57.
rel=1,
config='[Xe] 5s2.00 5p6.00 6s2.00 5d1.00',
iswitch=3,
dft='revPBE'
 /
 
   lloc=1,
   pseudotype=3,
   nlcc=.true.,
   tm=.true.,
   file_pseudopw='La.revPBE.UPF',
 /
5
5D  3  2  1.00   0.00  2.20  2.20
5D  3  2  0.00   0.05  2.20  2.20
6S  1  0  2.00   0.00  2.20  2.20
6S  1  0  0.00   0.05  2.20  2.20
5P  2  1  6.00   0.00  2.20  2.00
 Is there anything wrong in my electric configuration of La?
as I changed like this :

title='La'
zed=57.
rel=1,
config='[Xe] 5s2.00 5p6.00 6s1.50 5d1.00 6p0.50',
iswitch=3,
dft='revPBE'
 /
 
   lloc=1,
   pseudotype=3,
   nlcc=.true.,
   tm=.true.,
   file_pseudopw='La.revPBE.UPF',
 /
5
5S  1  0  2.00   0.00  2.20  2.20
5P  2  1  6.00   0.00  2.20  2.20
5D  3  2  1.00   0.00  2.20  2.20
6S  1  0  1.50   0.00  2.20  2.20 
6P  2  1  0.50   0.00  2.20  2.00
the error is still exist like the fomer.what's the error suggest ?I'm wondered 
that how to write the list of states following the namelist of  tried 
to find some information in the reference of PWscf to make me clear,but this 
part seemed  hard to understand,hope everyone who is sophisticated at PP 
generation to give me some tips, I'm appreciated.
-- next part --
An HTML attachment was scrubbed...
URL: 
http://www.democritos.it/pipermail/pw_forum/attachments/20110904/42f82997/attachment.htm
 


[Pw_forum] about PWscf PP generation

2011-09-04 Thread Robin H
Hello everyone, I used revisedPBE to generate  PWscf  PP of La atom.But for
the first time I met a mistake like this
  Program LD1 v.4.3.1starts on  4Sep2011 at 19:53:36
 This program is part of the open-source Quantum ESPRESSO suite
 for quantum simulation of materials; please cite
 "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
  URL http://www.quantum-espresso.org;,
 in publications or presentations arising from this work. More details
at
 http://www.quantum-espresso.org/wiki/index.php/Citing_Quantum-ESPRESSO
 Parallel version (MPI), running on 1 processors
 %%
 from el_config : error #12
 wavefunction 5S found too many times
 %%
 stopping ...
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
since the input is like this
 
title='La'
zed=57.
rel=1,
config='[Xe] 5s2.00 5p6.00 6s2.00 5d1.00',
iswitch=3,
dft='revPBE'
 /
 
   lloc=1,
   pseudotype=3,
   nlcc=.true.,
   tm=.true.,
   file_pseudopw='La.revPBE.UPF',
 /
5
5D  3  2  1.00   0.00  2.20  2.20
5D  3  2  0.00   0.05  2.20  2.20
6S  1  0  2.00   0.00  2.20  2.20
6S  1  0  0.00   0.05  2.20  2.20
5P  2  1  6.00   0.00  2.20  2.00
 Is there anything wrong in my electric configuration of La?
as I changed like this :

title='La'
zed=57.
rel=1,
config='[Xe] 5s2.00 5p6.00 6s1.50 5d1.00 6p0.50',
iswitch=3,
dft='revPBE'
 /
 
   lloc=1,
   pseudotype=3,
   nlcc=.true.,
   tm=.true.,
   file_pseudopw='La.revPBE.UPF',
 /
5
5S  1  0  2.00   0.00  2.20  2.20
5P  2  1  6.00   0.00  2.20  2.20
5D  3  2  1.00   0.00  2.20  2.20
6S  1  0  1.50   0.00  2.20  2.20
6P  2  1  0.50   0.00  2.20  2.00
the error is still exist like the fomer.what's the error suggest ?I'm
wondered that how to write the list of states following the namelist of
 tried to find some information in the reference of PWscf to make
me clear,but this part seemed  hard to understand,hope everyone who is
sophisticated at PP generation to give me some tips, I'm appreciated.
-- next part --
An HTML attachment was scrubbed...
URL: 
http://www.democritos.it/pipermail/pw_forum/attachments/20110904/f9fdfd99/attachment-0001.htm
 


[Pw_forum] Pw_forum Digest, Vol 51, Issue 7

2011-09-04 Thread Sanjay D. Gupta
b(2) = (  0.00  0.811784  0.00 )
>   b(3) = (  0.00  0.00  0.963964 )
> *
>
>
> here with i am also pasting the input file for further details.
>
> 
>calculation = 'scf',
>prefix='CuWO4',
>restart_mode='from_scratch',
>outdir='./'
>pseudo_dir = '/PWSCF/pseudo/',
>tstress = .true.
>tprnfor = .true.
>wf_collect = .true.
>etot_conv_thr = 1.0d-5,
>forc_conv_thr = 1.0d-4,
>  /
>  
>ibrav= 14,
>celldm(1) = 8.889603025,
>celldm(2) = 1.241632288,
>celldm(3) = 1.037383575,
>celldm(4) = -0.029264993,
>celldm(5) = -0.043078844,
>celldm(6) =  0.1252466553,
>nosym = .true.,
>nat= 6,
>ntyp= 3,
>ecutwfc = 50,
>ecutrho= 400,
>occupations= 'smearing',
>smearing= 'm-p',
>degauss= 0.05,
>  /
>  
> mixing_mode = "local-TF",
> mixing_beta =  0.70,
> conv_thr=  1.0d-08,
> /
> ATOMIC_SPECIES
> Cu  63.546  Cu.pbe-n-van_ak.UPF
> W   183.84  W.pbe-nsp-van.UPF
> O   15.9994 O.pbe-van_ak.UPF
> ATOMIC_POSITIONS (crystal)
> Cu   0.495330E+000.659760E+000.244810E+00
> W0.210600E-010.173480E+000.254290E+00
> O0.249100E+000.353500E+000.424500E+00
> O0.214500E+000.881200E+000.430900E+00
> O0.735300E+000.380300E+000.981000E-01
> O0.782600E+000.907900E+000.533000E-01
> K_POINTS automatic
> 6 6 6 0 0 0
>
> Please suggest me more.
>
> Waiting for positive reply.
>
>
>
>
> ~Best Regards
> ...
> Sanjay D. Gupta
> Research Fellow
> Department of Physics,
> Bhavnagar University, Bhavnagar-364 022
> Gujarat, Mobile-987943
> email:guptasanjay_56 at yahoo.co.in
> ...
> -- next part --
> An HTML attachment was scrubbed...
> URL:
> http://www.democritos.it/pipermail/pw_forum/attachments/20110904/a94d6e2b/attachment-0001.htm
>
> --
>
> Message: 2
> Date: Sun, 4 Sep 2011 11:19:24 +0200
> From: Paolo Giannozzi 
> Subject: Re: [Pw_forum] Problem while Reading celldm of triclinic
>structure
> To: PWSCF Forum 
> Message-ID: <65A2705B-9FDC-47DE-B6D9-E1BC630CA25A at democritos.it>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
>
> On Sep 4, 2011, at 11:00 , Sanjay D. Gupta wrote:
>
> > celldm(4) = ?0.029264993,
> > celldm(5) = ?0.043078844,
>
> the "minus" sign in front of the numbers is not a minus,
> it is a dash. Compare with the correct character:

Dear Sir,
Thank you very much
This was my silly mistake now its working fine.
Thanks once again for quick reply.
With kind Regards
Sanjay D Gupta



> > celldm(4) = -0.029264993,
> > celldm(5) = -0.043078844,
>
> P.
> ---
> Paolo Giannozzi, Dept of Chemistry,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216, fax +39-0432-558222
>
>
>
>
>
>
> --
>
> Message: 3
> Date: Sun,  4 Sep 2011 13:37:49 +0200
> From: giuseppe.mattioli at mlib.ism.cnr.it
> Subject: [Pw_forum] problems with the new Martins-Troullier O pseudo
> To: PWSCF Forum 
> Message-ID: <20110904133749.3x4zy8ri84c8www0 at webmail.sic.rm.cnr.it>
> Content-Type: text/plain;   charset=ISO-8859-1; DelSp="Yes";
>format="flowed"
>
>
> Dear all
>
> My 4.3.2 QE version (but I tried also with older ones) crashes when
> used with the new O.pbe-mt.UPF pseudopotential. No problems with other
> ones, so the error  should depend on the new PP file.
> The output stops at
>
>  Program PWSCF v.4.3.2  starts on  4Sep2011 at 13:25:53
>
>  This program is part of the open-source Quantum ESPRESSO suite
>  for quantum simulation of materials; please cite
>  "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
>   URL http://www.quantum-espresso.org;,
>  in publications or presentations arising from this work. More details
> at
>
> http://www.quantum-espresso.org/wiki/index.php/Citing_Quantum-ESPRESSO
>
>  Parallel version (MPI), running on 4 processors
>  R & G space division:  proc/pool =4
>
>  Current dimensions of program PWSCF are:
>  Max number of different atomic species (ntypx) = 10
>  Max number of k-points (npk) =  4
>  Max angular momentum in pseudopot

[Pw_forum] problems with the new Martins-Troullier O pseudo

2011-09-04 Thread Paolo Giannozzi

On Sep 4, 2011, at 13:37 , giuseppe.mattioli at mlib.ism.cnr.it wrote:

> My 4.3.2 QE version (but I tried also with older ones) crashes when  
> used
> with the new O.pbe-mt.UPF pseudopotential. No problems with other ones

funny problem: the reason for the crash is a "comment" field pointing  
to a
string longer than 80 characters. I have updated the file on the web  
site.

P.
---
Paolo Giannozzi, Dept of Chemistry,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222






[Pw_forum] Problem while Reading celldm of triclinic structure

2011-09-04 Thread Sanjay D. Gupta
Dear,
Dear QE professionals,
I am running a scf  calcuation for triclinic strcture but while reading
input file it shows
"Bad data for namelist object celldm
Bad data for namelist object celldm"

But program running without any error and giving output without considering
the celldm(4) and celldm(5) from the input file.

Here with i am pasting the output portion  of reading the name list  and
input file.

**
Part of out put program


 Program PWSCF v.4.3starts on  4Sep2011 at 14:17:47

 This program is part of the open-source Quantum ESPRESSO suite
 for quantum simulation of materials; please cite
 "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
  URL http://www.quantum-espresso.org;,
 in publications or presentations arising from this work. More details
at
 http://www.quantum-espresso.org/wiki/index.php/Citing_Quantum-ESPRESSO

 Parallel version (MPI), running on 4 processors
 R & G space division:  proc/pool =4

 Current dimensions of program PWSCF are:
 Max number of different atomic species (ntypx) = 10
 Max number of k-points (npk) =  4
 Max angular momentum in pseudopotentials (lmaxx) =  3
 Waiting for input...

 Subspace diagonalization in iterative solution of the eigenvalue
problem:
 a serial algorithm will be used


   Stick Mesh
   --
   nst =  3101,  nstw =   449, nsts =  1537
   n.st   n.stw   n.stsn.gn.gw   n.gs
   min 775 112 384   303351673   10698
   max 776 113 385   303381674   10713
   3101 4491537  1213436695   42829



 bravais-lattice index =   14
 lattice parameter (a_0)   =   8.8896  a.u.
 unit-cell volume  = 897.7308 (a.u.)^3
 number of atoms/cell  =6
 number of atomic types=3
 number of electrons   =49.00
 number of Kohn-Sham states=   30
 kinetic-energy cutoff =  50.  Ry
 charge density cutoff = 400.  Ry
 convergence threshold =  1.0E-08
 mixing beta   =   0.7000
 number of iterations used =8  local-TF  mixing
 Exchange-correlation  =  SLA  PW   PBE  PBE (1434)
 EXX-fraction  =0.00

 celldm(1)=   8.889603  celldm(2)=   1.241632  celldm(3)=   1.037384
 celldm(4)=   0.00  celldm(5)=   0.00  celldm(6)=   0.125247

 crystal axes: (cart. coord. in units of a_0)
   a(1) = (   1.00   0.00   0.00 )
   a(2) = (   0.155510   1.231855   0.00 )
   a(3) = (   0.00   0.00   1.037384 )

 reciprocal axes: (cart. coord. in units 2 pi/a_0)
   b(1) = (  1.00 -0.126241  0.00 )
   b(2) = (  0.00  0.811784  0.00 )
   b(3) = (  0.00  0.00  0.963964 )
*


here with i am also pasting the input file for further details.


calculation = 'scf',
prefix='CuWO4',
restart_mode='from_scratch',
outdir='./'
pseudo_dir = '/PWSCF/pseudo/',
tstress = .true.
tprnfor = .true.
wf_collect = .true.
etot_conv_thr = 1.0d-5,
forc_conv_thr = 1.0d-4,
 /
 
ibrav= 14,
celldm(1) = 8.889603025,
celldm(2) = 1.241632288,
celldm(3) = 1.037383575,
celldm(4) = -0.029264993,
celldm(5) = -0.043078844,
celldm(6) =  0.1252466553,
nosym = .true.,
nat= 6,
ntyp= 3,
ecutwfc = 50,
ecutrho= 400,
occupations= 'smearing',
smearing= 'm-p',
degauss= 0.05,
 /
 
 mixing_mode = "local-TF",
 mixing_beta =  0.70,
 conv_thr=  1.0d-08,
/
ATOMIC_SPECIES
Cu  63.546  Cu.pbe-n-van_ak.UPF
W   183.84  W.pbe-nsp-van.UPF
O   15.9994 O.pbe-van_ak.UPF
ATOMIC_POSITIONS (crystal)
Cu   0.495330E+000.659760E+000.244810E+00
W0.210600E-010.173480E+000.254290E+00
O0.249100E+000.353500E+000.424500E+00
O0.214500E+000.881200E+000.430900E+00
O0.735300E+000.380300E+000.981000E-01
O0.782600E+000.907900E+000.533000E-01
K_POINTS automatic
6 6 6 0 0 0

Please suggest me more.

Waiting for positive reply.




~Best Regards
...
Sanjay D. Gupta
Research Fellow
Department of Physics,
Bhavnagar University, Bhavnagar-364 022
Gujarat, Mobile-987943
email:guptasanjay_56 at yahoo.co.in
...
-- next part --
An HTML attachment was scrubbed...
URL: 
http://www.democritos.it/pipermail/pw_forum/attachments/20110904/a94d6e2b/attachment.htm
 


[Pw_forum] problems with the new Martins-Troullier O pseudo

2011-09-04 Thread giuseppe.matti...@mlib.ism.cnr.it

Dear all

My 4.3.2 QE version (but I tried also with older ones) crashes when  
used with the new O.pbe-mt.UPF pseudopotential. No problems with other  
ones, so the error  should depend on the new PP file.
The output stops at

  Program PWSCF v.4.3.2  starts on  4Sep2011 at 13:25:53

  This program is part of the open-source Quantum ESPRESSO suite
  for quantum simulation of materials; please cite
  "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
   URL http://www.quantum-espresso.org;,
  in publications or presentations arising from this work. More details at
  http://www.quantum-espresso.org/wiki/index.php/Citing_Quantum-ESPRESSO

  Parallel version (MPI), running on 4 processors
  R & G space division:  proc/pool =4

  Current dimensions of program PWSCF are:
  Max number of different atomic species (ntypx) = 10
  Max number of k-points (npk) =  4
  Max angular momentum in pseudopotentials (lmaxx) =  3
  Waiting for input...
  Reading input from stdin
rank 0 in job 17  debian_53881   caused collective abort of all ranks
   exit status of rank 0: killed by signal 9

and my nohup.out file contains

##
# FROM IOTK LIBRARY, VERSION 1.2.0
# UNRECOVERABLE ERROR (ierr=1)
# ERROR IN: iotk_scan_attr (iotk_attr+CHARACTER1_0.f90:207)
# CVS Revision: 1.21
#
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

What's wrong?

Yours

Giuseppe


Giuseppe Mattioli
ISM-CNR
Italy






[Pw_forum] Problem while Reading celldm of triclinic structure

2011-09-04 Thread Paolo Giannozzi

On Sep 4, 2011, at 11:00 , Sanjay D. Gupta wrote:

> celldm(4) = ?0.029264993,
> celldm(5) = ?0.043078844,

the "minus" sign in front of the numbers is not a minus,
it is a dash. Compare with the correct character:

> celldm(4) = -0.029264993,
> celldm(5) = -0.043078844,

P.
---
Paolo Giannozzi, Dept of Chemistry,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222