[Wien] errors in lapw

2012-02-03 Thread Bin Shao
Dear all,

I am running wien2k 11.1 on a cluster with Centos 6 under a pbs queuing
system. The job is submitted in a k-point parallel mode and the total 36
kpoints are divided by 16 cups. But there comes some errors in lapw2 and
the dnlapw2_18/19/20.error files are not empty. At the same time, the job
in pbs system seems dead and can not be killed by the pbs command. The
administrator check the computing node and command top shows that the node
is experiencing very heavy load above 40. Further, ps aux shows that there
are 16 lapw2 processes but not running or say suspended. The jobs caused a
heavy load and triggered the self-protection mechanism of the OS, which
automatically suspends any running process including ssh login except root
account.

Any comments will be appreciated and thanks in advanced.

The followings are the error files and case.dayfile.
dnlapw2_18/19/20.error--
Error in LAPW2


-case.output2dn_19
...
   KVEC( 73563) =   -19   -599.10461
   KVEC( 73564) =   -19   24   -99.10461
   KVEC( 73565) =   -19   2499.10461
   KVEC( 73566) =19  -24   -99.10461
   KVEC( 73567) =19  -2499.10461
   KVEC( 73568) =195   -99.10461
   KVEC( 73569) =19599.10461
   KVE


case.dayfile---
...
[14]   Done  ( ( $remote $machine[$p] cd $PWD;$t
$exe ${def}_${loop}.def $loop;fixerror_lapw ${def}_$loop; rm -f
.lock_$lockfile[$p] )  .stdout2_$loop; if ( -f .stdout2_$loop )
bashtime2csh.pl_lapw .stdout2_$loop  .temp2_$loop; grep \% .temp2_$loop 
.time2_$loop; grep -v \% .temp2_$loop | perl -e print stderr STDIN )
[9]Done  ( ( $remote $machine[$p] cd $PWD;$t
$exe ${def}_${loop}.def $loop;fixerror_lapw ${def}_$loop; rm -f
.lock_$lockfile[$p] )  .stdout2_$loop; if ( -f .stdout2_$loop )
bashtime2csh.pl_lapw .stdout2_$loop  .temp2_$loop; grep \% .temp2_$loop 
.time2_$loop; grep -v \% .temp2_$loop | perl -e print stderr STDIN )
[4]Done  ( ( $remote $machine[$p] cd $PWD;$t
$exe ${def}_${loop}.def $loop;fixerror_lapw ${def}_$loop; rm -f
.lock_$lockfile[$p] )  .stdout2_$loop; if ( -f .stdout2_$loop )
bashtime2csh.pl_lapw .stdout2_$loop  .temp2_$loop; grep \% .temp2_$loop 
.time2_$loop; grep -v \% .temp2_$loop | perl -e print stderr STDIN )
[4] 18809
-

-:log
...
Thu Feb  2 17:58:03 CST 2012 (x) lapw1 -c -dn -p -orb
Thu Feb  2 19:46:53 CST 2012 (x) lapw2 -c -up -p
Thu Feb  2 19:51:36 CST 2012 (x) sumpara -up -d
Thu Feb  2 19:52:07 CST 2012 (x) lapw2 -c -dn -p


(If more information is needed, I will provide.)

Best,

-- 
Bin Shao, Ph.D. Candidate
College of Information Technical Science, Nankai University
94 Weijin Rd. Nankai Dist. Tianjin 300071, China
Email: bshao at mail.nankai.edu.cn
-- next part --
An HTML attachment was scrubbed...
URL: 
http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120203/e1af9482/attachment-0001.htm


[Wien] errors in lapw

2012-02-03 Thread Peter Blaha


Clearly you should write your job script such that it divides the 36 k-points 
in a
meaningful way.
In principle you can use 36,18,9,6,4,or 3 parallel jobs, but 16 us not 
meaningful.

Furthermore, it seems that your cluster has problems with heavy I/O (NFS) and 
this is
most likely the reason for the observed high load and the crash. Thus I would
i) not use too many cores. Has one node of your cluster really 16 cores, or is 
this just due
to multithreading and in fact it has only 8 ? Do you have enough memory per 
node ?
ii) try to use a (local) $SCRATCH directory, which reduces the NFS load. But 
this works only
 if your k-list and .machines file is compatible as mentioned above.

It also seems a bit of a bigger calculations (lapw1 took nearly 2h), thus you 
may either need MPI
or you should not use all cores on one node at your cluster because of memory 
restrictions.


Am 03.02.2012 13:56, schrieb Bin Shao:
 Dear all,

 I am running wien2k 11.1 on a cluster with Centos 6 under a pbs queuing 
 system. The job is submitted in a k-point parallel mode and the total 36 
 kpoints are divided by 16 cups.
 But there comes some errors in lapw2 and the dnlapw2_18/19/20.error files are 
 not empty. At the same time, the job in pbs system seems dead and can not be 
 killed by the pbs
 command. The administrator check the computing node and command top shows 
 that the node is experiencing very heavy load above 40. Further, ps aux shows 
 that there are 16 lapw2
 processes but not running or say suspended. The jobs caused a heavy load and 
 triggered the self-protection mechanism of the OS, which automatically 
 suspends any running process
 including ssh login except root account.

 Any comments will be appreciated and thanks in advanced.

 The followings are the error files and case.dayfile.
 dnlapw2_18/19/20.error--
 Error in LAPW2
 

 -case.output2dn_19
 ...
 KVEC( 73563) =   -19   -599.10461
 KVEC( 73564) =   -19   24   -99.10461
 KVEC( 73565) =   -19   2499.10461
 KVEC( 73566) =19  -24   -99.10461
 KVEC( 73567) =19  -2499.10461
 KVEC( 73568) =195   -99.10461
 KVEC( 73569) =19599.10461
 KVE
 

 case.dayfile---
 ...
 [14]   Done  ( ( $remote $machine[$p] cd $PWD;$t 
 $exe ${def}_${loop}.def $loop;fixerror_lapw ${def}_$loop; rm -f 
 .lock_$lockfile[$p] )  .stdout2_$loop;
 if ( -f .stdout2_$loop ) bashtime2csh.pl_lapw .stdout2_$loop  .temp2_$loop; 
 grep \% .temp2_$loop  .time2_$loop; grep -v \% .temp2_$loop | perl -e 
 print stderr STDIN )
 [9]Done  ( ( $remote $machine[$p] cd $PWD;$t 
 $exe ${def}_${loop}.def $loop;fixerror_lapw ${def}_$loop; rm -f 
 .lock_$lockfile[$p] )  .stdout2_$loop;
 if ( -f .stdout2_$loop ) bashtime2csh.pl_lapw .stdout2_$loop  .temp2_$loop; 
 grep \% .temp2_$loop  .time2_$loop; grep -v \% .temp2_$loop | perl -e 
 print stderr STDIN )
 [4]Done  ( ( $remote $machine[$p] cd $PWD;$t 
 $exe ${def}_${loop}.def $loop;fixerror_lapw ${def}_$loop; rm -f 
 .lock_$lockfile[$p] )  .stdout2_$loop;
 if ( -f .stdout2_$loop ) bashtime2csh.pl_lapw .stdout2_$loop  .temp2_$loop; 
 grep \% .temp2_$loop  .time2_$loop; grep -v \% .temp2_$loop | perl -e 
 print stderr STDIN )
 [4] 18809
 -

 -:log
 ...
 Thu Feb  2 17:58:03 CST 2012 (x) lapw1 -c -dn -p -orb
 Thu Feb  2 19:46:53 CST 2012 (x) lapw2 -c -up -p
 Thu Feb  2 19:51:36 CST 2012 (x) sumpara -up -d
 Thu Feb  2 19:52:07 CST 2012 (x) lapw2 -c -dn -p
 

 (If more information is needed, I will provide.)

 Best,

 --
 Bin Shao, Ph.D. Candidate
 College of Information Technical Science, Nankai University
 94 Weijin Rd. Nankai Dist. Tianjin 300071, China
 Email: bshao at mail.nankai.edu.cn mailto:bshao at mail.nankai.edu.cn



 ___
 Wien mailing list
 Wien at zeus.theochem.tuwien.ac.at
 http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 

   P.Blaha
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.atWWW: http://info.tuwien.ac.at/theochem/
--


[Wien] SPHBES - Error

2012-02-03 Thread Bouabdellah AZOUZA
The program worked with RMT for Mg (2), for Fe (1.8) and H (1).
I want to know the precision of calculation in function of RMTKMAX and
K points for determination of calculation parameter for the kind of
material ( E0,a0,B) and times consuming
Sincerely yours


2012/2/1, Laurence Marks L-marks at northwestern.edu:
 Why are you using RKMAX=8 or 9.5? These are way too big. Since the
 smallest RMT is 1.0 (H) a value of 5 should be fine, maybe 6 at the
 most.

 On Tue, Jan 31, 2012 at 2:17 PM, Bouabdellah AZOUZA b.azouza at gmail.com
 wrote:
 After several attempts I confused the numbers.
 here is my file, the interatomic distances are in bohr but The problem
 persists.

 MgFeH3
 P   LATTICE,NONEQUIV.ATOMS:  3221_Pm-3m
 MODE OF CALC=RELA unit=bohr
  6.292787  6.292787  6.292787 90.00 90.00 90.00
 ATOM   1: X=0. Y=0. Z=0.
  MULT= 1  ISPLIT= 2
 Mg1NPT=  781  R0=0.0001 RMT=2.5000   Z: 12.0
 LOCAL ROT MATRIX:1.000 0.000 0.000
 0.000 1.000 0.000
 0.000 0.000 1.000
 ATOM   2: X=0.5000 Y=0.5000 Z=0.5000
  MULT= 1  ISPLIT= 2
 Fe2NPT=  781  R0=0.0001 RMT=1.9000   Z: 26.0
 LOCAL ROT MATRIX:1.000 0.000 0.000
 0.000 1.000 0.000
 0.000 0.000 1.000
 ATOM  -3: X=0. Y=0.5000 Z=0.5000
  MULT= 3  ISPLIT=-2
  -3: X=0.5000 Y=0. Z=0.5000
  -3: X=0.5000 Y=0.5000 Z=0.
 H 3NPT=  781  R0=0.0001 RMT=1.   Z:  1.0
 LOCAL ROT MATRIX:0.000 0.000 1.000
 0.000 1.000 0.000
-1.000 0.000 0.000
  48  NUMBER OF SYMMETRY OPERATIONS



 2012/1/30, Laurence Marks L-marks at northwestern.edu:
 You have confused Angstroms and Atomic Units when generating your
 structure -- the distances are way to close. Please go back to the web
 interface and input your structure in properly, or change the units of
 a,b,c to what they should be. This, rather than anything else, is
 99.999% certain the source of your problems.

 2012/1/30 Bouabdellah AZOUZA b.azouza at gmail.com:
 Dear Dr. Blaha
 for the small rkmax (6,6.5,7,7.5) it works, and here is my file a
 struct.

 MgFeH3
 P   LATTICE,NONEQUIV.ATOMS:  3221_Pm-3m
 MODE OF CALC=RELA unit=bohr
  3.33  3.33  3.33 90.00 90.00 90.00
 ATOM   1: X=0. Y=0. Z=0.
  MULT= 1  ISPLIT= 2
 Mg1NPT=  781  R0=0.0001 RMT=1.3000   Z: 12.0
 LOCAL ROT MATRIX:1.000 0.000 0.000
 0.000 1.000 0.000
 0.000 0.000 1.000
 ATOM   2: X=0.5000 Y=0.5000 Z=0.5000
  MULT= 1  ISPLIT= 2
 Fe2NPT=  781  R0=0.0001 RMT=1.   Z: 26.0
 LOCAL ROT MATRIX:1.000 0.000 0.000
 0.000 1.000 0.000
 0.000 0.000 1.000
 ATOM  -3: X=0.5000 Y=0.5000 Z=0.
  MULT= 3  ISPLIT= 2
 ATOM  -3:X= 0. Y=0.5000 Z=0.5000
 ATOM  -3:X= 0.5000 Y=0. Z=0.5000
 H 3NPT=  781  R0=0.0001 RMT=0.5500   Z:  1.0
 LOCAL ROT MATRIX:1.000 0.000 0.000
 0.000 1.000 0.000
 0.000 0.000 1.000
  48  NUMBER OF SYMMETRY OPERATIONS
 thank you in advance for your help
 Best regards


 2012/1/30, Peter Blaha pblaha at theochem.tuwien.ac.at:
 Does it occur with small RKMAX too ?

 How does your struct file look like ?

 Am 28.01.2012 14:57, schrieb Bouabdellah AZOUZA:
 Respected Sir,
   I am running wien version 11 on a machine of type I3 with
 operating system lunix 11.3, fortran compiler ifort
 I am running this case (MgFeH3.struct) (Perovskite structure).After
 defining and initializing the structure for RMTKmax=9, during SCF run
 it reports an error as :

   Error in LAPW1
 SPHBES - Error

 for RMTKmax=8,9.5 during SCF run it reports an error as :
 error in lapw2
 L2main ?OT ?B.GT 15 Ghostbands chek scf files
 Kindly help me how to remove this errors

 Best regards

 Bouabdellah azouza

 Department of Physics, USTHB Algiers Algeria
 ___
 Wien mailing list
 Wien at zeus.theochem.tuwien.ac.at
 http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

 --

P.Blaha
 --
 Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
 Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
 Email: blaha at theochem.tuwien.ac.atWWW:
 http://info.tuwien.ac.at/theochem/
 --

 

[Wien] band structure

2012-02-03 Thread Saba Sabeti



Dear all users, 

I'm calculating the band structure of some topological half heuslers. All of 
the calculations run without any error, however I found wrong result on my band 
structure. If somebody helps me to correct them, I would be so thankful.
First of all, in spite of seeing the essential band inversion, the gamma7 is 
not drawn!
In addition, when I insert SO interaction(without spin polarization),in the 
absence of any error, I don't get correct results; and when I insert SO (with 
spin polarization),there's some errors in lapw2!

I want to know if I should insert any change or further information in band 
structure step or not. Or it may be because of some wrong information in 
initso_lapw step?

thank you so much in advance
-- next part --
An HTML attachment was scrubbed...
URL: 
http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120203/ba8e13a5/attachment.htm


[Wien] band structure

2012-02-03 Thread Gerhard Fecher
what do you mean with I don't get correct results ?

Ciao
Gerhard

DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy:
I think the problem, to be quite honest with you,
is that you have never actually known what the question is.


Dr. Gerhard H. Fecher
Institut of Inorganic and Analytical Chemistry
Johannes Gutenberg - University
55099 Mainz

Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at 
zeus.theochem.tuwien.ac.at]quot; im Auftrag von quot;Saba Sabeti 
[raskolnikof6028 at yahoo.com]
Gesendet: Freitag, 3. Februar 2012 22:37
An: wien at zeus.theochem.tuwien.ac.at
Betreff: [Wien] band structure

Dear all users,
I'm calculating the band structure of some topological half heuslers. All of 
the calculations run without any error, however I found wrong result on my band 
structure. If somebody helps me to correct them, I would be so thankful.
First of all, in spite of seeing the essential band inversion, the gamma7 is 
not drawn!
In addition, when I insert SO interaction(without spin polarization),in the 
absence of any error, I don't get correct results; and when I insert SO (with 
spin polarization),there's some errors in lapw2!
I want to know if I should insert any change or further information in band 
structure step or not. Or it may be because of some wrong information in 
initso_lapw step?
thank you so much in advance