Thank you very much for the hint. I found in my $SCRATCH directory 256 *storeHinv* files 221MB each. My mistake was to use a nfs mounted directory on a head node as $SCRATCH. I changed it now to a local directory.
Thank you once again, Oleg >>> Peter Blaha <pblaha at theochem.tuwien.ac.at> 12/23/09 2:17 AM >>> The new iterative diagonalization creates files called case.storeHinv.., where the inverse of H is stored (one triangle of the matrix in single precision). These files can be quite large (eg. for matrix size 30000 the size of all Hinv-files (# of processors) is 3600MB or 7200MB (real/complex), but on a balanced cluster they should be written/read in 100-200 seconds. It is created only once (in the second scf cycle), but read in all subsequent iterative scf cycles. Please note, that the method is usually so efficient, that one can run even a minimization with -it0: min -j "run_lapw -it0"; i.e. one does not need to create it again! Similar as with the vector files, you can use the SCRATCH variable, to direct these files to a local scratch directory (eg. with 100 processors, each processor reads/writes only 36MB !) > I observe the cluster network dying for about 10 minutes when performing > calculation for a relatively large case that involves 256 cores and > InfiniBand. I use WIEN2k_09.2 (Release 29/9/2009) + ifort 11.0.074 + Intel > MKL 10.1.0.015 + MVAPICH2 and iterative diagonalization. The network dyes > always at the end of the second scf iteration iteration (most likely at the > end of lapw1). This did not occur in WIEN2k_08.3 (Release 18/9/2008) for the > same case and compiler settings. I know that the iterative diagonalization > has undergone some major changes between these two versions. > > This actually does not interrupt the calculations and there is no sign of any > error, but it causes SGE demon to die on compute nodes with all consequences. > > Did anyone experience a similar problem? What is differently in the behaviour > of lapw1 for the 2nd iteration that may cause the problem? > > Thank you in advance and Happy Holidays. > > Oleg Rubel > > -- > Thunder Bay Regional Research Institute > 290 Munro St, Thunder Bay, ON, P7A 7T1, Canada > Homepage: http://www.tbrri.com/~orubel/ > _______________________________________________ > Wien mailing list > Wien at zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien -- ----------------------------------------- Peter Blaha Inst. Materials Chemistry, TU Vienna Getreidemarkt 9, A-1060 Vienna, Austria Tel: +43-1-5880115671 Fax: +43-1-5880115698 email: pblaha at theochem.tuwien.ac.at ----------------------------------------- _______________________________________________ Wien mailing list Wien at zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien