Dear Lyudmila, This is almost certainly an OS problem, and there is little that you can do except find a better supercomputer!
It could be an NFS problem, and setting SCRATCH to a local file on each computer node might then help. Alternatively, while you are supposed to have all of any given node, they might not be running that way -- a lot depends upon how srun is configured. One thing to test is internal mpi (same node) versus cross-node mpi. The first should always be fast. And....buy a sys admin a beer (vodka) and have him/her explain how they have things configured in more detail. On Tue, Oct 6, 2020 at 10:14 AM Lyudmila Dobysheva <lyuk...@mail.ru> wrote: > Dear all, > > I have started working at supercomputer and sometimes I see some delays > during execution. They occur randomly, more frequently during lapw0, but > in other programs also (extra 7-20 min). Administrators say that there > can be sometimes problems with the net's speed. > But I cannot understand: now I take only one node with 16 processors. > I'd say that if I send the task to one node the problems of the net > between computers should not affect till the whole task ends. > Maybe I have wrongly set scratch variable? > In .bashrc: > export SCRATCH=./ > > During execution I see how the cycle is fulfilled, that is, after lapw0 > I see its output files. This means that after lapw0 the calculating node > sends to the governing computer the files, and, maybe, here it waits? Is > this behavior correct? I expected that I should not see the intermediate > stages, till the work ends. > And the very programs lapw0, lapw1, lapw2, lcore, mixer - maybe they are > reloaded to the calculating computer every cycle anew? > > Best regards > Lyudmila Dobysheva > > some details WIEN2k_19.2 > ifort 64 19.1.0.166 > --------------- > parallel_options: > setenv TASKSET "srun " > if ( ! $?USE_REMOTE ) setenv USE_REMOTE 1 > if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 0 > setenv WIEN_GRANULARITY 1 > setenv DELAY 0.1 > setenv SLEEPY 1 > if ( ! $?WIEN_MPIRUN) setenv WIEN_MPIRUN "srun -K -N_nodes_ -n_NP_ > -r_offset_ _PINNING_ _EXEC_" > if ( ! $?CORES_PER_NODE) setenv CORES_PER_NODE 16 > -------------- > WIEN2k_OPTIONS: > current:FOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML > -traceback -assume buffered_io -I$( > MKLROOT)/include > current:FPOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML > -traceback -assume buffered_io -I$ > (MKLROOT)/include > current:OMP_SWITCH:-qopenmp > current:LDFLAGS:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -lpthread > -lm -ldl -liomp5 > current:DPARALLEL:'-DParallel' > current:R_LIBS:-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core > current:FFTWROOT:/home/uffff/.local/ > current:FFTW_VERSION:FFTW3 > current:FFTW_LIB:lib > current:FFTW_LIBNAME:fftw3 > current:LIBXCROOT: > current:LIBXC_FORTRAN: > current:LIBXC_LIBNAME: > current:LIBXC_LIBDNAME: > current:SCALAPACKROOT:$(MKLROOT)/lib/ > current:SCALAPACK_LIBNAME:mkl_scalapack_lp64 > current:BLACSROOT:$(MKLROOT)/lib/ > current:BLACS_LIBNAME:mkl_blacs_intelmpi_lp64 > current:ELPAROOT: > current:ELPA_VERSION: > current:ELPA_LIB: > current:ELPA_LIBNAME: > current:MPIRUN:srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_ > current:CORES_PER_NODE:16 > current:MKL_TARGET_ARCH:intel64 > > ------------------ > > https://urldefense.com/v3/__http://ftiudm.ru/content/view/25/103/lang,english/__;!!Dq0X2DkFhyF93HkjWTBQKhk!DR3lyfE3O6uY7hwNXSGhDD_cUJeZJ30DGB2hyhheIjmw6g37W7S_HNcCObMl3AHsatYthw$ > Physics-Techn.Institute, > Udmurt Federal Research Center, Ural Br. of Rus.Ac.Sci. > 426000 Izhevsk Kirov str. 132 > Russia > --- > Tel. +7 (34I2)43-24-59 (office), +7 (9I2)OI9-795O (home) > Skype: lyuka18 (office), lyuka17 (home) > E-mail: lyuk...@mail.ru (office), lyuk...@gmail.com (home) > _______________________________________________ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!DR3lyfE3O6uY7hwNXSGhDD_cUJeZJ30DGB2hyhheIjmw6g37W7S_HNcCObMl3AFZ-tY25Q$ > SEARCH the MAILING-LIST at: > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!DR3lyfE3O6uY7hwNXSGhDD_cUJeZJ30DGB2hyhheIjmw6g37W7S_HNcCObMl3AE759vujg$ > -- Professor Laurence Marks Department of Materials Science and Engineering Northwestern University www.numis.northwestern.edu Corrosion in 4D: www.numis.northwestern.edu/MURI Co-Editor, Acta Cryst A "Research is to see what everybody else has seen, and to think what nobody else has thought" Albert Szent-Gyorgi
_______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html