Here are some of my compiler options for WIEN2k: FOPT:-mkl -O -FR -mp1 -w -prec_div -pc80 -pad -ip -g -DINTEL_VML -DMKL_LP64 -traceback -assume buffered_io -I$(TACC_MKL_INC)
FPOPT:-mkl -O -FR -mp1 -w -prec_div -pc80 -pad -ip -g -DINTEL_VML -DMKL_LP64 -traceback -assume buffered_io -I$(TACC_MKL_INC) LDFLAGS:$(FOPT) -Wl,-rpath,/scratch/tacc/apps/intel19/impi19_0/fftw3/3.3.10/lib,-rpath,/opt/intel/compilers_and_libraries_2020.1.217/linux/mkl/lib/intel64,-rpath,/opt/intel/compilers_and_libraries_2020.1.217/linux/compiler/lib/intel64,-rpath,/usr/lib64 -L/usr/lib64 -lm -ldl -lpthread -L/opt/intel/compilers_and_libraries_2020.1.217/linux/compiler/lib/intel64 -liomp5 R_LIBS:-L/opt/intel/compilers_and_libraries_2020.1.217/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core RP_LIBS:$(R_LIBS) FFTWROOT:/scratch/tacc/apps/intel19/impi19_0/fftw3/3.3.10/ FFTW_LIBNAME:fftw3 SCALAPACKROOT:/opt/intel/compilers_and_libraries_2020.1.217/linux/mkl/lib/ SCALAPACK_LIBNAME:mkl_scalapack_lp64 BLACSROOT:/opt/intel/compilers_and_libraries_2020.1.217/linux/mkl/lib/ BLACS_LIBNAME:mkl_blacs_intelmpi_lp64 MPIRUN:srun -N _nodes_ -n _NP_ _PINNING_ _EXEC_ CORES_PER_NODE:64 On Sun, Mar 26, 2023 at 8:21 PM Brian Lee <brianh...@utexas.edu> wrote: > Hi, thank you for the responses > > Yes, sorry the dayfile was from a different test run. The run using > "./wien2k_tasks_v4.sh 2 4" shows it as: > > > > lapw0 -p (12:51:21) starting parallel lapw0 at Thu Mar 23 > 12:51:21 CD$ > > -------- .machine0 : 2 processors > > ** lapw0 crashed! > > .machines file was generated using: > > # create hostfile_tacc from a batch > > mpiexec.hydra hostname|cut -d \. -f 1 | sort -n > hostlist_wien2k > > # head of machines_kpoint > > # > > rm .machines > > echo '#' > .machines > > echo 'granularity:1' >> .machines > > # list the hosts in rows for k-point parallelism > > awk -v div=$1 '{_=int(NR/(div+1.0e-10))} > {a[_]=((a[_])?a[_]FS:x)$1;l=(_>l)?_:l}END{for(i=0;i<=0;++i)print > "lapw0:"a[i]":1"}' hostlist_wien2k >>.machines > > awk -v div=$2 '{_=int(NR/(div+1.0e-10))} > {a[_]=((a[_])?a[_]FS:x)$1;l=(_>l)?_:l}END{for(i=0;i<=l;++i)print > "1:"a[i]":1"}' hostlist_wien2k >>.machines > > # > > # tail of machines_kpoint: allocate remaining k points one by one over all > tasks > > # > > echo 'extrafine:1' >>.machines > > # machines_kpoint is end > > # cleanup > > rm hostlist_wien2k > > I believe both of fftw and WIEN2k were compiled with the same intel > compilers, but I've attached my WIEN2k options in the second email. I’ve > tried using different “CORES_PER_NODE” settings (16, 64) to either match > the number of cores per node I request or the number of total cores per > node, but the error is still the same, and running x lapw0 followed by x > lapw1 -p in my job script leads to: > > > LAPW0 END > > forrtl: No such file or directory > > forrtl: severe (28): CLOSE error, unit 200, file "Unknown" > > Image PC Routine Line Source > > lapw1_mpi 00000000004DCBAB Unknown Unknown Unknown > > lapw1_mpi 00000000004CED9F Unknown Unknown Unknown > > lapw1_mpi 000000000045DEE3 inilpw_ 264 > inilpw.f > > lapw1_mpi 0000000000462050 MAIN__ 48 > lapw1_tmp_.F > > lapw1_mpi 0000000000408362 Unknown Unknown Unknown > > libc-2.28.so 0000147E06BC9CF3 __libc_start_main Unknown > Unknown > > lapw1_mpi 000000000040826E Unknown Unknown Unknown > > srun: error: c306-005: task 0: Exited with exit code 28 > > forrtl: No such file or directory > > forrtl: severe (28): CLOSE error, unit 200, file "Unknown" > > Any additional help/information would be greatly appreciated > > Regards, > > Brian Lee | Graduate Student > > The University of Texas at Austin | Texas Materials Institute > > (he/him/his) > > On Thu, Mar 23, 2023 at 3:51 PM Peter Blaha <peter.bl...@tuwien.ac.at> > wrote: > >> My guess would be that you link with a fftw which is compiled with >> gfortran, while wien2k is compiled with ifort (of the opposite or different >> compiler versions.....). >> >> Or it was compiled with proper compilers, but the mpi was mixed (openmpi >> vs intelmpi, ... >> >> >> You can also try to run only >> >> x lapw0 (serial, so that you get proper vsp and vns files for lapw1) >> >> x lapw1 -p in mpi-mode. lapw1 does not link fftw (but scalapack and >> hopefully elpa). >> >> >> Otherwise your report cannot be fully correct: >> >> You claim that you requested 2 cores for lapw0 and part of your email >> supports this . >> >> However, I do not understand why the dayfile claims to have 4 cores in >> .machine0 ??? >> >> About the way wien2k launches mpi jobs: You can "see" how it does it in >> the error logs: >> >> srun -K -N1 -n2 -r0 /home1/08844/leebrian/wien2k/lapw0_mpi lapw0.def >> >> .time00 >> >> Your sysadmins can check this command and you can put this line in your >> submit script and test it. >> >> PS: In any case, you request 4 nodes and in total 64 cores. >> >> But with this .machines file you use only 2 cores in lapw0 and 16 in >> lapw1/2. This waists your cpu-hours. >> >> Check the part of your script (wien2k_tasks... ????) that generates the >> .machines file. >> >> PS: What is your CORES_PER_NODE setting ? >> >> PPS: The message from L.Marks that you need a ":number" in the .machines >> file is not true. It is perfectly ok and the same to use node:1 or >> only node >> >> >> Am 23.03.2023 um 19:14 schrieb Brian Lee: >> >> Hello WIEN2k users/developers, >> >> I am a graduate student at UT Austin in the MS&E program and would like >> to test >> >> WIEN2k_23.2 using various parallelization schemes. When I try to run >> “run_lapw -p” with the default MPI run command suggested during siteconfig >> along with a .machines file/job script that requests 2 processors per lapw0 >> and/or 2 processors per kpt, I receive the following error: >> /index.html >> >> -- >> ----------------------------------------------------------------------- >> Peter Blaha, Inst. f. Materials Chemistry, TU Vienna, A-1060 Vienna >> Phone: +43-158801165300 >> Email: peter.bl...@tuwien.ac.at >> WWW: http://www.imc.tuwien.ac.at WIEN2k: http://www.wien2k.at >> ------------------------------------------------------------------------- >> >> _______________________________________________ >> Wien mailing list >> Wien@zeus.theochem.tuwien.ac.at >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien >> SEARCH the MAILING-LIST at: >> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html >> >
_______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html