Re: [Wien] MPI parallelization failure for lapw1

Peter Blaha Wed, 27 Nov 2019 00:21:13 -0800

When using the srun setup of WIEN2k it means that you are tightlyintegrated into your system and have to follow all your systems defaultsettings.

For instance you configured CORES_PER_NODE =1; but I very much doubtthat you cluster has only one core per node and srun will probably makecertain assumptions about that.


Two suggestions for tests:

a) run it on only ONE node, but on all cores of this node. Thecorresponding .machines-file should have

1:machine1:YY    where YY is the number of cores (16 or 24, ..)

b) If your queuing system setup allows to use mpirun, reconfigure WIEN2k(siteconfig) with the default intel+mkl option (not the srun option). Itwill then suggest to use mpirun ... for starting jobs.

Make sure that in your batch job (I assume you are using it) the propermodules are loaded (intel, mkl, intel-mpi).



On 11/26/19 7:07 PM, Hanning Chen wrote:

Dear WIEN2K community,

I am a new user of WIEN2K, and just compiled it using the followingoptions:

current:FOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML-traceback -assume buffered_io -I$(MKLROOT)/include

current:FPOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML-traceback -assume buffered_io -I$(MKLROOT)/include


current:OMP_SWITCH:-qopenmp

current:LDFLAGS:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -lpthread-lm -ldl -liomp5


current:DPARALLEL:'-DParallel'

current:R_LIBS:-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core

current:FFTWROOT:/home/ec2-user/FFTW338/

current:FFTW_VERSION:FFTW3

current:FFTW_LIB:lib

current:FFTW_LIBNAME:fftw3

current:LIBXCROOT:

current:LIBXC_FORTRAN:

current:LIBXC_LIBNAME:

current:LIBXC_LIBDNAME:

current:SCALAPACKROOT:$(MKLROOT)/lib/

current:SCALAPACK_LIBNAME:mkl_scalapack_lp64

current:BLACSROOT:$(MKLROOT)/lib/

current:BLACS_LIBNAME:mkl_blacs_intelmpi_lp64

current:ELPAROOT:

current:ELPA_VERSION:

current:ELPA_LIB:

current:ELPA_LIBNAME:

current:MPIRUN:srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_

current:CORES_PER_NODE:1

current:MKL_TARGET_ARCH:intel64

setenv TASKSET "no"

if ( ! $?USE_REMOTE ) setenv USE_REMOTE 1

if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 0

setenv WIEN_GRANULARITY 1

setenv DELAY 0.1

setenv SLEEPY 1

setenv WIEN_MPIRUN "srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_"

if ( ! $?CORES_PER_NODE) setenv CORES_PER_NODE1

# if ( ! $?PINNING_COMMAND) setenv PINNING_COMMAND "--cpu_bind=map_cpu:"

# if ( ! $?PINNING_LIST ) setenv PINNING_LIST"0,8,1,9,2,10,3,11,4,12,5,13,6,14,7,15"

Then, I ran a k-point parallelization with the .machines file below,and it worked perfectly:


     granularity:1

1:machine1

2:machine2

extrafine:1

   But, when I tried to parallelize it over MPI with the new .machines file:

       granularity:1

       1:machine1 machine2

extrafine:1

lapw1 crashed with the error message as

**   Error in Parallel LAPW1

**.  LAPW1 STOPPED

** check ERROR FILES!

   SEP INFO = -21

‘SECLR4’. -SYEVX (Scalapack/LAPACK) failed

Although I understand that the 21st parameter of the SYEVX subroutine isincorrect, I am not sure how to fix the problem. I actually have linkedWIEN2K with NETLIB’s SCALAPACK/LAPACK/BLAS instead of MKL. But the sameerror appeared again.


Please help me out. Thanks.

Hanning Chen, Ph.D.

Department of Chemistry

American University

Washington, DC 20016


_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


--

                                      P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
--------------------------------------------------------------------------
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Re: [Wien] MPI parallelization failure for lapw1

Reply via email to