Hi Rolly:
Thank you for replying. I was just looking for how to set mpi fortran compiler to mpiifort. ./configure -h shows how to set fortran compiler to ifort and c compiler to icc, but not mpi fortran compiler. Maybe developers can add some hints. Progress: 1. With intel parallel studio 2016 update3, Errors in my original post persists. 2. With intel parallel studio 2015, Everything works well! Cheers! Chong Wang ________________________________ From: pw_forum-boun...@pwscf.org <pw_forum-boun...@pwscf.org> on behalf of Rolly Ng <roll...@gmail.com> Sent: Monday, May 16, 2016 9:43 PM To: PWSCF Forum Subject: Re: [Pw_forum] mpi error using pw.x Hi Chong Wang, Perhaps it would be better to run ./configure with, ./configure CC=icc CXX=icpc F90=ifort F77=ifort MPIF90=mpiifort --with-scalapack=intel so that QE knows which compiler to use, verified with QE v5.3.0. Rolly On 05/16/2016 05:52 PM, Paolo Giannozzi wrote: On Mon, May 16, 2016 at 4:11 AM, Chong Wang <ch-w...@outlook.com<mailto:ch-w...@outlook.com>> wrote: I have checked my mpif90 calls gfortran so there's no mix up. I am not sure it is possible to use gfortran together with intel mpi. If you have intel mpi and mkl, presumably you have the intel compiler as well. Can you kindly share with me your make.sys? it doesn't make sense to share a make.sys file unless the software configuration is the same. Paolo Thanks in advance! Best! Chong Wang ________________________________ From: pw_forum-boun...@pwscf.org<mailto:pw_forum-boun...@pwscf.org> <pw_forum-boun...@pwscf.org<mailto:pw_forum-boun...@pwscf.org>> on behalf of Paolo Giannozzi <<mailto:p.gianno...@gmail.com>p.gianno...@gmail.com<mailto:p.gianno...@gmail.com>> Sent: Monday, May 16, 2016 3:10 AM To: PWSCF Forum Subject: Re: [Pw_forum] mpi error using pw.x Your make.sys shows clear signs of mixup between ifort and gfortran. Please verify that mpif90 calls ifort and not gfortran (or vice versa). Configure issues a warning if this happens. I have successfully run your test on a machine with some recent intel compiler and intel mpi. The second output (run as mpirun -np 18 pw.x -nk 18....) is an example of what I mean by "type of parallelization": there are many different parallelization levels in QE. This is on k-points (and runs faster in this case on less processors than parallelization on plane waves). Paolo On Sun, May 15, 2016 at 6:01 PM, Chong Wang <ch-w...@outlook.com<mailto:ch-w...@outlook.com>> wrote: Hi, I have done more test: 1. intel mpi 2015 yields segment fault 2. intel mpi 2013 yields the same error here Did I do something wrong with compiling? Here's my make.sys: # make.sys. Generated from make.sys.in<http://make.sys.in> by configure. # compilation rules .SUFFIXES : .SUFFIXES : .o .c .f .f90 # most fortran compilers can directly preprocess c-like directives: use # $(MPIF90) $(F90FLAGS) -c $< # if explicit preprocessing by the C preprocessor is needed, use: # $(CPP) $(CPPFLAGS) $< -o $*.F90 # $(MPIF90) $(F90FLAGS) -c $*.F90 -o $*.o # remember the tabulator in the first column !!! .f90.o: $(MPIF90) $(F90FLAGS) -c $< # .f.o and .c.o: do not modify .f.o: $(F77) $(FFLAGS) -c $< .c.o: $(CC) $(CFLAGS) -c $< # Top QE directory, not used in QE but useful for linking QE libs with plugins # The following syntax should always point to TOPDIR: # $(dir $(abspath $(filter %make.sys,$(MAKEFILE_LIST)))) TOPDIR = /home/wangc/temp/espresso-5.4.0 # DFLAGS = precompilation options (possible arguments to -D and -U) # used by the C compiler and preprocessor # FDFLAGS = as DFLAGS, for the f90 compiler # See include/defs.h.README for a list of options and their meaning # With the exception of IBM xlf, FDFLAGS = $(DFLAGS) # For IBM xlf, FDFLAGS is the same as DFLAGS with separating commas # MANUAL_DFLAGS = additional precompilation option(s), if desired # BEWARE: it does not work for IBM xlf! Manually edit FDFLAGS MANUAL_DFLAGS = DFLAGS = -D__GFORTRAN -D__STD_F95 -D__DFTI -D__MPI -D__PARA -D__SCALAPACK FDFLAGS = $(DFLAGS) $(MANUAL_DFLAGS) # IFLAGS = how to locate directories with *.h or *.f90 file to be included # typically -I../include -I/some/other/directory/ # the latter contains .e.g. files needed by FFT libraries IFLAGS = -I../include -I/opt/intel/compilers_and_libraries_2016.3.210/linux/mkl/include # MOD_FLAGS = flag used by f90 compiler to locate modules # Each Makefile defines the list of needed modules in MODFLAGS MOD_FLAG = -I # Compilers: fortran-90, fortran-77, C # If a parallel compilation is desired, MPIF90 should be a fortran-90 # compiler that produces executables for parallel execution using MPI # (such as for instance mpif90, mpf90, mpxlf90,...); # otherwise, an ordinary fortran-90 compiler (f90, g95, xlf90, ifort,...) # If you have a parallel machine but no suitable candidate for MPIF90, # try to specify the directory containing "mpif.h" in IFLAGS # and to specify the location of MPI libraries in MPI_LIBS MPIF90 = mpif90 #F90 = gfortran CC = cc F77 = gfortran # C preprocessor and preprocessing flags - for explicit preprocessing, # if needed (see the compilation rules above) # preprocessing flags must include DFLAGS and IFLAGS CPP = cpp CPPFLAGS = -P -C -traditional $(DFLAGS) $(IFLAGS) # compiler flags: C, F90, F77 # C flags must include DFLAGS and IFLAGS # F90 flags must include MODFLAGS, IFLAGS, and FDFLAGS with appropriate syntax CFLAGS = -O3 $(DFLAGS) $(IFLAGS) F90FLAGS = $(FFLAGS) -x f95-cpp-input $(FDFLAGS) $(IFLAGS) $(MODFLAGS) FFLAGS = -O3 -g # compiler flags without optimization for fortran-77 # the latter is NEEDED to properly compile dlamch.f, used by lapack FFLAGS_NOOPT = -O0 -g # compiler flag needed by some compilers when the main program is not fortran # Currently used for Yambo FFLAGS_NOMAIN = # Linker, linker-specific flags (if any) # Typically LD coincides with F90 or MPIF90, LD_LIBS is empty LD = mpif90 LDFLAGS = -g -pthread LD_LIBS = # External Libraries (if any) : blas, lapack, fft, MPI # If you have nothing better, use the local copy : # BLAS_LIBS = /your/path/to/espresso/BLAS/blas.a # BLAS_LIBS_SWITCH = internal BLAS_LIBS = -lmkl_gf_lp64 -lmkl_sequential -lmkl_core BLAS_LIBS_SWITCH = external # If you have nothing better, use the local copy : # LAPACK_LIBS = /your/path/to/espresso/lapack-3.2/lapack.a # LAPACK_LIBS_SWITCH = internal # For IBM machines with essl (-D__ESSL): load essl BEFORE lapack ! # remember that LAPACK_LIBS precedes BLAS_LIBS in loading order LAPACK_LIBS = LAPACK_LIBS_SWITCH = external ELPA_LIBS_SWITCH = disabled SCALAPACK_LIBS = -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 # nothing needed here if the the internal copy of FFTW is compiled # (needs -D__FFTW in DFLAGS) FFT_LIBS = # For parallel execution, the correct path to MPI libraries must # be specified in MPI_LIBS (except for IBM if you use mpxlf) MPI_LIBS = # IBM-specific: MASS libraries, if available and if -D__MASS is defined in FDFLAGS MASS_LIBS = # ar command and flags - for most architectures: AR = ar, ARFLAGS = ruv AR = ar ARFLAGS = ruv # ranlib command. If ranlib is not needed (it isn't in most cases) use # RANLIB = echo RANLIB = ranlib # all internal and external libraries - do not modify FLIB_TARGETS = all LIBOBJS = ../clib/clib.a ../iotk/src/libiotk.a LIBS = $(SCALAPACK_LIBS) $(LAPACK_LIBS) $(FFT_LIBS) $(BLAS_LIBS) $(MPI_LIBS) $(MASS_LIBS) $(LD_LIBS) # wget or curl - useful to download from network WGET = wget -O # Install directory - not currently used PREFIX = /usr/local Cheers! Chong Wang ________________________________ From: pw_forum-boun...@pwscf.org<mailto:pw_forum-boun...@pwscf.org> <pw_forum-boun...@pwscf.org<mailto:pw_forum-boun...@pwscf.org>> on behalf of Paolo Giannozzi <<mailto:p.gianno...@gmail.com>p.gianno...@gmail.com<mailto:p.gianno...@gmail.com>> Sent: Sunday, May 15, 2016 8:28:26 PM To: PWSCF Forum Subject: Re: [Pw_forum] mpi error using pw.x It looks like a compiler/mpi bug, since there is nothing special in your input and in your execution, unless you find evidence that the problem is reproducible on other compiler/mpi versions. Paolo On Sun, May 15, 2016 at 10:11 AM, Chong Wang <ch-w...@outlook.com<mailto:ch-w...@outlook.com>> wrote: Hi, Thank you for replying. More details: 1. input data: &control calculation='scf' restart_mode='from_scratch', pseudo_dir = '../pot/', outdir='./out/' prefix='BaTiO3' / &system nbnd = 48 ibrav = 0, nat = 5, ntyp = 3 ecutwfc = 50 occupations='smearing', smearing='gaussian', degauss=0.02 / &electrons conv_thr = 1.0e-8 / ATOMIC_SPECIES Ba 137.327 Ba.pbe-mt_fhi.UPF Ti 204.380 Ti.pbe-mt_fhi.UPF O 15.999 O.pbe-mt_fhi.UPF ATOMIC_POSITIONS Ba 0.0000000000000000 0.0000000000000000 0.0000000000000000 Ti 0.5000000000000000 0.5000000000000000 0.4819999933242795 O 0.5000000000000000 0.5000000000000000 0.0160000007599592 O 0.5000000000000000 -0.0000000000000000 0.5149999856948849 O 0.0000000000000000 0.5000000000000000 0.5149999856948849 K_POINTS (automatic) 11 11 11 0 0 0 CELL_PARAMETERS {angstrom} 3.999800000000001 0.000000000000000 0.000000000000000 0.000000000000000 3.999800000000001 0.000000000000000 0.000000000000000 0.000000000000000 4.018000000000000 2. number of processors: I tested 24 cores and 8 cores, and both yield the same result. 3. type of parallelization: I don't know your meaning. I execute pw.x by: mpirun -np 24 pw.x < BTO.scf.in<http://BTO.scf.in> >> output 'which mpirun' output: /opt/intel/compilers_and_libraries_2016.3.210/linux/mpi/intel64/bin/mpirun 4. when the error occurs: in the middle of the run. The last a few lines of the output is total cpu time spent up to now is 32.9 secs total energy = -105.97885119 Ry Harris-Foulkes estimate = -105.99394457 Ry estimated scf accuracy < 0.03479229 Ry iteration # 7 ecut= 50.00 Ry beta=0.70 Davidson diagonalization with overlap ethr = 1.45E-04, avg # of iterations = 2.7 total cpu time spent up to now is 37.3 secs total energy = -105.99039982 Ry Harris-Foulkes estimate = -105.99025175 Ry estimated scf accuracy < 0.00927902 Ry iteration # 8 ecut= 50.00 Ry beta=0.70 Davidson diagonalization with overlap 5. Error message: Something like: Fatal error in PMPI_Cart_sub: Other MPI error, error stack: PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, remain_dims=0x7ffc03ae5f38, comm_new=0x7ffc03ae5e90) failed PMPI_Cart_sub(178)...................: MPIR_Comm_split_impl(270)............: MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on this process; ignore_id=0) Fatal error in PMPI_Cart_sub: Other MPI error, error stack: PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, remain_dims=0x7ffd10080408, comm_new=0x7ffd10080360) failed PMPI_Cart_sub(178)...................: Cheers! Chong ________________________________ From: <mailto:pw_forum-boun...@pwscf.org> pw_forum-boun...@pwscf.org<mailto:pw_forum-boun...@pwscf.org> <<mailto:pw_forum-boun...@pwscf.org>pw_forum-boun...@pwscf.org<mailto:pw_forum-boun...@pwscf.org>> on behalf of Paolo Giannozzi <<mailto:p.gianno...@gmail.com>p.gianno...@gmail.com<mailto:p.gianno...@gmail.com>> Sent: Sunday, May 15, 2016 3:43 PM To: PWSCF Forum Subject: Re: [Pw_forum] mpi error using pw.x Please tell us what is wrong and we will fix it. Seriously: nobody can answer your question unless you specify, as a strict minimum, input data, number of processors and type of parallelization that trigger the error, and where the error occurs (at startup, later, in the middle of the run, ...). Paolo On Sun, May 15, 2016 at 7:50 AM, Chong Wang <<mailto:ch-w...@outlook.com>ch-w...@outlook.com<mailto:ch-w...@outlook.com>> wrote: I compiled quantum espresso 5.4 with intel mpi and mkl 2016 update 3. However, when I ran pw.x the following errors were reported: ... MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on this process; ignore_id=0) Fatal error in PMPI_Cart_sub: Other MPI error, error stack: PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, remain_dims=0x7ffde1391dd8, comm_new=0x7ffde1391d30) failed PMPI_Cart_sub(178)...................: MPIR_Comm_split_impl(270)............: MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on this process; ignore_id=0) Fatal error in PMPI_Cart_sub: Other MPI error, error stack: PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, remain_dims=0x7ffc02ad7eb8, comm_new=0x7ffc02ad7e10) failed PMPI_Cart_sub(178)...................: MPIR_Comm_split_impl(270)............: MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on this process; ignore_id=0) Fatal error in PMPI_Cart_sub: Other MPI error, error stack: PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, remain_dims=0x7fffb24e60f8, comm_new=0x7fffb24e6050) failed PMPI_Cart_sub(178)...................: MPIR_Comm_split_impl(270)............: MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on this process; ignore_id=0) I googled and found out this might be caused by hitting os limits of number of opened files. However, After I increased number of opened files per process from 1024 to 40960, the error persists. What's wrong here? Chong Wang Ph. D. candidate Institute for Advanced Study, Tsinghua University, Beijing, 100084 _______________________________________________ Pw_forum mailing list <mailto:Pw_forum@pwscf.org>Pw_forum@pwscf.org<mailto:Pw_forum@pwscf.org> <http://pwscf.org/mailman/listinfo/pw_forum>http://pwscf.org/mailman/listinfo/pw_forum -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216<tel:%2B39-0432-558216>, fax +39-0432-558222<tel:%2B39-0432-558222> _______________________________________________ Pw_forum mailing list Pw_forum@pwscf.org<mailto:Pw_forum@pwscf.org> http://pwscf.org/mailman/listinfo/pw_forum -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216<tel:%2B39-0432-558216>, fax +39-0432-558222<tel:%2B39-0432-558222> _______________________________________________ Pw_forum mailing list Pw_forum@pwscf.org<mailto:Pw_forum@pwscf.org> http://pwscf.org/mailman/listinfo/pw_forum -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216<tel:%2B39-0432-558216>, fax +39-0432-558222<tel:%2B39-0432-558222> _______________________________________________ Pw_forum mailing list Pw_forum@pwscf.org<mailto:Pw_forum@pwscf.org> http://pwscf.org/mailman/listinfo/pw_forum -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222 _______________________________________________ Pw_forum mailing list Pw_forum@pwscf.org<mailto:Pw_forum@pwscf.org> http://pwscf.org/mailman/listinfo/pw_forum -- PhD. Research Fellow, Dept. of Physics & Materials Science, City University of Hong Kong Tel: +852 3442 4000 Fax: +852 3442 0538
_______________________________________________ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum