Your make.sys shows clear signs of mixup between ifort and gfortran. Please verify that mpif90 calls ifort and not gfortran (or vice versa). Configure issues a warning if this happens.
I have successfully run your test on a machine with some recent intel compiler and intel mpi. The second output (run as mpirun -np 18 pw.x -nk 18....) is an example of what I mean by "type of parallelization": there are many different parallelization levels in QE. This is on k-points (and runs faster in this case on less processors than parallelization on plane waves). Paolo On Sun, May 15, 2016 at 6:01 PM, Chong Wang <ch-w...@outlook.com> wrote: > Hi, > > > I have done more test: > > 1. intel mpi 2015 yields segment fault > > 2. intel mpi 2013 yields the same error here > > Did I do something wrong with compiling? Here's my make.sys: > > > # make.sys. Generated from make.sys.in by configure. > > > # compilation rules > > > .SUFFIXES : > > .SUFFIXES : .o .c .f .f90 > > > # most fortran compilers can directly preprocess c-like directives: use > > # $(MPIF90) $(F90FLAGS) -c $< > > # if explicit preprocessing by the C preprocessor is needed, use: > > # $(CPP) $(CPPFLAGS) $< -o $*.F90 > > # $(MPIF90) $(F90FLAGS) -c $*.F90 -o $*.o > > # remember the tabulator in the first column !!! > > > .f90.o: > > $(MPIF90) $(F90FLAGS) -c $< > > > # .f.o and .c.o: do not modify > > > .f.o: > > $(F77) $(FFLAGS) -c $< > > > .c.o: > > $(CC) $(CFLAGS) -c $< > > > > > # Top QE directory, not used in QE but useful for linking QE libs with > plugins > > # The following syntax should always point to TOPDIR: > > # $(dir $(abspath $(filter %make.sys,$(MAKEFILE_LIST)))) > > > TOPDIR = /home/wangc/temp/espresso-5.4.0 > > > # DFLAGS = precompilation options (possible arguments to -D and -U) > > # used by the C compiler and preprocessor > > # FDFLAGS = as DFLAGS, for the f90 compiler > > # See include/defs.h.README for a list of options and their meaning > > # With the exception of IBM xlf, FDFLAGS = $(DFLAGS) > > # For IBM xlf, FDFLAGS is the same as DFLAGS with separating commas > > > # MANUAL_DFLAGS = additional precompilation option(s), if desired > > # BEWARE: it does not work for IBM xlf! Manually edit > FDFLAGS > > MANUAL_DFLAGS = > > DFLAGS = -D__GFORTRAN -D__STD_F95 -D__DFTI -D__MPI -D__PARA > -D__SCALAPACK > > FDFLAGS = $(DFLAGS) $(MANUAL_DFLAGS) > > > # IFLAGS = how to locate directories with *.h or *.f90 file to be included > > # typically -I../include -I/some/other/directory/ > > # the latter contains .e.g. files needed by FFT libraries > > > IFLAGS = -I../include > -I/opt/intel/compilers_and_libraries_2016.3.210/linux/mkl/include > > > # MOD_FLAGS = flag used by f90 compiler to locate modules > > # Each Makefile defines the list of needed modules in MODFLAGS > > > MOD_FLAG = -I > > > # Compilers: fortran-90, fortran-77, C > > # If a parallel compilation is desired, MPIF90 should be a fortran-90 > > # compiler that produces executables for parallel execution using MPI > > # (such as for instance mpif90, mpf90, mpxlf90,...); > > # otherwise, an ordinary fortran-90 compiler (f90, g95, xlf90, ifort,...) > > # If you have a parallel machine but no suitable candidate for MPIF90, > > # try to specify the directory containing "mpif.h" in IFLAGS > > # and to specify the location of MPI libraries in MPI_LIBS > > > MPIF90 = mpif90 > > #F90 = gfortran > > CC = cc > > F77 = gfortran > > > # C preprocessor and preprocessing flags - for explicit preprocessing, > > # if needed (see the compilation rules above) > > # preprocessing flags must include DFLAGS and IFLAGS > > > CPP = cpp > > CPPFLAGS = -P -C -traditional $(DFLAGS) $(IFLAGS) > > > # compiler flags: C, F90, F77 > > # C flags must include DFLAGS and IFLAGS > > # F90 flags must include MODFLAGS, IFLAGS, and FDFLAGS with appropriate > syntax > > > CFLAGS = -O3 $(DFLAGS) $(IFLAGS) > > F90FLAGS = $(FFLAGS) -x f95-cpp-input $(FDFLAGS) $(IFLAGS) > $(MODFLAGS) > > FFLAGS = -O3 -g > > > # compiler flags without optimization for fortran-77 > > # the latter is NEEDED to properly compile dlamch.f, used by lapack > > > FFLAGS_NOOPT = -O0 -g > > > # compiler flag needed by some compilers when the main program is not > fortran > > # Currently used for Yambo > > > FFLAGS_NOMAIN = > > > # Linker, linker-specific flags (if any) > > # Typically LD coincides with F90 or MPIF90, LD_LIBS is empty > > > LD = mpif90 > > LDFLAGS = -g -pthread > > LD_LIBS = > > > # External Libraries (if any) : blas, lapack, fft, MPI > > > # If you have nothing better, use the local copy : > > # BLAS_LIBS = /your/path/to/espresso/BLAS/blas.a > > # BLAS_LIBS_SWITCH = internal > > > BLAS_LIBS = -lmkl_gf_lp64 -lmkl_sequential -lmkl_core > > BLAS_LIBS_SWITCH = external > > > # If you have nothing better, use the local copy : > > # LAPACK_LIBS = /your/path/to/espresso/lapack-3.2/lapack.a > > # LAPACK_LIBS_SWITCH = internal > > # For IBM machines with essl (-D__ESSL): load essl BEFORE lapack ! > > # remember that LAPACK_LIBS precedes BLAS_LIBS in loading order > > > LAPACK_LIBS = > > LAPACK_LIBS_SWITCH = external > > > ELPA_LIBS_SWITCH = disabled > > SCALAPACK_LIBS = -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 > > > # nothing needed here if the the internal copy of FFTW is compiled > > # (needs -D__FFTW in DFLAGS) > > > FFT_LIBS = > > > # For parallel execution, the correct path to MPI libraries must > > # be specified in MPI_LIBS (except for IBM if you use mpxlf) > > > MPI_LIBS = > > > # IBM-specific: MASS libraries, if available and if -D__MASS is defined in > FDFLAGS > > > MASS_LIBS = > > > # ar command and flags - for most architectures: AR = ar, ARFLAGS = ruv > > > AR = ar > > ARFLAGS = ruv > > > # ranlib command. If ranlib is not needed (it isn't in most cases) use > > # RANLIB = echo > > > RANLIB = ranlib > > > # all internal and external libraries - do not modify > > > FLIB_TARGETS = all > > > LIBOBJS = ../clib/clib.a ../iotk/src/libiotk.a > > LIBS = $(SCALAPACK_LIBS) $(LAPACK_LIBS) $(FFT_LIBS) $(BLAS_LIBS) > $(MPI_LIBS) $(MASS_LIBS) $(LD_LIBS) > > > # wget or curl - useful to download from network > > WGET = wget -O > > > # Install directory - not currently used > > PREFIX = /usr/local > > Cheers! > > > Chong Wang > ------------------------------ > *From:* pw_forum-boun...@pwscf.org <pw_forum-boun...@pwscf.org> on behalf > of Paolo Giannozzi <p.gianno...@gmail.com> > *Sent:* Sunday, May 15, 2016 8:28:26 PM > > *To:* PWSCF Forum > *Subject:* Re: [Pw_forum] mpi error using pw.x > > It looks like a compiler/mpi bug, since there is nothing special in your > input and in your execution, unless you find evidence that the problem is > reproducible on other compiler/mpi versions. > > Paolo > > On Sun, May 15, 2016 at 10:11 AM, Chong Wang <ch-w...@outlook.com> wrote: > >> Hi, >> >> >> Thank you for replying. >> >> >> More details: >> >> >> 1. input data: >> >> &control >> calculation='scf' >> restart_mode='from_scratch', >> pseudo_dir = '../pot/', >> outdir='./out/' >> prefix='BaTiO3' >> / >> &system >> nbnd = 48 >> ibrav = 0, nat = 5, ntyp = 3 >> ecutwfc = 50 >> occupations='smearing', smearing='gaussian', degauss=0.02 >> / >> &electrons >> conv_thr = 1.0e-8 >> / >> ATOMIC_SPECIES >> Ba 137.327 Ba.pbe-mt_fhi.UPF >> Ti 204.380 Ti.pbe-mt_fhi.UPF >> O 15.999 O.pbe-mt_fhi.UPF >> ATOMIC_POSITIONS >> Ba 0.0000000000000000 0.0000000000000000 0.0000000000000000 >> Ti 0.5000000000000000 0.5000000000000000 0.4819999933242795 >> O 0.5000000000000000 0.5000000000000000 0.0160000007599592 >> O 0.5000000000000000 -0.0000000000000000 0.5149999856948849 >> O 0.0000000000000000 0.5000000000000000 0.5149999856948849 >> K_POINTS (automatic) >> 11 11 11 0 0 0 >> CELL_PARAMETERS {angstrom} >> 3.999800000000001 0.000000000000000 0.000000000000000 >> 0.000000000000000 3.999800000000001 0.000000000000000 >> 0.000000000000000 0.000000000000000 4.018000000000000 >> >> 2. number of processors: >> I tested 24 cores and 8 cores, and both yield the same result. >> >> 3. type of parallelization: >> I don't know your meaning. I execute pw.x by: >> mpirun -np 24 pw.x < BTO.scf.in >> output >> >> 'which mpirun' output: >> /opt/intel/compilers_and_libraries_2016.3.210/linux/mpi/intel64/bin/mpirun >> >> 4. when the error occurs: >> in the middle of the run. The last a few lines of the output is >> total cpu time spent up to now is 32.9 secs >> >> total energy = -105.97885119 Ry >> Harris-Foulkes estimate = -105.99394457 Ry >> estimated scf accuracy < 0.03479229 Ry >> >> iteration # 7 ecut= 50.00 Ry beta=0.70 >> Davidson diagonalization with overlap >> ethr = 1.45E-04, avg # of iterations = 2.7 >> >> total cpu time spent up to now is 37.3 secs >> >> total energy = -105.99039982 Ry >> Harris-Foulkes estimate = -105.99025175 Ry >> estimated scf accuracy < 0.00927902 Ry >> >> iteration # 8 ecut= 50.00 Ry beta=0.70 >> Davidson diagonalization with overlap >> >> 5. Error message: >> Something like: >> Fatal error in PMPI_Cart_sub: Other MPI error, error stack: >> PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, >> remain_dims=0x7ffc03ae5f38, comm_new=0x7ffc03ae5e90) failed >> PMPI_Cart_sub(178)...................: >> MPIR_Comm_split_impl(270)............: >> MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 >> free on this process; ignore_id=0) >> Fatal error in PMPI_Cart_sub: Other MPI error, error stack: >> PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, >> remain_dims=0x7ffd10080408, comm_new=0x7ffd10080360) failed >> PMPI_Cart_sub(178)...................: >> >> Cheers! >> >> Chong >> ------------------------------ >> *From:* pw_forum-boun...@pwscf.org <pw_forum-boun...@pwscf.org> on >> behalf of Paolo Giannozzi <p.gianno...@gmail.com> >> *Sent:* Sunday, May 15, 2016 3:43 PM >> *To:* PWSCF Forum >> *Subject:* Re: [Pw_forum] mpi error using pw.x >> >> Please tell us what is wrong and we will fix it. >> >> Seriously: nobody can answer your question unless you specify, as a >> strict minimum, input data, number of processors and type of >> parallelization that trigger the error, and where the error occurs (at >> startup, later, in the middle of the run, ...). >> >> Paolo >> >> On Sun, May 15, 2016 at 7:50 AM, Chong Wang <ch-w...@outlook.com> wrote: >> >>> I compiled quantum espresso 5.4 with intel mpi and mkl 2016 update 3. >>> >>> However, when I ran pw.x the following errors were reported: >>> >>> ... >>> MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 >>> free on this process; ignore_id=0) >>> Fatal error in PMPI_Cart_sub: Other MPI error, error stack: >>> PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, >>> remain_dims=0x7ffde1391dd8, comm_new=0x7ffde1391d30) failed >>> PMPI_Cart_sub(178)...................: >>> MPIR_Comm_split_impl(270)............: >>> MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 >>> free on this process; ignore_id=0) >>> Fatal error in PMPI_Cart_sub: Other MPI error, error stack: >>> PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, >>> remain_dims=0x7ffc02ad7eb8, comm_new=0x7ffc02ad7e10) failed >>> PMPI_Cart_sub(178)...................: >>> MPIR_Comm_split_impl(270)............: >>> MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 >>> free on this process; ignore_id=0) >>> Fatal error in PMPI_Cart_sub: Other MPI error, error stack: >>> PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, >>> remain_dims=0x7fffb24e60f8, comm_new=0x7fffb24e6050) failed >>> PMPI_Cart_sub(178)...................: >>> MPIR_Comm_split_impl(270)............: >>> MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 >>> free on this process; ignore_id=0) >>> >>> I googled and found out this might be caused by hitting os limits of >>> number of opened files. However, After I increased number of opened files >>> per process from 1024 to 40960, the error persists. >>> >>> >>> What's wrong here? >>> >>> >>> Chong Wang >>> >>> Ph. D. candidate >>> >>> Institute for Advanced Study, Tsinghua University, Beijing, 100084 >>> >>> _______________________________________________ >>> Pw_forum mailing list >>> Pw_forum@pwscf.org >>> http://pwscf.org/mailman/listinfo/pw_forum >>> >> >> >> >> -- >> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, >> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy >> Phone +39-0432-558216, fax +39-0432-558222 >> >> >> _______________________________________________ >> Pw_forum mailing list >> Pw_forum@pwscf.org >> http://pwscf.org/mailman/listinfo/pw_forum >> > > > > -- > Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy > Phone +39-0432-558216, fax +39-0432-558222 > > > _______________________________________________ > Pw_forum mailing list > Pw_forum@pwscf.org > http://pwscf.org/mailman/listinfo/pw_forum > -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
prova2.24p
Description: Binary data
prova2.18k
Description: Binary data
_______________________________________________ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum