Dear Expert Users and the Developers of QE, Could you please have a look for this thread?
regards Bhamu On Wed, Jul 31, 2019 at 6:08 PM Dr. K. C. Bhamu <kcbham...@gmail.com> wrote: > Dear QE users and developers, > > Greetings!! > > I am looking for a help to do the effective parallelization for a gamma > centered calculation with qe-6.4.1 with intel mkl 2015 with external fftw3 > or internal fftw3 on a cluster having 32 processors on each node. > > The system is a binary case with 128 atoms (1664.00 electrons) as first > case and in another case we are having 250 atoms (3250.00 electrons). > Job on 32 processor for the scf file with 128 atoms is running well but > for the other file (250 atoms, other parameters are same) we are getting > the error after first iteration as appended at the bottom of the email . > If we use two nodes for the second case then the CPU time is too much (~ > five times to the first case). > Could someone please help me to run the jobs with effective > parallelization for gamma k-point calculations with 1/2/3/4.. nodes (32 > proc for each node)? > > > The other useful information that may be required by you to diagnosis the > problem is: > Parallel version (MPI), running on 32 processors > > MPI processes distributed on 1 nodes > R & G space division: proc/nbgrp/npool/nimage = 32 > Waiting for input... > Reading input from standard input > > Current dimensions of program PWSCF are: > Max number of different atomic species (ntypx) = 10 > Max number of k-points (npk) = 40000 > Max angular momentum in pseudopotentials (lmaxx) = 3 > > gamma-point specific algorithms are used > > Subspace diagonalization in iterative solution of the eigenvalue > problem: > one sub-group per band group will be used > scalapack distributed-memory algorithm (size of sub-group: 4* 4 > procs) > > Parallelization info > -------------------- > sticks: dense smooth PW G-vecs: dense smooth PW > Min 936 936 233 107112 107112 13388 > Max 937 937 236 107120 107120 13396 > Sum 29953 29953 7495 3427749 3427749 428575 > total cpu time spent up to now is 143.9 secs > > and > > number of k points= 1 > cart. coord. in units 2pi/alat > k( 1) = ( 0.0000000 0.0000000 0.0000000), wk = 2.0000000 > > Dense grid: 1713875 G-vectors FFT dimensions: ( 216, 225, 216) > > Estimated max dynamical RAM per process > 1.01 GB > > Estimated total dynamical RAM > 64.62 GB > > Initial potential from superposition of free atoms > > starting charge 3249.86289, renormalised to 3250.00000 > Starting wfcs are 2125 randomized atomic wfcs > > ========== Below is the error for the case with 250 atoms run over 32 > procs============= > > > Self-consistent Calculation > > iteration # 1 ecut= 80.00 Ry beta= 0.70 > Davidson diagonalization with overlap > > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = PID 154663 RUNNING AT node:1 > = EXIT CODE: 9 > = CLEANING UP REMAINING PROCESSES > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > =================================================================================== > APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9) > > > > On the other cluster (68 procs per node) I do not observe any error. > > Please let me know if I need to provide some additional information. > > Looking forward to hearing from the experts. > > Regards > > K.C. Bhamu, Ph.D. > Postdoctoral Fellow > CSIR-NCL, Pune > India >
_______________________________________________ Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso) users mailing list users@lists.quantum-espresso.org https://lists.quantum-espresso.org/mailman/listinfo/users