On Tue, 2013-06-11 at 19:57 +0000, vijaya subramanian wrote: > Hi Paolo
you know, there are 1605 subscribed user on the pw_forum mailing list. Even if part of them are actually disabled, it is a lot of people. Why do you address to me? Your unit cells are quite large, your cutoff is not small, and you use spin-orbit, a feature that increases the memory footprint and is less optimized than "plain-vanilla" calculations. In order to run such large jobs, one needs to know quite a bit about the inner working of parallelization, which arrays are distributed, which are not ... The following arrays, for instance: > Each <psi_i|beta_j> matrix 350.63 Mb ( 5440, 2, 2112) are not distributed. This is the kind of arrays that causes bottlenecks. If you have N mpi processes per node, you have N such arrays filling the same physical memory. Reducing the number of MPI processes per node and using OpenMP instead might be a good strategy. P. -- Paolo Giannozzi, Dept. Chemistry&Physics&Environment, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222