Hi, I have access to two clusters as a low-level user. One cluster (cluster A) consists of nodes with 8 core and 8 G mem per node. The other cluster (cluster B) has 24G mem per node and each node has 14 cores or more. The cores on cluster A are Xeon CPU E5620@2.40GHz, while the cores on cluster B are Xeon CPU X5550@2.67GH. From the specifications (2.40GHz+12288 KB cache vs 2.67GHz+8192 KB cache), two machines should be very close in performance. But it does not seem to be so.
I have job with 72 atoms per unit cell. I initialized the job on cluster A and ran it for a few iterations. Each iteration took 2 hours. Then, I moved the job to cluster B (14 cores per node with @2.67GHz). Now it takes more than 8 hours to finish one iteration. On both clusters, I request one core per node and 8 nodes per job ( 8 is the number of k points). I compiled WIEN2k_13 on cluster A without mpi. On cluster B, WIEN2k_12 was compiled by the administrator with mpi. What could have caused poor performance of cluster B? Is it because of MPI? On an unrelated question. Sometimes memory would run out on cluster B which has 24Gmem per node. Nevertheless the same job could run smoothly on cluster A which only has 8 G per node. Thanks.
_______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html