Hi, Libmesh Developers,
When I run the codes in parallel mode, I found the performance is not always
improved with the increase of the CPU number. I tried to find the bottleneck
using the libmesh performance table. I found the problem is from Mesh
Communication::parallel_sort().
At the beginning (from 2 to 14 CPUs), the cost time of Mesh Communication
takes a small percentage of the total time (Check the following table).
However, when the CPU number changes larger than 14, the performance of Mesh
Communication becomes worse, especially parallel_sort(). I don't know
whether it is normal. Could you give me some advice? Thanks a lot.
Although there are 8 CPUs in one node of the cluster, each CPU added is
from different nodes compared with previous one by using "-machinefile " in
MPI when I add the CPU.
CPUs Mesh Communication
1
0 2
2.6 3
2.6 4
2.7 5
2.7 6
2.7 8
2.9 10
2.6 12
3.4 14
4.2 16
7.2 18
7.2 20
8.7 22
9.5 24
13.8 26
15.3 28
19.4 30
24.2 32
34.6 34
32.4 36
36.6 38
39.1 40
46.9
Regards,
Yujie
------------------------------------------------------------------------------
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users