Dear all, I want to analyze the strong scaling of our in-house FEM code. The test problem has about 20M DoFs. I ran the problem using various settings. The speedups for the assembly and solving procedures are as follows: Assembly Solving NProcessors NNodes CoresPerNode 1 1 1 1.0 1.0 2 1 2 1.995246 1.898756 2 1 2.121401 2.436149 4 1 4 4.658187 6.004539 2 2 4.666667 5.942085 4 1 4.65272 6.101214 8 2 4 9.380985 16.581135 4 2 9.308575 17.258891 8 1 9.314449 17.380612 16 2 8 18.575953 34.483058 4 4 18.745129 34.854409 8 2 18.828393 36.45509 32 4 8 37.140626 70.175879 8 4 37.166421 71.533865
I don't quite understand this result. Why we can achieve a speedup of about 70+ using 32 processors? Could you please help me explain this? Thank you in advance. Best, Ce