I also run the same codes (the same version of Libmesh, the same codes I wrote, the same number of CPUs, and the same version of PETSc) on another Cluster. The environment is Intel 32bit, RedHat Enterprise, GCC3.2 and MPICH127P1. The following is the cost time. The total cost time is about 88secs (3895secs in the previous case). You can find "find_global_indices()" spent little time. I don't know where the problem is. Could you give me some help? thanks a lot.
Regards, Yujie ------------------------------------------------------------------------------------------------------------- | libMesh Performance: Alive time=94.0735, Active time=88.2455 | ------------------------------------------------------------------------------------------------------------- | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | | w/o Sub w/o Sub With Sub With Sub w/o S With S | |-------------------------------------------------------------------------------------------------------------| | | | | | DofMap | | add_neighbors_to_send_list() 3 0.2863 0.095427 0.3427 0.114217 0.32 0.39 | | build_constraint_matrix() 33576 0.3030 0.000009 0.3030 0.000009 0.34 0.34 | | cnstrn_elem_mat_vec() 33576 0.2127 0.000006 0.2127 0.000006 0.24 0.24 | | compute_sparsity() 3 2.8864 0.962134 3.7640 1.254653 3.27 4.27 | | create_dof_constraints() 3 0.4083 0.136117 0.8024 0.267480 0.46 0.91 | | distribute_dofs() 3 0.6631 0.221018 1.4217 0.473890 0.75 1.61 | | dof_indices() 426112 3.7301 0.000009 3.7301 0.000009 4.23 4.23 | | enforce_constraints_exactly() 2 0.0124 0.006191 0.0124 0.006191 0.01 0.01 | | old_dof_indices() 67152 0.5640 0.000008 0.5640 0.000008 0.64 0.64 | | prepare_send_list() 3 0.0102 0.003389 0.0102 0.003389 0.01 0.01 | | reinit() 3 0.7196 0.239869 0.7196 0.239869 0.82 0.82 | | | | FE | | compute_affine_map() 162423 6.5415 0.000040 6.5415 0.000040 7.41 7.41 | | compute_face_map() 64735 3.1745 0.000049 3.1745 0.000049 3.60 3.60 | | compute_shape_functions() 162423 7.4944 0.000046 7.4944 0.000046 8.49 8.49 | | init_face_shape_functions() 54151 1.2186 0.000023 1.2186 0.000023 1.38 1.38 | | init_shape_functions() 115651 6.6277 0.000057 6.6277 0.000057 7.51 7.51 | | inverse_map() 521467 7.4752 0.000014 7.4752 0.000014 8.47 8.47 | | | | GMVIO | | write_nodal_data() 1 0.5900 0.590033 0.5900 0.590033 0.67 0.67 | | | | JumpErrorEstimator | | estimate_error() 2 7.4065 3.703241 30.4493 15.224661 8.39 34.51 | | | | LocationMap | | find() 50456 0.2714 0.000005 0.2714 0.000005 0.31 0.31 | | init() 4 0.0948 0.023701 0.0948 0.023701 0.11 0.11 | | | | Mesh | | contract() 2 0.1155 0.057756 0.1519 0.075933 0.13 0.17 | | find_neighbors() 3 4.9775 1.659166 4.9798 1.659928 5.64 5.64 | | read() 1 3.4427 3.442706 3.4427 3.442706 3.90 3.90 | | renumber_nodes_and_elem() 8 0.1274 0.015928 0.1274 0.015928 0.14 0.14 | | | | MeshCommunication | | broadcast_bcs() 1 0.0030 0.002954 0.0092 0.009157 0.00 0.01 | | broadcast_mesh() 1 0.1153 0.115250 0.1212 0.121190 0.13 0.14 | | compute_hilbert_indices() 4 1.1428 0.285697 1.1428 0.285697 1.30 1.30 | | find_global_indices() 4 0.3963 0.099081 1.7243 0.431078 0.45 1.95 | | parallel_sort() 4 0.1184 0.029595 0.1558 0.038955 0.13 0.18 | | | | MeshRefinement | | _coarsen_elements() 4 0.0447 0.011166 0.0481 0.012035 0.05 0.05 | | _refine_elements() 4 0.5717 0.142932 1.3206 0.330145 0.65 1.50 | | add_point() 50456 0.3634 0.000007 0.6900 0.000014 0.41 0.78 | | make_coarsening_compatible() 11 0.7757 0.070514 0.7757 0.070514 0.88 0.88 | | make_refinement_compatible() 11 0.1117 0.010153 0.1164 0.010585 0.13 0.13 | | | | MetisPartitioner | | partition() 3 1.9354 0.645147 3.3210 1.106993 2.19 3.76 | | | | Parallel | | allgather() 16 0.0124 0.000772 0.0124 0.000772 0.01 0.01 | | broadcast() 13 0.0119 0.000912 0.0119 0.000912 0.01 0.01 | | gather() 3 0.0001 0.000044 0.0001 0.000044 0.00 0.00 | | max() 267 0.0468 0.000175 0.0468 0.000175 0.05 0.05 | | min() 467 0.5519 0.001182 0.5519 0.001182 0.63 0.63 | | probe() 26 0.0155 0.000595 0.0155 0.000595 0.02 0.02 | | receive() 26 0.0095 0.000365 0.0250 0.000962 0.01 0.03 | | send() 26 0.0052 0.000201 0.0052 0.000201 0.01 0.01 | | send_receive() 34 0.0041 0.000121 0.0350 0.001030 0.00 0.04 | | sum() 20 0.0889 0.004443 0.0889 0.004443 0.10 0.10 | | wait() 26 0.0005 0.000021 0.0005 0.000021 0.00 0.00 | | | | Partitioner | | set_node_processor_ids() 3 0.5467 0.182219 0.5731 0.191044 0.62 0.65 | | set_parent_processor_ids() 3 0.0542 0.018076 0.0542 0.018076 0.06 0.06 | | | | PetscLinearSolver | | solve() 3 8.0743 2.691427 8.0764 2.692117 9.15 9.15 | | | | ProjectVector | | operator() 2 0.5417 0.270829 1.2560 0.627999 0.61 1.42 | | | | System | | assemble() 3 13.0582 4.352722 26.0303 8.676766 14.80 29.50 | | project_vector() 2 0.2916 0.145808 1.8486 0.924310 0.33 2.09 | ------------------------------------------------------------------------------------------------------------- | Totals: 1743206 88.2455 100.00 | ------------------------------------------------------------------------------------------------------------- On Tue, Jan 26, 2010 at 2:22 PM, Yujie <[email protected]> wrote: > Dear Libmesh developers, > > In previous emails, I met a problem likely about data communication between > nodes. However, when I run the codes on Master node with 2 CPUs. That means > that there is not data communication between nodes. The problem is always > there. the following is the table of cost time. You can find > "find_global_indices()" took a very long time. > > I am using AMD x86_64, Redhat Enterprise, GCC4.0 and MPICH127p1. Could you > give me some advice? Thanks a lot. > > > ------------------------------------------------------------------------------------------------------------- > | libMesh Performance: Alive time=3921.43, Active > time=3895.64 | > > > ------------------------------------------------------------------------------------------------------------- > | Event nCalls Total Time Avg Time Total > Time Avg Time % of Active Time | > | w/o Sub w/o Sub With > Sub With Sub w/o S With S | > > |-------------------------------------------------------------------------------------------------------------| > | > | > | > | > | > DofMap > | > | add_neighbors_to_send_list() 3 1.8758 0.625277 > 2.0094 0.669790 0.05 0.05 | > | build_constraint_matrix() 36160 1.1394 0.000032 > 1.1394 0.000032 0.03 0.03 | > | cnstrn_elem_mat_vec() 36160 1.0369 0.000029 > 1.0369 0.000029 0.03 0.03 | > | compute_sparsity() 3 66.9598 22.319928 > 69.6673 23.222438 1.72 1.79 | > | create_dof_constraints() 3 1.9375 0.645828 > 2.7731 0.924369 0.05 0.07 | > | distribute_dofs() 3 5.9749 1.991641 > 11.1858 3.728586 0.15 0.29 | > | dof_indices() 451121 10.4046 0.000023 > 10.4046 0.000023 0.27 0.27 | > | enforce_constraints_exactly() 2 0.0860 0.043013 > 0.0860 0.043013 0.00 0.00 | > | old_dof_indices() 72320 1.6502 0.000023 > 1.6502 0.000023 0.04 0.04 | > | prepare_send_list() 3 1.3888 0.462939 > 1.3888 0.462939 0.04 0.04 | > | reinit() 3 4.4227 1.474239 > 4.4227 1.474239 0.11 0.11 | > | > | > | > FE > | > | compute_affine_map() 166087 28.4779 0.000171 > 28.4779 0.000171 0.73 0.73 | > | compute_face_map() 65290 12.9293 0.000198 > 12.9293 0.000198 0.33 0.33 | > | compute_shape_functions() 166087 53.0771 0.000320 > 53.0771 0.000320 1.36 1.36 | > | init_face_shape_functions() 54525 6.8743 0.000126 > 6.8743 0.000126 0.18 0.18 | > | init_shape_functions() 116731 41.2603 0.000353 > 41.2603 0.000353 1.06 1.06 | > | inverse_map() 528671 15.4390 0.000029 > 15.4390 0.000029 0.40 0.40 | > | > | > | > GMVIO > | > | write_nodal_data() 1 2.0390 2.038986 > 2.0390 2.038986 0.05 0.05 | > | > | > | > JumpErrorEstimator > | > | estimate_error() 2 20.5754 10.287681 > 126.0162 63.008090 0.53 3.23 | > | > | > | > LocationMap > | > | find() 69104 0.9536 0.000014 > 0.9536 0.000014 0.02 0.02 | > | init() 4 0.4922 0.123059 > 0.4922 0.123059 0.01 0.01 | > | > | > | > Mesh > | > | contract() 2 0.6653 0.332647 > 1.1356 0.567810 0.02 0.03 | > | find_neighbors() 3 30.7003 10.233423 > 30.8006 10.266880 0.79 0.79 | > | read() 1 5.7922 5.792197 > 5.7922 5.792197 0.15 0.15 | > | renumber_nodes_and_elem() 8 1.9422 0.242779 > 1.9422 0.242779 0.05 0.05 | > | > | > | > MeshCommunication > | > | broadcast_bcs() 1 0.0604 0.060440 > 0.0743 0.074266 0.00 0.00 | > | broadcast_mesh() 1 1.0069 1.006910 > 1.0271 1.027131 0.03 0.03 | > | compute_hilbert_indices() 4 4.1264 1.031604 > 4.1264 1.031604 0.11 0.11 | > | find_global_indices() 4 3172.3789 793.094713 > 3412.5255 853.131373 81.43 87.60 | > | parallel_sort() 4 158.8466 39.711649 > 161.0895 40.272380 4.08 4.14 | > | > | > | > MeshRefinement > | > | _coarsen_elements() 4 0.4822 0.120546 > 0.4828 0.120710 0.01 0.01 | > | _refine_elements() 4 2.7347 0.683675 > 5.5430 1.385758 0.07 0.14 | > | add_point() 69104 1.3920 0.000020 > 2.5913 0.000037 0.04 0.07 | > | make_coarsening_compatible() 12 7.9254 0.660450 > 7.9254 0.660450 0.20 0.20 | > | make_refinement_compatible() 12 1.3526 0.112718 > 1.3618 0.113480 0.03 0.03 | > | > | > | > MetisPartitioner > | > | partition() 3 9.6725 3.224183 > 2854.3422 951.447412 0.25 73.27 | > | > | > | > Parallel > | > | allgather() 16 0.4336 0.027100 > 0.4336 0.027100 0.01 0.01 | > | broadcast() 13 0.0327 0.002513 > 0.0327 0.002513 0.00 0.00 | > | gather() 3 0.0007 0.000229 > 0.0007 0.000229 0.00 0.00 | > | max() 275 0.5796 0.002108 > 0.5796 0.002108 0.01 0.01 | > | min() 482 38.8301 0.080560 > 38.8301 0.080560 1.00 1.00 | > | probe() 26 56.6142 2.177470 > 56.6142 2.177470 1.45 1.45 | > | receive() 26 0.0334 0.001284 > 56.6479 2.178767 0.00 1.45 | > | send() 26 18.5027 0.711642 > 18.5027 0.711642 0.47 0.47 | > | send_receive() 34 0.0077 0.000225 > 75.1605 2.210604 0.00 1.93 | > | sum() 20 2.8493 0.142467 > 2.8493 0.142467 0.07 0.07 | > | wait() 26 0.0016 0.000061 > 0.0016 0.000061 0.00 0.00 | > | > | > | > Partitioner > | > | set_node_processor_ids() 3 3.7781 1.259356 > 4.5113 1.503763 0.10 0.12 | > | set_parent_processor_ids() 3 0.5565 0.185501 > 0.5565 0.185501 0.01 0.01 | > | > | > | > PetscLinearSolver > | > | solve() 3 27.1758 9.058608 > 27.1827 9.060906 0.70 0.70 | > | > | > | > ProjectVector > | > | operator() 2 2.2219 1.110940 > 4.1955 2.097752 0.06 0.11 | > | > | > | > System > | > | assemble() 3 57.2052 19.068412 > 125.2392 41.746413 1.47 3.21 | > | project_vector() 2 8.7408 4.370384 > 13.9501 6.975064 0.22 0.36 | > > > ------------------------------------------------------------------------------------------------------------- > | Totals: 1832413 > 3895.6373 100.00 | > > > ------------------------------------------------------------------------------------------------------------- > Regards, > Yujie > ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com _______________________________________________ Libmesh-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libmesh-users
