I am having more issues with MPI and the Block Matrix. This time, it is the line:
system_matrix.compress(VectorOperation::add); I get a PETSc error when I do this. The error occurs within petsc_matrix_base.cc:275 // flush buffers PetscErrorCode ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); AssertThrow(ierr == 0, ExcPETScError(ierr)); I've been going through this with GDB and found that one processor will have no problem with line, while all of the others create an error that looks like this: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Inserting a new nonzero at global row/column (36, 48) into matrix [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.12.4, Feb, 04, 2020 I wanted to see if there was anything special about the row/column (36,48), so I've overlapped an image of the sparsity patterns for each CPU. The red image came from: if (this_mpi_process == 0) { sparsity_pattern.copy_from(dsp); sparsity_pattern.print_svg(out4); } while the blue image came from this_mpi_process == 1. I've marked the pixel with this matrix-entry in black in an image. It's a little hard to see, but it's just first entry into the blue, on the border with the purple overlapped area. Therefore, it is not included in the sparsity pattern for cpu0. I expect to get an error if cpu0 tries to do anything with this entry, but I don't understand why the system_matrix.compress(VectorOperation::add) is triggering this? I wasn't able to find an answer in the glossary entry https://www.dealii.org/current/doxygen/deal.II/DEALGlossary.html#GlossCompress . So, it seems that I'm doing something wrong with either my sparsity pattern or system matrix or maybe the constraints such that I can't properly do the system_matrix.compress() operation. I had this working in serial, but this problem arose when I switched to MPI. My setup and assembly looks like this: *void ElasticProblem::setup_system()* { // --- Create p:f:t's // Solid GridTools::partition_triangulation_zorder( Utilities::MPI::n_mpi_processes(mpi_communicator), triangulation_sld); // maybe divide n_mpi by 2 GridTools::partition_multigrid_levels(triangulation_sld); auto construction_data_sld = TriangulationDescription::Utilities::create_description_from_triangulation( triangulation_sld, mpi_communicator, TriangulationDescription::Settings::construct_multigrid_hierarchy); tria_pft_sld.create_triangulation(construction_data_sld); // Shell GridTools::partition_triangulation_zorder(Utilities::MPI::n_mpi_processes(mpi_communicator), triangulation_sh); GridTools::partition_multigrid_levels(triangulation_sh); auto construction_data_sh = TriangulationDescription::Utilities::create_description_from_triangulation( triangulation_sh, mpi_communicator, TriangulationDescription::Settings::construct_multigrid_hierarchy); tria_pft_sh.create_triangulation(construction_data_sh); // Solid dof_handler_sld.distribute_dofs(fe_sld); sld_owned_dofs = dof_handler_sld.locally_owned_dofs(); sld_relevant_dofs = DoFTools::extract_locally_relevant_dofs(dof_handler_sld); // Shell dof_handler_sh.distribute_dofs(fe_sh); sh_owned_dofs = dof_handler_sh.locally_owned_dofs(); sh_relevant_dofs = DoFTools::extract_locally_relevant_dofs(dof_handler_sh); const unsigned int n_dof_sld = dof_handler_sld.n_dofs(); const unsigned int n_dof_sh = dof_handler_sh.n_dofs(); std::string solidbcfile = "input/solid.csv"; std::string shellbcfile = "input/shell.csv"; constraints_sld.clear(); nodal_bcs<3,3>(solidbcfile, dof_handler_sld ,constraints_sld); constraints_sld.close(); constraints_sh.clear(); nodal_bcs<2,3>(shellbcfile, dof_handler_sh ,constraints_sh); constraints_sh.close(); constraints.copy_from(constraints_sld); constraints_sh.shift(n_dof_sld); constraints.merge(constraints_sh); const std::vector<unsigned int> dofs_per_block = {n_dof_sld, n_dof_sh}; const std::vector<unsigned int> locally_owned_sizes = {sld_owned_dofs.size(), sh_owned_dofs.size()}; const std::vector<IndexSet> v_locown = {sld_owned_dofs, sh_owned_dofs}; const std::vector<IndexSet> v_locrelv = {sld_relevant_dofs, sh_relevant_dofs}; BlockDynamicSparsityPattern dsp(v_locrelv); dsp.collect_sizes(); DoFTools::make_sparsity_pattern(dof_handler_sld, dsp.block(0,0)); // A DoFTools::make_sparsity_pattern(dof_handler_sh, dsp.block(1,1)); // D SparsityTools::distribute_sparsity_pattern(dsp, v_locown, mpi_communicator, v_locrelv); system_matrix.reinit(v_locown, dsp, mpi_communicator); system_rhs.reinit(v_locown, mpi_communicator); solution.reinit(dofs_per_block); system_matrix.collect_sizes(); } *void ElasticProblem::assemble_system()* { system_rhs = 0; system_matrix = 0; ... Create FE values objects ... for (const auto &cell : dof_handler_sld.active_cell_iterators()) if (cell->is_locally_owned()) { cell_matrix_sld = 0; cell_rhs_sld = 0; for (const unsigned int i : fe_values_sld.dof_indices()) for (const unsigned int j : fe_values_sld.dof_indices()) for (const unsigned int q_point : fe_values_sld.quadrature_point_indices()) { cell_matrix_sld(i,j) = ...long equation... } cell->get_dof_indices(local_dof_indices_sld); constraints.distribute_local_to_global( cell_matrix_sld, cell_rhs_sld, local_dof_indices_sld, system_matrix.block(0,0), system_rhs.block(0)); } // end of cell loop MPI_Barrier(mpi_communicator); system_matrix.compress(VectorOperation::add); *// <---- This is where my issue is* system_rhs.compress(VectorOperation::add); // Onto Shell element assembly ... ... ... } The same error occurs if I skip the system_matrix.compress() at the end of the solid elements, and instead wait until the end of the shell element assembly. I've looked at Step-55 and Step-70, which use the MPI::BlockSparseMatrix, but I don't see what I'm missing. These examples use an index set locally_relevant_dofs_per_processor, but I am using a vector of index sets, since I have two DoF handlers. Can anyone see where I may have gone wrong or what I have omitted? Many thanks, Alex On Wednesday, January 24, 2024 at 12:00:23 PM UTC-5 Wolfgang Bangerth wrote: > > On 1/24/24 09:07, Alex Quinlan wrote: > > > > I seem to have it working now, though I've hit some other unrelated > > snags that I need to resolve. Once I get those fixed and confirm that > > my program is running correctly, I would be willing to work on a patch. > > Excellent, thank you in advance! > Best > W. > -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/c5b355ca-72d5-4f79-9e24-011d4d28e0e1n%40googlegroups.com.