Dear Pascal,

This problem seems related to a problem we recently worked around in https://github.com/dealii/dealii/pull/4043

Can you check what happens if you call GrowingVectorMemory<TrilinosWrappers::MPI::Vector>::release_unused_memory()

between your optimization steps? If a communicator gets stack in those places it is likely a stale object somewhere that we fail to work around for some reason.

Best,
Martin


On 15.03.2017 14:10, Pascal Kraft wrote:
Dear Timo,

I have done some more digging and found out the following. The problems seem to happen in trilinos_vector.cc between the lines 240 and 270. What I see on the call stacks is, that one process reaches line 261 ( ierr = vector->GlobalAssemble (last_action); ) and then waits inside this call at an MPI_Barrier with the following stack:
20 <symbol is not available> 7fffd4d18f56
19 opal_progress()  7fffdc56dfca
18 ompi_request_default_wait_all()  7fffddd54b15
17 ompi_coll_tuned_barrier_intra_recursivedoubling()  7fffcf9abb5d
16 PMPI_Barrier()  7fffddd68a9c
15 Epetra_MpiDistributor::DoPosts()  7fffe4088b4f
14 Epetra_MpiDistributor::Do()  7fffe4089773
13 Epetra_DistObject::DoTransfer()  7fffe400a96a
12 Epetra_DistObject::Export()  7fffe400b7b7
11 int Epetra_FEVector::GlobalAssemble<int>()  7fffe4023d7f
10 Epetra_FEVector::GlobalAssemble()  7fffe40228e3

The other (in my case three) processes are stuck in the head of the if/else-f statement leading up to this point, namely in the line if(vector->Map().SameAs(v.vector <https://www.dealii.org/8.4.0/doxygen/deal.II/classTrilinosWrappers_1_1VectorBase.html#afa80df228813b5bd94a6e780a4f5e6ae>->Map()) == false)
inside the call to SameAs(...) with stacks like
15 opal_progress() 7fffdc56dfbc 14 ompi_request_default_wait_all() 7fffddd54b15 13 ompi_coll_tuned_allreduce_intra_recursivedoubling() 7fffcf9a4913 12 PMPI_Allreduce() 7fffddd6587f 11 Epetra_MpiComm::MinAll() 7fffe408739e 10 Epetra_BlockMap::SameAs() 7fffe3fb9d74 Maybe this helps. Producing a smaller example will likely not be possible in the coming two weeks but if there are no solutions until then I can try.
Greetings,
Pascal
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com <mailto:dealii+unsubscr...@googlegroups.com>.
For more options, visit https://groups.google.com/d/optout.

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to