[deal.II] PETSc.. deadlock when calling PETScWrappers::MPI::reinit()?

Daniel Goldberg Tue, 08 Sep 2009 11:59:47 -0700

Hi all,

Sorry if the subject is misleading... what I'm experiencing is notnecessarily a deadlock but it is very, very vexing. At a point in mycode where I have refined the mesh, distributed the DoFs, I amreinit-ing an MPI vector. There is no error message, the program simplyhalts (confirmed using gdb). This is a time-stepping model, so thisproblem does not occur at every timestep. Indeed, it used to occurpretty infrequently and seemed to be sporadic rather than deterministic,making me think it had to do with conditions on the computing cluster.But now it happens more regularly (i.e. consistently with the same setof inputs at the same point in the simulation).

I'm sorry I'm not providing more detail, as responses to this email willalmost certainly ask for some. But I did not want to dump all of mycode, most of it probably unnecessary, into this email. At this point Iwould just like to know under what set of circumstances could the abovehappen - that is, the programming "hanging" on a reinit() call with anMPI::PETScWrappers vector, almost as if there were an infinite loophiding somewhere, without throwing any error at all. When someonesuggested a deadlock to me, I was skeptical but still I put a bunch ofMPI_Barrier() calls leading up to the reinit() call. I also checked tomake sure that the sum of the local dof count is equal to the total count.

I have not run in debug mode yet. That is something I need to try,although I am not sure if this will result in the run ending before itgets to the "good" part.

I am using 6.1.0/Petsc 2.x. (I am working with the server team(unsuccessfully so far) to get 6.2.1/Petsc 3.x going.)


Thank you,
Dan
_______________________________________________
dealii mailing list http://poisson.dealii.org/mailman/listinfo/dealii

[deal.II] PETSc.. deadlock when calling PETScWrappers::MPI::reinit()?

Reply via email to