Hi all,
Sorry if the subject is misleading... what I'm experiencing is not
necessarily a deadlock but it is very, very vexing. At a point in my
code where I have refined the mesh, distributed the DoFs, I am
reinit-ing an MPI vector. There is no error message, the program simply
halts (confirmed using gdb). This is a time-stepping model, so this
problem does not occur at every timestep. Indeed, it used to occur
pretty infrequently and seemed to be sporadic rather than deterministic,
making me think it had to do with conditions on the computing cluster.
But now it happens more regularly (i.e. consistently with the same set
of inputs at the same point in the simulation).
I'm sorry I'm not providing more detail, as responses to this email will
almost certainly ask for some. But I did not want to dump all of my
code, most of it probably unnecessary, into this email. At this point I
would just like to know under what set of circumstances could the above
happen - that is, the programming "hanging" on a reinit() call with an
MPI::PETScWrappers vector, almost as if there were an infinite loop
hiding somewhere, without throwing any error at all. When someone
suggested a deadlock to me, I was skeptical but still I put a bunch of
MPI_Barrier() calls leading up to the reinit() call. I also checked to
make sure that the sum of the local dof count is equal to the total count.
I have not run in debug mode yet. That is something I need to try,
although I am not sure if this will result in the run ending before it
gets to the "good" part.
I am using 6.1.0/Petsc 2.x. (I am working with the server team
(unsuccessfully so far) to get 6.2.1/Petsc 3.x going.)
Thank you,
Dan
_______________________________________________
dealii mailing list http://poisson.dealii.org/mailman/listinfo/dealii