Ralph and all,

my understanding is that

opal_finalize_util

agressively tries to free memory that would be still allocated otherwise.

an other way of saying "make valgrind happy" is "fully automake memory
leak detection"
(Joost pointed to the -fsanitize=leak feature of gcc 4.9 in
http://www.open-mpi.org/community/lists/devel/2014/05/14672.php)

the following simple program :

#include <mpi.h>

int main(int argc, char* argv[])
{
  int ret, provided;
  ret = MPI_T_init_thread(MPI_THREAD_SINGLE, &provided);
  ret = MPI_T_finalize();
  return 0;
}

leaks a *lot* of objects (and might remove some environment variables as
well) which have been half destroyed by opal_finalize_util, for example :
- classes are still marked as initialized *but* the cls_contruct_array
has been free'd
- the oob framework was not unallocated, it is still marked as
MCA_BASE_FRAMEWORK_FLAG_REGISTERED
  but some mca variables were freed, and that will cause problems when
MPI_Init try to (re)start the tcp component

now my 0.02$ :

ideally, MPI_Finalize nor MPI_T_finalize would leak any memory and the
framework would be re-initializable.
this could be a goal and George gave some good explanations on why it is
hard to achieve.
from my pragmatic point of view, and for this test case only, i am very
happy with a simple working solution,
even if it means that MPI_T_finalize leaks way too much memory in order
to work around the non re-initializable framework.

Cheers,

Gilles

On 2014/07/16 12:49, Ralph Castain wrote:
> I've attached a solution that blocks the segfault without requiring any 
> gyrations. Can someone explain why this isn't adequate?
>
> Alternate solution was to simply decrement opal_util_initialized in 
> MPI_T_finalize rather than calling finalize itself. Either way resolves the 
> problem in a very simple manner.
>

Reply via email to