Hi all,

I think, I found a bug and a fix for it.
Could someone verify the rationale behind this bug, as I have this
SIGSEG on only one of two machines, and I don't quite see why it doesn't
occur always. (Same testprogram, equally compiled 1.2.4 OpenMPI).
Though the fix does prevent the segmentation fault. :)

Thanks,
Murat





Where:
Bug:
free() crashes when trying to free stack memory
ompi/communicator/comm_dyn.c:630

    OBJ_RELEASE(apps[i]);


SIGSEG:
orte/mca/rmgr/rmgr_types.h:113

        free (app_context->cwd);



There are two ways that apps[i]->cwd is filled:
1. dynamically allocated memory
548     if ( !have_wdir ) {
            getcwd(cwd, OMPI_PATH_MAX);
            apps[i]->cwd = strdup(cwd);    // <--
        }

2. stack
354    char cwd[OMPI_PATH_MAX];
// ...
516         /* check for 'wdir' */
            ompi_info_get (array_of_info[i], "wdir", valuelen, cwd, &flag);
            if ( flag ) {
                apps[i]->cwd = cwd;  // <--
                have_wdir = 1;
            }



Fix: Allocate cwd always manually and make sure, it is deleted afterwards.

1.
<    char cwd[OMPI_PATH_MAX];
---
>    char *cwd = (char*)malloc(OMPI_PATH_MAX);

2. And on cleanup (somewhere below line 624)

>        if ( !have_wdir ) {
>            getcwd(cwd, OMPI_PATH_MAX);
>            apps[i]->cwd = strdup(cwd);
>        }

Reply via email to