Ah, I found the root cause of the problem. It appears that whomever wrote the 
test thought that setting MPI_CHECK_ARGS=1 in the environment would force the 
param check to be done. However, a quick scan shows that this envar is *never* 
looked at by OMPI.

So the question is: did the test writer make a mistake? Or are we supposed to 
be looking at that envar?


> On Jul 18, 2015, at 1:59 AM, Ralph Castain <r...@open-mpi.org> wrote:
> 
> Yeah, I finally traced it to an MCA param setting in my default param file. I 
> swear, as much as I like our MCA param system, there are times like this when 
> it leaves something to be desired.
> 
> Sigh. Sorry for the “false” alarm.
> 
> 
>> On Jul 17, 2015, at 8:54 PM, Gilles Gouaillardet 
>> <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote:
>> 
>> Ralph,
>> 
>> based on the source code (ompi_mpi_params.c:91) I was expecting a Boolean 
>> ompi_mpi_param_check
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Saturday, July 18, 2015, Ralph Castain <r...@open-mpi.org 
>> <mailto:r...@open-mpi.org>> wrote:
>> Yep, I checked:
>> 
>>      MPI parameter check: runtime
>> 
>> 
>> 
>>> On Jul 17, 2015, at 8:00 PM, Gilles Gouaillardet 
>>> <gilles.gouaillar...@gmail.com 
>>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>>> 
>>> Ralph,
>>> 
>>> I will try to reproduce this.
>>> I guess you already checked the output of ompi_info to confirm params are 
>>> checked at runtime.
>>> 
>>> Cheers,
>>> 
>>> Gilles
>>> 
>>> On Saturday, July 18, 2015, Ralph Castain <r...@open-mpi.org 
>>> <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org');>> wrote:
>>> Hi folks
>>> 
>>> I keep getting segfault errors when testing 1.10, while others say the 
>>> tests are passing for them. The tests are in the onesided area, but I don’t 
>>> believe they necessarily are a onesided issue.
>>> 
>>> Specifically, the tests (e.g., test_start1.c) call MPI_Win_set_errhandler 
>>> with a NULL argument for the first parameter (MPI_win). Looking at the code 
>>> for that function, I see this:
>>> 
>>> int MPI_Win_set_errhandler(MPI_Win win, MPI_Errhandler errhandler)
>>> {
>>>     MPI_Errhandler tmp;
>>> 
>>>     OPAL_CR_NOOP_PROGRESS();
>>> 
>>>     if (MPI_PARAM_CHECK) {
>>>         OMPI_ERR_INIT_FINALIZE(FUNC_NAME);
>>> 
>>>         if (ompi_win_invalid(win)) {
>>>             return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_WIN,
>>>                                           FUNC_NAME);
>>>         } else if (NULL == errhandler ||
>>>                    MPI_ERRHANDLER_NULL == errhandler ||
>>>                    (OMPI_ERRHANDLER_TYPE_WIN != 
>>> errhandler->eh_mpi_object_type &&
>>>                     OMPI_ERRHANDLER_TYPE_PREDEFINED != 
>>> errhandler->eh_mpi_object_type) ) {
>>>             return OMPI_ERRHANDLER_INVOKE(win, MPI_ERR_ARG, FUNC_NAME);
>>>         }
>>>     }
>>> 
>>>     /* Prepare the new error handler */
>>>     OBJ_RETAIN(errhandler);
>>> 
>>>     /* Ditch the old errhandler, and decrement its refcount.  On 64
>>>        bits environments we have to make sure the reading of the
>>>        error_handler became atomic. */
>>>     do {
>>>         tmp = win->error_handler;
>>>     } while (!OPAL_ATOMIC_CMPSET(&(win->error_handler), tmp, errhandler));
>>>     OBJ_RELEASE(tmp);
>>> 
>>>     /* All done */
>>>     return MPI_SUCCESS;
>>> }
>>> 
>>> If someone built with —with-mpi-param-check=always or runtime, then this 
>>> function will return an error when given the NULL argument. Otherwise, it 
>>> will definitely segfault. According to the configure output, this option is 
>>> supposed to default to “runtime”. I don’t set it in my configury, so I 
>>> would have thought this was the case. And when I look at the config.log, I 
>>> see:
>>> 
>>> configure:10401: checking if want run-time MPI parameter checking
>>> configure:10425: result: runtime
>>> 
>>> 
>>> However, what I’m seeing implies that this is *not* the case - i.e., we 
>>> aren’t checking MPI params, and hence I am crashing. Does anyone have any 
>>> thoughts on what could be going on? Is this test itself even correct?
>>> 
>>> Ralph
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org <>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2015/07/17656.php 
>>> <http://www.open-mpi.org/community/lists/devel/2015/07/17656.php>_______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2015/07/17661.php 
>>> <http://www.open-mpi.org/community/lists/devel/2015/07/17661.php>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/07/17663.php
> 

Reply via email to