Oops!! You put the fix in the wrong place, I'm afraid:

    /* leave space for length field */
    memset(c, ' ', PMII_COMMANDLEN_SIZE);
    c += PMII_COMMANDLEN_SIZE;

    PMI2U_ERR_CHKANDJUMP(strlen(cmd) > PMI2_MAX_VALLEN, pmi2_errno, 
PMI2_ERR_OTHER, "**cmd_too_long");

    ret = snprintf(c, remaining_len, "cmd=%s;", cmd);  <<<<<  THIS IS THE 
PROBLEM POINT

    PMI2U_ERR_CHKANDJUMP(ret >= remaining_len, pmi2_errno, PMI2_ERR_OTHER, 
"**intern %s", "Ran out of room for command");
    c += ret;
    remaining_len -= ret;
    /* Subtract the PMII_COMMANDLEN_SIZE to prevent
     * certain implementation of snprintf() to
     * segfault when zero out the buffer.
     * PMII_COMMANDLEN_SIZE must be added later on
     * back again to send out the right protocol
     * message size.
     */
    remaining_len -= PMII_COMMANDLEN_SIZE;


That change needed to go *before* the highlighted snprintf. I'm afraid 2.6.3 
continue to segfault :-(


On Oct 3, 2013, at 12:12 PM, Ralph Castain <r...@open-mpi.org> wrote:

> 
> Cool - thanks David!
> 
> On Oct 3, 2013, at 11:31 AM, David Bigagli <da...@schedmd.com> wrote:
> 
>> Fixed. I chose the method proposed by Michael, subtract first, add later. :-)
>> 
>> On 10/03/2013 10:29 AM, Ralph Castain wrote:
>>> 
>>> 
>>> On Oct 3, 2013, at 10:16 AM, David Bigagli <da...@schedmd.com> wrote:
>>> 
>>>> 
>>>> 
>>>> I am not saying that remaining_len is correct or that mpich is bugless :-) 
>>>> I am only saying that decrementing remaining_len as proposed breaks the 
>>>> pmi2 protocol as since the client sends a wrong length to the server. We 
>>>> actually applied the very same fix in the past and then had to undo it 
>>>> commit: 1e25eb10df0f.
>>>> 
>>>> Funny enough Ralph was one of those that reported that broken protocol bug 
>>>> :-) "mpi/pmi2: no value for key %s in req".
>>> 
>>> Ha! Yes indeed, we do see that reappear. :-/
>>> 
>>>> 
>>>> We can either use sprintf() as there is no need to use snprintf() here or 
>>>> do snprintf(c, remaining_len - PMII_COMMANDLEN_SIZE, "thrid=%p;", resp);
>>> 
>>> Either way is fine with me - feel free to choose.
>>> Thanks!
>>> 
>>>> 
>>>> On 10/03/2013 09:50 AM, Ralph Castain wrote:
>>>>> 
>>>>> I'm afraid that isn't quite correct, David. If you walk thru the code, 
>>>>> you'll see that the snprintf is being given an incorrect length for the 
>>>>> buffer. When using some libc versions, snprintf automatically fills the 
>>>>> given buffer with NULLs to ensure that the string is NULL-terminated. 
>>>>> When it does so, the code as currently written causes a segfault.
>>>>> 
>>>>> We have tested the provided change and it fixes the problem. If your code 
>>>>> review indicates that remaining_len should not be changed where we did 
>>>>> it, then you need to at least change the snprintf command to reflect the 
>>>>> reduced size of the "c" buffer.
>>>>> 
>>>>> On Oct 3, 2013, at 9:44 AM, David Bigagli <da...@schedmd.com> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Hi,
>>>>>>  I don't know the details of the segfault but the code in question is 
>>>>>> correct. If you decrease the length then the file cmdlen:
>>>>>> 
>>>>>> cmdlen = PMII_MAX_COMMAND_LEN - remaining_len;
>>>>>> 
>>>>>> will not be correct and wrong length will be sent to the pmi2 server.
>>>>>> 
>>>>>> This code is taken verbatim from mpich2-1.5, precisely from: 
>>>>>> mpich2-1.5/src/pmi/pmi2/simple2pmi.c
>>>>>> 
>>>>>> On 10/03/2013 08:47 AM, Ralph Castain wrote:
>>>>>>> 
>>>>>>> Hi folks
>>>>>>> 
>>>>>>> We have uncovered a segfault-causing bug in contribs/pmi2/pmi2_api.c. 
>>>>>>> Looking at the code in the current trunk, you need to add the following 
>>>>>>> at line 1499:
>>>>>>> 
>>>>>>>    remaining_len -= PMII_COMMANDLEN_SIZE;
>>>>>>> 
>>>>>>> as you have changed the location of the pointer to the buffer, but 
>>>>>>> failed to account for that change in the size param being later passed 
>>>>>>> to snprintf.
>>>>>>> 
>>>>>>> Could you please fix this and include it in 2.6.3?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> Ralph
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Thanks,
>>>>>>     /David
>>>> 
>>>> --
>>>> 
>>>> Thanks,
>>>>     /David
>> 
>> -- 
>> 
>> Thanks,
>>     /David

Reply via email to