This seems to be working, but I think we now have a pid group problem -- I
think we need to setpgid() right after the fork. Otherwise, when we kill the
group, we might end up killing much more than just the one MPI process
(including the orted and/or orted's parent!).
Ping me on IM -- I'm test
Okay, fixed and cmr'd to you
On Mar 18, 2014, at 11:00 AM, Ralph Castain wrote:
>
> On Mar 18, 2014, at 10:54 AM, Dave Goodell (dgoodell)
> wrote:
>
>> Ralph,
>>
>> I'm seeing problems with MPIEXEC_TIMEOUT in v1.7 @ r31103 (fairly close to
>> HEAD):
>>
>> 8<
>> MPIEXEC_TIMEOUT=8
On Mar 18, 2014, at 10:54 AM, Dave Goodell (dgoodell)
wrote:
> Ralph,
>
> I'm seeing problems with MPIEXEC_TIMEOUT in v1.7 @ r31103 (fairly close to
> HEAD):
>
> 8<
> MPIEXEC_TIMEOUT=8 mpirun --mca btl usnic,sm,self -np 4 ./sleeper
> --
Ralph,
I'm seeing problems with MPIEXEC_TIMEOUT in v1.7 @ r31103 (fairly close to
HEAD):
8<
MPIEXEC_TIMEOUT=8 mpirun --mca btl usnic,sm,self -np 4 ./sleeper
--
The user-provided time limit for job execution has been