Just to be safe, I blew away my existing installations and got completely fresh
checkouts. I am doing a vanilla configure, with the only configure options
besides prefix being --enable-orterun-prefix-by-default and --enable-mpi-java
(so I can test the Java bindings)
For 1.7.5, running the IBM t
This seems to be working, but I think we now have a pid group problem -- I
think we need to setpgid() right after the fork. Otherwise, when we kill the
group, we might end up killing much more than just the one MPI process
(including the orted and/or orted's parent!).
Ping me on IM -- I'm test
Okay, fixed and cmr'd to you
On Mar 18, 2014, at 11:00 AM, Ralph Castain wrote:
>
> On Mar 18, 2014, at 10:54 AM, Dave Goodell (dgoodell)
> wrote:
>
>> Ralph,
>>
>> I'm seeing problems with MPIEXEC_TIMEOUT in v1.7 @ r31103 (fairly close to
>> HEAD):
>>
>> 8<
>> MPIEXEC_TIMEOUT=8
On Mar 18, 2014, at 10:54 AM, Dave Goodell (dgoodell)
wrote:
> Ralph,
>
> I'm seeing problems with MPIEXEC_TIMEOUT in v1.7 @ r31103 (fairly close to
> HEAD):
>
> 8<
> MPIEXEC_TIMEOUT=8 mpirun --mca btl usnic,sm,self -np 4 ./sleeper
> --
Tomorrow at 9am US Eastern, IU will be changing the IP address of open-mpi.org
(and all of its associated services: email, web, etc.).
They're hoping it causes no downtime -- there should be proxies in place to
relay traffic from the old IP addresses for the next week or two, so that no
one sho
Ralph,
I'm seeing problems with MPIEXEC_TIMEOUT in v1.7 @ r31103 (fairly close to
HEAD):
8<
MPIEXEC_TIMEOUT=8 mpirun --mca btl usnic,sm,self -np 4 ./sleeper
--
The user-provided time limit for job execution has been
It's on the trunk, but I imagine it is on 1.7 as well. I use the "simple_spawn"
program in orte/test/mpi, and the cmd line is just "mpirun -np 2 ./simple_spawn"
On Mar 18, 2014, at 7:42 AM, Nathan Hjelm wrote:
> Is this trunk or 1.7? Can you give me your mpirun command?
>
> -Nathan
>
> On Tu
Is this trunk or 1.7? Can you give me your mpirun command?
-Nathan
On Tue, Mar 18, 2014 at 07:35:01AM -0700, Ralph Castain wrote:
>I'm seeing comm_spawn hang here:
>[bend001][[52890,1],0][coll_ml_module.c:3030:mca_coll_ml_comm_query]
>COLL-ML ml_coll_schedule_setup exit with error
>
I'm seeing comm_spawn hang here:
[bend001][[52890,1],0][coll_ml_module.c:3030:mca_coll_ml_comm_query] COLL-ML
ml_coll_schedule_setup exit with error
[bend001][[52890,1],1][coll_ml_module.c:3030:mca_coll_ml_comm_query] COLL-ML
ml_coll_schedule_setup exit with error
Setting -mca coll ^ml allows t
Thanks for your fix.
You say that the environment is only taken in
account during register. There is another variable set in the
environment in opal-restart.c. Does the following still work:
opal-restart.c:
(void) mca_base_var_env_name("crs", &tmp_env_var);
opal_setenv(tmp_env_var,
10 matches
Mail list logo