On Apr 19, 2006, at 4:15 PM, Greg Watson wrote:
We've just run across a rather tricky issue. We're calling
opal_event_loop() to dispatch orte events to an orted that has been
launched separately. However if the orted dies for some reason (gets
a signal or whatever) then opal_event_loop() is call
Well, I actually don't know much about opal_event_loop and/or how it is
intended to work. My guess is that:
(a) your remote orted is acting as the seed and your local process (the
one in Eclipse) is running as a client to that seed - at least, that
was the case last I talked to Nathan
(b) whe
You make a good point about the library not calling exit(). I'll have
to recruit some help to look at the notion of opal_even_loop returning
an error value - it isn't entirely clear who it would return it to in
our system,. Even though I understand how someone in your situation
would handle it,
The simplest thing for us would be for opal_event_loop() to return an
error value. That way we can detect the situation and clean up our
system. At the moment we're not trying to restart orted, so clean
recovery of orte is not that important, though ultimately I would
think it is desirable.
Ok, thanks.
For clarification, the model we're using at the moment looks roughly
like this:
orte_init();
forever () {
if (do_our_stuff() == GAME_OVER)
break;
opal_event_loop(OPAL_EVLOOP_ONCE);
}
orte_finalize();
The simplest change for us would be something
Looks reasonable - let me see what can be done.
Thanks
Ralph
Greg Watson wrote:
Ok, thanks.
For clarification, the model we're using at the moment looks roughly
like this:
orte_init();
forever () {
if (do_our_stuff() == GAME_OVER)
break;
opal_event_
Hey Guys,
Not sure what is going on here, has anyone seen this before?
- Galen
Hi Galen,
Sorry to bother you.
I have installed latest stable version of Open MPI(1.0) on two of
spider
nodes(s7,s4) for some experiments, but there seems to be configuration
error or something else which I