Re: [OMPI devel] opal_event_loop exiting

2006-04-20 Thread Ralph Castain
Looks reasonable - let me see what can be done. Thanks Ralph Greg Watson wrote: Ok, thanks. For clarification, the model we're using at the moment looks roughly like this: orte_init(); forever () { if (do_our_stuff() == GAME_OVER)     break; opal_event_

Re: [OMPI devel] opal_event_loop exiting

2006-04-20 Thread Greg Watson
Ok, thanks. For clarification, the model we're using at the moment looks roughly like this: orte_init(); forever () { if (do_our_stuff() == GAME_OVER) break; opal_event_loop(OPAL_EVLOOP_ONCE); } orte_finalize(); The simplest change for us would be something

Re: [OMPI devel] opal_event_loop exiting

2006-04-20 Thread Greg Watson
The simplest thing for us would be for opal_event_loop() to return an error value. That way we can detect the situation and clean up our system. At the moment we're not trying to restart orted, so clean recovery of orte is not that important, though ultimately I would think it is desirable.

Re: [OMPI devel] opal_event_loop exiting

2006-04-20 Thread Ralph Castain
You make a good point about the library not calling exit(). I'll have to recruit some help to look at the notion of opal_even_loop returning an error value - it isn't entirely clear who it would return it to in our system,. Even though I understand how someone in your situation would handle it,

Re: [OMPI devel] opal_event_loop exiting

2006-04-20 Thread Ralph Castain
Well, I actually don't know much about opal_event_loop and/or how it is intended to work. My guess is that: (a) your remote orted is acting as the seed and your local process (the one in Eclipse) is running as a client to that seed - at least, that was the case last I talked to Nathan (b) whe

Re: [OMPI devel] opal_event_loop exiting

2006-04-20 Thread Brian Barrett
On Apr 19, 2006, at 4:15 PM, Greg Watson wrote: We've just run across a rather tricky issue. We're calling opal_event_loop() to dispatch orte events to an orted that has been launched separately. However if the orted dies for some reason (gets a signal or whatever) then opal_event_loop() is call

Re: [OMPI devel] opal_event_loop exiting

2006-04-19 Thread Greg Watson
Oops, should have mentioned that this is 1.0.2 on MacOSX. Greg On Apr 19, 2006, at 5:15 PM, Greg Watson wrote: Hi all, We've just run across a rather tricky issue. We're calling opal_event_loop() to dispatch orte events to an orted that has been launched separately. However if the orted d