This was happening on Alpha 1 as well but I upgraded today to Alpha 4 to see if it's gone away - it has not.

I register a callback on a spawn() inside ORTE. That callback includes the current state and should be called as the job goes through those states.

I am now noticing that jobs never go through the INIT state. They may also not go through others but definitely not ORTE_PROC_STATE_INIT.

I was registering the IOForwarding callback during the INIT phase so, consequentially, I now do not have IOF. There are other side effects such as jobs that I start I think are perpetually in the 'starting' state and then, suddenly, they're done.

Can someone look into / comment on this please?

Thanks.

--
-- Nathan
Correspondence
---------------------------------------------------------------------
Nathan DeBardeleben, Ph.D.
Los Alamos National Laboratory
Parallel Tools Team
High Performance Computing Environments
phone: 505-667-3428
email: ndeb...@lanl.gov
---------------------------------------------------------------------

Reply via email to