Could be - let me investigate this weekend.
Thanks for all that parsing!!!
> On Oct 23, 2015, at 5:00 PM, Mark Santcroos
> wrote:
>
> Is this the culprit?
>
> 'ACTIVATING PROC [[8679,2],0] STATE IOF COMPLETE PRI 4',
> 'state:base:track_procs called for proc
Is this the culprit?
'ACTIVATING PROC [[8679,2],0] STATE IOF COMPLETE PRI 4',
'state:base:track_procs called for proc [[8679,2],0] state RUNNING',
That seems to be out of order for the hanging processes.
> On 21 Oct 2015, at 2:50 , Ralph Castain wrote:
> Can you do me a favor?
Hi Ralph,
It required some parsing-fu, but here you go! :-)
Three text files attached. One is the raw log, the second is output from my
parser script and the third is the output of pstree after it
No, you won’t see the change to the daemon-to-proc connection coming to the
1.10 series. It will only be upstream from that one, starting with 2.0
> On Oct 23, 2015, at 9:15 AM, Justin Cinkelj wrote:
>
>
>
> - Original Message -
>> From: "Justin Cinkelj"
- Original Message -
> From: "Justin Cinkelj"
> To: "Open MPI Developers"
> Sent: Friday, October 23, 2015 5:59:43 PM
> Subject: Re: [OMPI devel] How is session dir used?
>
> Shared memory file is used by mpi_program only, and not by orted,
Each module has the opportunity to provide an ft_event function, that is
supposedly called when a change in the module behavior is necessary. Thus,
it is relatively easy to let the BTL knows about the fact that a particular
destination process will migrate to a new location.
George.
On Fri,
The session dir is also used by the shared memory system for its backing file,
so you may need it if you plan to run more than one proc in a VM. This has been
one of the sticking points for VM/container-based operations.
As for the orted: your description is pretty close. The socket you mention
On Oct 22, 2015, at 7:17 AM, Gilles Gouaillardet
wrote:
>
> Gianmario,
>
> there was c/r support in the v1.6 series but it has been removed.
To be specific: the C/R support was removed from the v2.x branch because it is
stale / not working. The support is
I see the issue in the current code:
1. The current code assumes that if you use the MTT database reporter, you can
reach the database. One of the first things it does is ping the server to
ensure that it's reachable. The rationale is that you don't want MTT to run
for a long time and then
I see the issue in the current code:
1. The current code assumes that if you use the MTT database reporter, you can
reach the database. One of the first things it does is ping the server to
ensure that it's reachable. The rationale is that you don't want MTT to run
for a long time and then
George,
Then you cannot use https otherwise certificate check will fail,
Note if you have a proxy, you can tunnel to the proxy and that should be fine.
The main drawback is the ssh connection must be active when contacting IU, and
if a batch manager is used, no one knows when that will be
Gianmario,
Iirc, there is one pipe between orted and each children stderr.
stdout is a pty, and stdin is /dev/null, but it might be a pipe on task 0
This is the way stdout/stderr from tasks end up being printed by mpirun : orted
does i/o forwarding (aka IOF)
are you trying to migrate only one
Hi Adrian and Gilles,
first of all thank you for your responses. I'm working with Gianmario on
this ambitious project.
2015-10-22 13:17 GMT+02:00 Gilles Gouaillardet <
gilles.gouaillar...@gmail.com>:
> Gianmario,
>
> there was c/r support in the v1.6 series but it has been removed.
> the
Howard,
that has already been raised in
http://www.open-mpi.org/community/lists/mtt-users/2014/10/0820.php
at the end, Christoph claimed he could achieve that with mtt-relay
(but provided no detail on how ...)
You might want to check the full thread and/or ask Christoph directly
Ralph,
I was thinking about this, and I believe it would require a change to the mtt
client to avoid it. I’m working on a new Python-based version of it, and I’ll
make sure to deal with this there.
In the interim, I’ll have to defer to some old, gray Perl guru to update the
current client
> On Oct
15 matches
Mail list logo