Re: [OMPI devel] MPI_Comm_spawn fails under certain conditions

2014-06-25 Thread Ralph Castain
I see your point, but I don't know how to make that happen. The problem is
that spawn really should fail under certain conditions because you asked us
to do something we couldn't do - i.e., you asked that we launch and bind
more processes then we could. Increasing the number of available resources
will always change the situation and make it more likely spawn will succeed.

You can still trigger the behavior by individually setting the
oversubscribe property in the --may-by option - instead of giving
"--oversubscribe", just use "--map-by :oversubscribe". This will allow
oversubscription but not overload, and you'll be back to the original
scenario.




On Tue, Jun 24, 2014 at 10:03 PM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

> Hi Ralph,
>
> On 2014/06/25 2:51, Ralph Castain wrote:
> > Had a chance to review this with folks here, and we think that having
> > oversubscribe automatically set overload makes some sense. However, we do
> > want to retain the ability to separately specify oversubscribe and
> overload
> > as well since these two terms don't mean quite the same thing.
> >
> > Our proposal, therefore, is to have the --oversubscribe flag set both the
> > --map-by :oversubscribe and --bind-to :overload-allowed properties. If
> > someone specifies both the --oversubscribe flag and a conflicting
> directive
> > for one or both of the individual properties, then we'll error out with a
> > "bozo" message.
> i fully agree.
> > The use-cases you describe are (minus the crash) correct as the warning
> > only is emitted when you are overloaded (i.e., trying to bind to more
> cpus
> > than you have). So you won't get any warning when running on three nodes
> as
> > you have enough cpus for all the procs, etc.
> >
> > I'll investigate the crash once I get home and have access to a cluster
> > again. The problem likely has to do with not properly responding to the
> > failure to spawn.
> humm
>
> because you already made the change described above(r32072), the crash
> does not occur any more.
>
> about the crash, i see things the other way around : spawn should have
> not failed.
> /* or spawn should have failed when running on a single node, at least
> for the sake of consistency */
>
> but like i said, it works now, so it might be just pedantic to point a
> bug that is still here but that cannot be triggered ...
>
> Cheers,
>
> Gilles
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/06/15053.php
>


Re: [OMPI devel] MPI_Comm_spawn fails under certain conditions

2014-06-25 Thread Gilles Gouaillardet
Hi Ralph,

On 2014/06/25 2:51, Ralph Castain wrote:
> Had a chance to review this with folks here, and we think that having
> oversubscribe automatically set overload makes some sense. However, we do
> want to retain the ability to separately specify oversubscribe and overload
> as well since these two terms don't mean quite the same thing.
>
> Our proposal, therefore, is to have the --oversubscribe flag set both the
> --map-by :oversubscribe and --bind-to :overload-allowed properties. If
> someone specifies both the --oversubscribe flag and a conflicting directive
> for one or both of the individual properties, then we'll error out with a
> "bozo" message.
i fully agree.
> The use-cases you describe are (minus the crash) correct as the warning
> only is emitted when you are overloaded (i.e., trying to bind to more cpus
> than you have). So you won't get any warning when running on three nodes as
> you have enough cpus for all the procs, etc.
>
> I'll investigate the crash once I get home and have access to a cluster
> again. The problem likely has to do with not properly responding to the
> failure to spawn.
humm

because you already made the change described above(r32072), the crash
does not occur any more.

about the crash, i see things the other way around : spawn should have
not failed.
/* or spawn should have failed when running on a single node, at least
for the sake of consistency */

but like i said, it works now, so it might be just pedantic to point a
bug that is still here but that cannot be triggered ...

Cheers,

Gilles


[OMPI devel] MPI_Comm_spawn fails under certain conditions

2014-06-24 Thread Gilles Gouaillardet
Folks,

this issue is related to the failures reported by mtt on the trunk when
the ibm test suite invokes MPI_Comm_spawn.

my test bed is made of 3 (virtual) machines with 2 sockets and 8 cpus
per socket each.

if i run on one host (without any batch manager)

mpirun -np 16 --host slurm1 --oversubscribe --mca coll ^ml
./intercomm_create

then the test is a success with the following warning  :

--
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to: CORE
   Node:slurm2
   #processes:  2
   #cpus:   1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--


now if i run on three hosts

mpirun -np 16 --host slurm1,slurm2,slurm3 --oversubscribe --mca coll ^ml
./intercomm_create

then the test is a success without any warning


but now, if i run on two hosts

mpirun -np 16 --host slurm1,slurm2 --oversubscribe --mca coll ^ml
./intercomm_create

then the test is a failure.

first, i get the following same warning :

--
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to: CORE
   Node:slurm2
   #processes:  2
   #cpus:   1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--

followed by a crash

[slurm1:2482] *** An error occurred in MPI_Comm_spawn
[slurm1:2482] *** reported by process [2068512769,0]
[slurm1:2482] *** on communicator MPI_COMM_WORLD
[slurm1:2482] *** MPI_ERR_SPAWN: could not spawn processes
[slurm1:2482] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort,
[slurm1:2482] ***and potentially your MPI job)


that being said, i the following command works :

mpirun -np 16 --host slurm1,slurm2 --mca coll ^ml --bind-to none
./intercomm_create


1) what does the first message means ?
is it a warning ? /* if yes, why does mpirun on two hosts fail ? */
is it a fatal error ? /* if yes, why does mpirun on one host success
? */

2) generally speaking, and assuming the first message is a warning,
should --oversubscribe automatically set overload-allowed ?
/* as far as i am concerned, that would be much more intuitive */

Cheers,

Gilles