There is a known issue on BProc 4 w.r.t. pty support. Open MPI by
default will try to use ptys for I/O forwarding but will revert to
pipes if ptys are not available.
You can "safely" ignore the pty warnings, or you may want to rerun
configure and add:
--disable-pty-support
I say "safely" because my understanding is that some I/O data may be
lost if pipes are used during abnormal termination.
Alternatively you might try getting pty support working, you need to
configure ptys on the backend nodes.
You can then try the following code to test if it is working
correctly, if this fails (it does on our BProc 4 cluster) you
shouldn't use ptys on BProc.
#include <pty.h>
#include <utmp.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
int
main(int argc, char *agrv[])
{
int amaster, aslave;
if (openpty(&amaster, &aslave, NULL, NULL, NULL) < 0) {
printf("openpty() failed with errno = %d, %s\n", errno, strerror
(errno));
} else {
printf("openpty() succeeded\n");
}
return 0;
}
On Apr 26, 2007, at 2:06 PM, Daniel Gruner wrote:
Hi
I have been testing OpenMPI 1.2, and now 1.2.1, on several BProc-
based clusters, and I have found some problems/issues. All my
clusters have standard ethernet interconnects, either 100Base/T or
Gigabit, on standard switches.
The clusters are all running Clustermatic 5 (BProc 4.x), and range
from 32-bit Athlon, to 32-bit Xeon, to 64-bit Opteron. In all cases
the same problems occur, identically. I attach here the results
from "ompi_info --all" and the config.log, for my latest build on
an Opteron cluster, using the Pathscale compilers. I had exactly
the same problems when using the vanilla GNU compilers.
Now for a description of the problem:
When running an mpi code (cpi.c, from the standard mpi examples, also
attached), using the mpirun defaults (e.g. -byslot), with a single
process:
sonoma:dgruner{134}> mpirun -n 1 ./cpip
[n17:30019] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time = 0.000199
However, if one tries to run more than one process, this bombs:
sonoma:dgruner{134}> mpirun -n 2 ./cpip
.
.
.
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
.
. ad infinitum
If one uses de option "-bynode", things work:
sonoma:dgruner{145}> mpirun -bynode -n 2 ./cpip
[n17:30055] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
Process 1 on n21
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.010375
Note that there is always the message about "openpty failed, using
pipes instead".
If I run more processes (on my 3-node cluster, with 2 cpus per
node), the
openpty message appears repeatedly for the first node:
sonoma:dgruner{146}> mpirun -bynode -n 6 ./cpip
[n17:30061] odls_bproc: openpty failed, using pipes instead
[n17:30061] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
Process 2 on n49
Process 1 on n21
Process 5 on n49
Process 3 on n17
Process 4 on n21
pi is approximately 3.1415926544231239, Error is 0.0000000008333307
wall clock time = 0.050332
Should I worry about the openpty failure? I suspect that
communications
may be slower this way. Using the -byslot option always fails, so
this
is a bug. The same occurs for all the codes that I have tried,
both simple
and complex.
Thanks for your attention to this.
Regards,
Daniel
--
Dr. Daniel Gruner dgru...@chem.utoronto.ca
Dept. of Chemistry daniel.gru...@utoronto.ca
University of Toronto phone: (416)-978-8689
80 St. George Street fax: (416)-978-5325
Toronto, ON M5S 3H6, Canada finger for PGP public key
<cpi.c.gz>
<config.log.gz>
<ompiinfo.gz>
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users