Ralph,

This patch fixed it, num_nodes was being used initialised and hence the
client was getting a bogus value for the number of nodes.

Ashley,

On Mon, 2009-05-18 at 10:09 +0100, Ashley Pittman wrote:
> No joy I'm afraid,  now I get errors when I run it.  This is a single
> node job run with the command line "mpirun -n 3 ./a.out".  I've attached
> the strace output and gzipped /tmp files from the machine.  Valgrind on
> the opmi-ps process doesn't show anything interesting.
> 
> [alpha:29942] [[35044,0],0] ORTE_ERROR_LOG: Data unpack would read past
> end of buffer in
> file 
> /mnt/home/debian/ashley/code/OpenMPI/ompi-trunk-tes/trunk/orte/util/comm/comm.c
>  at line 242
> [alpha:29942] [[35044,0],0] ORTE_ERROR_LOG: Data unpack would read past
> end of buffer in
> file 
> /mnt/home/debian/ashley/code/OpenMPI/ompi-trunk-tes/trunk/orte/tools/orte-ps/orte-ps.c
>  at line 818
> 
> Ashley.
> 
> On Sat, 2009-05-16 at 08:15 -0600, Ralph Castain wrote:
> > This is fixed now, Ashley - sorry for the problem.
> > 
> > 
> > On May 15, 2009, at 4:47 AM, Ashley Pittman wrote:
> > 
> > > On Thu, 2009-05-14 at 22:49 -0600, Ralph Castain wrote:
> > >> It is definitely broken at the moment, Ashley. I have it pretty well
> > >> fixed, but need/want to cleanup some corner cases that have plagued  
> > >> us
> > >> for a long time.
> > >>
> > >> Should have it for you sometime Friday.
> > >
> > > Ok, thanks.  I might try switching to slurm in the mean-time, I know  
> > > my
> > > code works with that.
> > >
> > > Can you let me know when it's fixed on or off list and I'll do an
> > > update.
> > >
> > > Ashley,
> > >
> > > _______________________________________________
> > > devel mailing list
> > > de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > 
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
Index: orte/orted/orted_comm.c
===================================================================
--- orte/orted/orted_comm.c	(revision 21248)
+++ orte/orted/orted_comm.c	(working copy)
@@ -837,6 +837,7 @@
                         goto CLEANUP;
                     }
                 } else {
+                    num_nodes = 0;
                     /* count number of nodes */
                     for (i=0; i < orte_node_pool->size; i++) {
                         if (NULL != opal_pointer_array_get_item(orte_node_pool, i)) {

Reply via email to