Hi, Thanks for the info about IMB. I will download the latest one.
Pallas was running fine in intra-node case. But it is hanging in inter-node case. I have a small MPI program which send/recv a char. I have tested this program across the nodes (inter-node) as follows. It ran fine across the nodes. Note: I have used the same options given by Tim while running pallas, mpi-ping and my small test mprogram. # mpirun -np 2 -mca pml ob1 -mca btl_base_include self,mvapi -mca btl_base_debug 1 ./a.out I have run mpi-ping.c file which is attached in the file given by OMPI Developer. This program hangs. I have run pallas (only pingpong) in inter-node case, it hangs too. Attached zip file contains the following files Test_out.txt --> Works fine in inter-node case. Send/recv only one char. mpi_ping.txt --> Hangs in inter-node case. I need to press ctrl+C Pmb_out.txt --> Hangs in inter-node case. Just ran pingpong. I need to press ctrl+C Test.c ---> my small MPI program The debug info is there in the above .txt files. Tim might be interested to look at the debug output. I have run pallas in intra-node case (same machine) and it hangs in intra-node case too. This output is something similar to pmb_out.txt except the IP address and port number. # mpirun -np 2 -mca pml ob1 -mca btl_base_include self,mvapi -mca btl_base_debug 1 ./PMB-MPI1 But when I run without any options, it runs fine. #mpirun -np 2 ./PMB-MPI1 Thanks -Sridhar -----Original Message----- From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Wednesday, August 17, 2005 6:19 PM To: Open MPI Developers Subject: Re: [O-MPI devel] Fwd: Regarding MVAPI Component in Open MPI On Aug 17, 2005, at 8:23 AM, Sridhar Chirravuri wrote: > Can someone reply to my mail please? I think you sent your first mail at 6:48am in my time zone (that is 4:48am Los Alamos time -- I strongly doubt that they are at work yet...); I'm still processing my mail from last night and am just now seeing your mail. Global software development is challenging. :-) > I checked out the latest code drop r6911 today morning and ran Pallas > with in the same node (2 procs). It ran fine. I didn't see any hangs > this time whereas I could see the following statements in the pallas > output and I feel they are just warnings, which can be ignored. Am I > correct? > > Request for 0 bytes (coll_basic_reduce_scatter.c, 80) > Request for 0 bytes (coll_basic_reduce.c, 194) > Request for 0 bytes (coll_basic_reduce_scatter.c, 80) > Request for 0 bytes (coll_basic_reduce.c, 194) > Request for 0 bytes (coll_basic_reduce_scatter.c, 80) > Request for 0 bytes (coll_basic_reduce.c, 194) Hum. I was under the impression that George had fixed these, but I get the same warnings. I'll have a look... > Here is the output of sample MPI program which sends a char and recvs a > char. > > [root@micrompi-1 ~]# mpirun -np 2 ./a.out > Could not join a running, existing universe > Establishing a new one named: default-universe-12913 > [0,0,0] mca_oob_tcp_init: calling orte_gpr.subscribe > [0,0,0] mca_oob_tcp_init: calling orte_gpr.put(orte-job-0) > [snipped] > [0,0,0]-[0,0,1] mca_oob_tcp_send: tag 2 > [0,0,0]-[0,0,1] mca_oob_tcp_send: tag 2 This seems to be a *lot* of debugging output -- did you enable that on purpose? I don't get the majority of that output when I run a hello world or a ring MPI program (I only get the bit about the existing universe). > My configure command looks like > > ./configure --prefix=/openmpi --with-btl-mvapi=/usr/local/topspin/ > --enable-mca-no-build=btl-openib,pml-teg,pml-uniq > > Since I am working with mvapi component, I disabled openib. Note that you can disable these things at run-time; you don't have to disable it at configure time. I only mention this for completeness -- either way, it's disabled. > But I could see that data is going over TCP/GigE and not on Infiniband. Tim: what's the status of multi-rail stuff? I thought I saw a commit recently where the TCP BTL would automatically disable itself if it saw that one or more of the low-latency BTLs was available...? Sridhar: Did you try running explicitly requesting mvapi? Perhaps something like: mpirun --mca btl mvapi,self .... This shouldn't be necessary -- mvapi should select itself automatically -- but perhaps something is going wrong with the mvapi selection sequence...? Tim/Galen -- got any insight here? > I have run pallas, it simply hangs again :-( I'm confused -- above, you said that you ran pallas and it worked fine...? (it does not hang for me when I run with teg or ob1) > Note: I added pml=ob1 in the conf file > /openmpi/etc/openmpi-mca-params.conf > > Any latest options being added to the configure command? Please let me > know. No, nothing changed there AFAIK. -- {+} Jeff Squyres {+} The Open MPI Project {+} http://www.open-mpi.org/ _______________________________________________ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
output.tar.gz
Description: output.tar.gz