I've committed a fix to the trunk (r29245) and scheduled it for v1.7.3 - thanks for the debug info!
Ralph On Sep 25, 2013, at 5:00 AM, Suraj Prabhakaran <suraj.prabhaka...@gmail.com> wrote: > Dear Ralph, > > I am sorry but I think I missed adding plm verbosity to 5 last time. Here is > the output of the complete program with and without -novm to the following > mpiexec. > > mpiexec -mca state_base_verbose 10 -mca errmgr_base_verbose 10 -mca > plm_base_verbose 5 -mca btl tcp,sm,self -np 2 ./addhosttest > mpiexec -mca state_base_verbose 10 -mca errmgr_base_verbose 10 -mca > plm_base_verbose 5 -mca btl tcp,sm,self -novm -np 2 ./addhosttest > > Here you can see that although I spawn only one process on grsacc18, > something is also done with grsacc19. > > Sorry and thanks! > Suraj > <output.rtf><output-novm.rtf> > > > On Sep 24, 2013, at 8:24 PM, Ralph Castain wrote: > >> What I find puzzling is that I don't see any output indicating that you went >> thru the Torque launcher to launch the daemons - not a peep of debug output. >> This makes me suspicious that something else is going on. Are you sure you >> sent me all the output? >> >> Try adding -novm to your mpirun cmd line and let's see if that mode works >> >> On Sep 24, 2013, at 9:06 AM, Suraj Prabhakaran <suraj.prabhaka...@gmail.com> >> wrote: >> >>> Hi Ralph, >>> >>> So here is what I do. I spawn just a "single" process on a new node which >>> is basically not in the $PBS_NODEFILE list. >>> My $PBS_NODEFILE list contains >>> grsacc20 >>> grsacc19 >>> >>> I then start the app with just 2 processes. So one host gets one process >>> and they are successfully spawned through the torque (through tm_spawn()). >>> MPI would have stored grsacc20 and grsacc19 to its list of hosts with >>> launchid 0 and 1 correspondingly. >>> I then use the add-host info and spawn ONE new process on a new host >>> "grsacc18" through MPI_Comm_spawn. From what I saw in the code, the >>> launchid of this new host is -1 since openmpi does not know about this and >>> it is not available in the $PBS_NODEFILE. Since withouth the launchid, >>> torque would not know where to spawn, I just retrieve the correct launchid >>> of this host from a file just before tm_spawn() and use this launchid. This >>> is the only modification that I made to openmpi. >>> So, the host "grsacc18" will have a new launchid = 2 and will be used to >>> spawn the process through torque. This worked perfectly until 1.6.5. >>> >>> As we see here from the outputs, although I spawn only a single process on >>> grsacc18, I too have no clue why openmpi tries to spawn something on >>> grsacc19. Of course, without pbs/torque involved, everything works fine. >>> I have attached the simple test code. Please modify hostnames and >>> executable path before use. >>> >>> Best, >>> Suraj >>> >>> <addhosttest.c> >>> >>> >>> On Sep 24, 2013, at 4:59 PM, Ralph Castain wrote: >>> >>>> I'm going to need a little help here. The problem is that you launch two >>>> new daemons, and one of them exits immediately because it thinks it lost >>>> the connection back to mpirun - before it even gets a chance to create it. >>>> >>>> Can you give me a little more info as to exactly what you are doing? >>>> Perhaps send me your test code? >>>> >>>> On Sep 24, 2013, at 7:48 AM, Suraj Prabhakaran >>>> <suraj.prabhaka...@gmail.com> wrote: >>>> >>>>> Hi Ralph, >>>>> >>>>> Output attached in a file. >>>>> Thanks a lot! >>>>> >>>>> Best, >>>>> Suraj >>>>> >>>>> <output.rtf> >>>>> >>>>> On Sep 24, 2013, at 4:11 PM, Ralph Castain wrote: >>>>> >>>>>> Afraid I don't see the problem offhand - can you add the following to >>>>>> your cmd line? >>>>>> >>>>>> -mca state_base_verbose 10 -mca errmgr_base_verbose 10 >>>>>> >>>>>> Thanks >>>>>> Ralph >>>>>> >>>>>> On Sep 24, 2013, at 6:35 AM, Suraj Prabhakaran >>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>> >>>>>>> Hi Ralph, >>>>>>> >>>>>>> I always got this output from any MPI job that ran on our nodes. There >>>>>>> seems to be a problem somewhere but it never stopped the applications >>>>>>> from running. But anyway, I ran it again now with only tcp and excluded >>>>>>> the infiniband and I get the same output again. Except that this time, >>>>>>> the error related to this openib is not there anymore. Printing out the >>>>>>> log again. >>>>>>> >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive processing msg >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive job launch command from >>>>>>> [[6160,1],0] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive adding hosts >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive calling spawn >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive done processing commands >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:setup_job >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:setup_vm >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:setup_vm add new daemon >>>>>>> [[6160,0],2] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:setup_vm assigning new daemon >>>>>>> [[6160,0],2] to node grsacc18 >>>>>>> [grsacc20:04578] [[6160,0],0] plm:tm: launching vm >>>>>>> [grsacc20:04578] [[6160,0],0] plm:tm: final top-level argv: >>>>>>> orted -mca ess tm -mca orte_ess_jobid 403701760 -mca >>>>>>> orte_ess_vpid <template> -mca orte_ess_num_procs 3 -mca orte_hnp_uri >>>>>>> "403701760.0;tcp://192.168.222.20:35163" -mca plm_base_verbose 5 -mca >>>>>>> btl tcp,sm,self >>>>>>> [grsacc20:04578] [[6160,0],0] plm:tm: launching on node grsacc19 >>>>>>> [grsacc20:04578] [[6160,0],0] plm:tm: executing: >>>>>>> orted -mca ess tm -mca orte_ess_jobid 403701760 -mca >>>>>>> orte_ess_vpid 1 -mca orte_ess_num_procs 3 -mca orte_hnp_uri >>>>>>> "403701760.0;tcp://192.168.222.20:35163" -mca plm_base_verbose 5 -mca >>>>>>> btl tcp,sm,self >>>>>>> [grsacc20:04578] [[6160,0],0] plm:tm: launching on node grsacc18 >>>>>>> [grsacc20:04578] [[6160,0],0] plm:tm: executing: >>>>>>> orted -mca ess tm -mca orte_ess_jobid 403701760 -mca >>>>>>> orte_ess_vpid 2 -mca orte_ess_num_procs 3 -mca orte_hnp_uri >>>>>>> "403701760.0;tcp://192.168.222.20:35163" -mca plm_base_verbose 5 -mca >>>>>>> btl tcp,sm,self >>>>>>> [grsacc20:04578] [[6160,0],0] plm:tm:launch: finished spawning orteds >>>>>>> [grsacc19:28821] mca:base:select:( plm) Querying component [rsh] >>>>>>> [grsacc19:28821] [[6160,0],1] plm:rsh_lookup on agent ssh : rsh path >>>>>>> NULL >>>>>>> [grsacc19:28821] mca:base:select:( plm) Query of component [rsh] set >>>>>>> priority to 10 >>>>>>> [grsacc19:28821] mca:base:select:( plm) Selected component [rsh] >>>>>>> [grsacc19:28821] [[6160,0],1] plm:rsh_setup on agent ssh : rsh path NULL >>>>>>> [grsacc19:28821] [[6160,0],1] plm:base:receive start comm >>>>>>> [grsacc19:28821] [[6160,0],1] plm:base:receive stop comm >>>>>>> [grsacc18:16717] mca:base:select:( plm) Querying component [rsh] >>>>>>> [grsacc18:16717] [[6160,0],2] plm:rsh_lookup on agent ssh : rsh path >>>>>>> NULL >>>>>>> [grsacc18:16717] mca:base:select:( plm) Query of component [rsh] set >>>>>>> priority to 10 >>>>>>> [grsacc18:16717] mca:base:select:( plm) Selected component [rsh] >>>>>>> [grsacc18:16717] [[6160,0],2] plm:rsh_setup on agent ssh : rsh path NULL >>>>>>> [grsacc18:16717] [[6160,0],2] plm:base:receive start comm >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:orted_report_launch from daemon >>>>>>> [[6160,0],2] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:orted_report_launch from daemon >>>>>>> [[6160,0],2] on node grsacc18 >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:orted_report_launch completed >>>>>>> for daemon [[6160,0],2] at contact >>>>>>> 403701760.2;tcp://192.168.222.18:44229 >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:launch_apps for job [6160,2] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive processing msg >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive update proc state >>>>>>> command from [[6160,0],2] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive got update_proc_state >>>>>>> for job [6160,2] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive got update_proc_state >>>>>>> for vpid 0 state RUNNING exit_code 0 >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive done processing commands >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:launch wiring up iof for job >>>>>>> [6160,2] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive processing msg >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive done processing commands >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:launch registered event >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:launch sending dyn release of >>>>>>> job [6160,2] to [[6160,1],0] >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:orted_cmd sending orted_exit >>>>>>> commands >>>>>>> [grsacc19:28815] [[6160,0],1] plm:base:receive stop comm >>>>>>> [grsacc20:04578] [[6160,0],0] plm:base:receive stop comm >>>>>>> -bash-4.1$ [grsacc18:16717] [[6160,0],2] plm:base:receive stop comm >>>>>>> >>>>>>> Best, >>>>>>> Suraj >>>>>>> On Sep 24, 2013, at 3:24 PM, Ralph Castain wrote: >>>>>>> >>>>>>>> Your output shows that it launched your apps, but they exited. The >>>>>>>> error is reported here, though it appears we aren't flushing the >>>>>>>> message out before exiting due to a race condition: >>>>>>>> >>>>>>>>> [grsacc20:04511] 1 more process has sent help message >>>>>>>>> help-mpi-btl-openib.txt / no active ports found >>>>>>>> >>>>>>>> Here is the full text: >>>>>>>> [no active ports found] >>>>>>>> WARNING: There is at least non-excluded one OpenFabrics device found, >>>>>>>> but there are no active ports detected (or Open MPI was unable to use >>>>>>>> them). This is most certainly not what you wanted. Check your >>>>>>>> cables, subnet manager configuration, etc. The openib BTL will be >>>>>>>> ignored for this job. >>>>>>>> >>>>>>>> Local host: %s >>>>>>>> >>>>>>>> Looks like at least one node being used doesn't have an active >>>>>>>> Infiniband port on it? >>>>>>>> >>>>>>>> >>>>>>>> On Sep 24, 2013, at 6:11 AM, Suraj Prabhakaran >>>>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Ralph, >>>>>>>>> >>>>>>>>> I tested it with the trunk r29228. I still have the following >>>>>>>>> problem. Now, it even spawns the daemon on the new node through >>>>>>>>> torque but then suddently quits. The following is the output. Can you >>>>>>>>> please have a look? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Suraj >>>>>>>>> >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive processing msg >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive job launch command >>>>>>>>> from [[6253,1],0] >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive adding hosts >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive calling spawn >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive done processing >>>>>>>>> commands >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:setup_job >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:setup_vm >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:setup_vm add new daemon >>>>>>>>> [[6253,0],2] >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:setup_vm assigning new daemon >>>>>>>>> [[6253,0],2] to node grsacc18 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:tm: launching vm >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:tm: final top-level argv: >>>>>>>>> orted -mca ess tm -mca orte_ess_jobid 409796608 -mca >>>>>>>>> orte_ess_vpid <template> -mca orte_ess_num_procs 3 -mca orte_hnp_uri >>>>>>>>> "409796608.0;tcp://192.168.222.20:53097" -mca plm_base_verbose 6 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:tm: launching on node grsacc19 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:tm: executing: >>>>>>>>> orted -mca ess tm -mca orte_ess_jobid 409796608 -mca >>>>>>>>> orte_ess_vpid 1 -mca orte_ess_num_procs 3 -mca orte_hnp_uri >>>>>>>>> "409796608.0;tcp://192.168.222.20:53097" -mca plm_base_verbose 6 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:tm: launching on node grsacc18 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:tm: executing: >>>>>>>>> orted -mca ess tm -mca orte_ess_jobid 409796608 -mca >>>>>>>>> orte_ess_vpid 2 -mca orte_ess_num_procs 3 -mca orte_hnp_uri >>>>>>>>> "409796608.0;tcp://192.168.222.20:53097" -mca plm_base_verbose 6 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:tm:launch: finished spawning orteds >>>>>>>>> [grsacc19:28754] mca:base:select:( plm) Querying component [rsh] >>>>>>>>> [grsacc19:28754] [[6253,0],1] plm:rsh_lookup on agent ssh : rsh path >>>>>>>>> NULL >>>>>>>>> [grsacc19:28754] mca:base:select:( plm) Query of component [rsh] set >>>>>>>>> priority to 10 >>>>>>>>> [grsacc19:28754] mca:base:select:( plm) Selected component [rsh] >>>>>>>>> [grsacc19:28754] [[6253,0],1] plm:rsh_setup on agent ssh : rsh path >>>>>>>>> NULL >>>>>>>>> [grsacc19:28754] [[6253,0],1] plm:base:receive start comm >>>>>>>>> [grsacc19:28754] [[6253,0],1] plm:base:receive stop comm >>>>>>>>> [grsacc18:16648] mca:base:select:( plm) Querying component [rsh] >>>>>>>>> [grsacc18:16648] [[6253,0],2] plm:rsh_lookup on agent ssh : rsh path >>>>>>>>> NULL >>>>>>>>> [grsacc18:16648] mca:base:select:( plm) Query of component [rsh] set >>>>>>>>> priority to 10 >>>>>>>>> [grsacc18:16648] mca:base:select:( plm) Selected component [rsh] >>>>>>>>> [grsacc18:16648] [[6253,0],2] plm:rsh_setup on agent ssh : rsh path >>>>>>>>> NULL >>>>>>>>> [grsacc18:16648] [[6253,0],2] plm:base:receive start comm >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:orted_report_launch from >>>>>>>>> daemon [[6253,0],2] >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:orted_report_launch from >>>>>>>>> daemon [[6253,0],2] on node grsacc18 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:orted_report_launch completed >>>>>>>>> for daemon [[6253,0],2] at contact >>>>>>>>> 409796608.2;tcp://192.168.222.18:47974 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:launch_apps for job [6253,2] >>>>>>>>> [grsacc20:04511] 1 more process has sent help message >>>>>>>>> help-mpi-btl-openib.txt / no active ports found >>>>>>>>> [grsacc20:04511] Set MCA parameter "orte_base_help_aggregate" to 0 to >>>>>>>>> see all help / error messages >>>>>>>>> [grsacc20:04511] 1 more process has sent help message >>>>>>>>> help-mpi-btl-base.txt / btl:no-nics >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive processing msg >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive update proc state >>>>>>>>> command from [[6253,0],2] >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive got update_proc_state >>>>>>>>> for job [6253,2] >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive got update_proc_state >>>>>>>>> for vpid 0 state RUNNING exit_code 0 >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive done processing >>>>>>>>> commands >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:launch wiring up iof for job >>>>>>>>> [6253,2] >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive processing msg >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive done processing >>>>>>>>> commands >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:launch registered event >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:launch sending dyn release of >>>>>>>>> job [6253,2] to [[6253,1],0] >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:orted_cmd sending orted_exit >>>>>>>>> commands >>>>>>>>> [grsacc19:28747] [[6253,0],1] plm:base:receive stop comm >>>>>>>>> [grsacc20:04511] [[6253,0],0] plm:base:receive stop comm >>>>>>>>> -bash-4.1$ [grsacc18:16648] [[6253,0],2] plm:base:receive stop comm >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sep 23, 2013, at 1:55 AM, Ralph Castain wrote: >>>>>>>>> >>>>>>>>>> Found a bug in the Torque support - we were trying to connect to the >>>>>>>>>> MOM again, which would hang (I imagine). I pushed a fix to the trunk >>>>>>>>>> (r29227) and scheduled it to come to 1.7.3 if you want to try it >>>>>>>>>> again. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sep 22, 2013, at 4:21 PM, Suraj Prabhakaran >>>>>>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Dear Ralph, >>>>>>>>>>> >>>>>>>>>>> This is the output I get when I execute with the verbose option. >>>>>>>>>>> >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:receive processing msg >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:receive job launch command >>>>>>>>>>> from [[23526,1],0] >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:receive adding hosts >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:receive calling spawn >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:receive done processing >>>>>>>>>>> commands >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:setup_job >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:setup_vm >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:setup_vm add new daemon >>>>>>>>>>> [[23526,0],2] >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:setup_vm assigning new >>>>>>>>>>> daemon [[23526,0],2] to node grsacc17/1-4 >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:setup_vm add new daemon >>>>>>>>>>> [[23526,0],3] >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:setup_vm assigning new >>>>>>>>>>> daemon [[23526,0],3] to node grsacc17/0-5 >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:tm: launching vm >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:tm: final top-level argv: >>>>>>>>>>> orted -mca ess tm -mca orte_ess_jobid 1541799936 -mca >>>>>>>>>>> orte_ess_vpid <template> -mca orte_ess_num_procs 4 -mca >>>>>>>>>>> orte_hnp_uri "1541799936.0;tcp://192.168.222.20:49049" -mca >>>>>>>>>>> plm_base_verbose 5 >>>>>>>>>>> [warn] opal_libevent2021_event_base_loop: reentrant invocation. >>>>>>>>>>> Only one event_base_loop can run on each event_base at once. >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:orted_cmd sending >>>>>>>>>>> orted_exit commands >>>>>>>>>>> [grsacc20:21012] [[23526,0],0] plm:base:receive stop comm >>>>>>>>>>> >>>>>>>>>>> Says something? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Suraj >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sep 22, 2013, at 9:45 PM, Ralph Castain wrote: >>>>>>>>>>> >>>>>>>>>>>> I'll still need to look at the intercomm_create issue, but I just >>>>>>>>>>>> tested both the trunk and current 1.7.3 branch for "add-host" and >>>>>>>>>>>> both worked just fine. This was on my little test cluster which >>>>>>>>>>>> only has rsh available - no Torque. >>>>>>>>>>>> >>>>>>>>>>>> You might add "-mca plm_base_verbose 5" to your cmd line to get >>>>>>>>>>>> some debug output as to the problem. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sep 21, 2013, at 5:48 PM, Ralph Castain <r...@open-mpi.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sep 21, 2013, at 4:54 PM, Suraj Prabhakaran >>>>>>>>>>>>> <suraj.prabhaka...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Dear all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Really thanks a lot for your efforts. I too downloaded the trunk >>>>>>>>>>>>>> to check if it works for my case and as of revision 29215, it >>>>>>>>>>>>>> works for the original case I reported. Although it works, I >>>>>>>>>>>>>> still see the following in the output. Does it mean anything? >>>>>>>>>>>>>> [grsacc17][[13611,1],0][btl_openib_proc.c:157:mca_btl_openib_proc_create] >>>>>>>>>>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer >>>>>>>>>>>>>> [[13611,2],0] >>>>>>>>>>>>> >>>>>>>>>>>>> Yes - it means we don't quite have this right yet :-( >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> However, on another topic relevant to my use case, I have >>>>>>>>>>>>>> another problem to report. I am having problems using the >>>>>>>>>>>>>> "add-host" info to the MPI_Comm_spawn() when MPI is compiled >>>>>>>>>>>>>> with support for Torque resource manager. This problem is >>>>>>>>>>>>>> totally new in the 1.7 series and it worked perfectly until >>>>>>>>>>>>>> 1.6.5 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Basically, I am working on implementing dynamic resource >>>>>>>>>>>>>> management facilities in the Torque/Maui batch system. Through a >>>>>>>>>>>>>> new tm call, an application can get new resources for a job. >>>>>>>>>>>>> >>>>>>>>>>>>> FWIW: you'll find that we added an API to the orte RAS framework >>>>>>>>>>>>> to support precisely that operation. It allows an application to >>>>>>>>>>>>> request that we dynamically obtain additional resources during >>>>>>>>>>>>> execution (e.g., as part of a Comm_spawn call via an info_key). >>>>>>>>>>>>> We originally implemented this with Slurm, but you could add the >>>>>>>>>>>>> calls into the Torque component as well if you like. >>>>>>>>>>>>> >>>>>>>>>>>>> This is in the trunk now - will come over to 1.7.4 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> I want to use MPI_Comm_spawn() to spawn new processes in the new >>>>>>>>>>>>>> hosts. With my extended torque/maui batch system, I was able to >>>>>>>>>>>>>> perfectly use the "add-host" info argument to MPI_Comm_spawn() >>>>>>>>>>>>>> to spawn new processes on these hosts. Since MPI and Torque >>>>>>>>>>>>>> refer to the hosts through the nodeids, I made sure that OpenMPI >>>>>>>>>>>>>> uses the correct nodeid's for these new hosts. >>>>>>>>>>>>>> Until 1.6.5, this worked perfectly fine, except that due to the >>>>>>>>>>>>>> Intercomm_merge problem, I could not really run a real >>>>>>>>>>>>>> application to its completion. >>>>>>>>>>>>>> >>>>>>>>>>>>>> While this is now fixed in the trunk, I found that, however, >>>>>>>>>>>>>> when using the "add-host" info argument, everything collapses >>>>>>>>>>>>>> after printing out the following error. >>>>>>>>>>>>>> >>>>>>>>>>>>>> [warn] opal_libevent2021_event_base_loop: reentrant invocation. >>>>>>>>>>>>>> Only one event_base_loop can run on each event_base at once. >>>>>>>>>>>>> >>>>>>>>>>>>> I'll take a look - probably some stale code that hasn't been >>>>>>>>>>>>> updated yet for async ORTE operations >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> And due to this, I am still not really able to run my >>>>>>>>>>>>>> application! I also compiled the MPI without any Torque/PBS >>>>>>>>>>>>>> support and just used the "add-host" argument normally. Again, >>>>>>>>>>>>>> this worked perfectly in 1.6.5. But in the 1.7 series, it works >>>>>>>>>>>>>> but after printing out the following error. >>>>>>>>>>>>>> >>>>>>>>>>>>>> [grsacc17][[13731,1],0][btl_openib_proc.c:157:mca_btl_openib_proc_create] >>>>>>>>>>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer >>>>>>>>>>>>>> [[13731,2],0] >>>>>>>>>>>>>> [grsacc17][[13731,1],1][btl_openib_proc.c:157:mca_btl_openib_proc_create] >>>>>>>>>>>>>> [btl_openib_proc.c:157] ompi_modex_recv failed for peer >>>>>>>>>>>>>> [[13731,2],0] >>>>>>>>>>>>> >>>>>>>>>>>>> Yeah, the 1.7 series doesn't have the reentrant test in it - so >>>>>>>>>>>>> we "illegally" re-enter libevent. The error again means we don't >>>>>>>>>>>>> have Intercomm_create correct just yet. >>>>>>>>>>>>> >>>>>>>>>>>>> I'll see what I can do about this and get back to you >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> In short, with pbs/torque support, it fails and without >>>>>>>>>>>>>> pbs/torque support, it runs after spitting the above lines. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would really appreciate some help on this, since I need these >>>>>>>>>>>>>> features to actually test my case and (at least in my short >>>>>>>>>>>>>> experience) no other MPI implementation seem friendly to such >>>>>>>>>>>>>> dynamic scenarios. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks a lot! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Suraj >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sep 20, 2013, at 4:58 PM, Jeff Squyres (jsquyres) wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Just to close my end of this loop: as of trunk r29213, it all >>>>>>>>>>>>>>> works for me. Thanks! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sep 18, 2013, at 12:52 PM, Ralph Castain <r...@open-mpi.org> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks George - much appreciated >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Sep 18, 2013, at 9:49 AM, George Bosilca >>>>>>>>>>>>>>>> <bosi...@icl.utk.edu> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The test case was broken. I just pushed a fix. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> George. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sep 18, 2013, at 16:49 , Ralph Castain <r...@open-mpi.org> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hangs with any np > 1 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> However, I'm not sure if that's an issue with the test vs >>>>>>>>>>>>>>>>>> the underlying implementation >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Sep 18, 2013, at 7:40 AM, "Jeff Squyres (jsquyres)" >>>>>>>>>>>>>>>>>> <jsquy...@cisco.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Does it hang when you run with -np 4? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Sent from my phone. No type good. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Sep 18, 2013, at 4:10 PM, "Ralph Castain" >>>>>>>>>>>>>>>>>>> <r...@open-mpi.org> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Strange - it works fine for me on my Mac. However, I see >>>>>>>>>>>>>>>>>>>> one difference - I only run it with np=1 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Sep 18, 2013, at 2:22 AM, Jeff Squyres (jsquyres) >>>>>>>>>>>>>>>>>>>> <jsquy...@cisco.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Sep 18, 2013, at 9:33 AM, George Bosilca >>>>>>>>>>>>>>>>>>>>> <bosi...@icl.utk.edu> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. sm doesn't work between spawned processes. So you >>>>>>>>>>>>>>>>>>>>>> must have another network enabled. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I know :-). I have tcp available as well (OMPI will >>>>>>>>>>>>>>>>>>>>> abort if you only run with sm,self because the comm_spawn >>>>>>>>>>>>>>>>>>>>> will fail with unreachable errors -- I just tested/proved >>>>>>>>>>>>>>>>>>>>> this to myself). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2. Don't use the test case attached to my email, I left >>>>>>>>>>>>>>>>>>>>>> an xterm based spawn and the debugging. It can't work >>>>>>>>>>>>>>>>>>>>>> without xterm support. Instead try using the test case >>>>>>>>>>>>>>>>>>>>>> from the trunk, the one committed by Ralph. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I didn't see any "xterm" strings in there, but ok. :-) >>>>>>>>>>>>>>>>>>>>> I ran with orte/test/mpi/intercomm_create.c, and that >>>>>>>>>>>>>>>>>>>>> hangs for me as well: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 4] >>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 5] >>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 6] >>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 7] >>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 4] >>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 5] >>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 6] >>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 7] >>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>> [hang] >>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Similarly, on my Mac, it hangs with no output: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>>>>>>>>>>>>>> [hang] >>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> George. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sep 18, 2013, at 07:53 , "Jeff Squyres (jsquyres)" >>>>>>>>>>>>>>>>>>>>>> <jsquy...@cisco.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> George -- >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> When I build the SVN trunk (r29201) on 64 bit linux, >>>>>>>>>>>>>>>>>>>>>>> your attached test case hangs: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, >>>>>>>>>>>>>>>>>>>>>>> MPI_COMM_NULL, 201, &inter) [rank 4] >>>>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, >>>>>>>>>>>>>>>>>>>>>>> MPI_COMM_NULL, 201, &inter) [rank 5] >>>>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, >>>>>>>>>>>>>>>>>>>>>>> MPI_COMM_NULL, 201, &inter) [rank 6] >>>>>>>>>>>>>>>>>>>>>>> b: MPI_Intercomm_create( intra, 0, intra, >>>>>>>>>>>>>>>>>>>>>>> MPI_COMM_NULL, 201, &inter) [rank 7] >>>>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>>>> a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 0, 201, >>>>>>>>>>>>>>>>>>>>>>> &inter) (0) >>>>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 4] >>>>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 5] >>>>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 6] >>>>>>>>>>>>>>>>>>>>>>> c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, >>>>>>>>>>>>>>>>>>>>>>> 201, &inter) [rank 7] >>>>>>>>>>>>>>>>>>>>>>> [hang] >>>>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On my Mac, it hangs without printing anything: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpicc intercomm_create.c -o intercomm_create >>>>>>>>>>>>>>>>>>>>>>> ❯❯❯ mpirun -np 4 intercomm_create >>>>>>>>>>>>>>>>>>>>>>> [hang] >>>>>>>>>>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Sep 18, 2013, at 1:48 AM, George Bosilca >>>>>>>>>>>>>>>>>>>>>>> <bosi...@icl.utk.edu> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Here is a quick (and definitively not the cleanest) >>>>>>>>>>>>>>>>>>>>>>>> patch that addresses the MPI_Intercomm issue at the >>>>>>>>>>>>>>>>>>>>>>>> MPI level. It should be applied after removal of 29166. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I also added the corrected test case stressing the >>>>>>>>>>>>>>>>>>>>>>>> corner cases by doing barriers at every inter-comm >>>>>>>>>>>>>>>>>>>>>>>> creation and doing a clean disconnect. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> Jeff Squyres >>>>>>>>>>>>>>>>>>>>>>> jsquy...@cisco.com >>>>>>>>>>>>>>>>>>>>>>> For corporate legal information go to: >>>>>>>>>>>>>>>>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> Jeff Squyres >>>>>>>>>>>>>>>>>>>>> jsquy...@cisco.com >>>>>>>>>>>>>>>>>>>>> For corporate legal information go to: >>>>>>>>>>>>>>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Jeff Squyres >>>>>>>>>>>>>>> jsquy...@cisco.com >>>>>>>>>>>>>>> For corporate legal information go to: >>>>>>>>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> devel mailing list >>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> de...@open-mpi.org >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel