I think I have this working now - try anything on or after r23647
On Aug 23, 2010, at 1:36 PM, Philippe wrote: > sure. I took a guess at ppn and nodes for the case where 2 processes > are on the same node... I dont claim these are the right values ;-) > > > > c0301b10e1 ~/mpi> env|grep OMPI > OMPI_MCA_orte_nodes=c0301b10e1 > OMPI_MCA_orte_rank=0 > OMPI_MCA_orte_ppn=2 > OMPI_MCA_orte_num_procs=2 > OMPI_MCA_oob_tcp_static_ports_v6=10000-11000 > OMPI_MCA_ess=generic > OMPI_MCA_orte_jobid=9999 > OMPI_MCA_oob_tcp_static_ports=10000-11000 > c0301b10e1 ~/hpa/benchmark/mpi> ./ben1 1 1 1 > [c0301b10e1:22827] [[0,9999],0] assigned port 10001 > [c0301b10e1:22827] [[0,9999],0] accepting connections via event library > minsize=1 maxsize=1 delay=1.000000 > > <no more output after that> > > > c0301b10e1 ~/mpi> env|grep OMPI > OMPI_MCA_orte_nodes=c0301b10e1 > OMPI_MCA_orte_rank=1 > OMPI_MCA_orte_ppn=2 > OMPI_MCA_orte_num_procs=2 > OMPI_MCA_oob_tcp_static_ports_v6=10000-11000 > OMPI_MCA_ess=generic > OMPI_MCA_orte_jobid=9999 > OMPI_MCA_oob_tcp_static_ports=10000-11000 > c0301b10e1 ~/hpa/benchmark/mpi> ./ben1 1 1 1 > [c0301b10e1:22830] [[0,9999],1] assigned port 10002 > [c0301b10e1:22830] [[0,9999],1] accepting connections via event library > [c0301b10e1:22830] [[0,9999],1]-[[0,0],0] mca_oob_tcp_send_nb: tag 15 size 189 > [c0301b10e1:22830] [[0,9999],1]-[[0,0],0] > mca_oob_tcp_peer_try_connect: connecting port 10002 to: > 10.4.72.110:10000 > [c0301b10e1:22830] [[0,9999],1]-[[0,0],0] > mca_oob_tcp_peer_complete_connect: connection failed: Connection > refused (111) - retrying > [c0301b10e1:22830] [[0,9999],1]-[[0,0],0] > mca_oob_tcp_peer_try_connect: connecting port 10002 to: > 10.4.72.110:10000 > [c0301b10e1:22830] [[0,9999],1]-[[0,0],0] > mca_oob_tcp_peer_complete_connect: connection failed: Connection > refused (111) - retrying > [c0301b10e1:22830] [[0,9999],1]-[[0,0],0] > mca_oob_tcp_peer_try_connect: connecting port 10002 to: > 10.4.72.110:10000 > [c0301b10e1:22830] [[0,9999],1]-[[0,0],0] > mca_oob_tcp_peer_complete_connect: connection failed: Connection > refused (111) - retrying > > <repeats..> > > > Thanks! > p. > > > On Mon, Aug 23, 2010 at 3:24 PM, Ralph Castain <[email protected]> wrote: >> Can you send me the values you are using for the relevant envars? That way I >> can try to replicate here >> >> >> On Aug 23, 2010, at 1:15 PM, Philippe wrote: >> >>> I took a look at the code but I'm afraid I dont see anything wrong. >>> >>> p. >>> >>> On Thu, Aug 19, 2010 at 2:32 PM, Ralph Castain <[email protected]> wrote: >>>> Yes, that is correct - we reserve the first port in the range for a daemon, >>>> should one exist. >>>> The problem is clearly that get_node_rank is returning the wrong value for >>>> the second process (your rank=1). If you want to dig deeper, look at the >>>> orte/mca/ess/generic code where it generates the nidmap and pidmap. There >>>> is >>>> a bug down there somewhere that gives the wrong answer when ppn > 1. >>>> >>>> >>>> On Thu, Aug 19, 2010 at 12:12 PM, Philippe <[email protected]> wrote: >>>>> >>>>> Ralph, >>>>> >>>>> somewhere in ./orte/mca/oob/tcp/oob_tcp.c, there is this comment: >>>>> >>>>> orte_node_rank_t nrank; >>>>> /* do I know my node_local_rank yet? */ >>>>> if (ORTE_NODE_RANK_INVALID != (nrank = >>>>> orte_ess.get_node_rank(ORTE_PROC_MY_NAME)) && >>>>> (nrank+1) < >>>>> opal_argv_count(mca_oob_tcp_component.tcp4_static_ports)) { >>>>> /* any daemon takes the first entry, so we start >>>>> with the second */ >>>>> >>>>> which seems constant with process #0 listening on 10001. the question >>>>> would be why process #1 attempt to connect to port 10000 then? or >>>>> maybe totally unrelated :-) >>>>> >>>>> btw, if I trick process #1 to open the connection to 10001 by shifting >>>>> the range, I now get this error and the process terminate immediately: >>>>> >>>>> [c0301b10e1:03919] [[0,9999],1]-[[0,0],0] >>>>> mca_oob_tcp_peer_recv_connect_ack: received unexpected process >>>>> identifier [[0,9999],0] >>>>> >>>>> good luck with the surgery and wishing you a prompt recovery! >>>>> >>>>> p. >>>>> >>>>> On Thu, Aug 19, 2010 at 2:02 PM, Ralph Castain <[email protected]> wrote: >>>>>> Something doesn't look right - here is what the algo attempts to do: >>>>>> given a port range of 10000-12000, the lowest rank'd process on the node >>>>>> should open port 10000. The next lowest rank on the node will open >>>>>> 10001, >>>>>> etc. >>>>>> So it looks to me like there is some confusion in the local rank algo. >>>>>> I'll >>>>>> have to look at the generic module - must be a bug in it somewhere. >>>>>> This might take a couple of days as I have surgery tomorrow morning, so >>>>>> please forgive the delay. >>>>>> >>>>>> On Thu, Aug 19, 2010 at 11:13 AM, Philippe <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Ralph, >>>>>>> >>>>>>> I'm able to use the generic module when the processes are on different >>>>>>> machines. >>>>>>> >>>>>>> what would be the values of the EV when two processes are on the same >>>>>>> machine (hopefully talking over SHM). >>>>>>> >>>>>>> i've played with combination of nodelist and ppn but no luck. I get >>>>>>> errors >>>>>>> like: >>>>>>> >>>>>>> >>>>>>> >>>>>>> [c0301b10e1:03172] [[0,9999],1] -> [[0,0],0] (node: c0301b10e1) >>>>>>> oob-tcp: Number of attempts to create TCP connection has been >>>>>>> exceeded. Can not communicate with peer >>>>>>> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file >>>>>>> grpcomm_hier_module.c at line 303 >>>>>>> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file >>>>>>> base/grpcomm_base_modex.c at line 470 >>>>>>> [c0301b10e1:03172] [[0,9999],1] ORTE_ERROR_LOG: Unreachable in file >>>>>>> grpcomm_hier_module.c at line 484 >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> It looks like MPI_INIT failed for some reason; your parallel process is >>>>>>> likely to abort. There are many reasons that a parallel process can >>>>>>> fail during MPI_INIT; some of which are due to configuration or >>>>>>> environment >>>>>>> problems. This failure appears to be an internal failure; here's some >>>>>>> additional information (which may only be relevant to an Open MPI >>>>>>> developer): >>>>>>> >>>>>>> orte_grpcomm_modex failed >>>>>>> --> Returned "Unreachable" (-12) instead of "Success" (0) >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> *** The MPI_Init() function was called before MPI_INIT was invoked. >>>>>>> *** This is disallowed by the MPI standard. >>>>>>> *** Your MPI job will now abort. >>>>>>> [c0301b10e1:3172] Abort before MPI_INIT completed successfully; not >>>>>>> able to guarantee that all other processes were killed! >>>>>>> >>>>>>> >>>>>>> maybe a related question is how to assign the TCP port range and how >>>>>>> is it used? when the processes are on different machines, I use the >>>>>>> same range and that's ok as long as the range is free. but when the >>>>>>> processes are on the same node, what value should the range be for >>>>>>> each process? My range is 10000-12000 (for both processes) and I see >>>>>>> that process with rank #0 listen on port 10001 while process with rank >>>>>>> #1 try to establish a connect to port 10000. >>>>>>> >>>>>>> Thanks so much! >>>>>>> p. still here... still trying... ;-) >>>>>>> >>>>>>> On Tue, Jul 27, 2010 at 12:58 AM, Ralph Castain <[email protected]> >>>>>>> wrote: >>>>>>>> Use what hostname returns - don't worry about IP addresses as we'll >>>>>>>> discover them. >>>>>>>> >>>>>>>> On Jul 26, 2010, at 10:45 PM, Philippe wrote: >>>>>>>> >>>>>>>>> Thanks a lot! >>>>>>>>> >>>>>>>>> now, for the ev "OMPI_MCA_orte_nodes", what do I put exactly? our >>>>>>>>> nodes have a short/long name (it's rhel 5.x, so the command hostname >>>>>>>>> returns the long name) and at least 2 IP addresses. >>>>>>>>> >>>>>>>>> p. >>>>>>>>> >>>>>>>>> On Tue, Jul 27, 2010 at 12:06 AM, Ralph Castain <[email protected]> >>>>>>>>> wrote: >>>>>>>>>> Okay, fixed in r23499. Thanks again... >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Jul 26, 2010, at 9:47 PM, Ralph Castain wrote: >>>>>>>>>> >>>>>>>>>>> Doh - yes it should! I'll fix it right now. >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> On Jul 26, 2010, at 9:28 PM, Philippe wrote: >>>>>>>>>>> >>>>>>>>>>>> Ralph, >>>>>>>>>>>> >>>>>>>>>>>> i was able to test the generic module and it seems to be working. >>>>>>>>>>>> >>>>>>>>>>>> one question tho, the function orte_ess_generic_component_query >>>>>>>>>>>> in >>>>>>>>>>>> "orte/mca/ess/generic/ess_generic_component.c" calls getenv with >>>>>>>>>>>> the >>>>>>>>>>>> argument "OMPI_MCA_enc", which seems to cause the module to fail >>>>>>>>>>>> to >>>>>>>>>>>> load. shouldnt it be "OMPI_MCA_ess" ? >>>>>>>>>>>> >>>>>>>>>>>> ..... >>>>>>>>>>>> >>>>>>>>>>>> /* only pick us if directed to do so */ >>>>>>>>>>>> if (NULL != (pick = getenv("OMPI_MCA_env")) && >>>>>>>>>>>> 0 == strcmp(pick, "generic")) { >>>>>>>>>>>> *priority = 1000; >>>>>>>>>>>> *module = (mca_base_module_t *)&orte_ess_generic_module; >>>>>>>>>>>> >>>>>>>>>>>> ... >>>>>>>>>>>> >>>>>>>>>>>> p. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jul 22, 2010 at 5:53 PM, Ralph Castain <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Dev trunk looks okay right now - I think you'll be fine using >>>>>>>>>>>>> it. >>>>>>>>>>>>> My new component -might- work with 1.5, but probably not with >>>>>>>>>>>>> 1.4. I haven't >>>>>>>>>>>>> checked either of them. >>>>>>>>>>>>> >>>>>>>>>>>>> Anything at r23478 or above will have the new module. Let me >>>>>>>>>>>>> know >>>>>>>>>>>>> how it works for you. I haven't tested it myself, but am pretty >>>>>>>>>>>>> sure it >>>>>>>>>>>>> should work. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Jul 22, 2010, at 3:22 PM, Philippe wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ralph, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you so much!! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'll give it a try and let you know. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I know it's a tough question, but how stable is the dev trunk? >>>>>>>>>>>>>> Can >>>>>>>>>>>>>> I >>>>>>>>>>>>>> just grab the latest and run, or am I better off taking your >>>>>>>>>>>>>> changes >>>>>>>>>>>>>> and copy them back in a stable release? (if so, which one? 1.4? >>>>>>>>>>>>>> 1.5?) >>>>>>>>>>>>>> >>>>>>>>>>>>>> p. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Jul 22, 2010 at 3:50 PM, Ralph Castain >>>>>>>>>>>>>> <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> It was easier for me to just construct this module than to >>>>>>>>>>>>>>> explain how to do so :-) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I will commit it this evening (couple of hours from now) as >>>>>>>>>>>>>>> that >>>>>>>>>>>>>>> is our standard practice. You'll need to use the developer's >>>>>>>>>>>>>>> trunk, though, >>>>>>>>>>>>>>> to use it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Here are the envars you'll need to provide: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Each process needs to get the same following values: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * OMPI_MCA_ess=generic >>>>>>>>>>>>>>> * OMPI_MCA_orte_num_procs=<number of MPI procs> >>>>>>>>>>>>>>> * OMPI_MCA_orte_nodes=<a comma-separated list of nodenames >>>>>>>>>>>>>>> where >>>>>>>>>>>>>>> MPI procs reside> >>>>>>>>>>>>>>> * OMPI_MCA_orte_ppn=<number of procs/node> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Note that I have assumed this last value is a constant for >>>>>>>>>>>>>>> simplicity. If that isn't the case, let me know - you could >>>>>>>>>>>>>>> instead provide >>>>>>>>>>>>>>> it as a comma-separated list of values with an entry for each >>>>>>>>>>>>>>> node. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In addition, you need to provide the following value that will >>>>>>>>>>>>>>> be >>>>>>>>>>>>>>> unique to each process: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * OMPI_MCA_orte_rank=<MPI rank> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Finally, you have to provide a range of static TCP ports for >>>>>>>>>>>>>>> use >>>>>>>>>>>>>>> by the processes. Pick any range that you know will be >>>>>>>>>>>>>>> available across all >>>>>>>>>>>>>>> the nodes. You then need to ensure that each process sees the >>>>>>>>>>>>>>> following >>>>>>>>>>>>>>> envar: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * OMPI_MCA_oob_tcp_static_ports=6000-6010 <== obviously, >>>>>>>>>>>>>>> replace >>>>>>>>>>>>>>> this with your range >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You will need a port range that is at least equal to the ppn >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> the job (each proc on a node will take one of the provided >>>>>>>>>>>>>>> ports). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> That should do it. I compute everything else I need from those >>>>>>>>>>>>>>> values. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Does that work for you? >>>>>>>>>>>>>>> Ralph >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> users mailing list >>>>>>>>> [email protected] >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> [email protected] >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> [email protected] >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> [email protected] >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > [email protected] > http://www.open-mpi.org/mailman/listinfo.cgi/users
