Hmmm...but then proc->hostname will *never* be filled in, because it is only ever accessed in an error message - i.e., in opal_output and its variants.
If we are not going to retrieve it be default, then we need another solution *if* we want hostnames for error messages under direct launch. If we don't care, then we can ignore this issue and follow the proposal. I suppose one could ask why we are even bothering with hostname since the opal_output message includes the hostname in its prefix anyway. Jeff: this was your baby - what do you think? On Aug 19, 2013, at 3:43 PM, Nathan Hjelm <hje...@lanl.gov> wrote: > That solution is fine with me. > > -Nathan > > On Tue, Aug 20, 2013 at 12:41:49AM +0200, George Bosilca wrote: >> If your offer is between quadratic and non-deterministic, I'll take the >> former. >> >> I would advocate for a middle-ground solution. Clearly document in the >> header file that the ompi_proc_get_hostname is __not__ safe to be used in >> all contexts as it might exhibit recursive behavior due to communications. >> Then revert all its uses in the context of opal_output, opal_output_verbose >> and all variants back to using "->proc_hostname". We might get a (null) >> instead of the peer name, but this removes the potential loops. >> >> George. >> >> On Aug 19, 2013, at 23:52 , Nathan Hjelm <hje...@lanl.gov> wrote: >> >>> It would require a db read from every rank which is what we are trying >>> to avoid. This scales quadratic at best on Cray systems. >>> >>> -Nathan >>> >>> On Mon, Aug 19, 2013 at 02:48:18PM -0700, Ralph Castain wrote: >>>> Yeah, I have some concerns about it too...been trying to test it out some >>>> more. Would be good to see just how much that one change makes - maybe >>>> restoring just the hostname wouldn't have that big an impact. >>>> >>>> I'm leery of trying to ensure we strip all the opal_output loops if we >>>> don't find the hostname. >>>> >>>> On Aug 19, 2013, at 2:41 PM, George Bosilca <bosi...@icl.utk.edu> wrote: >>>> >>>>> As a result of this patch the first decode of a peer host name might >>>>> happen in the middle of a debug message (on the first call to >>>>> ompi_proc_get_hostname). Such a behavior might generate deadlocks based >>>>> on the level of output verbosity, and has significant potential to >>>>> reintroduce the recursive behavior the new state machine was supposed to >>>>> remove. >>>>> >>>>> George. >>>>> >>>>> >>>>> On Aug 17, 2013, at 02:49 , svn-commit-mai...@open-mpi.org wrote: >>>>> >>>>>> Author: rhc (Ralph Castain) >>>>>> Date: 2013-08-16 20:49:18 EDT (Fri, 16 Aug 2013) >>>>>> New Revision: 29040 >>>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/29040 >>>>>> >>>>>> Log: >>>>>> When we direct launch an application, we rely on PMI for wireup support. >>>>>> In doing so, we lose the de facto data compression we get from the ORTE >>>>>> modex since we no longer get all the wireup info from every proc in a >>>>>> single blob. Instead, we have to iterate over all the procs, calling >>>>>> PMI_KVS_get for every value we require. >>>>>> >>>>>> This creates a really bad scaling behavior. Users have found a nearly >>>>>> 20% launch time differential between mpirun and PMI, with PMI being the >>>>>> slower method. Some of the problem is attributable to poor exchange >>>>>> algorithms in RM's like Slurm and Alps, but we make things worse by >>>>>> calling "get" so many times. >>>>>> >>>>>> Nathan (with a tad advice from me) has attempted to alleviate this >>>>>> problem by reducing the number of "get" calls. This required the >>>>>> following changes: >>>>>> >>>>>> * upon first request for data, have the OPAL db pmi component fetch and >>>>>> decode *all* the info from a given remote proc. It turned out we weren't >>>>>> caching the info, so we would continually request it and only decode the >>>>>> piece we needed for the immediate request. We now decode all the info >>>>>> and push it into the db hash component for local storage - and then all >>>>>> subsequent retrievals are fulfilled locally >>>>>> >>>>>> * reduced the amount of data by eliminating the exchange of the >>>>>> OMPI_ARCH value if heterogeneity is not enabled. This was used solely as >>>>>> a check so we would error out if the system wasn't actually homogeneous, >>>>>> which was fine when we thought there was no cost in doing the check. >>>>>> Unfortunately, at large scale and with direct launch, there is a >>>>>> non-zero cost of making this test. We are open to finding a compromise >>>>>> (perhaps turning the test off if requested?), if people feel strongly >>>>>> about performing the test >>>>>> >>>>>> * reduced the amount of RTE data being automatically fetched, and >>>>>> fetched the rest only upon request. In particular, we no longer >>>>>> immediately fetch the hostname (which is only used for error reporting), >>>>>> but instead get it when needed. Likewise for the RML uri as that info is >>>>>> only required for some (not all) environments. In addition, we no longer >>>>>> fetch the locality unless required, relying instead on the PMI clique >>>>>> info to tell us who is on our local node (if additional info is >>>>>> required, the fetch is performed when a modex_recv is issued). >>>>>> >>>>>> Again, all this only impacts direct launch - all the info is provided >>>>>> when launched via mpirun as there is no added cost to getting it >>>>>> >>>>>> Barring objections, we may move this (plus any required other pieces) to >>>>>> the 1.7 branch once it soaks for an appropriate time. >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel