> On Oct 17, 2014, at 10:23 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:
> 
> Hi Ralph
> 
> Thank you.
> Your fixes covered much more than I could find.
> The section about the three levels of process placement
> (" Mapping, Ranking, and Binding: Oh My!") really helps.
> I would just add at the very beginning
> short sentences quickly characterizing each of the three levels.
> Kind of an "abstract".
> Then explain each level in more detail.

Will do - thanks!

> 
> **
> 
> Also, I found Jeff's 2013 presentation about the new style
> of process placement.
> 
> http://www.slideshare.net/jsquyres/open-mpi-explorations-in-process-affinity-eurompi13-presentation
> 
> The title calls it "LAMA".
> (That is mud in Portuguese! But the presentation is clear.)
> OK, the acronym means "Locality Aware Mapping Algorithm".
> In any case, it sounds very similar to the current process placement
> features of OMPI 1.8, although only Jeff and you can really tell if it
> is exactly the same.
> 
> If it is the same, it may help to link it to the OMPI FAQ,
> or somehow make it more visible, printable, etc.
> If there are differences between OMPI 1.8 and the presentation,
> it may be worth adjusting the presentation to the
> current OMPI 1.8, and posting it as well.
> That would be a good way to convey the OMPI 1.8
> process placement conceptual model, along with its syntax
> and examples.

Yeah, I need to do that. LAMA was an alternative implementation of the current 
map/rank/bind system. It hasn’t been fully maintained since it was introduced, 
and so I’m not sure how much of it is functional. I need to create an 
equivalent for the current implementation.


> 
> Thank you,
> Gus Correa
> 
> 
> On 10/17/2014 12:10 AM, Ralph Castain wrote:
>> I know this commit could be a little hard to parse, but I have updated
>> the mpirun man page on the trunk and will port the change over to the
>> 1.8 series tomorrow. FWIW, I’ve provided the link to the commit below so
>> you can “preview” it.
>> 
>> https://github.com/open-mpi/ompi/commit/f9d620e3a772cdeddd40b4f0789cf59c75b44868
>> 
>> HTH
>> Ralph
>> 
>> 
>>> On Oct 16, 2014, at 9:43 AM, Gus Correa <g...@ldeo.columbia.edu
>>> <mailto:g...@ldeo.columbia.edu>> wrote:
>>> 
>>> Hi Ralph
>>> 
>>> Yes, I know the process placement features are powerful.
>>> They were already very good in 1.6, even in 1.4,
>>> and I just tried the new 1.8
>>> "-map-by l2cache" (works nicely on Opteron 6300).
>>> 
>>> Unfortunately I couldn't keep track, test, and use the 1.7 series.
>>> I did that in the previous "odd/new feature" series (1.3, 1.5).
>>> However, my normal workload require that
>>> I focus my attention on the "even/stable" series
>>> (less fun, more production).
>>> Hence I hopped directly from 1.6 to 1.8,
>>> although I read a number of mailing list postings about the new
>>> style of process placement.
>>> 
>>> Pestering you again about documentation (last time for now):
>>> The mpiexec man page also seems to need an update.
>>> That is probably the first place people look for information
>>> about runtime features.
>>> For instance, the process placement examples still
>>> use deprecated parameters and mpiexec options:
>>> -bind-to-core, rmaps_base_schedule_policy, orte_process_binding, etc.
>>> 
>>> Thank you,
>>> Gus Correa
>>> 
>>> On 10/15/2014 11:10 PM, Ralph Castain wrote:
>>>> 
>>>> On Oct 15, 2014, at 11:46 AM, Gus Correa <g...@ldeo.columbia.edu
>>>> <mailto:g...@ldeo.columbia.edu>
>>>> <mailto:g...@ldeo.columbia.edu>> wrote:
>>>> 
>>>>> Thank you Ralph and Jeff for the help!
>>>>> 
>>>>> Glad to hear the segmentation fault is reproducible and will be fixed.
>>>>> 
>>>>> In any case, one can just avoid the old parameter name
>>>>> (rmaps_base_schedule_policy),
>>>>> and use instead the new parameter name
>>>>> (rmaps_base_mapping_policy)
>>>>> without any problem in OMPI 1.8.3.
>>>>> 
>>>> 
>>>> Fix is in the nightly 1.8 tarball - I'll release a 1.8.4 soon to cover
>>>> the problem.
>>>> 
>>>>> **
>>>>> 
>>>>> Thanks Ralph for sending the new (OMPI 1.8)
>>>>> parameter names for process binding.
>>>>> 
>>>>> My recollection is that sometime ago somebody (Jeff perhaps?)
>>>>> posted here a link to a presentation (PDF or PPT) explaining the
>>>>> new style of process binding, but I couldn't find it in the
>>>>> list archives.
>>>>> Maybe the link could be part of the FAQ (if not already there)?
>>>> 
>>>> I don't think it is, but I'll try to add it over the next day or so.
>>>> 
>>>>> 
>>>>> **
>>>>> 
>>>>> The Open MPI runtime environment is really great.
>>>>> However, to take advantage of it one often has to do parameter guessing,
>>>>> and to do time consuming tests by trial and error,
>>>>> because the main sources of documentation are
>>>>> the terse output of ompi_info, and several sparse
>>>>> references in the FAQ.
>>>>> (Some of them outdated?)
>>>>> 
>>>>> In addition, the runtime environment has evolved over time,
>>>>> which is certainly a good thing.
>>>>> However, along with this evolution, several runtime parameters
>>>>> changed both name and functionality, new ones were introduced,
>>>>> old ones were deprecated, which can be somewhat confusing,
>>>>> and can lead to an ineffective use of the runtime environment.
>>>>> (In 1.8.3 I was using several deprecated parameters from 1.6.5
>>>>> that seem to be silently ignored at runtime.
>>>>> I only noticed the problem because that segmentation fault happened.)
>>>>> 
>>>>> I know asking for thorough documentation is foolish,
>>>> 
>>>> Not really - it is something we need to get better about :-(
>>>> 
>>>>> but I guess a simple table of runtime parameter names and valid values
>>>>> in the FAQ, maybe sorted by their purpose/function, along with a few
>>>>> examples of use, could help a lot.
>>>>> Some of this material is now spread across several FAQ, but not so
>>>>> easy to find/compare.
>>>>> That doesn't need to be a comprehensive table, but commonly used
>>>>> items like selecting the btl, selecting interfaces,
>>>>> dealing with process binding,
>>>>> modifying/enriching the stdout/sterr output
>>>>> (tagging output, increasing verbosity, etc),
>>>>> probably have their place there.
>>>> 
>>>> Yeah, we fell down on this one. The changes were announced with each
>>>> step in the 1.7 series, but if you step from 1.6 directly to 1.8, you'll
>>>> get caught flat-footed. We honestly didn't think of that case, and so we
>>>> mentally assumed that "of course people have been following the series -
>>>> they know what happened".
>>>> 
>>>> You know what they say about those who "assume" :-/
>>>> 
>>>> I'll try to get something into the FAQ about the entire new mapping,
>>>> ranking, and binding system. It is actually VERY powerful, allowing you
>>>> to specify pretty much any placement pattern you can imagine and bind it
>>>> to whatever level you desire. It was developed in response to requests
>>>> from researchers who wanted to explore application performance versus
>>>> placement strategies - but we provided some simplified options to
>>>> support more common usage patterns.
>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>> Many thanks,
>>>>> Gus Correa
>>>>> 
>>>>> 
>>>>> On 10/15/2014 11:12 AM, Jeff Squyres (jsquyres) wrote:
>>>>>> We talked off-list -- fixed this on master and just filed
>>>>>> https://github.com/open-mpi/ompi-release/pull/33 to get this into the
>>>>>> v1.8 branch.
>>>>>> 
>>>>>> 
>>>>>> On Oct 14, 2014, at 7:39 PM, Ralph Castain <r...@open-mpi.org
>>>>>> <mailto:r...@open-mpi.org>
>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Oct 14, 2014, at 5:32 PM, Gus Correa <g...@ldeo.columbia.edu
>>>>>>> <mailto:g...@ldeo.columbia.edu>
>>>>>>> <mailto:g...@ldeo.columbia.edu>> wrote:
>>>>>>> 
>>>>>>>> Dear Open MPI fans and experts
>>>>>>>> 
>>>>>>>> This is just a note in case other people run into the same problem.
>>>>>>>> 
>>>>>>>> I just built Open MPI 1.8.3.
>>>>>>>> As usual I put my old settings on openmpi-mca-params.conf,
>>>>>>>> with no further thinking.
>>>>>>>> Then I compiled the connectivity_c.c program and tried
>>>>>>>> to run it with mpiexec.
>>>>>>>> That is a routine that never failed before.
>>>>>>>> 
>>>>>>>> Bummer!
>>>>>>>> I've got a segmentation fault right away.
>>>>>>> 
>>>>>>> Strange  - it works fine from the cmd line:
>>>>>>> 
>>>>>>> 07:27:04  (v1.8) /home/common/openmpi/ompi-release$ mpirun -n 1 -mca
>>>>>>> rmaps_base_schedule_policy core hostname
>>>>>>> --------------------------------------------------------------------------
>>>>>>> A deprecated MCA variable value was specified in the environment or
>>>>>>> on the command line.  Deprecated MCA variables should be avoided;
>>>>>>> they may disappear in future releases.
>>>>>>> 
>>>>>>> Deprecated variable: rmaps_base_schedule_policy
>>>>>>> New variable:        rmaps_base_mapping_policy
>>>>>>> --------------------------------------------------------------------------
>>>>>>> bend001
>>>>>>> 
>>>>>>> HOWEVER, I can replicate that behavior when it is in the default
>>>>>>> params file! I don't see the immediate cause of the difference, but
>>>>>>> will investigate.
>>>>>>> 
>>>>>>>> 
>>>>>>>> After some head scratching, checking my environment, etc,
>>>>>>>> I thought I might have configured OMPI incorrectly.
>>>>>>>> Hence, I tried to get information from ompi_info.
>>>>>>>> Oh well, ompi_info also segfaulted!
>>>>>>>> 
>>>>>>>> It took me a while to realize that the runtime parameter
>>>>>>>> configuration file was the culprit.
>>>>>>>> 
>>>>>>>> When I inserted the runtime parameter settings one by one,
>>>>>>>> the segfault came with this one:
>>>>>>>> 
>>>>>>>> rmaps_base_schedule_policy = core
>>>>>>>> 
>>>>>>>> Ompi_info (when I got it to work) told me that the parameter above
>>>>>>>> is now a deprecated synonym of:
>>>>>>>> 
>>>>>>>> rmaps_base_mapping_policy = core
>>>>>>>> 
>>>>>>>> In any case, the old synonym doesn't work and makes ompi_info and
>>>>>>>> mpiexec segfault (and I'd guess anything else that requires the
>>>>>>>> OMPI runtime components).
>>>>>>>> Only the new parameter name works.
>>>>>>> 
>>>>>>> That's because the segfault is happening in the printing of the
>>>>>>> deprecation warning.
>>>>>>> 
>>>>>>>> 
>>>>>>>> ***
>>>>>>>> 
>>>>>>>> I am also missing in the ompi_info output the following
>>>>>>>> (OMPI 1.6.5) parameters (not reported by ompi_info --all --all):
>>>>>>>> 
>>>>>>> 
>>>>>>> 1) orte_process_binding  ===> hwloc_base_binding_policy
>>>>>>> 
>>>>>>> 2) orte_report_bindings   ===> hwloc_base_report_bindings
>>>>>>> 
>>>>>>> 3) opal_paffinity_alone  ===> gone, use
>>>>>>> hwloc_base_binding_policy=none if you don't want any binding
>>>>>>> 
>>>>>>>> 
>>>>>>>> Are they gone forever?
>>>>>>>> 
>>>>>>>> Are there replacements for them, with approximately the same
>>>>>>>> functionality?
>>>>>>>> 
>>>>>>>> Is there a list comparing the new (1.8) vs. old (1.6)
>>>>>>>> OMPI runtime parameters, and/or any additional documentation
>>>>>>>> about the new style of OMPI 1.8 runtime parameters?
>>>>>>> 
>>>>>>> Will try to add this to the web site
>>>>>>> 
>>>>>>>> 
>>>>>>>> Since there seems to have been a major revamping of the OMPI
>>>>>>>> runtime parameters, that would be a great help.
>>>>>>>> 
>>>>>>>> Thank you,
>>>>>>>> Gus Correa
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>>> <mailto:us...@open-mpi.org>
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> Link to this post:
>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/10/25497.php
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>>>> <mailto:us...@open-mpi.org>
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/users/2014/10/25498.php
>>>>>> 
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>>> <mailto:us...@open-mpi.org>
>>>>> Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>>>> Link to this
>>>>> post:http://www.open-mpi.org/community/lists/users/2014/10/25501.php
>>>>> <http://www.open-mpi.org/community/lists/users/2014/10/25501.php>
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2014/10/25503.php
>>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2014/10/25508.php
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/10/25526.php
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25531.php

Reply via email to