On Mar 2, 2007, at 12:18 PM, George Bosilca wrote:

I think I miss the discussion about these AMCA here at UTK and about
the benefits that give us. Anyway, I have some comments about this
patch.

You seems to add the new AMCA files into the same string as the
default MCA param file and then you call your new function
fixup_files. This function take a directory as an argument, and you
will try to match everything with the path coming from an MCA
parameter described as AMCA specific. Doesn't really make sense to
me ! If the prefix is AMCA specific then don't match the MCA param
files with it, if not then correct the help message.

Actually I think you have miss-read the code if I understand what you are saying correctly. We are talking about file "opal/mca/base/ mca_base_param.c". On Line 213 we call the fixup_files() function passing it the list of AMCA files (-am command line parameter), and the AMCA specific search path. This will resolve the AMCA files using this search path. The variable new_agg_path contains the AMCA specific path which is kept separate from the default MCA param file variable 'new_files'. That is until *after* we have resolved all relative AMCA files. At that point (Line 223) we prepend the default MCA param file list with the specified AMCA parameter files.

So it is doing exactly as it should, using the AMCA specific path to resolve only the AMCA parameter sets, never the default MCA parameter files.


Last thing about this patch. Having the opal MCA layer export a bool
variable just to make sure the life of orted and orterun (which in
fact don't really need it as it set it multiple times to true ???) is
much easier, isn't something that look to me like a good approach.

It is really not making the life of the orted/orterun easier as much as it is suppressing a warning from being raised when the user specifies a relative path that needs to be resolved in the current working directory (e.g., -am ../adir/my-amac.conf). We need a way to tell the MCA layer that it shouldn't try to resolve the AMCA stuff because all of the environment variables are not setup properly yet.

It gets a bit tricky since the orted/orterun processes have to kind of bootstrap the MCA layer a bit due to command line arguments. Upon first entering the mca_base_param_init() function the system has only part of the information [environment variables mostly], but nothing from the command line. The orted/orterun processes then parse the command line therefore seeding the MCA layer with the 'correct' information from the command line. Once we have the correct information we want to recache the files (mca_base_param_recache_files function) using the user provided information. So if we don't have some way of telling the MCA layer that it should not raise a warning about AMCA files that were not found it is possible that when we get certain relative paths that the MCA layer will raise a warning on the first pass through the library (because it doesn't have the complete information yet), but not on the second. Therefore confusing the end user about what happened.

So this is a long way of saying this is the best way I could think of to do this without changing a whole lot of code and more interfaces in the MCA layer.



In fact, I was wondering what is the real difference between having
this new AMCA stuff and extending the mca_param_files default MCA
parameter ?

Nothing much other than the way it expresses itself to the end user. As I mentioned in my original email the default MCA parameter files and the AMCA parameter set files are the same format, and only differ in when they are used. The default MCA parameter files are used *all* the time on every run. The AMCA parameter set files are only used when the user explicitly asks for them on the command line. As you may have noticed in the code they are parsed and processed in the same way. But by exposing special MCA parameters for the AMCA file sets and the AMCA special path we can logically separate them so the default MCA parameter files are in one place and the AMCA parameter set files are in another place on the system. This way it is a bit clearer that the AMCA parameter sets are opt-in functionality.

Certainly the end user could specify another file to use in addition to the default MCA parameter files (mca_param_files), but then they must also specify the other locations that already exist in that path (e.g., $HOME/.openmpi/mca-params.conf:$SYSCONFDIR/openmpi-mca- params.conf). This is a short cut in a sense, so the end user doesn't have to know all of this uglyness every time they want to run a benchmark, or ...

Hopefully that explains things a bit more, sorry if it was overly confusing.

-- Josh



Thanks,
   george.

On Mar 1, 2007, at 8:52 AM, Josh Hursey wrote:

Developers,

I just committed back to the trunk the Aggregate MCA (AMCA) Parameter
Set work that Jeff Squyres and I have been working on. This will be a
part of the eventual v1.3 release.

The motivation for creating AMCA parameter sets came from the
realization that for certain applications a large number of MCA
parameters needed to be set for the job to run well and/or as the
user expects. So the goal of this work was to help reduce the number
of MCA parameters that the user has to manage, therefore leading to a
better end user experience with Open MPI.

AMCA parameter set files are formated exactly like the "~/.openmpi/
mca-params.conf" configuration files. In addition when AMCA parameter
sets are used the user may still override the parameters on the
command line if they like.

For example, let's say there is a set of MCA parameters that a user
would need to set to get good performance out of Netpipe when using
Open IB. They would typically run the application as:
   shell$ mpirun -np 2 NPmpi

To use the AMCA parameter set for Open IB the user would run:
   shell$ mpirun -np 2 -am  btl-openib-benchmark NPmpi

This will load a series of MCA parameters for the user. If they
wanted to override the max_btls MCA parameter for tuning reasons they
would run:
   shell$ mpirun -np 2 -am  btl-openib-benchmark -mca
btl_open_ib_max_btls 10 NPmpi

AMCA parameter sets can be coupled. If we take the example above and
wanted to also use an AMCA parameter set for TCP, the user would run:
   shell$ mpirun -np 2 -am  btl-openib-benchmark:btl-tcp-benchmark -
mca btl_open_ib_max_btls 10 NPmpi

The AMCA parameter sets are loaded in priority order. This means that
the OpenIB AMCA set has priority over the TCP AMCA set. So if the TCP
AMCA sets the MCA parameter "mpi_leave_pinned=0" and the OpenIB AMCA
sets it to "mpi_leave_pinned=1" then the latter, OpenIB version, will
be used.

Additional Related MCA parameters:
  - mca_base_param_file_prefix
      (Default: NULL)
      This is the fullname of the "-am" mpirun option. Used to
specify a ':' separated list of AMCA parameter set files.
  - mca_base_param_file_path
      (Default: $SYSCONFDIR/amca-param-sets/:$CWD)
      The path to search for AMCA files with relative paths. A
warning will be printed if the AMCA file cannot be found.


If you have any problems with this new feature let me know. There
will be an FAQ coming shortly about this I suspect.

Cheers,
Josh


----
Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/

----
Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

"Half of what I say is meaningless; but I say it so that the other
half may reach you"
                                   Kahlil Gibran


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

----
Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/

Reply via email to