Hi Jeff,

I finally got an allocation on cori - its one busy machine.

Anyway, using the ompi i'd built on edison with the above recommended
configure options
I was able to run using either srun or mpirun on cori provided that in the
later case I used

mpirun -np X -N Y --mca plm slurm ./my_favorite_app

I will make an adjustment to the alps plm launcher to disqualify itself if
the wlm_detect
facility on the cray reports that srun is the launcher.  That's a minor fix
and should make
it in to v2.x in a week or so.  It will be a runtime selection so you only
have to build ompi
once for use either on edison or cori.

Howard


2015-11-19 17:11 GMT-07:00 Howard Pritchard <hpprit...@gmail.com>:

> Hi Jeff H.
>
> Why don't you just try configuring with
>
> ./configure --prefix=my_favorite_install_dir
> --with-libfabric=install_dir_for_libfabric
> make -j 8 install
>
> and see what happens?
>
> Make sure before you configure that you have PrgEnv-gnu or PrgEnv-intel
> module loaded.
>
> Those were the configure/compiler options I used to do testing of ofi mtl
> on cori.
>
> Jeff S. - this thread has gotten intermingled with mpich setup as well,
> hence
> the suggestion for the mpich shm mechanism.
>
>
> Howard
>
>
>
> 2015-11-19 16:59 GMT-07:00 Jeff Hammond <jeff.scie...@gmail.com>:
>
>>
>>> How did you configure for Cori?  You need to be using the slurm plm
>>> component for that system.  I know this sounds like gibberish.
>>>
>>>
>> ../configure --with-libfabric=$HOME/OFI/install-ofi-gcc-gni-cori \
>>              --enable-mca-static=mtl-ofi \
>>              --enable-mca-no-build=btl-openib,btl-vader,btl-ugni,btl-tcp \
>>              --enable-static --disable-shared --disable-dlopen \
>>              --prefix=$HOME/MPI/install-ompi-ofi-gcc-gni-xpmem-cori \
>>              --with-cray-pmi --with-alps --with-cray-xpmem --with-slurm \
>>              --without-verbs --without-fca --without-mxm --without-ucx \
>>              --without-portals4 --without-psm --without-psm2 \
>>              --without-udreg --without-ugni --without-munge \
>>              --without-sge --without-loadleveler --without-tm --without-lsf \
>>              --without-pvfs2 --without-plfs \
>>              --without-cuda --disable-oshmem \
>>              --disable-mpi-fortran --disable-oshmem-fortran \
>>              LDFLAGS="-L/opt/cray/ugni/default/lib64 -lugni \                
>>       -L/opt/cray/alps/default/lib64 -lalps -lalpslli -lalpsutil \           
>>            -ldl -lrt"
>>
>>
>> This is copied from
>> https://github.com/jeffhammond/HPCInfo/blob/master/ofi/README.md#open-mpi,
>> which I note in case you want to see what changes I've made at any point in
>> the future.
>>
>>
>>> There should be a with-slurm configure option to pick up this component.
>>>
>>> Indeed there is.
>>
>>
>>> Doesn't mpich have the option to use sysv memory?  You may want to try
>>> that
>>>
>>>
>> MPICH?  Look, I may have earned my way onto Santa's naughty list more
>> than a few times, but at least I have the decency not to post MPICH
>> questions to the Open-MPI list ;-)
>>
>> If there is a way to tell Open-MPI to use shm_open without filesystem
>> backing (if that is even possible) at configure time, I'd love to do that.
>>
>>
>>> Oh for tuning params you can use env variables.  For example lets say
>>> rather than using the gni provider in ofi mtl you want to try sockets. Then
>>> do
>>>
>>> Export OMPI_MCA_mtl_ofi_provider_include=sockets
>>>
>>>
>> Thanks.  I'm glad that there is an option to set them this way.
>>
>>
>>> In the spirit OMPI - may the force be with you.
>>>
>>>
>> All I will say here is that Open-MPI has a Vader BTL :-)
>>
>>>
>>> > On Thu 19.11.2015 09:44:20 Jeff Hammond wrote:
>>> > > I have no idea what this is trying to tell me. Help?
>>> > >
>>> > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64
>>> > > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file
>>> > > ../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418
>>> > >
>>> > > I can run the same job with srun without incident:
>>> > >
>>> > > jhammond@nid00024:~/MPI/qoit/collectives> srun -n 2 ./driver.x 64
>>> > > MPI was initialized.
>>> > >
>>> > > This is on the NERSC Cori Cray XC40 system. I build Open-MPI git
>>> head from
>>> > > source for OFI libfabric.
>>> > >
>>> > > I have many other issues, which I will report later. As a spoiler,
>>> if I
>>> > > cannot use your mpirun, I cannot set any of the MCA options there. Is
>>> > > there a method to set MCA options with environment variables? I
>>> could not
>>> > > find this documented anywhere.
>>> > >
>>> > > In particular, is there a way to cause shm to not use the global
>>> > > filesystem? I see this issue comes up a lot and I read the list
>>> archives,
>>> > > but the warning message (
>>> > >
>>> https://github.com/hpc/cce-mpi-openmpi-1.6.4/blob/master/ompi/mca/common/sm/
>>> > > help-mpi-common-sm.txt) suggested that I could override it by
>>> setting TMP,
>>> > > TEMP or TEMPDIR, which I did to no avail.
>>> >
>>> > From my experience on edison: the one environment variable that does
>>> works is TMPDIR - the one that is not listed in the error message :-)
>>>
>>
>> That's great.  I will try that now.  Is there a Github issue open already
>> to fix that documentation?  If not...
>>
>>
>>> > Can't help you with your mpirun problem though ...
>>>
>>> No worries.  I appreciate all the help I can get.
>>
>> Thanks,
>>
>> Jeff
>>
>> --
>> Jeff Hammond
>> jeff.scie...@gmail.com
>> http://jeffhammond.github.io/
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/11/28072.php
>>
>
>

Reply via email to