Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread r...@open-mpi.org
You can add "OMPI_MCA_plm=rsh OMPI_MCA_sec=^munge” to your environment


> On Jun 22, 2017, at 7:28 AM, John Hearns via users  
> wrote:
> 
> Michael,  try
>  --mca plm_rsh_agent ssh
> 
> I've been fooling with this myself recently, in the contect of a PBS cluster
> 
> On 22 June 2017 at 16:16, Michael Di Domenico  > wrote:
> is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
> command line or (better) using environment variables?
> 
> i'd like to use the installed version of openmpi i have on a
> workstation, but it's linked with slurm from one of my clusters.
> 
> mpi/slurm work just fine on the cluster, but when i run it on a
> workstation i get the below errors
> 
> mca_base_component_repositoy_open: unable to open mca_sec_munge:
> libmunge missing
> ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
> opal_pmix_base_select failed
> returned value not found (-13) instead of orte_success
> 
> there's probably a magical incantation of mca parameters, but i'm not
> adept enough at determining what they are
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-19 Thread r...@open-mpi.org
When you fork that process off, do you set its process group? Or is it in the 
same process group as the shell script?

> On Jun 19, 2017, at 10:19 AM, Ted Sussman <ted.suss...@adina.com> wrote:
> 
> If I replace the sleep with an infinite loop, I get the same behavior.  One 
> "aborttest" process 
> remains after all the signals are sent.
> 
> On 19 Jun 2017 at 10:10, r...@open-mpi.org wrote:
> 
>> 
>> That is typical behavior when you throw something into "sleep" - not much we 
>> can do about it, I 
>> think.
>> 
>>On Jun 19, 2017, at 9:58 AM, Ted Sussman <ted.suss...@adina.com> wrote:
>> 
>>Hello,
>> 
>>I have rebuilt Open MPI 2.1.1 on the same computer, including 
>> --enable-debug.
>> 
>>I have attached the abort test program aborttest10.tgz.  This version 
>> sleeps for 5 sec before
>>calling MPI_ABORT, so that I can check the pids using ps.
>> 
>>This is what happens (see run2.sh.out).
>> 
>>Open MPI invokes two instances of dum.sh.  Each instance of dum.sh 
>> invokes aborttest.exe.
>> 
>>PidProcess
>>---
>>19565  dum.sh
>>19566  dum.sh
>>19567 aborttest10.exe
>>19568 aborttest10.exe
>> 
>>When MPI_ABORT is called, Open MPI sends SIGCONT, SIGTERM and SIGKILL to 
>> both
>>instances of dum.sh (pids 19565 and 19566).
>> 
>>ps shows that both the shell processes vanish, and that one of the 
>> aborttest10.exe processes
>>vanishes.  But the other aborttest10.exe remains and continues until it 
>> is finished sleeping.
>> 
>>Hope that this information is useful.
>> 
>>Sincerely,
>> 
>>Ted Sussman
>> 
>> 
>> 
>>On 19 Jun 2017 at 23:06,  gil...@rist.or.jp  wrote:
>> 
>> 
>> Ted,
>> 
>>some traces are missing  because you did not configure with --enable-debug
>>i am afraid you have to do it (and you probably want to install that 
>> debug version in an 
>>other
>>location since its performances are not good for production) in order to 
>> get all the logs.
>> 
>>Cheers,
>> 
>>Gilles
>> 
>>- Original Message -
>>   Hello Gilles,
>> 
>>   I retried my example, with the same results as I observed before.  The 
>> process with rank 
>>1
>>   does not get killed by MPI_ABORT.
>> 
>>   I have attached to this E-mail:
>> 
>> config.log.bz2
>> ompi_info.bz2  (uses ompi_info -a)
>> aborttest09.tgz
>> 
>>   This testing is done on a computer running Linux 3.10.0.  This is a 
>> different computer 
>>than
>>   the computer that I previously used for testing.  You can confirm that 
>> I am using Open 
>>MPI
>>   2.1.1.
>> 
>>   tar xvzf aborttest09.tgz
>>   cd aborttest09
>>   ./sh run2.sh
>> 
>>   run2.sh contains the command
>> 
>>   /opt/openmpi-2.1.1-GNU/bin/mpirun -np 2 -mca btl tcp,self --mca 
>> odls_base_verbose 
>>10
>>   ./dum.sh
>> 
>>   The output from this run is in aborttest09/run2.sh.out.
>> 
>>   The output shows that the the "default" component is selected by odls.
>> 
>>   The only messages from odls are: odls: launch spawning child ...  (two 
>> messages). 
>>There
>>   are no messages from odls with "kill" and I see no SENDING SIGCONT / 
>> SIGKILL
>>   messages.
>> 
>>   I am not running from within any batch manager.
>> 
>>   Sincerely,
>> 
>>   Ted Sussman
>> 
>>   On 17 Jun 2017 at 16:02, gil...@rist.or.jp wrote:
>> 
>>Ted,
>> 
>>i do not observe the same behavior you describe with Open MPI 2.1.1
>> 
>># mpirun -np 2 -mca btl tcp,self --mca odls_base_verbose 5 ./abort.sh
>> 
>>abort.sh 31361 launching abort
>>abort.sh 31362 launching abort
>>I am rank 0 with pid 31363
>>I am rank 1 with pid 31364
>>
>>--
>>MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>with errorcode 1.
>> 
>>NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>You may or may not see output from other processes, depending on
>>exact

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-15 Thread r...@open-mpi.org
You have to understand that we have no way of knowing who is making MPI calls - 
all we see is the proc that we started, and we know someone of that rank is 
running (but we have no way of knowing which of the procs you sub-spawned it 
is).

So the behavior you are seeking only occurred in some earlier release by sheer 
accident. Nor will you find it portable as there is no specification directing 
that behavior.

The behavior I’ve provided is to either deliver the signal to _all_ child 
processes (including grandchildren etc.), or _only_ the immediate child of the 
daemon. It won’t do what you describe - kill the mPI proc underneath the shell, 
but not the shell itself.

What you can eventually do is use PMIx to ask the runtime to selectively 
deliver signals to pid/procs for you. We don’t have that capability implemented 
just yet, I’m afraid.

Meantime, when I get a chance, I can code an option that will record the pid of 
the subproc that calls MPI_Init, and then let’s you deliver signals to just 
that proc. No promises as to when that will be done.


> On Jun 15, 2017, at 1:37 PM, Ted Sussman <ted.suss...@adina.com> wrote:
> 
> Hello Ralph,
> 
> I am just an Open MPI end user, so I will need to wait for the next official 
> release.
> 
> mpirun --> shell for process 0 -->  executable for process 0 --> MPI calls
>--> shell for process 1 -->  executable for process 1 --> MPI calls
> ...
> 
> I guess the question is, should MPI_ABORT kill the executables or the shells? 
>  I naively thought, that, since it is the executables that make the MPI 
> calls, it is the executables that should be aborted by the call to MPI_ABORT. 
>  Since the shells don't make MPI calls, the shells should not be aborted.
> 
> And users might have several layers of shells in between mpirun and the 
> executable.
> 
> So now I will look for the latest version of Open MPI that has the 1.4.3 
> behavior.
> 
> Sincerely,
> 
> Ted Sussman
> 
> On 15 Jun 2017 at 12:31, r...@open-mpi.org wrote:
> 
> >
> > Yeah, things jittered a little there as we debated the “right” behavior. 
> > Generally, when we see that
> > happening it means that a param is required, but somehow we never reached 
> > that point.
> >
> > See if https://github.com/open-mpi/ompi/pull/3704  helps - if so, I can 
> > schedule it for the next 2.x
> > release if the RMs agree to take it
> >
> > Ralph
> >
> > On Jun 15, 2017, at 12:20 PM, Ted Sussman <ted.suss...@adina.com> wrote:
> >
> > Thank you for your comments.
> >
> > Our application relies upon "dum.sh" to clean up after the process 
> > exits, either if the process
> > exits normally, or if the process exits abnormally because of 
> > MPI_ABORT.  If the process
> > group is killed by MPI_ABORT, this clean up will not be performed.  If 
> > exec is used to launch
> > the executable from dum.sh, then dum.sh is terminated by the exec, so 
> > dum.sh cannot
> > perform any clean up.
> >
> > I suppose that other user applications might work similarly, so it 
> > would be good to have an
> > MCA parameter to control the behavior of MPI_ABORT.
> >
> > We could rewrite our shell script that invokes mpirun, so that the 
> > cleanup that is now done
> > by
> > dum.sh is done by the invoking shell script after mpirun exits.  
> > Perhaps this technique is the
> > preferred way to clean up after mpirun is invoked.
> >
> > By the way, I have also tested with Open MPI 1.10.7, and Open MPI 
> > 1.10.7 has different
> > behavior than either Open MPI 1.4.3 or Open MPI 2.1.1.  In this 
> > explanation, it is important to
> > know that the aborttest executable sleeps for 20 sec.
> >
> > When running example 2:
> >
> > 1.4.3: process 1 immediately aborts
> > 1.10.7: process 1 doesn't abort and never stops.
> > 2.1.1 process 1 doesn't abort, but stops after it is finished sleeping
> >
> > Sincerely,
> >
> > Ted Sussman
> >
> > On 15 Jun 2017 at 9:18, r...@open-mpi.org wrote:
> >
> > Here is how the system is working:
> >
> > Master: each process is put into its own process group upon launch. 
> > When we issue a
> > "kill", however, we only issue it to the individual process (instead of 
> > the process group
> > that is headed by that child process). This is probably a bug as I 
> > don´t believe that is
> > what we intended, but

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-15 Thread r...@open-mpi.org
Yeah, things jittered a little there as we debated the “right” behavior. 
Generally, when we see that happening it means that a param is required, but 
somehow we never reached that point.

See if https://github.com/open-mpi/ompi/pull/3704 
<https://github.com/open-mpi/ompi/pull/3704> helps - if so, I can schedule it 
for the next 2.x release if the RMs agree to take it

Ralph

> On Jun 15, 2017, at 12:20 PM, Ted Sussman <ted.suss...@adina.com> wrote:
> 
> Thank you for your comments.
> 
> Our application relies upon "dum.sh" to clean up after the process exits, 
> either if the process 
> exits normally, or if the process exits abnormally because of MPI_ABORT.  If 
> the process 
> group is killed by MPI_ABORT, this clean up will not be performed.  If exec 
> is used to launch 
> the executable from dum.sh, then dum.sh is terminated by the exec, so dum.sh 
> cannot 
> perform any clean up.
> 
> I suppose that other user applications might work similarly, so it would be 
> good to have an 
> MCA parameter to control the behavior of MPI_ABORT.
> 
> We could rewrite our shell script that invokes mpirun, so that the cleanup 
> that is now done by 
> dum.sh is done by the invoking shell script after mpirun exits.  Perhaps this 
> technique is the 
> preferred way to clean up after mpirun is invoked.
> 
> By the way, I have also tested with Open MPI 1.10.7, and Open MPI 1.10.7 has 
> different 
> behavior than either Open MPI 1.4.3 or Open MPI 2.1.1.  In this explanation, 
> it is important to 
> know that the aborttest executable sleeps for 20 sec.
> 
> When running example 2:
> 
> 1.4.3: process 1 immediately aborts
> 1.10.7: process 1 doesn't abort and never stops.
> 2.1.1 process 1 doesn't abort, but stops after it is finished sleeping 
> 
> Sincerely,
> 
> Ted Sussman
> 
> On 15 Jun 2017 at 9:18, r...@open-mpi.org wrote:
> 
>> Here is how the system is working:
>> 
>> Master: each process is put into its own process group upon launch. When we 
>> issue a "kill", however, we only issue it to the individual process (instead 
>> of the process group that is headed by that child process). This is probably 
>> a bug as I don´t believe that is what we intended, but set that aside for 
>> now.
>> 
>> 2.x: each process is put into its own process group upon launch. When we 
>> issue a "kill", we issue it to the process group. Thus, every child proc of 
>> that child proc will receive it. IIRC, this was the intended behavior.
>> 
>> It is rather trivial to make the change (it only involves 3 lines of code), 
>> but I´m not sure of what our intended behavior is supposed to be. Once we 
>> clarify that, it is also trivial to add another MCA param (you can never 
>> have too many!) to allow you to select the other behavior.
>> 
>> 
>>> On Jun 15, 2017, at 5:23 AM, Ted Sussman <ted.suss...@adina.com> wrote:
>>> 
>>> Hello Gilles,
>>> 
>>> Thank you for your quick answer.  I confirm that if exec is used, both 
>>> processes immediately 
>>> abort.
>>> 
>>> Now suppose that the line
>>> 
>>> echo "After aborttest: OMPI_COMM_WORLD_RANK="$OMPI_COMM_WORLD_RANK
>>> 
>>> is added to the end of dum.sh.
>>> 
>>> If Example 2 is run with Open MPI 1.4.3, the output is
>>> 
>>> After aborttest: OMPI_COMM_WORLD_RANK=0
>>> 
>>> which shows that the shell script for the process with rank 0 continues 
>>> after the abort,
>>> but that the shell script for the process with rank 1 does not continue 
>>> after the abort.
>>> 
>>> If Example 2 is run with Open MPI 2.1.1, with exec used to invoke 
>>> aborttest02.exe, then 
>>> there is no such output, which shows that both shell scripts do not 
>>> continue after the abort.
>>> 
>>> I prefer the Open MPI 1.4.3 behavior because our original application 
>>> depends upon the 
>>> Open MPI 1.4.3 behavior.  (Our original application will also work if both 
>>> executables are 
>>> aborted, and if both shell scripts continue after the abort.)
>>> 
>>> It might be too much to expect, but is there a way to recover the Open MPI 
>>> 1.4.3 behavior 
>>> using Open MPI 2.1.1?  
>>> 
>>> Sincerely,
>>> 
>>> Ted Sussman
>>> 
>>> 
>>> On 15 Jun 2017 at 9:50, Gilles Gouaillardet wrote:
>>> 
>>>> Ted,
>>>> 
>>>> 
>>>> fwiw, the 'master' branch has the behavior you expec

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-15 Thread r...@open-mpi.org
Here is how the system is working:

Master: each process is put into its own process group upon launch. When we 
issue a “kill”, however, we only issue it to the individual process (instead of 
the process group that is headed by that child process). This is probably a bug 
as I don’t believe that is what we intended, but set that aside for now.

2.x: each process is put into its own process group upon launch. When we issue 
a “kill”, we issue it to the process group. Thus, every child proc of that 
child proc will receive it. IIRC, this was the intended behavior.

It is rather trivial to make the change (it only involves 3 lines of code), but 
I’m not sure of what our intended behavior is supposed to be. Once we clarify 
that, it is also trivial to add another MCA param (you can never have too 
many!) to allow you to select the other behavior.


> On Jun 15, 2017, at 5:23 AM, Ted Sussman  wrote:
> 
> Hello Gilles,
> 
> Thank you for your quick answer.  I confirm that if exec is used, both 
> processes immediately 
> abort.
> 
> Now suppose that the line
> 
> echo "After aborttest: OMPI_COMM_WORLD_RANK="$OMPI_COMM_WORLD_RANK
> 
> is added to the end of dum.sh.
> 
> If Example 2 is run with Open MPI 1.4.3, the output is
> 
> After aborttest: OMPI_COMM_WORLD_RANK=0
> 
> which shows that the shell script for the process with rank 0 continues after 
> the abort,
> but that the shell script for the process with rank 1 does not continue after 
> the abort.
> 
> If Example 2 is run with Open MPI 2.1.1, with exec used to invoke 
> aborttest02.exe, then 
> there is no such output, which shows that both shell scripts do not continue 
> after the abort.
> 
> I prefer the Open MPI 1.4.3 behavior because our original application depends 
> upon the 
> Open MPI 1.4.3 behavior.  (Our original application will also work if both 
> executables are 
> aborted, and if both shell scripts continue after the abort.)
> 
> It might be too much to expect, but is there a way to recover the Open MPI 
> 1.4.3 behavior 
> using Open MPI 2.1.1?  
> 
> Sincerely,
> 
> Ted Sussman
> 
> 
> On 15 Jun 2017 at 9:50, Gilles Gouaillardet wrote:
> 
>> Ted,
>> 
>> 
>> fwiw, the 'master' branch has the behavior you expect.
>> 
>> 
>> meanwhile, you can simple edit your 'dum.sh' script and replace
>> 
>> /home/buildadina/src/aborttest02/aborttest02.exe
>> 
>> with
>> 
>> exec /home/buildadina/src/aborttest02/aborttest02.exe
>> 
>> 
>> Cheers,
>> 
>> 
>> Gilles
>> 
>> 
>> On 6/15/2017 3:01 AM, Ted Sussman wrote:
>>> Hello,
>>> 
>>> My question concerns MPI_ABORT, indirect execution of executables by mpirun 
>>> and Open
>>> MPI 2.1.1.  When mpirun runs executables directly, MPI_ABORT works as 
>>> expected, but
>>> when mpirun runs executables indirectly, MPI_ABORT does not work as 
>>> expected.
>>> 
>>> If Open MPI 1.4.3 is used instead of Open MPI 2.1.1, MPI_ABORT works as 
>>> expected in all
>>> cases.
>>> 
>>> The examples given below have been simplified as far as possible to show 
>>> the issues.
>>> 
>>> ---
>>> 
>>> Example 1
>>> 
>>> Consider an MPI job run in the following way:
>>> 
>>> mpirun ... -app addmpw1
>>> 
>>> where the appfile addmpw1 lists two executables:
>>> 
>>> -n 1 -host gulftown ... aborttest02.exe
>>> -n 1 -host gulftown ... aborttest02.exe
>>> 
>>> The two executables are executed on the local node gulftown.  aborttest02 
>>> calls MPI_ABORT
>>> for rank 0, then sleeps.
>>> 
>>> The above MPI job runs as expected.  Both processes immediately abort when 
>>> rank 0 calls
>>> MPI_ABORT.
>>> 
>>> ---
>>> 
>>> Example 2
>>> 
>>> Now change the above example as follows:
>>> 
>>> mpirun ... -app addmpw2
>>> 
>>> where the appfile addmpw2 lists shell scripts:
>>> 
>>> -n 1 -host gulftown ... dum.sh
>>> -n 1 -host gulftown ... dum.sh
>>> 
>>> dum.sh invokes aborttest02.exe.  So aborttest02.exe is executed indirectly 
>>> by mpirun.
>>> 
>>> In this case, the MPI job only aborts process 0 when rank 0 calls 
>>> MPI_ABORT.  Process 1
>>> continues to run.  This behavior is unexpected.
>>> 
>>> 
>>> 
>>> I have attached all files to this E-mail.  Since there are absolute 
>>> pathnames in the files, to
>>> reproduce my findings, you will need to update the pathnames in the 
>>> appfiles and shell
>>> scripts.  To run example 1,
>>> 
>>> sh run1.sh
>>> 
>>> and to run example 2,
>>> 
>>> sh run2.sh
>>> 
>>> ---
>>> 
>>> I have tested these examples with Open MPI 1.4.3 and 2.0.3.  In Open MPI 
>>> 1.4.3, both
>>> examples work as expected.  Open MPI 2.0.3 has the same behavior as Open 
>>> MPI 2.1.1.
>>> 
>>> ---
>>> 
>>> I would prefer that Open MPI 2.1.1 aborts both processes, even when the 
>>> executables are
>>> invoked indirectly by mpirun.  If there is an MCA setting that is needed to 
>>> make Open MPI
>>> 2.1.1 abort both processes, please let me know.
>>> 
>>> 
>>> Sincerely,
>>> 
>>> Theodore Sussman
>>> 
>>> 
>>> The following section of this message contains a file attachment
>>> prepared 

Re: [OMPI users] "undefined reference to `MPI_Comm_create_group'" error message when using Open MPI 1.6.2

2017-06-09 Thread r...@open-mpi.org
Well, of course it still needs to execute the orteds on those nodes - but that 
wasn’t what you asked. One way or another, the orteds must be available on the 
compute nodes.

> On Jun 9, 2017, at 8:53 AM, Arham Amouie <erham...@yahoo.com> wrote:
> 
> Hi. I had tried this. It still looks for ORTE file(s) on the hard disks of 
> compute nodes.
> 
> Now I know that I can install Open MPI in a shared directory. But is it 
> possible to make executable files that don't look for any Open MPI's files on 
> disk?
> 
> Arham
> 
> 
> From: "r...@open-mpi.org" <r...@open-mpi.org>
> To: Arham Amouie <erham...@yahoo.com>; Open MPI Users 
> <users@lists.open-mpi.org> 
> Sent: Friday, June 9, 2017 5:40 PM
> Subject: Re: [OMPI users] "undefined reference to `MPI_Comm_create_group'" 
> error message when using Open MPI 1.6.2
> 
> Sure - just configure OMPI with “--enable-static --disable-shared”
> 
>> On Jun 9, 2017, at 5:50 AM, Arham Amouie via users <users@lists.open-mpi.org 
>> <mailto:users@lists.open-mpi.org>> wrote:
>> 
>> Thank you very much. Could you please answer another somewhat related 
>> question? I'd like to know if ORTE could be linked statically like a library 
>> in order to have a completely stand-alone executable file. As you may have 
>> noticed I don't have a good knowledge of how Open MPI works.
>> 
>> Thanks in advance,
>> 
>> Arham
>> 
>> 
>> From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com 
>> <mailto:gilles.gouaillar...@gmail.com>>
>> To: Arham Amouie <erham...@yahoo.com <mailto:erham...@yahoo.com>>; Open MPI 
>> Users <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> 
>> Sent: Thursday, June 8, 2017 2:41 PM
>> Subject: Re: [OMPI users] "undefined reference to `MPI_Comm_create_group'" 
>> error message when using Open MPI 1.6.2
>> 
>> MPI_Comm_create_group was not available in Open MPI v1.6.
>> so unless you are willing to create your own subroutine in your
>> application, you'd rather upgrade to Open MPI v2
>> 
>> i recomment you configure Open MPI with
>> --disable-dlopen --prefix=> and compute nodes>
>> 
>> unless you plan to scale on thousands of nodes, you should be just
>> fine with that.
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> 
>> On Thu, Jun 8, 2017 at 6:58 PM, Arham Amouie via users
>> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote:
>> > Hello. Open MPI 1.6.2 is installed on the cluster I'm using. At the moment 
>> > I
>> > can't upgrade Open MPI on the computing nodes of this system. My C code
>> > contains many calls to MPI functions. When I try to 'make' this code on the
>> > cluster, the only error that I get is "undefined reference to
>> > `MPI_Comm_create_group'".
>> >
>> > I'm able to install a newer version (like 2.1.1) of Open MPI only on the
>> > frontend of this cluster. Using newer version, the code is compiled and
>> > linked successfully. But in this case I face problem in running the 
>> > program,
>> > since the newer version of Open MPI is not installed on the computing 
>> > nodes.
>> >
>> > Is there any way that I can compile and link the code using Open MPI 1.6.2?
>> >
>> > Thanks,
>> > Arham Amouei
>> 
>> >
>> > ___
>> > users mailing list
>> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>> 
>> 
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> 
> 

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Node failure handling

2017-06-09 Thread r...@open-mpi.org
It has been awhile since I tested it, but I believe the --enable-recovery 
option might do what you want.

> On Jun 8, 2017, at 6:17 AM, Tim Burgess  wrote:
> 
> Hi!
> 
> So I know from searching the archive that this is a repeated topic of
> discussion here, and apologies for that, but since it's been a year or
> so I thought I'd double-check whether anything has changed before
> really starting to tear my hair out too much.
> 
> Is there a combination of MCA parameters or similar that will prevent
> ORTE from aborting a job when it detects a node failure?  This is
> using the tcp btl, under slurm.
> 
> The application, not written by us and too complicated to re-engineer
> at short notice, has a strictly master-slave communication pattern.
> The master never blocks on communication from individual slaves, and
> apparently can itself detect slaves that have silently disappeared and
> reissue the work to those remaining.  So from an application
> standpoint I believe we should be able to handle this.  However, in
> all my testing so far the job is aborted as soon as the runtime system
> figures out what is going on.
> 
> If not, do any users know of another MPI implementation that might
> work for this use case?  As far as I can tell, FT-MPI has been pretty
> quiet the last couple of years?
> 
> Thanks in advance,
> 
> Tim
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] "undefined reference to `MPI_Comm_create_group'" error message when using Open MPI 1.6.2

2017-06-09 Thread r...@open-mpi.org
Sure - just configure OMPI with “--enable-static --disable-shared”

> On Jun 9, 2017, at 5:50 AM, Arham Amouie via users  
> wrote:
> 
> Thank you very much. Could you please answer another somewhat related 
> question? I'd like to know if ORTE could be linked statically like a library 
> in order to have a completely stand-alone executable file. As you may have 
> noticed I don't have a good knowledge of how Open MPI works.
> 
> Thanks in advance,
> 
> Arham
> 
> 
> From: Gilles Gouaillardet 
> To: Arham Amouie ; Open MPI Users 
>  
> Sent: Thursday, June 8, 2017 2:41 PM
> Subject: Re: [OMPI users] "undefined reference to `MPI_Comm_create_group'" 
> error message when using Open MPI 1.6.2
> 
> MPI_Comm_create_group was not available in Open MPI v1.6.
> so unless you are willing to create your own subroutine in your
> application, you'd rather upgrade to Open MPI v2
> 
> i recomment you configure Open MPI with
> --disable-dlopen --prefix= and compute nodes>
> 
> unless you plan to scale on thousands of nodes, you should be just
> fine with that.
> 
> Cheers,
> 
> Gilles
> 
> 
> On Thu, Jun 8, 2017 at 6:58 PM, Arham Amouie via users
> > wrote:
> > Hello. Open MPI 1.6.2 is installed on the cluster I'm using. At the moment I
> > can't upgrade Open MPI on the computing nodes of this system. My C code
> > contains many calls to MPI functions. When I try to 'make' this code on the
> > cluster, the only error that I get is "undefined reference to
> > `MPI_Comm_create_group'".
> >
> > I'm able to install a newer version (like 2.1.1) of Open MPI only on the
> > frontend of this cluster. Using newer version, the code is compiled and
> > linked successfully. But in this case I face problem in running the program,
> > since the newer version of Open MPI is not installed on the computing nodes.
> >
> > Is there any way that I can compile and link the code using Open MPI 1.6.2?
> >
> > Thanks,
> > Arham Amouei
> 
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org 
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> > 
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Issue in installing PMIx

2017-06-07 Thread r...@open-mpi.org
I guess I should also have clarified - I tested with PMIx v1.1.5 as that is the 
latest in the 1.1 series.

> On Jun 7, 2017, at 8:23 PM, r...@open-mpi.org wrote:
> 
> It built fine for me - on your configure path-to-pmix, what did you tell it? 
> It wants the path supplied as  when you configured pmix itself
> 
>> On Jun 7, 2017, at 2:50 PM, Marc Cooper <marccooper2...@gmail.com 
>> <mailto:marccooper2...@gmail.com>> wrote:
>> 
>> OpenMPI 2.1.1 and PMIx v1.1
>> 
>> On 7 June 2017 at 11:54, r...@open-mpi.org <mailto:r...@open-mpi.org> 
>> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
>> Ummm...what version of OMPI and PMIx are you talking about?
>> 
>> > On Jun 6, 2017, at 2:20 PM, Marc Cooper <marccooper2...@gmail.com 
>> > <mailto:marccooper2...@gmail.com>> wrote:
>> >
>> > Hi,
>> >
>> > I've been trying to install PMIx external to OpenMPI, with separate 
>> > libevent and hwloc. My configuration script is
>> >
>> >  ./configure --prefix= --with-platform=optimized 
>> > --with-pmix= --with-libevent= 
>> > --with-hwloc=.
>> >
>> > This is done successfully. When I 'make' it, I get the following error
>> >
>> > Making all in mca/pmix/external
>> >   CC   mca_pmix_external_la-pmix_ext_component.lo
>> > In file included from pmix_ext_component.c:24:
>> > ./pmix_ext.h:31:10: fatal error: 'pmix/pmix_common.h' file not found
>> > #include "pmix/pmix_common.h"
>> >  ^
>> > 1 error generated.
>> > make[2]: *** [mca_pmix_external_la-pmix_ext_component.lo] Error 1
>> > make[1]: *** [all-recursive] Error 1
>> > make: *** [all-recursive] Error 1
>> >
>> > When I have given the path to the external PMIx, why is this error popping 
>> > up. Appreciate any help in resolving it.
>> >
>> > Cheers,
>> > Marc
>> > ___
>> > users mailing list
>> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Issue in installing PMIx

2017-06-07 Thread r...@open-mpi.org
It built fine for me - on your configure path-to-pmix, what did you tell it? It 
wants the path supplied as  when you configured pmix itself

> On Jun 7, 2017, at 2:50 PM, Marc Cooper <marccooper2...@gmail.com> wrote:
> 
> OpenMPI 2.1.1 and PMIx v1.1
> 
> On 7 June 2017 at 11:54, r...@open-mpi.org <mailto:r...@open-mpi.org> 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> Ummm...what version of OMPI and PMIx are you talking about?
> 
> > On Jun 6, 2017, at 2:20 PM, Marc Cooper <marccooper2...@gmail.com 
> > <mailto:marccooper2...@gmail.com>> wrote:
> >
> > Hi,
> >
> > I've been trying to install PMIx external to OpenMPI, with separate 
> > libevent and hwloc. My configuration script is
> >
> >  ./configure --prefix= --with-platform=optimized 
> > --with-pmix= --with-libevent= 
> > --with-hwloc=.
> >
> > This is done successfully. When I 'make' it, I get the following error
> >
> > Making all in mca/pmix/external
> >   CC   mca_pmix_external_la-pmix_ext_component.lo
> > In file included from pmix_ext_component.c:24:
> > ./pmix_ext.h:31:10: fatal error: 'pmix/pmix_common.h' file not found
> > #include "pmix/pmix_common.h"
> >  ^
> > 1 error generated.
> > make[2]: *** [mca_pmix_external_la-pmix_ext_component.lo] Error 1
> > make[1]: *** [all-recursive] Error 1
> > make: *** [all-recursive] Error 1
> >
> > When I have given the path to the external PMIx, why is this error popping 
> > up. Appreciate any help in resolving it.
> >
> > Cheers,
> > Marc
> > ___
> > users mailing list
> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Issue in installing PMIx

2017-06-07 Thread r...@open-mpi.org
Ummm...what version of OMPI and PMIx are you talking about?

> On Jun 6, 2017, at 2:20 PM, Marc Cooper  wrote:
> 
> Hi,
> 
> I've been trying to install PMIx external to OpenMPI, with separate libevent 
> and hwloc. My configuration script is 
> 
>  ./configure --prefix= --with-platform=optimized 
> --with-pmix= --with-libevent= 
> --with-hwloc=.
> 
> This is done successfully. When I 'make' it, I get the following error
> 
> Making all in mca/pmix/external
>   CC   mca_pmix_external_la-pmix_ext_component.lo
> In file included from pmix_ext_component.c:24:
> ./pmix_ext.h:31:10: fatal error: 'pmix/pmix_common.h' file not found
> #include "pmix/pmix_common.h"
>  ^
> 1 error generated.
> make[2]: *** [mca_pmix_external_la-pmix_ext_component.lo] Error 1
> make[1]: *** [all-recursive] Error 1
> make: *** [all-recursive] Error 1
> 
> When I have given the path to the external PMIx, why is this error popping 
> up. Appreciate any help in resolving it.
> 
> Cheers,
> Marc
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread r...@open-mpi.org
Until the fixes pending in the big ORTE update PR are committed, I suggest not 
wasting time chasing this down. I tested the “patched” version of the 3.x 
branch, and it works just fine.


> On May 30, 2017, at 7:43 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
> 
> Ralph,
> 
> 
> the issue Siegmar initially reported was
> 
> loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi
> 
> 
> per what you wrote, this should be equivalent to
> 
> loki hello_1 111 mpiexec -np 3 --host loki:2,exin:1 hello_1_mpi
> 
> and this is what i initially wanted to double check (but i made a typo in my 
> reply)
> 
> 
> anyway, the logs Siegmar posted indicate the two commands produce the same 
> output
> 
> --
> There are not enough slots available in the system to satisfy the 3 slots
> that were requested by the application:
>  hello_1_mpi
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> 
> 
> to me, this is incorrect since the command line made 3 available slots.
> also, i am unable to reproduce any of these issues :-(
> 
> 
> 
> Siegmar,
> 
> can you please post your configure command line, and try these commands from 
> loki
> 
> mpiexec -np 3 --host loki:2,exin --mca plm_base_verbose 5 hostname
> mpiexec -np 1 --host exin --mca plm_base_verbose 5 hostname
> mpiexec -np 1 --host exin ldd ./hello_1_mpi
> 
> if Open MPI is not installed on a shared filesystem (NFS for example), please 
> also double check
> both install were built from the same source and with the same options
> 
> 
> Cheers,
> 
> Gilles
> On 5/30/2017 10:20 PM, r...@open-mpi.org wrote:
>> This behavior is as-expected. When you specify "-host foo,bar”, you have 
>> told us to assign one slot to each of those nodes. Thus, running 3 procs 
>> exceeds the number of slots you assigned.
>> 
>> You can tell it to set the #slots to the #cores it discovers on the node by 
>> using “-host foo:*,bar:*”
>> 
>> I cannot replicate your behavior of "-np 3 -host foo:2,bar:3” running more 
>> than 3 procs
>> 
>> 
>>> On May 30, 2017, at 5:24 AM, Siegmar Gross 
>>> <siegmar.gr...@informatik.hs-fulda.de> wrote:
>>> 
>>> Hi Gilles,
>>> 
>>>> what if you ?
>>>> mpiexec --host loki:1,exin:1 -np 3 hello_1_mpi
>>> I need as many slots as processes so that I use "-np 2".
>>> "mpiexec --host loki,exin -np 2 hello_1_mpi" works as well. The command
>>> breaks, if I use at least "-np 3" and distribute the processes across at
>>> least two machines.
>>> 
>>> loki hello_1 118 mpiexec --host loki:1,exin:1 -np 2 hello_1_mpi
>>> Process 0 of 2 running on loki
>>> Process 1 of 2 running on exin
>>> Now 1 slave tasks are sending greetings.
>>> Greetings from task 1:
>>>  message type:3
>>>  msg length:  131 characters
>>>  message:
>>>hostname:  exin
>>>operating system:  Linux
>>>release:   4.4.49-92.11-default
>>>processor: x86_64
>>> loki hello_1 119
>>> 
>>> 
>>> 
>>>> are loki and exin different ? (os, sockets, core)
>>> Yes, loki is a real machine and exin is a virtual one. "exin" uses a newer
>>> kernel.
>>> 
>>> loki fd1026 108 uname -a
>>> Linux loki 4.4.38-93-default #1 SMP Wed Dec 14 12:59:43 UTC 2016 (2d3e9d4) 
>>> x86_64 x86_64 x86_64 GNU/Linux
>>> 
>>> loki fd1026 109 ssh exin uname -a
>>> Linux exin 4.4.49-92.11-default #1 SMP Fri Feb 17 08:29:30 UTC 2017 
>>> (8f9478a) x86_64 x86_64 x86_64 GNU/Linux
>>> loki fd1026 110
>>> 
>>> The number of sockets and cores is identical, but the processor types are
>>> different as you can see at the end of my previous email. "loki" uses two
>>> "Intel(R) Xeon(R) CPU E5-2620 v3" processors and "exin" two "Intel Core
>>> Processor (Haswell, no TSX)" from QEMU. I can provide a pdf file with both
>>> topologies (89 K) if you are interested in the output from lstopo. I've
>>> added some runs. Most interesting in my opinion are the last two
>>> "mpiexec --host exin:2,loki:3 -np 3 hello_1_mpi" and
>>> "mpiexec -np 3 --host exin:2,loki:3 hello_1_mpi".
>>> Why does mpiexec create fiv

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread r...@open-mpi.org
This behavior is as-expected. When you specify "-host foo,bar”, you have told 
us to assign one slot to each of those nodes. Thus, running 3 procs exceeds the 
number of slots you assigned.

You can tell it to set the #slots to the #cores it discovers on the node by 
using “-host foo:*,bar:*”

I cannot replicate your behavior of "-np 3 -host foo:2,bar:3” running more than 
3 procs


> On May 30, 2017, at 5:24 AM, Siegmar Gross 
>  wrote:
> 
> Hi Gilles,
> 
>> what if you ?
>> mpiexec --host loki:1,exin:1 -np 3 hello_1_mpi
> 
> I need as many slots as processes so that I use "-np 2".
> "mpiexec --host loki,exin -np 2 hello_1_mpi" works as well. The command
> breaks, if I use at least "-np 3" and distribute the processes across at
> least two machines.
> 
> loki hello_1 118 mpiexec --host loki:1,exin:1 -np 2 hello_1_mpi
> Process 0 of 2 running on loki
> Process 1 of 2 running on exin
> Now 1 slave tasks are sending greetings.
> Greetings from task 1:
>  message type:3
>  msg length:  131 characters
>  message:
>hostname:  exin
>operating system:  Linux
>release:   4.4.49-92.11-default
>processor: x86_64
> loki hello_1 119
> 
> 
> 
>> are loki and exin different ? (os, sockets, core)
> 
> Yes, loki is a real machine and exin is a virtual one. "exin" uses a newer
> kernel.
> 
> loki fd1026 108 uname -a
> Linux loki 4.4.38-93-default #1 SMP Wed Dec 14 12:59:43 UTC 2016 (2d3e9d4) 
> x86_64 x86_64 x86_64 GNU/Linux
> 
> loki fd1026 109 ssh exin uname -a
> Linux exin 4.4.49-92.11-default #1 SMP Fri Feb 17 08:29:30 UTC 2017 (8f9478a) 
> x86_64 x86_64 x86_64 GNU/Linux
> loki fd1026 110
> 
> The number of sockets and cores is identical, but the processor types are
> different as you can see at the end of my previous email. "loki" uses two
> "Intel(R) Xeon(R) CPU E5-2620 v3" processors and "exin" two "Intel Core
> Processor (Haswell, no TSX)" from QEMU. I can provide a pdf file with both
> topologies (89 K) if you are interested in the output from lstopo. I've
> added some runs. Most interesting in my opinion are the last two
> "mpiexec --host exin:2,loki:3 -np 3 hello_1_mpi" and
> "mpiexec -np 3 --host exin:2,loki:3 hello_1_mpi".
> Why does mpiexec create five processes although I've asked for only three
> processes? Why do I have to break the program with  for the first
> of the above commands?
> 
> 
> 
> loki hello_1 110 mpiexec --host loki:2,exin:1 -np 3 hello_1_mpi
> --
> There are not enough slots available in the system to satisfy the 3 slots
> that were requested by the application:
>  hello_1_mpi
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> 
> 
> 
> loki hello_1 111 mpiexec --host exin:3 -np 3 hello_1_mpi
> Process 0 of 3 running on exin
> Process 1 of 3 running on exin
> Process 2 of 3 running on exin
> ...
> 
> 
> 
> loki hello_1 115 mpiexec --host exin:2,loki:3 -np 3 hello_1_mpi
> Process 1 of 3 running on loki
> Process 0 of 3 running on loki
> Process 2 of 3 running on loki
> ...
> 
> Process 0 of 3 running on exin
> Process 1 of 3 running on exin
> [exin][[52173,1],1][../../../../../openmpi-v3.x-201705250239-d5200ea/opal/mca/btl/tcp/btl_tcp_endpoint.c:794:mca_btl_tcp_endpoint_complete_connect]
>  connect() to 193.xxx.xxx.xxx failed: Connection refused (111)
> 
> ^Cloki hello_1 116
> 
> 
> 
> 
> loki hello_1 116 mpiexec -np 3 --host exin:2,loki:3 hello_1_mpi
> Process 0 of 3 running on loki
> Process 2 of 3 running on loki
> Process 1 of 3 running on loki
> ...
> Process 1 of 3 running on exin
> Process 0 of 3 running on exin
> [exin][[51638,1],1][../../../../../openmpi-v3.x-201705250239-d5200ea/opal/mca/btl/tcp/btl_tcp_endpoint.c:590:mca_btl_tcp_endpoint_recv_blocking]
>  recv(16, 0/8) failed: Connection reset by peer (104)
> [exin:31909] 
> ../../../../../openmpi-v3.x-201705250239-d5200ea/ompi/mca/pml/ob1/pml_ob1_sendreq.c:191
>  FATAL
> loki hello_1 117
> 
> 
> Do you need anything else?
> 
> 
> Kind regards and thank you very much for your help
> 
> Siegmar
> 
> 
> 
>> Cheers,
>> Gilles
>> - Original Message -
>>> Hi,
>>> 
>>> I have installed openmpi-v3.x-201705250239-d5200ea on my "SUSE Linux
>>> Enterprise Server 12.2 (x86_64)" with Sun C 5.14 and gcc-7.1.0.
>>> Depending on the machine that I use to start my processes, I have
>>> a problem with "--host" for versions "v3.x" and "master", while
>>> everything works as expected with earlier versions.
>>> 
>>> 
>>> loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi
>>> --
>> 
>>> There are not enough slots available in the system to satisfy the 3
>> slots
>>> that were requested by the application:
>>>hello_1_mpi
>>> 
>>> Either request fewer slots for your 

Re: [OMPI users] Closing pipes associated with repeated MPI comm spawns

2017-05-29 Thread r...@open-mpi.org
It looks like v3.0 is clean - probably best to update when it is released. We 
know there are issues with dynamics in the 2.x series, and put a special effort 
to eliminate them in 3.x.


> On Apr 28, 2017, at 8:48 AM, Austin Herrema <aherr...@iastate.edu> wrote:
> 
> OMPI version 2.1.0. Should have clarified that initially, sorry. Running on 
> Ubuntu 12.04.5. 
> 
> On Fri, Apr 28, 2017 at 10:29 AM, r...@open-mpi.org 
> <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> 
> wrote:
> What version of OMPI are you using?
> 
>> On Apr 28, 2017, at 8:26 AM, Austin Herrema <aherr...@iastate.edu 
>> <mailto:aherr...@iastate.edu>> wrote:
>> 
>> Hello all,
>> 
>> I am using mpi4py in an optimization code that iteratively spawns an MPI 
>> analysis code (fortran-based) via "MPI.COMM_SELF.Spawn" (I gather that this 
>> is not an ideal use for comm spawn but I don't have too many other options 
>> at this juncture). I am calling "child_comm.Disconnect()" on the parent side 
>> and "call MPI_COMM_DISCONNECT(parent, ier)" on the child side.
>> 
>> After a dozen or so iterations, it would appear I am running up against the 
>> system limit for number of open pipes:
>> 
>> [affogato:05553] [[63653,0],0] ORTE_ERROR_LOG: The system limit on number of 
>> pipes a process can open was reached in file odls_default_module.c at line 
>> 689
>> [affogato:05553] [[63653,0],0] usock_peer_send_blocking: send() to socket 
>> 998 failed: Broken pipe (32)
>> [affogato:05553] [[63653,0],0] ORTE_ERROR_LOG: Unreachable in file 
>> oob_usock_connection.c at line 316
>> 
>> From this Stackoverflow post 
>> <http://stackoverflow.com/questions/20698712/mpi4py-close-mpi-spawn> I have 
>> surmised that the opened pipes remain open on mpiexec despite no longer 
>> being used. I know I can increase system limits, but this will only get me 
>> so far as I intend to perform hundreds if not thousands of iterations. Is 
>> there a way to dynamically close the unused pipes on either the python or 
>> fortran side? Also, I've seen the "mca parameter" mentioned in regards to 
>> this topic. I don't fully understand what that is, but will setting it have 
>> an effect on this issue?
>> 
>> Thank you,
>> Austin
>> 
>> -- 
>> Austin Herrema
>> PhD Student | Graduate Research Assistant | Iowa State University
>> Wind Energy Science, Engineering, and Policy | Mechanical Engineering
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> 
> 
> -- 
> Austin Herrema
> PhD Student | Graduate Research Assistant | Iowa State University
> Wind Energy Science, Engineering, and Policy | Mechanical Engineering
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] How to launch ompi-server?

2017-05-27 Thread r...@open-mpi.org
This is now fixed in the master and will make it for v3.0, which is planned for 
release in the near future

> On Mar 19, 2017, at 1:40 PM, Adam Sylvester <op8...@gmail.com 
> <mailto:op8...@gmail.com>> wrote:
> 
> I did a little more testing in case this helps... if I run ompi-server on the 
> same host as the one I call MPI_Publish_name() on, it does successfully 
> connect.  But when I run it on a separate machine (which is on the same 
> network and accessible via TCP), I get the issue above where it hangs.
> 
> Thanks for taking a look - if you'd like me to open a bug report for this one 
> somewhere, just let me know.
> 
> -Adam
> 
> On Sun, Mar 19, 2017 at 2:46 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> Well, your initial usage looks correct - you don’t launch ompi-server via 
> mpirun. However, it sounds like there is probably a bug somewhere if it hangs 
> as you describe.
> 
> Scratching my head, I can only recall less than a handful of people ever 
> using these MPI functions to cross-connect jobs, so it does tend to fall into 
> disrepair. As I said, I’ll try to repair it, at least for 3.0.
> 
> 
>> On Mar 19, 2017, at 4:37 AM, Adam Sylvester <op8...@gmail.com 
>> <mailto:op8...@gmail.com>> wrote:
>> 
>> I am trying to use ompi-server with Open MPI 1.10.6.  I'm wondering if I 
>> should run this with or without the mpirun command.  If I run this:
>> 
>> ompi-server --no-daemonize -r +
>> 
>> It prints something such as 959315968.0;tcp://172.31.3.57:45743 
>> <http://172.31.3.57:45743/> to stdout but I have thus far been unable to 
>> connect to it.  That is, in another application on another machine which is 
>> on the same network as the ompi-server machine, I try
>> 
>> MPI_Info info;
>> MPI_Info_create();
>> MPI_Info_set(info, "ompi_global_scope", "true");
>> 
>> char myport[MPI_MAX_PORT_NAME];
>> MPI_Open_port(MPI_INFO_NULL, myport);
>> MPI_Publish_name("adam-server", info, myport);
>> 
>> But the MPI_Publish_name() function hangs forever when I run it like
>> 
>> mpirun -np 1 --ompi-server "959315968.0;tcp://172.31.3.57:45743 
>> <http://172.31.3.57:45743/>" server
>> 
>> Blog posts are inconsistent as to if you should run ompi-server with mpirun 
>> or not so I tried using it but this seg faults:
>> 
>> mpirun -np 1 ompi-server --no-daemonize -r +
>> [ip-172-31-5-39:14785] *** Process received signal ***
>> [ip-172-31-5-39:14785] Signal: Segmentation fault (11)
>> [ip-172-31-5-39:14785] Signal code: Address not mapped (1)
>> [ip-172-31-5-39:14785] Failing at address: 0x6e0
>> [ip-172-31-5-39:14785] [ 0] /lib64/libpthread.so.0(+0xf370)[0x7f895d7a5370]
>> [ip-172-31-5-39:14785] [ 1] 
>> /usr/local/lib/libopen-pal.so.13(opal_hwloc191_hwloc_get_cpubind+0x9)[0x7f895e336839]
>> [ip-172-31-5-39:14785] [ 2] 
>> /usr/local/lib/libopen-rte.so.12(orte_ess_base_proc_binding+0x17a)[0x7f895e5d8fca]
>> [ip-172-31-5-39:14785] [ 3] 
>> /usr/local/lib/openmpi/mca_ess_env.so(+0x15dd)[0x7f895cdcd5dd]
>> [ip-172-31-5-39:14785] [ 4] 
>> /usr/local/lib/libopen-rte.so.12(orte_init+0x168)[0x7f895e5b5368]
>> [ip-172-31-5-39:14785] [ 5] ompi-server[0x4014d4]
>> [ip-172-31-5-39:14785] [ 6] 
>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f895d3f6b35]
>> [ip-172-31-5-39:14785] [ 7] ompi-server[0x40176b]
>> [ip-172-31-5-39:14785] *** End of error message ***
>> 
>> Am I doing something wrong?
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_Comm_accept()

2017-05-27 Thread r...@open-mpi.org
Hardly the hoped-for quick turnaround, but it has been fixed in master and will 
go into v3.0, which is planned for release in the near future

> On Mar 14, 2017, at 6:26 PM, Adam Sylvester <op8...@gmail.com 
> <mailto:op8...@gmail.com>> wrote:
> 
> Excellent - I appreciate the quick turnaround.
> 
> On Tue, Mar 14, 2017 at 10:24 AM, r...@open-mpi.org 
> <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> 
> wrote:
> I don’t see an issue right away, though I know it has been brought up before. 
> I hope to resolve it either this week or next - will reply to this thread 
> with the PR link when ready.
> 
> 
>> On Mar 13, 2017, at 6:16 PM, Adam Sylvester <op8...@gmail.com 
>> <mailto:op8...@gmail.com>> wrote:
>> 
>> Bummer - thanks for the update.  I will revert back to 1.10.x for now then.  
>> Should I file a bug report for this on GitHub or elsewhere?  Or if there's 
>> an issue for this already open, can you point me to it so I can keep track 
>> of when it's fixed?  Any best guess calendar-wise as to when you expect this 
>> to be fixed?
>> 
>> Thanks.
>> 
>> On Mon, Mar 13, 2017 at 10:45 AM, r...@open-mpi.org 
>> <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> 
>> wrote:
>> You should consider it a bug for now - it won’t work in the 2.0 series, and 
>> I don’t think it will work in the upcoming 2.1.0 release. Probably will be 
>> fixed after that.
>> 
>> 
>>> On Mar 13, 2017, at 5:17 AM, Adam Sylvester <op8...@gmail.com 
>>> <mailto:op8...@gmail.com>> wrote:
>>> 
>>> As a follow-up, I tried this with Open MPI 1.10.4 and this worked as 
>>> expected (the port formatting looks really different):
>>> 
>>> $ mpirun -np 1 ./server
>>> Port name is 
>>> 1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300
>>>  <>
>>> Accepted!
>>> 
>>> $ mpirun -np 1 ./client 
>>> "1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300
>>>  <>"
>>> Trying with 
>>> '1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300'
>>>  <>
>>> Connected!
>>> 
>>> I've found some other posts of users asking about similar things regarding 
>>> the 2.x release - is this a bug?
>>> 
>>> On Sun, Mar 12, 2017 at 9:38 PM, Adam Sylvester <op8...@gmail.com 
>>> <mailto:op8...@gmail.com>> wrote:
>>> I'm using Open MPI 2.0.2 on RHEL 7.  I'm trying to use MPI_Open_port() / 
>>> MPI_Comm_accept() / MPI_Conn_connect().  My use case is that I'll have two 
>>> processes running on two machines that don't initially know about each 
>>> other (i.e. I can't do the typical mpirun with a list of IPs); eventually I 
>>> think I may need to use ompi-server to accomplish what I want but for now 
>>> I'm trying to test this out running two processes on the same machine with 
>>> some toy programs.
>>> 
>>> server.cpp creates the port, prints it, and waits for a client to accept 
>>> using it:
>>> 
>>> #include 
>>> #include 
>>> 
>>> int main(int argc, char** argv)
>>> {
>>> MPI_Init(NULL, NULL);
>>> 
>>> char myport[MPI_MAX_PORT_NAME];
>>> MPI_Comm intercomm;
>>> 
>>> MPI_Open_port(MPI_INFO_NULL, myport);
>>> std::cout << "Port name is " << myport << std::endl;
>>> 
>>> MPI_Comm_accept(myport, MPI_INFO_NULL, 0, MPI_COMM_SELF, );
>>> 
>>> std::cout << "Accepted!" << std::endl;
>>> 
>>> MPI_Finalize();
>>> return 0;
>>> }
>>> 
>>> client.cpp takes in this port on the command line and tries to connect to 
>>> it:
>>> 
>>> #include 
>>> #include 
>>> 
>>> int main(int argc, char** argv)
>>> {
>>> MPI_Init(NULL, NULL);
>>> 
>>> MPI_Comm intercomm;
>>> 
>>> const std::string name(argv[1]);
>>> std::cout << "Trying with '" << name << "'" << std::endl;
>>> MPI_Comm_connect(name.c_str(), MPI_INFO_NULL, 0, MPI_COMM_SELF, 
>>> );
>>> 
>>> std::cout << "Connected!" << std::endl;
>>> 
>>>   

Re: [OMPI users] pmix, lxc, hpcx

2017-05-26 Thread r...@open-mpi.org
You can also get around it by configuring OMPI with “--disable-pmix-dstore”


> On May 26, 2017, at 3:02 PM, Howard Pritchard  wrote:
> 
> Hi John,
> 
> In the 2.1.x release stream a shared memory capability was introduced into 
> the PMIx component.
> 
> I know nothing about LXC containers, but it looks to me like there's some 
> issue when PMIx tries
> to create these shared memory segments.  I'd check to see if there's 
> something about your
> container configuration that is preventing the creation of shared memory 
> segments.
> 
> Howard
> 
> 
> 2017-05-26 15:18 GMT-06:00 John Marshall  >:
> Hi,
> 
> I have built openmpi 2.1.1 with hpcx-1.8 and tried to run some mpi code under
> ubuntu 14.04 and LXC (1.x) but I get the following:
> [ib7-bc2oo42-be10p16.science.gc.ca:16035 
> ] PMIX ERROR: 
> OUT-OF-RESOURCE in file src/dstore/pmix_esh.c at line 1651
> [ib7-bc2oo42-be10p16.science.gc.ca:16035 
> ] PMIX ERROR: 
> OUT-OF-RESOURCE in file src/dstore/pmix_esh.c at line 1751
> [ib7-bc2oo42-be10p16.science.gc.ca:16035 
> ] PMIX ERROR: 
> OUT-OF-RESOURCE in file src/dstore/pmix_esh.c at line 1114
> [ib7-bc2oo42-be10p16.science.gc.ca:16035 
> ] PMIX ERROR: 
> OUT-OF-RESOURCE in file src/common/pmix_jobdata.c at line 93
> [ib7-bc2oo42-be10p16.science.gc.ca:16035 
> ] PMIX ERROR: 
> OUT-OF-RESOURCE in file src/common/pmix_jobdata.c at line 333
> [ib7-bc2oo42-be10p16.science.gc.ca:16035 
> ] PMIX ERROR: 
> OUT-OF-RESOURCE in file src/server/pmix_server.c at line 606
> I do not get the same outside of the LXC container and my code runs fine.
> 
> I've looked for more info on these messages but could not find anything
> helpful. Are these messages indicative of something missing in, or some
> incompatibility with, the container?
> 
> When I build using 2.0.2, I do not have a problem running inside or outside of
> the container.
> 
> Thanks,
> John
> 
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread r...@open-mpi.org
If I might interject here before lots of time is wasted. Spectrum MPI is an IBM 
-product- and is not free. What you are likely running into is that their 
license manager is blocking you from running, albeit without a really nice 
error message. I’m sure that’s something they are working on.

If you really want to use Spectrum MPI, I suggest you contact them about 
purchasing it.


> On May 19, 2017, at 1:16 AM, Gabriele Fatigati  wrote:
> 
> Hi Gilles, in attach the outpuf of:
> 
> mpirun --mca btl_base_verbose 100 -np 2 ...
> 
> 2017-05-19 9:43 GMT+02:00 Gilles Gouaillardet  >:
> Gabriele,
> 
> 
> can you
> 
> mpirun --mca btl_base_verbose 100 -np 2 ...
> 
> 
> so we can figure out why nor sm nor vader is used ?
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> 
> On 5/19/2017 4:23 PM, Gabriele Fatigati wrote:
> Oh no, by using two procs:
> 
> 
> findActiveDevices Error
> We found no active IB device ports
> findActiveDevices Error
> We found no active IB device ports
> --
> At least one pair of MPI processes are unable to reach each other for
> MPI communications.  This means that no Open MPI device has indicated
> that it can be used to communicate between these processes.  This is
> an error; Open MPI requires that all MPI processes be able to reach
> each other.  This error can sometimes be the result of forgetting to
> specify the "self" BTL.
> 
>   Process 1 ([[12380,1],0]) is on host: openpower
>   Process 2 ([[12380,1],1]) is on host: openpower
>   BTLs attempted: self
> 
> Your MPI job is now going to abort; sorry.
> --
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> --
> MPI_INIT has failed because at least one MPI process is unreachable
> from another.  This *usually* means that an underlying communication
> plugin -- such as a BTL or an MTL -- has either not loaded or not
> allowed itself to be used.  Your MPI job will now abort.
> 
> You may wish to try to narrow down the problem;
>  * Check the output of ompi_info to see which BTL/MTL plugins are
>available.
>  * Run your application with MPI_THREAD_SINGLE.
>  * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
>if using MTL-based communications) to see exactly which
>communication plugins were considered and/or discarded.
> --
> [openpower:88867] 1 more process has sent help message help-mca-bml-r2.txt / 
> unreachable proc
> [openpower:88867] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
> all help / error messages
> [openpower:88867] 1 more process has sent help message help-mpi-runtime.txt / 
> mpi_init:startup:pml-add-procs-fail
> 
> 
> 
> 
> 
> 2017-05-19 9:22 GMT+02:00 Gabriele Fatigati    >>:
> 
> Hi GIlles,
> 
> using your command with one MPI procs I get:
> 
> findActiveDevices Error
> We found no active IB device ports
> Hello world from rank 0  out of 1 processors
> 
> So it seems to work apart the error message.
> 
> 
> 2017-05-19 9:10 GMT+02:00 Gilles Gouaillardet  
> >>:
> 
> Gabriele,
> 
> 
> so it seems pml/pami assumes there is an infiniband card
> available (!)
> 
> i guess IBM folks will comment on that shortly.
> 
> 
> meanwhile, you do not need pami since you are running on a
> single node
> 
> mpirun --mca pml ^pami ...
> 
> should do the trick
> 
> (if it does not work, can run and post the logs)
> 
> mpirun --mca pml ^pami --mca pml_base_verbose 100 ...
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> On 5/19/2017 4:01 PM, Gabriele Fatigati wrote:
> 
> Hi John,
> Infiniband is not used, there is a single node on this
> machine.
> 
> 2017-05-19 8:50 GMT+02:00 John Hearns via users
> 
>  >
> 
>   
> Gabriele,   pleae run  

Re: [OMPI users] Can OpenMPI support torque and slurm at the same time?

2017-05-10 Thread r...@open-mpi.org
Certainly. Just make sure you have the headers for both on the node where you 
build OMPI so we build the required components. Then we will auto-detect which 
one we are running under, so nothing further is required

> On May 10, 2017, at 11:41 AM, Belgin, Mehmet  
> wrote:
> 
> Hello everyone,
> 
> Is it possible to compile OpenMPI to support torque and slurm at the same 
> time? We may end up using torque and slurm on different clusters in the near 
> future and hoping to continue maintain only one OpenMPI stack in our combined 
> repository.
> 
> Thank you!
> -Mehmet
> 
> =
> Mehmet Belgin, Ph.D.
> Scientific Computing Consultant 
> Partnership for an Advanced Computing Environment (PACE)
> Georgia Institute of Technology
> 258 4th Street NW, Rich Building, #326 
> Atlanta, GA  30332-0700
> Office: (404) 385-0665
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] [OMPI USERS] Jumbo frames

2017-05-05 Thread r...@open-mpi.org
If you are looking to use TCP packets, then you want to set the send/recv 
buffer size in the TCP btl, not the openib one, yes?

Also, what version of OMPI are you using?

> On May 5, 2017, at 7:16 AM, Alberto Ortiz  wrote:
> 
> Hi,
> I have a program running with openMPI over a network using a gigabit switch. 
> This switch supports jumbo frames up to 13.000 bytes, so, in order to test 
> and see if it would be faster communicating with this frame lengths, I am 
> trying to use them with my program. I have set the MTU in each node to be 
> 13.000 but when running the program it doesn't even initiate, it gets 
> blocked. I have tried different lengths from 1.500 up to 13.000 but it 
> doesn't work with any length.
> 
> I have searched and only found that I have to set OMPI with "-mca 
> btl_openib_ib_mtu 13000" or the length to be used, but I don't seem to get it 
> working.
> 
> Which are the steps to get OMPI to use larger TCP packets length? Is it 
> possible to reach 13000 bytes instead of the standard 1500?
> 
> Thank yo in advance,
> Alberto
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Closing pipes associated with repeated MPI comm spawns

2017-04-28 Thread r...@open-mpi.org
What version of OMPI are you using?

> On Apr 28, 2017, at 8:26 AM, Austin Herrema  wrote:
> 
> Hello all,
> 
> I am using mpi4py in an optimization code that iteratively spawns an MPI 
> analysis code (fortran-based) via "MPI.COMM_SELF.Spawn" (I gather that this 
> is not an ideal use for comm spawn but I don't have too many other options at 
> this juncture). I am calling "child_comm.Disconnect()" on the parent side and 
> "call MPI_COMM_DISCONNECT(parent, ier)" on the child side.
> 
> After a dozen or so iterations, it would appear I am running up against the 
> system limit for number of open pipes:
> 
> [affogato:05553] [[63653,0],0] ORTE_ERROR_LOG: The system limit on number of 
> pipes a process can open was reached in file odls_default_module.c at line 689
> [affogato:05553] [[63653,0],0] usock_peer_send_blocking: send() to socket 998 
> failed: Broken pipe (32)
> [affogato:05553] [[63653,0],0] ORTE_ERROR_LOG: Unreachable in file 
> oob_usock_connection.c at line 316
> 
> From this Stackoverflow post 
>  I have 
> surmised that the opened pipes remain open on mpiexec despite no longer being 
> used. I know I can increase system limits, but this will only get me so far 
> as I intend to perform hundreds if not thousands of iterations. Is there a 
> way to dynamically close the unused pipes on either the python or fortran 
> side? Also, I've seen the "mca parameter" mentioned in regards to this topic. 
> I don't fully understand what that is, but will setting it have an effect on 
> this issue?
> 
> Thank you,
> Austin
> 
> -- 
> Austin Herrema
> PhD Student | Graduate Research Assistant | Iowa State University
> Wind Energy Science, Engineering, and Policy | Mechanical Engineering
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI 2.1.0: FAIL: opal_path_nfs

2017-04-26 Thread r...@open-mpi.org
We have a utility that checks to ensure that the shared memory backing file is 
not on a shared file system. The test is checking to see if that utility can 
stat and assess the nature of any file system on your node. It’s undoubtedly 
stale as things have changed over the years (new file systems appearing, etc). 
So all it is saying is “found something I don’t recognize”.

> On Apr 26, 2017, at 3:19 PM, Prentice Bisbal <pbis...@pppl.gov> wrote:
> 
> That's what I figured, but I wanted to check first. Any idea of exactly what 
> it's trying to check?
> 
> Prentice
> 
> On 04/26/2017 05:54 PM, r...@open-mpi.org wrote:
>> You can probably safely ignore it.
>> 
>>> On Apr 26, 2017, at 2:29 PM, Prentice Bisbal <pbis...@pppl.gov> wrote:
>>> 
>>> I'm trying to build OpenMPI 2.1.0 with GCC 5.4.0 on CentOS 6.8. After 
>>> working around the '-Lyes/lib' errors I reported in my previous post, 
>>> opal_path_nfs fails during 'make check' (see below). Is this failure 
>>> critical, or is it something I can ignore and continue with my install? 
>>> Googling only returned links to discussions of similar problems from 4-5 
>>> years ago with earlier versions of OpenMPI.
>>> 
>>> STDOUT and STDERR from 'make check':
>>> 
>>> make  check-TESTS
>>> make[3]: Entering directory `/local/pbisbal/openmpi-2.1.0/test/util'
>>> make[4]: Entering directory `/local/pbisbal/openmpi-2.1.0/test/util'
>>> PASS: opal_bit_ops
>>> FAIL: opal_path_nfs
>>> 
>>> Testsuite summary for Open MPI 2.1.0
>>> 
>>> # TOTAL: 2
>>> # PASS:  1
>>> # SKIP:  0
>>> # XFAIL: 0
>>> # FAIL:  1
>>> # XPASS: 0
>>> # ERROR: 0
>>> 
>>> See test/util/test-suite.log
>>> Please report to http://www.open-mpi.org/community/help/
>>> 
>>> 
>>> Contents of test/util/test-suite.log:
>>> 
>>> cat test/util/test-suite.log
>>> ==
>>>   Open MPI 2.1.0: test/util/test-suite.log
>>> ==
>>> 
>>> # TOTAL: 2
>>> # PASS:  1
>>> # SKIP:  0
>>> # XFAIL: 0
>>> # FAIL:  1
>>> # XPASS: 0
>>> # ERROR: 0
>>> 
>>> .. contents:: :depth: 2
>>> 
>>> FAIL: opal_path_nfs
>>> ===
>>> 
>>> Test usage: ./opal_path_nfs [DIR]
>>> On Linux interprets output from mount(8) to check for nfs and verify 
>>> opal_path_nfs()
>>> Additionally, you may specify multiple DIR on the cmd-line, of which you 
>>> the output
>>> get_mounts: dirs[0]:/ fs:rootfs nfs:No
>>> get_mounts: dirs[1]:/proc fs:proc nfs:No
>>> get_mounts: dirs[2]:/sys fs:sysfs nfs:No
>>> get_mounts: dirs[3]:/dev fs:devtmpfs nfs:No
>>> get_mounts: dirs[4]:/dev/pts fs:devpts nfs:No
>>> get_mounts: dirs[5]:/dev/shm fs:tmpfs nfs:No
>>> get_mounts: already know dir[0]:/
>>> get_mounts: dirs[0]:/ fs:nfs nfs:Yes
>>> get_mounts: dirs[6]:/proc/bus/usb fs:usbfs nfs:No
>>> get_mounts: dirs[7]:/var/lib/stateless/writable fs:tmpfs nfs:No
>>> get_mounts: dirs[8]:/var/cache/man fs:tmpfs nfs:No
>>> get_mounts: dirs[9]:/var/lock fs:tmpfs nfs:No
>>> get_mounts: dirs[10]:/var/log fs:tmpfs nfs:No
>>> get_mounts: dirs[11]:/var/run fs:tmpfs nfs:No
>>> get_mounts: dirs[12]:/var/lib/dbus fs:tmpfs nfs:No
>>> get_mounts: dirs[13]:/var/lib/nfs fs:tmpfs nfs:No
>>> get_mounts: dirs[14]:/tmp fs:tmpfs nfs:No
>>> get_mounts: dirs[15]:/var/cache/foomatic fs:tmpfs nfs:No
>>> get_mounts: dirs[16]:/var/cache/hald fs:tmpfs nfs:No
>>> get_mounts: dirs[17]:/var/cache/logwatch fs:tmpfs nfs:No
>>> get_mounts: dirs[18]:/var/lib/dhclient fs:tmpfs nfs:No
>>> get_mounts: dirs[19]:/var/tmp fs:tmpfs nfs:No
>>> get_mounts: dirs[20]:/media fs:tmpfs nfs:No
>>> get_mounts: dirs[21]:/etc/adjtime fs:tmpfs nfs:No
>>> get_mounts: dirs[22]:/etc/ntp.conf fs:tmpfs nfs:No
>>> get_mounts: dirs[23]:/etc/resolv.conf fs:tmpfs nfs:No
>>> get_mounts: dirs[24]:/etc/lvm/archive fs:tmpfs nfs:No
>>> get_mounts: dirs[25]:/etc/lvm/backup fs:tmpfs nfs:No
>>> get_mounts: dirs[26]:/var/account fs:tmpfs nfs:No
>

Re: [OMPI users] OpenMPI 2.1.0: FAIL: opal_path_nfs

2017-04-26 Thread r...@open-mpi.org
You can probably safely ignore it.

> On Apr 26, 2017, at 2:29 PM, Prentice Bisbal  wrote:
> 
> I'm trying to build OpenMPI 2.1.0 with GCC 5.4.0 on CentOS 6.8. After working 
> around the '-Lyes/lib' errors I reported in my previous post, opal_path_nfs 
> fails during 'make check' (see below). Is this failure critical, or is it 
> something I can ignore and continue with my install? Googling only returned 
> links to discussions of similar problems from 4-5 years ago with earlier 
> versions of OpenMPI.
> 
> STDOUT and STDERR from 'make check':
> 
> make  check-TESTS
> make[3]: Entering directory `/local/pbisbal/openmpi-2.1.0/test/util'
> make[4]: Entering directory `/local/pbisbal/openmpi-2.1.0/test/util'
> PASS: opal_bit_ops
> FAIL: opal_path_nfs
> 
> Testsuite summary for Open MPI 2.1.0
> 
> # TOTAL: 2
> # PASS:  1
> # SKIP:  0
> # XFAIL: 0
> # FAIL:  1
> # XPASS: 0
> # ERROR: 0
> 
> See test/util/test-suite.log
> Please report to http://www.open-mpi.org/community/help/
> 
> 
> Contents of test/util/test-suite.log:
> 
> cat test/util/test-suite.log
> ==
>   Open MPI 2.1.0: test/util/test-suite.log
> ==
> 
> # TOTAL: 2
> # PASS:  1
> # SKIP:  0
> # XFAIL: 0
> # FAIL:  1
> # XPASS: 0
> # ERROR: 0
> 
> .. contents:: :depth: 2
> 
> FAIL: opal_path_nfs
> ===
> 
> Test usage: ./opal_path_nfs [DIR]
> On Linux interprets output from mount(8) to check for nfs and verify 
> opal_path_nfs()
> Additionally, you may specify multiple DIR on the cmd-line, of which you the 
> output
> get_mounts: dirs[0]:/ fs:rootfs nfs:No
> get_mounts: dirs[1]:/proc fs:proc nfs:No
> get_mounts: dirs[2]:/sys fs:sysfs nfs:No
> get_mounts: dirs[3]:/dev fs:devtmpfs nfs:No
> get_mounts: dirs[4]:/dev/pts fs:devpts nfs:No
> get_mounts: dirs[5]:/dev/shm fs:tmpfs nfs:No
> get_mounts: already know dir[0]:/
> get_mounts: dirs[0]:/ fs:nfs nfs:Yes
> get_mounts: dirs[6]:/proc/bus/usb fs:usbfs nfs:No
> get_mounts: dirs[7]:/var/lib/stateless/writable fs:tmpfs nfs:No
> get_mounts: dirs[8]:/var/cache/man fs:tmpfs nfs:No
> get_mounts: dirs[9]:/var/lock fs:tmpfs nfs:No
> get_mounts: dirs[10]:/var/log fs:tmpfs nfs:No
> get_mounts: dirs[11]:/var/run fs:tmpfs nfs:No
> get_mounts: dirs[12]:/var/lib/dbus fs:tmpfs nfs:No
> get_mounts: dirs[13]:/var/lib/nfs fs:tmpfs nfs:No
> get_mounts: dirs[14]:/tmp fs:tmpfs nfs:No
> get_mounts: dirs[15]:/var/cache/foomatic fs:tmpfs nfs:No
> get_mounts: dirs[16]:/var/cache/hald fs:tmpfs nfs:No
> get_mounts: dirs[17]:/var/cache/logwatch fs:tmpfs nfs:No
> get_mounts: dirs[18]:/var/lib/dhclient fs:tmpfs nfs:No
> get_mounts: dirs[19]:/var/tmp fs:tmpfs nfs:No
> get_mounts: dirs[20]:/media fs:tmpfs nfs:No
> get_mounts: dirs[21]:/etc/adjtime fs:tmpfs nfs:No
> get_mounts: dirs[22]:/etc/ntp.conf fs:tmpfs nfs:No
> get_mounts: dirs[23]:/etc/resolv.conf fs:tmpfs nfs:No
> get_mounts: dirs[24]:/etc/lvm/archive fs:tmpfs nfs:No
> get_mounts: dirs[25]:/etc/lvm/backup fs:tmpfs nfs:No
> get_mounts: dirs[26]:/var/account fs:tmpfs nfs:No
> get_mounts: dirs[27]:/var/lib/iscsi fs:tmpfs nfs:No
> get_mounts: dirs[28]:/var/lib/logrotate.status fs:tmpfs nfs:No
> get_mounts: dirs[29]:/var/lib/ntp fs:tmpfs nfs:No
> get_mounts: dirs[30]:/var/spool fs:tmpfs nfs:No
> get_mounts: dirs[31]:/var/lib/sss fs:tmpfs nfs:No
> get_mounts: dirs[32]:/etc/sysconfig/network-scripts fs:tmpfs nfs:No
> get_mounts: dirs[33]:/var fs:ext4 nfs:No
> get_mounts: already know dir[14]:/tmp
> get_mounts: dirs[14]:/tmp fs:ext4 nfs:No
> get_mounts: dirs[34]:/local fs:ext4 nfs:No
> get_mounts: dirs[35]:/proc/sys/fs/binfmt_misc fs:binfmt_misc nfs:No
> get_mounts: dirs[36]:/local/cgroup/cpuset fs:cgroup nfs:No
> get_mounts: dirs[37]:/local/cgroup/cpu fs:cgroup nfs:No
> get_mounts: dirs[38]:/local/cgroup/cpuacct fs:cgroup nfs:No
> get_mounts: dirs[39]:/local/cgroup/memory fs:cgroup nfs:No
> get_mounts: dirs[40]:/local/cgroup/devices fs:cgroup nfs:No
> get_mounts: dirs[41]:/local/cgroup/freezer fs:cgroup nfs:No
> get_mounts: dirs[42]:/local/cgroup/net_cls fs:cgroup nfs:No
> get_mounts: dirs[43]:/local/cgroup/blkio fs:cgroup nfs:No
> get_mounts: dirs[44]:/usr/pppl fs:nfs nfs:Yes
> get_mounts: dirs[45]:/misc fs:autofs nfs:No
> get_mounts: dirs[46]:/net fs:autofs nfs:No
> get_mounts: dirs[47]:/v fs:autofs nfs:No
> get_mounts: dirs[48]:/u fs:autofs nfs:No
> get_mounts: dirs[49]:/w fs:autofs nfs:No
> get_mounts: dirs[50]:/l fs:autofs nfs:No
> get_mounts: dirs[51]:/p fs:autofs nfs:No
> get_mounts: dirs[52]:/pfs fs:autofs nfs:No
> get_mounts: dirs[53]:/proc/fs/nfsd fs:nfsd nfs:No
> get_mounts: dirs[54]:/u/gtchilin fs:nfs nfs:Yes
> get_mounts: dirs[55]:/u/ldelgado fs:nfs nfs:Yes

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
Sure - there is always an MCA param for everything: 
OMPI_MCA_rmaps_base_oversubscribe=1


> On Apr 25, 2017, at 2:10 PM, Eric Chamberland 
> <eric.chamberl...@giref.ulaval.ca> wrote:
> 
> On 25/04/17 04:36 PM, r...@open-mpi.org wrote:
>> add --oversubscribe to the cmd line
> 
> good, it works! :)
> 
> Is there an environment variable equivalent to --oversubscribe argument?
> 
> I can't find this option in near related FAQ entries, should it be added 
> here? :
> 
> https://www.open-mpi.org/faq/?category=running#oversubscribing
> 
> or here ? :
> 
> https://www.open-mpi.org/faq/?category=running#force-aggressive-degraded
> 
> I was using:
> 
> export OMPI_MCA_mpi_yield_when_idle=1
> export OMPI_MCA_hwloc_base_binding_policy=none
> 
> and it was ok before...
> 
> Thanks,
> 
> Eric
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
If it helps, I believe I added the ability to just use ‘:*’ to indicate “take 
them all” so you don’t have to remember the number.

> On Apr 25, 2017, at 2:13 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> 
> Thanks Ralph,
> 
> Indeed, if I add :8 I get back the expected behavior. I can cope with this (I 
> don't usually restrict my runs to a subset of the nodes).
> 
>   George.
> 
> 
> On Tue, Apr 25, 2017 at 4:53 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> I suspect it read the file just fine - what you are seeing in the output is a 
> reflection of the community’s design decision that only one slot would be 
> allocated for each time a node is listed in -host. This is why they added the 
> :N modifier so you can specify the #slots to use in lieu of writing the host 
> name N times
> 
> If this isn’t what you feel it should do, then please look at the files in 
> orte/util/dash_host and feel free to propose a modification to the behavior. 
> I personally am not bound to any particular answer, but I really don’t have 
> time to address it again.
> 
> 
> 
>> On Apr 25, 2017, at 1:35 PM, George Bosilca <bosi...@icl.utk.edu 
>> <mailto:bosi...@icl.utk.edu>> wrote:
>> 
>> Just to be clear, the hostfile contains the correct info:
>> 
>> dancer00 slots=8
>> dancer01 slots=8
>> 
>> The output regarding the 2 nodes (dancer00 and dancer01) is clearly wrong.
>> 
>>   George.
>> 
>> 
>> 
>> On Tue, Apr 25, 2017 at 4:32 PM, George Bosilca <bosi...@icl.utk.edu 
>> <mailto:bosi...@icl.utk.edu>> wrote:
>> I confirm a similar issue on a more managed environment. I have an hostfile 
>> that worked for the last few years, and that span across a small cluster (30 
>> nodes of 8 cores each). 
>> 
>> Trying to spawn any number of processes across P nodes fails if the number 
>> of processes is larger than P (despite the fact that there are largely 
>> enough resources, and that this information is provided via the hostfile).
>> 
>> George.
>> 
>> 
>> $ mpirun -mca ras_base_verbose 10 --display-allocation -np 4 --host 
>> dancer00,dancer01 --map-by
>> 
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: registering framework ras components
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: found loaded component simulator
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: component simulator register function successful
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: found loaded component slurm
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: component slurm register function successful
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: found loaded component loadleveler
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: component loadleveler register function successful
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: found loaded component tm
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_register: component tm register function successful
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_open: opening ras components
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_open: found loaded component simulator
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_open: found loaded component slurm
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_open: component slurm open function successful
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_open: found loaded component loadleveler
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_open: component loadleveler open function successful
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
>> components_open: found loaded component tm
>> [dancer.icl.utk.edu:13457 <http://dancer.icl.utk.edu:13457/>] mca: base: 
&g

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
  dancer29: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer30: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer31: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> =
> --
> There are not enough slots available in the system to satisfy the 4 slots
> that were requested by the application:
>   startup
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> 
> 
> 
> 
> On Tue, Apr 25, 2017 at 4:00 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> Okay - so effectively you have no hostfile, and no allocation. So this is 
> running just on the one node where mpirun exists?
> 
> Add “-mca ras_base_verbose 10 --display-allocation” to your cmd line and 
> let’s see what it found
> 
> > On Apr 25, 2017, at 12:56 PM, Eric Chamberland 
> > <eric.chamberl...@giref.ulaval.ca 
> > <mailto:eric.chamberl...@giref.ulaval.ca>> wrote:
> >
> > Hi,
> >
> > the host file has been constructed automatically by the 
> > configuration+installation process and seems to contain only comments and a 
> > blank line:
> >
> > (15:53:50) [zorg]:~> cat /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> > #
> > # Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
> > # University Research and Technology
> > # Corporation.  All rights reserved.
> > # Copyright (c) 2004-2005 The University of Tennessee and The University
> > # of Tennessee Research Foundation.  All rights
> > # reserved.
> > # Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
> > # University of Stuttgart.  All rights reserved.
> > # Copyright (c) 2004-2005 The Regents of the University of California.
> > # All rights reserved.
> > # $COPYRIGHT$
> > #
> > # Additional copyrights may follow
> > #
> > # $HEADER$
> > #
> > # This is the default hostfile for Open MPI.  Notice that it does not
> > # contain any hosts (not even localhost).  This file should only
> > # contain hosts if a system administrator wants users to always have
> > # the same set of default hosts, and is not using a batch scheduler
> > # (such as SLURM, PBS, etc.).
> > #
> > # Note that this file is *not* used when running in "managed"
> > # environments (e.g., running in a job under a job scheduler, such as
> > # SLURM or PBS / Torque).
> > #
> > # If you are primarily interested in running Open MPI on one node, you
> > # should *not* simply list "localhost" in here (contrary to prior MPI
> > # implementations, such as LAM/MPI).  A localhost-only node list is
> > # created by the RAS component named "localhost" if no other RAS
> > # components were able to find any hosts to run on (this behavior can
> > # be disabled by excluding the localhost RAS component by specifying
> > # the value "^localhost" [without the quotes] to the "ras" MCA
> > # parameter).
> >
> > (15:53:52) [zorg]:~>
> >
> > Thanks!
> >
> > Eric
> >
> >
> > On 25/04/17 03:52 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
> >> What is in your hostfile?
> >>
> >>
> >>> On Apr 25, 2017, at 11:39 AM, Eric Chamberland 
> >>> <eric.chamberl...@giref.ulaval.ca 
> >>> <mailto:eric.chamberl...@giref.ulaval.ca>> wrote:
> >>>
> >>> Hi,
> >>>
> >>> just testing the 3.x branch... I launch:
> >>>
> >>> mpirun -n 8 echo "hello"
> >>>
> >>> and I get:
> >>>
> >>> --
> >>> There are not enough slots available in the system to satisfy the 8 slots
> >>> that were requested by the application:
> >>> echo
> >>>
> >>> Either request fewer slots for your application, or make more slots 
> >>> available
> >>> for use.
> >>> --
> >>>
> >>> I have to oversubscr

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
tate=UNKNOWN
>   dancer11: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer12: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer13: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer14: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer15: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer16: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer17: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer18: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer19: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer20: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer21: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer22: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer23: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer24: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer25: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer26: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer27: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer28: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer29: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer30: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer31: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> =
> 
> ==   ALLOCATED NODES   ==
>   dancer00: flags=0x13 slots=1 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer01: flags=0x13 slots=1 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer02: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer03: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer04: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer05: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer06: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer07: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer08: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer09: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer10: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer11: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer12: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer13: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer14: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer15: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer16: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer17: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer18: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer19: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer20: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer21: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer22: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer23: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer24: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer25: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer26: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer27: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer28: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer29: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer30: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
>   dancer31: flags=0x10 slots=8 max_slots=0 slots_inuse=0 state=UNKNOWN
> =====
> --
> There are not enough slots available in the system to satisfy the 4 slots
> that were requested by the application:
>   startup
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> 
> 
> 
> 
> On Tue, Apr 25, 2017 at 4:00 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
>

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
able
> for use.
> --
> [zorg:22429] [[40249,0],0] plm:base:orted_cmd sending orted_exit commands
> [zorg:22429] [[40249,0],0] plm:base:receive stop comm
> 
> ===
> second with -n 4:
> ===
> (16:31:23) [zorg]:~> mpirun -mca ras_base_verbose 10 --display-allocation -n 
> 4 echo "Hello"
> 
> [zorg:22463] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL
> [zorg:22463] plm:base:set_hnp_name: initial bias 22463 nodename hash 810220270
> [zorg:22463] plm:base:set_hnp_name: final jobfam 40219
> [zorg:22463] [[40219,0],0] plm:rsh_setup on agent ssh : rsh path NULL
> [zorg:22463] [[40219,0],0] plm:base:receive start comm
> [zorg:22463] mca: base: components_register: registering framework ras 
> components
> [zorg:22463] mca: base: components_register: found loaded component 
> loadleveler
> [zorg:22463] mca: base: components_register: component loadleveler register 
> function successful
> [zorg:22463] mca: base: components_register: found loaded component slurm
> [zorg:22463] mca: base: components_register: component slurm register 
> function successful
> [zorg:22463] mca: base: components_register: found loaded component simulator
> [zorg:22463] mca: base: components_register: component simulator register 
> function successful
> [zorg:22463] mca: base: components_open: opening ras components
> [zorg:22463] mca: base: components_open: found loaded component loadleveler
> [zorg:22463] mca: base: components_open: component loadleveler open function 
> successful
> [zorg:22463] mca: base: components_open: found loaded component slurm
> [zorg:22463] mca: base: components_open: component slurm open function 
> successful
> [zorg:22463] mca: base: components_open: found loaded component simulator
> [zorg:22463] mca:base:select: Auto-selecting ras components
> [zorg:22463] mca:base:select:(  ras) Querying component [loadleveler]
> [zorg:22463] [[40219,0],0] ras:loadleveler: NOT available for selection
> [zorg:22463] mca:base:select:(  ras) Querying component [slurm]
> [zorg:22463] mca:base:select:(  ras) Querying component [simulator]
> [zorg:22463] mca:base:select:(  ras) No component selected!
> [zorg:22463] [[40219,0],0] plm:base:setup_job
> [zorg:22463] [[40219,0],0] ras:base:allocate
> [zorg:22463] [[40219,0],0] ras:base:allocate nothing found in module - 
> proceeding to hostfile
> [zorg:22463] [[40219,0],0] ras:base:allocate parsing default hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> [zorg:22463] [[40219,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22463] [[40219,0],0] ras:base:allocate nothing found in hostfiles - 
> checking for rankfile
> [zorg:22463] [[40219,0],0] ras:base:allocate nothing found in rankfile - 
> inserting current node
> [zorg:22463] [[40219,0],0] ras:base:node_insert inserting 1 nodes
> [zorg:22463] [[40219,0],0] ras:base:node_insert updating HNP [zorg] info to 1 
> slots
> 
> ==   ALLOCATED NODES   ==
>zorg: flags=0x01 slots=1 max_slots=0 slots_inuse=0 state=UP
> =
> [zorg:22463] [[40219,0],0] plm:base:setup_vm
> [zorg:22463] [[40219,0],0] plm:base:setup_vm creating map
> [zorg:22463] [[40219,0],0] setup:vm: working unmanaged allocation
> [zorg:22463] [[40219,0],0] using default hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> [zorg:22463] [[40219,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22463] [[40219,0],0] plm:base:setup_vm only HNP in allocation
> [zorg:22463] [[40219,0],0] plm:base:setting slots for node zorg by cores
> 
> ==   ALLOCATED NODES   ==
>zorg: flags=0x11 slots=4 max_slots=0 slots_inuse=0 state=UP
> =
> [zorg:22463] [[40219,0],0] complete_setup on job [40219,1]
> [zorg:22463] [[40219,0],0] plm:base:launch_apps for job [40219,1]
> [zorg:22463] [[40219,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22463] [[40219,0],0] plm:base:launch wiring up iof for job [40219,1]
> [zorg:22463] [[40219,0],0] plm:base:launch job [40219,1] is not a dynamic 
> spawn
> Hello
> Hello
> Hello
> Hello
> [zorg:22463] [[40219,0],0] plm:base:orted_cmd sending orted_exit commands
> [zorg:22463] [[40219,0],0] plm:base:receive stop comm
> 
> 
> Thanks!
> 
> Eric
> 
> On 25/04/17 04:00 PM, r...@open-mpi.org wrote:
>> -mca ras_base_verbose 10 --display-allocation
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
Okay - so effectively you have no hostfile, and no allocation. So this is 
running just on the one node where mpirun exists?

Add “-mca ras_base_verbose 10 --display-allocation” to your cmd line and let’s 
see what it found

> On Apr 25, 2017, at 12:56 PM, Eric Chamberland 
> <eric.chamberl...@giref.ulaval.ca> wrote:
> 
> Hi,
> 
> the host file has been constructed automatically by the 
> configuration+installation process and seems to contain only comments and a 
> blank line:
> 
> (15:53:50) [zorg]:~> cat /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> #
> # Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
> # University Research and Technology
> # Corporation.  All rights reserved.
> # Copyright (c) 2004-2005 The University of Tennessee and The University
> # of Tennessee Research Foundation.  All rights
> # reserved.
> # Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
> # University of Stuttgart.  All rights reserved.
> # Copyright (c) 2004-2005 The Regents of the University of California.
> # All rights reserved.
> # $COPYRIGHT$
> #
> # Additional copyrights may follow
> #
> # $HEADER$
> #
> # This is the default hostfile for Open MPI.  Notice that it does not
> # contain any hosts (not even localhost).  This file should only
> # contain hosts if a system administrator wants users to always have
> # the same set of default hosts, and is not using a batch scheduler
> # (such as SLURM, PBS, etc.).
> #
> # Note that this file is *not* used when running in "managed"
> # environments (e.g., running in a job under a job scheduler, such as
> # SLURM or PBS / Torque).
> #
> # If you are primarily interested in running Open MPI on one node, you
> # should *not* simply list "localhost" in here (contrary to prior MPI
> # implementations, such as LAM/MPI).  A localhost-only node list is
> # created by the RAS component named "localhost" if no other RAS
> # components were able to find any hosts to run on (this behavior can
> # be disabled by excluding the localhost RAS component by specifying
> # the value "^localhost" [without the quotes] to the "ras" MCA
> # parameter).
> 
> (15:53:52) [zorg]:~>
> 
> Thanks!
> 
> Eric
> 
> 
> On 25/04/17 03:52 PM, r...@open-mpi.org wrote:
>> What is in your hostfile?
>> 
>> 
>>> On Apr 25, 2017, at 11:39 AM, Eric Chamberland 
>>> <eric.chamberl...@giref.ulaval.ca> wrote:
>>> 
>>> Hi,
>>> 
>>> just testing the 3.x branch... I launch:
>>> 
>>> mpirun -n 8 echo "hello"
>>> 
>>> and I get:
>>> 
>>> --
>>> There are not enough slots available in the system to satisfy the 8 slots
>>> that were requested by the application:
>>> echo
>>> 
>>> Either request fewer slots for your application, or make more slots 
>>> available
>>> for use.
>>> --
>>> 
>>> I have to oversubscribe, so what do I have to do to bypass this 
>>> "limitation"?
>>> 
>>> Thanks,
>>> 
>>> Eric
>>> 
>>> configure log:
>>> 
>>> http://www.giref.ulaval.ca/~cmpgiref/ompi_3.x/2017.04.25.10h46m08s_config.log
>>> http://www.giref.ulaval.ca/~cmpgiref/ompi_3.x/2017.04.25.10h46m08s_ompi_info_all.txt
>>> 
>>> 
>>> here is the complete message:
>>> 
>>> [zorg:30036] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL
>>> [zorg:30036] plm:base:set_hnp_name: initial bias 30036 nodename hash 
>>> 810220270
>>> [zorg:30036] plm:base:set_hnp_name: final jobfam 49136
>>> [zorg:30036] [[49136,0],0] plm:rsh_setup on agent ssh : rsh path NULL
>>> [zorg:30036] [[49136,0],0] plm:base:receive start comm
>>> [zorg:30036] [[49136,0],0] plm:base:setup_job
>>> [zorg:30036] [[49136,0],0] plm:base:setup_vm
>>> [zorg:30036] [[49136,0],0] plm:base:setup_vm creating map
>>> [zorg:30036] [[49136,0],0] setup:vm: working unmanaged allocation
>>> [zorg:30036] [[49136,0],0] using default hostfile 
>>> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
>>> [zorg:30036] [[49136,0],0] plm:base:setup_vm only HNP in allocation
>>> [zorg:30036] [[49

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread r...@open-mpi.org
What is in your hostfile?


> On Apr 25, 2017, at 11:39 AM, Eric Chamberland 
>  wrote:
> 
> Hi,
> 
> just testing the 3.x branch... I launch:
> 
> mpirun -n 8 echo "hello"
> 
> and I get:
> 
> --
> There are not enough slots available in the system to satisfy the 8 slots
> that were requested by the application:
>  echo
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> 
> I have to oversubscribe, so what do I have to do to bypass this "limitation"?
> 
> Thanks,
> 
> Eric
> 
> configure log:
> 
> http://www.giref.ulaval.ca/~cmpgiref/ompi_3.x/2017.04.25.10h46m08s_config.log
> http://www.giref.ulaval.ca/~cmpgiref/ompi_3.x/2017.04.25.10h46m08s_ompi_info_all.txt
> 
> 
> here is the complete message:
> 
> [zorg:30036] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL
> [zorg:30036] plm:base:set_hnp_name: initial bias 30036 nodename hash 810220270
> [zorg:30036] plm:base:set_hnp_name: final jobfam 49136
> [zorg:30036] [[49136,0],0] plm:rsh_setup on agent ssh : rsh path NULL
> [zorg:30036] [[49136,0],0] plm:base:receive start comm
> [zorg:30036] [[49136,0],0] plm:base:setup_job
> [zorg:30036] [[49136,0],0] plm:base:setup_vm
> [zorg:30036] [[49136,0],0] plm:base:setup_vm creating map
> [zorg:30036] [[49136,0],0] setup:vm: working unmanaged allocation
> [zorg:30036] [[49136,0],0] using default hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> [zorg:30036] [[49136,0],0] plm:base:setup_vm only HNP in allocation
> [zorg:30036] [[49136,0],0] plm:base:setting slots for node zorg by cores
> [zorg:30036] [[49136,0],0] complete_setup on job [49136,1]
> [zorg:30036] [[49136,0],0] plm:base:launch_apps for job [49136,1]
> --
> There are not enough slots available in the system to satisfy the 8 slots
> that were requested by the application:
>  echo
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> [zorg:30036] [[49136,0],0] plm:base:orted_cmd sending orted_exit commands
> [zorg:30036] [[49136,0],0] plm:base:receive stop comm
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-24 Thread r...@open-mpi.org
I’m afraid none of the current options is going to do that right now. I’ll put 
a note on my to-do list to look at this, but I can’t promise when I’ll get to 
it.

> On Apr 24, 2017, at 3:13 AM, Heinz-Ado Arnolds <arno...@mpa-garching.mpg.de> 
> wrote:
> 
> Dear Ralph,
> 
> thanks for this new hint. Unfortunately I don't see how that would fulfill 
> all my requirements:
> 
> I like to have 8 OpenMPI jobs for 2 nodes -> 4 OpenMPI jobs per node -> 2 per 
> socket, each executing one OpenMP job with 5 threads
> 
>   mpirun -np 8 --map-by ppr:4:node:pe=5 ...
> 
> How can I connect this with the constraint 1 threat per core:
> 
>   [pascal-3-06:14965] ... 
> [B./B./B./B./B./../../../../..][../../../../../../../../../..]
>   [pascal-3-06:14965] ... 
> [../../../../../B./B./B./B./B.][../../../../../../../../../..]
>   [pascal-3-06:14965] ... 
> [../../../../../../../../../..][B./B./B./B./B./../../../../..]
>   [pascal-3-06:14965] ... 
> [../../../../../../../../../..][../../../../../B./B./B./B./B./]
>   [pascal-3-07:21027] ... 
> [B./B./B./B./B./../../../../..][../../../../../../../../../..]
>   [pascal-3-07:21027] ... 
> [../../../../../B./B./B./B./B.][../../../../../../../../../..]
>   [pascal-3-07:21027] ... 
> [../../../../../../../../../..][B./B./B./B./B./../../../../..]
>   [pascal-3-07:21027] ... 
> [../../../../../../../../../..][../../../../../B./B./B./B./B./]
> 
> Cheers,
> 
> Ado
> 
> On 22.04.2017 16:45, r...@open-mpi.org wrote:
>> Sorry for delayed response. I’m glad that option solved the problem. We’ll 
>> have to look at that configure option - shouldn’t be too hard.
>> 
>> As for the mapping you requested - no problem! Here’s the cmd line:
>> 
>> mpirun --map-by ppr:1:core --bind-to hwthread
>> 
>> Ralph
>> 
>>> On Apr 19, 2017, at 2:51 AM, Heinz-Ado Arnolds 
>>> <arno...@mpa-garching.mpg.de> wrote:
>>> 
>>> Dear Ralph, dear Gilles,
>>> 
>>> thanks a lot for your help! The hints to use ":pe=" and to install 
>>> libnuma have been the keys to solve my problems.
>>> 
>>> Perhaps it would not be a bad idea to include --enable-libnuma in the 
>>> configure help, and make it a default, so that one has to specify 
>>> --disable-libnuma if he really likes to work without numactl. The option is 
>>> already checked in configure (framework in 
>>> opal/mca/hwloc/hwloc1112/hwloc/config/hwloc.m4).
>>> 
>>> One qestion remains: I now get a binding like
>>> [pascal-3-06:03036] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
>>> 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
>>> socket 0[core 4[hwt 0-1]]: 
>>> [BB/BB/BB/BB/BB/../../../../..][../../../../../../../../../..]
>>> and OpenMP uses just "hwt 0" of each core, what is very welcome. But is 
>>> there a way to get a binding like
>>> [pascal-3-06:03036] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 
>>> 0[core 4[hwt 0]]: 
>>> [B./B./B./B./B./../../../../..][../../../../../../../../../..]
>>> from OpenMPI directly?
>>> 
>>> Cheers and thanks again,
>>> 
>>> Ado
>>> 
>>> On 13.04.2017 17:34, r...@open-mpi.org wrote:
>>>> Yeah, we need libnuma to set the memory binding. There is a param to turn 
>>>> off the warning if installing libnuma is problematic, but it helps your 
>>>> performance if the memory is kept local to the proc
>>>> 
>>>>> On Apr 13, 2017, at 8:26 AM, Heinz-Ado Arnolds 
>>>>> <arno...@mpa-garching.mpg.de> wrote:
>>>>> 
>>>>> Dear Ralph,
>>>>> 
>>>>> thanks a lot for this valuable advice. Binding now works like expected!
>>>>> 
>>>>> Since adding the ":pe=" option I'm getting warnings
>>>>> 
>>>>> WARNING: a request was made to bind a process. While the system
>>>>> supports binding the process itself, at least one node does NOT
>>>>> support binding memory to the process location.
>>>>> 
>>>>>   Node:  pascal-1-05
>>>>> ...
>>>>> 
>>>>> even if I choose parameters so that binding is like exactly as before 
>>>>> without ":pe=". I don't have libnuma installed on the cluster. Might that 
>>>>> really be the cause of the warning?
>>>>> 
>>>>> Thanks

Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-22 Thread r...@open-mpi.org
Sorry for delayed response. I’m glad that option solved the problem. We’ll have 
to look at that configure option - shouldn’t be too hard.

As for the mapping you requested - no problem! Here’s the cmd line:

mpirun --map-by ppr:1:core --bind-to hwthread

Ralph

> On Apr 19, 2017, at 2:51 AM, Heinz-Ado Arnolds <arno...@mpa-garching.mpg.de> 
> wrote:
> 
> Dear Ralph, dear Gilles,
> 
> thanks a lot for your help! The hints to use ":pe=" and to install libnuma 
> have been the keys to solve my problems.
> 
> Perhaps it would not be a bad idea to include --enable-libnuma in the 
> configure help, and make it a default, so that one has to specify 
> --disable-libnuma if he really likes to work without numactl. The option is 
> already checked in configure (framework in 
> opal/mca/hwloc/hwloc1112/hwloc/config/hwloc.m4).
> 
> One qestion remains: I now get a binding like
>  [pascal-3-06:03036] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
> socket 0[core 4[hwt 0-1]]: 
> [BB/BB/BB/BB/BB/../../../../..][../../../../../../../../../..]
> and OpenMP uses just "hwt 0" of each core, what is very welcome. But is there 
> a way to get a binding like
>  [pascal-3-06:03036] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 
> 0[core 4[hwt 0]]: 
> [B./B./B./B./B./../../../../..][../../../../../../../../../..]
> from OpenMPI directly?
> 
> Cheers and thanks again,
> 
> Ado
> 
> On 13.04.2017 17:34, r...@open-mpi.org wrote:
>> Yeah, we need libnuma to set the memory binding. There is a param to turn 
>> off the warning if installing libnuma is problematic, but it helps your 
>> performance if the memory is kept local to the proc
>> 
>>> On Apr 13, 2017, at 8:26 AM, Heinz-Ado Arnolds 
>>> <arno...@mpa-garching.mpg.de> wrote:
>>> 
>>> Dear Ralph,
>>> 
>>> thanks a lot for this valuable advice. Binding now works like expected!
>>> 
>>> Since adding the ":pe=" option I'm getting warnings
>>> 
>>> WARNING: a request was made to bind a process. While the system
>>> supports binding the process itself, at least one node does NOT
>>> support binding memory to the process location.
>>> 
>>>Node:  pascal-1-05
>>> ...
>>> 
>>> even if I choose parameters so that binding is like exactly as before 
>>> without ":pe=". I don't have libnuma installed on the cluster. Might that 
>>> really be the cause of the warning?
>>> 
>>> Thanks a lot, have a nice Easter days
>>> 
>>> Ado
>>> 
>>> On 13.04.2017 15:49, r...@open-mpi.org wrote:
>>>> You can always specify a particular number of cpus to use for each process 
>>>> by adding it to the map-by directive:
>>>> 
>>>> mpirun -np 8 --map-by ppr:2:socket:pe=5 --use-hwthread-cpus 
>>>> -report-bindings --mca plm_rsh_agent "qrsh" ./myid
>>>> 
>>>> would map 2 processes to each socket, binding each process to 5 HTs on 
>>>> that socket (since you told us to treat HTs as independent cpus). If you 
>>>> want us to bind to you 5 cores, then you need to remove that 
>>>> --use-hwthread-cpus directive.
>>>> 
>>>> As I said earlier in this thread, we are actively working with the OpenMP 
>>>> folks on a mechanism by which the two sides can coordinate these actions 
>>>> so it will be easier to get the desired behavior. For now, though, 
>>>> hopefully this will suffice.
>>>> 
>>>>> On Apr 13, 2017, at 6:31 AM, Heinz-Ado Arnolds 
>>>>> <arno...@mpa-garching.mpg.de> wrote:
>>>>> 
>>>>> On 13.04.2017 15:20, gil...@rist.or.jp wrote:
>>>>> ...
>>>>>> in your second case, there are 2 things
>>>>>> - MPI binds to socket, that is why two MPI tasks are assigned the same 
>>>>>> hyperthreads
>>>>>> - the GNU OpenMP runtime looks unable to figure out 2 processes use the 
>>>>>> same cores, and hence end up binding
>>>>>> the OpenMP threads to the same cores.
>>>>>> my best bet is you should bind a MPI tasks to 5 cores instead of one 
>>>>>> socket.
>>>>>> i do not know the syntax off hand, and i am sure Ralph will help you 
>>>>>> with that
>>>>> 
>>>>> Thanks, would be great if someone has that syntax.
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Ado
>>>>> 
>>>>> 
>>>>> ___
>>>>> users mailing list
>>>>> users@lists.open-mpi.org
>>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>> 
>>>> ___
>>>> users mailing list
>>>> users@lists.open-mpi.org
>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>> 
>>> 
>> 
>> 
> 

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] fatal error for openmpi-master-201704200300-ded63c with SuSE Linux and gcc-6.3.0

2017-04-20 Thread r...@open-mpi.org
This is a known issue due to something in the NVIDIA library and it’s 
interactions with hwloc. Your tarball tag indicates you should have the 
attempted fix in it, so likely that wasn’t adequate. See 
https://github.com/open-mpi/ompi/pull/3283 
 for the discussion


> On Apr 20, 2017, at 8:11 AM, Siegmar Gross 
>  wrote:
> 
> Hi,
> 
> I tried to install openmpi-master-201704200300-ded63c on my "SUSE Linux
> Enterprise Server 12.2 (x86_64)" with Sun C 5.14 and gcc-6.3.0.
> Unfortunately, "make" breaks with the following error for gcc. I've had
> no problems with cc.
> 
> 
> loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 136 grep 
> topology log.make.Linux.x86_64.64_gcc
>  CC   topology.lo
>  CC   topology-noos.lo
>  CC   topology-synthetic.lo
>  CC   topology-custom.lo
>  CC   topology-xml.lo
>  CC   topology-xml-nolibxml.lo
>  CC   topology-pci.lo
>  CC   topology-nvml.lo
> ../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-nvml.c:14:18:
>  fatal error: nvml.h: No such file or directory
> Makefile:2181: recipe for target 'topology-nvml.lo' failed
> make[4]: *** [topology-nvml.lo] Error 1
> loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 137
> 
> 
> 
> 
> 
> loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 137 grep 
> topology 
> ../openmpi-master-201704200300-ded63c5-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc
>  CC   topology.lo
>  CC   topology-noos.lo
>  CC   topology-synthetic.lo
> "../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-synthetic.c",
>  line 851: warning: initializer will be sign-extended: -1
>  CC   topology-custom.lo
> "../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-custom.c",
>  line 88: warning: initializer will be sign-extended: -1
>  CC   topology-xml.lo
> "../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-xml.c",
>  line 1815: warning: initializer will be sign-extended: -1
>  CC   topology-xml-nolibxml.lo
>  CC   topology-pci.lo
>  CC   topology-nvml.lo
>  CC   topology-linux.lo
> "../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-linux.c",
>  line 2919: warning: initializer will be sign-extended: -1
> "../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-linux.c",
>  line 2919: warning: initializer will be sign-extended: -1
> "../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-linux.c",
>  line 2919: warning: initializer will be sign-extended: -1
>  CC   topology-hardwired.lo
>  CC   topology-x86.lo
> "../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-x86.c",
>  line 122: warning: initializer will be sign-extended: -1
> loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 138
> 
> 
> 
> 
> I used the following commands to configure the package.
> 
> loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 145 head -7 
> config.log |tail -1
>  $ ../openmpi-master-201704200300-ded63c5/configure 
> --prefix=/usr/local/openmpi-master_64_gcc 
> --libdir=/usr/local/openmpi-master_64_gcc/lib64 
> --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin 
> --with-jdk-headers=/usr/local/jdk1.8.0_66/include 
> JAVA_HOME=/usr/local/jdk1.8.0_66 LDFLAGS=-m64 CC=gcc CXX=g++ FC=gfortran 
> CFLAGS=-m64 CXXFLAGS=-m64 FCFLAGS=-m64 CPP=cpp CXXCPP=cpp --enable-mpi-cxx 
> --enable-cxx-exceptions --enable-mpi-java --with-cuda=/usr/local/cuda 
> --with-valgrind=/usr/local/valgrind --with-hwloc=internal --without-verbs 
> --with-wrapper-cflags=-std=c11 -m64 --with-wrapper-cxxflags=-m64 
> --with-wrapper-fcflags=-m64 --enable-debug
> loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 146
> 
> 
> 
> 
> loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 146 head -7 
> ../openmpi-master-201704200300-ded63c5-Linux.x86_64.64_cc/config.log | tail -1
>  $ ../openmpi-master-201704200300-ded63c5/configure 
> --prefix=/usr/local/openmpi-master_64_cc 
> --libdir=/usr/local/openmpi-master_64_cc/lib64 
> --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin 
> --with-jdk-headers=/usr/local/jdk1.8.0_66/include 
> JAVA_HOME=/usr/local/jdk1.8.0_66 LDFLAGS=-m64 -mt -Wl,-z -Wl,noexecstack 
> -L/usr/local/lib64 -L/usr/local/cuda/lib64 CC=cc CXX=CC FC=f95 CFLAGS=-m64 
> -mt -I/usr/local/include -I/usr/local/cuda/include CXXFLAGS=-m64 
> -I/usr/local/include -I/usr/local/cuda/include FCFLAGS=-m64 CPP=cpp 
> -I/usr/local/include -I/usr/local/cuda/include CXXCPP=cpp 
> -I/usr/local/include -I/usr/local/cuda/include --enable-mpi-cxx 
> --enable-cxx-exceptions --enable-mpi-java --with-cuda=/usr/local/cuda 
> 

Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-13 Thread r...@open-mpi.org
You can always specify a particular number of cpus to use for each process by 
adding it to the map-by directive:

 mpirun -np 8 --map-by ppr:2:socket:pe=5 --use-hwthread-cpus -report-bindings 
--mca plm_rsh_agent "qrsh" ./myid

would map 2 processes to each socket, binding each process to 5 HTs on that 
socket (since you told us to treat HTs as independent cpus). If you want us to 
bind to you 5 cores, then you need to remove that --use-hwthread-cpus directive.

As I said earlier in this thread, we are actively working with the OpenMP folks 
on a mechanism by which the two sides can coordinate these actions so it will 
be easier to get the desired behavior. For now, though, hopefully this will 
suffice.

> On Apr 13, 2017, at 6:31 AM, Heinz-Ado Arnolds  
> wrote:
> 
> On 13.04.2017 15:20, gil...@rist.or.jp wrote:
> ...
>> in your second case, there are 2 things
>> - MPI binds to socket, that is why two MPI tasks are assigned the same 
>> hyperthreads
>> - the GNU OpenMP runtime looks unable to figure out 2 processes use the 
>> same cores, and hence end up binding
>>  the OpenMP threads to the same cores.
>> my best bet is you should bind a MPI tasks to 5 cores instead of one 
>> socket.
>> i do not know the syntax off hand, and i am sure Ralph will help you 
>> with that
> 
> Thanks, would be great if someone has that syntax.
> 
> Cheers,
> 
> Ado
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-12 Thread r...@open-mpi.org
 
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0009(pid 05884), 
> 030, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0001 of 0004 is on pascal-1-00: MP thread  #0010(pid 05884), 
> 032, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0002 of 0004 is on pascal-1-00, Cpus_allowed_list: 
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0001(pid 05883), 
> 031, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0002(pid 05883), 
> 017, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0003(pid 05883), 
> 027, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0004(pid 05883), 
> 039, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0005(pid 05883), 
> 011, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0006(pid 05883), 
> 033, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0007(pid 05883), 
> 015, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0008(pid 05883), 
> 021, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0009(pid 05883), 
> 003, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0002 of 0004 is on pascal-1-00: MP thread  #0010(pid 05883), 
> 025, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0003 of 0004 is on pascal-3-00, Cpus_allowed_list: 
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0001(pid 07513), 
> 016, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0002(pid 07513), 
> 020, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0003(pid 07513), 
> 022, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0004(pid 07513), 
> 018, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0005(pid 07513), 
> 012, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0006(pid 07513), 
> 004, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0007(pid 07513), 
> 008, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0008(pid 07513), 
> 006, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0009(pid 07513), 
> 030, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0003 of 0004 is on pascal-3-00: MP thread  #0010(pid 07513), 
> 034, Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>   MPI Instance 0004 of 0004 is on pascal-3-00, Cpus_allowed_list: 
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0001(pid 07514), 
> 017, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0002(pid 07514), 
> 025, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0003(pid 07514), 
> 029, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>   MPI Instance 0004 of 0004 is on pascal-3-00: MP thread  #0004(pid 07514), 
> 003, Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
>

Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-10 Thread r...@open-mpi.org
I’m not entirely sure I understand your reference to “real cores”. When we bind 
you to a core, we bind you to all the HT’s that comprise that core. So, yes, 
with HT enabled, the binding report will list things by HT, but you’ll always 
be bound to the full core if you tell us bind-to core

The default binding directive is bind-to socket when more than 2 processes are 
in the job, and that’s what you are showing. You can override that by adding 
"-bind-to core" to your cmd line if that is what you desire.

If you want to use individual HTs as independent processors, then 
“--use-hwthread-cpus -bind-to hwthreads” would indeed be the right combination.

> On Apr 10, 2017, at 3:55 AM, Heinz-Ado Arnolds  
> wrote:
> 
> Dear OpenMPI users & developers,
> 
> I'm trying to distribute my jobs (with SGE) to a machine with a certain 
> number of nodes, each node having 2 sockets, each socket having 10 cores & 10 
> hyperthreads. I like to use only the real cores, no hyperthreading.
> 
> lscpu -a -e
> 
> CPU NODE SOCKET CORE L1d:L1i:L2:L3
> 0   00  00:0:0:0  
> 1   11  11:1:1:1  
> 2   00  22:2:2:0  
> 3   11  33:3:3:1  
> 4   00  44:4:4:0  
> 5   11  55:5:5:1  
> 6   00  66:6:6:0  
> 7   11  77:7:7:1  
> 8   00  88:8:8:0  
> 9   11  99:9:9:1  
> 10  00  10   10:10:10:0   
> 11  11  11   11:11:11:1   
> 12  00  12   12:12:12:0   
> 13  11  13   13:13:13:1   
> 14  00  14   14:14:14:0   
> 15  11  15   15:15:15:1   
> 16  00  16   16:16:16:0   
> 17  11  17   17:17:17:1   
> 18  00  18   18:18:18:0   
> 19  11  19   19:19:19:1   
> 20  00  00:0:0:0  
> 21  11  11:1:1:1  
> 22  00  22:2:2:0  
> 23  11  33:3:3:1  
> 24  00  44:4:4:0  
> 25  11  55:5:5:1  
> 26  00  66:6:6:0  
> 27  11  77:7:7:1  
> 28  00  88:8:8:0  
> 29  11  99:9:9:1  
> 30  00  10   10:10:10:0   
> 31  11  11   11:11:11:1   
> 32  00  12   12:12:12:0   
> 33  11  13   13:13:13:1   
> 34  00  14   14:14:14:0   
> 35  11  15   15:15:15:1   
> 36  00  16   16:16:16:0   
> 37  11  17   17:17:17:1   
> 38  00  18   18:18:18:0   
> 39  11  19   19:19:19:1   
> 
> How do I have to choose the options & parameters of mpirun to achieve this 
> behavior?
> 
> mpirun -np 4 --map-by ppr:2:node --mca plm_rsh_agent "qrsh" -report-bindings 
> ./myid
> 
> distributes to
> 
> [pascal-1-04:35735] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
> socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
> 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core 
> 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..]
> [pascal-1-04:35735] MCW rank 1 bound to socket 1[core 10[hwt 0-1]], socket 
> 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], 
> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core 16[hwt 
> 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket 1[core 
> 19[hwt 0-1]]: [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
> [pascal-1-03:00787] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], socket 
> 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]], 
> socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 6[hwt 
> 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 0[core 
> 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..]
> [pascal-1-03:00787] MCW rank 3 bound to socket 1[core 10[hwt 0-1]], socket 
> 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]], 
> socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core 16[hwt 
> 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket 1[core 
> 19[hwt 0-1]]: [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
> MPI Instance 0001 of 0004 is on pascal-1-04,pascal-1-04.MPA-Garching.MPG.DE, 
> Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> MPI Instance 0002 of 0004 is on pascal-1-04,pascal-1-04.MPA-Garching.MPG.DE, 
> Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> MPI Instance 0003 of 0004 is on pascal-1-03,pascal-1-03.MPA-Garching.MPG.DE, 
> Cpus_allowed_list:   
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> MPI Instance 0004 of 0004 is on pascal-1-03,pascal-1-03.MPA-Garching.MPG.DE, 
> Cpus_allowed_list:   
> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> 
> 

Re: [OMPI users] No more default core binding since 2.0.2?

2017-04-10 Thread r...@open-mpi.org

> On Apr 10, 2017, at 1:37 AM, Reuti <re...@staff.uni-marburg.de> wrote:
> 
>> 
>> Am 10.04.2017 um 01:58 schrieb r...@open-mpi.org <mailto:r...@open-mpi.org>:
>> 
>> Let me try to clarify. If you launch a job that has only 1 or 2 processes in 
>> it (total), then we bind to core by default. This is done because a job that 
>> small is almost always some kind of benchmark.
> 
> Yes, I see. But only if libnuma was compiled in AFAICS.
> 
> 
>> If there are more than 2 processes in the job (total), then we default to 
>> binding to NUMA (if NUMA’s are present - otherwise, to socket) across the 
>> entire job.
> 
> Mmh - can I spot a difference in --report-bindings between these two? To me 
> both looks like being bound to socket.

You won’t see a difference if the NUMA and socket are identical in terms of the 
cores they cover.

> 
> -- Reuti
> 
> 
>> You can always override these behaviors.
>> 
>>> On Apr 9, 2017, at 3:45 PM, Reuti <re...@staff.uni-marburg.de> wrote:
>>> 
>>>>> But I can't see a binding by core for number of processes <= 2. Does it 
>>>>> mean 2 per node or 2 overall for the `mpiexec`?
>>>> 
>>>> It’s 2 processes overall
>>> 
>>> Having a round-robin allocation in the cluster, this might not be what was 
>>> intended (to bind only one or two cores per exechost)?
>>> 
>>> Obviously the default changes (from --bind-to core to --bin-to socket), 
>>> whether I compiled Open MPI with or w/o libnuma (I wanted to get rid of the 
>>> warning in the output only – now it works). But "--bind-to core" I could 
>>> also use w/o libnuma and it worked, I got only that warning in addition 
>>> about the memory couldn't be bound.
>>> 
>>> BTW: I always had to use -ldl when using `mpicc`. Now, that I compiled in 
>>> libnuma, this necessity is gone.
>>> 
>>> -- Reuti
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] No more default core binding since 2.0.2?

2017-04-09 Thread r...@open-mpi.org
Let me try to clarify. If you launch a job that has only 1 or 2 processes in it 
(total), then we bind to core by default. This is done because a job that small 
is almost always some kind of benchmark.

If there are more than 2 processes in the job (total), then we default to 
binding to NUMA (if NUMA’s are present - otherwise, to socket) across the 
entire job.

You can always override these behaviors.

> On Apr 9, 2017, at 3:45 PM, Reuti  wrote:
> 
>>> But I can't see a binding by core for number of processes <= 2. Does it 
>>> mean 2 per node or 2 overall for the `mpiexec`? 
>> 
>> It’s 2 processes overall
> 
> Having a round-robin allocation in the cluster, this might not be what was 
> intended (to bind only one or two cores per exechost)?
> 
> Obviously the default changes (from --bind-to core to --bin-to socket), 
> whether I compiled Open MPI with or w/o libnuma (I wanted to get rid of the 
> warning in the output only – now it works). But "--bind-to core" I could also 
> use w/o libnuma and it worked, I got only that warning in addition about the 
> memory couldn't be bound.
> 
> BTW: I always had to use -ldl when using `mpicc`. Now, that I compiled in 
> libnuma, this necessity is gone.
> 
> -- Reuti
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] No more default core binding since 2.0.2?

2017-04-09 Thread r...@open-mpi.org

> On Apr 9, 2017, at 1:49 PM, Reuti <re...@staff.uni-marburg.de> wrote:
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hi,
> 
> Am 09.04.2017 um 16:35 schrieb r...@open-mpi.org <mailto:r...@open-mpi.org>:
> 
>> There has been no change in the policy - however, if you are oversubscribed, 
>> we did fix a bug to ensure that we don’t auto-bind in that situation
>> 
>> Can you pass along your cmd line? So far as I can tell, it still seems to be 
>> working.
> 
> I'm not sure whether it was the case with 1.8, but according to the man page 
> it binds now to sockets for number of processes > 2 . And this can lead the 
> effect that one sometimes may notice a drop in performance when just this 
> socket has other jobs running (by accident).
> 
> So, this is solved - I wasn't aware of the binding by socket.
> 
> But I can't see a binding by core for number of processes <= 2. Does it mean 
> 2 per node or 2 overall for the `mpiexec`? 

It’s 2 processes overall

> 
> - -- Reuti
> 
> 
>> 
>>> On Apr 9, 2017, at 3:40 AM, Reuti <re...@staff.uni-marburg.de> wrote:
>>> 
>>> Hi,
>>> 
>>> While I noticed an automatic core binding in Open MPI 1.8 (which in a 
>>> shared cluster may lead to oversubscribing of cores), I can't spot this any 
>>> longer in the 2.x series. So the question arises:
>>> 
>>> - Was this a general decision to no longer enable automatic core binding?
>>> 
>>> First I thought it might be because of:
>>> 
>>> - We define plm_rsh_agent=foo in $OMPI_ROOT/etc/openmpi-mca-params.conf
>>> - We compiled with --with-sge
>>> 
>>> But also started on the command line by `ssh` to the nodes, there seems no 
>>> automatic core binding to take place any longer.
>>> 
>>> -- Reuti
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
>> 
> 
> -BEGIN PGP SIGNATURE-
> Comment: GPGTools - https://gpgtools.org <https://gpgtools.org/>
> 
> iEYEARECAAYFAljqnnYACgkQo/GbGkBRnRrwtACgpUAlpvQElzbjoVdvsQubZmTo
> Pj4An05kJd3pW0YWW4HXaf/7Zl7xTc+y
> =kzwG
> -END PGP SIGNATURE-
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Install openmpi.2.0.2 with certain option

2017-04-04 Thread r...@open-mpi.org
--without-cuda --without-slurm

should do the trick

> On Apr 4, 2017, at 4:49 AM, Andrey Shtyrov via users 
>  wrote:
> 
> Dear openmpi communite,
> 
> I am need to install openmpi.2.0.2 on sistem with slurm, and cuda, without 
> support it.
> 
> I have tried write ".configure  ... (--without-cuda or 
> --enable-mca-no-build=cuda)"
> 
> but it havent solve my proble. What about switching off support of slurm, 
> i dont know what paramter would be wrotten. 
> What will advise about this proble?
> 
> And i will be glad if abbreviation of FOO would be decrypted.
> 
> Thank you for your help,
> Shtyrov
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Performance degradation of OpenMPI 1.10.2 when oversubscribed?

2017-03-27 Thread r...@open-mpi.org
I’m confused - mpi_yield_when_idle=1 is precisely the “oversubscribed” setting. 
So why would you expect different results?

> On Mar 27, 2017, at 3:52 AM, Jordi Guitart  wrote:
> 
> Hi Ben,
> 
> Thanks for your feedback. As described here 
> (https://www.open-mpi.org/faq/?category=running#oversubscribing 
> ), OpenMPI 
> detects that I'm oversubscribing and runs in degraded mode (yielding the 
> processor). Anyway, I repeated the experiments setting explicitly the 
> yielding flag, and I obtained the same weird results:
> 
> $HOME/openmpi-bin-1.10.1/bin/mpirun --mca mpi_yield_when_idle 1 -np 36 
> taskset -c 0-27 $HOME/NPB/NPB3.3-MPI/bin/bt.C.36 -> Time in seconds = 82.79
> $HOME/openmpi-bin-1.10.2/bin/mpirun --mca mpi_yield_when_idle 1 -np 36 
> taskset -c 0-27 $HOME/NPB/NPB3.3-MPI/bin/bt.C.36 -> Time in seconds = 110.93
> 
> Given these results, it seems that spin-waiting is not causing the issue. I 
> also agree that this should not be caused by HyperThreading, given that 0-27 
> correspond to single HW threads on distinct cores, as shown in the following 
> output returned by the lstopo command:
> 
> Machine (128GB total)
>   NUMANode L#0 (P#0 64GB)
> Package L#0 + L3 L#0 (35MB)
>   L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
> PU L#0 (P#0)
> PU L#1 (P#28)
>   L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
> PU L#2 (P#1)
> PU L#3 (P#29)
>   L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
> PU L#4 (P#2)
> PU L#5 (P#30)
>   L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
> PU L#6 (P#3)
> PU L#7 (P#31)
>   L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
> PU L#8 (P#4)
> PU L#9 (P#32)
>   L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
> PU L#10 (P#5)
> PU L#11 (P#33)
>   L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
> PU L#12 (P#6)
> PU L#13 (P#34)
>   L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
> PU L#14 (P#7)
> PU L#15 (P#35)
>   L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
> PU L#16 (P#8)
> PU L#17 (P#36)
>   L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
> PU L#18 (P#9)
> PU L#19 (P#37)
>   L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
> PU L#20 (P#10)
> PU L#21 (P#38)
>   L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
> PU L#22 (P#11)
> PU L#23 (P#39)
>   L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12
> PU L#24 (P#12)
> PU L#25 (P#40)
>   L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13
> PU L#26 (P#13)
> PU L#27 (P#41)
> HostBridge L#0
>   PCIBridge
> PCI 8086:24f0
>   Net L#0 "ib0"
>   OpenFabrics L#1 "hfi1_0"
>   PCIBridge
> PCI 14e4:1665
>   Net L#2 "eno1"
> PCI 14e4:1665
>   Net L#3 "eno2"
>   PCIBridge
> PCIBridge
>   PCIBridge
> PCIBridge
>   PCI 102b:0534
> GPU L#4 "card0"
> GPU L#5 "controlD64"
>   NUMANode L#1 (P#1 64GB) + Package L#1 + L3 L#1 (35MB)
> L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14
>   PU L#28 (P#14)
>   PU L#29 (P#42)
> L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15
>   PU L#30 (P#15)
>   PU L#31 (P#43)
> L2 L#16 (256KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16
>   PU L#32 (P#16)
>   PU L#33 (P#44)
> L2 L#17 (256KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17
>   PU L#34 (P#17)
>   PU L#35 (P#45)
> L2 L#18 (256KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18
>   PU L#36 (P#18)
>   PU L#37 (P#46)
> L2 L#19 (256KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19
>   PU L#38 (P#19)
>   PU L#39 (P#47)
> L2 L#20 (256KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20
>   PU L#40 (P#20)
>   PU L#41 (P#48)
> L2 L#21 (256KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21
>   PU L#42 (P#21)
>   PU L#43 (P#49)
> L2 L#22 (256KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22
>   PU L#44 (P#22)
>   PU L#45 (P#50)
> L2 L#23 (256KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23
>   PU L#46 (P#23)
>   PU L#47 (P#51)
> L2 L#24 (256KB) + L1d L#24 (32KB) + L1i L#24 (32KB) + Core L#24
>   PU L#48 (P#24)
>   PU L#49 (P#52)
> L2 L#25 (256KB) + L1d L#25 (32KB) + L1i L#25 (32KB) + Core L#25
>   PU L#50 (P#25)
>   PU L#51 (P#53)
> L2 L#26 (256KB) + L1d L#26 (32KB) + L1i L#26 (32KB) + Core L#26
>   PU L#52 (P#26)
>   PU L#53 (P#54)
> L2 L#27 (256KB) + L1d L#27 (32KB) + L1i L#27 (32KB) 

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread r...@open-mpi.org
Sorry folks - for some reason (probably timing for getting 2.1.0 out), the fix 
for this got pushed to v2.1.1 - see the PR here: 
https://github.com/open-mpi/ompi/pull/3163 



> On Mar 22, 2017, at 7:49 AM, Reuti  wrote:
> 
>> 
>> Am 22.03.2017 um 15:31 schrieb Heinz-Ado Arnolds 
>> >:
>> 
>> Dear Reuti,
>> 
>> thanks a lot, you're right! But why did the default behavior change but not 
>> the value of this parameter:
>> 
>> 2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", 
>> data source: default, level: 2 user/detail, type: string, synonyms: 
>> pls_rsh_agent, orte_rsh_agent)
>> The command used to launch executables on remote 
>> nodes (typically either "ssh" or "rsh")
>> 
>> 1.10.6:  MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", 
>> data source: default, level: 2 user/detail, type: string, synonyms: 
>> pls_rsh_agent, orte_rsh_agent)
>> The command used to launch executables on remote 
>> nodes (typically either "ssh" or "rsh")
>> 
>> That means there must have been changes in the code regarding that, perhaps 
>> for detecting SGE? Do you know of a way to revert to the old style (e.g. 
>> configure option)? Otherwise all my users have to add this option.
> 
> There was a discussion in https://github.com/open-mpi/ompi/issues/2947 
> 
> 
> For now you can make use of 
> https://www.open-mpi.org/faq/?category=tuning#setting-mca-params 
> 
> 
> Essentially to have it set for all users automatically, put:
> 
> plm_rsh_agent=foo
> 
> in $prefix/etc/openmpi-mca-params.conf of your central Open MPI 2.1.0 
> installation.
> 
> -- Reuti
> 
> 
>> Thanks again, and have a nice day
>> 
>> Ado Arnolds
>> 
>> On 22.03.2017 13:58, Reuti wrote:
>>> Hi,
>>> 
 Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds 
 :
 
 Dear users and developers,
 
 first of all many thanks for all the great work you have done for OpenMPI!
 
 Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh:
 mpirun -np 8 --map-by ppr:4:node ./myid
 /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V >>> Machine> orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess 
 "env" -mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca 
 orte_ess_num_procs "2" -mca orte_hnp_uri "1621884928.0;tcp://>>> Master>:41031" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
 --tree-spawn
 
 Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to start 
 orted:
 mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 
 ./myid
 /usr/bin/ssh -x  
 PATH=/afs/./openmpi-2.1.0/bin:$PATH ; export PATH ; 
 LD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export 
 LD_LIBRARY_PATH ; 
 DYLD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; export 
 DYLD_LIBRARY_PATH ;   /afs/./openmpi-2.1.0/bin/orted --hnp-topo-sig 
 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca ess_base_jobid 
 "1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca 
 orte_hnp_uri "1626013696.0;usock;tcp://:43019" -mca 
 plm_rsh_args "-x" -mca plm "rsh" -mca rmaps_base_mapping_policy 
 "ppr:4:node" -mca pmix "^s1,s2,cray"
 
 qrsh set the environment properly on the remote side, so that environment 
 variables from job scripts are properly transferred. With the ssh variant 
 the environment is not set properly on the remote side, and it seems that 
 there are handling problems with Kerberos tickets and/or AFS tokens.
 
 Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use SGE/qrsh) 
 one? Are there mca params to set this?
 
 If you need more info, please let me know. (Job submitting machine and 
 target cluster are the same with all tests. SW is residing in AFS 
 directories visible on all machines. Parameter "plm_rsh_disable_qrsh" 
 current value: "false")
>>> 
>>> It looks like `mpirun` still needs:
>>> 
>>> -mca plm_rsh_agent foo
>>> 
>>> to allow SGE to be detected.
>>> 
>>> -- Reuti
>>> 
>>> 
>>> 
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> 

Re: [OMPI users] How to launch ompi-server?

2017-03-19 Thread r...@open-mpi.org
Well, your initial usage looks correct - you don’t launch ompi-server via 
mpirun. However, it sounds like there is probably a bug somewhere if it hangs 
as you describe.

Scratching my head, I can only recall less than a handful of people ever using 
these MPI functions to cross-connect jobs, so it does tend to fall into 
disrepair. As I said, I’ll try to repair it, at least for 3.0.


> On Mar 19, 2017, at 4:37 AM, Adam Sylvester  wrote:
> 
> I am trying to use ompi-server with Open MPI 1.10.6.  I'm wondering if I 
> should run this with or without the mpirun command.  If I run this:
> 
> ompi-server --no-daemonize -r +
> 
> It prints something such as 959315968.0;tcp://172.31.3.57:45743 
>  to stdout but I have thus far been unable to 
> connect to it.  That is, in another application on another machine which is 
> on the same network as the ompi-server machine, I try
> 
> MPI_Info info;
> MPI_Info_create();
> MPI_Info_set(info, "ompi_global_scope", "true");
> 
> char myport[MPI_MAX_PORT_NAME];
> MPI_Open_port(MPI_INFO_NULL, myport);
> MPI_Publish_name("adam-server", info, myport);
> 
> But the MPI_Publish_name() function hangs forever when I run it like
> 
> mpirun -np 1 --ompi-server "959315968.0;tcp://172.31.3.57:45743 
> " server
> 
> Blog posts are inconsistent as to if you should run ompi-server with mpirun 
> or not so I tried using it but this seg faults:
> 
> mpirun -np 1 ompi-server --no-daemonize -r +
> [ip-172-31-5-39:14785] *** Process received signal ***
> [ip-172-31-5-39:14785] Signal: Segmentation fault (11)
> [ip-172-31-5-39:14785] Signal code: Address not mapped (1)
> [ip-172-31-5-39:14785] Failing at address: 0x6e0
> [ip-172-31-5-39:14785] [ 0] /lib64/libpthread.so.0(+0xf370)[0x7f895d7a5370]
> [ip-172-31-5-39:14785] [ 1] 
> /usr/local/lib/libopen-pal.so.13(opal_hwloc191_hwloc_get_cpubind+0x9)[0x7f895e336839]
> [ip-172-31-5-39:14785] [ 2] 
> /usr/local/lib/libopen-rte.so.12(orte_ess_base_proc_binding+0x17a)[0x7f895e5d8fca]
> [ip-172-31-5-39:14785] [ 3] 
> /usr/local/lib/openmpi/mca_ess_env.so(+0x15dd)[0x7f895cdcd5dd]
> [ip-172-31-5-39:14785] [ 4] 
> /usr/local/lib/libopen-rte.so.12(orte_init+0x168)[0x7f895e5b5368]
> [ip-172-31-5-39:14785] [ 5] ompi-server[0x4014d4]
> [ip-172-31-5-39:14785] [ 6] 
> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f895d3f6b35]
> [ip-172-31-5-39:14785] [ 7] ompi-server[0x40176b]
> [ip-172-31-5-39:14785] *** End of error message ***
> 
> Am I doing something wrong?
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_Comm_accept()

2017-03-14 Thread r...@open-mpi.org
I don’t see an issue right away, though I know it has been brought up before. I 
hope to resolve it either this week or next - will reply to this thread with 
the PR link when ready.


> On Mar 13, 2017, at 6:16 PM, Adam Sylvester <op8...@gmail.com> wrote:
> 
> Bummer - thanks for the update.  I will revert back to 1.10.x for now then.  
> Should I file a bug report for this on GitHub or elsewhere?  Or if there's an 
> issue for this already open, can you point me to it so I can keep track of 
> when it's fixed?  Any best guess calendar-wise as to when you expect this to 
> be fixed?
> 
> Thanks.
> 
> On Mon, Mar 13, 2017 at 10:45 AM, r...@open-mpi.org 
> <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> 
> wrote:
> You should consider it a bug for now - it won’t work in the 2.0 series, and I 
> don’t think it will work in the upcoming 2.1.0 release. Probably will be 
> fixed after that.
> 
> 
>> On Mar 13, 2017, at 5:17 AM, Adam Sylvester <op8...@gmail.com 
>> <mailto:op8...@gmail.com>> wrote:
>> 
>> As a follow-up, I tried this with Open MPI 1.10.4 and this worked as 
>> expected (the port formatting looks really different):
>> 
>> $ mpirun -np 1 ./server
>> Port name is 
>> 1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300 
>> <>
>> Accepted!
>> 
>> $ mpirun -np 1 ./client 
>> "1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300
>>  <>"
>> Trying with 
>> '1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300'
>>  <>
>> Connected!
>> 
>> I've found some other posts of users asking about similar things regarding 
>> the 2.x release - is this a bug?
>> 
>> On Sun, Mar 12, 2017 at 9:38 PM, Adam Sylvester <op8...@gmail.com 
>> <mailto:op8...@gmail.com>> wrote:
>> I'm using Open MPI 2.0.2 on RHEL 7.  I'm trying to use MPI_Open_port() / 
>> MPI_Comm_accept() / MPI_Conn_connect().  My use case is that I'll have two 
>> processes running on two machines that don't initially know about each other 
>> (i.e. I can't do the typical mpirun with a list of IPs); eventually I think 
>> I may need to use ompi-server to accomplish what I want but for now I'm 
>> trying to test this out running two processes on the same machine with some 
>> toy programs.
>> 
>> server.cpp creates the port, prints it, and waits for a client to accept 
>> using it:
>> 
>> #include 
>> #include 
>> 
>> int main(int argc, char** argv)
>> {
>> MPI_Init(NULL, NULL);
>> 
>> char myport[MPI_MAX_PORT_NAME];
>> MPI_Comm intercomm;
>> 
>> MPI_Open_port(MPI_INFO_NULL, myport);
>> std::cout << "Port name is " << myport << std::endl;
>> 
>> MPI_Comm_accept(myport, MPI_INFO_NULL, 0, MPI_COMM_SELF, );
>> 
>> std::cout << "Accepted!" << std::endl;
>> 
>> MPI_Finalize();
>> return 0;
>> }
>> 
>> client.cpp takes in this port on the command line and tries to connect to it:
>> 
>> #include 
>> #include 
>> 
>> int main(int argc, char** argv)
>> {
>> MPI_Init(NULL, NULL);
>> 
>> MPI_Comm intercomm;
>> 
>> const std::string name(argv[1]);
>> std::cout << "Trying with '" << name << "'" << std::endl;
>> MPI_Comm_connect(name.c_str(), MPI_INFO_NULL, 0, MPI_COMM_SELF, 
>> );
>> 
>> std::cout << "Connected!" << std::endl;
>> 
>> MPI_Finalize();
>> return 0;
>> }
>> 
>> I run the server first:
>> $ mpirun ./server
>> Port name is 2720137217.0:595361386
>> 
>> Then a second later I run the client:
>> $ mpirun ./client 2720137217.0:595361386
>> Trying with '2720137217.0:595361386'
>> 
>> Both programs hang for awhile and then eventually time out.  I have a 
>> feeling I'm misunderstanding something and doing something dumb but from all 
>> the examples I've seen online it seems like this should work.
>> 
>> Thanks for the help.
>> -Adam
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_Comm_accept()

2017-03-13 Thread r...@open-mpi.org
You should consider it a bug for now - it won’t work in the 2.0 series, and I 
don’t think it will work in the upcoming 2.1.0 release. Probably will be fixed 
after that.


> On Mar 13, 2017, at 5:17 AM, Adam Sylvester  wrote:
> 
> As a follow-up, I tried this with Open MPI 1.10.4 and this worked as expected 
> (the port formatting looks really different):
> 
> $ mpirun -np 1 ./server
> Port name is 
> 1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300
> Accepted!
> 
> $ mpirun -np 1 ./client 
> "1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300"
> Trying with 
> '1286733824.0;tcp://10.102.16.135:43074+1286733825.0;tcp://10.102.16.135::300'
> Connected!
> 
> I've found some other posts of users asking about similar things regarding 
> the 2.x release - is this a bug?
> 
> On Sun, Mar 12, 2017 at 9:38 PM, Adam Sylvester  > wrote:
> I'm using Open MPI 2.0.2 on RHEL 7.  I'm trying to use MPI_Open_port() / 
> MPI_Comm_accept() / MPI_Conn_connect().  My use case is that I'll have two 
> processes running on two machines that don't initially know about each other 
> (i.e. I can't do the typical mpirun with a list of IPs); eventually I think I 
> may need to use ompi-server to accomplish what I want but for now I'm trying 
> to test this out running two processes on the same machine with some toy 
> programs.
> 
> server.cpp creates the port, prints it, and waits for a client to accept 
> using it:
> 
> #include 
> #include 
> 
> int main(int argc, char** argv)
> {
> MPI_Init(NULL, NULL);
> 
> char myport[MPI_MAX_PORT_NAME];
> MPI_Comm intercomm;
> 
> MPI_Open_port(MPI_INFO_NULL, myport);
> std::cout << "Port name is " << myport << std::endl;
> 
> MPI_Comm_accept(myport, MPI_INFO_NULL, 0, MPI_COMM_SELF, );
> 
> std::cout << "Accepted!" << std::endl;
> 
> MPI_Finalize();
> return 0;
> }
> 
> client.cpp takes in this port on the command line and tries to connect to it:
> 
> #include 
> #include 
> 
> int main(int argc, char** argv)
> {
> MPI_Init(NULL, NULL);
> 
> MPI_Comm intercomm;
> 
> const std::string name(argv[1]);
> std::cout << "Trying with '" << name << "'" << std::endl;
> MPI_Comm_connect(name.c_str(), MPI_INFO_NULL, 0, MPI_COMM_SELF, 
> );
> 
> std::cout << "Connected!" << std::endl;
> 
> MPI_Finalize();
> return 0;
> }
> 
> I run the server first:
> $ mpirun ./server
> Port name is 2720137217.0:595361386
> 
> Then a second later I run the client:
> $ mpirun ./client 2720137217.0:595361386
> Trying with '2720137217.0:595361386'
> 
> Both programs hang for awhile and then eventually time out.  I have a feeling 
> I'm misunderstanding something and doing something dumb but from all the 
> examples I've seen online it seems like this should work.
> 
> Thanks for the help.
> -Adam
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI in docker container

2017-03-11 Thread r...@open-mpi.org
Past attempts have indicated that only TCP works well with Docker - if you want 
to use OPA, you’re probably better off using Singularity as your container.

http://singularity.lbl.gov/ 

The OMPI master has some optimized integration for Singularity, but 2.0.2 will 
work with it just fine as well.


> On Mar 11, 2017, at 11:09 AM, Ender GÜLER  wrote:
> 
> Hi Josh,
> 
> Thanks for your suggestion. When I add "-mca pml ob1" it worked. Actually I 
> need the psm support (but not with this scenario). Here's the story: 
> 
> I compiled the openmpi source with psm2 support becuase the host has OmniPath 
> device and my first try is to test whether I can use the hardware or not and 
> I ended up testing the compiled OpenMPI against the different transport modes 
> without success. 
> 
> The psm2 support is working when running directly from physical host and I 
> suppose the docker layer has something to do with this error. But I cannot 
> figure out what causes this situation.
> 
> Do you guys, have any idea what to look at next? I'll ask opinion at the 
> Docker Forums but before that I try to get more information and I wondered 
> whether anyone else have this kind of problem before.
> 
> Regards,
> 
> Ender
> 
> On Sat, Mar 11, 2017 at 6:19 PM Josh Hursey  > wrote:
> From the stack track it looks like it's failing the PSM2 MTL, which you 
> shouldn't need (or want?) in this scenario.
> 
> Try adding this additional MCA parameter to your command line:
>  -mca pml ob1
> 
> That will force Open MPI's selection such that it avoids that component. That 
> might get you further along.
> 
> 
> On Sat, Mar 11, 2017 at 7:49 AM, Ender GÜLER  > wrote:
> Hi there,
> 
> I try to use openmpi in a docker container. My host and container OS is 
> CentOS 7 (7.2.1511 to be exact). When I try to run a simple MPI hello world 
> application, the app core dumps every time with BUS ERROR. The OpenMPI 
> version is 2.0.2 and I compiled in the container. When I copied the 
> installation from container to host, it runs without any problem.
> 
> Have you ever tried to run OpenMPI and encountered a problem like this one. 
> If so what can be wrong? What should I do to find the root cause and solve 
> the problem? The very same application can be run with IntelMPI in the 
> container without any problem.
> 
> I pasted the output of my mpirun command and its output below.
> 
> [root@cn15 ~]# mpirun --allow-run-as-root -mca btl sm -np 2 -machinefile 
> mpd.hosts ./mpi_hello.x
> [cn15:25287] *** Process received signal ***
> [cn15:25287] Signal: Bus error (7)
> [cn15:25287] Signal code: Non-existant physical address (2)
> [cn15:25287] Failing at address: 0x7fe2d0fbf000
> [cn15:25287] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7fe2d53e9100]
> [cn15:25287] [ 1] /lib64/libpsm2.so.2(+0x4b034)[0x7fe2d5a9a034]
> [cn15:25287] [ 2] /lib64/libpsm2.so.2(+0xc45f)[0x7fe2d5a5b45f]
> [cn15:25287] [ 3] /lib64/libpsm2.so.2(+0xc706)[0x7fe2d5a5b706]
> [cn15:25287] [ 4] /lib64/libpsm2.so.2(+0x10d60)[0x7fe2d5a5fd60]
> [cn15:25287] [ 5] /lib64/libpsm2.so.2(psm2_ep_open+0x41e)[0x7fe2d5a5e8de]
> [cn15:25287] [ 6] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(ompi_mtl_psm2_module_init+0x1df)[0x7fe2d69b5d5b]
> [cn15:25287] [ 7] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(+0x1b3249)[0x7fe2d69b7249]
> [cn15:25287] [ 8] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(ompi_mtl_base_select+0xc2)[0x7fe2d69b2956]
> [cn15:25287] [ 9] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(+0x216c9f)[0x7fe2d6a1ac9f]
> [cn15:25287] [10] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(mca_pml_base_select+0x29b)[0x7fe2d69f7566]
> [cn15:25287] [11] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(ompi_mpi_init+0x665)[0x7fe2d687e0f4]
> [cn15:25287] [12] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(MPI_Init+0x99)[0x7fe2d68b1cb4]
> [cn15:25287] [13] ./mpi_hello.x[0x400927]
> [cn15:25287] [14] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fe2d5039b15]
> [cn15:25287] [15] ./mpi_hello.x[0x400839]
> [cn15:25287] *** End of error message ***
> [cn15:25286] *** Process received signal ***
> [cn15:25286] Signal: Bus error (7)
> [cn15:25286] Signal code: Non-existant physical address (2)
> [cn15:25286] Failing at address: 0x7fd4abb18000
> [cn15:25286] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7fd4b3f56100]
> [cn15:25286] [ 1] /lib64/libpsm2.so.2(+0x4b034)[0x7fd4b4607034]
> [cn15:25286] [ 2] /lib64/libpsm2.so.2(+0xc45f)[0x7fd4b45c845f]
> [cn15:25286] [ 3] /lib64/libpsm2.so.2(+0xc706)[0x7fd4b45c8706]
> [cn15:25286] [ 4] /lib64/libpsm2.so.2(+0x10d60)[0x7fd4b45ccd60]
> [cn15:25286] [ 5] /lib64/libpsm2.so.2(psm2_ep_open+0x41e)[0x7fd4b45cb8de]
> [cn15:25286] [ 6] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(ompi_mtl_psm2_module_init+0x1df)[0x7fd4b5522d5b]
> [cn15:25286] [ 7] 
> /opt/openmpi/2.0.2/lib/libmpi.so.20(+0x1b3249)[0x7fd4b5524249]
> [cn15:25286] [ 8] 
> 

Re: [OMPI users] MPI for microcontrolles without OS

2017-03-08 Thread r...@open-mpi.org
A quick web search can answer your quest, I believe - here are a few hits I got 
(Texas Instruments has been active in this area):

http://processors.wiki.ti.com/index.php/MCSDK_HPC_3.x_OpenMPI_Under_Review 
<http://processors.wiki.ti.com/index.php/MCSDK_HPC_3.x_OpenMPI_Under_Review>
https://e2e.ti.com/support/applications/high-performance-computing/f/952/t/440905
 
<https://e2e.ti.com/support/applications/high-performance-computing/f/952/t/440905>

Several of our members have Raspberry Pi systems running OMPI - looks something 
like this one:

https://www.hackster.io/darthbison/raspberry-pi-cluster-with-mpi-4602cb 
<https://www.hackster.io/darthbison/raspberry-pi-cluster-with-mpi-4602cb>

and here’s a little book on how to do it:

https://www.packtpub.com/hardware-and-creative/raspberry-pi-super-cluster 
<https://www.packtpub.com/hardware-and-creative/raspberry-pi-super-cluster>

or one of many online explanations:

http://www.southampton.ac.uk/~sjc/raspberrypi/pi_supercomputer_southampton_web.pdf
 
<http://www.southampton.ac.uk/~sjc/raspberrypi/pi_supercomputer_southampton_web.pdf>

HTH
Ralph

> On Mar 8, 2017, at 10:41 AM, Mateusz Tasz <mateusz.t...@gmail.com> wrote:
> 
> Hi,
> 
> Thank for Your answer. Although I am still confused. As I know TCP
> communication is not a problem for microcontrollers, so that cannot be
> crucial cause for OS choice. Maybe something else is also necessary -
> maybe acording memory - do you know?
> Do you know where I can find a ported version of OMPI to
> microcontrollers(hopefully with documentation :) ?  I admit that
> having OS on the board is nice and gives a high level of abstraction.
> But I belive that sometimes the lower level would be necessary - thats
> why I force to find a solutions.
> About this ported version - was it working properly?
> 
> Thanks in advance,
> Mateusz Tasz
> 
> 
> 2017-03-08 18:23 GMT+01:00 r...@open-mpi.org <r...@open-mpi.org>:
>> OpenMPI has been ported to microcontrollers before, but it does require at 
>> least a minimal OS to provide support (e.g., TCP for communications). Most 
>> IoT systems already include an OS on them for just that reason. I personally 
>> have OMPI running on a little Edison board using the OS that comes with it - 
>> no major changes were required.
>> 
>> Others have used various typical microcontroller real-time OS on their 
>> systems, and porting OMPI to them usually isn’t that bad. May require some 
>> configuration mods.
>> 
>> 
>>> On Mar 8, 2017, at 9:00 AM, Mateusz Tasz <mateusz.t...@gmail.com> wrote:
>>> 
>>> Hello,
>>> 
>>> I am a student. I am attracted by concept of MPI and i would like to
>>> apply ths idea to bare metel devices like mictrocontrollers e.g.
>>> stm32. But your solution requires an operating system on board. May I
>>> ask why is it necessary? Can I neglect it? And if so how can I do it?
>>> I ask because I'd like to apply this concept to IoT system where data
>>> can be processed by few device in local neighbourhood.
>>> 
>>> Thank in advance,
>>> Mateusz Tasz
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI for microcontrolles without OS

2017-03-08 Thread r...@open-mpi.org
OpenMPI has been ported to microcontrollers before, but it does require at 
least a minimal OS to provide support (e.g., TCP for communications). Most IoT 
systems already include an OS on them for just that reason. I personally have 
OMPI running on a little Edison board using the OS that comes with it - no 
major changes were required.

Others have used various typical microcontroller real-time OS on their systems, 
and porting OMPI to them usually isn’t that bad. May require some configuration 
mods.


> On Mar 8, 2017, at 9:00 AM, Mateusz Tasz  wrote:
> 
> Hello,
> 
> I am a student. I am attracted by concept of MPI and i would like to
> apply ths idea to bare metel devices like mictrocontrollers e.g.
> stm32. But your solution requires an operating system on board. May I
> ask why is it necessary? Can I neglect it? And if so how can I do it?
> I ask because I'd like to apply this concept to IoT system where data
> can be processed by few device in local neighbourhood.
> 
> Thank in advance,
> Mateusz Tasz
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Issues with different IB adapters and openmpi 2.0.2

2017-02-28 Thread r...@open-mpi.org
The root cause is that the nodes are defined as “heterogeneous” because the 
difference in HCAs causes a difference in selection logic. For scalability 
purposes, we don’t circulate the choice of PML as that isn’t something mpirun 
can “discover” and communicate.

One option we could pursue is to provide a mechanism by which we add the HCAs 
to the topology “signature” sent back by the daemon. This would allow us to 
detect the difference, and then ensure that the PML selection is included in 
the circulated wireup data so the system can at least warn you of the problem 
instead of silently hanging.


> On Feb 28, 2017, at 10:38 AM, Orion Poplawski  wrote:
> 
> On 02/27/2017 05:19 PM, Howard Pritchard wrote:
>> Hi Orion
>> 
>> Does the problem occur if you only use font2 and 3?  Do you have MXM 
>> installed
>> on the font1 node?
> 
> No, running across font2/3 is fine.  No idea what MXM is.
> 
>> The 2.x series is using PMIX and it could be that is impacting the PML sanity
>> check.
>> 
>> Howard
>> 
>> 
>> Orion Poplawski > schrieb am
>> Mo. 27. Feb. 2017 um 14:50:
>> 
>>We have a couple nodes with different IB adapters in them:
>> 
>>font1/var/log/lspci:03:00.0 InfiniBand [0c06]: Mellanox Technologies 
>> MT25204
>>[InfiniHost III Lx HCA] [15b3:6274] (rev 20)
>>font2/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
>> InfiniBand
>>HCA [1077:7220] (rev 02)
>>font3/var/log/lspci:03:00.0 InfiniBand [0c06]: QLogic Corp. IBA7220 
>> InfiniBand
>>HCA [1077:7220] (rev 02)
>> 
>>With 1.10.3 we saw the following errors with mpirun:
>> 
>>[font2.cora.nwra.com:13982 ]
>>[[23220,1],10] selected pml cm, but peer
>>[[23220,1],0] on font1 selected pml ob1
>> 
>>which crashed MPI_Init.
>> 
>>We worked around this by passing "--mca pml ob1".  I notice now with 
>> openmpi
>>2.0.2 without that option I no longer see errors, but the mpi program will
>>hang shortly after startup.  Re-adding the option makes it work, so I'm
>>assuming the underlying problem is still the same, but openmpi appears to 
>> have
>>stopped alerting me to the issue.
>> 
>>Thoughts?
>> 
>>--
>>Orion Poplawski
>>Technical Manager  720-772-5637
>>NWRA, Boulder/CoRA Office FAX: 303-415-9702
>>3380 Mitchell Lane   or...@nwra.com
>>
>>Boulder, CO 80301   http://www.nwra.com
>>___
>>users mailing list
>>users@lists.open-mpi.org 
>>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> 
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
> 
> 
> -- 
> Orion Poplawski
> Technical Manager  720-772-5637
> NWRA, Boulder/CoRA Office FAX: 303-415-9702
> 3380 Mitchell Lane   or...@nwra.com
> Boulder, CO 80301   http://www.nwra.com
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] State of the DVM in Open MPI

2017-02-28 Thread r...@open-mpi.org
Hi Reuti

The DVM in master seems to be fairly complete, but several organizations are in 
the process of automating tests for it so it gets more regular exercise.

If you are using a version in OMPI 2.x, those are early prototype - we haven’t 
updated the code in the release branches. The more production-ready version 
will be in 3.0, and we’ll start supporting it there.

Meantime, we do appreciate any suggestions and bug reports as we polish it up.


> On Feb 28, 2017, at 2:17 AM, Reuti  wrote:
> 
> Hi,
> 
> Only by reading recent posts I got aware of the DVM. This would be a welcome 
> feature for our setup*. But I see not all options working as expected - is it 
> still a work in progress, or should all work as advertised?
> 
> 1)
> 
> $ soft@server:~> orte-submit -cf foo --hnp file:/home/reuti/dvmuri -n 1 touch 
> /home/reuti/hacked
> 
> Open MPI has detected that a parameter given to a command line
> option does not match the expected format:
> 
>  Option: np
>  Param:  foo
> 
> ==> The given option is -cf, not -np
> 
> 2)
> 
> According to `man orte-dvm` there is -H, -host, --host, -machinefile, 
> -hostfile but none of them seem operational (Open MPI 2.0.2). A given 
> hostlist given by SGE is honored though.
> 
> -- Reuti
> 
> 
> *) We run Open MPI jobs inside SGE. This works fine. Some applications invoke 
> several `mpiexec`-calls during their execution and rely on temporary files 
> they created in the last step(s). While this is working fine on one and the 
> same machine, it fails in case SGE granted slots on several machines as the 
> scratch directories created by `qrsh -inherit …` vanish once the 
> `mpiexec`-call on this particular node finishes (and not at the end of the 
> complete job). I can mimic persistent scratch directories in SGE for a 
> complete job, but invoking the DVM before and shutting it down later on 
> (either by hand in the job script or by SGE killing all remains at the end of 
> the job) might be more straight forward (looks like `orte-dvm` is started by 
> `qrsh -inherit …` too).
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-27 Thread r...@open-mpi.org

> On Feb 27, 2017, at 9:39 AM, Reuti  wrote:
> 
> 
>> Am 27.02.2017 um 18:24 schrieb Angel de Vicente :
>> 
>> […]
>> 
>> For a small group of users if the DVM can run with my user and there is
>> no restriction on who can use it or if I somehow can authorize others to
>> use it (via an authority file or similar) that should be enough.
> 
> AFAICS there is no user authorization at all. Everyone can hijack a running 
> DVM once he knows the URI. The only problem might be, that all processes are 
> running under the account of the user who started the DVM. I.e. output files 
> have to go to the home directory of this user, as any other user can't write 
> to his own directory any longer this way.

We can add some authorization protection, at least at the user/group level. One 
can resolve the directory issue by creating some place that has group 
authorities, and then requesting that to be the working directory.

> 
> Running the DVM under root might help, but this would be a high risk that any 
> faulty script might write to a place where sensible system information is 
> stored and may leave the machine unusable afterwards.
> 

I would advise against that

> My first attempts using DVM often leads to a terminated DVM once a process 
> returned with a non-zero exit code. But once the DVM is gone, the queued jobs 
> might be lost too I fear. I would wish that the DVM could be more forgivable 
> (or this feature be adjustable what to do in case of a non-zero exit code).

We just fixed that issue the other day :-)

> 
> -- Reuti
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-27 Thread r...@open-mpi.org

> On Feb 27, 2017, at 4:58 AM, Angel de Vicente <ang...@iac.es> wrote:
> 
> Hi,
> 
> "r...@open-mpi.org" <r...@open-mpi.org> writes:
>> You might want to try using the DVM (distributed virtual machine)
>> mode in ORTE. You can start it on an allocation using the “orte-dvm”
>> cmd, and then submit jobs to it with “mpirun --hnp ”, where foo
>> is either the contact info printed out by orte-dvm, or the name of
>> the file you told orte-dvm to put that info in. You’ll need to take
>> it from OMPI master at this point.
> 
> this question looked interesting so I gave it a try. In a cluster with
> Slurm I had no problem submitting a job which launched an orte-dvm
> -report-uri ... and then use that file to launch jobs onto that virtual
> machine via orte-submit. 
> 
> To be useful to us at this point, I should be able to start executing
> jobs if there are cores available and just hold them in a queue if the
> cores are already filled. At this point this is not happenning, and if I
> try to submit a second job while the previous one has not finished, I
> get a message like:
> 
> ,
> | DVM ready
> | --
> | All nodes which are allocated for this job are already filled.
> | --
> `
> 
> With the DVM, is it possible to keep these jobs in some sort of queue,
> so that they will be executed when the cores get free?

It wouldn’t be hard to do so - as long as it was just a simple FIFO scheduler. 
I wouldn’t want it to get too complex.

> 
> Thanks,
> -- 
> Ángel de Vicente
> http://www.iac.es/galeria/angelv/  
> -
> ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protección de 
> Datos, acceda a http://www.iac.es/disclaimer.php
> WARNING: For more information on privacy and fulfilment of the Law concerning 
> the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Using OpenMPI / ORTE as cluster aware GNU Parallel

2017-02-23 Thread r...@open-mpi.org
You might want to try using the DVM (distributed virtual machine) mode in ORTE. 
You can start it on an allocation using the “orte-dvm” cmd, and then submit 
jobs to it with “mpirun --hnp ”, where foo is either the contact info 
printed out by orte-dvm, or the name of the file you told orte-dvm to put that 
info in. You’ll need to take it from OMPI master at this point.

Alternatively, you can get just the DVM bits by downloading the PMIx Reference 
Server (https://github.com/pmix/pmix-reference-server 
). It’s just ORTE, but with it 
locked to the DVM operation. So a simple “psrvr” starts the machine, and then 
“prun” executes cmds (supports all the orterun options, doesn’t need to be told 
how to contact psrvr).

Both will allow you to run serial as well as parallel codes (so long as they 
are built against OMPI master). We are working on providing cross-version PMIx 
support - at that time, you’ll be able to run OMPI v2.0 and above against 
either one as well.

HTH
Ralph

> On Feb 23, 2017, at 1:41 PM, Brock Palen  wrote:
> 
> Is it possible to use mpirun / orte as a load balancer for running serial
> jobs in parallel similar to GNU Parallel?
> https://www.biostars.org/p/63816/ 
> 
> Reason is on any major HPC system you normally want to use a resource
> manager launcher (TM, slurm etc)  and not ssh like gnu parallel.
> 
> I recall there being a way to give OMPI a stack of work todo from the talk
> at SC this year, but I can't figure it out if it does what I think it
> should do.
> 
> Thanks,
> 
> Brock Palen
> www.umich.edu/~brockp 
> Director Advanced Research Computing - TS
> XSEDE Campus Champion
> bro...@umich.edu 
> (734)936-1985
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] More confusion about --map-by!

2017-02-23 Thread r...@open-mpi.org
Just as a fun follow-up: if you wanted to load-balance across nodes as well as 
within nodes, then you would add the “span” modifier to map-by:

$ mpirun --map-by socket:span,pe=2 --rank-by core --report-bindings -n 8 
hostname
[rhc001:162391] SETTING BINDING TO CORE
[rhc001:162391] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 
1[hwt 0-1]]: 
[BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:162391] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[core 
3[hwt 0-1]]: 
[../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]
[rhc001:162391] MCW rank 2 bound to socket 1[core 12[hwt 0-1]], socket 1[core 
13[hwt 0-1]]: 
[../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]
[rhc001:162391] MCW rank 3 bound to socket 1[core 14[hwt 0-1]], socket 1[core 
15[hwt 0-1]]: 
[../../../../../../../../../../../..][../../BB/BB/../../../../../../../..]


[rhc002.cluster:150295] MCW rank 4 bound to socket 0[core 0[hwt 0-1]], socket 
0[core 1[hwt 0-1]]: 
[BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:150295] MCW rank 5 bound to socket 0[core 2[hwt 0-1]], socket 
0[core 3[hwt 0-1]]: 
[../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]
[rhc002.cluster:150295] MCW rank 6 bound to socket 1[core 12[hwt 0-1]], socket 
1[core 13[hwt 0-1]]: 
[../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]
[rhc002.cluster:150295] MCW rank 7 bound to socket 1[core 14[hwt 0-1]], socket 
1[core 15[hwt 0-1]]: 
[../../../../../../../../../../../..][../../BB/BB/../../../../../../../..]

“span” causes ORTE to treat all the sockets etc. as being on a single giant 
node.

HTH
Ralph


> On Feb 23, 2017, at 6:38 AM, r...@open-mpi.org wrote:
> 
> From the mpirun man page:
> 
> **
> Open MPI employs a three-phase procedure for assigning process locations and 
> ranks:
> mapping
> Assigns a default location to each process
> ranking
> Assigns an MPI_COMM_WORLD rank value to each process
> binding
> Constrains each process to run on specific processors
> The mapping step is used to assign a default location to each process based 
> on the mapper being employed. Mapping by slot, node, and sequentially results 
> in the assignment of the processes to the node level. In contrast, mapping by 
> object, allows the mapper to assign the process to an actual object on each 
> node.
> 
> Note: the location assigned to the process is independent of where it will be 
> bound - the assignment is used solely as input to the binding algorithm.
> 
> The mapping of process processes to nodes can be defined not just with 
> general policies but also, if necessary, using arbitrary mappings that cannot 
> be described by a simple policy. One can use the "sequential mapper," which 
> reads the hostfile line by line, assigning processes to nodes in whatever 
> order the hostfile specifies. Use the -mca rmaps seq option. For example, 
> using the same hostfile as before:
> 
> mpirun -hostfile myhostfile -mca rmaps seq ./a.out
> 
> will launch three processes, one on each of nodes aa, bb, and cc, 
> respectively. The slot counts don’t matter; one process is launched per line 
> on whatever node is listed on the line.
> 
> Another way to specify arbitrary mappings is with a rankfile, which gives you 
> detailed control over process binding as well. Rankfiles are discussed below.
> 
> The second phase focuses on the ranking of the process within the job’s 
> MPI_COMM_WORLD. Open MPI separates this from the mapping procedure to allow 
> more flexibility in the relative placement of MPI processes. 
> 
> The binding phase actually binds each process to a given set of processors. 
> This can improve performance if the operating system is placing processes 
> suboptimally. For example, it might oversubscribe some multi-core processor 
> sockets, leaving other sockets idle; this can lead processes to contend 
> unnecessarily for common resources. Or, it might spread processes out too 
> widely; this can be suboptimal if application performance is sensitive to 
> interprocess communication costs. Binding can also keep the operating system 
> from migrating processes excessively, regardless of how optimally those 
> processes were placed to begin with.
> 
> 
> So what you probably want is:  --map-by socket:pe=N --rank-by core
> 
> Remember, the pe=N modifier automatically forces binding at the cpu level. 
> The rank-by directive defaults to rank-by socket when you map-by socket, 
> hence you need to specify that you want it to map by core instead. Here is 
> the result of doing that on my box:
> 
> $ mpirun --map-by socket:pe=2 --rank-by core --report-bindings -n 8 hostname
> [rhc001:1

Re: [OMPI users] More confusion about --map-by!

2017-02-23 Thread r...@open-mpi.org
From the mpirun man page:

**
Open MPI employs a three-phase procedure for assigning process locations and 
ranks:
mapping
Assigns a default location to each process
ranking
Assigns an MPI_COMM_WORLD rank value to each process
binding
Constrains each process to run on specific processors
The mapping step is used to assign a default location to each process based on 
the mapper being employed. Mapping by slot, node, and sequentially results in 
the assignment of the processes to the node level. In contrast, mapping by 
object, allows the mapper to assign the process to an actual object on each 
node.

Note: the location assigned to the process is independent of where it will be 
bound - the assignment is used solely as input to the binding algorithm.

The mapping of process processes to nodes can be defined not just with general 
policies but also, if necessary, using arbitrary mappings that cannot be 
described by a simple policy. One can use the "sequential mapper," which reads 
the hostfile line by line, assigning processes to nodes in whatever order the 
hostfile specifies. Use the -mca rmaps seq option. For example, using the same 
hostfile as before:

mpirun -hostfile myhostfile -mca rmaps seq ./a.out

will launch three processes, one on each of nodes aa, bb, and cc, respectively. 
The slot counts don’t matter; one process is launched per line on whatever node 
is listed on the line.

Another way to specify arbitrary mappings is with a rankfile, which gives you 
detailed control over process binding as well. Rankfiles are discussed below.

The second phase focuses on the ranking of the process within the job’s 
MPI_COMM_WORLD. Open MPI separates this from the mapping procedure to allow 
more flexibility in the relative placement of MPI processes. 

The binding phase actually binds each process to a given set of processors. 
This can improve performance if the operating system is placing processes 
suboptimally. For example, it might oversubscribe some multi-core processor 
sockets, leaving other sockets idle; this can lead processes to contend 
unnecessarily for common resources. Or, it might spread processes out too 
widely; this can be suboptimal if application performance is sensitive to 
interprocess communication costs. Binding can also keep the operating system 
from migrating processes excessively, regardless of how optimally those 
processes were placed to begin with.


So what you probably want is:  --map-by socket:pe=N --rank-by core

Remember, the pe=N modifier automatically forces binding at the cpu level. The 
rank-by directive defaults to rank-by socket when you map-by socket, hence you 
need to specify that you want it to map by core instead. Here is the result of 
doing that on my box:

$ mpirun --map-by socket:pe=2 --rank-by core --report-bindings -n 8 hostname
[rhc001:154283] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 
1[hwt 0-1]]: 
[BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
[rhc001:154283] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[core 
3[hwt 0-1]]: 
[../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]
[rhc001:154283] MCW rank 2 bound to socket 0[core 4[hwt 0-1]], socket 0[core 
5[hwt 0-1]]: 
[../../../../BB/BB/../../../../../..][../../../../../../../../../../../..]
[rhc001:154283] MCW rank 3 bound to socket 0[core 6[hwt 0-1]], socket 0[core 
7[hwt 0-1]]: 
[../../../../../../BB/BB/../../../..][../../../../../../../../../../../..]
[rhc001:154283] MCW rank 4 bound to socket 1[core 12[hwt 0-1]], socket 1[core 
13[hwt 0-1]]: 
[../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]
[rhc001:154283] MCW rank 5 bound to socket 1[core 14[hwt 0-1]], socket 1[core 
15[hwt 0-1]]: 
[../../../../../../../../../../../..][../../BB/BB/../../../../../../../..]
[rhc001:154283] MCW rank 6 bound to socket 1[core 16[hwt 0-1]], socket 1[core 
17[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../BB/BB/../../../../../..]
[rhc001:154283] MCW rank 7 bound to socket 1[core 18[hwt 0-1]], socket 1[core 
19[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../BB/BB/../../../..]


HTH
Ralph

> On Feb 23, 2017, at 6:18 AM,   wrote:
> 
> Mark,
> 
> what about
> mpirun -np 6 -map-by slot:PE=4 --bind-to core --report-bindings ./prog
> 
> it is a fit for 1) and 2) but not 3)
> 
> if you use OpenMP and want 2 threads per task, then you can
> export OMP_NUM_THREADS=2
> not to use 4 threads by default (with most OpenMP runtimes)
> 
> Cheers,
> 
> Gilles
> - Original Message -
>> Hi,
>> 
>> I'm still trying to figure out how to express the core binding I want 
> to 
>> openmpi 2.x via the --map-by option. Can anyone help, please?
>> 
>> I bet I'm being dumb, but it's proving tricky to achieve the following 
>> aims (most important first):
>> 
>> 1) Maximise memory bandwidth usage (e.g. load balance ranks across

Re: [OMPI users] Segmentation Fault when using OpenMPI 1.10.6 and PGI 17.1.0 on POWER8

2017-02-21 Thread r...@open-mpi.org
Can you provide a backtrace with line numbers from a debug build? We don’t get 
much testing with lsf, so it is quite possible there is a bug in there.

> On Feb 21, 2017, at 7:39 PM, Hammond, Simon David (-EXP)  
> wrote:
> 
> Hi OpenMPI Users,
> 
> Has anyone successfully tested OpenMPI 1.10.6 with PGI 17.1.0 on POWER8 with 
> the LSF scheduler (—with-lsf=..)?
> 
> I am getting this error when the code hits MPI_Finalize. It causes the job to 
> abort (i.e. exit the LSF session) when I am running interactively.
> 
> Are there any materials we can supply to aid debugging/problem isolation?
> 
> [white23:58788] *** Process received signal ***
> [white23:58788] Signal: Segmentation fault (11)
> [white23:58788] Signal code: Invalid permissions (2)
> [white23:58788] Failing at address: 0x108e0810
> [white23:58788] [ 0] [0x10050478]
> [white23:58788] [ 1] [0x0]
> [white23:58788] [ 2] 
> /home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libopen-rte.so.12(+0x1b6b0)[0x1071b6b0]
> [white23:58788] [ 3] 
> /home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libopen-rte.so.12(orte_finalize+0x70)[0x1071b5b8]
> [white23:58788] [ 4] 
> /home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libmpi.so.12(ompi_mpi_finalize+0x760)[0x10121dc8]
> [white23:58788] [ 5] 
> /home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libmpi.so.12(PMPI_Finalize+0x6c)[0x10153154]
> [white23:58788] [ 6] ./IMB-MPI1[0x100028dc]
> [white23:58788] [ 7] /lib64/libc.so.6(+0x24700)[0x104b4700]
> [white23:58788] [ 8] /lib64/libc.so.6(__libc_start_main+0xc4)[0x104b48f4]
> [white23:58788] *** End of error message ***
> [white22:73620] *** Process received signal ***
> [white22:73620] Signal: Segmentation fault (11)
> [white22:73620] Signal code: Invalid permissions (2)
> [white22:73620] Failing at address: 0x108e0810
> 
> 
> Thanks,
> 
> S.
> 
> —
> 
> Si Hammond
> Scalable Computer Architectures
> Sandia National Laboratories, NM, USA
> 
> [Sent from Remote Connection, Please excuse typos]
> 
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI and Singularity

2017-02-20 Thread r...@open-mpi.org
If you can send us some more info on how it breaks, that would be helpful. I’ll 
file it as an issue so we can track things

Thanks
Ralph


> On Feb 20, 2017, at 9:13 AM, Bennet Fauber <ben...@umich.edu> wrote:
> 
> I got mixed results when bringing a container that doesn't have the IB
> and Torque libraries compiled into the OMPI inside the container to a
> cluster where it does.
> 
> The short summary is that mutlinode communication seems unreliable.  I
> can mostly get up to 8 procs, two-per-node, to run, but beyond that
> not.  In a couple of cases, a particular node seemed able to cause a
> problem.  I am going to try again making the configure line inside the
> container the same as outside, but I have to chase down the IB and
> Torque to do so.
> 
> If you're interested in how it breaks, I can send you some more
> information.  If there are diagnostics you would like, I can try to
> provide those.  I will be gone starting Thu for a week.
> 
> -- bennet
> 
> 
> 
> 
> On Fri, Feb 17, 2017 at 11:20 PM, r...@open-mpi.org <r...@open-mpi.org> wrote:
>> I -think- that is correct, but you may need the verbs library as well - I 
>> honestly don’t remember if the configury checks for functions in the library 
>> or not. If so, then you’ll need that wherever you build OMPI, but everything 
>> else is accurate
>> 
>> Good luck - and let us know how it goes!
>> Ralph
>> 
>>> On Feb 17, 2017, at 4:34 PM, Bennet Fauber <ben...@umich.edu> wrote:
>>> 
>>> Ralph.
>>> 
>>> I will be building from the Master branch at github.com for testing
>>> purposes.  We are not 'supporting' Singularity container creation, but
>>> we do hope to be able to offer some guidance, so I think we can
>>> finesse the PMIx version, yes?
>>> 
>>> That is good to know about the verbs headers being the only thing
>>> needed; thanks for that detail.  Sometimes the library also needs to
>>> be present.
>>> 
>>> Also very good to know that the host mpirun will start processes, as
>>> we are using cgroups, and if the processes get started by a
>>> non-tm-supporting MPI, they will be outside the proper cgroup.
>>> 
>>> So, just to recap, if I install from the current master at
>>> http://github.com/open-mpi/ompi.git on the host system and within the
>>> container, I copy the verbs headers into the container, then configure
>>> and build OMPI within the container and ignore TM support, I should be
>>> able to copy the container to the cluster and run it with verbs and
>>> the system OMPI using tm.
>>> 
>>> If a user were to build without the verbs support, it would still run,
>>> but it would fall back to non-verbs communication, so it would just be
>>> commensurately slower.
>>> 
>>> Let me know if I've garbled things.  Otherwise, wish me luck, and have
>>> a good weekend!
>>> 
>>> Thanks,  -- bennet
>>> 
>>> 
>>> 
>>> On Fri, Feb 17, 2017 at 7:24 PM, r...@open-mpi.org <r...@open-mpi.org> 
>>> wrote:
>>>> The embedded Singularity support hasn’t made it into the OMPI 2.x release 
>>>> series yet, though OMPI will still work within a Singularity container 
>>>> anyway.
>>>> 
>>>> Compatibility across the container boundary is always a problem, as your 
>>>> examples illustrate. If the system is using one OMPI version and the 
>>>> container is using another, then the only concern is compatibility across 
>>>> the container boundary of the process-to-ORTE daemon communication. In the 
>>>> OMPI 2.x series and beyond, this is done with PMIx. OMPI v2.0 is based on 
>>>> PMIx v1.x, and so will OMPI v2.1. Thus, there is no compatibility issue 
>>>> there. However, that statement is _not_ true for OMPI v1.10 and earlier 
>>>> series.
>>>> 
>>>> Future OMPI versions will utilize PMIx v2 and above, which include a 
>>>> cross-version compatibility layer. Thus, you shouldn’t have any issues 
>>>> mixing and matching OMPI versions from this regard.
>>>> 
>>>> However, your second example is a perfect illustration of where 
>>>> containerization can break down. If you build your container on a system 
>>>> that doesn’t have (for example) tm and verbs installed on it, then those 
>>>> OMPI components will not be built. The tm component won’t matter as the 
>>>> system version of mpirun will be executing, and it presumably k

Re: [OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?

2017-02-18 Thread r...@open-mpi.org
FWIW: have you taken a look at the event notification mechanisms in PMIx yet? 
The intent there, among other features, is to provide async notification of 
events generated either by the system (e.g., node failures and/or congestion) 
or other application processes.

https://pmix.github.io/master <https://pmix.github.io/master>

OMPI includes PMIx support beginning with OMPI v2.0, and various RMs are 
releasing their integrated support as well.
Ralph

> On Feb 18, 2017, at 10:07 AM, Michel Lesoinne <mlesoi...@cmsoftinc.com> wrote:
> 
> I am also a proponent of the multiple thread support. For many reasons:
>  - code simplification
>  - easier support of computation/communication overlap with fewer 
> synchronization points
>  - possibility of creating exception aware MPI Code (I think the MPI standard 
> cruelly lacks constructs for a natural clean handling of application 
> exceptions across processes)
> 
> So it is good to hear there is progress.
> 
> On Feb 18, 2017 7:43 AM, "r...@open-mpi.org <mailto:r...@open-mpi.org>" 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> We have been making a concerted effort to resolve outstanding issues as the 
> interest in threaded applications has grown. It should be pretty good now, 
> but we do see occasional bug reports, so it isn’t perfect.
> 
> > On Feb 18, 2017, at 12:14 AM, Mark Dixon <m.c.di...@leeds.ac.uk 
> > <mailto:m.c.di...@leeds.ac.uk>> wrote:
> >
> > On Fri, 17 Feb 2017, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
> >
> >> Depends on the version, but if you are using something in the v2.x range, 
> >> you should be okay with just one installed version
> >
> > Thanks Ralph.
> >
> > How good is MPI_THREAD_MULTIPLE support these days and how far up the 
> > wishlist is it, please?
> >
> > We don't get many openmpi-specific queries from users but, other than core 
> > binding, it seems to be the thing we get asked about the most (I normally 
> > point those people at mvapich2 or intelmpi instead).
> >
> > Cheers,
> >
> > Mark
> > ___
> > users mailing list
> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?

2017-02-18 Thread r...@open-mpi.org
We have been making a concerted effort to resolve outstanding issues as the 
interest in threaded applications has grown. It should be pretty good now, but 
we do see occasional bug reports, so it isn’t perfect.

> On Feb 18, 2017, at 12:14 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:
> 
> On Fri, 17 Feb 2017, r...@open-mpi.org wrote:
> 
>> Depends on the version, but if you are using something in the v2.x range, 
>> you should be okay with just one installed version
> 
> Thanks Ralph.
> 
> How good is MPI_THREAD_MULTIPLE support these days and how far up the 
> wishlist is it, please?
> 
> We don't get many openmpi-specific queries from users but, other than core 
> binding, it seems to be the thing we get asked about the most (I normally 
> point those people at mvapich2 or intelmpi instead).
> 
> Cheers,
> 
> Mark
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI and Singularity

2017-02-17 Thread r...@open-mpi.org
I -think- that is correct, but you may need the verbs library as well - I 
honestly don’t remember if the configury checks for functions in the library or 
not. If so, then you’ll need that wherever you build OMPI, but everything else 
is accurate

Good luck - and let us know how it goes!
Ralph

> On Feb 17, 2017, at 4:34 PM, Bennet Fauber <ben...@umich.edu> wrote:
> 
> Ralph.
> 
> I will be building from the Master branch at github.com for testing
> purposes.  We are not 'supporting' Singularity container creation, but
> we do hope to be able to offer some guidance, so I think we can
> finesse the PMIx version, yes?
> 
> That is good to know about the verbs headers being the only thing
> needed; thanks for that detail.  Sometimes the library also needs to
> be present.
> 
> Also very good to know that the host mpirun will start processes, as
> we are using cgroups, and if the processes get started by a
> non-tm-supporting MPI, they will be outside the proper cgroup.
> 
> So, just to recap, if I install from the current master at
> http://github.com/open-mpi/ompi.git on the host system and within the
> container, I copy the verbs headers into the container, then configure
> and build OMPI within the container and ignore TM support, I should be
> able to copy the container to the cluster and run it with verbs and
> the system OMPI using tm.
> 
> If a user were to build without the verbs support, it would still run,
> but it would fall back to non-verbs communication, so it would just be
> commensurately slower.
> 
> Let me know if I've garbled things.  Otherwise, wish me luck, and have
> a good weekend!
> 
> Thanks,  -- bennet
> 
> 
> 
> On Fri, Feb 17, 2017 at 7:24 PM, r...@open-mpi.org <r...@open-mpi.org> wrote:
>> The embedded Singularity support hasn’t made it into the OMPI 2.x release 
>> series yet, though OMPI will still work within a Singularity container 
>> anyway.
>> 
>> Compatibility across the container boundary is always a problem, as your 
>> examples illustrate. If the system is using one OMPI version and the 
>> container is using another, then the only concern is compatibility across 
>> the container boundary of the process-to-ORTE daemon communication. In the 
>> OMPI 2.x series and beyond, this is done with PMIx. OMPI v2.0 is based on 
>> PMIx v1.x, and so will OMPI v2.1. Thus, there is no compatibility issue 
>> there. However, that statement is _not_ true for OMPI v1.10 and earlier 
>> series.
>> 
>> Future OMPI versions will utilize PMIx v2 and above, which include a 
>> cross-version compatibility layer. Thus, you shouldn’t have any issues 
>> mixing and matching OMPI versions from this regard.
>> 
>> However, your second example is a perfect illustration of where 
>> containerization can break down. If you build your container on a system 
>> that doesn’t have (for example) tm and verbs installed on it, then those 
>> OMPI components will not be built. The tm component won’t matter as the 
>> system version of mpirun will be executing, and it presumably knows how to 
>> interact with Torque.
>> 
>> However, if you run that container on a system that has verbs, your 
>> application won’t be able to utilize the verbs support because those 
>> components were never compiled. Note that the converse is not true: if you 
>> build your container on a system that has verbs installed, you can then run 
>> it on a system that doesn’t have verbs support and those components will 
>> dynamically disqualify themselves.
>> 
>> Remember, you only need the verbs headers to be installed - you don’t have 
>> to build on a machine that actually has a verbs-supporting NIC installed 
>> (this is how the distributions get around the problem). Thus, it isn’t hard 
>> to avoid this portability problem - you just need to think ahead a bit.
>> 
>> HTH
>> Ralph
>> 
>>> On Feb 17, 2017, at 3:49 PM, Bennet Fauber <ben...@umich.edu> wrote:
>>> 
>>> I am wishing to follow the instructions on the Singularity web site,
>>> 
>>>   http://singularity.lbl.gov/docs-hpc
>>> 
>>> to test Singularity and OMPI on our cluster.  My previously normal
>>> configure for the 1.x series looked like this.
>>> 
>>> ./configure --prefix=/usr/local \
>>>  --mandir=${PREFIX}/share/man \
>>>  --with-tm --with-verbs \
>>>  --disable-dlopen --enable-shared
>>>  CC=gcc CXX=g++ FC=gfortran
>>> 
>>> I have a couple of wonderments.
>>> 
>>> First, I presume it will be best to have the same version of OMPI
>>> inside th

Re: [OMPI users] OpenMPI and Singularity

2017-02-17 Thread r...@open-mpi.org
The embedded Singularity support hasn’t made it into the OMPI 2.x release 
series yet, though OMPI will still work within a Singularity container anyway.

Compatibility across the container boundary is always a problem, as your 
examples illustrate. If the system is using one OMPI version and the container 
is using another, then the only concern is compatibility across the container 
boundary of the process-to-ORTE daemon communication. In the OMPI 2.x series 
and beyond, this is done with PMIx. OMPI v2.0 is based on PMIx v1.x, and so 
will OMPI v2.1. Thus, there is no compatibility issue there. However, that 
statement is _not_ true for OMPI v1.10 and earlier series.

Future OMPI versions will utilize PMIx v2 and above, which include a 
cross-version compatibility layer. Thus, you shouldn’t have any issues mixing 
and matching OMPI versions from this regard.

However, your second example is a perfect illustration of where 
containerization can break down. If you build your container on a system that 
doesn’t have (for example) tm and verbs installed on it, then those OMPI 
components will not be built. The tm component won’t matter as the system 
version of mpirun will be executing, and it presumably knows how to interact 
with Torque.

However, if you run that container on a system that has verbs, your application 
won’t be able to utilize the verbs support because those components were never 
compiled. Note that the converse is not true: if you build your container on a 
system that has verbs installed, you can then run it on a system that doesn’t 
have verbs support and those components will dynamically disqualify themselves.

Remember, you only need the verbs headers to be installed - you don’t have to 
build on a machine that actually has a verbs-supporting NIC installed (this is 
how the distributions get around the problem). Thus, it isn’t hard to avoid 
this portability problem - you just need to think ahead a bit.

HTH
Ralph

> On Feb 17, 2017, at 3:49 PM, Bennet Fauber  wrote:
> 
> I am wishing to follow the instructions on the Singularity web site,
> 
>http://singularity.lbl.gov/docs-hpc
> 
> to test Singularity and OMPI on our cluster.  My previously normal
> configure for the 1.x series looked like this.
> 
> ./configure --prefix=/usr/local \
>   --mandir=${PREFIX}/share/man \
>   --with-tm --with-verbs \
>   --disable-dlopen --enable-shared
>   CC=gcc CXX=g++ FC=gfortran
> 
> I have a couple of wonderments.
> 
> First, I presume it will be best to have the same version of OMPI
> inside the container as out, but how sensitive will it be to minor
> versions?  All 2.1.x version should be fine, but not mix 2.1.x outside
> with 2.2.x inside or vice-versa (might be backward compatible but not
> forward)?
> 
> Second, if someone builds OMPI inside their container on an external
> system, without tm and verbs, then brings the container to our system,
> will the tm and verbs be handled by the calling mpirun from the host
> system, and the OMPI inside the container won't care?  Will not having
> those inside the container cause them to be suppressed outside?
> 
> Thanks in advance,  -- bennet
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] "-map-by socket:PE=1" doesn't do what I expect

2017-02-17 Thread r...@open-mpi.org
Mark - this is now available in master. Will look at what might be required to 
bring it to 2.0

> On Feb 15, 2017, at 5:49 AM, r...@open-mpi.org wrote:
> 
> 
>> On Feb 15, 2017, at 5:45 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:
>> 
>> On Wed, 15 Feb 2017, r...@open-mpi.org wrote:
>> 
>>> Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 
>>> - the logic is looking expressly for values > 1 as we hadn’t anticipated 
>>> this use-case.
>> 
>> Is it a sensible use-case, or am I crazy?
> 
> Not crazy, I’d say. The expected way of doing it would be “--map-by socket 
> --bind-to core”. However, I can see why someone might expect pe=1 to work.
> 
>> 
>>> I can make that change. I’m off to a workshop for the next day or so, but 
>>> can probably do this on the plane.
>> 
>> You're a star - thanks :)
>> 
>> Mark___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] fatal error with openmpi-master-201702150209-404fe32 on Linux with Sun C

2017-02-17 Thread r...@open-mpi.org
Thanks Gilles!

> On Feb 15, 2017, at 10:24 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
> 
> Ralph,
> 
> 
> i was able to rewrite some macros to make Oracle compilers happy, and filed 
> https://github.com/pmix/master/pull/309 for that
> 
> 
> Siegmar,
> 
> 
> meanwhile, feel free to manually apply the attached patch
> 
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> On 2/16/2017 8:09 AM, r...@open-mpi.org wrote:
>> I guess it was the next nightly tarball, but not next commit. However, it 
>> was almost certainly 7acef48 from Gilles that updated the PMIx code.
>> 
>> Gilles: can you perhaps take a peek?
>> 
>> Sent from my iPad
>> 
>>> On Feb 15, 2017, at 11:43 AM, Siegmar Gross 
>>> <siegmar.gr...@informatik.hs-fulda.de> wrote:
>>> 
>>> Hi Ralph,
>>> 
>>> I get the error already with openmpi-master-201702100209-51def91 which
>>> is the next version after openmpi-master-201702080209-bc2890e, if I'm
>>> right.
>>> 
>>> loki openmpi-master 146 grep Error \
>>>  
>>> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc
>>>  \
>>>  
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc
>>> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:
>>>   GENERATE mpi/man/man3/MPI_Error_class.3
>>> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:
>>>   GENERATE mpi/man/man3/MPI_Error_string.3
>>> 
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[5]:
>>>  *** [dstore/pmix_esh.lo] Error 1
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[4]:
>>>  *** [all-recursive] Error 1
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[3]:
>>>  *** [all-recursive] Error 1
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[2]:
>>>  *** [all-recursive] Error 1
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[1]:
>>>  *** [all-recursive] Error 1
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make:
>>>  *** [all-recursive] Error 1
>>> 
>>> "pmix_esh.lo" isn't available for openmpi-master-201702100209-51def91. It's
>>> also not available for the other versions which break.
>>> 
>>> loki openmpi-master 147 find 
>>> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc -name pmix_esh.lo
>>> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/opal/mca/pmix/pmix2x/pmix/src/dstore/pmix_esh.lo
>>> 
>>> loki openmpi-master 148 find 
>>> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc -name pmix_esh.lo
>>> loki openmpi-master 149
>>> 
>>> Which files do you need? Which commands shall I run to get differences of
>>> files?
>>> 
>>> 
>>> Kind regards
>>> 
>>> Siegmar
>>> 
>>> 
>>>> Am 15.02.2017 um 17:42 schrieb r...@open-mpi.org:
>>>> If we knew what line in that file was causing the compiler to barf, we
>>>> could at least address it. There is probably something added in recent
>>>> commits that is causing problems for the compiler.
>>>> 
>>>> So checking to see what commit might be triggering the failure would be 
>>>> most helpful.
>>>> 
>>>> 
>>>>> On Feb 15, 2017, at 8:29 AM, Siegmar Gross 
>>>>> <siegmar.gr...@informatik.hs-fulda.de> wrote:
>>>>> 
>>>>> Hi Gilles,
>>>>> 
>>>>>> this looks like a compiler crash, and it should be reported to Oracle.
>>>>> I can try, but I don't think that they are interested, because
>>>>> we don't have a contract any longer. I didn't get the error
>>>>> building openmpi-master-201702080209-bc2890e as you can see
>>>>> below. Would it be helpful to build all intermediate versions
>>>>> to find out when the error occured the first time? Perhaps we
>>>>> can identify which change of code is responsible for the error.
>>>>> 
>>>>> loki openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc 111 grep 
>>>>> Error log.make.Linux.x86_64.64_cc
>>>>> GENERATE mpi/man/man3/

Re: [OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?

2017-02-17 Thread r...@open-mpi.org
Depends on the version, but if you are using something in the v2.x range, you 
should be okay with just one installed version

> On Feb 17, 2017, at 4:41 AM, Mark Dixon  wrote:
> 
> Hi,
> 
> We have some users who would like to try out openmpi MPI_THREAD_MULTIPLE 
> support on our InfiniBand cluster. I am wondering if we should enable it on 
> our production cluster-wide version, or install it as a separate "here be 
> dragons" copy.
> 
> I seem to recall openmpi folk cautioning that MPI_THREAD_MULTIPLE support was 
> pretty crazy and that enabling it could have problems for 
> non-MPI_THREAD_MULTIPLE codes (never mind codes that explicitly used it), so 
> such an install shouldn't be used unless for codes that actually need it.
> 
> Is that still the case, please?
> 
> Thanks,
> 
> Mark
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] fatal error with openmpi-master-201702150209-404fe32 on Linux with Sun C

2017-02-15 Thread r...@open-mpi.org
I guess it was the next nightly tarball, but not next commit. However, it was 
almost certainly 7acef48 from Gilles that updated the PMIx code.

Gilles: can you perhaps take a peek?

Sent from my iPad

> On Feb 15, 2017, at 11:43 AM, Siegmar Gross 
> <siegmar.gr...@informatik.hs-fulda.de> wrote:
> 
> Hi Ralph,
> 
> I get the error already with openmpi-master-201702100209-51def91 which
> is the next version after openmpi-master-201702080209-bc2890e, if I'm
> right.
> 
> loki openmpi-master 146 grep Error \
>  
> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc
>  \
>  
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc
> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:
>   GENERATE mpi/man/man3/MPI_Error_class.3
> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:
>   GENERATE mpi/man/man3/MPI_Error_string.3
> 
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[5]:
>  *** [dstore/pmix_esh.lo] Error 1
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[4]:
>  *** [all-recursive] Error 1
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[3]:
>  *** [all-recursive] Error 1
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[2]:
>  *** [all-recursive] Error 1
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make[1]:
>  *** [all-recursive] Error 1
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc:make:
>  *** [all-recursive] Error 1
> 
> "pmix_esh.lo" isn't available for openmpi-master-201702100209-51def91. It's
> also not available for the other versions which break.
> 
> loki openmpi-master 147 find 
> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc -name pmix_esh.lo
> openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc/opal/mca/pmix/pmix2x/pmix/src/dstore/pmix_esh.lo
> 
> loki openmpi-master 148 find 
> openmpi-master-201702100209-51def91-Linux.x86_64.64_cc -name pmix_esh.lo
> loki openmpi-master 149
> 
> Which files do you need? Which commands shall I run to get differences of
> files?
> 
> 
> Kind regards
> 
> Siegmar
> 
> 
>> Am 15.02.2017 um 17:42 schrieb r...@open-mpi.org:
>> If we knew what line in that file was causing the compiler to barf, we
>> could at least address it. There is probably something added in recent
>> commits that is causing problems for the compiler.
>> 
>> So checking to see what commit might be triggering the failure would be most 
>> helpful.
>> 
>> 
>>> On Feb 15, 2017, at 8:29 AM, Siegmar Gross 
>>> <siegmar.gr...@informatik.hs-fulda.de> wrote:
>>> 
>>> Hi Gilles,
>>> 
>>>> this looks like a compiler crash, and it should be reported to Oracle.
>>> 
>>> I can try, but I don't think that they are interested, because
>>> we don't have a contract any longer. I didn't get the error
>>> building openmpi-master-201702080209-bc2890e as you can see
>>> below. Would it be helpful to build all intermediate versions
>>> to find out when the error occured the first time? Perhaps we
>>> can identify which change of code is responsible for the error.
>>> 
>>> loki openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc 111 grep Error 
>>> log.make.Linux.x86_64.64_cc
>>> GENERATE mpi/man/man3/MPI_Error_class.3
>>> GENERATE mpi/man/man3/MPI_Error_string.3
>>> 
>>> loki openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc 112 cd 
>>> ../openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc
>>> loki openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc 113 grep Error 
>>> log.make.Linux.x86_64.64_cc
>>> make[5]: *** [dstore/pmix_esh.lo] Error 1
>>> make[4]: *** [all-recursive] Error 1
>>> make[3]: *** [all-recursive] Error 1
>>> make[2]: *** [all-recursive] Error 1
>>> make[1]: *** [all-recursive] Error 1
>>> make: *** [all-recursive] Error 1
>>> loki openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc 114
>>> 
>>> 
>>> Kind regards and thank you very much for your help
>>> 
>>> Siegmar
>>> 
>>> 
>>>> 
>>>> Cheers,
>>>> 
>>>> Gilles
>>>> 
>>>> On Wednesday, February 15, 2017, Siegmar Gross 
>>>> <siegmar.gr...@informatik.hs-fulda.de 
>>>> <mailto:sieg

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread r...@open-mpi.org
Yes, 2.0.1 has a spawn issue. We believe that 2.0.2 is okay if you want to give 
it a try 

Sent from my iPad

> On Feb 15, 2017, at 1:14 PM, Jason Maldonis <maldo...@wisc.edu> wrote:
> 
> Just to throw this out there -- to me, that doesn't seem to be just a problem 
> with SLURM. I'm guessing the exact same error would be thrown interactively 
> (unless I didn't read the above messages carefully enough).  I had a lot of 
> problems running spawned jobs on 2.0.x a few months ago, so I switched back 
> to 1.10.2 and everything worked. Just in case that helps someone.
> 
> Jason
> 
>> On Wed, Feb 15, 2017 at 1:09 PM, Anastasia Kruchinina 
>> <nastja.kruchin...@gmail.com> wrote:
>> Hi!
>> 
>> I am doing like this:
>> 
>> sbatch  -N 2 -n 5 ./job.sh
>> 
>> where job.sh is:
>> 
>> #!/bin/bash -l
>> module load openmpi/2.0.1-icc
>> mpirun -np 1 ./manager 4
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> On 15 February 2017 at 17:58, r...@open-mpi.org <r...@open-mpi.org> wrote:
>>> The cmd line looks fine - when you do your “sbatch” request, what is in the 
>>> shell script you give it? Or are you saying you just “sbatch” the mpirun 
>>> cmd directly?
>>> 
>>> 
>>>> On Feb 15, 2017, at 8:07 AM, Anastasia Kruchinina 
>>>> <nastja.kruchin...@gmail.com> wrote:
>>>> 
>>>> Hi, 
>>>> 
>>>> I am running like this: 
>>>> mpirun -np 1 ./manager
>>>> 
>>>> Should I do it differently?
>>>> 
>>>> I also thought that all sbatch does is create an allocation and then run 
>>>> my script in it. But it seems it is not since I am getting these results...
>>>> 
>>>> I would like to upgrade to OpenMPI, but no clusters near me have it yet :( 
>>>> So I even cannot check if it works with OpenMPI 2.0.2. 
>>>> 
>>>>> On 15 February 2017 at 16:04, Howard Pritchard <hpprit...@gmail.com> 
>>>>> wrote:
>>>>> Hi Anastasia,
>>>>> 
>>>>> Definitely check the mpirun when in batch environment but you may also 
>>>>> want to upgrade to Open MPI 2.0.2.
>>>>> 
>>>>> Howard
>>>>> 
>>>>> r...@open-mpi.org <r...@open-mpi.org> schrieb am Mi. 15. Feb. 2017 um 
>>>>> 07:49:
>>>>>> Nothing immediate comes to mind - all sbatch does is create an 
>>>>>> allocation and then run your script in it. Perhaps your script is using 
>>>>>> a different “mpirun” command than when you type it interactively?
>>>>>> 
>>>>>>> On Feb 14, 2017, at 5:11 AM, Anastasia Kruchinina 
>>>>>>> <nastja.kruchin...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Hi, 
>>>>>>> 
>>>>>>> I am trying to use MPI_Comm_spawn function in my code. I am having 
>>>>>>> trouble with openmpi 2.0.x + sbatch (batch system Slurm). 
>>>>>>> My test program is located here: 
>>>>>>> http://user.it.uu.se/~anakr367/files/MPI_test/ 
>>>>>>> 
>>>>>>> When I am running my code I am getting an error: 
>>>>>>> 
>>>>>>> OPAL ERROR: Timeout in file 
>>>>>>> ../../../../openmpi-2.0.1/opal/mca/pmix/base/pmix_base_fns.c at line 
>>>>>>> 193 
>>>>>>> *** An error occurred in MPI_Init_thread 
>>>>>>> *** on a NULL communicator 
>>>>>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
>>>>>>> abort, 
>>>>>>> ***and potentially your MPI job) 
>>>>>>> --
>>>>>>>  
>>>>>>> It looks like MPI_INIT failed for some reason; your parallel process is 
>>>>>>> likely to abort.  There are many reasons that a parallel process can 
>>>>>>> fail during MPI_INIT; some of which are due to configuration or 
>>>>>>> environment 
>>>>>>> problems.  This failure appears to be an internal failure; here's some 
>>>>>>> additional information (which may only be relevant to an Open MPI 
>>>>>>> developer): 
>>>>>>> 
>>>>>>>ompi_dpm_dyn_

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread r...@open-mpi.org
The cmd line looks fine - when you do your “sbatch” request, what is in the 
shell script you give it? Or are you saying you just “sbatch” the mpirun cmd 
directly?


> On Feb 15, 2017, at 8:07 AM, Anastasia Kruchinina 
> <nastja.kruchin...@gmail.com> wrote:
> 
> Hi, 
> 
> I am running like this: 
> mpirun -np 1 ./manager
> 
> Should I do it differently?
> 
> I also thought that all sbatch does is create an allocation and then run my 
> script in it. But it seems it is not since I am getting these results...
> 
> I would like to upgrade to OpenMPI, but no clusters near me have it yet :( So 
> I even cannot check if it works with OpenMPI 2.0.2. 
> 
> On 15 February 2017 at 16:04, Howard Pritchard <hpprit...@gmail.com 
> <mailto:hpprit...@gmail.com>> wrote:
> Hi Anastasia,
> 
> Definitely check the mpirun when in batch environment but you may also want 
> to upgrade to Open MPI 2.0.2.
> 
> Howard
> 
> r...@open-mpi.org <mailto:r...@open-mpi.org> <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> schrieb am Mi. 15. Feb. 2017 um 07:49:
> Nothing immediate comes to mind - all sbatch does is create an allocation and 
> then run your script in it. Perhaps your script is using a different “mpirun” 
> command than when you type it interactively?
> 
>> On Feb 14, 2017, at 5:11 AM, Anastasia Kruchinina 
>> <nastja.kruchin...@gmail.com <mailto:nastja.kruchin...@gmail.com>> wrote:
>> 
>> Hi, 
>> 
>> I am trying to use MPI_Comm_spawn function in my code. I am having trouble 
>> with openmpi 2.0.x + sbatch (batch system Slurm). 
>> My test program is located here: 
>> http://user.it.uu.se/~anakr367/files/MPI_test/ 
>> <http://user.it.uu.se/%7Eanakr367/files/MPI_test/> 
>> 
>> When I am running my code I am getting an error: 
>> 
>> OPAL ERROR: Timeout in file 
>> ../../../../openmpi-2.0.1/opal/mca/pmix/base/pmix_base_fns.c at line 193 
>> *** An error occurred in MPI_Init_thread 
>> *** on a NULL communicator 
>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, 
>> ***and potentially your MPI job) 
>> -- 
>> It looks like MPI_INIT failed for some reason; your parallel process is 
>> likely to abort.  There are many reasons that a parallel process can 
>> fail during MPI_INIT; some of which are due to configuration or environment 
>> problems.  This failure appears to be an internal failure; here's some 
>> additional information (which may only be relevant to an Open MPI 
>> developer): 
>> 
>>ompi_dpm_dyn_init() failed 
>>--> Returned "Timeout" (-15) instead of "Success" (0) 
>> -- 
>> 
>> The interesting thing is that there is no error when I am firstly allocating 
>> nodes with salloc and then run my program. So, I noticed that the program 
>> works fine using openmpi 1.x+sbach/salloc or openmpi 2.0.x+salloc but not 
>> openmpi 2.0.x+sbatch. 
>> 
>> The error was reproduced on three different computer clusters. 
>> 
>> Best regards, 
>> Anastasia 
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] fatal error with openmpi-master-201702150209-404fe32 on Linux with Sun C

2017-02-15 Thread r...@open-mpi.org
If we knew what line in that file was causing the compiler to barf, we could at 
least address it. There is probably something added in recent commits that is 
causing problems for the compiler.

So checking to see what commit might be triggering the failure would be most 
helpful.


> On Feb 15, 2017, at 8:29 AM, Siegmar Gross 
>  wrote:
> 
> Hi Gilles,
> 
>> this looks like a compiler crash, and it should be reported to Oracle.
> 
> I can try, but I don't think that they are interested, because
> we don't have a contract any longer. I didn't get the error
> building openmpi-master-201702080209-bc2890e as you can see
> below. Would it be helpful to build all intermediate versions
> to find out when the error occured the first time? Perhaps we
> can identify which change of code is responsible for the error.
> 
> loki openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc 111 grep Error 
> log.make.Linux.x86_64.64_cc
>  GENERATE mpi/man/man3/MPI_Error_class.3
>  GENERATE mpi/man/man3/MPI_Error_string.3
> 
> loki openmpi-master-201702080209-bc2890e-Linux.x86_64.64_cc 112 cd 
> ../openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc
> loki openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc 113 grep Error 
> log.make.Linux.x86_64.64_cc
> make[5]: *** [dstore/pmix_esh.lo] Error 1
> make[4]: *** [all-recursive] Error 1
> make[3]: *** [all-recursive] Error 1
> make[2]: *** [all-recursive] Error 1
> make[1]: *** [all-recursive] Error 1
> make: *** [all-recursive] Error 1
> loki openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc 114
> 
> 
> Kind regards and thank you very much for your help
> 
> Siegmar
> 
> 
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Wednesday, February 15, 2017, Siegmar Gross 
>> > > wrote:
>> 
>>Hi,
>> 
>>I tried to install openmpi-master-201702150209-404fe32 on my "SUSE Linux
>>Enterprise Server 12.2 (x86_64)" with Sun C 5.14. Unfortunately, "make"
>>breaks with the following error. I've had no problems with gcc-6.3.0.
>> 
>> 
>>...
>>
>> "../../../../../../../openmpi-master-201702150209-404fe32/opal/mca/pmix/pmix2x/pmix/src/buffer_ops/copy.c",
>>  line 1004: warning: statement not reached
>>  CC   buffer_ops/internal_functions.lo
>>  CC   buffer_ops/open_close.lo
>>  CC   buffer_ops/pack.lo
>>  CC   buffer_ops/print.lo
>>  CC   buffer_ops/unpack.lo
>>  CC   sm/pmix_sm.lo
>>  CC   sm/pmix_mmap.lo
>>  CC   dstore/pmix_dstore.lo
>>  CC   dstore/pmix_esh.lo
>>cc: Fatal error in /opt/sun/developerstudio12.5/lib/compilers/bin/acomp : 
>> Signal number = 139
>>Makefile:1322: recipe for target 'dstore/pmix_esh.lo' failed
>>make[5]: *** [dstore/pmix_esh.lo] Error 1
>>make[5]: Leaving directory 
>> '/export2/src/openmpi-master/openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc/opal/mca/pmix/pmix2x/pmix/src'
>>Makefile:1375: recipe for target 'all-recursive' failed
>>make[4]: *** [all-recursive] Error 1
>>make[4]: Leaving directory 
>> '/export2/src/openmpi-master/openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc/opal/mca/pmix/pmix2x/pmix/src'
>>Makefile:652: recipe for target 'all-recursive' failed
>>make[3]: *** [all-recursive] Error 1
>>make[3]: Leaving directory 
>> '/export2/src/openmpi-master/openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc/opal/mca/pmix/pmix2x/pmix'
>>Makefile:2037: recipe for target 'all-recursive' failed
>>make[2]: *** [all-recursive] Error 1
>>make[2]: Leaving directory 
>> '/export2/src/openmpi-master/openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc/opal/mca/pmix/pmix2x'
>>Makefile:2386: recipe for target 'all-recursive' failed
>>make[1]: *** [all-recursive] Error 1
>>make[1]: Leaving directory 
>> '/export2/src/openmpi-master/openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc/opal'
>>Makefile:1903: recipe for target 'all-recursive' failed
>>make: *** [all-recursive] Error 1
>>loki openmpi-master-201702150209-404fe32-Linux.x86_64.64_cc 129
>> 
>> 
>>I would be grateful, if somebody can fix the problem. Do you need anything
>>else? Thank you very much for any help in advance.
>> 
>> 
>>Kind regards
>> 
>>Siegmar
>>___
>>users mailing list
>>users@lists.open-mpi.org
>>https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> 
>> 
>> 
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing 

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread r...@open-mpi.org
Nothing immediate comes to mind - all sbatch does is create an allocation and 
then run your script in it. Perhaps your script is using a different “mpirun” 
command than when you type it interactively?

> On Feb 14, 2017, at 5:11 AM, Anastasia Kruchinina 
>  wrote:
> 
> Hi, 
> 
> I am trying to use MPI_Comm_spawn function in my code. I am having trouble 
> with openmpi 2.0.x + sbatch (batch system Slurm). 
> My test program is located here: 
> http://user.it.uu.se/~anakr367/files/MPI_test/ 
>  
> 
> When I am running my code I am getting an error: 
> 
> OPAL ERROR: Timeout in file 
> ../../../../openmpi-2.0.1/opal/mca/pmix/base/pmix_base_fns.c at line 193 
> *** An error occurred in MPI_Init_thread 
> *** on a NULL communicator 
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, 
> ***and potentially your MPI job) 
> -- 
> It looks like MPI_INIT failed for some reason; your parallel process is 
> likely to abort.  There are many reasons that a parallel process can 
> fail during MPI_INIT; some of which are due to configuration or environment 
> problems.  This failure appears to be an internal failure; here's some 
> additional information (which may only be relevant to an Open MPI 
> developer): 
> 
>ompi_dpm_dyn_init() failed 
>--> Returned "Timeout" (-15) instead of "Success" (0) 
> -- 
> 
> The interesting thing is that there is no error when I am firstly allocating 
> nodes with salloc and then run my program. So, I noticed that the program 
> works fine using openmpi 1.x+sbach/salloc or openmpi 2.0.x+salloc but not 
> openmpi 2.0.x+sbatch. 
> 
> The error was reproduced on three different computer clusters. 
> 
> Best regards, 
> Anastasia 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Specify the core binding when spawning a process

2017-02-15 Thread r...@open-mpi.org
Sorry for slow response - was away for awhile. What version of OMPI are you 
using?


> On Feb 8, 2017, at 1:59 PM, Allan Ma  wrote:
> 
> Hello,
> 
> I'm designing a program on a dual socket system that needs the parent process 
> and spawned child process to be at least running on (or bound to) the cores 
> of the same socket in the same node.
> 
> I wonder if anyone knows how to specify the core binding or socket binding 
> when spawning a single process using MPI_COMM_Spawn. 
> 
> Currently I tried using the setting key 'host' in mpiinfo when passing it to 
> Spawn and it appears to be working, but I don't know how to specify exactly 
> the logical core number to run on. When I bind processes to sockets when 
> starting with mpirun, I used the -cpu-set option for setting to the core 
> number in the desired socket.
> 
> Also, I was just checking the manual here:
> 
> https://www.open-mpi.org/doc/v2.0/man3/MPI_Comm_spawn.3.php#toc7 
> 
> 
> I found there is a "mapper" key in the MPI_INFO that might be useful in my 
> case:
> 
> mapper char* Mapper to be used for this job
> 
> I wonder if there's any more detailed documentation or any example on how to 
> use this mapper key.
> 
> Thanks
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] "-map-by socket:PE=1" doesn't do what I expect

2017-02-15 Thread r...@open-mpi.org

> On Feb 15, 2017, at 5:45 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:
> 
> On Wed, 15 Feb 2017, r...@open-mpi.org wrote:
> 
>> Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 - 
>> the logic is looking expressly for values > 1 as we hadn’t anticipated this 
>> use-case.
> 
> Is it a sensible use-case, or am I crazy?

Not crazy, I’d say. The expected way of doing it would be “--map-by socket 
--bind-to core”. However, I can see why someone might expect pe=1 to work.

> 
>> I can make that change. I’m off to a workshop for the next day or so, but 
>> can probably do this on the plane.
> 
> You're a star - thanks :)
> 
> Mark___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] "-map-by socket:PE=1" doesn't do what I expect

2017-02-15 Thread r...@open-mpi.org
Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 - 
the logic is looking expressly for values > 1 as we hadn’t anticipated this 
use-case.

I can make that change. I’m off to a workshop for the next day or so, but can 
probably do this on the plane.


> On Feb 15, 2017, at 3:17 AM, Mark Dixon  wrote:
> 
> Hi,
> 
> When combining OpenMPI 2.0.2 with OpenMP, I'm interested in launching a 
> number of ranks and allocating a number of cores to each rank. Using "-map-by 
> socket:PE=", switching to "-map-by node:PE=" if I want to allocate 
> more than a single socket to a rank, seems to do what I want.
> 
> Except for "-map-by socket:PE=1". That seems to allocate an entire socket to 
> each rank instead of a single core. Here's the output of a test program on a 
> dual socket non-hyperthreading system that reports rank core bindings (odd 
> cores on one socket, even on the other):
> 
>   $ mpirun -np 2 -map-by socket:PE=1 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2 4 6 8 10 12 14 16 18 20 22
>   Rank 1 bound somehost.somewhere:  1 3 5 7 9 11 13 15 17 19 21 23
> 
>   $ mpirun -np 2 -map-by socket:PE=2 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2
>   Rank 1 bound somehost.somewhere:  1 3
> 
>   $ mpirun -np 2 -map-by socket:PE=3 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2 4
>   Rank 1 bound somehost.somewhere:  1 3 5
> 
>   $ mpirun -np 2 -map-by socket:PE=4 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2 4 6
>   Rank 1 bound somehost.somewhere:  1 3 5 7
> 
> I get the same result if I change "socket" to "numa". Changing "socket" to 
> either "core", "node" or "slot" binds each rank to a single core (good), but 
> doesn't round-robin ranks across sockets like "socket" does (bad).
> 
> Is "-map-by socket:PE=1" doing the right thing, please? I tried reading the 
> man page but I couldn't work out what the expected behaviour is :o
> 
> Cheers,
> 
> Mark
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] "-map-by socket:PE=1" doesn't do what I expect

2017-02-15 Thread r...@open-mpi.org
Ah, yes - I know what the problem is. We weren’t expecting a PE value of 1 - 
the logic is looking expressly for values > 1 as we hadn’t anticipated this 
use-case.

I can make that change. I’m off to a workshop for the next day or so, but can 
probably do this on the plane.


> On Feb 15, 2017, at 3:17 AM, Mark Dixon  wrote:
> 
> Hi,
> 
> When combining OpenMPI 2.0.2 with OpenMP, I'm interested in launching a 
> number of ranks and allocating a number of cores to each rank. Using "-map-by 
> socket:PE=", switching to "-map-by node:PE=" if I want to allocate 
> more than a single socket to a rank, seems to do what I want.
> 
> Except for "-map-by socket:PE=1". That seems to allocate an entire socket to 
> each rank instead of a single core. Here's the output of a test program on a 
> dual socket non-hyperthreading system that reports rank core bindings (odd 
> cores on one socket, even on the other):
> 
>   $ mpirun -np 2 -map-by socket:PE=1 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2 4 6 8 10 12 14 16 18 20 22
>   Rank 1 bound somehost.somewhere:  1 3 5 7 9 11 13 15 17 19 21 23
> 
>   $ mpirun -np 2 -map-by socket:PE=2 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2
>   Rank 1 bound somehost.somewhere:  1 3
> 
>   $ mpirun -np 2 -map-by socket:PE=3 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2 4
>   Rank 1 bound somehost.somewhere:  1 3 5
> 
>   $ mpirun -np 2 -map-by socket:PE=4 ./report_binding
>   Rank 0 bound somehost.somewhere:  0 2 4 6
>   Rank 1 bound somehost.somewhere:  1 3 5 7
> 
> I get the same result if I change "socket" to "numa". Changing "socket" to 
> either "core", "node" or "slot" binds each rank to a single core (good), but 
> doesn't round-robin ranks across sockets like "socket" does (bad).
> 
> Is "-map-by socket:PE=1" doing the right thing, please? I tried reading the 
> man page but I couldn't work out what the expected behaviour is :o
> 
> Cheers,
> 
> Mark
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-13 Thread r...@open-mpi.org
I dug into this further, and the simplest solution for now is to simply do one 
of the following:

* replace the “!=“ with “==“ in the test, as Jeff indicated; or

* revert the commit Mark identified

Both options will restore the original logic. Given that someone already got it 
wrong, I have clarified the logic in the OMPI master repo. However, I don’t 
know how long it will be before a 2.0.3 release is issued, so GridEngine users 
might want to locally fix things in the interim.


> On Feb 12, 2017, at 1:52 PM, r...@open-mpi.org wrote:
> 
> Yeah, I’ll fix it this week. The problem is that you can’t check the source 
> as being default as the default is ssh - so the only way to get the current 
> code to check for qrsh is to specify something other than the default ssh (it 
> doesn’t matter what you specify - anything will get you past the erroneous 
> check so you look for qrsh).
> 
> 
>> On Feb 9, 2017, at 3:21 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
>> wrote:
>> 
>> Yes, we can get it fixed.
>> 
>> Ralph is unavailable this week; I don't know offhand what he meant by his 
>> prior remarks.  It's possible that 
>> https://github.com/open-mpi/ompi/commit/71ec5cfb436977ea9ad409ba634d27e6addf6fae;
>>  can you try changing the "!=" on line to be "=="?  I.e., from
>> 
>> if (MCA_BASE_VAR_SOURCE_DEFAULT != source) {
>> 
>> to
>> 
>> if (MCA_BASE_VAR_SOURCE_DEFAULT == source) {
>> 
>> I filed https://github.com/open-mpi/ompi/issues/2947 to track the issue.
>> 
>> 
>>> On Feb 9, 2017, at 6:01 PM, Glenn Johnson <glenn-john...@uiowa.edu> wrote:
>>> 
>>> Will this be fixed in the 2.0.3 release?
>>> 
>>> Thanks.
>>> 
>>> 
>>> Glenn
>>> 
>>> On Mon, Feb 6, 2017 at 10:45 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:
>>> On Mon, 6 Feb 2017, Mark Dixon wrote:
>>> ...
>>> Ah-ha! "-mca plm_rsh_agent foo" fixes it!
>>> 
>>> Thanks very much - presumably I can stick that in the system-wide 
>>> openmpi-mca-params.conf for now.
>>> ...
>>> 
>>> Except if I do that, it means running ompi outside of the SGE environment 
>>> no longer works :(
>>> 
>>> Should I just revoke the following commit?
>>> 
>>> Cheers,
>>> 
>>> Mark
>>> 
>>> commit d51c2af76b0c011177aca8e08a5a5fcf9f5e67db
>>> Author: Jeff Squyres <jsquy...@cisco.com>
>>> Date:   Tue Aug 16 06:58:20 2016 -0500
>>> 
>>>   rsh: robustify the check for plm_rsh_agent default value
>>> 
>>>   Don't strcmp against the default value -- the default value may change
>>>   over time.  Instead, check to see if the MCA var source is not
>>>   DEFAULT.
>>> 
>>>   Signed-off-by: Jeff Squyres <jsquy...@cisco.com>
>>> 
>>>   (cherry picked from commit 
>>> open-mpi/ompi@71ec5cfb436977ea9ad409ba634d27e6addf6fae)
>>> 
>>> 
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>> 
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-12 Thread r...@open-mpi.org
Yeah, I’ll fix it this week. The problem is that you can’t check the source as 
being default as the default is ssh - so the only way to get the current code 
to check for qrsh is to specify something other than the default ssh (it 
doesn’t matter what you specify - anything will get you past the erroneous 
check so you look for qrsh).


> On Feb 9, 2017, at 3:21 PM, Jeff Squyres (jsquyres)  
> wrote:
> 
> Yes, we can get it fixed.
> 
> Ralph is unavailable this week; I don't know offhand what he meant by his 
> prior remarks.  It's possible that 
> https://github.com/open-mpi/ompi/commit/71ec5cfb436977ea9ad409ba634d27e6addf6fae;
>  can you try changing the "!=" on line to be "=="?  I.e., from
> 
> if (MCA_BASE_VAR_SOURCE_DEFAULT != source) {
> 
> to
> 
> if (MCA_BASE_VAR_SOURCE_DEFAULT == source) {
> 
> I filed https://github.com/open-mpi/ompi/issues/2947 to track the issue.
> 
> 
>> On Feb 9, 2017, at 6:01 PM, Glenn Johnson  wrote:
>> 
>> Will this be fixed in the 2.0.3 release?
>> 
>> Thanks.
>> 
>> 
>> Glenn
>> 
>> On Mon, Feb 6, 2017 at 10:45 AM, Mark Dixon  wrote:
>> On Mon, 6 Feb 2017, Mark Dixon wrote:
>> ...
>> Ah-ha! "-mca plm_rsh_agent foo" fixes it!
>> 
>> Thanks very much - presumably I can stick that in the system-wide 
>> openmpi-mca-params.conf for now.
>> ...
>> 
>> Except if I do that, it means running ompi outside of the SGE environment no 
>> longer works :(
>> 
>> Should I just revoke the following commit?
>> 
>> Cheers,
>> 
>> Mark
>> 
>> commit d51c2af76b0c011177aca8e08a5a5fcf9f5e67db
>> Author: Jeff Squyres 
>> Date:   Tue Aug 16 06:58:20 2016 -0500
>> 
>>rsh: robustify the check for plm_rsh_agent default value
>> 
>>Don't strcmp against the default value -- the default value may change
>>over time.  Instead, check to see if the MCA var source is not
>>DEFAULT.
>> 
>>Signed-off-by: Jeff Squyres 
>> 
>>(cherry picked from commit 
>> open-mpi/ompi@71ec5cfb436977ea9ad409ba634d27e6addf6fae)
>> 
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_Comm_spawn question

2017-02-03 Thread r...@open-mpi.org
We know v2.0.1 has problems with comm_spawn, and so you may be encountering one 
of those. Regardless, there is indeed a timeout mechanism in there. It was 
added because people would execute a comm_spawn, and then would hang and eat up 
their entire allocation time for nothing.

In v2.0.2, I see it is still hardwired at 60 seconds. I believe we eventually 
realized we needed to make that a variable, but it didn’t get into the 2.0.2 
release.


> On Feb 1, 2017, at 1:00 AM, elistrato...@info.sgu.ru wrote:
> 
> I am using Open MPI version 2.0.1.
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-03 Thread r...@open-mpi.org
I don’t think so - at least, that isn’t the code I was looking at.

> On Feb 3, 2017, at 9:43 AM, Glenn Johnson <glenn-john...@uiowa.edu> wrote:
> 
> Is this the same issue that was previously fixed in PR-1960?
> 
> https://github.com/open-mpi/ompi/pull/1960/files 
> <https://github.com/open-mpi/ompi/pull/1960/files>
> 
> 
> Glenn
> 
> On Fri, Feb 3, 2017 at 10:56 AM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> I do see a diff between 2.0.1 and 2.0.2 that might have a related impact. The 
> way we handled the MCA param that specifies the launch agent (ssh, rsh, or 
> whatever) was modified, and I don’t think the change is correct. It basically 
> says that we don’t look for qrsh unless the MCA param has been changed from 
> the coded default, which means we are not detecting SGE by default.
> 
> Try setting "-mca plm_rsh_agent foo" on your cmd line - that will get past 
> the test, and then we should auto-detect SGE again
> 
> 
> > On Feb 3, 2017, at 8:49 AM, Mark Dixon <m.c.di...@leeds.ac.uk 
> > <mailto:m.c.di...@leeds.ac.uk>> wrote:
> >
> > On Fri, 3 Feb 2017, Reuti wrote:
> > ...
> >> SGE on its own is not configured to use SSH? (I mean the entries in `qconf 
> >> -sconf` for rsh_command resp. daemon).
> > ...
> >
> > Nope, everything left as the default:
> >
> > $ qconf -sconf | grep _command
> > qlogin_command   builtin
> > rlogin_command   builtin
> > rsh_command  builtin
> >
> > I have 2.0.1 and 2.0.2 installed side by side. 2.0.1 is happy but 2.0.2 
> > isn't.
> >
> > I'll start digging, but I'd appreciate hearing from any other SGE user who 
> > had tried 2.0.2 and tell me if it had worked for them, please? :)
> >
> > Cheers,
> >
> > Mark
> > ___
> > users mailing list
> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-03 Thread r...@open-mpi.org
I do see a diff between 2.0.1 and 2.0.2 that might have a related impact. The 
way we handled the MCA param that specifies the launch agent (ssh, rsh, or 
whatever) was modified, and I don’t think the change is correct. It basically 
says that we don’t look for qrsh unless the MCA param has been changed from the 
coded default, which means we are not detecting SGE by default.

Try setting "-mca plm_rsh_agent foo" on your cmd line - that will get past the 
test, and then we should auto-detect SGE again


> On Feb 3, 2017, at 8:49 AM, Mark Dixon  wrote:
> 
> On Fri, 3 Feb 2017, Reuti wrote:
> ...
>> SGE on its own is not configured to use SSH? (I mean the entries in `qconf 
>> -sconf` for rsh_command resp. daemon).
> ...
> 
> Nope, everything left as the default:
> 
> $ qconf -sconf | grep _command
> qlogin_command   builtin
> rlogin_command   builtin
> rsh_command  builtin
> 
> I have 2.0.1 and 2.0.2 installed side by side. 2.0.1 is happy but 2.0.2 isn't.
> 
> I'll start digging, but I'd appreciate hearing from any other SGE user who 
> had tried 2.0.2 and tell me if it had worked for them, please? :)
> 
> Cheers,
> 
> Mark
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Performance Issues on SMP Workstation

2017-02-01 Thread r...@open-mpi.org
Simple test: replace your executable with “hostname”. If you see multiple hosts 
come out on your cluster, then you know why the performance is different.

> On Feb 1, 2017, at 2:46 PM, Andy Witzig  wrote:
> 
> Honestly, I’m not exactly sure what scheme is being used.  I am using the 
> default template from Penguin Computing for job submission.  It looks like:
> 
> #PBS -S /bin/bash
> #PBS -q T30
> #PBS -l walltime=24:00:00,nodes=1:ppn=20
> #PBS -j oe
> #PBS -N test
> #PBS -r n
> 
> mpirun $EXECUTABLE $INPUT_FILE
> 
> I’m not configuring OpenMPI anywhere else. It is possible the Penguin 
> Computing folks have pre-configured my MPI environment.  I’ll see what I can 
> find.
> 
> Best regards,
> Andy
> 
> On Feb 1, 2017, at 4:32 PM, Douglas L Reeder  > wrote:
> 
> Andy,
> 
> What allocation scheme are you using on the cluster. For some codes we see 
> noticeable differences using fillup vs round robin, not 4x though. Fillup is 
> more shared memory use while round robin uses more infinniband.
> 
> Doug
>> On Feb 1, 2017, at 3:25 PM, Andy Witzig > > wrote:
>> 
>> Hi Tom,
>> 
>> The cluster uses an Infiniband interconnect.  On the cluster I’m requesting: 
>> #PBS -l walltime=24:00:00,nodes=1:ppn=20.  So technically, the run on the 
>> cluster should be SMP on the node, since there are 20 cores/node.  On the 
>> workstation I’m just using the command: mpirun -np 20 …. I haven’t finished 
>> setting Torque/PBS up yet.
>> 
>> Best regards,
>> Andy
>> 
>> On Feb 1, 2017, at 4:10 PM, Elken, Tom > > wrote:
>> 
>> For this case:  " a cluster system with 2.6GHz Intel Haswell with 20 cores / 
>> node and 128GB RAM/node.  "
>> 
>> are you running 5 ranks per node on 4 nodes?
>> What interconnect are you using for the cluster?
>> 
>> -Tom
>> 
>>> -Original Message-
>>> From: users [mailto:users-boun...@lists.open-mpi.org 
>>> ] On Behalf Of Andrew
>>> Witzig
>>> Sent: Wednesday, February 01, 2017 1:37 PM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] Performance Issues on SMP Workstation
>>> 
>>> By the way, the workstation has a total of 36 cores / 72 threads, so using 
>>> mpirun
>>> -np 20 is possible (and should be equivalent) on both platforms.
>>> 
>>> Thanks,
>>> cap79
>>> 
 On Feb 1, 2017, at 2:52 PM, Andy Witzig > wrote:
 
 Hi all,
 
 I’m testing my application on a SMP workstation (dual Intel Xeon E5-2697 V4
>>> 2.3 GHz Intel Broadwell (boost 2.8-3.1GHz) processors 128GB RAM) and am
>>> seeing a 4x performance drop compared to a cluster system with 2.6GHz Intel
>>> Haswell with 20 cores / node and 128GB RAM/node.  Both applications have
>>> been compiled using OpenMPI 1.6.4.  I have tried running:
 
 mpirun -np 20 $EXECUTABLE $INPUT_FILE
 mpirun -np 20 --mca btl self,sm $EXECUTABLE $INPUT_FILE
 
 and others, but cannot achieve the same performance on the workstation as 
 is
>>> seen on the cluster.  The workstation outperforms on other non-MPI but 
>>> multi-
>>> threaded applications, so I don’t think it’s a hardware issue.
 
 Any help you can provide would be appreciated.
 
 Thanks,
 cap79
 ___
 users mailing list
 users@lists.open-mpi.org 
 https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org 
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> ___
>> users mailing list
>> users@lists.open-mpi.org 
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_Comm_spawn question

2017-01-31 Thread r...@open-mpi.org
What version of OMPI are you using?

> On Jan 31, 2017, at 7:33 AM, elistrato...@info.sgu.ru wrote:
> 
> Hi,
> 
> I am trying to write trivial master-slave program. Master simply creates
> slaves, sends them a string, they print it out and exit. Everything works
> just fine, however, when I add a delay (more than 2 sec) before calling
> MPI_Init on slave, MPI fails with MPI_ERR_SPAWN. I am pretty sure that
> MPI_Comm_spawn has some kind of timeout on waiting for slaves to call
> MPI_Init, and if they fail to respond in time, it returns an error.
> 
> I believe there is a way to change this behaviour, but I wasn't able to
> find any suggestions/ideas in the internet.
> I would appreciate if someone could help with this.
> 
> ---
> --- terminal command i use to run program:
> mpirun -n 1 hello 2 2 // the first argument to "hello" is number of
> slaves, the second is delay in seconds
> 
> --- Error message I get when delay is >=2 sec:
> [host:2231] *** An error occurred in MPI_Comm_spawn
> [host:2231] *** reported by process [3453419521,0]
> [host:2231] *** on communicator MPI_COMM_SELF
> [host:2231] *** MPI_ERR_SPAWN: could not spawn processes
> [host:2231] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will
> now abort,
> [host:2231] ***and potentially your MPI job)
> 
> --- The program itself:
> #include "stdlib.h"
> #include "stdio.h"
> #include "mpi.h"
> #include "unistd.h"
> 
> MPI_Comm slave_comm;
> MPI_Comm new_world;
> #define MESSAGE_SIZE 40
> 
> void slave() {
>   printf("Slave initialized; ");
>   MPI_Comm_get_parent(_comm);
>   MPI_Intercomm_merge(slave_comm, 1, _world);
> 
>   int slave_rank;
>   MPI_Comm_rank(new_world, _rank);
> 
>   char message[MESSAGE_SIZE];
>   MPI_Bcast(message, MESSAGE_SIZE, MPI_CHAR, 0, new_world);
> 
>   printf("Slave %d received message from master: %s\n", slave_rank, 
> message);
> }
> 
> void master(int slave_count, char* executable, char* delay) {
>   char* slave_argv[] = { delay, NULL };
>   MPI_Comm_spawn( executable,
>   slave_argv,
>   slave_count,
>   MPI_INFO_NULL,
>   0,
>   MPI_COMM_SELF,
>   _comm,
>   MPI_ERRCODES_IGNORE);
>   MPI_Intercomm_merge(slave_comm, 0, _world);
>   char* helloWorld = "Hello New World!\0";
>   MPI_Bcast(helloWorld, MESSAGE_SIZE, MPI_CHAR, 0, new_world);
>   printf("Processes spawned!\n");
> }
> 
> int main(int argc, char* argv[]) {
>   if (argc > 2) {
>   MPI_Init(, );
>   master(atoi(argv[1]), argv[0], argv[2]);
>   } else {
>   sleep(atoi(argv[1])); /// delay
>   MPI_Init(, );
>   slave();
>   }
>   MPI_Comm_free(_world);
>   MPI_Comm_free(_comm);
>   MPI_Finalize();
> }
> 
> 
> Thank you,
> 
> Andrew Elistratov
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Startup limited to 128 remote hosts in some situations?

2017-01-20 Thread r...@open-mpi.org
Well, it appears we are already forwarding all envars, which should include 
PATH. Here is the qrsh command line we use:

“qrsh --inherit --nostdin -V"

So would you please try the following patch:

diff --git a/orte/mca/plm/rsh/plm_rsh_component.c 
b/orte/mca/plm/rsh/plm_rsh_component.c
index 0183bcc..1cc5aa4 100644
--- a/orte/mca/plm/rsh/plm_rsh_component.c
+++ b/orte/mca/plm/rsh/plm_rsh_component.c
@@ -288,8 +288,6 @@ static int rsh_component_query(mca_base_module_t **module, 
int *priority)
 }
 mca_plm_rsh_component.agent = tmp;
 mca_plm_rsh_component.using_qrsh = true;
-/* no tree spawn allowed under qrsh */
-mca_plm_rsh_component.no_tree_spawn = true;
 goto success;
 } else if (!mca_plm_rsh_component.disable_llspawn &&
NULL != getenv("LOADL_STEP_ID")) {


> On Jan 19, 2017, at 5:29 PM, r...@open-mpi.org wrote:
> 
> I’ll create a patch that you can try - if it works okay, we can commit it
> 
>> On Jan 18, 2017, at 3:29 AM, William Hay <w@ucl.ac.uk> wrote:
>> 
>> On Tue, Jan 17, 2017 at 09:56:54AM -0800, r...@open-mpi.org wrote:
>>> As I recall, the problem was that qrsh isn???t available on the backend 
>>> compute nodes, and so we can???t use a tree for launch. If that isn???t 
>>> true, then we can certainly adjust it.
>>> 
>> qrsh should be available on all nodes of a SoGE cluster but, depending on 
>> how things are set up, may not be 
>> findable (ie not in the PATH) when you qrsh -inherit into a node.  A 
>> workaround would be to start backend 
>> processes with qrsh -inherit -v PATH which will copy the PATH from the 
>> master node to the slave node 
>> process or otherwise pass the location of qrsh from one node or another.  
>> That of course assumes that 
>> qrsh is in the same location on all nodes.
>> 
>> I've tested that it is possible to qrsh from the head node of a job to a 
>> slave node and then on to
>> another slave node by this method.
>> 
>> William
>> 
>> 
>>>> On Jan 17, 2017, at 9:37 AM, Mark Dixon <m.c.di...@leeds.ac.uk> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> While commissioning a new cluster, I wanted to run HPL across the whole 
>>>> thing using openmpi 2.0.1.
>>>> 
>>>> I couldn't get it to start on more than 129 hosts under Son of Gridengine 
>>>> (128 remote plus the localhost running the mpirun command). openmpi would 
>>>> sit there, waiting for all the orted's to check in; however, there were 
>>>> "only" a maximum of 128 qrsh processes, therefore a maximum of 128 
>>>> orted's, therefore waiting a lng time.
>>>> 
>>>> Increasing plm_rsh_num_concurrent beyond the default of 128 gets the job 
>>>> to launch.
>>>> 
>>>> Is this intentional, please?
>>>> 
>>>> Doesn't openmpi use a tree-like startup sometimes - any particular reason 
>>>> it's not using it here?
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Startup limited to 128 remote hosts in some situations?

2017-01-17 Thread r...@open-mpi.org
As I recall, the problem was that qrsh isn’t available on the backend compute 
nodes, and so we can’t use a tree for launch. If that isn’t true, then we can 
certainly adjust it.

> On Jan 17, 2017, at 9:37 AM, Mark Dixon  wrote:
> 
> Hi,
> 
> While commissioning a new cluster, I wanted to run HPL across the whole thing 
> using openmpi 2.0.1.
> 
> I couldn't get it to start on more than 129 hosts under Son of Gridengine 
> (128 remote plus the localhost running the mpirun command). openmpi would sit 
> there, waiting for all the orted's to check in; however, there were "only" a 
> maximum of 128 qrsh processes, therefore a maximum of 128 orted's, therefore 
> waiting a lng time.
> 
> Increasing plm_rsh_num_concurrent beyond the default of 128 gets the job to 
> launch.
> 
> Is this intentional, please?
> 
> Doesn't openmpi use a tree-like startup sometimes - any particular reason 
> it's not using it here?
> 
> Cheers,
> 
> Mark
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-10 Thread r...@open-mpi.org
I think there is some relevant discussion here: 
https://github.com/open-mpi/ompi/issues/1569 


It looks like Gilles had (at least at one point) a fix for master when 
enable-heterogeneous, but I don’t know if that was committed.

> On Jan 9, 2017, at 8:23 AM, Howard Pritchard  wrote:
> 
> HI Siegmar,
> 
> You have some config parameters I wasn't trying that may have some impact.
> I'll give a try with these parameters.
> 
> This should be enough info for now,
> 
> Thanks,
> 
> Howard
> 
> 
> 2017-01-09 0:59 GMT-07:00 Siegmar Gross  >:
> Hi Howard,
> 
> I use the following commands to build and install the package.
> ${SYSTEM_ENV} is "Linux" and ${MACHINE_ENV} is "x86_64" for my
> Linux machine.
> 
> mkdir openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
> cd openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
> 
> ../openmpi-2.0.2rc3/configure \
>   --prefix=/usr/local/openmpi-2.0.2_64_cc \
>   --libdir=/usr/local/openmpi-2.0.2_64_cc/lib64 \
>   --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \
>   --with-jdk-headers=/usr/local/jdk1.8.0_66/include \
>   JAVA_HOME=/usr/local/jdk1.8.0_66 \
>   LDFLAGS="-m64 -mt -Wl,-z -Wl,noexecstack" CC="cc" CXX="CC" FC="f95" \
>   CFLAGS="-m64 -mt" CXXFLAGS="-m64" FCFLAGS="-m64" \
>   CPP="cpp" CXXCPP="cpp" \
>   --enable-mpi-cxx \
>   --enable-mpi-cxx-bindings \
>   --enable-cxx-exceptions \
>   --enable-mpi-java \
>   --enable-heterogeneous \
>   --enable-mpi-thread-multiple \
>   --with-hwloc=internal \
>   --without-verbs \
>   --with-wrapper-cflags="-m64 -mt" \
>   --with-wrapper-cxxflags="-m64" \
>   --with-wrapper-fcflags="-m64" \
>   --with-wrapper-ldflags="-mt" \
>   --enable-debug \
>   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> 
> make |& tee log.make.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> rm -r /usr/local/openmpi-2.0.2_64_cc.old
> mv /usr/local/openmpi-2.0.2_64_cc /usr/local/openmpi-2.0.2_64_cc.old
> make install |& tee log.make-install.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> make check |& tee log.make-check.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> 
> 
> I get a different error if I run the program with gdb.
> 
> loki spawn 118 gdb /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec
> GNU gdb (GDB; SUSE Linux Enterprise 12) 7.11.1
> Copyright (C) 2016 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later  >
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-suse-linux".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> >.
> Find the GDB manual and other documentation resources online at:
>  >.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec...done.
> (gdb) r -np 1 --host loki --slot-list 0:0-5,1:0-5 spawn_master
> Starting program: /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec -np 1 --host 
> loki --slot-list 0:0-5,1:0-5 spawn_master
> Missing separate debuginfos, use: zypper install 
> glibc-debuginfo-2.24-2.3.x86_64
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0x73b97700 (LWP 13582)]
> [New Thread 0x718a4700 (LWP 13583)]
> [New Thread 0x710a3700 (LWP 13584)]
> [New Thread 0x7fffebbba700 (LWP 13585)]
> Detaching after fork from child process 13586.
> 
> Parent process 0 running on loki
>   I create 4 slave processes
> 
> Detaching after fork from child process 13589.
> Detaching after fork from child process 13590.
> Detaching after fork from child process 13591.
> [loki:13586] OPAL ERROR: Timeout in file 
> ../../../../openmpi-2.0.2rc3/opal/mca/pmix/base/pmix_base_fns.c at line 193
> [loki:13586] *** An error occurred in MPI_Comm_spawn
> [loki:13586] *** reported by process [2873294849,0]
> [loki:13586] *** on communicator MPI_COMM_WORLD
> [loki:13586] *** MPI_ERR_UNKNOWN: unknown error
> [loki:13586] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
> now abort,
> [loki:13586] ***and potentially your MPI job)
> [Thread 0x7fffebbba700 (LWP 13585) exited]
> [Thread 0x710a3700 (LWP 13584) exited]
> [Thread 0x718a4700 (LWP 13583) exited]
> [Thread 0x73b97700 (LWP 13582) exited]
> [Inferior 1 (process 13567) exited with code 016]
> Missing separate debuginfos, use: zypper install 
> libpciaccess0-debuginfo-0.13.2-5.1.x86_64 
> libudev1-debuginfo-210-116.3.3.x86_64
> (gdb) bt
> No stack.
> (gdb)
> 
> Do you need anything else?
> 
> 
> Kind regards
> 
> 

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-23 Thread r...@open-mpi.org
Also check to ensure you are using the same version of OMPI on all nodes - this 
message usually means that a different version was used on at least one node.

> On Dec 23, 2016, at 1:58 AM, gil...@rist.or.jp wrote:
> 
>  Serguei,
> 
>  
> this looks like a very different issue, orted cannot be remotely started.
> 
>  
> that typically occurs if orted cannot find some dependencies
> 
> (the Open MPI libs and/or the compiler runtime)
> 
>  
> for example, from a node, ssh  orted should not fail because of 
> unresolved dependencies.
> 
> a simple trick is to replace
> 
> mpirun ...
> 
> with
> 
> `which mpirun` ...
> 
>  
> a better option (as long as you do not plan to relocate Open MPI install dir) 
> is to configure with
> 
> --enable-mpirun-prefix-by-default
> 
>  
> Cheers,
> 
>  
> Gilles
> 
> - Original Message -
> 
> Hi All !
> As there are no any positive changes with "UDSM + IPoIB" problem since my 
> previous post, 
> we installed IPoIB on the cluster and "No OpenFabrics connection..." error 
> doesn't appear more.
> But now OpenMPI reports about another problem:
> 
> In app ERROR OUTPUT stream:
> 
> [node2:14142] [[37935,0],0] ORTE_ERROR_LOG: Data unpack had inadequate space 
> in file base/plm_base_launch_support.c at line 1035
> 
> In app OUTPUT stream:
> 
> --
> ORTE was unable to reliably start one or more daemons.
> This usually is caused by:
> 
> * not finding the required libraries and/or binaries on
>   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
>   settings, or configure OMPI with --enable-orterun-prefix-by-default
> 
> * lack of authority to execute on one or more specified nodes.
>   Please verify your allocation and authorities.
> 
> * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
>   Please check with your sys admin to determine the correct location to use.
> 
> *  compilation of the orted with dynamic libraries when static are required
>   (e.g., on Cray). Please check your configure cmd line and consider using
>   one of the contrib/platform definitions for your system type.
> 
> * an inability to create a connection back to mpirun due to a
>   lack of common network interfaces and/or no route found between
>   them. Please check network connectivity (including firewalls
>   and network routing requirements).
> --
> 
> When I'm trying to run the task using single node - all works properly.
> But when I specify "run on 2 nodes", the problem appears.
> 
> I tried to run ping using IPoIB addresses and all hosts are resolved 
> properly, 
> ping requests and replies are going over IB without any problems.
> So all nodes (including head) see each other via IPoIB.
> But MPI app fails.
> 
> Same test task works perfect on all nodes being run with Ethernet transport 
> instead of InfiniBand.
> 
> P.S. We use Torque resource manager to enqueue MPI tasks.
> 
> Best regards,
> Sergei.
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] Release of OMPI v1.10.5

2016-12-19 Thread r...@open-mpi.org
The Open MPI Team, representing a consortium of research, academic, and 
industry partners, is pleased to announce the release of Open MPI version 
1.10.5.

v1.10.5 is a bug fix release that includes an important performance regression 
fix. All users are encouraged to upgrade to v1.10.5 when possible.  

Version 1.10.5 can be downloaded from the main Open MPI web site

https://www.open-mpi.org/software/ompi/v1.10/ 



NEWS

1.10.5 - 19 Dec 2016
--
- Update UCX APIs
- Fix bug in darray that caused MPI/IO failures
- Use a MPI_Get_library_version() like string to tag the debugger DLL.
  Thanks to Alastair McKinstry for the report
- Fix multi-threaded race condition in coll/libnbc
- Several fixes to OSHMEM
- Fix bug in UCX support due to uninitialized field
- Fix MPI_Ialltoallv with MPI_IN_PLACE and without MPI param check
- Correctly reset receive request type before init. Thanks Chris Pattison
  for the report and test case.
- Fix bug in iallgather[v]
- Fix concurrency issue with MPI_Comm_accept. Thanks to Pieter Noordhuis
  for the patch
- Fix ompi_coll_base_{gather,scatter}_intra_binomial
- Fixed an issue with MPI_Type_get_extent returning the wrong extent
  for distributed array datatypes.
- Re-enable use of rtdtsc instruction as a monotonic clock source if
  the processor has a core-invariant tsc. This is a partial fix for a
  performance regression introduced in Open MPI v1.10.3.


Ralph

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-08 Thread r...@open-mpi.org
23de369da405358ba2
>>>> Merge: ac8c019 b9420bb
>>>> Author: Jeff Squyres <jsquy...@users.noreply.github.com <javascript:;>>
>>>> Date:   Wed Dec 7 18:24:46 2016 -0500
>>>>Merge pull request #2528 from rhc54/cmr20x/signals
>>>> 
>>>> Unfortunately it changes nothing. The root rank stops and all other
>>>> ranks (and mpirun) just stay, the remaining ranks at 100 % CPU waiting
>>>> apparently in that allreduce. The stack trace looks a bit more
>>>> interesting (git is always debug build ?), so I include it at the very
>>>> bottom just in case.
>>>> 
>>>> Off-list Gilles Gouaillardet suggested to set breakpoints at exit,
>>>> __exit etc. to try to catch signals. Would that be useful ? I need a
>>>> moment to figure out how to do this, but I can definitively try.
>>>> 
>>>> Some remark: During "make install" from the git repo I see a
>>>> 
>>>> WARNING!  Common symbols found:
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_2complex
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_2double_complex
>>>>  mpi-f08-types.o: 0004 C
>>>> ompi_f08_mpi_2double_precision
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_2integer
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_2real
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_aint
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_band
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_bor
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_bxor
>>>>  mpi-f08-types.o: 0004 C ompi_f08_mpi_byte
>>>> 
>>>> I have never noticed this before.
>>>> 
>>>> 
>>>> Best Regards
>>>> 
>>>> Christof
>>>> 
>>>> Thread 1 (Thread 0x2af84cde4840 (LWP 11219)):
>>>> #0  0x2af84e4c669d in poll () from /lib64/libc.so.6
>>>> #1  0x2af850517496 in poll_dispatch () from /cluster/mpi/openmpi/2.0.2/
>>>> intel2016/lib/libopen-pal.so.20
>>>> #2  0x2af85050ffa5 in opal_libevent2022_event_base_loop () from
>>>> /cluster/mpi/openmpi/2.0.2/intel2016/lib/libopen-pal.so.20
>>>> #3  0x2af85049fa1f in opal_progress () at runtime/opal_progress.c:207
>>>> #4  0x2af84e02f7f7 in ompi_request_default_wait_all (count=233618144,
>>>> requests=0x2, statuses=0x0) at ../opal/threads/wait_sync.h:80
>>>> #5  0x2af84e0758a7 in ompi_coll_base_allreduce_intra_recursivedoubling
>>>> (sbuf=0xdecbae0,
>>>> rbuf=0x2, count=0, dtype=0x, op=0x0, comm=0x1,
>>>> module=0xdee69e0) at base/coll_base_allreduce.c:225
>>>> #6  0x2af84e07b747 in ompi_coll_tuned_allreduce_intra_dec_fixed
>>>> (sbuf=0xdecbae0, rbuf=0x2, count=0, dtype=0x, op=0x0,
>>>> comm=0x1, module=0x1) at coll_tuned_decision_fixed.c:66
>>>> #7  0x2af84e03e832 in PMPI_Allreduce (sendbuf=0xdecbae0, recvbuf=0x2,
>>>> count=0, datatype=0x, op=0x0, comm=0x1) at pallreduce.c:107
>>>> #8  0x2af84ddaac90 in ompi_allreduce_f (sendbuf=0xdecbae0 "\005",
>>>> recvbuf=0x2 , count=0x0,
>>>> datatype=0x, op=0x0, comm=0x1, ierr=0x7ffdf3cffe9c) at
>>>> pallreduce_f.c:87
>>>> #9  0x0045ecc6 in m_sum_i_ ()
>>>> #10 0x00e172c9 in mlwf_mp_mlwf_wannier90_ ()
>>>> #11 0x004325ff in vamp () at main.F:2640
>>>> #12 0x0040de1e in main ()
>>>> #13 0x2af84e3fbb15 in __libc_start_main () from /lib64/libc.so.6
>>>> #14 0x0040dd29 in _start ()
>>>> 
>>>> On Wed, Dec 07, 2016 at 09:47:48AM -0800, r...@open-mpi.org <javascript:;>
>>>> wrote:
>>>>> Hi Christof
>>>>> 
>>>>> Sorry if I missed this, but it sounds like you are saying that one of
>>>> your procs abnormally terminates, and we are failing to kill the remaining
>>>> job? Is that correct?
>>>>> 
>>>>> If so, I just did some work that might relate to that problem that is
>>>> pending in PR #2528: https://github.com/open-mpi/ompi/pull/2528 <
>>>> https://github.com/open-mpi/ompi/pull/2528>
>>>>> 
>>>>> Would you be able to try that?
>>>>> 
&

Re: [OMPI users] device failed to appear .. Connection timed out

2016-12-08 Thread r...@open-mpi.org
Sounds like something didn’t quite get configured right, or maybe you have a 
library installed that isn’t quite setup correctly, or...

Regardless, we generally advise building from source to avoid such problems. Is 
there some reason not to just do so?

> On Dec 8, 2016, at 6:16 AM, Daniele Tartarini  
> wrote:
> 
> Hi,
> 
> I've installed on a Red Hat 7.2 the OpenMPI distributed via Yum:
> 
> openmpi-devel.x86_64 1.10.3-3.el7  
> 
> any code I try to run (including the mpitests-*) I get the following message 
> with slight variants:
> 
>  my_machine.171619hfi_wait_for_device: The /dev/hfi1_0 device failed 
> to appear after 15.0 seconds: Connection timed out
> 
> Is anyone able to help me in identifying the source of the problem?
> Anyway,  /dev/hfi1_0 doesn't exist.
> 
> If I use an OpenMPI version compiled from source I have no issue (gcc 4.8.5).
> 
> many thanks in advance.
> 
> cheers
> Daniele
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread r...@open-mpi.org
Hi Christof

Sorry if I missed this, but it sounds like you are saying that one of your 
procs abnormally terminates, and we are failing to kill the remaining job? Is 
that correct?

If so, I just did some work that might relate to that problem that is pending 
in PR #2528: https://github.com/open-mpi/ompi/pull/2528 


Would you be able to try that?

Ralph

> On Dec 7, 2016, at 9:37 AM, Christof Koehler 
>  wrote:
> 
> Hello,
> 
> On Wed, Dec 07, 2016 at 10:19:10AM -0500, Noam Bernstein wrote:
>>> On Dec 7, 2016, at 10:07 AM, Christof Koehler 
>>>  wrote:
 
>>> I really think the hang is a consequence of
>>> unclean termination (in the sense that the non-root ranks are not
>>> terminated) and probably not the cause, in my interpretation of what I
>>> see. Would you have any suggestion to catch signals sent between orterun
>>> (mpirun) and the child tasks ?
>> 
>> Do you know where in the code the termination call is?  Is it actually 
>> calling mpi_abort(), or just doing something ugly like calling fortran 
>> “stop”?  If the latter, would that explain a possible hang?
> Well, basically it tries to use wannier90 (LWANNIER=.TRUE.). The wannier90 
> input contains
> an error, a restart is requested and the wannier90.chk file the restart
> information is missing.
> "
> Exiting...
> Error: restart requested but wannier90.chk file not found
> "
> So it must terminate.
> 
> The termination happens in the libwannier.a, source file io.F90:
> 
> write(stdout,*)  'Exiting...'
> write(stdout, '(1x,a)') trim(error_msg)
> close(stdout)
> stop "wannier90 error: examine the output/error file for details"
> 
> So it calls stop  as you assumed.
> 
>> Presumably someone here can comment on what the standard says about the 
>> validity of terminating without mpi_abort.
> 
> Well, probably stop is not a good way to terminate then.
> 
> My main point was the change relative to 1.10 anyway :-) 
> 
> 
>> 
>> Actually, if you’re willing to share enough input files to reproduce, I 
>> could take a look.  I just recompiled our VASP with openmpi 2.0.1 to fix a 
>> crash that was apparently addressed by some change in the memory allocator 
>> in a recent version of openmpi.  Just e-mail me if that’s the case.
> 
> I think that is no longer necessary ? In principle it is no problem but
> it at the end of a (small) GW calculation, the Si tutorial example. 
> So the mail would be abit larger due to the WAVECAR.
> 
> 
>> 
>>  Noam
>> 
>> 
>> 
>> ||
>> |U.S. NAVAL|
>> |_RESEARCH_|
>> LABORATORY
>> Noam Bernstein, Ph.D.
>> Center for Materials Physics and Technology
>> U.S. Naval Research Laboratory
>> T +1 202 404 8628  F +1 202 404 7546
>> https://www.nrl.navy.mil 
> 
> -- 
> Dr. rer. nat. Christof Köhler   email: c.koeh...@bccms.uni-bremen.de
> Universitaet Bremen/ BCCMS  phone:  +49-(0)421-218-62334
> Am Fallturm 1/ TAB/ Raum 3.12   fax: +49-(0)421-218-62770
> 28359 Bremen  
> 
> PGP: http://www.bccms.uni-bremen.de/cms/people/c_koehler/
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Signal propagation in 2.0.1

2016-12-02 Thread r...@open-mpi.org
Fix is on the way: https://github.com/open-mpi/ompi/pull/2498 
<https://github.com/open-mpi/ompi/pull/2498>

Thanks
Ralph

> On Dec 1, 2016, at 10:49 AM, r...@open-mpi.org wrote:
> 
> Yeah, that’s a bug - we’ll have to address it
> 
> Thanks
> Ralph
> 
>> On Nov 28, 2016, at 9:29 AM, Noel Rycroft <noel.rycr...@cd-adapco.com 
>> <mailto:noel.rycr...@cd-adapco.com>> wrote:
>> 
>> I'm seeing different behaviour between Open MPI 1.8.4 and 2.0.1 with regards 
>> to signal propagation.
>> 
>> With version 1.8.4 mpirun seems to propagate SIGTERM to the tasks it starts 
>> which enables the tasks to handle SIGTERM.
>> 
>> In version 2.0.1 mpirun does not seem to propagate SIGTERM and instead I 
>> suspect it's sending SIGKILL immediately. Because the child tasks are not 
>> given a chance to handle SIGTERM they end up orphaning their child processes.
>> 
>> I have a pretty simply reproducer which consists of:
>> A simple MPI application that sleeps for a number of seconds.
>> A simple bash script which launches mpirun.  
>> A second bash script which is used to launch a 'child' MPI application 
>> 'sleep' binary
>> Both scripts launch their children in the background, and 'wait' on 
>> completion. They both install signal handlers for SIGTERM.
>> 
>> When SIGTERM is sent to the top level script it is explicitly propagated to 
>> 'mpirun' via the signal handler. 
>> 
>> In Open MPI 1.8.4 SIGTERM is propagated to the child MPI tasks which in turn 
>> explicitly propagate the signal to the child binary processes.
>> 
>> In Open MPI 2.0.1 I see no evidence that SIGTERM is propagated to the child 
>> MPI tasks. Instead those tasks are killed and their children (the 
>> application binaries) are orphaned.
>> 
>> Is the difference in behaviour between the different versions expected..?
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Signal propagation in 2.0.1

2016-12-01 Thread r...@open-mpi.org
Yeah, that’s a bug - we’ll have to address it

Thanks
Ralph

> On Nov 28, 2016, at 9:29 AM, Noel Rycroft  wrote:
> 
> I'm seeing different behaviour between Open MPI 1.8.4 and 2.0.1 with regards 
> to signal propagation.
> 
> With version 1.8.4 mpirun seems to propagate SIGTERM to the tasks it starts 
> which enables the tasks to handle SIGTERM.
> 
> In version 2.0.1 mpirun does not seem to propagate SIGTERM and instead I 
> suspect it's sending SIGKILL immediately. Because the child tasks are not 
> given a chance to handle SIGTERM they end up orphaning their child processes.
> 
> I have a pretty simply reproducer which consists of:
> A simple MPI application that sleeps for a number of seconds.
> A simple bash script which launches mpirun.  
> A second bash script which is used to launch a 'child' MPI application 
> 'sleep' binary
> Both scripts launch their children in the background, and 'wait' on 
> completion. They both install signal handlers for SIGTERM.
> 
> When SIGTERM is sent to the top level script it is explicitly propagated to 
> 'mpirun' via the signal handler. 
> 
> In Open MPI 1.8.4 SIGTERM is propagated to the child MPI tasks which in turn 
> explicitly propagate the signal to the child binary processes.
> 
> In Open MPI 2.0.1 I see no evidence that SIGTERM is propagated to the child 
> MPI tasks. Instead those tasks are killed and their children (the application 
> binaries) are orphaned.
> 
> Is the difference in behaviour between the different versions expected..?
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] question about "--rank-by slot" behavior

2016-11-30 Thread r...@open-mpi.org
“slot’ never became equivalent to “socket”, or to “core”. Here is what happened:

*for your first example: the mapper assigns the first process to the first node 
because there is a free core there, and you said to map-by core. It goes on to 
assign the second process to the second core, and the third process to the 
third core, etc. until we reach the defined #procs for that node (i.e., the 
number of assigned “slots” for that node). When it goes to rank the procs, the 
ranker starts with the first process assigned on the first node - this process 
occupies the first “slot”, and so it gets rank 0. The ranker then assigns rank 
1 to the second process it assigned to the first node, as that process occupies 
the second “slot”. Etc.

* your 2nd example: the mapper assigns the first process to the first socket of 
the first node, the second process to the second socket of the first node, and 
the third process to the first socket of the first node, until all the “slots” 
for that node have been filled. The ranker then starts with the first process 
that was assigned to the first node, and gives it rank 0. The ranker then 
assigns rank 1 to the second process that was assigned to the node - that would 
be the first proc mapped to the second socket. The ranker then assigns rank 2 
to the third proc assigned to the node - that would be the 2nd proc assigned to 
the first socket.

* your 3rd example: the mapper assigns the first process to the first socket of 
the first node, the second process to the second socket of the first node, and 
the third process to the first socket of the second node, continuing around 
until all procs have been mapped. The ranker then starts with the first proc 
assigned to the first node, and gives it rank 0. The ranker then assigns rank 1 
to the second process assigned to the first node (because we are ranking by 
slot!), which corresponds to the first proc mapped to the second socket. The 
ranker then assigns rank 2 to the third process assigned to the first node, 
which corresponds to the second proc mapped to the first socket of that node.

So you can see that you will indeed get the same relative ranking, even though 
the mapping was done using a different algorithm.

HTH
Ralph

> On Nov 30, 2016, at 2:16 PM, David Shrader <dshra...@lanl.gov> wrote:
> 
> Hello Ralph,
> 
> I do understand that "slot" is an abstract term and isn't tied down to any 
> particular piece of hardware. What I am trying to understand is how "slot" 
> came to be equivalent to "socket" in my second and third example, but "core" 
> in my first example. As far as I can tell, MPI ranks should have been 
> assigned the same in all three examples. Why weren't they?
> 
> You mentioned that, when using "--rank-by slot", the ranks are assigned 
> round-robin by scheduler entry; does this mean that the scheduler entries 
> change based on the mapping algorithm (the only thing I changed in my 
> examples) and this results in ranks being assigned differently?
> 
> Thanks again,
> David
> 
> On 11/30/2016 01:23 PM, r...@open-mpi.org wrote:
>> I think you have confused “slot” with a physical “core”. The two have 
>> absolutely nothing to do with each other.
>> 
>> A “slot” is nothing more than a scheduling entry in which a process can be 
>> placed. So when you --rank-by slot, the ranks are assigned round-robin by 
>> scheduler entry - i.e., you assign all the ranks on the first node, then 
>> assign all the ranks on the next node, etc.
>> 
>> It doesn’t matter where those ranks are placed, or what core or socket they 
>> are running on. We just blindly go thru and assign numbers.
>> 
>> If you rank-by core, then we cycle across the procs by looking at the core 
>> number they are bound to, assigning all the procs on a node before moving to 
>> the next node. If you rank-by socket, then you cycle across the procs on a 
>> node by round-robin of sockets, assigning all procs on the node before 
>> moving to the next node. If you then added “span” to that directive, we’d 
>> round-robin by socket across all nodes before circling around to the next 
>> proc on this node.
>> 
>> HTH
>> Ralph
>> 
>> 
>>> On Nov 30, 2016, at 11:26 AM, David Shrader <dshra...@lanl.gov> wrote:
>>> 
>>> Hello All,
>>> 
>>> The man page for mpirun says that the default ranking procedure is 
>>> round-robin by slot. It doesn't seem to be that straight-forward to me, 
>>> though, and I wanted to ask about the behavior.
>>> 
>>> To help illustrate my confusion, here are a few examples where the ranking 
>>> behavior changed based on the mapping behavior, which doesn't make sense to 
>>> me, yet. F

Re: [OMPI users] question about "--rank-by slot" behavior

2016-11-30 Thread r...@open-mpi.org
I think you have confused “slot” with a physical “core”. The two have 
absolutely nothing to do with each other.

A “slot” is nothing more than a scheduling entry in which a process can be 
placed. So when you --rank-by slot, the ranks are assigned round-robin by 
scheduler entry - i.e., you assign all the ranks on the first node, then assign 
all the ranks on the next node, etc.

It doesn’t matter where those ranks are placed, or what core or socket they are 
running on. We just blindly go thru and assign numbers.

If you rank-by core, then we cycle across the procs by looking at the core 
number they are bound to, assigning all the procs on a node before moving to 
the next node. If you rank-by socket, then you cycle across the procs on a node 
by round-robin of sockets, assigning all procs on the node before moving to the 
next node. If you then added “span” to that directive, we’d round-robin by 
socket across all nodes before circling around to the next proc on this node.

HTH
Ralph


> On Nov 30, 2016, at 11:26 AM, David Shrader  wrote:
> 
> Hello All,
> 
> The man page for mpirun says that the default ranking procedure is 
> round-robin by slot. It doesn't seem to be that straight-forward to me, 
> though, and I wanted to ask about the behavior.
> 
> To help illustrate my confusion, here are a few examples where the ranking 
> behavior changed based on the mapping behavior, which doesn't make sense to 
> me, yet. First, here is a simple map by core (using 4 nodes of 32 cpu cores 
> each):
> 
> $> mpirun -n 128 --map-by core --report-bindings true
> [gr0649.localdomain:119614] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
> [B/././././././././././././././././.][./././././././././././././././././.]
> [gr0649.localdomain:119614] MCW rank 1 bound to socket 0[core 1[hwt 0]]: 
> [./B/./././././././././././././././.][./././././././././././././././././.]
> [gr0649.localdomain:119614] MCW rank 2 bound to socket 0[core 2[hwt 0]]: 
> [././B/././././././././././././././.][./././././././././././././././././.]
> ...output snipped...
> 
> Things look as I would expect: ranking happens round-robin through the cpu 
> cores. Now, here's a map by socket example:
> 
> $> mpirun -n 128 --map-by socket --report-bindings true
> [gr0649.localdomain:119926] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
> [B/././././././././././././././././.][./././././././././././././././././.]
> [gr0649.localdomain:119926] MCW rank 1 bound to socket 1[core 18[hwt 0]]: 
> [./././././././././././././././././.][B/././././././././././././././././.]
> [gr0649.localdomain:119926] MCW rank 2 bound to socket 0[core 1[hwt 0]]: 
> [./B/./././././././././././././././.][./././././././././././././././././.]
> ...output snipped...
> 
> Why is rank 1 on a different socket? I know I am mapping by socket in this 
> example, but, fundamentally, nothing should really be different in terms of 
> ranking, correct? The same number of processes are available on each host as 
> in the first example, and available in the same locations. How is "slot" 
> different in this case? If I use "--rank-by core," I recover the output from 
> the first example.
> 
> I thought that maybe "--rank-by slot" might be following something laid down 
> by "--map-by", but the following example shows that isn't completely correct, 
> either:
> 
> $> mpirun -n 128 --map-by socket:span --report-bindings true
> [gr0649.localdomain:119319] MCW rank 0 bound to socket 0[core 0[hwt 0]]: 
> [B/././././././././././././././././.][./././././././././././././././././.]
> [gr0649.localdomain:119319] MCW rank 1 bound to socket 1[core 18[hwt 0]]: 
> [./././././././././././././././././.][B/././././././././././././././././.]
> [gr0649.localdomain:119319] MCW rank 2 bound to socket 0[core 1[hwt 0]]: 
> [./B/./././././././././././././././.][./././././././././././././././././.]
> ...output snipped...
> 
> If ranking by slot were somehow following something left over by mapping, I 
> would have expected rank 2 to end up on a different host. So, now I don't 
> know what to expect from using "--rank-by slot." Does anyone have any 
> pointers?
> 
> Thank you for the help!
> David
> 
> -- 
> David Shrader
> HPC-ENV High Performance Computer Systems
> Los Alamos National Lab
> Email: dshrader  lanl.gov
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] malloc related crash inside openmpi

2016-11-24 Thread r...@open-mpi.org
Just to be clear: are you saying that mpirun exits with that message? Or is 
your application process exiting with it?

There is no reason for mpirun to be looking for that library.

The library in question is in the /lib/openmpi directory, and is named 
mca_ess_pmi.[la,so]


> On Nov 23, 2016, at 2:31 PM, Noam Bernstein <noam.bernst...@nrl.navy.mil> 
> wrote:
> 
> 
>> On Nov 23, 2016, at 5:26 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
>> wrote:
>> 
>> It looks like the library may not have been fully installed on that node - 
>> can you see if the prefix location is present, and that the LD_LIBRARY_PATH 
>> on that node is correctly set? The referenced component did not exist prior 
>> to the 2.0 series, so I’m betting that your LD_LIBRARY_PATH isn’t correct on 
>> that node.
> 
> The LD_LIBRARY path is definitely correct on the node that’s running the 
> mpirun, I checked that, and the openmpi directory is supposedly NFS mounted 
> everywhere.  I suppose installation may have not fully worked and I didn’t 
> notice.  What’s the name of the library it’s looking for?
> 
>   
> Noam
> 
> 
> 
> ||
> |U.S. NAVAL|
> |_RESEARCH_|
> LABORATORY
> 
> Noam Bernstein, Ph.D.
> Center for Materials Physics and Technology
> U.S. Naval Research Laboratory
> T +1 202 404 8628  F +1 202 404 7546
> https://www.nrl.navy.mil <https://www.nrl.navy.mil/>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] malloc related crash inside openmpi

2016-11-23 Thread r...@open-mpi.org
It looks like the library may not have been fully installed on that node - can 
you see if the prefix location is present, and that the LD_LIBRARY_PATH on that 
node is correctly set? The referenced component did not exist prior to the 2.0 
series, so I’m betting that your LD_LIBRARY_PATH isn’t correct on that node.


> On Nov 23, 2016, at 2:21 PM, Noam Bernstein  
> wrote:
> 
> 
>> On Nov 23, 2016, at 3:45 PM, George Bosilca > > wrote:
>> 
>> Thousands reasons ;)
> 
> Still trying to check if 2.0.1 fixes the problem, and discovered that earlier 
> runs weren’t actually using the version I intended.  When I do use 2.0.1, I 
> get the following errors:
> --
> A requested component was not found, or was unable to be opened.  This
> means that this component is either not installed or is unable to be
> used on your system (e.g., sometimes this means that shared libraries
> that the component requires are unable to be found/loaded).  Note that
> Open MPI stopped checking at the first component that it did not find.
> 
> Host:  compute-1-35
> Framework: ess
> Component: pmi
> --
> --
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_ess_base_open failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --
> 
> I’ve confirmed that mpirun PATH and LD_LIBRARY_PATH are pointing to 2.0.1 
> version of things within the job script.  Configure line is as I’ve used for 
> 1.8.x, i.e.
> export CC=gcc
> export CXX=g++
> export F77=ifort
> export FC=ifort 
> 
> ./configure \
> --prefix=${DEST} \
> --with-tm=/usr/local/torque \
> --enable-mpirun-prefix-by-default \
> --with-verbs=/usr \
> --with-verbs-libdir=/usr/lib64
> Followed by “make install” Any suggestions for getting 2.0.1 working?
> 
>   thanks,
>   Noam
> 
> 
> ||
> |U.S. NAVAL|
> |_RESEARCH_|
> LABORATORY
> 
> Noam Bernstein, Ph.D.
> Center for Materials Physics and Technology
> U.S. Naval Research Laboratory
> T +1 202 404 8628  F +1 202 404 7546
> https://www.nrl.navy.mil 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Slurm binding not propagated to MPI jobs

2016-11-04 Thread r...@open-mpi.org
See https://github.com/open-mpi/ompi/pull/2365 
<https://github.com/open-mpi/ompi/pull/2365>

Let me know if that solves it for you


> On Nov 3, 2016, at 9:48 AM, Andy Riebs <andy.ri...@hpe.com> wrote:
> 
> Getting that support into 2.1 would be terrific -- and might save us from 
> having to write some Slurm prolog scripts to effect that.
> 
> Thanks Ralph!
> 
> On 11/01/2016 11:36 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
>> Ah crumby!! We already solved this on master, but it cannot be backported to 
>> the 1.10 series without considerable pain. For some reason, the support for 
>> it has been removed from the 2.x series as well. I’ll try to resolve that 
>> issue and get the support reinstated there (probably not until 2.1).
>> 
>> Can you manage until then? I think the v2 RM’s are thinking Dec/Jan for 
>> 2.1.
>> Ralph
>> 
>> 
>>> On Nov 1, 2016, at 11:38 AM, Riebs, Andy <andy.ri...@hpe.com 
>>> <mailto:andy.ri...@hpe.com>> wrote:
>>> 
>>> To close the thread here… I got the following information:
>>>  
>>> Looking at SLURM_CPU_BIND is the right idea, but there are quite a few more 
>>> options. It misses map_cpu, rank, plus the NUMA-based options:
>>> rank_ldom, map_ldom, and mask_ldom. See the srun man pages for 
>>> documentation.
>>>  
>>>  
>>> From: Riebs, Andy 
>>> Sent: Thursday, October 27, 2016 1:53 PM
>>> To: users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>>> Subject: Re: [OMPI users] Slurm binding not propagated to MPI jobs
>>>  
>>> Hi Ralph,
>>> 
>>> I haven't played around in this code, so I'll flip the question over to the 
>>> Slurm list, and report back here when I learn anything.
>>> 
>>> Cheers
>>> Andy
>>> 
>>> On 10/27/2016 01:44 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
>>> Sigh - of course it wouldn’t be simple :-( 
>>>  
>>> All right, let’s suppose we look for SLURM_CPU_BIND:
>>>  
>>> * if it includes the word “none”, then we know the user specified that 
>>> they don’t want us to bind
>>>  
>>> * if it includes the word mask_cpu, then we have to check the value of that 
>>> option.
>>>  
>>> * If it is all F’s, then they didn’t specify a binding and we should do 
>>> our thing.
>>>  
>>> * If it is anything else, then we assume they _did_ specify a binding, and 
>>> we leave it alone
>>>  
>>> Would that make sense? Is there anything else that could be in that envar 
>>> which would trip us up?
>>>  
>>>  
>>> On Oct 27, 2016, at 10:37 AM, Andy Riebs <andy.ri...@hpe.com 
>>> <mailto:andy.ri...@hpe.com>> wrote:
>>>  
>>> Yes, they still exist:
>>> $ srun --ntasks-per-node=2 -N1 env | grep BIND | sort -u
>>> SLURM_CPU_BIND_LIST=0x
>>> SLURM_CPU_BIND=quiet,mask_cpu:0x
>>> SLURM_CPU_BIND_TYPE=mask_cpu:
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> Here are the relevant Slurm configuration options that could conceivably 
>>> change the behavior from system to system:
>>> SelectType  = select/cons_res
>>> SelectTypeParameters= CR_CPU
>>> 
>>>  
>>> On 10/27/2016 01:17 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
>>> And if there is no --cpu_bind on the cmd line? Do these not exist?
>>>  
>>> On Oct 27, 2016, at 10:14 AM, Andy Riebs <andy.ri...@hpe.com 
>>> <mailto:andy.ri...@hpe.com>> wrote:
>>>  
>>> Hi Ralph,
>>> 
>>> I think I've found the magic keys...
>>> 
>>> $ srun --ntasks-per-node=2 -N1 --cpu_bind=none env | grep BIND
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=none
>>> SLURM_CPU_BIND_LIST=
>>> SLURM_CPU_BIND=quiet,none
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=none
>>> SLURM_CPU_BIND_LIST=
>>> SLURM_CPU_BIND=quiet,none
>>> $ srun --ntasks-per-node=2 -N1 --cpu_bind=core env | grep BIND
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=mask_cpu:
>>> SLURM_CPU_BIND_LIST=0x,0x
>>> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=mask_cpu:
>>> SLURM_CPU_BIND_LIST=0x,0x
>>> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
>>> 
>>> Andy
>&g

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
All true - but I reiterate. The source of the problem is that the "--map-by 
node” on the cmd line must come *before* your application. Otherwise, none of 
these suggestions will help.

> On Nov 4, 2016, at 6:52 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> 
> In your case, using slots or --npernode or --map-by node will result in the 
> same distribution of processes because you're only launching 1 process per 
> node (a.k.a. "1ppn").
> 
> They have more pronounced differences when you're launching more than 1ppn.
> 
> Let's take a step back: you should know that Open MPI uses 3 phases to plan 
> out how it will launch your MPI job:
> 
> 1. Mapping: where each process will go
> 2. Ordering: after mapping, how each process will be numbered (this 
> translates to rank ordering MPI_COMM_WORLD)
> 3. Binding: binding processes to processors
> 
> #3 is not pertinent to this conversation, so I'll leave it out of my 
> discussion below.
> 
> We're mostly talking about #1 here.  Let's look at each of the three options 
> mentioned in this thread individually.  In each of the items below, I assume 
> you are using *just* that option, and *neither of the other 2 options*:
> 
> 1. slots: this tells Open MPI the maximum number of processes that can be 
> placed on a server before it is considered to be "oversubscribed" (and Open 
> MPI won't let you oversubscribe by default).
> 
> So when you say "slots=1", you're basically telling Open MPI to launch 1 
> process per node and then to move on to the next node.  If you said 
> "slots=3", then Open MPI would launch up to 3 processes per node before 
> moving on to the next (until the total np processes were launched).
> 
> *** Be aware that we have changed the hostfile default value of slots (i.e., 
> what number of slots to use if it is not specified in the hostfile) in 
> different versions of Open MPI.  When using hostfiles, in most cases, you'll 
> see either a default value of 1 or the total number of cores on the node.
> 
> 2. --map-by node: in this case, Open MPI will map out processes round robin 
> by *node* instead of its default by *core*.  Hence, even if you had "slots=3" 
> and -np 9, Open MPI would first put a process on node A, then put a process 
> on node B, then a process on node C, and then loop back to putting a 2nd 
> process on node A, ...etc.
> 
> 3. --npernode: in this case, you're telling Open MPI how many processes to 
> put on each node before moving on to the next node.  E.g., if you "mpirun -np 
> 9 ..." (and assuming you have >=3 slots per node), Open MPI will put 3 
> processes on each node before moving on to the next node.
> 
> With the default MPI_COMM_WORLD rank ordering, the practical difference in 
> these three options is:
> 
> Case 1:
> 
> $ cat hostfile
> a slots=3
> b slots=3
> c slots=3
> $ mpirun --hostfile hostfile -np 9 my_mpi_executable
> 
> In this case, you'll end up with MCW ranks 0-2 on a, 3-5 on b, and 6-8 on c.
> 
> Case 2:
> 
> # Setting an arbitrarily large number of slots per host just to be explicitly 
> clear for this example
> $ cat hostfile
> a slots=20
> b slots=20
> c slots=20
> $ mpirun --hostfile hostfile -np 9 --map-by node my_mpi_executable
> 
> In this case, you'll end up with MCW ranks 0,3,6 on a, 1,4,7 on b, and 2,5,8 
> on c.
> 
> Case 3:
> 
> # Setting an arbitrarily large number of slots per host just to be explicitly 
> clear for this example
> $ cat hostfile
> a slots=20
> b slots=20
> c slots=20
> $ mpirun --hostfile hostfile -np 9 --npernode 3 my_mpi_executable
> 
> In this case, you'll end up with the same distribution / rank ordering as 
> case #1, but you'll still have 17 more slots you could have used.
> 
> There are lots of variations on this, too, because these mpirun options (and 
> many others) can be used in conjunction with each other.  But that gets 
> pretty esoteric pretty quickly; most users don't have a need for such 
> complexity.
> 
> 
> 
>> On Nov 4, 2016, at 8:57 AM, Bennet Fauber <ben...@umich.edu> wrote:
>> 
>> Mahesh,
>> 
>> Depending what you are trying to accomplish, might using the mpirun option
>> 
>> -pernode  -o-  --pernode
>> 
>> work for you?  That requests that only one process be spawned per
>> available node.
>> 
>> We generally use this for hybrid codes, where the single process will
>> spawn threads to the remaining processors.
>> 
>> Just a thought,   -- bennet
>> 
>> 
>> 
>> 
>> 
>> On Fri, Nov 4, 2016 at 8:39 AM, Mahesh Nanavalla
>> <mahesh.na

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
My apologies - the problem is that you list the option _after_ your executable 
name, and so we think it is an argument for your executable. You need to list 
the option _before_ your executable on the cmd line


> On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla <mahesh.nanavalla...@gmail.com> 
> wrote:
> 
> Thanks for reply,
> 
> But,with space also not running on one process one each node
> 
> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
> myhostfile /usr/bin/openmpiWiFiBulb --map-by node
> 
> And 
> 
> If use like this it,s working fine(running one process on each node)
> /root@OpenWrt:~#/usr/bin/mpirun --allow-run-as-root -np 3 --host 
> root@10.74.25.1 <mailto:root@10.74.25.1>,root@10.74.46.1 
> <mailto:root@10.74.46.1>,root@10.73.145.1 <mailto:root@10.73.145.1> 
> /usr/bin/openmpiWiFiBulb 
> 
> But,i want use hostfile only..
> kindly help me.
> 
> 
> On Fri, Nov 4, 2016 at 5:00 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> you mistyped the option - it is “--map-by node”. Note the space between “by” 
> and “node” - you had typed it with a “-“ instead of a “space”
> 
> 
>> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla <mahesh.nanavalla...@gmail.com 
>> <mailto:mahesh.nanavalla...@gmail.com>> wrote:
>> 
>> Hi all,
>> 
>> I am using openmpi-1.10.3,using quad core processor(node).
>> 
>> I am running 3 processes on three nodes(provided by hostfile) each node 
>> process is limited  by --map-by-node as below
>> 
>> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
>> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node
>> 
>> root@OpenWrt:~# cat myhostfile 
>> root@10.73.145.1:1 <http://root@10.73.145.1:1/>
>> root@10.74.25.1:1 <http://root@10.74.25.1:1/>
>> root@10.74.46.1:1 <http://root@10.74.46.1:1/>
>> 
>> 
>> Problem is 3 process running on one node.it <http://node.it/>'s not mapping 
>> one process by node.
>> 
>> is there any library used to run like above.if yes please tell me that .
>> 
>> Kindly help me where am doing wrong...
>> 
>> Thanks,
>> Mahesh N
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
you mistyped the option - it is “--map-by node”. Note the space between “by” 
and “node” - you had typed it with a “-“ instead of a “space”


> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla  
> wrote:
> 
> Hi all,
> 
> I am using openmpi-1.10.3,using quad core processor(node).
> 
> I am running 3 processes on three nodes(provided by hostfile) each node 
> process is limited  by --map-by-node as below
> 
> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node
> 
> root@OpenWrt:~# cat myhostfile 
> root@10.73.145.1:1 
> root@10.74.25.1:1 
> root@10.74.46.1:1 
> 
> 
> Problem is 3 process running on one node.it 's not mapping 
> one process by node.
> 
> is there any library used to run like above.if yes please tell me that .
> 
> Kindly help me where am doing wrong...
> 
> Thanks,
> Mahesh N
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Slurm binding not propagated to MPI jobs

2016-11-01 Thread r...@open-mpi.org
Ah crumby!! We already solved this on master, but it cannot be backported to 
the 1.10 series without considerable pain. For some reason, the support for it 
has been removed from the 2.x series as well. I’ll try to resolve that issue 
and get the support reinstated there (probably not until 2.1).

Can you manage until then? I think the v2 RM’s are thinking Dec/Jan for 2.1.
Ralph


> On Nov 1, 2016, at 11:38 AM, Riebs, Andy <andy.ri...@hpe.com> wrote:
> 
> To close the thread here… I got the following information:
>  
> Looking at SLURM_CPU_BIND is the right idea, but there are quite a few more 
> options. It misses map_cpu, rank, plus the NUMA-based options:
> rank_ldom, map_ldom, and mask_ldom. See the srun man pages for documentation.
>  
>  
> From: Riebs, Andy 
> Sent: Thursday, October 27, 2016 1:53 PM
> To: users@lists.open-mpi.org
> Subject: Re: [OMPI users] Slurm binding not propagated to MPI jobs
>  
> Hi Ralph,
> 
> I haven't played around in this code, so I'll flip the question over to the 
> Slurm list, and report back here when I learn anything.
> 
> Cheers
> Andy
> 
> On 10/27/2016 01:44 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
> Sigh - of course it wouldn’t be simple :-( 
>  
> All right, let’s suppose we look for SLURM_CPU_BIND:
>  
> * if it includes the word “none”, then we know the user specified that they 
> don’t want us to bind
>  
> * if it includes the word mask_cpu, then we have to check the value of that 
> option.
>  
> * If it is all F’s, then they didn’t specify a binding and we should do our 
> thing.
>  
> * If it is anything else, then we assume they _did_ specify a binding, and we 
> leave it alone
>  
> Would that make sense? Is there anything else that could be in that envar 
> which would trip us up?
>  
>  
> On Oct 27, 2016, at 10:37 AM, Andy Riebs <andy.ri...@hpe.com 
> <mailto:andy.ri...@hpe.com>> wrote:
>  
> Yes, they still exist:
> $ srun --ntasks-per-node=2 -N1 env | grep BIND | sort -u
> SLURM_CPU_BIND_LIST=0x
> SLURM_CPU_BIND=quiet,mask_cpu:0x
> SLURM_CPU_BIND_TYPE=mask_cpu:
> SLURM_CPU_BIND_VERBOSE=quiet
> Here are the relevant Slurm configuration options that could conceivably 
> change the behavior from system to system:
> SelectType  = select/cons_res
> SelectTypeParameters= CR_CPU
> 
>  
> On 10/27/2016 01:17 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
> And if there is no --cpu_bind on the cmd line? Do these not exist?
>  
> On Oct 27, 2016, at 10:14 AM, Andy Riebs <andy.ri...@hpe.com 
> <mailto:andy.ri...@hpe.com>> wrote:
>  
> Hi Ralph,
> 
> I think I've found the magic keys...
> 
> $ srun --ntasks-per-node=2 -N1 --cpu_bind=none env | grep BIND
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=none
> SLURM_CPU_BIND_LIST=
> SLURM_CPU_BIND=quiet,none
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=none
> SLURM_CPU_BIND_LIST=
> SLURM_CPU_BIND=quiet,none
> $ srun --ntasks-per-node=2 -N1 --cpu_bind=core env | grep BIND
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=mask_cpu:
> SLURM_CPU_BIND_LIST=0x,0x
> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=mask_cpu:
> SLURM_CPU_BIND_LIST=0x,0x
> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
> 
> Andy
> 
> On 10/27/2016 11:57 AM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote:
> 
> Hey Andy
> 
> Is there a SLURM envar that would tell us the binding option from the srun 
> cmd line? We automatically bind when direct launched due to user complaints 
> of poor performance if we don’t. If the user specifies a binding 
> option, then we detect that we were already bound and don’t do it.
> 
> However, if the user specifies that they not be bound, then we think they 
> simply didn’t specify anything - and that isn’t the case. If we 
> can see something that tells us “they explicitly said not to do 
> it”, then we can avoid the situation.
> 
> Ralph
> 
> 
> On Oct 27, 2016, at 8:48 AM, Andy Riebs <andy.ri...@hpe.com 
> <mailto:andy.ri...@hpe.com>> wrote:
> 
> Hi All,
> 
> We are running Open MPI version 1.10.2, built with support for Slurm version 
> 16.05.0. When a user specifies "--cpu_bind=none", MPI tries to bind by core, 
> which segv's if there are more processes than cores.
> 
> The user reports:
> 
> What I found is that
> 
> % srun --ntasks-per-node=8 --cpu_bind=none  \
> env SHMEM_SYMMETRIC_HEAP_SIZE=1024M bin/all2all.shmem.exe 0
> 
> will have the problem, but:
> 
> % srun --ntasks-per-node=8 --cpu_

Re: [OMPI users] MCA compilation later

2016-10-31 Thread r...@open-mpi.org
Here’s a link on how to create components:

https://github.com/open-mpi/ompi/wiki/devel-CreateComponent

and if you want to create a completely new framework:

https://github.com/open-mpi/ompi/wiki/devel-CreateFramework

If you want to distribute a proprietary plugin, you first develop and build it 
within the OMPI code base on your own machines. Then, just take the dll for 
your plugin from the /lib/openmpi directory and distribute that “blob”.

I’ll correct my comment: you need the headers and the libraries. You just don’t 
need the hardware, though it means you cannot test those features.


> On Oct 31, 2016, at 6:19 AM, Sean Ahern <s...@ensight.com> wrote:
> 
> Thanks. That's what I expected and hoped. But is there a pointer about how to 
> get started? If I've got an existing OpenMPI build, what's the process to get 
> a new MCA plugin built with a new set of header files?
> 
> (I'm a bit surprised only header files are necessary. Shouldn't the plugin 
> require at least runtime linking with a low-level transport library?)
> 
> -Sean
> 
> --
> Sean Ahern
> Computational Engineering International
> 919-363-0883
> 
> On Fri, Oct 28, 2016 at 3:40 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> 
> <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote:
> You don’t need any of the hardware - you just need the headers. Things like 
> libfabric and libibverbs are all publicly available, and so you can build all 
> that support even if you cannot run it on your machine.
> 
> Once your customer installs the binary, the various plugins will check for 
> their required library and hardware and disqualify themselves if it isn’t 
> found.
> 
>> On Oct 28, 2016, at 12:33 PM, Sean Ahern <s...@ensight.com 
>> <mailto:s...@ensight.com>> wrote:
>> 
>> There's been discussion on the OpenMPI list recently about static linking of 
>> OpenMPI with all of the desired MCAs in it. I've got the opposite question. 
>> I'd like to add MCAs later on to an already-compiled version of OpenMPI and 
>> am not quite sure how to do it.
>> 
>> Let me summarize. We've got a commercial code that we deploy on customer 
>> machines in binary form. We're working to integrate OpenMPI into the 
>> installer, and things seem to be progressing well. (Note: because we're a 
>> commercial code, making the customer compile something doesn't work for us 
>> like it can for open source or research codes.)
>> 
>> Now, we want to take advantage of OpenMPI's ability to find MCAs at runtime, 
>> pointing to the various plugins that might apply to a deployed system. I've 
>> configured and compiled OpenMPI on one of our build machines, one that 
>> doesn't have any special interconnect hardware or software installed. We 
>> take this compiled version of OpenMPI and use it on all of our machines. 
>> (Yes, I've read Building FAQ #39 
>> <https://www.open-mpi.org/faq/?category=building#installdirs> about 
>> relocating OpenMPI. Useful, that.) I'd like to take our pre-compiled version 
>> of OpenMPI and add MCA libraries to it, giving OpenMPI the ability to 
>> communicate via transport mechanisms that weren't available on the original 
>> build machine. Things like InfiniBand, OmniPath, or one of Cray's 
>> interconnects.
>> 
>> How would I go about doing this? And what are the limitations?
>> 
>> I'm guessing that I need to go configure and compile the same version of 
>> OpenMPI on a machine that has the desired interconnect installation (headers 
>> and libraries), then go grab the corresponding lib/openmpi/mca_*{la,so} 
>> files. Take those files and drop them in our pre-built OpenMPI from our 
>> build machine in the same relative plugin location (lib/openmpi). If I stick 
>> with the same compiler (gcc, in this case), I'm hoping that symbols will all 
>> resolve themselves at runtime. (I probably will have to do some 
>> LD_LIBRARY_PATH games to be sure to find the appropriate underlying 
>> libraries unless OpenMPI's process for building MCAs links them in 
>> statically somehow.)
>> 
>> Am I even on the right track here? (The various system-level FAQs (here 
>> <https://www.open-mpi.org/faq/?category=supported-systems>, here 
>> <https://www.open-mpi.org/faq/?category=developers>, and especially here 
>> <https://www.open-mpi.org/faq/?category=sysadmin>) seem to suggest that I 
>> am.)
>> 
>> Our first test platform will be getting OpenMPI via IB working on our 
>> cluster, where we have IB (and TCP/IP) functional and not OpenMPI. This will 
>> be a great stand-in for a customer that has an I

<    1   2   3   >