from:"Lloyd Brown"

Re: [OMPI users] Setting bind-to none as default via environment?

2015-11-03 Thread Lloyd Brown

No problem.  It wasn't much of a delay.

The scenario involves a combination of MPI and OpenMP (or other
threading scheme).  Basically, the software will launch one or more
processes via MPI, which then spawn threads to do the work.

What we've been seeing is that, without something like '--bind-to none'
or similar, those threads end up being pinned to the same processor as
the process that spawned them.

We're okay with a bind=none, since we already have cgroups in place to
constrain the user to the resources they request.  We might get more
process/thread migration between processors (but within the cgroup) than
we would like, but that's still probably acceptable in this scenario.

If there's a better solution, we'd love to hear it.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 11/03/2015 08:16 AM, Ralph Castain wrote:
> Sorry for delay - was on travel.
> 
> hwloc_base_binding_policy=none
> 
> Alternatively, you may get better performance if you bind to numa or
> socket levels, assuming you want one proc per socket:
> 
> hwloc_base_binding_policy=socket [or numa]
> rmaps_base_mapping_policy=socket [or numa]
> 
> HTH
> Ralph
> 
>> On Nov 2, 2015, at 8:31 AM, Lloyd Brown > <mailto:lloyd_br...@byu.edu>> wrote:
>>
>> Is there an environment variable option, as well as the
>> openmpi-mca-params.conf to set the equivalent of "--bind-to none"?
>> Similar to how I can specify the environment variable
>> "OMPI_MCA_btl=^openib" instead of the cli param "--mca btl ^openib"?
>>
>> We're running into a situation where users have a combination of OpenMPI
>> and OpenMP threads, and the threads get constrained to the same
>> processor where the OpenMPI process was launched.  As far as we can
>> tell, this started with v1.8.x.
>>
>>
>> Lloyd Brown
>> Systems Administrator
>> Fulton Supercomputing Lab
>> Brigham Young University
>> http://marylou.byu.edu <http://marylou.byu.edu/>
>>
>> On 10/01/2015 09:02 AM, Nick Papior wrote:
>>> You can define default mca parameters in this file:
>>> /etc/openmpi-mca-params.conf
>>>
>>> 2015-10-01 16:57 GMT+02:00 Grigory Shamov
>>> mailto:grigory.sha...@umanitoba.ca>
>>> <mailto:grigory.sha...@umanitoba.ca>>:
>>>
>>>Hi All,
>>>
>>>A parhaps naive question: is it possible to set ' mpiexec —bind-to
>>>none ' as a system-wide default in 1.10, like, by setting an
>>>OMPI_xxx variable?
>>>
>>>-- 
>>>Grigory Shamov
>>>Westgrid/ComputeCanada Site Lead
>>>University of Manitoba
>>>E2-588 EITC Building, 
>>>(204) 474-9625 
>>>
>>>
>>>
>>>___
>>>users mailing list
>>>us...@open-mpi.org
>>> <mailto:us...@open-mpi.org> <mailto:us...@open-mpi.org>
>>>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>Link to this post:
>>>http://www.open-mpi.org/community/lists/users/2015/10/27764.php
>>>
>>>
>>>
>>>
>>> -- 
>>> Kind regards Nick
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/10/27765.php
>>>
>> ___
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this
>> post: http://www.open-mpi.org/community/lists/users/2015/11/27974.php
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/11/27978.php
>

Re: [OMPI users] Setting bind-to none as default via environment?

2015-11-02 Thread Lloyd Brown

Is there an environment variable option, as well as the
openmpi-mca-params.conf to set the equivalent of "--bind-to none"?
Similar to how I can specify the environment variable
"OMPI_MCA_btl=^openib" instead of the cli param "--mca btl ^openib"?

We're running into a situation where users have a combination of OpenMPI
and OpenMP threads, and the threads get constrained to the same
processor where the OpenMPI process was launched.  As far as we can
tell, this started with v1.8.x.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 10/01/2015 09:02 AM, Nick Papior wrote:
> You can define default mca parameters in this file:
> /etc/openmpi-mca-params.conf
> 
> 2015-10-01 16:57 GMT+02:00 Grigory Shamov  <mailto:grigory.sha...@umanitoba.ca>>:
> 
> Hi All,
> 
> A parhaps naive question: is it possible to set ' mpiexec —bind-to
> none ' as a system-wide default in 1.10, like, by setting an
> OMPI_xxx variable?
> 
> -- 
> Grigory Shamov
> Westgrid/ComputeCanada Site Lead
> University of Manitoba
> E2-588 EITC Building, 
> (204) 474-9625 
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/10/27764.php
> 
> 
> 
> 
> -- 
> Kind regards Nick
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/10/27765.php
>

Re: [OMPI users] understanding BTL selection process

2015-03-02 Thread Lloyd Brown

As much as I hate to reply to myself, I'm going to in this case.

Digging deeper into the old OS image (I found a couple of nodes that I
forgot to image), it looks like libibverbs and librdmacm were, in fact
installed.  That explains how the previous image was able to avoid the
"cannot open shared object file" messages.

My current theory is that somewhere between the (very) old version of
librdmacm on the old image, and the new version on the new image, that
there was a change that started to emit the "librdmacm: Fatal: no RDMA
devices found" messages.

All of this implies that the difference is related to something that
happened with librdmacm, not something that changed in OpenMPI.  Sorry
for the list noise.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 03/02/2015 02:42 PM, Lloyd Brown wrote:
> I hope this isn't too basic of a question, but is there a document
> somewhere that describes how the selection of which BTL components (eg.
> openib, tcp) to use occurs when mpirun/mpiexec is launched?  I know it
> can be influenced by conf files, parameters, and env variables.  But
> lacking those, how does it choose which components to use?
> 
> I'm trying to diagnose an issue involving OpenMPI, OFED, and an OS
> upgrade.  I'm hoping that better understanding of how components are
> selected, will help me figure out what changed with the OS upgrade.
> 
> 
> 
> 
> Here's a longer explanation.
> 
> We recently upgraded our HPC cluster from RHEL 6.2 to 6.6.  We have
> several versions of OpenMPI availale from a central NFS store.  Our
> cluster has some nodes with IB hardware, and some without.
> 
> On the old OS image, we did not install any of the OFED components on
> the non-IB nodes, and OpenMPI was able to somehow figure out that it
> shouldn't even try the openib btl, without any runtime warnings.  We got
> the speeds we were expecting, when running osu_bw tests from the OMB
> test suite, for either the IB nodes (about 3800 MB/s for 4xQDR IB), or
> the non-IB nodes (about 115 MB/s for 1GbE).
> 
> Since the OS upgrade, we start to get warnings like this on non-IB nodes
> without OFED installed:
> 
>> $ mpirun -np 2 hello_world
>> [m7stage-1-1:09962] mca: base: component_find: unable to open 
>> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot 
>> open shared object file: No such file or directory (ignored)
>> [m7stage-1-1:09961] mca: base: component_find: unable to open 
>> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot 
>> open shared object file: No such file or directory (ignored)
>> [m7stage-1-1:09961] mca: base: component_find: unable to open 
>> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: 
>> cannot open shared object file: No such file or directory (ignored)
>> [m7stage-1-1:09962] mca: base: component_find: unable to open 
>> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: 
>> cannot open shared object file: No such file or directory (ignored)
>> Hello from process # 0 of 2 on host m7stage-1-1
>> Hello from process # 1 of 2 on host m7stage-1-1
> 
> Obviously these are references to software components associated with
> OFED.  We can install OFED on the non-IB nodes, but then we get warnings
> more like this:
> 
>> $ mpirun -np 2 hello_world
>> librdmacm: Fatal: no RDMA devices found
>> librdmacm: Fatal: no RDMA devices found
>> --
>> [[63448,1],0]: A high-performance Open MPI point-to-point messaging module
>> was unable to find any relevant network interfaces:
>>
>> Module: OpenFabrics (openib)
>>   Host: m7stage-1-1
>>
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --
>> Hello from process # 0 of 2 on host m7stage-1-1
>> Hello from process # 1 of 2 on host m7stage-1-1
>> [m7stage-1-1:18753] 1 more process has sent help message 
>> help-mpi-btl-base.txt / btl:no-nics
>> [m7stage-1-1:18753] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
>> all help / error messages
> 
> Obviously we can work with this by using "--mca btl ^openib" or similar
> on the non-IB nodes.  And we're pursuing that option.
> 
> But I'm struggling to understand what happened to cause OpenMPI on the
> non-IB node, without OFED installed, to no longer be able to figure out
> that it shouldn't use the openib btl.  Thus the reason why I ask for
> more information about how that decision is being made.  Maybe that will
> clue me in, as to what changed.
> 
> 
> 
> Thanks,
>

[OMPI users] understanding BTL selection process

2015-03-02 Thread Lloyd Brown

I hope this isn't too basic of a question, but is there a document
somewhere that describes how the selection of which BTL components (eg.
openib, tcp) to use occurs when mpirun/mpiexec is launched?  I know it
can be influenced by conf files, parameters, and env variables.  But
lacking those, how does it choose which components to use?

I'm trying to diagnose an issue involving OpenMPI, OFED, and an OS
upgrade.  I'm hoping that better understanding of how components are
selected, will help me figure out what changed with the OS upgrade.




Here's a longer explanation.

We recently upgraded our HPC cluster from RHEL 6.2 to 6.6.  We have
several versions of OpenMPI availale from a central NFS store.  Our
cluster has some nodes with IB hardware, and some without.

On the old OS image, we did not install any of the OFED components on
the non-IB nodes, and OpenMPI was able to somehow figure out that it
shouldn't even try the openib btl, without any runtime warnings.  We got
the speeds we were expecting, when running osu_bw tests from the OMB
test suite, for either the IB nodes (about 3800 MB/s for 4xQDR IB), or
the non-IB nodes (about 115 MB/s for 1GbE).

Since the OS upgrade, we start to get warnings like this on non-IB nodes
without OFED installed:

> $ mpirun -np 2 hello_world
> [m7stage-1-1:09962] mca: base: component_find: unable to open 
> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot 
> open shared object file: No such file or directory (ignored)
> [m7stage-1-1:09961] mca: base: component_find: unable to open 
> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot 
> open shared object file: No such file or directory (ignored)
> [m7stage-1-1:09961] mca: base: component_find: unable to open 
> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: 
> cannot open shared object file: No such file or directory (ignored)
> [m7stage-1-1:09962] mca: base: component_find: unable to open 
> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: 
> cannot open shared object file: No such file or directory (ignored)
> Hello from process # 0 of 2 on host m7stage-1-1
> Hello from process # 1 of 2 on host m7stage-1-1

Obviously these are references to software components associated with
OFED.  We can install OFED on the non-IB nodes, but then we get warnings
more like this:

> $ mpirun -np 2 hello_world
> librdmacm: Fatal: no RDMA devices found
> librdmacm: Fatal: no RDMA devices found
> --
> [[63448,1],0]: A high-performance Open MPI point-to-point messaging module
> was unable to find any relevant network interfaces:
> 
> Module: OpenFabrics (openib)
>   Host: m7stage-1-1
> 
> Another transport will be used instead, although this may result in
> lower performance.
> --
> Hello from process # 0 of 2 on host m7stage-1-1
> Hello from process # 1 of 2 on host m7stage-1-1
> [m7stage-1-1:18753] 1 more process has sent help message 
> help-mpi-btl-base.txt / btl:no-nics
> [m7stage-1-1:18753] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
> all help / error messages

Obviously we can work with this by using "--mca btl ^openib" or similar
on the non-IB nodes.  And we're pursuing that option.

But I'm struggling to understand what happened to cause OpenMPI on the
non-IB node, without OFED installed, to no longer be able to figure out
that it shouldn't use the openib btl.  Thus the reason why I ask for
more information about how that decision is being made.  Maybe that will
clue me in, as to what changed.



Thanks,

-- 
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Lloyd Brown

I don't know about your users, but experience has, unfortunately, taught
us to assume that users' jobs are very, very badly-behaved.

I choose to assume that it's incompetence on the part of programmers and
users, rather than malice, though. :-)

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 03/27/2014 04:49 PM, Dave Love wrote:
> Actually there's no need for cpusets unless jobs are badly-behaved and
> escape their bindings.

Re: [OMPI users] Oversubscription of nodes with Torque and OpenMPI

2013-11-22 Thread Lloyd Brown

As far as I understand, the mpirun will assign processes to hosts in the
hostlist ($PBS_NODEFILE) sequentially, and if it runs out of hosts in
the list, it starts over at the top of the file.

Theoretically, you should be able to request specific hostnames, and the
processor counts per hostname, in your torque submit request.  I'm not
sure if this is correct (we don't use Torque here anymore, and I'm going
off memory), but it should be approximately correct:

> qsub -l nodes=n:2+n0001:2+n0002:8+n0003:8+n0004:2+n0005:2+n0006:2+n0007:4 
> ...

Granted, that's awkward, but I'm not sure if there's another way in
Torque to request different numbers of processors per node.  You might
ask on the Torque Users list.  They might tell you to change the nodes
file to reflect the number of actual processes you want on each node,
rather than the number of physical processors on the hosts.  Whether
this works for you, depends on whether you want this type of
oversubscription to happen all the time, or on a per-job basis, etc.


Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 11/22/2013 11:11 AM, Gans, Jason D wrote:
> I have tried the 1.7 series (specifically 1.7.3) and I get the same
> behavior.
> 
> When I run "mpirun -oversubscribe -np 24 hostname", three instances of
> "hostname" are run on each node.
> 
> The contents of the $PBS_NODEFILE are:
> n0007
> n0006
> n0005
> n0004
> n0003
> n0002
> n0001
> n
> 
> but, since I have compiled OpenMPI using the "--with-tm",  it appears
> that OpenMPI is not using the $PBS_NODEFILE (which I tested by modifying
> the torque pbs_mom to write a $PBS_NODEFILE that contained "slot=xx"
> information for each node. mpirun complained when I did this).
> 
> Regards,
> 
> Jason
> 
> 
> *From:* users [users-boun...@open-mpi.org] on behalf of Ralph Castain
> [r...@open-mpi.org]
> *Sent:* Friday, November 22, 2013 11:04 AM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] Oversubscription of nodes with Torque and
> OpenMPI
> 
> Really shouldn't matter - this is clearly a bug in OMPI if it is doing
> mapping as you describe. Out of curiosity, have you tried the 1.7
> series? Does it behave the same?
> 
> I can take a look at the code later today and try to figure out what
> happened.
> 
> On Nov 22, 2013, at 9:56 AM, Jason Gans  <mailto:jg...@lanl.gov>> wrote:
> 
>> On 11/22/13 10:47 AM, Reuti wrote:
>>> Hi,
>>>
>>> Am 22.11.2013 um 17:32 schrieb Gans, Jason D:
>>>
>>>> I would like to run an instance of my application on every *core* of
>>>> a small cluster. I am using Torque 2.5.12 to run jobs on the
>>>> cluster. The cluster in question is a heterogeneous collection of
>>>> machines that are all past their prime. Specifically, the number of
>>>> cores ranges from 2-8. Here is the Torque "nodes" file:
>>>>
>>>> n np=2
>>>> n0001 np=2
>>>> n0002 np=8
>>>> n0003 np=8
>>>> n0004 np=2
>>>> n0005 np=2
>>>> n0006 np=2
>>>> n0007 np=4
>>>>
>>>> When I use openmpi-1.6.3, I can oversubscribe nodes but the tasks
>>>> are allocated to nodes without regard to the number of cores on each
>>>> node (specified by the "np=xx" in the nodes file). For example, when
>>>> I run "mpirun -np 24 hostname", mpirun places three instances of
>>>> "hostname" on each node, despite the fact that some nodes only have
>>>> two processors and some have more.
>>> You submitted the job itself by requesting 24 cores for it too?
>>>
>>> -- Reuti
>> Since there are only 8 Torque nodes in the cluster, I submitted the
>> job by requesting 8 nodes, i.e. "qsub -I -l nodes=8".
>>>
>>>
>>>> Is there a way to have OpenMPI "gracefully" oversubscribe nodes by
>>>> allocating instances based on the "np=xx" information in the Torque
>>>> nodes file? It this a Torque problem?
>>>>
>>>> p.s. I do get the desired behavior when I run *without* Torque and
>>>> specify the following machine file to mpirun:
>>>>
>>>> n slots=2
>>>> n0001 slots=2
>>>> n0002 slots=8
>>>> n0003 slots=8
>>>> n0004 slots=2
>>>> n0005 slots=2
>>>> n0006 slots=2
>>>> n0007 slots=4
>>>>
>>>> Regards,
>>>>
>>>> Jason
>>>>
>>>>
>>>>
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> ___
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ___
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Debugging Runtime/Ethernet Problems

2013-09-20 Thread Lloyd Brown

1 - How do I check the BTLs available?  Something like "ompi_info | grep
-i btl"?  If so, here's the list:

>  MCA btl: ofud (MCA v2.0, API v2.0, Component v1.6.3)
>  MCA btl: openib (MCA v2.0, API v2.0, Component v1.6.3)
>  MCA btl: self (MCA v2.0, API v2.0, Component v1.6.3)
>  MCA btl: sm (MCA v2.0, API v2.0, Component v1.6.3)
>  MCA btl: tcp (MCA v2.0, API v2.0, Component v1.6.3)

2 - The IP interfaces on all nodes are:
- em1 - Ethernet - IP in the 192.168.216.0/22 range
- ib0 - IPoIB (only on IB-enabled nodes) - IP in the 192.168.212.0/22 range
- lo - loopback - 127.0.0.1/8

And I think that Jeff is absolutely right.  This syntax did work:

> mpirun --mca btl ^openib --mca btl_tcp_if_exclude 
> 192.168.212.0/22,127.0.0.1/8 ./osu_bw


And this one too, which is basically equivalent in this case:

> mpirun --mca btl ^openib --mca btl_tcp_if_exclude ib0,lo ./osu_bw


It is interesting to me, though, that I need to explicitly exclude
lo/127.0.0.1 in this case, but when I'm on an Ethernet-only node, and I
just do the plain "mpirun ./appname", I don't have to exclude anything,
and it figures out to use em1, and not lo.



Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 09/20/2013 10:31 AM, Jeff Squyres (jsquyres) wrote:
> On Sep 20, 2013, at 12:27 PM, Lloyd Brown  wrote:
> 
>> Interesting.  I was taking the approach of "only exclude what you're
>> certain you don't want" (the native IB and TCP/IPoIB stuff) since I
>> wasn't confident enough in my knowledge of the OpenMPI internals, to
>> know what I should explicitly include.
>>
>> However, taking Jeff's suggestion, this does seem to work, and gives me
>> the expected Ethernet performance:
>>
>> "mpirun --mca btl tcp,sm,self --mca btl_tcp_if_include em1 ./osu_bw"
>>
>> So, in short, I'm still not sure why my exclude syntax doesn't work.
> 
> Check two things:
> 
> 1. What BTLs are available?  Is there some other BTL that may be used instead 
> of openib?
> 
> 2. (this one is more likely) What IP interfaces are available on all nodes?  
> The most obvious guess here is that you didn't exclude 127.0.0.1/8, and OMPI 
> found this interface on all nodes, and therefore assumed that it was 
> routable/usable on all nodes.  Hence, one quick experiment might be to try 
> your exclude syntax again, but *also* exclude 127.0.0.8/8.
>

Re: [OMPI users] Debugging Runtime/Ethernet Problems

2013-09-20 Thread Lloyd Brown

Interesting.  I was taking the approach of "only exclude what you're
certain you don't want" (the native IB and TCP/IPoIB stuff) since I
wasn't confident enough in my knowledge of the OpenMPI internals, to
know what I should explicitly include.

However, taking Jeff's suggestion, this does seem to work, and gives me
the expected Ethernet performance:

"mpirun --mca btl tcp,sm,self --mca btl_tcp_if_include em1 ./osu_bw"

So, in short, I'm still not sure why my exclude syntax doesn't work.
But the include-driven syntax that Jeff suggested, does seem to work.  I
admit I'm still curious to understand how to get OpenMPI to give me the
details of what's going on.  But the immediate problem of getting the
numbers out of osu_bw and osu_latency, seems to be solved.

Thanks everyone.  I really appreciate it.


--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 09/20/2013 09:33 AM, Jeff Squyres (jsquyres) wrote:
> Correct -- it doesn't make sense to specify both include *and* exclude: by 
> specifying one, you're implicitly (but exactly/precisely) specifying the 
> other.
> 
> My suggestion would be to use positive notation, not negative notation.  For 
> example:
> 
> mpirun --mca btl tcp,self --mca btl_tcp_if_include eth0 ...
> 
> That way, you *know* you're only getting the TCP and self BTLs, and you 
> *know* you're only getting eth0.  If that works, then spread out from there, 
> e.g.:
> 
> mpirun --mca btl tcp,sm,self --mca btl_tcp_if_include eth0,eth1 ...
> 
> E.g., also include the "sm" BTL (which is only used for shared memory 
> communications between 2 procs on the same server, and is therefore useless 
> for a 2-proc-across-2-server run of osu_bw, but you get the idea), but also 
> use eth0 and eth1.  
> 
> And so on.
> 
> The problem with using ^openib and/or btl_tcp_if_exclude is that you might 
> end up using some BTLs and/or TCP interfaces that you don't expect, and 
> therefore can run into problems.
> 
> Make sense?
> 
> 
> 
> On Sep 20, 2013, at 11:17 AM, Ralph Castain  wrote:
> 
>> I don't think you are allowed to specify both include and exclude options at 
>> the same time as they conflict - you should either exclude ib0 or include 
>> eth0 (or whatever).
>>
>> My guess is that the various nodes are trying to communicate across disjoint 
>> networks. We've seen that before when, for example, eth0 on one node is on 
>> one subnet, and eth0 on another node is on a different subnet. You might 
>> look for that kind of arrangement.
>>
>>
>> On Sep 20, 2013, at 8:05 AM, "Elken, Tom"  wrote:
>>
>>>> The trouble is when I try to add some "--mca" parameters to force it to
>>>> use TCP/Ethernet, the program seems to hang.  I get the headers of the
>>>> "osu_bw" output, but no results, even on the first case (1 byte payload
>>>> per packet).  This is occurring on both the IB-enabled nodes, and on the
>>>> Ethernet-only nodes.  The specific syntax I was using was:  "mpirun
>>>> --mca btl ^openib --mca btl_tcp_if_exclude ib0 ./osu_bw"
>>>
>>> When we want to run over TCP and IPoIB on an IB/PSM equipped cluster, we 
>>> use:
>>> --mca btl sm --mca btl tcp,self --mca btl_tcp_if_exclude eth0 --mca 
>>> btl_tcp_if_include ib0 --mca mtl ^psm
>>>
>>> based on this, it looks like the following might work for you:
>>> --mca btl sm,tcp,self --mca btl_tcp_if_exclude ib0 --mca btl_tcp_if_include 
>>> eth0 --mca btl ^openib
>>>
>>> If you don't have ib0 ports configured on the IB nodes, probably you don't 
>>> need the" --mca btl_tcp_if_exclude ib0."
>>>
>>> -Tom
>>>
>>>>
>>>> The problem occurs at least with OpenMPI 1.6.3 compiled with GNU 4.4
>>>> compilers, with 1.6.3 compiled with Intel 13.0.1 compilers, and with
>>>> 1.6.5 compiled with Intel 13.0.1 compilers.  I haven't tested any other
>>>> combinations yet.
>>>>
>>>> Any ideas here?  It's very possible this is a system configuration
>>>> problem, but I don't know where to look.  At this point, any ideas would
>>>> be welcome, either about the specific situation, or general pointers on
>>>> mpirun debugging flags to use.  I can't find much in the docs yet on
>>>> run-time debugging for OpenMPI, as opposed to debugging the application.
>>>> Maybe I'm just looking in the wrong place.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> --
>>>> Lloyd Brown
>>>> Systems Administrator
>>>> Fulton Supercomputing Lab
>>>> Brigham Young University
>>>> http://marylou.byu.edu
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
>

[OMPI users] Debugging Runtime/Ethernet Problems

2013-09-20 Thread Lloyd Brown

Hi, all.

We've got a couple of clusters running RHEL 6.2, and have several
centrally-installed versions/compilations of OpenMPI.  Some of the nodes
have 4xQDR Infiniband, and all the nodes have 1 gigabit ethernet.  I was
gathering some bandwidth and latency numbers using the OSU/OMB tests,
and noticed some weird behavior.

When I run a simple "mpirun ./osu_bw" on a couple of IB-enabled node, I
get numbers consistent with our IB speed (up to about 3800 MB/s), and
when I run the same thing on two nodes with only Ethernet, I get speeds
consistent with that (up to about 120 MB/s).  So far, so good.

The trouble is when I try to add some "--mca" parameters to force it to
use TCP/Ethernet, the program seems to hang.  I get the headers of the
"osu_bw" output, but no results, even on the first case (1 byte payload
per packet).  This is occurring on both the IB-enabled nodes, and on the
Ethernet-only nodes.  The specific syntax I was using was:  "mpirun
--mca btl ^openib --mca btl_tcp_if_exclude ib0 ./osu_bw"

The problem occurs at least with OpenMPI 1.6.3 compiled with GNU 4.4
compilers, with 1.6.3 compiled with Intel 13.0.1 compilers, and with
1.6.5 compiled with Intel 13.0.1 compilers.  I haven't tested any other
combinations yet.

Any ideas here?  It's very possible this is a system configuration
problem, but I don't know where to look.  At this point, any ideas would
be welcome, either about the specific situation, or general pointers on
mpirun debugging flags to use.  I can't find much in the docs yet on
run-time debugging for OpenMPI, as opposed to debugging the application.
 Maybe I'm just looking in the wrong place.


Thanks,

-- 
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

Re: [OMPI users] check point restart

2013-07-19 Thread Lloyd Brown

I know that in the past it has been supported via toolkits like BLCR,
but I don't know the current level of support, to be honest.  I think I
heard somewhere that the checkpoint/restart support in OpenMPI was going
away in some fashion.

In any case, if you have the ability to set up application-aware,
application-specific checkpointing, it will be a much better solution
than something that's application-agnostic.  The checkpoint files will
be smaller (the application knows what in memory is important, and what
isn't), coordination will be better between processes, you have some
level of assurance that you won't have PID conflicts or problems when
the PID ends up different, etc.

I suspect someone on the list can answer your question about the
built-in checkpoint/restart code better than I can.  But in general, if
you have a choice between checkpointing external and internal to your
application, choose the application-internal checkpointing.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 07/19/2013 01:34 PM, Erik Nelson wrote:
> I run mpi on an NSF computer. One of the conditions of use is that jobs
> are limited to 24 hr
> duration to provide democratic allotment to its users.
> 
> A long program can require many restarts, so it becomes necessary to
> store the state of the 
> program in memory, print it, recompile, and and read the state to start
> again.
> 
> I seem to remember a simpler approach (check point restart?) in which
> the state of the .exe
> code is saved and then simply restarted from its current position.
> 
> Is there something like this for restarting an mpi program?
> 
> Thanks, Erik
> 
> 
> -- 
> Erik Nelson
> 
> Howard Hughes Medical Institute
> 6001 Forest Park Blvd., Room ND10.124
> Dallas, Texas 75235-9050
> 
> p : 214 645 5981
> f : 214 645 5948
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] PBS jobs with OPENMPI

2012-11-19 Thread Lloyd Brown

As far as I know, the mpdboot is not needed with OpenMPI.  You should
just be able to call mpirun or mpiexec directly.

If your OpenMPI installation was compiled to use the TM API with Torque,
you just do it like this, and it figures it all out:

mpirun myprogram

Otherwise, you will need to supply the number of nodes and nodefile,
like this:

NP=`wc -l $PBS_NODEFILE | awk '{print $1}'`
mpirun -n $NP -hostfile $PBS_NODEFILE myprogram



Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 11/19/2012 03:28 PM, Mariana Vargas Magana wrote:
> Hi all
> Help !! I have to send a job using #PBS and in the script example there is 
> something like this because the cluster is using MPICH2 
> In my case i nee Openmpi to run my code so I installed locally, in this case 
> anyone knows what it is the equivalent of this commands because it is not 
> recognized like that...
> 
> mpdboot -n ${NNODES} -f ${PBS_NODEFILE} -v --remcons
> Thanks !!
> 
> Mariana
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] PG compilers and OpenMPI 1.6.1

2012-08-27 Thread Lloyd Brown

Thanks for getting this in so quickly.

Yes, the nightly tarball from Aug 25 (a1r27142), seems to get through a
configure and make stage at least.

Thanks,

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 08/25/2012 05:18 AM, Jeff Squyres wrote:
> I've merged the VT fix into the 1.6 branch; it will be available in tonight's 
> tarball.
> 
> Can you give a nightly v1.6 tarball a whirl to ensure it fixes your problem?
> 
> http://www.open-mpi.org/nightly/v1.6/
> 
> 
> On Aug 23, 2012, at 7:30 PM, Jeff Squyres (jsquyres) wrote:
> 
>> Yes. VT = vampirtrace. 
>>
>> Sent from my phone. No type good. 
>>
>> On Aug 23, 2012, at 6:49 PM, "Lloyd Brown"  wrote:
>>
>>> Okay.  Sounds good.  I'll watch that bug.
>>>
>>> For my own sanity check, "vt" means VampirTrace stuff, right?  In our
>>> environment, I don't think it'll be a problem to disable VampirTrace
>>> temporarily.  More people here use the Intel and GNU compiled versions
>>> anyway, both of which compile just fine with 1.6.1.
>>>
>>> Lloyd Brown
>>> Systems Administrator
>>> Fulton Supercomputing Lab
>>> Brigham Young University
>>> http://marylou.byu.edu
>>>
>>> On 08/23/2012 04:43 PM, Jeff Squyres wrote:
>>>> This was reported earlier today:
>>>>
>>>>   https://svn.open-mpi.org/trac/ompi/ticket/3251
>>>>
>>>> I've alerted the VT guys to have a look.  For a workaround, you can 
>>>> --disable-vt.
>>>>
>>>>
>>>> On Aug 23, 2012, at 6:00 PM, Ralph Castain wrote:
>>>>
>>>>> Just looking at your output, it looks like there is a missing header that 
>>>>> PGI requires - I have no idea what that might be. You might do a search 
>>>>> for omp_lock_t to see where it is defined and add that head to the 
>>>>> vt_wrapper.cc file and see if that fixes the problem
>>>>>
>>>>> On Aug 23, 2012, at 2:44 PM, Lloyd Brown  wrote:
>>>>>
>>>>>> Has anyone been able to get OpenMPI 1.6.1 to compile with a recent
>>>>>> Portland Group compiler set?  I'm currently trying on RHEL 6.2 with PG
>>>>>> compilers v12.5 (2012), and I keep getting errors like the ones below.
>>>>>> It could easily be a problem with the compiler code, but since this
>>>>>> doesn't happen with OpenMPI 1.6, I'm not sure.  Can anyone provide any
>>>>>> insight on what might have changed with respect to that file
>>>>>> ('ompi/contrib/vt/vt/tools/vtwrapper/vt_wrapper.cc') between 1.6 and 
>>>>>> 1.6.1?
>>>>>>
>>>>>> Thanks,
>>>>>> Lloyd
>>>>>>
>>>>>>
>>>>>> Error Messages:
>>>>>>
>>>>>>> [root@rocks6staging vtwrapper]# pwd
>>>>>>> /tmp/openmpi-1.6.1/ompi/contrib/vt/vt/tools/vtwrapper
>>>>>>> [root@rocks6staging vtwrapper]# make V=1
>>>>>>> source='vt_wrapper.cc' object='vtwrapper-vt_wrapper.o' libtool=no \
>>>>>>> DEPDIR=.deps depmode=none /bin/sh ../../config/depcomp \
>>>>>>> pgcpp -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../include 
>>>>>>> -I../../util -I../../util  -DINSIDE_OPENMPI  -D_REENTRANT 
>>>>>>> -I/tmp/openmpi-1.6.1/opal/mca/hwloc/hwloc132/hwloc/include   
>>>>>>> -I/usr/include/infiniband -I/usr/include/infiniband  -DHAVE_FC 
>>>>>>> -DHAVE_MPI -DHAVE_FMPI -DHAVE_THREADS -DHAVE_OMP -fast -c -o 
>>>>>>> vtwrapper-vt_wrapper.o `test -f 'vt_wrapper.cc' || echo 
>>>>>>> './'`vt_wrapper.cc
>>>>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 356: error: 
>>>>>>>   identifier "omp_lock_t" is undefined
>>>>>>> omp_lock_t _M_lock;
>>>>>>> ^
>>>>>>>
>>>>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 359: error: 
>>>>>>>   identifier "omp_init_lock" is undefined
>>>>>>>   omp_init_lock(&_M_lock);
>>>>>>>   ^
>>>>>>>
>>>>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.

Re: [OMPI users] PG compilers and OpenMPI 1.6.1

2012-08-23 Thread Lloyd Brown

Okay.  Sounds good.  I'll watch that bug.

For my own sanity check, "vt" means VampirTrace stuff, right?  In our
environment, I don't think it'll be a problem to disable VampirTrace
temporarily.  More people here use the Intel and GNU compiled versions
anyway, both of which compile just fine with 1.6.1.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 08/23/2012 04:43 PM, Jeff Squyres wrote:
> This was reported earlier today:
> 
> https://svn.open-mpi.org/trac/ompi/ticket/3251
> 
> I've alerted the VT guys to have a look.  For a workaround, you can 
> --disable-vt.
> 
> 
> On Aug 23, 2012, at 6:00 PM, Ralph Castain wrote:
> 
>> Just looking at your output, it looks like there is a missing header that 
>> PGI requires - I have no idea what that might be. You might do a search for 
>> omp_lock_t to see where it is defined and add that head to the vt_wrapper.cc 
>> file and see if that fixes the problem
>>
>> On Aug 23, 2012, at 2:44 PM, Lloyd Brown  wrote:
>>
>>> Has anyone been able to get OpenMPI 1.6.1 to compile with a recent
>>> Portland Group compiler set?  I'm currently trying on RHEL 6.2 with PG
>>> compilers v12.5 (2012), and I keep getting errors like the ones below.
>>> It could easily be a problem with the compiler code, but since this
>>> doesn't happen with OpenMPI 1.6, I'm not sure.  Can anyone provide any
>>> insight on what might have changed with respect to that file
>>> ('ompi/contrib/vt/vt/tools/vtwrapper/vt_wrapper.cc') between 1.6 and 1.6.1?
>>>
>>> Thanks,
>>> Lloyd
>>>
>>>
>>> Error Messages:
>>>
>>>> [root@rocks6staging vtwrapper]# pwd
>>>> /tmp/openmpi-1.6.1/ompi/contrib/vt/vt/tools/vtwrapper
>>>> [root@rocks6staging vtwrapper]# make V=1
>>>> source='vt_wrapper.cc' object='vtwrapper-vt_wrapper.o' libtool=no \
>>>>   DEPDIR=.deps depmode=none /bin/sh ../../config/depcomp \
>>>>   pgcpp -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../include 
>>>> -I../../util -I../../util  -DINSIDE_OPENMPI  -D_REENTRANT 
>>>> -I/tmp/openmpi-1.6.1/opal/mca/hwloc/hwloc132/hwloc/include   
>>>> -I/usr/include/infiniband -I/usr/include/infiniband  -DHAVE_FC -DHAVE_MPI 
>>>> -DHAVE_FMPI -DHAVE_THREADS -DHAVE_OMP -fast -c -o vtwrapper-vt_wrapper.o 
>>>> `test -f 'vt_wrapper.cc' || echo './'`vt_wrapper.cc
>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 356: error: 
>>>> identifier "omp_lock_t" is undefined
>>>>   omp_lock_t _M_lock;
>>>>   ^
>>>>
>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 359: error: 
>>>> identifier "omp_init_lock" is undefined
>>>> omp_init_lock(&_M_lock);
>>>> ^
>>>>
>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 364: error: 
>>>> identifier "omp_destroy_lock" is undefined
>>>>omp_destroy_lock(&_M_lock);
>>>>^
>>>>
>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 369: error: 
>>>> identifier "omp_set_lock" is undefined
>>>>omp_set_lock(&_M_lock);
>>>>^
>>>>
>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 375: error: 
>>>> identifier "omp_set_lock" is undefined
>>>>omp_set_lock(&_M_lock);
>>>>^
>>>>
>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 380: error: 
>>>> identifier "omp_unset_lock" is undefined
>>>> omp_unset_lock(&_M_lock);
>>>> ^
>>>>
>>>> 6 errors detected in the compilation of "vt_wrapper.cc".
>>>> make: *** [vtwrapper-vt_wrapper.o] Error 2
>>>> [root@rocks6staging vtwrapper]# 
>>>
>>>
>>>
>>> -- 
>>> Lloyd Brown
>>> Systems Administrator
>>> Fulton Supercomputing Lab
>>> Brigham Young University
>>> http://marylou.byu.edu
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
>

[OMPI users] PG compilers and OpenMPI 1.6.1

2012-08-23 Thread Lloyd Brown

Has anyone been able to get OpenMPI 1.6.1 to compile with a recent
Portland Group compiler set?  I'm currently trying on RHEL 6.2 with PG
compilers v12.5 (2012), and I keep getting errors like the ones below.
It could easily be a problem with the compiler code, but since this
doesn't happen with OpenMPI 1.6, I'm not sure.  Can anyone provide any
insight on what might have changed with respect to that file
('ompi/contrib/vt/vt/tools/vtwrapper/vt_wrapper.cc') between 1.6 and 1.6.1?

Thanks,
Lloyd


Error Messages:

> [root@rocks6staging vtwrapper]# pwd
> /tmp/openmpi-1.6.1/ompi/contrib/vt/vt/tools/vtwrapper
> [root@rocks6staging vtwrapper]# make V=1
> source='vt_wrapper.cc' object='vtwrapper-vt_wrapper.o' libtool=no \
> DEPDIR=.deps depmode=none /bin/sh ../../config/depcomp \
> pgcpp -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../include 
> -I../../util -I../../util  -DINSIDE_OPENMPI  -D_REENTRANT 
> -I/tmp/openmpi-1.6.1/opal/mca/hwloc/hwloc132/hwloc/include   
> -I/usr/include/infiniband -I/usr/include/infiniband  -DHAVE_FC -DHAVE_MPI 
> -DHAVE_FMPI -DHAVE_THREADS -DHAVE_OMP -fast -c -o vtwrapper-vt_wrapper.o 
> `test -f 'vt_wrapper.cc' || echo './'`vt_wrapper.cc
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 356: error: 
>   identifier "omp_lock_t" is undefined
> omp_lock_t _M_lock;
> ^
> 
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 359: error: 
>   identifier "omp_init_lock" is undefined
>   omp_init_lock(&_M_lock);
>   ^
> 
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 364: error: 
>   identifier "omp_destroy_lock" is undefined
>  omp_destroy_lock(&_M_lock);
>  ^
> 
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 369: error: 
>   identifier "omp_set_lock" is undefined
>  omp_set_lock(&_M_lock);
>  ^
> 
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 375: error: 
>   identifier "omp_set_lock" is undefined
>  omp_set_lock(&_M_lock);
>  ^
> 
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 380: error: 
>   identifier "omp_unset_lock" is undefined
>   omp_unset_lock(&_M_lock);
>   ^
> 
> 6 errors detected in the compilation of "vt_wrapper.cc".
> make: *** [vtwrapper-vt_wrapper.o] Error 2
> [root@rocks6staging vtwrapper]# 



-- 
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

Re: [OMPI users] Measuring latency

2012-08-21 Thread Lloyd Brown

That's fine.  In that case, you just compile it with your MPI
implementation and do something like this:

mpiexec -np 2 -H masterhostname,slavehostname ./osu_latency

There may be some all-to-all latency tools too.  I don't really remember.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 08/21/2012 03:41 PM, Maginot Junior wrote:
> Sorry for the type, what I meant was "and" not "em".
> Thank you for the quick response, I will take a look at your suggestion

Re: [OMPI users] Measuring latency

2012-08-21 Thread Lloyd Brown

I'm not really familiar enough to know what you mean by "em slaves", but
for general testing of bandwidth and latency, I usually use the "OSU
Micro-benchmarks" (see http://mvapich.cse.ohio-state.edu/benchmarks/).

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 08/21/2012 03:32 PM, Maginot Junior wrote:
> Hello.
> How do you suggest me to measure the latency between master em slaves
> in my cluster? Is there any tool that I can use to test the
> performance of my environment?
> Thanks
> 
> 
> --
> Maginot Júnior
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] rpmbuild defining opt install path

2012-06-27 Thread Lloyd Brown

That's a really good idea.  The trouble is that I need to have multiple
versions installed (eg. compiled with the various compilers), so I think
I still need to manipulate name in some way, so the packages will be
named differently.  But _prefix should definitely give me more
flexibility as to where it's installed.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 06/27/2012 11:12 AM, Jeff Squyres wrote:
> On Jun 26, 2012, at 2:40 PM, Lloyd Brown wrote:
> 
>> Is there an easy way with the .spec file and the rpmbuild command, for
>> me to override the path the OpenMPI RPM installs into, in /opt?
>> Basically, I'm already doing something like this:
> 
> I think all you need to do is override the RPM-builtin names, like _prefix 
> (and possibly some others).  For example, I did this in RHEL 6.2:
> 
> rpmbuild --rebuild --define '_prefix /tmp/bogus' \
>  /home/jsquyres/RPMS/SRPMS/openmpi-1.6-1.src.rpm
> 
> Which resulted in:
> 
> + ./configure --build=x86_64-unknown-linux-gnu 
> --host=x86_64-unknown-linux-gnu --target=x86_64-redhat-linux-gnu 
> --program-prefix= --prefix=/tmp/bogus --exec-prefix=/tmp/bogus 
> --bindir=/tmp/bogus/bin --sbindir=/tmp/bogus/sbin --sysconfdir=/tmp/bogus/etc 
> --datadir=/tmp/bogus/share --includedir=/tmp/bogus/include 
> --libdir=/tmp/bogus/lib64 --libexecdir=/tmp/bogus/libexec 
> --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man 
> --infodir=/usr/share/info
> 
> For some reason, this didn't override localstatedir, sharedstatedir, mandir, 
> and infodir (gotta love RPM! :-) ), so I did:
> 
> rpmbuild --rebuild --define '_prefix /tmp/bogus' --define '_localstatedir 
> /tmp/bogus/var' --define '_sharedstatedir /tmp/bogus/var/lib' --define 
> '_mandir /tmp/bogus/share/man' --define '_infodir /tmp/bogus/share/info' 
> /home/jsquyres/RPMS/SRPMS/openmpi-1.6-1.src.rpm
> 
> When then gave me what I think you want:
> 
> + ./configure --build=x86_64-unknown-linux-gnu 
> --host=x86_64-unknown-linux-gnu --target=x86_64-redhat-linux-gnu 
> --program-prefix= --prefix=/tmp/bogus --exec-prefix=/tmp/bogus 
> --bindir=/tmp/bogus/bin --sbindir=/tmp/bogus/sbin --sysconfdir=/tmp/bogus/etc 
> --datadir=/tmp/bogus/share --includedir=/tmp/bogus/include 
> --libdir=/tmp/bogus/lib64 --libexecdir=/tmp/bogus/libexec 
> --localstatedir=/tmp/bogus/var --sharedstatedir=/tmp/bogus/var/lib 
> --mandir=/tmp/bogus/share/man --infodir=/tmp/bogus/share/info
>

Re: [OMPI users] rpmbuild defining opt install path

2012-06-26 Thread Lloyd Brown

Something else interesting that I just discovered.  If I do this, I have
the problem:

rpmbuild --rebuild  -bb path/to/openmpi-1.6-2.src.rpm

However, if I do an "rpm -i path/to/openmpi-1.6-2.src.rpm", and then do
very-similar rpmbuild syntax, it puts everything where I want it:

rpmbuild  -bb path/to/openmpi-1.6.spec

In this case, the "" are all exactly the same.  Clearly
there's something I'm missing about the RPM build process.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 06/26/2012 12:40 PM, Lloyd Brown wrote:
> Is there an easy way with the .spec file and the rpmbuild command, for
> me to override the path the OpenMPI RPM installs into, in /opt?
> Basically, I'm already doing something like this:
> 
> rpmbuild --rebuild --define 'install_in_opt 1' --define '_name
> fsl_openmpi_intel' --define 'name fsl_openmpi_intel' ... -bb
> openmpi-1.6-2.src.rpm
> 
> For some reason, though, while most of it ends up in
> "/opt/fsl_openmpi_intel/1.6", as I intend, a few files still get put
> into "/opt/openmpi/1.6/etc", and I'm not sure what else I can do to put
> it where I want:
> 
>> # rpm -q -l -p /root/rpmbuild/RPMS/x86_64/fsl_openmpi_intel-1.6-2.x86_64.rpm
>> /opt/fsl_openmpi_intel
>> /opt/fsl_openmpi_intel/1.6
>> /opt/fsl_openmpi_intel/1.6/bin
>> /opt/fsl_openmpi_intel/1.6/bin/mpiCC
>> /opt/fsl_openmpi_intel/1.6/bin/mpiCC-vt
>> /opt/fsl_openmpi_intel/1.6/bin/mpic++
>> /opt/fsl_openmpi_intel/1.6/bin/mpic++-vt
>> /opt/fsl_openmpi_intel/1.6/bin/mpicc
>> ...
>> /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.dtd
>> /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.xml
>> /opt/openmpi/1.6/etc
>> /opt/openmpi/1.6/etc/openmpi-default-hostfile
>> /opt/openmpi/1.6/etc/openmpi-mca-params.conf
>> /opt/openmpi/1.6/etc/openmpi-totalview.tcl
>> /opt/openmpi/1.6/etc/vt-java-default-filter.spec
>> /opt/openmpi/1.6/etc/vtsetup-config.dtd
>> /opt/openmpi/1.6/etc/vtsetup-config.xml
> 
> I realize it might not be a good idea in general to override "name" and
> "_name" like this, so if there's an easier way, I'd be happy to do it.
> I just haven't found anything yet, and haven't yet found the place in
> the spec file where it's being set to "/opt/openmpi" again.
> 
> We're probably going to end up with at least 3 versions of v1.6 (gcc
> compilers, intel compilers, pgi compilers) and possibly a few of a
> previous version, so putting everything in /opt/openmpi/VERSION, is a
> little problematic.
> 
> Thanks,

[OMPI users] rpmbuild defining opt install path

2012-06-26 Thread Lloyd Brown

Is there an easy way with the .spec file and the rpmbuild command, for
me to override the path the OpenMPI RPM installs into, in /opt?
Basically, I'm already doing something like this:

rpmbuild --rebuild --define 'install_in_opt 1' --define '_name
fsl_openmpi_intel' --define 'name fsl_openmpi_intel' ... -bb
openmpi-1.6-2.src.rpm

For some reason, though, while most of it ends up in
"/opt/fsl_openmpi_intel/1.6", as I intend, a few files still get put
into "/opt/openmpi/1.6/etc", and I'm not sure what else I can do to put
it where I want:

> # rpm -q -l -p /root/rpmbuild/RPMS/x86_64/fsl_openmpi_intel-1.6-2.x86_64.rpm
> /opt/fsl_openmpi_intel
> /opt/fsl_openmpi_intel/1.6
> /opt/fsl_openmpi_intel/1.6/bin
> /opt/fsl_openmpi_intel/1.6/bin/mpiCC
> /opt/fsl_openmpi_intel/1.6/bin/mpiCC-vt
> /opt/fsl_openmpi_intel/1.6/bin/mpic++
> /opt/fsl_openmpi_intel/1.6/bin/mpic++-vt
> /opt/fsl_openmpi_intel/1.6/bin/mpicc
> ...
> /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.dtd
> /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.xml
> /opt/openmpi/1.6/etc
> /opt/openmpi/1.6/etc/openmpi-default-hostfile
> /opt/openmpi/1.6/etc/openmpi-mca-params.conf
> /opt/openmpi/1.6/etc/openmpi-totalview.tcl
> /opt/openmpi/1.6/etc/vt-java-default-filter.spec
> /opt/openmpi/1.6/etc/vtsetup-config.dtd
> /opt/openmpi/1.6/etc/vtsetup-config.xml

I realize it might not be a good idea in general to override "name" and
"_name" like this, so if there's an easier way, I'd be happy to do it.
I just haven't found anything yet, and haven't yet found the place in
the spec file where it's being set to "/opt/openmpi" again.

We're probably going to end up with at least 3 versions of v1.6 (gcc
compilers, intel compilers, pgi compilers) and possibly a few of a
previous version, so putting everything in /opt/openmpi/VERSION, is a
little problematic.

Thanks,
-- 
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

Re: [OMPI users] regarding the problem occurred while running anmpi programs

2012-04-25 Thread Lloyd Brown

Yes, but what happens when you run a remote, non-login shell?  By that,
I mean something like this:

ssh master@ip-10-80-106-70 'echo $LD_LIBRARY_PATH'

Assuming I got the syntax right, I suspect you'll find that the contents
of the variable, do not include /usr/local/openmpi-1.4.5/lib.

You really need that to be in LD_LIBRARY_PATH (or some other method) on
all nodes, in all shells for the user.  One simple way to do this is via
the startup files (eg. .bashrc and .bash_profile for bash, .cshrc for
csh/tcsh, etc.)

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 04/25/2012 09:43 AM, seshendra seshu wrote:
> Hi
> I have exported the library files as below
> 
> [master@ip-10-80-106-70 ~]$ export
> LD_LIBRARY_PATH=/usr/local/openmpi-1.4.5/lib:$LD_LIBRARY_PATH 
>   
> 
> [master@ip-10-80-106-70 ~]$ mpirun --prefix /usr/local/openmpi-1.4.5 -n
> 1 --hostfile hostfile out
> out: error while loading shared libraries: libmpi_cxx.so.0: cannot open
> shared object file: No such file or directory
> [master@ip-10-80-106-70 ~]$ mpirun --prefix /usr/local/lib/ -n 1
> --hostfile hostfile
> out   
> 
> 
> out: error while loading shared libraries: libmpi_cxx.so.0: cannot open
> shared object file: No such file or directory
> 
> But still iam getting the same error.
> 
> 
> 
> 
> 
> On Wed, Apr 25, 2012 at 5:36 PM, Jeff Squyres (jsquyres)
> mailto:jsquy...@cisco.com>> wrote:
> 
> See the FAQ item I cited. 
> 
> Sent from my phone. No type good. 
> 
> On Apr 25, 2012, at 11:24 AM, "seshendra seshu"  <mailto:seshu...@gmail.com>> wrote:
> 
>> Hi
>> now i have created an used and tried to run the program but i got
>> the following error
>>
>> [master@ip-10-80-106-70 ~]$ mpirun -n 1 --hostfile hostfile
>> out  
>>   
>>
>> out: error while loading shared libraries: libmpi_cxx.so.0: cannot
>> open shared object file: No such file or directory
>>
>>
>> thanking you
>>
>>
>>
>> On Wed, Apr 25, 2012 at 5:12 PM, Jeff Squyres > <mailto:jsquy...@cisco.com>> wrote:
>>
>> On Apr 25, 2012, at 11:06 AM, seshendra seshu wrote:
>>
>> > so should i need to create an user and run the mpi program.
>> or how can i run in cluster
>>
>> It is a "best practice" to not run real applications as root
>> (e.g., MPI applications).  Create a non-privlidged user to run
>> your applications.
>>
>> Then be sure to set your LD_LIBRARY_PATH if you installed Open
>> MPI into a non-system-default location.  See this FAQ item:
>>
>>  
>>  http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com <mailto:jsquy...@cisco.com>
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>>
>> -- 
>>  WITH REGARDS
>> M.L.N.Seshendra
>> ___
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> 
> -- 
>  WITH REGARDS
> M.L.N.Seshendra
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] ssh between nodes

2012-02-29 Thread Lloyd Brown

It really depends.  You certainly CAN have mpirun/mpiexec use ssh to
launch the remote processes.  If you're using Torque, though, I strongly
recommend using the hooks in OpenMPI, into the Torque TM-API (see
http://www.open-mpi.org/faq/?category=building#build-rte-tm).  That will
use the pbs_mom's themselves to launch all the processes, which has
several advantages.

Using the TM-API for job launch means that remote processes will be
children of the Torque pbs_mom process, not the sshd process, which
means that Torque will be able to do a better job at killing rogue
processes, reporting resources utilized, etc.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 02/29/2012 02:09 PM, Denver Smith wrote:
> Hello,
> 
> On my cluster running moab and torque, I cannot ssh without a password
> between compute nodes. I can however request multiple node jobs fine. I
> was wondering if passwordless ssh keys need to be set up between compute
> nodes in order for mpi applications to run correctly.
> 
> Thanks
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Mpirun: How to print STDOUT of just one process?

2012-02-01 Thread Lloyd Brown

I don't know about using mpirun to do it, but you can actually call
mpirun on a script, and have that script individually call a single
instance of your program.  Then that script could use shell redirection
to redirect the output of the program's instance to a separate file.

I've used this technique to play with ulimit sort of things in the
script before.  I'm not entirely sure what variables are exposed to you
in the script, such that you could come up with a unique filename to
output to, though.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 02/01/2012 08:59 AM, Frank wrote:
> When running
> 
> mpirun -n 2 
> 
> the STDOUT streams of both processes are combined and are displayed by
> the shell. In such an interleaved format its hard to tell what line
> comes from which node.
> 
> Is there a way to have mpirun just merger STDOUT of one process to its
> STDOUT stream?
> 
> Best,
> Frank
> 
> Cross-reference:
> http://stackoverflow.com/questions/9098781/mpirun-how-to-print-stdout-of-just-one-process
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Checkpoint an MPI process

2012-01-19 Thread Lloyd Brown

Since you're looking for a function call, I'm going to assume that you
are writing this application, and it's not a pre-compiled, commercial
application.  Given that, it's going to be significantly better to have
an internal application checkpointing mechanism, where it serializes and
stores the data, etc., than to use an external, applicaiton-agnostic
checkpointing mechanism like BLCR or similar.  The application should be
aware of what data is important, how to most efficiently store it, etc.
 A generic library has to assume that everything is important, and store
it all.

Don't get me wrong.  Libraries like BLCR are great for applications that
don't have that visibility, and even as a tool for the
application-internal checkpointing mechanism (where the application
deliberately interacts with the library to annotate what's important to
store, and how to do so, etc.).  But if you're writing the application,
you're better off to handle it internally, than externally.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 01/19/2012 08:05 AM, Josh Hursey wrote:
> Currently Open MPI only supports the checkpointing of the whole
> application. There has been some work on uncoordinated checkpointing
> with message logging, though I do not know the state of that work with
> regards to availability. That work has been undertaken by the University
> of Tennessee Knoxville, so maybe they can provide more information.
> 
> -- Josh
> 
> On Wed, Jan 18, 2012 at 3:24 PM, Rodrigo Oliveira
> mailto:rsilva.olive...@gmail.com>> wrote:
> 
> Hi,
> 
> I'd like to know if there is a way to checkpoint a specific process
> running under an mpirun call. In other words, is there a function
> CHECKPOINT(rank) in which I can pass the rank of the process I want
> to checkpoint? I do not want to checkpoint the entire application,
> but just one of its processes.
> 
> Thanks
> 
> ___
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> 
> -- 
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] segfault when resuming on different host

2011-12-29 Thread Lloyd Brown

Josh,

When I use cr_{run,checkpoint,restart} to start a checkpoint and restart
a single-threaded, single-process app on a different host, it works,
even with prelinking enabled.  That's kinda why I assumed the problem
was with the OpenMPI code, and didn't look at the BLCR FAQ that closely,
to be honest.

Having said that, I did temporarily disable prelink on my two hosts, and
tried my MPI test again, and it seemed to work.  I'll have to do more
tests with something more intense (xhpl, maybe), and so on, but
preliminary results look good.

Thanks for pointing me in the right direction.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 12/29/2011 02:31 PM, Josh Hursey wrote:
> Often this type of problem is due to the 'prelink' option in Linux.
> BLCR has a FAQ item that discusses this issue and how to resolve it:
>   https://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#prelink
> 
> I would give that a try. If that does not help then you might want to
> try checkpointing a single (non-MPI) process on one node with BLCR and
> restart it on the other node. If that fails, then it is likely a
> BLCR/system configuration issue that is the cause. If it does work,
> then we can dig more into the Open MPI causes.
> 
> Let me know if disabling prelink works for you.
> 
> -- Josh
>

[OMPI users] segfault when resuming on different host

2011-12-29 Thread Lloyd Brown

Hi, all.

I'm in the middle of testing some of the checkpoint/restart capabilities
of OpenMPI with BLCR on our cluster.  I've been able to checkpoint and
restart successfully when I restart on the same nodes as it was running
previously.  But when I try to restart on a different host, I always get
an error like this:

> $ ompi-restart ompi_global_snapshot_15935.ckpt
> --
> mpirun noticed that process rank 1 with PID 15201 on node m5stage-1-2.local 
> exited on signal 11 (Segmentation fault).
> --


Now, it's very possible that I've missed something during the setup, or
that despite my failure to find it while searching the mailing list,
that this is already answered somewhere, but none of the threads I could
find seemed to apply (eg. cr_restart *is* installed, etc.).

I'm attaching a tarball that contains the source code of the very-simple
test application, as well as some example output of "ompi_info --all"
and "ompi_info -v ompi full --parsable".  I don't know if this will be
useful or not.

This is being tested on CentOS v5.4 with BLCR v0.8.4.  I've seen this
problem with OpenMPI v1.4.2, v1.4.4, and v1.5.4.

If anyone has any ideas on what's going on, or how to best debug this,
I'd love to hear about it.

I don't mind doing the legwork too, but I'm just stumped where to go
from here.  I have some core files, but I'm having trouble getting the
symbols from the backtrace in gdb.  Maybe I'm doing it wrong.


TIA,

-- 
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu


byufsl_debugging_segfault_on_resume.tar.gz
Description: application/gzip

Re: [OMPI users] Setting bind-to none as default via environment?

Re: [OMPI users] Setting bind-to none as default via environment?

Re: [OMPI users] understanding BTL selection process

[OMPI users] understanding BTL selection process

Re: [OMPI users] busy waiting and oversubscriptions

Re: [OMPI users] Oversubscription of nodes with Torque and OpenMPI

Re: [OMPI users] Debugging Runtime/Ethernet Problems

Re: [OMPI users] Debugging Runtime/Ethernet Problems

[OMPI users] Debugging Runtime/Ethernet Problems

Re: [OMPI users] check point restart

Re: [OMPI users] PBS jobs with OPENMPI

Re: [OMPI users] PG compilers and OpenMPI 1.6.1

Re: [OMPI users] PG compilers and OpenMPI 1.6.1

[OMPI users] PG compilers and OpenMPI 1.6.1

Re: [OMPI users] Measuring latency

Re: [OMPI users] Measuring latency

Re: [OMPI users] rpmbuild defining opt install path

Re: [OMPI users] rpmbuild defining opt install path

[OMPI users] rpmbuild defining opt install path

Re: [OMPI users] regarding the problem occurred while running anmpi programs

Re: [OMPI users] ssh between nodes

Re: [OMPI users] Mpirun: How to print STDOUT of just one process?

Re: [OMPI users] Checkpoint an MPI process

Re: [OMPI users] segfault when resuming on different host

[OMPI users] segfault when resuming on different host

25 matches

Site Navigation

Mail list logo

Footer information