Re: [OMPI users] Setting bind-to none as default via environment?
No problem. It wasn't much of a delay. The scenario involves a combination of MPI and OpenMP (or other threading scheme). Basically, the software will launch one or more processes via MPI, which then spawn threads to do the work. What we've been seeing is that, without something like '--bind-to none' or similar, those threads end up being pinned to the same processor as the process that spawned them. We're okay with a bind=none, since we already have cgroups in place to constrain the user to the resources they request. We might get more process/thread migration between processors (but within the cgroup) than we would like, but that's still probably acceptable in this scenario. If there's a better solution, we'd love to hear it. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 11/03/2015 08:16 AM, Ralph Castain wrote: > Sorry for delay - was on travel. > > hwloc_base_binding_policy=none > > Alternatively, you may get better performance if you bind to numa or > socket levels, assuming you want one proc per socket: > > hwloc_base_binding_policy=socket [or numa] > rmaps_base_mapping_policy=socket [or numa] > > HTH > Ralph > >> On Nov 2, 2015, at 8:31 AM, Lloyd Brown > <mailto:lloyd_br...@byu.edu>> wrote: >> >> Is there an environment variable option, as well as the >> openmpi-mca-params.conf to set the equivalent of "--bind-to none"? >> Similar to how I can specify the environment variable >> "OMPI_MCA_btl=^openib" instead of the cli param "--mca btl ^openib"? >> >> We're running into a situation where users have a combination of OpenMPI >> and OpenMP threads, and the threads get constrained to the same >> processor where the OpenMPI process was launched. As far as we can >> tell, this started with v1.8.x. >> >> >> Lloyd Brown >> Systems Administrator >> Fulton Supercomputing Lab >> Brigham Young University >> http://marylou.byu.edu <http://marylou.byu.edu/> >> >> On 10/01/2015 09:02 AM, Nick Papior wrote: >>> You can define default mca parameters in this file: >>> /etc/openmpi-mca-params.conf >>> >>> 2015-10-01 16:57 GMT+02:00 Grigory Shamov >>> mailto:grigory.sha...@umanitoba.ca> >>> <mailto:grigory.sha...@umanitoba.ca>>: >>> >>>Hi All, >>> >>>A parhaps naive question: is it possible to set ' mpiexec —bind-to >>>none ' as a system-wide default in 1.10, like, by setting an >>>OMPI_xxx variable? >>> >>>-- >>>Grigory Shamov >>>Westgrid/ComputeCanada Site Lead >>>University of Manitoba >>>E2-588 EITC Building, >>>(204) 474-9625 >>> >>> >>> >>>___ >>>users mailing list >>>us...@open-mpi.org >>> <mailto:us...@open-mpi.org> <mailto:us...@open-mpi.org> >>>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>Link to this post: >>>http://www.open-mpi.org/community/lists/users/2015/10/27764.php >>> >>> >>> >>> >>> -- >>> Kind regards Nick >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2015/10/27765.php >>> >> ___ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this >> post: http://www.open-mpi.org/community/lists/users/2015/11/27974.php > > > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/11/27978.php >
Re: [OMPI users] Setting bind-to none as default via environment?
Is there an environment variable option, as well as the openmpi-mca-params.conf to set the equivalent of "--bind-to none"? Similar to how I can specify the environment variable "OMPI_MCA_btl=^openib" instead of the cli param "--mca btl ^openib"? We're running into a situation where users have a combination of OpenMPI and OpenMP threads, and the threads get constrained to the same processor where the OpenMPI process was launched. As far as we can tell, this started with v1.8.x. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 10/01/2015 09:02 AM, Nick Papior wrote: > You can define default mca parameters in this file: > /etc/openmpi-mca-params.conf > > 2015-10-01 16:57 GMT+02:00 Grigory Shamov <mailto:grigory.sha...@umanitoba.ca>>: > > Hi All, > > A parhaps naive question: is it possible to set ' mpiexec —bind-to > none ' as a system-wide default in 1.10, like, by setting an > OMPI_xxx variable? > > -- > Grigory Shamov > Westgrid/ComputeCanada Site Lead > University of Manitoba > E2-588 EITC Building, > (204) 474-9625 > > > > ___ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/10/27764.php > > > > > -- > Kind regards Nick > > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/10/27765.php >
Re: [OMPI users] understanding BTL selection process
As much as I hate to reply to myself, I'm going to in this case. Digging deeper into the old OS image (I found a couple of nodes that I forgot to image), it looks like libibverbs and librdmacm were, in fact installed. That explains how the previous image was able to avoid the "cannot open shared object file" messages. My current theory is that somewhere between the (very) old version of librdmacm on the old image, and the new version on the new image, that there was a change that started to emit the "librdmacm: Fatal: no RDMA devices found" messages. All of this implies that the difference is related to something that happened with librdmacm, not something that changed in OpenMPI. Sorry for the list noise. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 03/02/2015 02:42 PM, Lloyd Brown wrote: > I hope this isn't too basic of a question, but is there a document > somewhere that describes how the selection of which BTL components (eg. > openib, tcp) to use occurs when mpirun/mpiexec is launched? I know it > can be influenced by conf files, parameters, and env variables. But > lacking those, how does it choose which components to use? > > I'm trying to diagnose an issue involving OpenMPI, OFED, and an OS > upgrade. I'm hoping that better understanding of how components are > selected, will help me figure out what changed with the OS upgrade. > > > > > Here's a longer explanation. > > We recently upgraded our HPC cluster from RHEL 6.2 to 6.6. We have > several versions of OpenMPI availale from a central NFS store. Our > cluster has some nodes with IB hardware, and some without. > > On the old OS image, we did not install any of the OFED components on > the non-IB nodes, and OpenMPI was able to somehow figure out that it > shouldn't even try the openib btl, without any runtime warnings. We got > the speeds we were expecting, when running osu_bw tests from the OMB > test suite, for either the IB nodes (about 3800 MB/s for 4xQDR IB), or > the non-IB nodes (about 115 MB/s for 1GbE). > > Since the OS upgrade, we start to get warnings like this on non-IB nodes > without OFED installed: > >> $ mpirun -np 2 hello_world >> [m7stage-1-1:09962] mca: base: component_find: unable to open >> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot >> open shared object file: No such file or directory (ignored) >> [m7stage-1-1:09961] mca: base: component_find: unable to open >> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot >> open shared object file: No such file or directory (ignored) >> [m7stage-1-1:09961] mca: base: component_find: unable to open >> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: >> cannot open shared object file: No such file or directory (ignored) >> [m7stage-1-1:09962] mca: base: component_find: unable to open >> /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: >> cannot open shared object file: No such file or directory (ignored) >> Hello from process # 0 of 2 on host m7stage-1-1 >> Hello from process # 1 of 2 on host m7stage-1-1 > > Obviously these are references to software components associated with > OFED. We can install OFED on the non-IB nodes, but then we get warnings > more like this: > >> $ mpirun -np 2 hello_world >> librdmacm: Fatal: no RDMA devices found >> librdmacm: Fatal: no RDMA devices found >> -- >> [[63448,1],0]: A high-performance Open MPI point-to-point messaging module >> was unable to find any relevant network interfaces: >> >> Module: OpenFabrics (openib) >> Host: m7stage-1-1 >> >> Another transport will be used instead, although this may result in >> lower performance. >> -- >> Hello from process # 0 of 2 on host m7stage-1-1 >> Hello from process # 1 of 2 on host m7stage-1-1 >> [m7stage-1-1:18753] 1 more process has sent help message >> help-mpi-btl-base.txt / btl:no-nics >> [m7stage-1-1:18753] Set MCA parameter "orte_base_help_aggregate" to 0 to see >> all help / error messages > > Obviously we can work with this by using "--mca btl ^openib" or similar > on the non-IB nodes. And we're pursuing that option. > > But I'm struggling to understand what happened to cause OpenMPI on the > non-IB node, without OFED installed, to no longer be able to figure out > that it shouldn't use the openib btl. Thus the reason why I ask for > more information about how that decision is being made. Maybe that will > clue me in, as to what changed. > > > > Thanks, >
[OMPI users] understanding BTL selection process
I hope this isn't too basic of a question, but is there a document somewhere that describes how the selection of which BTL components (eg. openib, tcp) to use occurs when mpirun/mpiexec is launched? I know it can be influenced by conf files, parameters, and env variables. But lacking those, how does it choose which components to use? I'm trying to diagnose an issue involving OpenMPI, OFED, and an OS upgrade. I'm hoping that better understanding of how components are selected, will help me figure out what changed with the OS upgrade. Here's a longer explanation. We recently upgraded our HPC cluster from RHEL 6.2 to 6.6. We have several versions of OpenMPI availale from a central NFS store. Our cluster has some nodes with IB hardware, and some without. On the old OS image, we did not install any of the OFED components on the non-IB nodes, and OpenMPI was able to somehow figure out that it shouldn't even try the openib btl, without any runtime warnings. We got the speeds we were expecting, when running osu_bw tests from the OMB test suite, for either the IB nodes (about 3800 MB/s for 4xQDR IB), or the non-IB nodes (about 115 MB/s for 1GbE). Since the OS upgrade, we start to get warnings like this on non-IB nodes without OFED installed: > $ mpirun -np 2 hello_world > [m7stage-1-1:09962] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot > open shared object file: No such file or directory (ignored) > [m7stage-1-1:09961] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_ofud: librdmacm.so.1: cannot > open shared object file: No such file or directory (ignored) > [m7stage-1-1:09961] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: > cannot open shared object file: No such file or directory (ignored) > [m7stage-1-1:09962] mca: base: component_find: unable to open > /apps/openmpi/1.6.3_gnu-4.4/lib/openmpi/mca_btl_openib: librdmacm.so.1: > cannot open shared object file: No such file or directory (ignored) > Hello from process # 0 of 2 on host m7stage-1-1 > Hello from process # 1 of 2 on host m7stage-1-1 Obviously these are references to software components associated with OFED. We can install OFED on the non-IB nodes, but then we get warnings more like this: > $ mpirun -np 2 hello_world > librdmacm: Fatal: no RDMA devices found > librdmacm: Fatal: no RDMA devices found > -- > [[63448,1],0]: A high-performance Open MPI point-to-point messaging module > was unable to find any relevant network interfaces: > > Module: OpenFabrics (openib) > Host: m7stage-1-1 > > Another transport will be used instead, although this may result in > lower performance. > -- > Hello from process # 0 of 2 on host m7stage-1-1 > Hello from process # 1 of 2 on host m7stage-1-1 > [m7stage-1-1:18753] 1 more process has sent help message > help-mpi-btl-base.txt / btl:no-nics > [m7stage-1-1:18753] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages Obviously we can work with this by using "--mca btl ^openib" or similar on the non-IB nodes. And we're pursuing that option. But I'm struggling to understand what happened to cause OpenMPI on the non-IB node, without OFED installed, to no longer be able to figure out that it shouldn't use the openib btl. Thus the reason why I ask for more information about how that decision is being made. Maybe that will clue me in, as to what changed. Thanks, -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu
Re: [OMPI users] busy waiting and oversubscriptions
I don't know about your users, but experience has, unfortunately, taught us to assume that users' jobs are very, very badly-behaved. I choose to assume that it's incompetence on the part of programmers and users, rather than malice, though. :-) Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 03/27/2014 04:49 PM, Dave Love wrote: > Actually there's no need for cpusets unless jobs are badly-behaved and > escape their bindings.
Re: [OMPI users] Oversubscription of nodes with Torque and OpenMPI
As far as I understand, the mpirun will assign processes to hosts in the hostlist ($PBS_NODEFILE) sequentially, and if it runs out of hosts in the list, it starts over at the top of the file. Theoretically, you should be able to request specific hostnames, and the processor counts per hostname, in your torque submit request. I'm not sure if this is correct (we don't use Torque here anymore, and I'm going off memory), but it should be approximately correct: > qsub -l nodes=n:2+n0001:2+n0002:8+n0003:8+n0004:2+n0005:2+n0006:2+n0007:4 > ... Granted, that's awkward, but I'm not sure if there's another way in Torque to request different numbers of processors per node. You might ask on the Torque Users list. They might tell you to change the nodes file to reflect the number of actual processes you want on each node, rather than the number of physical processors on the hosts. Whether this works for you, depends on whether you want this type of oversubscription to happen all the time, or on a per-job basis, etc. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 11/22/2013 11:11 AM, Gans, Jason D wrote: > I have tried the 1.7 series (specifically 1.7.3) and I get the same > behavior. > > When I run "mpirun -oversubscribe -np 24 hostname", three instances of > "hostname" are run on each node. > > The contents of the $PBS_NODEFILE are: > n0007 > n0006 > n0005 > n0004 > n0003 > n0002 > n0001 > n > > but, since I have compiled OpenMPI using the "--with-tm", it appears > that OpenMPI is not using the $PBS_NODEFILE (which I tested by modifying > the torque pbs_mom to write a $PBS_NODEFILE that contained "slot=xx" > information for each node. mpirun complained when I did this). > > Regards, > > Jason > > > *From:* users [users-boun...@open-mpi.org] on behalf of Ralph Castain > [r...@open-mpi.org] > *Sent:* Friday, November 22, 2013 11:04 AM > *To:* Open MPI Users > *Subject:* Re: [OMPI users] Oversubscription of nodes with Torque and > OpenMPI > > Really shouldn't matter - this is clearly a bug in OMPI if it is doing > mapping as you describe. Out of curiosity, have you tried the 1.7 > series? Does it behave the same? > > I can take a look at the code later today and try to figure out what > happened. > > On Nov 22, 2013, at 9:56 AM, Jason Gans <mailto:jg...@lanl.gov>> wrote: > >> On 11/22/13 10:47 AM, Reuti wrote: >>> Hi, >>> >>> Am 22.11.2013 um 17:32 schrieb Gans, Jason D: >>> >>>> I would like to run an instance of my application on every *core* of >>>> a small cluster. I am using Torque 2.5.12 to run jobs on the >>>> cluster. The cluster in question is a heterogeneous collection of >>>> machines that are all past their prime. Specifically, the number of >>>> cores ranges from 2-8. Here is the Torque "nodes" file: >>>> >>>> n np=2 >>>> n0001 np=2 >>>> n0002 np=8 >>>> n0003 np=8 >>>> n0004 np=2 >>>> n0005 np=2 >>>> n0006 np=2 >>>> n0007 np=4 >>>> >>>> When I use openmpi-1.6.3, I can oversubscribe nodes but the tasks >>>> are allocated to nodes without regard to the number of cores on each >>>> node (specified by the "np=xx" in the nodes file). For example, when >>>> I run "mpirun -np 24 hostname", mpirun places three instances of >>>> "hostname" on each node, despite the fact that some nodes only have >>>> two processors and some have more. >>> You submitted the job itself by requesting 24 cores for it too? >>> >>> -- Reuti >> Since there are only 8 Torque nodes in the cluster, I submitted the >> job by requesting 8 nodes, i.e. "qsub -I -l nodes=8". >>> >>> >>>> Is there a way to have OpenMPI "gracefully" oversubscribe nodes by >>>> allocating instances based on the "np=xx" information in the Torque >>>> nodes file? It this a Torque problem? >>>> >>>> p.s. I do get the desired behavior when I run *without* Torque and >>>> specify the following machine file to mpirun: >>>> >>>> n slots=2 >>>> n0001 slots=2 >>>> n0002 slots=8 >>>> n0003 slots=8 >>>> n0004 slots=2 >>>> n0005 slots=2 >>>> n0006 slots=2 >>>> n0007 slots=4 >>>> >>>> Regards, >>>> >>>> Jason >>>> >>>> >>>> >>>> ___ >>>> users mailing list >>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> ___ >>> users mailing list >>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> ___ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Debugging Runtime/Ethernet Problems
1 - How do I check the BTLs available? Something like "ompi_info | grep -i btl"? If so, here's the list: > MCA btl: ofud (MCA v2.0, API v2.0, Component v1.6.3) > MCA btl: openib (MCA v2.0, API v2.0, Component v1.6.3) > MCA btl: self (MCA v2.0, API v2.0, Component v1.6.3) > MCA btl: sm (MCA v2.0, API v2.0, Component v1.6.3) > MCA btl: tcp (MCA v2.0, API v2.0, Component v1.6.3) 2 - The IP interfaces on all nodes are: - em1 - Ethernet - IP in the 192.168.216.0/22 range - ib0 - IPoIB (only on IB-enabled nodes) - IP in the 192.168.212.0/22 range - lo - loopback - 127.0.0.1/8 And I think that Jeff is absolutely right. This syntax did work: > mpirun --mca btl ^openib --mca btl_tcp_if_exclude > 192.168.212.0/22,127.0.0.1/8 ./osu_bw And this one too, which is basically equivalent in this case: > mpirun --mca btl ^openib --mca btl_tcp_if_exclude ib0,lo ./osu_bw It is interesting to me, though, that I need to explicitly exclude lo/127.0.0.1 in this case, but when I'm on an Ethernet-only node, and I just do the plain "mpirun ./appname", I don't have to exclude anything, and it figures out to use em1, and not lo. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 09/20/2013 10:31 AM, Jeff Squyres (jsquyres) wrote: > On Sep 20, 2013, at 12:27 PM, Lloyd Brown wrote: > >> Interesting. I was taking the approach of "only exclude what you're >> certain you don't want" (the native IB and TCP/IPoIB stuff) since I >> wasn't confident enough in my knowledge of the OpenMPI internals, to >> know what I should explicitly include. >> >> However, taking Jeff's suggestion, this does seem to work, and gives me >> the expected Ethernet performance: >> >> "mpirun --mca btl tcp,sm,self --mca btl_tcp_if_include em1 ./osu_bw" >> >> So, in short, I'm still not sure why my exclude syntax doesn't work. > > Check two things: > > 1. What BTLs are available? Is there some other BTL that may be used instead > of openib? > > 2. (this one is more likely) What IP interfaces are available on all nodes? > The most obvious guess here is that you didn't exclude 127.0.0.1/8, and OMPI > found this interface on all nodes, and therefore assumed that it was > routable/usable on all nodes. Hence, one quick experiment might be to try > your exclude syntax again, but *also* exclude 127.0.0.8/8. >
Re: [OMPI users] Debugging Runtime/Ethernet Problems
Interesting. I was taking the approach of "only exclude what you're certain you don't want" (the native IB and TCP/IPoIB stuff) since I wasn't confident enough in my knowledge of the OpenMPI internals, to know what I should explicitly include. However, taking Jeff's suggestion, this does seem to work, and gives me the expected Ethernet performance: "mpirun --mca btl tcp,sm,self --mca btl_tcp_if_include em1 ./osu_bw" So, in short, I'm still not sure why my exclude syntax doesn't work. But the include-driven syntax that Jeff suggested, does seem to work. I admit I'm still curious to understand how to get OpenMPI to give me the details of what's going on. But the immediate problem of getting the numbers out of osu_bw and osu_latency, seems to be solved. Thanks everyone. I really appreciate it. -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 09/20/2013 09:33 AM, Jeff Squyres (jsquyres) wrote: > Correct -- it doesn't make sense to specify both include *and* exclude: by > specifying one, you're implicitly (but exactly/precisely) specifying the > other. > > My suggestion would be to use positive notation, not negative notation. For > example: > > mpirun --mca btl tcp,self --mca btl_tcp_if_include eth0 ... > > That way, you *know* you're only getting the TCP and self BTLs, and you > *know* you're only getting eth0. If that works, then spread out from there, > e.g.: > > mpirun --mca btl tcp,sm,self --mca btl_tcp_if_include eth0,eth1 ... > > E.g., also include the "sm" BTL (which is only used for shared memory > communications between 2 procs on the same server, and is therefore useless > for a 2-proc-across-2-server run of osu_bw, but you get the idea), but also > use eth0 and eth1. > > And so on. > > The problem with using ^openib and/or btl_tcp_if_exclude is that you might > end up using some BTLs and/or TCP interfaces that you don't expect, and > therefore can run into problems. > > Make sense? > > > > On Sep 20, 2013, at 11:17 AM, Ralph Castain wrote: > >> I don't think you are allowed to specify both include and exclude options at >> the same time as they conflict - you should either exclude ib0 or include >> eth0 (or whatever). >> >> My guess is that the various nodes are trying to communicate across disjoint >> networks. We've seen that before when, for example, eth0 on one node is on >> one subnet, and eth0 on another node is on a different subnet. You might >> look for that kind of arrangement. >> >> >> On Sep 20, 2013, at 8:05 AM, "Elken, Tom" wrote: >> >>>> The trouble is when I try to add some "--mca" parameters to force it to >>>> use TCP/Ethernet, the program seems to hang. I get the headers of the >>>> "osu_bw" output, but no results, even on the first case (1 byte payload >>>> per packet). This is occurring on both the IB-enabled nodes, and on the >>>> Ethernet-only nodes. The specific syntax I was using was: "mpirun >>>> --mca btl ^openib --mca btl_tcp_if_exclude ib0 ./osu_bw" >>> >>> When we want to run over TCP and IPoIB on an IB/PSM equipped cluster, we >>> use: >>> --mca btl sm --mca btl tcp,self --mca btl_tcp_if_exclude eth0 --mca >>> btl_tcp_if_include ib0 --mca mtl ^psm >>> >>> based on this, it looks like the following might work for you: >>> --mca btl sm,tcp,self --mca btl_tcp_if_exclude ib0 --mca btl_tcp_if_include >>> eth0 --mca btl ^openib >>> >>> If you don't have ib0 ports configured on the IB nodes, probably you don't >>> need the" --mca btl_tcp_if_exclude ib0." >>> >>> -Tom >>> >>>> >>>> The problem occurs at least with OpenMPI 1.6.3 compiled with GNU 4.4 >>>> compilers, with 1.6.3 compiled with Intel 13.0.1 compilers, and with >>>> 1.6.5 compiled with Intel 13.0.1 compilers. I haven't tested any other >>>> combinations yet. >>>> >>>> Any ideas here? It's very possible this is a system configuration >>>> problem, but I don't know where to look. At this point, any ideas would >>>> be welcome, either about the specific situation, or general pointers on >>>> mpirun debugging flags to use. I can't find much in the docs yet on >>>> run-time debugging for OpenMPI, as opposed to debugging the application. >>>> Maybe I'm just looking in the wrong place. >>>> >>>> >>>> Thanks, >>>> >>>> -- >>>> Lloyd Brown >>>> Systems Administrator >>>> Fulton Supercomputing Lab >>>> Brigham Young University >>>> http://marylou.byu.edu >>>> ___ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >
[OMPI users] Debugging Runtime/Ethernet Problems
Hi, all. We've got a couple of clusters running RHEL 6.2, and have several centrally-installed versions/compilations of OpenMPI. Some of the nodes have 4xQDR Infiniband, and all the nodes have 1 gigabit ethernet. I was gathering some bandwidth and latency numbers using the OSU/OMB tests, and noticed some weird behavior. When I run a simple "mpirun ./osu_bw" on a couple of IB-enabled node, I get numbers consistent with our IB speed (up to about 3800 MB/s), and when I run the same thing on two nodes with only Ethernet, I get speeds consistent with that (up to about 120 MB/s). So far, so good. The trouble is when I try to add some "--mca" parameters to force it to use TCP/Ethernet, the program seems to hang. I get the headers of the "osu_bw" output, but no results, even on the first case (1 byte payload per packet). This is occurring on both the IB-enabled nodes, and on the Ethernet-only nodes. The specific syntax I was using was: "mpirun --mca btl ^openib --mca btl_tcp_if_exclude ib0 ./osu_bw" The problem occurs at least with OpenMPI 1.6.3 compiled with GNU 4.4 compilers, with 1.6.3 compiled with Intel 13.0.1 compilers, and with 1.6.5 compiled with Intel 13.0.1 compilers. I haven't tested any other combinations yet. Any ideas here? It's very possible this is a system configuration problem, but I don't know where to look. At this point, any ideas would be welcome, either about the specific situation, or general pointers on mpirun debugging flags to use. I can't find much in the docs yet on run-time debugging for OpenMPI, as opposed to debugging the application. Maybe I'm just looking in the wrong place. Thanks, -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu
Re: [OMPI users] check point restart
I know that in the past it has been supported via toolkits like BLCR, but I don't know the current level of support, to be honest. I think I heard somewhere that the checkpoint/restart support in OpenMPI was going away in some fashion. In any case, if you have the ability to set up application-aware, application-specific checkpointing, it will be a much better solution than something that's application-agnostic. The checkpoint files will be smaller (the application knows what in memory is important, and what isn't), coordination will be better between processes, you have some level of assurance that you won't have PID conflicts or problems when the PID ends up different, etc. I suspect someone on the list can answer your question about the built-in checkpoint/restart code better than I can. But in general, if you have a choice between checkpointing external and internal to your application, choose the application-internal checkpointing. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 07/19/2013 01:34 PM, Erik Nelson wrote: > I run mpi on an NSF computer. One of the conditions of use is that jobs > are limited to 24 hr > duration to provide democratic allotment to its users. > > A long program can require many restarts, so it becomes necessary to > store the state of the > program in memory, print it, recompile, and and read the state to start > again. > > I seem to remember a simpler approach (check point restart?) in which > the state of the .exe > code is saved and then simply restarted from its current position. > > Is there something like this for restarting an mpi program? > > Thanks, Erik > > > -- > Erik Nelson > > Howard Hughes Medical Institute > 6001 Forest Park Blvd., Room ND10.124 > Dallas, Texas 75235-9050 > > p : 214 645 5981 > f : 214 645 5948 > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] PBS jobs with OPENMPI
As far as I know, the mpdboot is not needed with OpenMPI. You should just be able to call mpirun or mpiexec directly. If your OpenMPI installation was compiled to use the TM API with Torque, you just do it like this, and it figures it all out: mpirun myprogram Otherwise, you will need to supply the number of nodes and nodefile, like this: NP=`wc -l $PBS_NODEFILE | awk '{print $1}'` mpirun -n $NP -hostfile $PBS_NODEFILE myprogram Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 11/19/2012 03:28 PM, Mariana Vargas Magana wrote: > Hi all > Help !! I have to send a job using #PBS and in the script example there is > something like this because the cluster is using MPICH2 > In my case i nee Openmpi to run my code so I installed locally, in this case > anyone knows what it is the equivalent of this commands because it is not > recognized like that... > > mpdboot -n ${NNODES} -f ${PBS_NODEFILE} -v --remcons > Thanks !! > > Mariana > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] PG compilers and OpenMPI 1.6.1
Thanks for getting this in so quickly. Yes, the nightly tarball from Aug 25 (a1r27142), seems to get through a configure and make stage at least. Thanks, Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 08/25/2012 05:18 AM, Jeff Squyres wrote: > I've merged the VT fix into the 1.6 branch; it will be available in tonight's > tarball. > > Can you give a nightly v1.6 tarball a whirl to ensure it fixes your problem? > > http://www.open-mpi.org/nightly/v1.6/ > > > On Aug 23, 2012, at 7:30 PM, Jeff Squyres (jsquyres) wrote: > >> Yes. VT = vampirtrace. >> >> Sent from my phone. No type good. >> >> On Aug 23, 2012, at 6:49 PM, "Lloyd Brown" wrote: >> >>> Okay. Sounds good. I'll watch that bug. >>> >>> For my own sanity check, "vt" means VampirTrace stuff, right? In our >>> environment, I don't think it'll be a problem to disable VampirTrace >>> temporarily. More people here use the Intel and GNU compiled versions >>> anyway, both of which compile just fine with 1.6.1. >>> >>> Lloyd Brown >>> Systems Administrator >>> Fulton Supercomputing Lab >>> Brigham Young University >>> http://marylou.byu.edu >>> >>> On 08/23/2012 04:43 PM, Jeff Squyres wrote: >>>> This was reported earlier today: >>>> >>>> https://svn.open-mpi.org/trac/ompi/ticket/3251 >>>> >>>> I've alerted the VT guys to have a look. For a workaround, you can >>>> --disable-vt. >>>> >>>> >>>> On Aug 23, 2012, at 6:00 PM, Ralph Castain wrote: >>>> >>>>> Just looking at your output, it looks like there is a missing header that >>>>> PGI requires - I have no idea what that might be. You might do a search >>>>> for omp_lock_t to see where it is defined and add that head to the >>>>> vt_wrapper.cc file and see if that fixes the problem >>>>> >>>>> On Aug 23, 2012, at 2:44 PM, Lloyd Brown wrote: >>>>> >>>>>> Has anyone been able to get OpenMPI 1.6.1 to compile with a recent >>>>>> Portland Group compiler set? I'm currently trying on RHEL 6.2 with PG >>>>>> compilers v12.5 (2012), and I keep getting errors like the ones below. >>>>>> It could easily be a problem with the compiler code, but since this >>>>>> doesn't happen with OpenMPI 1.6, I'm not sure. Can anyone provide any >>>>>> insight on what might have changed with respect to that file >>>>>> ('ompi/contrib/vt/vt/tools/vtwrapper/vt_wrapper.cc') between 1.6 and >>>>>> 1.6.1? >>>>>> >>>>>> Thanks, >>>>>> Lloyd >>>>>> >>>>>> >>>>>> Error Messages: >>>>>> >>>>>>> [root@rocks6staging vtwrapper]# pwd >>>>>>> /tmp/openmpi-1.6.1/ompi/contrib/vt/vt/tools/vtwrapper >>>>>>> [root@rocks6staging vtwrapper]# make V=1 >>>>>>> source='vt_wrapper.cc' object='vtwrapper-vt_wrapper.o' libtool=no \ >>>>>>> DEPDIR=.deps depmode=none /bin/sh ../../config/depcomp \ >>>>>>> pgcpp -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../include >>>>>>> -I../../util -I../../util -DINSIDE_OPENMPI -D_REENTRANT >>>>>>> -I/tmp/openmpi-1.6.1/opal/mca/hwloc/hwloc132/hwloc/include >>>>>>> -I/usr/include/infiniband -I/usr/include/infiniband -DHAVE_FC >>>>>>> -DHAVE_MPI -DHAVE_FMPI -DHAVE_THREADS -DHAVE_OMP -fast -c -o >>>>>>> vtwrapper-vt_wrapper.o `test -f 'vt_wrapper.cc' || echo >>>>>>> './'`vt_wrapper.cc >>>>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 356: error: >>>>>>> identifier "omp_lock_t" is undefined >>>>>>> omp_lock_t _M_lock; >>>>>>> ^ >>>>>>> >>>>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 359: error: >>>>>>> identifier "omp_init_lock" is undefined >>>>>>> omp_init_lock(&_M_lock); >>>>>>> ^ >>>>>>> >>>>>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.
Re: [OMPI users] PG compilers and OpenMPI 1.6.1
Okay. Sounds good. I'll watch that bug. For my own sanity check, "vt" means VampirTrace stuff, right? In our environment, I don't think it'll be a problem to disable VampirTrace temporarily. More people here use the Intel and GNU compiled versions anyway, both of which compile just fine with 1.6.1. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 08/23/2012 04:43 PM, Jeff Squyres wrote: > This was reported earlier today: > > https://svn.open-mpi.org/trac/ompi/ticket/3251 > > I've alerted the VT guys to have a look. For a workaround, you can > --disable-vt. > > > On Aug 23, 2012, at 6:00 PM, Ralph Castain wrote: > >> Just looking at your output, it looks like there is a missing header that >> PGI requires - I have no idea what that might be. You might do a search for >> omp_lock_t to see where it is defined and add that head to the vt_wrapper.cc >> file and see if that fixes the problem >> >> On Aug 23, 2012, at 2:44 PM, Lloyd Brown wrote: >> >>> Has anyone been able to get OpenMPI 1.6.1 to compile with a recent >>> Portland Group compiler set? I'm currently trying on RHEL 6.2 with PG >>> compilers v12.5 (2012), and I keep getting errors like the ones below. >>> It could easily be a problem with the compiler code, but since this >>> doesn't happen with OpenMPI 1.6, I'm not sure. Can anyone provide any >>> insight on what might have changed with respect to that file >>> ('ompi/contrib/vt/vt/tools/vtwrapper/vt_wrapper.cc') between 1.6 and 1.6.1? >>> >>> Thanks, >>> Lloyd >>> >>> >>> Error Messages: >>> >>>> [root@rocks6staging vtwrapper]# pwd >>>> /tmp/openmpi-1.6.1/ompi/contrib/vt/vt/tools/vtwrapper >>>> [root@rocks6staging vtwrapper]# make V=1 >>>> source='vt_wrapper.cc' object='vtwrapper-vt_wrapper.o' libtool=no \ >>>> DEPDIR=.deps depmode=none /bin/sh ../../config/depcomp \ >>>> pgcpp -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../include >>>> -I../../util -I../../util -DINSIDE_OPENMPI -D_REENTRANT >>>> -I/tmp/openmpi-1.6.1/opal/mca/hwloc/hwloc132/hwloc/include >>>> -I/usr/include/infiniband -I/usr/include/infiniband -DHAVE_FC -DHAVE_MPI >>>> -DHAVE_FMPI -DHAVE_THREADS -DHAVE_OMP -fast -c -o vtwrapper-vt_wrapper.o >>>> `test -f 'vt_wrapper.cc' || echo './'`vt_wrapper.cc >>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 356: error: >>>> identifier "omp_lock_t" is undefined >>>> omp_lock_t _M_lock; >>>> ^ >>>> >>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 359: error: >>>> identifier "omp_init_lock" is undefined >>>> omp_init_lock(&_M_lock); >>>> ^ >>>> >>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 364: error: >>>> identifier "omp_destroy_lock" is undefined >>>>omp_destroy_lock(&_M_lock); >>>>^ >>>> >>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 369: error: >>>> identifier "omp_set_lock" is undefined >>>>omp_set_lock(&_M_lock); >>>>^ >>>> >>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 375: error: >>>> identifier "omp_set_lock" is undefined >>>>omp_set_lock(&_M_lock); >>>>^ >>>> >>>> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 380: error: >>>> identifier "omp_unset_lock" is undefined >>>> omp_unset_lock(&_M_lock); >>>> ^ >>>> >>>> 6 errors detected in the compilation of "vt_wrapper.cc". >>>> make: *** [vtwrapper-vt_wrapper.o] Error 2 >>>> [root@rocks6staging vtwrapper]# >>> >>> >>> >>> -- >>> Lloyd Brown >>> Systems Administrator >>> Fulton Supercomputing Lab >>> Brigham Young University >>> http://marylou.byu.edu >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >
[OMPI users] PG compilers and OpenMPI 1.6.1
Has anyone been able to get OpenMPI 1.6.1 to compile with a recent Portland Group compiler set? I'm currently trying on RHEL 6.2 with PG compilers v12.5 (2012), and I keep getting errors like the ones below. It could easily be a problem with the compiler code, but since this doesn't happen with OpenMPI 1.6, I'm not sure. Can anyone provide any insight on what might have changed with respect to that file ('ompi/contrib/vt/vt/tools/vtwrapper/vt_wrapper.cc') between 1.6 and 1.6.1? Thanks, Lloyd Error Messages: > [root@rocks6staging vtwrapper]# pwd > /tmp/openmpi-1.6.1/ompi/contrib/vt/vt/tools/vtwrapper > [root@rocks6staging vtwrapper]# make V=1 > source='vt_wrapper.cc' object='vtwrapper-vt_wrapper.o' libtool=no \ > DEPDIR=.deps depmode=none /bin/sh ../../config/depcomp \ > pgcpp -DHAVE_CONFIG_H -I. -I../.. -I../../include -I../../include > -I../../util -I../../util -DINSIDE_OPENMPI -D_REENTRANT > -I/tmp/openmpi-1.6.1/opal/mca/hwloc/hwloc132/hwloc/include > -I/usr/include/infiniband -I/usr/include/infiniband -DHAVE_FC -DHAVE_MPI > -DHAVE_FMPI -DHAVE_THREADS -DHAVE_OMP -fast -c -o vtwrapper-vt_wrapper.o > `test -f 'vt_wrapper.cc' || echo './'`vt_wrapper.cc > "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 356: error: > identifier "omp_lock_t" is undefined > omp_lock_t _M_lock; > ^ > > "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 359: error: > identifier "omp_init_lock" is undefined > omp_init_lock(&_M_lock); > ^ > > "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 364: error: > identifier "omp_destroy_lock" is undefined > omp_destroy_lock(&_M_lock); > ^ > > "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 369: error: > identifier "omp_set_lock" is undefined > omp_set_lock(&_M_lock); > ^ > > "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 375: error: > identifier "omp_set_lock" is undefined > omp_set_lock(&_M_lock); > ^ > > "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 380: error: > identifier "omp_unset_lock" is undefined > omp_unset_lock(&_M_lock); > ^ > > 6 errors detected in the compilation of "vt_wrapper.cc". > make: *** [vtwrapper-vt_wrapper.o] Error 2 > [root@rocks6staging vtwrapper]# -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu
Re: [OMPI users] Measuring latency
That's fine. In that case, you just compile it with your MPI implementation and do something like this: mpiexec -np 2 -H masterhostname,slavehostname ./osu_latency There may be some all-to-all latency tools too. I don't really remember. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 08/21/2012 03:41 PM, Maginot Junior wrote: > Sorry for the type, what I meant was "and" not "em". > Thank you for the quick response, I will take a look at your suggestion
Re: [OMPI users] Measuring latency
I'm not really familiar enough to know what you mean by "em slaves", but for general testing of bandwidth and latency, I usually use the "OSU Micro-benchmarks" (see http://mvapich.cse.ohio-state.edu/benchmarks/). Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 08/21/2012 03:32 PM, Maginot Junior wrote: > Hello. > How do you suggest me to measure the latency between master em slaves > in my cluster? Is there any tool that I can use to test the > performance of my environment? > Thanks > > > -- > Maginot Júnior > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] rpmbuild defining opt install path
That's a really good idea. The trouble is that I need to have multiple versions installed (eg. compiled with the various compilers), so I think I still need to manipulate name in some way, so the packages will be named differently. But _prefix should definitely give me more flexibility as to where it's installed. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 06/27/2012 11:12 AM, Jeff Squyres wrote: > On Jun 26, 2012, at 2:40 PM, Lloyd Brown wrote: > >> Is there an easy way with the .spec file and the rpmbuild command, for >> me to override the path the OpenMPI RPM installs into, in /opt? >> Basically, I'm already doing something like this: > > I think all you need to do is override the RPM-builtin names, like _prefix > (and possibly some others). For example, I did this in RHEL 6.2: > > rpmbuild --rebuild --define '_prefix /tmp/bogus' \ > /home/jsquyres/RPMS/SRPMS/openmpi-1.6-1.src.rpm > > Which resulted in: > > + ./configure --build=x86_64-unknown-linux-gnu > --host=x86_64-unknown-linux-gnu --target=x86_64-redhat-linux-gnu > --program-prefix= --prefix=/tmp/bogus --exec-prefix=/tmp/bogus > --bindir=/tmp/bogus/bin --sbindir=/tmp/bogus/sbin --sysconfdir=/tmp/bogus/etc > --datadir=/tmp/bogus/share --includedir=/tmp/bogus/include > --libdir=/tmp/bogus/lib64 --libexecdir=/tmp/bogus/libexec > --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man > --infodir=/usr/share/info > > For some reason, this didn't override localstatedir, sharedstatedir, mandir, > and infodir (gotta love RPM! :-) ), so I did: > > rpmbuild --rebuild --define '_prefix /tmp/bogus' --define '_localstatedir > /tmp/bogus/var' --define '_sharedstatedir /tmp/bogus/var/lib' --define > '_mandir /tmp/bogus/share/man' --define '_infodir /tmp/bogus/share/info' > /home/jsquyres/RPMS/SRPMS/openmpi-1.6-1.src.rpm > > When then gave me what I think you want: > > + ./configure --build=x86_64-unknown-linux-gnu > --host=x86_64-unknown-linux-gnu --target=x86_64-redhat-linux-gnu > --program-prefix= --prefix=/tmp/bogus --exec-prefix=/tmp/bogus > --bindir=/tmp/bogus/bin --sbindir=/tmp/bogus/sbin --sysconfdir=/tmp/bogus/etc > --datadir=/tmp/bogus/share --includedir=/tmp/bogus/include > --libdir=/tmp/bogus/lib64 --libexecdir=/tmp/bogus/libexec > --localstatedir=/tmp/bogus/var --sharedstatedir=/tmp/bogus/var/lib > --mandir=/tmp/bogus/share/man --infodir=/tmp/bogus/share/info >
Re: [OMPI users] rpmbuild defining opt install path
Something else interesting that I just discovered. If I do this, I have the problem: rpmbuild --rebuild -bb path/to/openmpi-1.6-2.src.rpm However, if I do an "rpm -i path/to/openmpi-1.6-2.src.rpm", and then do very-similar rpmbuild syntax, it puts everything where I want it: rpmbuild -bb path/to/openmpi-1.6.spec In this case, the "" are all exactly the same. Clearly there's something I'm missing about the RPM build process. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 06/26/2012 12:40 PM, Lloyd Brown wrote: > Is there an easy way with the .spec file and the rpmbuild command, for > me to override the path the OpenMPI RPM installs into, in /opt? > Basically, I'm already doing something like this: > > rpmbuild --rebuild --define 'install_in_opt 1' --define '_name > fsl_openmpi_intel' --define 'name fsl_openmpi_intel' ... -bb > openmpi-1.6-2.src.rpm > > For some reason, though, while most of it ends up in > "/opt/fsl_openmpi_intel/1.6", as I intend, a few files still get put > into "/opt/openmpi/1.6/etc", and I'm not sure what else I can do to put > it where I want: > >> # rpm -q -l -p /root/rpmbuild/RPMS/x86_64/fsl_openmpi_intel-1.6-2.x86_64.rpm >> /opt/fsl_openmpi_intel >> /opt/fsl_openmpi_intel/1.6 >> /opt/fsl_openmpi_intel/1.6/bin >> /opt/fsl_openmpi_intel/1.6/bin/mpiCC >> /opt/fsl_openmpi_intel/1.6/bin/mpiCC-vt >> /opt/fsl_openmpi_intel/1.6/bin/mpic++ >> /opt/fsl_openmpi_intel/1.6/bin/mpic++-vt >> /opt/fsl_openmpi_intel/1.6/bin/mpicc >> ... >> /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.dtd >> /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.xml >> /opt/openmpi/1.6/etc >> /opt/openmpi/1.6/etc/openmpi-default-hostfile >> /opt/openmpi/1.6/etc/openmpi-mca-params.conf >> /opt/openmpi/1.6/etc/openmpi-totalview.tcl >> /opt/openmpi/1.6/etc/vt-java-default-filter.spec >> /opt/openmpi/1.6/etc/vtsetup-config.dtd >> /opt/openmpi/1.6/etc/vtsetup-config.xml > > I realize it might not be a good idea in general to override "name" and > "_name" like this, so if there's an easier way, I'd be happy to do it. > I just haven't found anything yet, and haven't yet found the place in > the spec file where it's being set to "/opt/openmpi" again. > > We're probably going to end up with at least 3 versions of v1.6 (gcc > compilers, intel compilers, pgi compilers) and possibly a few of a > previous version, so putting everything in /opt/openmpi/VERSION, is a > little problematic. > > Thanks,
[OMPI users] rpmbuild defining opt install path
Is there an easy way with the .spec file and the rpmbuild command, for me to override the path the OpenMPI RPM installs into, in /opt? Basically, I'm already doing something like this: rpmbuild --rebuild --define 'install_in_opt 1' --define '_name fsl_openmpi_intel' --define 'name fsl_openmpi_intel' ... -bb openmpi-1.6-2.src.rpm For some reason, though, while most of it ends up in "/opt/fsl_openmpi_intel/1.6", as I intend, a few files still get put into "/opt/openmpi/1.6/etc", and I'm not sure what else I can do to put it where I want: > # rpm -q -l -p /root/rpmbuild/RPMS/x86_64/fsl_openmpi_intel-1.6-2.x86_64.rpm > /opt/fsl_openmpi_intel > /opt/fsl_openmpi_intel/1.6 > /opt/fsl_openmpi_intel/1.6/bin > /opt/fsl_openmpi_intel/1.6/bin/mpiCC > /opt/fsl_openmpi_intel/1.6/bin/mpiCC-vt > /opt/fsl_openmpi_intel/1.6/bin/mpic++ > /opt/fsl_openmpi_intel/1.6/bin/mpic++-vt > /opt/fsl_openmpi_intel/1.6/bin/mpicc > ... > /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.dtd > /opt/fsl_openmpi_intel/1.6/share/vtsetup-data.xml > /opt/openmpi/1.6/etc > /opt/openmpi/1.6/etc/openmpi-default-hostfile > /opt/openmpi/1.6/etc/openmpi-mca-params.conf > /opt/openmpi/1.6/etc/openmpi-totalview.tcl > /opt/openmpi/1.6/etc/vt-java-default-filter.spec > /opt/openmpi/1.6/etc/vtsetup-config.dtd > /opt/openmpi/1.6/etc/vtsetup-config.xml I realize it might not be a good idea in general to override "name" and "_name" like this, so if there's an easier way, I'd be happy to do it. I just haven't found anything yet, and haven't yet found the place in the spec file where it's being set to "/opt/openmpi" again. We're probably going to end up with at least 3 versions of v1.6 (gcc compilers, intel compilers, pgi compilers) and possibly a few of a previous version, so putting everything in /opt/openmpi/VERSION, is a little problematic. Thanks, -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu
Re: [OMPI users] regarding the problem occurred while running anmpi programs
Yes, but what happens when you run a remote, non-login shell? By that, I mean something like this: ssh master@ip-10-80-106-70 'echo $LD_LIBRARY_PATH' Assuming I got the syntax right, I suspect you'll find that the contents of the variable, do not include /usr/local/openmpi-1.4.5/lib. You really need that to be in LD_LIBRARY_PATH (or some other method) on all nodes, in all shells for the user. One simple way to do this is via the startup files (eg. .bashrc and .bash_profile for bash, .cshrc for csh/tcsh, etc.) Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 04/25/2012 09:43 AM, seshendra seshu wrote: > Hi > I have exported the library files as below > > [master@ip-10-80-106-70 ~]$ export > LD_LIBRARY_PATH=/usr/local/openmpi-1.4.5/lib:$LD_LIBRARY_PATH > > > [master@ip-10-80-106-70 ~]$ mpirun --prefix /usr/local/openmpi-1.4.5 -n > 1 --hostfile hostfile out > out: error while loading shared libraries: libmpi_cxx.so.0: cannot open > shared object file: No such file or directory > [master@ip-10-80-106-70 ~]$ mpirun --prefix /usr/local/lib/ -n 1 > --hostfile hostfile > out > > > out: error while loading shared libraries: libmpi_cxx.so.0: cannot open > shared object file: No such file or directory > > But still iam getting the same error. > > > > > > On Wed, Apr 25, 2012 at 5:36 PM, Jeff Squyres (jsquyres) > mailto:jsquy...@cisco.com>> wrote: > > See the FAQ item I cited. > > Sent from my phone. No type good. > > On Apr 25, 2012, at 11:24 AM, "seshendra seshu" <mailto:seshu...@gmail.com>> wrote: > >> Hi >> now i have created an used and tried to run the program but i got >> the following error >> >> [master@ip-10-80-106-70 ~]$ mpirun -n 1 --hostfile hostfile >> out >> >> >> out: error while loading shared libraries: libmpi_cxx.so.0: cannot >> open shared object file: No such file or directory >> >> >> thanking you >> >> >> >> On Wed, Apr 25, 2012 at 5:12 PM, Jeff Squyres > <mailto:jsquy...@cisco.com>> wrote: >> >> On Apr 25, 2012, at 11:06 AM, seshendra seshu wrote: >> >> > so should i need to create an user and run the mpi program. >> or how can i run in cluster >> >> It is a "best practice" to not run real applications as root >> (e.g., MPI applications). Create a non-privlidged user to run >> your applications. >> >> Then be sure to set your LD_LIBRARY_PATH if you installed Open >> MPI into a non-system-default location. See this FAQ item: >> >> >> http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path >> >> -- >> Jeff Squyres >> jsquy...@cisco.com <mailto:jsquy...@cisco.com> >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> ___ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> >> -- >> WITH REGARDS >> M.L.N.Seshendra >> ___ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > -- > WITH REGARDS > M.L.N.Seshendra > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] ssh between nodes
It really depends. You certainly CAN have mpirun/mpiexec use ssh to launch the remote processes. If you're using Torque, though, I strongly recommend using the hooks in OpenMPI, into the Torque TM-API (see http://www.open-mpi.org/faq/?category=building#build-rte-tm). That will use the pbs_mom's themselves to launch all the processes, which has several advantages. Using the TM-API for job launch means that remote processes will be children of the Torque pbs_mom process, not the sshd process, which means that Torque will be able to do a better job at killing rogue processes, reporting resources utilized, etc. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 02/29/2012 02:09 PM, Denver Smith wrote: > Hello, > > On my cluster running moab and torque, I cannot ssh without a password > between compute nodes. I can however request multiple node jobs fine. I > was wondering if passwordless ssh keys need to be set up between compute > nodes in order for mpi applications to run correctly. > > Thanks > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Mpirun: How to print STDOUT of just one process?
I don't know about using mpirun to do it, but you can actually call mpirun on a script, and have that script individually call a single instance of your program. Then that script could use shell redirection to redirect the output of the program's instance to a separate file. I've used this technique to play with ulimit sort of things in the script before. I'm not entirely sure what variables are exposed to you in the script, such that you could come up with a unique filename to output to, though. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 02/01/2012 08:59 AM, Frank wrote: > When running > > mpirun -n 2 > > the STDOUT streams of both processes are combined and are displayed by > the shell. In such an interleaved format its hard to tell what line > comes from which node. > > Is there a way to have mpirun just merger STDOUT of one process to its > STDOUT stream? > > Best, > Frank > > Cross-reference: > http://stackoverflow.com/questions/9098781/mpirun-how-to-print-stdout-of-just-one-process > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Checkpoint an MPI process
Since you're looking for a function call, I'm going to assume that you are writing this application, and it's not a pre-compiled, commercial application. Given that, it's going to be significantly better to have an internal application checkpointing mechanism, where it serializes and stores the data, etc., than to use an external, applicaiton-agnostic checkpointing mechanism like BLCR or similar. The application should be aware of what data is important, how to most efficiently store it, etc. A generic library has to assume that everything is important, and store it all. Don't get me wrong. Libraries like BLCR are great for applications that don't have that visibility, and even as a tool for the application-internal checkpointing mechanism (where the application deliberately interacts with the library to annotate what's important to store, and how to do so, etc.). But if you're writing the application, you're better off to handle it internally, than externally. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 01/19/2012 08:05 AM, Josh Hursey wrote: > Currently Open MPI only supports the checkpointing of the whole > application. There has been some work on uncoordinated checkpointing > with message logging, though I do not know the state of that work with > regards to availability. That work has been undertaken by the University > of Tennessee Knoxville, so maybe they can provide more information. > > -- Josh > > On Wed, Jan 18, 2012 at 3:24 PM, Rodrigo Oliveira > mailto:rsilva.olive...@gmail.com>> wrote: > > Hi, > > I'd like to know if there is a way to checkpoint a specific process > running under an mpirun call. In other words, is there a function > CHECKPOINT(rank) in which I can pass the rank of the process I want > to checkpoint? I do not want to checkpoint the entire application, > but just one of its processes. > > Thanks > > ___ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > -- > Joshua Hursey > Postdoctoral Research Associate > Oak Ridge National Laboratory > http://users.nccs.gov/~jjhursey > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] segfault when resuming on different host
Josh, When I use cr_{run,checkpoint,restart} to start a checkpoint and restart a single-threaded, single-process app on a different host, it works, even with prelinking enabled. That's kinda why I assumed the problem was with the OpenMPI code, and didn't look at the BLCR FAQ that closely, to be honest. Having said that, I did temporarily disable prelink on my two hosts, and tried my MPI test again, and it seemed to work. I'll have to do more tests with something more intense (xhpl, maybe), and so on, but preliminary results look good. Thanks for pointing me in the right direction. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 12/29/2011 02:31 PM, Josh Hursey wrote: > Often this type of problem is due to the 'prelink' option in Linux. > BLCR has a FAQ item that discusses this issue and how to resolve it: > https://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#prelink > > I would give that a try. If that does not help then you might want to > try checkpointing a single (non-MPI) process on one node with BLCR and > restart it on the other node. If that fails, then it is likely a > BLCR/system configuration issue that is the cause. If it does work, > then we can dig more into the Open MPI causes. > > Let me know if disabling prelink works for you. > > -- Josh >
[OMPI users] segfault when resuming on different host
Hi, all. I'm in the middle of testing some of the checkpoint/restart capabilities of OpenMPI with BLCR on our cluster. I've been able to checkpoint and restart successfully when I restart on the same nodes as it was running previously. But when I try to restart on a different host, I always get an error like this: > $ ompi-restart ompi_global_snapshot_15935.ckpt > -- > mpirun noticed that process rank 1 with PID 15201 on node m5stage-1-2.local > exited on signal 11 (Segmentation fault). > -- Now, it's very possible that I've missed something during the setup, or that despite my failure to find it while searching the mailing list, that this is already answered somewhere, but none of the threads I could find seemed to apply (eg. cr_restart *is* installed, etc.). I'm attaching a tarball that contains the source code of the very-simple test application, as well as some example output of "ompi_info --all" and "ompi_info -v ompi full --parsable". I don't know if this will be useful or not. This is being tested on CentOS v5.4 with BLCR v0.8.4. I've seen this problem with OpenMPI v1.4.2, v1.4.4, and v1.5.4. If anyone has any ideas on what's going on, or how to best debug this, I'd love to hear about it. I don't mind doing the legwork too, but I'm just stumped where to go from here. I have some core files, but I'm having trouble getting the symbols from the backtrace in gdb. Maybe I'm doing it wrong. TIA, -- Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu byufsl_debugging_segfault_on_resume.tar.gz Description: application/gzip