[OMPI devel] Ticket #1982 - Fortran MPI_IN_PLACE issue
I've been playing around with Jeff's "bogus" tarball and I, too, see it fail on OS X. If I make the following changes it works perfectly: in configure.in 1) replace -fno-common with -fcommon 2) add -flat_namespace as part of the arguments for creating shared libs. After that, things work fine: (dog@domdechant 63%) main Fortran MPI_BOTTOM is 93 Assigning C variables MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/19, 0x602c/20) Fortran MPI_BOTTOM is 19 Fortran MPI_BOTTOM is 32 MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/32, 0x602c/20) Fortran MPI_BOTTOM is 32 I still don't see what the problem is for the two different versions of OMPI are. OSX 10.5.8, GCC 4.4.1, most recent libtool, autoconf, automake and m4. -david -- David Gunter HPC-3: Parallel Tools Team Los Alamos National Laboratory
Re: [OMPI devel] Ticket #1982 - Fortran MPI_IN_PLACE issue
I meant to say "configure", not "configure.in" below. -- David Gunter HPC-3: Parallel Tools Team Los Alamos National Laboratory On Sep 22, 2009, at 8:05 AM, David Gunter wrote: I've been playing around with Jeff's "bogus" tarball and I, too, see it fail on OS X. If I make the following changes it works perfectly: in configure.in 1) replace -fno-common with -fcommon 2) add -flat_namespace as part of the arguments for creating shared libs. After that, things work fine: (dog@domdechant 63%) main Fortran MPI_BOTTOM is 93 Assigning C variables MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/19, 0x602c/20) Fortran MPI_BOTTOM is 19 Fortran MPI_BOTTOM is 32 MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/32, 0x602c/20) Fortran MPI_BOTTOM is 32 I still don't see what the problem is for the two different versions of OMPI are. OSX 10.5.8, GCC 4.4.1, most recent libtool, autoconf, automake and m4. -david -- David Gunter HPC-3: Parallel Tools Team Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Ticket #1982 - Fortran MPI_IN_PLACE issue
Thanks! I added these comments to #1982 (don't hesitate to add comments yourself :-) ). On Sep 22, 2009, at 10:05 AM, David Gunter wrote: I've been playing around with Jeff's "bogus" tarball and I, too, see it fail on OS X. If I make the following changes it works perfectly: in configure.in 1) replace -fno-common with -fcommon 2) add -flat_namespace as part of the arguments for creating shared libs. After that, things work fine: (dog@domdechant 63%) main Fortran MPI_BOTTOM is 93 Assigning C variables MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/19, 0x602c/20) Fortran MPI_BOTTOM is 19 Fortran MPI_BOTTOM is 32 MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/32, 0x602c/20) Fortran MPI_BOTTOM is 32 I still don't see what the problem is for the two different versions of OMPI are. OSX 10.5.8, GCC 4.4.1, most recent libtool, autoconf, automake and m4. -david -- David Gunter HPC-3: Parallel Tools Team Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI devel] Ticket #1982 - Fortran MPI_IN_PLACE issue
I don't believe I have an account to add comments - I would appreciate one! Thanks, david -- David Gunter HPC-3: Parallel Tools Team Los Alamos National Laboratory On Sep 22, 2009, at 8:24 AM, Jeff Squyres wrote: Thanks! I added these comments to #1982 (don't hesitate to add comments yourself :-) ). On Sep 22, 2009, at 10:05 AM, David Gunter wrote: I've been playing around with Jeff's "bogus" tarball and I, too, see it fail on OS X. If I make the following changes it works perfectly: in configure.in 1) replace -fno-common with -fcommon 2) add -flat_namespace as part of the arguments for creating shared libs. After that, things work fine: (dog@domdechant 63%) main Fortran MPI_BOTTOM is 93 Assigning C variables MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/19, 0x602c/20) Fortran MPI_BOTTOM is 19 Fortran MPI_BOTTOM is 32 MPI_SEND_F: This is BOTTOM: 0x2040 == (0x6020/17, 0x6024/18, 0x2040/32, 0x602c/20) Fortran MPI_BOTTOM is 32 I still don't see what the problem is for the two different versions of OMPI are. OSX 10.5.8, GCC 4.4.1, most recent libtool, autoconf, automake and m4. -david -- David Gunter HPC-3: Parallel Tools Team Los Alamos National Laboratory ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Dynamic languages, dlopen() issues, and symbol visibility of libtool ltdl API in current trunk
On Mon, Sep 21, 2009 at 9:45 AM, Jeff Squyres wrote: > Ick; I appreciate Lisandro's quandry, but don't quite know what to do. > I'm just asking the library "libopen-pal.so" exposing ltdl calls wrapped with an "opal_" prefix. This way, the original ltdl calls hare hidden (no chance to collide with user code using an incompatible libtool version), but Open MPI provides a portable way to dlopen() shared libs/dynamic modules. In simple terms, I'm asking "libopen-pal.so" to contain ltdl wrapper calls like this one: OMPI_DECLSPEC lt_dlhandle opal_lt_dlopenadvise(const char *filename, lt_dladvise advise) /* note opal_ prefix! */ { return lt_dlopenadvise(filename,advise); /* original ltdl call*/ } Then, third-party code (like mpi4py or any other dynamic MPI module for any other dynamic language) can do this: #include #if defined(OPEN_MPI) typedef void *lt_dlhandle; typedef void *lt_dladvise; OMPI_DECLSPEC extern lt_dlhandle opal_lt_dlopenadvise(const char *, lt_dladvise) #endif ... #if defined(OPEN_MPI) /* init advice, not shown ... */ opal_lt_dlopenadvise("mpi", advice); /* destroy advice, not shown ... */ #endif MPI_Init(0,0); > > How about keeping libltdl fvisibility=hidden inside mpi4py? > Not sure if I was clear enough in my comments above, but mpi4py does not bundles/link libtool. Just abuses on libtool availability in "libopen-pal.so" for the sake of portability. > > On Sep 17, 2009, at 11:16 AM, Josh Hursey wrote: > >> So I started down this road a couple months ago. I was using the >> lt_dlopen() and friends in the OPAL CRS self module. The visibility >> changes broke that functionality. The one solution that I started >> implementing was precisely what you suggested, wrapping a subset the >> libtool calls and prefixing them with opal_*. The email thread is below: >> http://www.open-mpi.org/community/lists/devel/2009/07/6531.php >> >> The problem that I hit was that libtool's build system did not play >> well with the visibility symbols. This caused dlopen to be disabled >> incorrectly. The libtool folks have a patch and, I believe, they are >> planning on incorporating in the next release. The email thread is >> below: >> http://thread.gmane.org/gmane.comp.gnu.libtool.patches/9446 >> >> So we would (others can speak up if not) certainly consider such a >> wrapper, but I think we need to wait for the next libtool release >> (unless there is other magic we can do) before it would be usable. >> >> Do others have any other ideas on how we might get around this in the >> mean time? >> >> -- Josh >> >> >> On Sep 16, 2009, at 5:59 PM, Lisandro Dalcin wrote: >> >> > Hi all.. I have to contact you again about the issues related to >> > dlopen()ing libmpi with RTLD_LOCAL, as many dynamic languages (Python >> > in my case) do. >> > >> > So far, I've been able to manage the issues (despite the "do nothing" >> > policy from Open MPI devs, which I understand) in a more or less >> > portable manner by taking advantage of the availability of libtool >> > ltdl symbols in the Open MPI libraries (specifically, in libopen-pal). >> > For reference, all this hackery is here: >> > http://code.google.com/p/mpi4py/source/browse/trunk/src/compat/openmpi.h >> > >> > However, I noticed that in current trunk (v1.4, IIUC) things have >> > changed and libtool symbols are not externally available. Again, I >> > understand the reason and acknowledge that such change is a really >> > good thing. However, this change has broken all my hackery for >> > dlopen()ing libmpi before the call to MPI_Init(). >> > >> > Is there any chance that libopen-pal could provide some properly >> > prefixed (let say, using "opal_" as a prefix) wrapper calls to a small >> > subset of the libtool ltdl API? The following set of wrapper calls >> > would is the minimum required to properly load libmpi in a portable >> > manner and cleanup resources (let me abuse of my previous suggestion >> > and add the opal_ prefix): >> > >> > opal_lt_dlinit() >> > opal_lt_dlexit() >> > >> > opal_lt_dladvise_init(a) >> > opal_lt_dladvise_destroy(a) >> > opal_lt_dladvise_global(a) >> > opal_lt_dladvise_ext(a) >> > >> > opal_lt_dlopenadvise(n,a) >> > opal_lt_dlclose(h) >> > >> > Any chance this request could be considered? I would really like to >> > have this before any Open MPI tarball get released without libtool >> > symbols exposed... >> > >> > >> > -- >> > Lisandro Dalcín >> > --- >> > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) >> > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) >> > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) >> > PTLC - Güemes 3450, (3000) Santa Fe, Argentina >> > Tel/Fax: +54-(0)342-451.1594 >> > >> > ___ >> > devel mailing list >> > de...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >>
[OMPI devel] coll sm ramifications
Someday soon, coll sm will be reliable. Really. :-) One thing I noticed is that coll sm is "slow" in communicator construction and destruction because it mmap's upon creation and munmap's upon deletion. For most apps, this probably doesn't matter. For apps that create bajillions of communicators, the effect can be noticeable. There's at least one way to alleviate this effect, but I don't have time to implement this optimization. I wrote up a ticket with a few more details: https://svn.open-mpi.org/trac/ompi/ticket/2027 -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI devel] [OMPI users] Open-MPI between Mac and Linux (ubuntu 9.04) over wireless
Hi Rolf, I ran the following: pallabdatta$ /usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca btl_tcp_if_include en0,wlan0 -np 2 -hetero -H localhost,10.11.14.205 /tmp/hello [fuji.local:02267] mca: base: components_open: Looking for btl components [fuji.local:02267] mca: base: components_open: opening btl components [fuji.local:02267] mca: base: components_open: found loaded component self [fuji.local:02267] mca: base: components_open: component self has no register function [fuji.local:02267] mca: base: components_open: component self open function successful [fuji.local:02267] mca: base: components_open: found loaded component sm [fuji.local:02267] mca: base: components_open: component sm has no register function [fuji.local:02267] mca: base: components_open: component sm open function successful [fuji.local:02267] mca: base: components_open: found loaded component tcp [fuji.local:02267] mca: base: components_open: component tcp has no register function [fuji.local:02267] mca: base: components_open: component tcp open function successful [fuji.local:02267] select: initializing btl component self [fuji.local:02267] select: init of component self returned success [fuji.local:02267] select: initializing btl component sm [fuji.local:02267] select: init of component sm returned success [fuji.local:02267] select: initializing btl component tcp [fuji.local][[59424,1],0][btl_tcp_component.c:468:mca_btl_tcp_component_create_instances] invalid interface "wlan0" [fuji.local:02267] select: init of component tcp returned success [apex-backpack:31956] mca: base: components_open: Looking for btl components [apex-backpack:31956] mca: base: components_open: opening btl components [apex-backpack:31956] mca: base: components_open: found loaded component self [apex-backpack:31956] mca: base: components_open: component self has no register function [apex-backpack:31956] mca: base: components_open: component self open function successful [apex-backpack:31956] mca: base: components_open: found loaded component sm [apex-backpack:31956] mca: base: components_open: component sm has no register function [apex-backpack:31956] mca: base: components_open: component sm open function successful [apex-backpack:31956] mca: base: components_open: found loaded component tcp [apex-backpack:31956] mca: base: components_open: component tcp has no register function [apex-backpack:31956] mca: base: components_open: component tcp open function successful [apex-backpack:31956] select: initializing btl component self [apex-backpack:31956] select: init of component self returned success [apex-backpack:31956] select: initializing btl component sm [apex-backpack:31956] select: init of component sm returned success [apex-backpack:31956] select: initializing btl component tcp [apex-backpack][[59424,1],1][btl_tcp_component.c:468:mca_btl_tcp_component_create_instances] invalid interface "en0" [apex-backpack:31956] select: init of component tcp returned success Process 0 on fuji.local out of 2 Process 1 on apex-backpack out of 2 [apex-backpack:31956] btl: tcp: attempting to connect() to address 10.11.14.203 on port 9360 It launches the processes on both ends and then it hangs at the send receive part..!! What is the other thing that you were mentioning which makes you think that its not working?!? Please suggest.. --regards, pallab > The -enable-heterogeneous should do the trick. And to answer the > previous question, yes, put both of the interfaces in the include list. > > --mca btl_tcp_if_include en0,wlan0 > > If that does not work, then I may have one other thought why it might > not work although perhaps not a solution. > > Rolf > > Pallab Datta wrote: >> Hi Rolf, >> >> Do i need to configure openmpi with some specific options apart from >> --enable-heterogeneous..? >> I am currently using >> ./configure --prefix=/usr/local/ --enable-heterogeneous --disable-static >> --enable-shared --enable-debug >> >> on both ends...is the above correct..?! Please let me know. >> thanks and regards, >> pallab >> >> >>> Hi: >>> I assume if you wait several minutes than your program will actually >>> time out, yes? I guess I have two suggestions. First, can you run a >>> non-MPI job using the wireless? Something like hostname? Secondly, >>> you >>> may want to specify the specific interfaces you want it to use on the >>> two machines. You can do that via the "--mca btl_tcp_if_include" >>> run-time parameter. Just list the ones that you expect it to use. >>> >>> Also, this is not right - "--mca OMPI_mca_mpi_preconnect_all 1" It >>> should be --mca mpi_preconnect_mpi 1 if you want to do the connection >>> during MPI_Init. >>> >>> Rolf >>> >>> Pallab Datta wrote: >>> The following is the error dump fuji:src pallabdatta$ /usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca btl tcp,self --mca OMPI_mca_
Re: [OMPI devel] [OMPI users] Open-MPI between Mac and Linux (ubuntu 9.04) over wireless
Is this a bug running open-mpi over heterogeneous environments (between a mac and linux) over wireless links. Please suggest what needs to be done or what I am missing.?! Any clues as to how to debug this will be of great help. thanks and regards, pallab > Hi Rolf, > > I ran the following: > > pallabdatta$ /usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca > btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca > btl_tcp_if_include en0,wlan0 -np 2 -hetero -H localhost,10.11.14.205 > /tmp/hello > > [fuji.local:02267] mca: base: components_open: Looking for btl components > [fuji.local:02267] mca: base: components_open: opening btl components > [fuji.local:02267] mca: base: components_open: found loaded component self > [fuji.local:02267] mca: base: components_open: component self has no > register function > [fuji.local:02267] mca: base: components_open: component self open > function successful > [fuji.local:02267] mca: base: components_open: found loaded component sm > [fuji.local:02267] mca: base: components_open: component sm has no > register function > [fuji.local:02267] mca: base: components_open: component sm open function > successful > [fuji.local:02267] mca: base: components_open: found loaded component tcp > [fuji.local:02267] mca: base: components_open: component tcp has no > register function > [fuji.local:02267] mca: base: components_open: component tcp open function > successful > [fuji.local:02267] select: initializing btl component self > [fuji.local:02267] select: init of component self returned success > [fuji.local:02267] select: initializing btl component sm > [fuji.local:02267] select: init of component sm returned success > [fuji.local:02267] select: initializing btl component tcp > [fuji.local][[59424,1],0][btl_tcp_component.c:468:mca_btl_tcp_component_create_instances] > invalid interface "wlan0" > [fuji.local:02267] select: init of component tcp returned success > [apex-backpack:31956] mca: base: components_open: Looking for btl > components > [apex-backpack:31956] mca: base: components_open: opening btl components > [apex-backpack:31956] mca: base: components_open: found loaded component > self > [apex-backpack:31956] mca: base: components_open: component self has no > register function > [apex-backpack:31956] mca: base: components_open: component self open > function successful > [apex-backpack:31956] mca: base: components_open: found loaded component > sm > [apex-backpack:31956] mca: base: components_open: component sm has no > register function > [apex-backpack:31956] mca: base: components_open: component sm open > function successful > [apex-backpack:31956] mca: base: components_open: found loaded component > tcp > [apex-backpack:31956] mca: base: components_open: component tcp has no > register function > [apex-backpack:31956] mca: base: components_open: component tcp open > function successful > [apex-backpack:31956] select: initializing btl component self > [apex-backpack:31956] select: init of component self returned success > [apex-backpack:31956] select: initializing btl component sm > [apex-backpack:31956] select: init of component sm returned success > [apex-backpack:31956] select: initializing btl component tcp > [apex-backpack][[59424,1],1][btl_tcp_component.c:468:mca_btl_tcp_component_create_instances] > invalid interface "en0" > [apex-backpack:31956] select: init of component tcp returned success > Process 0 on fuji.local out of 2 > Process 1 on apex-backpack out of 2 > [apex-backpack:31956] btl: tcp: attempting to connect() to address > 10.11.14.203 on port 9360 > > > > It launches the processes on both ends and then it hangs at the send > receive part..!! > What is the other thing that you were mentioning which makes you think > that its not working?!? > Please suggest.. > --regards, pallab > > > >> The -enable-heterogeneous should do the trick. And to answer the >> previous question, yes, put both of the interfaces in the include list. >> >> --mca btl_tcp_if_include en0,wlan0 >> >> If that does not work, then I may have one other thought why it might >> not work although perhaps not a solution. >> >> Rolf >> >> Pallab Datta wrote: >>> Hi Rolf, >>> >>> Do i need to configure openmpi with some specific options apart from >>> --enable-heterogeneous..? >>> I am currently using >>> ./configure --prefix=/usr/local/ --enable-heterogeneous >>> --disable-static >>> --enable-shared --enable-debug >>> >>> on both ends...is the above correct..?! Please let me know. >>> thanks and regards, >>> pallab >>> >>> Hi: I assume if you wait several minutes than your program will actually time out, yes? I guess I have two suggestions. First, can you run a non-MPI job using the wireless? Something like hostname? Secondly, you may want to specify the specific interfaces you want it to use on the two machines. You can do that via the "--mca btl_tcp_if_include" run-time parameter. Just list the ones that you expect it to use. >>>
Re: [OMPI devel] [OMPI users] Open-MPI between Mac and Linux (ubuntu 9.04) over wireless
The following are the ifconfig for both the Mac and the Linux respectively: fuji:openmpi-1.3.3 pallabdatta$ ifconfig lo0: flags=8049 mtu 16384 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 gif0: flags=8010 mtu 1280 stf0: flags=0<> mtu 1280 en0: flags=8863 mtu 1500 inet6 fe80::21f:5bff:fe3d:eaac%en0 prefixlen 64 scopeid 0x4 inet 10.11.14.203 netmask 0xf000 broadcast 10.11.15.255 ether 00:1f:5b:3d:ea:ac media: autoselect (100baseTX ) status: active supported media: autoselect 10baseT/UTP 10baseT/UTP 10baseT/UTP 10baseT/UTP 100baseTX 100baseTX 100baseTX 100baseTX 1000baseT 1000baseT 1000baseT en1: flags=8863 mtu 1500 ether 00:1f:5b:3d:ea:ad media: autoselect status: inactive supported media: autoselect 10baseT/UTP 10baseT/UTP 10baseT/UTP 10baseT/UTP 100baseTX 100baseTX 100baseTX 100baseTX 1000baseT 1000baseT 1000baseT fw0: flags=8863 mtu 4078 lladdr 00:22:41:ff:fe:ed:7d:a8 media: autoselect status: inactive supported media: autoselect LINUX: pallabdatta@apex-backpack:~/backpack/src$ ifconfig loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:116 errors:0 dropped:0 overruns:0 frame:0 TX packets:116 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:11788 (11.7 KB) TX bytes:11788 (11.7 KB) wlan0 Link encap:Ethernet HWaddr 00:21:79:c2:54:c7 inet addr:10.11.14.205 Bcast:10.11.14.255 Mask:255.255.240.0 inet6 addr: fe80::221:79ff:fec2:54c7/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:72531 errors:0 dropped:0 overruns:0 frame:0 TX packets:28894 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:5459312 (5.4 MB) TX bytes:7264193 (7.2 MB) wmaster0 Link encap:UNSPEC HWaddr 00-21-79-C2-54-C7-34-63-00-00-00-00-00-00-00-00 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) The mac is a Two 2.26GHz Quad-Core Intel Xeon Mac Pro and the Linux Box is Ubuntu Server Edition 9.04. The Mac has the ethernet interface to connect to the network and the linux box connects via a wireless adapter (IOGEAR). Please help me any way I can fix this issue. It really needs to work for our project. thanks in advance, regards, pallab > My other concern was the following but I am not sure it applies here. > If you have multiple interfaces on the node, and they are on the same > subnet, then you cannot actually select what IP address to go out of. > You can only select the IP address you want to connect to. In these > cases, I have seen a hang because we think we are selecting an IP > address to go out of, but it actually goes out the other one. > Perhaps you can send the User's list the output from "ifconfig" on each > of the machines which would show all the interfaces. You need to get the > right arguments for ifconfig depending on the OS you are running on. > > One thought is make sure the ethernet interface is marked down on both > boxes if that is possible. > > Pallab Datta wrote: >> Any suggestions on to how to debug this further..?? >> do you think I need to enable any other option besides heterogeneous at >> the configure proompt.? >> >> >>> The -enable-heterogeneous should do the trick. And to answer the >>> previous question, yes, put both of the interfaces in the include list. >>> >>> --mca btl_tcp_if_include en0,wlan0 >>> >>> If that does not work, then I may have one other thought why it might >>> not work although perhaps not a solution. >>> >>> Rolf >>> >>> Pallab Datta wrote: >>> Hi Rolf, Do i need to configure openmpi with some specific options apart from --enable-heterogeneous..? I am currently using ./configure --prefix=/usr/local/ --enable-heterogeneous --disable-static --enable-shared --enable-debug on both ends...is the above correct..?! Please let me know. thanks and regards, pallab > Hi: > I assume if you wait several minutes than your program will actually > time out, yes? I guess I have two suggestions. First, can you run a > non-MPI job using the wireless? Something like hostname? Secondly, > you > may want to specify the specific interfaces you want it to use on the > two machines. You can do that via the "--mca btl_tcp_if_include" > run-time parameter. Just list the ones that you expect it to use. > > Also, this is not right - "--mca OMPI_mca
Re: [OMPI devel] application hangs with multiple dup
Hi Edgar, - "Edgar Gabriel" wrote: > just wanted to give a heads-up that I *think* I know what the problem > is. I should have a fix (with a description) either later today or > tomorrow morning... I see that changeset 21970 is on trunk to fix this issue, is that backportable to the 1.3.x branch ? Love to see if this fixes up our users issues with Gadget! cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency
Re: [OMPI devel] application hangs with multiple dup
it will be available in 1.3.4... Thanks Edgar Chris Samuel wrote: Hi Edgar, - "Edgar Gabriel" wrote: just wanted to give a heads-up that I *think* I know what the problem is. I should have a fix (with a description) either later today or tomorrow morning... I see that changeset 21970 is on trunk to fix this issue, is that backportable to the 1.3.x branch ? Love to see if this fixes up our users issues with Gadget! cheers, Chris -- Edgar Gabriel Assistant Professor Parallel Software Technologies Lab http://pstl.cs.uh.edu Department of Computer Science University of Houston Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335