Re: [O-MPI devel] Path detection patch
Hi George, * George Bosilca wrote on Wed, Jan 18, 2006 at 10:39:48PM CET: > I have some troubles on windows getting the correct path for the ompi > installation directory as well as all tools used inside. We need this > path in order to be able to use the wrappers compilers, to load the > shared libraries and so on. I dig on the web and I come up with a > solution. If involve replacing the path define (it's always a define > for us) with a shell command. Depending on the OS this shell command > will do the magic to setup correctly the path. Here is an example: > > Actual code: > -DOMPI_PKGLIBDIR=\"$(pkglibdir)\" > > Patched code: > -DOMPI_PKGLIBDIR=\""`@PATH_CONVERTOR@ '$(pkglibdir)'`\"" > > On all UNIX flavors the PATH_CONVERTOR will be set to echo. On cygwin > will be set to "cygpath -m" so we will get the correct windows path. > I'm still looking on how to set it correctly on mingw. You might want to try cmd.exe //c $program_to_execute "$@" which will cause the mingw runtime to do path translation on all arguments. Watch out, though, this will generate backslashes, similar to 'cygpath -w' but unlike 'cygpath -m'. Likely, some other OpenMPI macros will need to be adjusted for this. Within Libtool, we mostly use this idiom to detect absolute path names: case $dir in [\\/]* | [A-Za-z]:[\\/]*) $commands_for_absolute_paths ;; *) $other_commands ;; esac (The nontrivial bit here is that using [/\\] instead of [\\/] is not portable due to buggy shells.) These threads may also be important to you (esp. for paths that do not exist yet): http://article.gmane.org/gmane.comp.gnu.mingw.msys/2785 http://thread.gmane.org/gmane.comp.gnu.mingw.user/18035 Also note that Automake has $(CYGPATH_W), which may be useful for you. > I attached the patch to this email. If you know or can find a > simplest way I will be happy to hear about. As usual all comments are > welcome :) | +-DOMPI_PREFIX=\""`@PATH_CONVERTOR@ '$(prefix)'`\"" \ Since you already AC_SUBST([PATH_CONVERTOR]), you can write -DOMPI_PREFIX=\""`$(PATH_CONVERTOR) '$(prefix)'`\"" \ here. Instead of changing all paths in the Makefiles only, you could try to do the translation at configure time already. A related thing you will encounter is that users will pass translated paths already (rather, some automatism will cause translated paths to be passed), so macros should probably be aware of that anyway. Cheers, Ralf
Re: [O-MPI devel] posix threads
Luke, I don't have access to exactly the same OS like you describe, but I was able to run your example (using the same compilation flags as you) on the following environments: - 32 bits Intel Redhat Nahant 2 - 32 bits Debian testing/unstable - 64 bits MAC OS X I even run the Netpipe benchmark with success on the same systems. However, there was an issue with our internal list management that was showing up [mostly] for the shared memory device. It's fixed now on SVN (from revision 8749). Please update you Open MPI copy (the changes are not yet in the nightly build so you have to checkout via SVN) and try again. Thanks, george. On Jan 18, 2006, at 2:12 PM, Luke Schierer wrote: If we compile openmpi with support for posix threads (./configure --prefix=/usr/local --enable-mpi-threads --with-threads=posix), mpi hangs even with simple commands like mpiexec --host localhost --np 1 hostname lock up, and have to be killed. If we compile without the --with- threads=posix, it works fine. We've tried this with both 1.0.1 and 1.0.2a3 on a CentOS release 4.2 install (CentOS is the same as Redhat, built from the same sources, just without the trademarked name and the support), and on a Debian unstable install. Does anyone have any ideas on getting openmpi to work with posix thread support? Luke ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel "Half of what I say is meaningless; but I say it so that the other half may reach you" Kahlil Gibran
Re: [O-MPI devel] posix threads
Actually, I have been able to duplicate Luke's problem and hope to have a fix this morning. I'll post more details when I have the fix. Brian On Jan 19, 2006, at 10:37 AM, George Bosilca wrote: Luke, I don't have access to exactly the same OS like you describe, but I was able to run your example (using the same compilation flags as you) on the following environments: - 32 bits Intel Redhat Nahant 2 - 32 bits Debian testing/unstable - 64 bits MAC OS X I even run the Netpipe benchmark with success on the same systems. However, there was an issue with our internal list management that was showing up [mostly] for the shared memory device. It's fixed now on SVN (from revision 8749). Please update you Open MPI copy (the changes are not yet in the nightly build so you have to checkout via SVN) and try again. Thanks, george. On Jan 18, 2006, at 2:12 PM, Luke Schierer wrote: If we compile openmpi with support for posix threads (./configure --prefix=/usr/local --enable-mpi-threads --with-threads=posix), mpi hangs even with simple commands like mpiexec --host localhost --np 1 hostname lock up, and have to be killed. If we compile without the --with- threads=posix, it works fine. We've tried this with both 1.0.1 and 1.0.2a3 on a CentOS release 4.2 install (CentOS is the same as Redhat, built from the same sources, just without the trademarked name and the support), and on a Debian unstable install. Does anyone have any ideas on getting openmpi to work with posix thread support? Luke ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel "Half of what I say is meaningless; but I say it so that the other half may reach you" Kahlil Gibran ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [O-MPI devel] while-loop around opal_condition_wait
Hello dear all, George's patch svn:open-mpi r8741 makes the dead-lock, experienced on a threaded build without this patch the on the mpi_test_suite sometimes go away (compiled with --enable-progress-threads) Previously, we would hang here: rusraink@pcglap12:~/WORK/OPENMPI/ompi-tests/mpi_test_suite/COMPILE-clean-threads> mpirun -np 2 ./mpi_test_suite -r FULL -c MPI_COMM_WORLD -d MPI_INT P2P tests Ring (3/31), comm MPI_COMM_WORLD (1/1), type MPI_INT (6/1) [... Tests snipped ...] P2P tests Alltoall with MPI_Probe (MPI_ANY_SOURCE) (20/31), comm MPI_COMM_WORLD (1/1), type MPI_INT (6/1) Collective tests Bcast (23/31), comm MPI_COMM_WORLD (1/1), type MPI_INT (6/1) ... Here we used to always hang. Now, we get through most of the times (9 out of 10). This is all without the below patch. CU, Rainer On Wednesday 18 January 2006 22:39, Brian Barrett wrote: > > Occurrences: > > ompi/class/ompi_free_list.h > > This is ok as is, because the loop protecting against a spurious > wakeup is already there. If two threads are waiting, both are woken > up, and there's only one request (or somehow, no requests), then > they'll try to remove from the list, get NULL, and continue through > the bigger while() loop. So that works as expected. > > > opal/class/opal_free_list.h > > Same reasoning as ompi_free_list. > > > ompi/request/req_wait.c /* Two Occurences: not a > >must, but... */ > > I believe these are both correct. The first is in a larger do { ...} > while loop that will handle the case of a wakeup with no requests > ready. The second is in a tight while() loop already, so we're ok > there. > > > orte/mca/gpr/proxy/gpr_proxy_compound_cmd.c > > This one I'd like Ralph to look at, because I"m not sure I understand > the logic completely. It looks like this is potentially a problem. > Only one thread will be woken up at a time, since the mutex has to be > re-acquired. So the question becomes, will anyone give up the mutex > with component.compound_cmd_mode left set to true, and I think the > answer is yes. This looks like it could be a possible bug if people > are using the compound command code when multiple threads are > active. Thankfully, I don't think this happens very often. > > > orte/mca/iof/base/iof_base_flush.c:108 > > This looks like it's wrapped in a larger while loop and is safe from > any restart wait conditions. > > > orte/mca/pls/rsh/pls_rsh_module.c:892 > > This could be a bit of a problem, but I don't think spurious wake-ups > will cause any real problems. The worst case is that possibly we end > up trying to concurrently start more processes than we really > intended. But Tim might have more insight than I do. > > > Just my $0.02 > > Brian > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- - Dipl.-Inf. Rainer Keller email: kel...@hlrs.de High Performance Computing Tel: ++49 (0)711-685 5858 Center Stuttgart (HLRS)Fax: ++49 (0)711-685 5832 POSTAL:Nobelstrasse 19 http://www.hlrs.de/people/keller ACTUAL:Allmandring 30, R. O.030 AIM:rusraink 70550 Stuttgart
Re: [O-MPI devel] while-loop around opal_condition_wait
Rainer, I was hopping my patch solve the problem completely ... look like it's not the case :( How exactly you get the dead-lock in the mpi_test_suite ? Which configure options ? Only --enable-progress- threads ? Thanks, george. On Jan 19, 2006, at 11:12 AM, Rainer Keller wrote: Hello dear all, George's patch svn:open-mpi r8741 makes the dead-lock, experienced on a threaded build without this patch the on the mpi_test_suite sometimes go away (compiled with --enable-progress-threads) Previously, we would hang here: rusraink@pcglap12:~/WORK/OPENMPI/ompi-tests/mpi_test_suite/COMPILE- clean-threads> mpirun -np 2 ./mpi_test_suite -r FULL -c MPI_COMM_WORLD -d MPI_INT P2P tests Ring (3/31), comm MPI_COMM_WORLD (1/1), type MPI_INT (6/1) [... Tests snipped ...] P2P tests Alltoall with MPI_Probe (MPI_ANY_SOURCE) (20/31), comm MPI_COMM_WORLD (1/1), type MPI_INT (6/1) Collective tests Bcast (23/31), comm MPI_COMM_WORLD (1/1), type MPI_INT (6/1) ... Here we used to always hang. Now, we get through most of the times (9 out of 10). This is all without the below patch. CU, Rainer On Wednesday 18 January 2006 22:39, Brian Barrett wrote: Occurrences: ompi/class/ompi_free_list.h This is ok as is, because the loop protecting against a spurious wakeup is already there. If two threads are waiting, both are woken up, and there's only one request (or somehow, no requests), then they'll try to remove from the list, get NULL, and continue through the bigger while() loop. So that works as expected. opal/class/opal_free_list.h Same reasoning as ompi_free_list. ompi/request/req_wait.c /* Two Occurences: not a must, but... */ I believe these are both correct. The first is in a larger do { ...} while loop that will handle the case of a wakeup with no requests ready. The second is in a tight while() loop already, so we're ok there. orte/mca/gpr/proxy/gpr_proxy_compound_cmd.c This one I'd like Ralph to look at, because I"m not sure I understand the logic completely. It looks like this is potentially a problem. Only one thread will be woken up at a time, since the mutex has to be re-acquired. So the question becomes, will anyone give up the mutex with component.compound_cmd_mode left set to true, and I think the answer is yes. This looks like it could be a possible bug if people are using the compound command code when multiple threads are active. Thankfully, I don't think this happens very often. orte/mca/iof/base/iof_base_flush.c:108 This looks like it's wrapped in a larger while loop and is safe from any restart wait conditions. orte/mca/pls/rsh/pls_rsh_module.c:892 This could be a bit of a problem, but I don't think spurious wake-ups will cause any real problems. The worst case is that possibly we end up trying to concurrently start more processes than we really intended. But Tim might have more insight than I do. Just my $0.02 Brian ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- - Dipl.-Inf. Rainer Keller email: kel...@hlrs.de High Performance Computing Tel: ++49 (0)711-685 5858 Center Stuttgart (HLRS)Fax: ++49 (0)711-685 5832 POSTAL:Nobelstrasse 19 http://www.hlrs.de/people/keller ACTUAL:Allmandring 30, R. O.030 AIM:rusraink 70550 Stuttgart ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel "Half of what I say is meaningless; but I say it so that the other half may reach you" Kahlil Gibran
Re: [O-MPI devel] while-loop around opal_condition_wait
Hi George, On Thursday 19 January 2006 17:22, George Bosilca wrote: > I was hopping my patch solve the problem completely ... look like > it's not the case :( How exactly you get the dead-lock in the > mpi_test_suite ? Which configure options ? Only --enable-progress- > threads ? This happens with both --enable-progress-threads and an additional --enable-mpi-threads Both hang in the same places: Process 0: #4 0x40315a56 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0 #5 0x40222513 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libc.so.6 #6 0x4007d7a2 in opal_condition_wait (c=0x4013c6c0, m=0x4013c720) at condition.h:64 #7 0x4007d40b in ompi_request_wait_all (count=1, requests=0x80bc1c0, statuses=0x0) at req_wait.c:159 #8 0x4073083f in ompi_coll_tuned_bcast_intra_basic_linear (buff=0x80c9c90, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at coll_tuned_bcast.c:762 #9 0x4072b002 in ompi_coll_tuned_bcast_intra_dec_fixed (buff=0x80c9c90, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at coll_tuned_decision_fixed.c:175 #10 0x40083dae in PMPI_Bcast (buffer=0x80c9c90, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at pbcast.c:88 #11 0x0804f2cf in tst_coll_bcast_run (env=0xbfffeac0) at tst_coll_bcast.c:74 #12 0x0804bf21 in tst_test_run_func (env=0xbfffeac0) at tst_tests.c:377 #13 0x0804a46a in main (argc=7, argv=0xbfffeb74) at mpi_test_suite.c:319 Process 1: #4 0x40315a56 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0 #5 0x40222513 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libc.so.6 #6 0x406941e3 in opal_condition_wait (c=0x4013c6c0, m=0x4013c720) at condition.h:64 #7 0x406939f2 in mca_pml_ob1_recv (addr=0x80c9c58, count=1000, datatype=0x8061de8, src=0, tag=-17, comm=0x80627e0, status=0x0) at pml_ob1_irecv.c:96 #8 0x407307a4 in ompi_coll_tuned_bcast_intra_basic_linear (buff=0x80c9c58, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at coll_tuned_bcast.c:729 #9 0x4072b002 in ompi_coll_tuned_bcast_intra_dec_fixed (buff=0x80c9c58, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at coll_tuned_decision_fixed.c:175 #10 0x40083dae in PMPI_Bcast (buffer=0x80c9c58, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at pbcast.c:88 #11 0x0804f2cf in tst_coll_bcast_run (env=0xbfffeac0) at tst_coll_bcast.c:74 #12 0x0804bf21 in tst_test_run_func (env=0xbfffeac0) at tst_tests.c:377 #13 0x0804a46a in main (argc=7, argv=0xbfffeb74) at mpi_test_suite.c:319 And yes, when I run with the basic-coll, we also hang ,-] mpirun -np 2 --mca coll basic ./mpi_test_suite -r FULL -c MPI_COMM_WORLD -d MPI_INT #4 0x40315a56 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0 #5 0x40222513 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libc.so.6 #6 0x406941e3 in opal_condition_wait (c=0x4013c6c0, m=0x4013c720) at condition.h:64 #7 0x406939f2 in mca_pml_ob1_recv (addr=0x80c4ca0, count=1000, datatype=0x8061de8, src=0, tag=-17, comm=0x80627e0, status=0x0) at pml_ob1_irecv.c:96 #8 0x4070e402 in mca_coll_basic_bcast_lin_intra (buff=0x80c4ca0, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at coll_basic_bcast.c:57 #9 0x40083dae in PMPI_Bcast (buffer=0x80c4ca0, count=1000, datatype=0x8061de8, root=0, comm=0x80627e0) at pbcast.c:88 #10 0x0804f2cf in tst_coll_bcast_run (env=0xbfffeab0) at tst_coll_bcast.c:74 #11 0x0804bf21 in tst_test_run_func (env=0xbfffeab0) at tst_tests.c:377 #12 0x0804a46a in main (argc=7, argv=0xbfffeb64) at mpi_test_suite.c:319 Now, for what its worth, I ran with helgrind, to check for possible race-conditions, and it spews out: ==20240== Possible data race writing variable at 0x1D84F46C ==20240==at 0x1DA8BE61: mca_oob_tcp_recv (oob_tcp_recv.c:129) ==20240==by 0x1D73A636: mca_oob_recv_packed (oob_base_recv.c:69) ==20240==by 0x1D73B2B0: mca_oob_xcast (oob_base_xcast.c:133) ==20240==by 0x1D511138: ompi_mpi_init (ompi_mpi_init.c:421) ==20240== Address 0x1D84F46C is 1020 bytes inside a block of size 3168 alloc'd by thread 1 ==20240==at 0x1D4A80B4: malloc (in /usr/lib/valgrind/vgpreload_helgrind.so) ==20240==by 0x1D7DF7BE: opal_free_list_grow (opal_free_list.c:94) ==20240==by 0x1D7DF754: opal_free_list_init (opal_free_list.c:79) ==20240==by 0x1DA815E3: mca_oob_tcp_component_init (oob_tcp.c:530) So, this was my initial search for whether we may have races in opal/mpi_free_list CU, Rainer -- - Dipl.-Inf. Rainer Keller email: kel...@hlrs.de High Performance Computing Tel: ++49 (0)711-685 5858 Center Stuttgart (HLRS)Fax: ++49 (0)711-685 5832 POSTAL:Nobelstrasse 19 http://www.hlrs.de/people/keller ACTUAL:Allmandring 30, R. O.030 AIM:rusraink 70550 Stuttgart
Re: [O-MPI devel] posix threads
On Jan 18, 2006, at 2:12 PM, Luke Schierer wrote: If we compile openmpi with support for posix threads (./configure --prefix=/usr/local --enable-mpi-threads --with-threads=posix), mpi hangs even with simple commands like mpiexec --host localhost --np 1 hostname lock up, and have to be killed. If we compile without the --with- threads=posix, it works fine. We've tried this with both 1.0.1 and 1.0.2a3 on a CentOS release 4.2 install (CentOS is the same as Redhat, built from the same sources, just without the trademarked name and the support), and on a Debian unstable install. Does anyone have any ideas on getting openmpi to work with posix thread support? It looks like there was an inadvertent double-lock in the startup sequence for the standard I/O forwarding in our subversion branch for the 1.0 releases. I'm not sure how the problem snuck into the release branch, but it has been fixed. You can either get an SVN checkout of the 1.0 branch, get the nightly tarball build tomorrow, wait for us to release Open MPI 1.0.2, or apply the attached patch to the v1.0.1 tarball. Brian -- Brian Barrett Open MPI developer http://www.open-mpi.org/ orte_lock.diff Description: Binary data
Re: [O-MPI devel] while-loop around opal_condition_wait
On Thu, 19 Jan 2006, Rainer Keller wrote: And yes, when I run with the basic-coll, we also hang ,-] in the first case your running : #8 0x407307a4 in ompi_coll_tuned_bcast_intra_basic_linear (buff=0x80c9c58, which is actually the basic collective anyway.. it just got there via a different path (in this case the collective decision as for 2 procs a lnear bcast for small messages is faster than segmented). mpirun -np 2 --mca coll basic ./mpi_test_suite -r FULL -c MPI_COMM_WORLD -d MPI_INT #8 0x4070e402 in mca_coll_basic_bcast_lin_intra (buff=0x80c4ca0, count=1000, So, this was my initial search for whether we may have races in opal/mpi_free_list G
Re: [O-MPI devel] debugging methods
Hello, Apologies for the late response. I've been learning the BTL interface myself recently, and was asked to come up with answers for you. Hopefully my response is useful, let me know if you have more questions. Andrew Leslie Watter wrote: What I need? To know how and what functions is necessary to perform a minimalist implementation of a new btl, registering it and make it usable. First, two component functions are required - mca_component_open and mca_component_close. Two structs need to be set up - one for the component, and one for each module. The component struct is called mca_btl__component_t, and extends mca_btl_base_component_t. The module struct is called mca_btl__module_t, and extends mca_btl_base_module_t. Only one instance of the component struct is created, while many module structs may be created (usually one per network interface). Inside these structs are several function pointers that must be filled in. For the component, the btl_init and btl_progress fields are required. For each module, the following functions are required: btl_add_procs btl_del_procs btl_register btl_finalize btl_alloc btl_free btl_prepare_src btl_send The remaining three - btl_prepare_dst, btl_put, and btl_get - are optional RDMA functions. Their presence is indicated by the btl_flags field in the module struct. If either MCA_BTL_FLAGS_PUT or MCA_BTL_FLAGS_GET are set, the respective put/get function must be set in the struct, as well as prepare_dst. See btl.h and tcp/btl_tcp.h for examples. 1) Inicialization component open component init component create instances btl tcp create component create_listen btl tcp setsocket options component exchange btl tcp add procs endpoint construct ( executed * number of endpoints ) btl tcp del procs btl tcp register del procs should not be getting called here. Otherwise this looks correct. This is the sequence I have found executing the TCP BTL code. Please fell free to correct the place of sections. Other than the del procs call, this looks correct. Tim Woodall had some additional comments about typical send cases: From the perspective of the PML<->BTL interface, the PML will in general call: 1) btl_alloc followed by btl_send for short control messages 2) btl_prepare_src followed by btl_send for send/recv semantics 3) btl_prepare_dst/btl_prepare_src/btl_put for rdma semantics
Re: [O-MPI devel] while-loop around opal_condition_wait
Rainer, I found it. Please update to revision 8760. Now it look like we are completely multi-threaded ... at least on all the tests I run :) george. On Jan 19, 2006, at 12:23 PM, Rainer Keller wrote: Hi George, On Thursday 19 January 2006 17:22, George Bosilca wrote: I was hopping my patch solve the problem completely ... look like it's not the case :( How exactly you get the dead-lock in the mpi_test_suite ? Which configure options ? Only --enable-progress- threads ? This happens with both --enable-progress-threads and an additional --enable-mpi-threads "Half of what I say is meaningless; but I say it so that the other half may reach you" Kahlil Gibran