Re: [OMPI users] poor performance using the openib btl
You might try restarting the device drivers. $pdsh -g yourcluster service openibd restart Josh Sent from my iPhone > On Jun 26, 2014, at 6:55 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> > wrote: > > Just curious -- if you run standard ping-pong kinds of MPI benchmarks with > the same kind of mpirun command line that you run your application, do you > see the expected level of performance? (i.e., verification that you're using > the low latency transport, etc.) > > >> On Jun 25, 2014, at 9:52 AM, Fischer, Greg A. <fisch...@westinghouse.com> >> wrote: >> >> I looked through my configure log, and that option is not enabled. Thanks >> for the suggestion. >> >> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime >> Boissonneault >> Sent: Wednesday, June 25, 2014 10:51 AM >> To: Open MPI Users >> Subject: Re: [OMPI users] poor performance using the openib btl >> >> Hi, >> I recovered the name of the option that caused problems for us. It is >> --enable-mpi-thread-multiple >> >> This option enables threading within OPAL, which was bugged (at least in >> 1.6.x series). I don't know if it has been fixed in 1.8 series. >> >> I do not see your configure line in the attached file, to see if it was >> enabled or not. >> >> Maxime >> >> Le 2014-06-25 10:46, Fischer, Greg A. a écrit : >> Attached are the results of “grep thread” on my configure output. There >> appears to be some amount of threading, but is there anything I should look >> for in particular? >> >> I see Mike Dubman’s questions on the mailing list website, but his message >> didn’t appear to make it to my inbox. The answers to his questions are: >> >> [binford:fischega] $ rpm -qa | grep ofed >> ofed-doc-1.5.4.1-0.11.5 >> ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 >> ofed-1.5.4.1-0.11.5 >> >> Distro: SLES11 SP3 >> >> HCA: >> [binf102:fischega] $ /usr/sbin/ibstat >> CA 'mlx4_0' >>CA type: MT26428 >> >> Command line (path and LD_LIBRARY_PATH are set correctly): >> mpirun -x LD_LIBRARY_PATH -mca btl openib,sm,self -mca btl_openib_verbose 1 >> -np 31 $CTF_EXEC >> >> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime >> Boissonneault >> Sent: Tuesday, June 24, 2014 6:41 PM >> To: Open MPI Users >> Subject: Re: [OMPI users] poor performance using the openib btl >> >> What are your threading options for OpenMPI (when it was built) ? >> >> I have seen OpenIB BTL completely lock when some level of threading is >> enabled before. >> >> Maxime Boissonneault >> >> >> Le 2014-06-24 18:18, Fischer, Greg A. a écrit : >> Hello openmpi-users, >> >> A few weeks ago, I posted to the list about difficulties I was having >> getting openib to work with Torque (see “openib segfaults with Torque”, June >> 6, 2014). The issues were related to Torque imposing restrictive limits on >> locked memory, and have since been resolved. >> >> However, now that I’ve had some time to test the applications, I’m seeing >> abysmal performance over the openib layer. Applications run with the tcp btl >> execute about 10x faster than with the openib btl. Clearly something still >> isn’t quite right. >> >> I tried running with “-mca btl_openib_verbose 1”, but didn’t see anything >> resembling a smoking gun. How should I go about determining the source of >> the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC >> 4.8.3 setup discussed previously.) >> >> Thanks, >> Greg >> >> >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/06/24697.php >> >> >> >> >> -- >> - >> Maxime Boissonneault >> Analyste de calcul - Calcul Québec, Université Laval >> Ph. D. en physique >> >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/06/24700.php >> >> >> >> -- >> - >> Maxime Boissonneault >> Analyste de calcul - Calcul Québec, Université Laval >> Ph. D. en physique >> ___ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/06/24702.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24707.php
Re: [OMPI users] poor performance using the openib btl
Just curious -- if you run standard ping-pong kinds of MPI benchmarks with the same kind of mpirun command line that you run your application, do you see the expected level of performance? (i.e., verification that you're using the low latency transport, etc.) On Jun 25, 2014, at 9:52 AM, Fischer, Greg A. <fisch...@westinghouse.com> wrote: > I looked through my configure log, and that option is not enabled. Thanks for > the suggestion. > > From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime > Boissonneault > Sent: Wednesday, June 25, 2014 10:51 AM > To: Open MPI Users > Subject: Re: [OMPI users] poor performance using the openib btl > > Hi, > I recovered the name of the option that caused problems for us. It is > --enable-mpi-thread-multiple > > This option enables threading within OPAL, which was bugged (at least in > 1.6.x series). I don't know if it has been fixed in 1.8 series. > > I do not see your configure line in the attached file, to see if it was > enabled or not. > > Maxime > > Le 2014-06-25 10:46, Fischer, Greg A. a écrit : > Attached are the results of “grep thread” on my configure output. There > appears to be some amount of threading, but is there anything I should look > for in particular? > > I see Mike Dubman’s questions on the mailing list website, but his message > didn’t appear to make it to my inbox. The answers to his questions are: > > [binford:fischega] $ rpm -qa | grep ofed > ofed-doc-1.5.4.1-0.11.5 > ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 > ofed-1.5.4.1-0.11.5 > > Distro: SLES11 SP3 > > HCA: > [binf102:fischega] $ /usr/sbin/ibstat > CA 'mlx4_0' > CA type: MT26428 > > Command line (path and LD_LIBRARY_PATH are set correctly): > mpirun -x LD_LIBRARY_PATH -mca btl openib,sm,self -mca btl_openib_verbose 1 > -np 31 $CTF_EXEC > > From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime > Boissonneault > Sent: Tuesday, June 24, 2014 6:41 PM > To: Open MPI Users > Subject: Re: [OMPI users] poor performance using the openib btl > > What are your threading options for OpenMPI (when it was built) ? > > I have seen OpenIB BTL completely lock when some level of threading is > enabled before. > > Maxime Boissonneault > > > Le 2014-06-24 18:18, Fischer, Greg A. a écrit : > Hello openmpi-users, > > A few weeks ago, I posted to the list about difficulties I was having getting > openib to work with Torque (see “openib segfaults with Torque”, June 6, > 2014). The issues were related to Torque imposing restrictive limits on > locked memory, and have since been resolved. > > However, now that I’ve had some time to test the applications, I’m seeing > abysmal performance over the openib layer. Applications run with the tcp btl > execute about 10x faster than with the openib btl. Clearly something still > isn’t quite right. > > I tried running with “-mca btl_openib_verbose 1”, but didn’t see anything > resembling a smoking gun. How should I go about determining the source of the > problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC 4.8.3 > setup discussed previously.) > > Thanks, > Greg > > > > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24697.php > > > > > -- > - > Maxime Boissonneault > Analyste de calcul - Calcul Québec, Université Laval > Ph. D. en physique > > > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24700.php > > > > -- > - > Maxime Boissonneault > Analyste de calcul - Calcul Québec, Université Laval > Ph. D. en physique > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24702.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] poor performance using the openib btl
I looked through my configure log, and that option is not enabled. Thanks for the suggestion. From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime Boissonneault Sent: Wednesday, June 25, 2014 10:51 AM To: Open MPI Users Subject: Re: [OMPI users] poor performance using the openib btl Hi, I recovered the name of the option that caused problems for us. It is --enable-mpi-thread-multiple This option enables threading within OPAL, which was bugged (at least in 1.6.x series). I don't know if it has been fixed in 1.8 series. I do not see your configure line in the attached file, to see if it was enabled or not. Maxime Le 2014-06-25 10:46, Fischer, Greg A. a écrit : Attached are the results of "grep thread" on my configure output. There appears to be some amount of threading, but is there anything I should look for in particular? I see Mike Dubman's questions on the mailing list website, but his message didn't appear to make it to my inbox. The answers to his questions are: [binford:fischega] $ rpm -qa | grep ofed ofed-doc-1.5.4.1-0.11.5 ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 ofed-1.5.4.1-0.11.5 Distro: SLES11 SP3 HCA: [binf102:fischega] $ /usr/sbin/ibstat CA 'mlx4_0' CA type: MT26428 Command line (path and LD_LIBRARY_PATH are set correctly): mpirun -x LD_LIBRARY_PATH -mca btl openib,sm,self -mca btl_openib_verbose 1 -np 31 $CTF_EXEC From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime Boissonneault Sent: Tuesday, June 24, 2014 6:41 PM To: Open MPI Users Subject: Re: [OMPI users] poor performance using the openib btl What are your threading options for OpenMPI (when it was built) ? I have seen OpenIB BTL completely lock when some level of threading is enabled before. Maxime Boissonneault Le 2014-06-24 18:18, Fischer, Greg A. a écrit : Hello openmpi-users, A few weeks ago, I posted to the list about difficulties I was having getting openib to work with Torque (see "openib segfaults with Torque", June 6, 2014). The issues were related to Torque imposing restrictive limits on locked memory, and have since been resolved. However, now that I've had some time to test the applications, I'm seeing abysmal performance over the openib layer. Applications run with the tcp btl execute about 10x faster than with the openib btl. Clearly something still isn't quite right. I tried running with "-mca btl_openib_verbose 1", but didn't see anything resembling a smoking gun. How should I go about determining the source of the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC 4.8.3 setup discussed previously.) Thanks, Greg ___ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/06/24697.php -- - Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique ___ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/06/24700.php -- - Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique
Re: [OMPI users] poor performance using the openib btl
Hi, I recovered the name of the option that caused problems for us. It is --enable-mpi-thread-multiple This option enables threading within OPAL, which was bugged (at least in 1.6.x series). I don't know if it has been fixed in 1.8 series. I do not see your configure line in the attached file, to see if it was enabled or not. Maxime Le 2014-06-25 10:46, Fischer, Greg A. a écrit : Attached are the results of "grep thread" on my configure output. There appears to be some amount of threading, but is there anything I should look for in particular? I see Mike Dubman's questions on the mailing list website, but his message didn't appear to make it to my inbox. The answers to his questions are: [binford:fischega] $ rpm -qa | grep ofed ofed-doc-1.5.4.1-0.11.5 ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 ofed-1.5.4.1-0.11.5 Distro: SLES11 SP3 HCA: [binf102:fischega] $ /usr/sbin/ibstat CA 'mlx4_0' CA type: MT26428 Command line (path and LD_LIBRARY_PATH are set correctly): mpirun -x LD_LIBRARY_PATH -mca btl openib,sm,self -mca btl_openib_verbose 1 -np 31 $CTF_EXEC *From:*users [mailto:users-boun...@open-mpi.org] *On Behalf Of *Maxime Boissonneault *Sent:* Tuesday, June 24, 2014 6:41 PM *To:* Open MPI Users *Subject:* Re: [OMPI users] poor performance using the openib btl What are your threading options for OpenMPI (when it was built) ? I have seen OpenIB BTL completely lock when some level of threading is enabled before. Maxime Boissonneault Le 2014-06-24 18:18, Fischer, Greg A. a écrit : Hello openmpi-users, A few weeks ago, I posted to the list about difficulties I was having getting openib to work with Torque (see "openib segfaults with Torque", June 6, 2014). The issues were related to Torque imposing restrictive limits on locked memory, and have since been resolved. However, now that I've had some time to test the applications, I'm seeing abysmal performance over the openib layer. Applications run with the tcp btl execute about 10x faster than with the openib btl. Clearly something still isn't quite right. I tried running with "-mca btl_openib_verbose 1", but didn't see anything resembling a smoking gun. How should I go about determining the source of the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC 4.8.3 setup discussed previously.) Thanks, Greg ___ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post:http://www.open-mpi.org/community/lists/users/2014/06/24697.php -- - Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique ___ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/06/24700.php -- - Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique
Re: [OMPI users] poor performance using the openib btl
Attached are the results of "grep thread" on my configure output. There appears to be some amount of threading, but is there anything I should look for in particular? I see Mike Dubman's questions on the mailing list website, but his message didn't appear to make it to my inbox. The answers to his questions are: [binford:fischega] $ rpm -qa | grep ofed ofed-doc-1.5.4.1-0.11.5 ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 ofed-1.5.4.1-0.11.5 Distro: SLES11 SP3 HCA: [binf102:fischega] $ /usr/sbin/ibstat CA 'mlx4_0' CA type: MT26428 Command line (path and LD_LIBRARY_PATH are set correctly): mpirun -x LD_LIBRARY_PATH -mca btl openib,sm,self -mca btl_openib_verbose 1 -np 31 $CTF_EXEC From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime Boissonneault Sent: Tuesday, June 24, 2014 6:41 PM To: Open MPI Users Subject: Re: [OMPI users] poor performance using the openib btl What are your threading options for OpenMPI (when it was built) ? I have seen OpenIB BTL completely lock when some level of threading is enabled before. Maxime Boissonneault Le 2014-06-24 18:18, Fischer, Greg A. a écrit : Hello openmpi-users, A few weeks ago, I posted to the list about difficulties I was having getting openib to work with Torque (see "openib segfaults with Torque", June 6, 2014). The issues were related to Torque imposing restrictive limits on locked memory, and have since been resolved. However, now that I've had some time to test the applications, I'm seeing abysmal performance over the openib layer. Applications run with the tcp btl execute about 10x faster than with the openib btl. Clearly something still isn't quite right. I tried running with "-mca btl_openib_verbose 1", but didn't see anything resembling a smoking gun. How should I go about determining the source of the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC 4.8.3 setup discussed previously.) Thanks, Greg ___ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/06/24697.php -- - Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique checking for a thread-safe mkdir -p... /bin/mkdir -p checking pthread.h usability... yes checking pthread.h presence... yes checking for pthread.h... yes checking if C compiler and POSIX threads work as is... no checking if C++ compiler and POSIX threads work as is... no checking if Fortran compiler and POSIX threads work as is... no checking if C compiler and POSIX threads work with -Kthread... no checking if C compiler and POSIX threads work with -kthread... no checking if C compiler and POSIX threads work with -pthread... yes checking if C++ compiler and POSIX threads work with -Kthread... no checking if C++ compiler and POSIX threads work with -kthread... no checking if C++ compiler and POSIX threads work with -pthread... yes checking if Fortran compiler and POSIX threads work with -Kthread... no checking if Fortran compiler and POSIX threads work with -kthread... no checking if Fortran compiler and POSIX threads work with -pthread... yes checking for pthread_mutexattr_setpshared... yes checking for pthread_condattr_setpshared... yes checking for working POSIX threads package... yes checking for type of thread support... posix checking if threads have different pids (pthreads on linux)... no checking for pthread_t... yes checking pthread_np.h usability... no checking pthread_np.h presence... no checking for pthread_np.h... no checking whether pthread_setaffinity_np is declared... yes checking whether pthread_getaffinity_np is declared... yes checking for library containing pthread_getthrds_np... no checking for pthread_mutex_lock... yes checking libevent configuration args... --disable-dns --disable-http --disable-rpc --disable-openssl --enable-thread-support --disable-evport configure: running /bin/sh '../../../../../../openmpi-1.8.1/opal/mca/event/libevent2021/libevent/configure' --disable-dns --disable-http --disable-rpc --disable-openssl --enable-thread-support --disable-evport '--prefix=/casl/vera_ib/gcc-4.8.3/toolset/openmpi-1.8.1' --cache-file=/dev/null --srcdir=../../../../../../openmpi-1.8.1/opal/mca/event/libevent2021/libevent --disable-option-checking checking for a thread-safe mkdir -p... /bin/mkdir -p checking for the pthreads library -lpthreads... no checking whether pthreads work without any flags... yes checking for joinable pthread attribute... PTHREAD_CREATE_JOINABLE checking if more special flags are required for pthreads... no checking size of pthread_t... 8 config.status: creating libevent_pthreads.pc checking for thread support (needed for rdmacm/udcm)... posix configure: running /bin/sh '../../../../../../openmpi-1.8
Re: [OMPI users] poor performance using the openib btl
Hi what ofed/mofed are you using? what HCA, distro and command line? M On Wed, Jun 25, 2014 at 1:40 AM, Maxime Boissonneault < maxime.boissonnea...@calculquebec.ca> wrote: > What are your threading options for OpenMPI (when it was built) ? > > I have seen OpenIB BTL completely lock when some level of threading is > enabled before. > > Maxime Boissonneault > > > Le 2014-06-24 18:18, Fischer, Greg A. a écrit : > > Hello openmpi-users, > > > > A few weeks ago, I posted to the list about difficulties I was having > getting openib to work with Torque (see “openib segfaults with Torque”, > June 6, 2014). The issues were related to Torque imposing restrictive > limits on locked memory, and have since been resolved. > > > > However, now that I’ve had some time to test the applications, I’m seeing > abysmal performance over the openib layer. Applications run with the tcp > btl execute about 10x faster than with the openib btl. Clearly something > still isn’t quite right. > > > > I tried running with “-mca btl_openib_verbose 1”, but didn’t see anything > resembling a smoking gun. How should I go about determining the source of > the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC > 4.8.3 setup discussed previously.) > > > > Thanks, > > Greg > > > ___ > users mailing listus...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24697.php > > > > -- > - > Maxime Boissonneault > Analyste de calcul - Calcul Québec, Université Laval > Ph. D. en physique > > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/06/24698.php >
Re: [OMPI users] poor performance using the openib btl
What are your threading options for OpenMPI (when it was built) ? I have seen OpenIB BTL completely lock when some level of threading is enabled before. Maxime Boissonneault Le 2014-06-24 18:18, Fischer, Greg A. a écrit : Hello openmpi-users, A few weeks ago, I posted to the list about difficulties I was having getting openib to work with Torque (see "openib segfaults with Torque", June 6, 2014). The issues were related to Torque imposing restrictive limits on locked memory, and have since been resolved. However, now that I've had some time to test the applications, I'm seeing abysmal performance over the openib layer. Applications run with the tcp btl execute about 10x faster than with the openib btl. Clearly something still isn't quite right. I tried running with "-mca btl_openib_verbose 1", but didn't see anything resembling a smoking gun. How should I go about determining the source of the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC 4.8.3 setup discussed previously.) Thanks, Greg ___ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2014/06/24697.php -- - Maxime Boissonneault Analyste de calcul - Calcul Québec, Université Laval Ph. D. en physique
[OMPI users] poor performance using the openib btl
Hello openmpi-users, A few weeks ago, I posted to the list about difficulties I was having getting openib to work with Torque (see "openib segfaults with Torque", June 6, 2014). The issues were related to Torque imposing restrictive limits on locked memory, and have since been resolved. However, now that I've had some time to test the applications, I'm seeing abysmal performance over the openib layer. Applications run with the tcp btl execute about 10x faster than with the openib btl. Clearly something still isn't quite right. I tried running with "-mca btl_openib_verbose 1", but didn't see anything resembling a smoking gun. How should I go about determining the source of the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC 4.8.3 setup discussed previously.) Thanks, Greg