Re: [OMPI users] quick patch to buildrpm.sh to enable building on SuSE
Committed -- thanks! On Oct 23, 2006, at 7:14 PM, Joe Landman wrote: --- buildrpm.sh 2006-10-23 17:59:33.729764603 -0400 +++ buildrpm-fixed.sh 2006-10-23 17:58:33.145635240 -0400 @@ -11,6 +11,7 @@ # prefix="/opt/openmpi" +#/1.1.2/pgi" specfile="openmpi.spec" rpmbuild_options="--define 'mflags -j4'" configure_options= @@ -22,10 +23,10 @@ # Some distro's will attempt to force using bizarre, custom compiler # names (e.g., i386-redhat-linux-gnu-gcc). So hardwire them to use # "normal" names. -#export CC=gcc -#export CXX=g++ -#export F77=f77 -#export FC= +#export CC=pgcc +#export CXX=pgCC +#export F77=pgf77 +#export FC=pgf90 # Note that this script can build one or all of the following RPMs: # SRPM, all-in-one, multiple. @@ -35,7 +36,7 @@ # If you want to build the "all in one RPM", put "yes" here build_single=no # If you want to build the "multiple" RPMs, put "yes" here -build_multiple=no +build_multiple=yes ## ### # You should not need to change anything below this line @@ -109,6 +110,9 @@ elif test -d /usr/src/RPM; then need_root=1 rpmtopdir="/usr/src/RPM" +elif test -d /usr/src/packages; then +need_root=1 +rpmtopdir="/usr/src/packages" else need_root=1 rpmtopdir="/usr/src/redhat" -- Joe Landman landman |at| scalableinformatics |dot| com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Server Virtualization Business Unit Cisco Systems
[OMPI users] quick patch to buildrpm.sh to enable building on SuSE
--- buildrpm.sh 2006-10-23 17:59:33.729764603 -0400 +++ buildrpm-fixed.sh 2006-10-23 17:58:33.145635240 -0400 @@ -11,6 +11,7 @@ # prefix="/opt/openmpi" +#/1.1.2/pgi" specfile="openmpi.spec" rpmbuild_options="--define 'mflags -j4'" configure_options= @@ -22,10 +23,10 @@ # Some distro's will attempt to force using bizarre, custom compiler # names (e.g., i386-redhat-linux-gnu-gcc). So hardwire them to use # "normal" names. -#export CC=gcc -#export CXX=g++ -#export F77=f77 -#export FC= +#export CC=pgcc +#export CXX=pgCC +#export F77=pgf77 +#export FC=pgf90 # Note that this script can build one or all of the following RPMs: # SRPM, all-in-one, multiple. @@ -35,7 +36,7 @@ # If you want to build the "all in one RPM", put "yes" here build_single=no # If you want to build the "multiple" RPMs, put "yes" here -build_multiple=no +build_multiple=yes # # You should not need to change anything below this line @@ -109,6 +110,9 @@ elif test -d /usr/src/RPM; then need_root=1 rpmtopdir="/usr/src/RPM" +elif test -d /usr/src/packages; then +need_root=1 +rpmtopdir="/usr/src/packages" else need_root=1 rpmtopdir="/usr/src/redhat" -- Joe Landman landman |at| scalableinformatics |dot| com
Re: [OMPI users] dual Gigabit ethernet support
On 10/23/06, Tony Laddwrote: A couple of comments regarding issues raised by this thread. 1) In my opinion Netpipe is not such a great network benchmarking tool for HPC applications. It measures timings based on the completion of the send call on the transmitter not the completion of the receive. Thus, if there is a delay in copying the send buffer across the net, it will report a misleading timing compared with the wall-clock time. This is particularly problematic with multiple pairs of edge exchanges, which can oversubscribe most GigE switches. Here the netpipe timings can be off by orders of magnitude compared with the wall clock. The good thing about writing your own code is that you know what it has done (of course no one else knows, which can be a problem). But it seems many people are unaware of the timing issue in Netpipe. Yes! I've noticed that. I am now using Intel MPI Benchmarck. PingPong /PingPing and SendRecv test cases seems to be more realistic. Does any one have any comments about this test suite? 2) Its worth distinguishing between ethernet and TCP/IP. With MPIGAMMA, the Intel Pro 1000 NIC has a latency of 12 microsecs including the switch and a duplex bandwidth of 220 MBytes/sec. With the Extreme Networks X450a-48t switch we can sustain 220MBytes/sec over 48 ports at once. This is not IB performance but it seems sufficient to scale a number of applications to the 100 cpu level, and perhaps beyond. GAMMA seems to be a great work, looking at some of its reports in web site. Hoever, I have not tried it yet, and I am not sure if I will, mainly because only supports MPICH-1. Has anyone any rough idea how much work it could be to make it availabe for OpenMPI. Seems to be a very interesting student project... -- Lisandro Dalcín --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
[OMPI users] dual Gigabit ethernet support
A couple of comments regarding issues raised by this thread. 1) In my opinion Netpipe is not such a great network benchmarking tool for HPC applications. It measures timings based on the completion of the send call on the transmitter not the completion of the receive. Thus, if there is a delay in copying the send buffer across the net, it will report a misleading timing compared with the wall-clock time. This is particularly problematic with multiple pairs of edge exchanges, which can oversubscribe most GigE switches. Here the netpipe timings can be off by orders of magnitude compared with the wall clock. The good thing about writing your own code is that you know what it has done (of course no one else knows, which can be a problem). But it seems many people are unaware of the timing issue in Netpipe. 2) Its worth distinguishing between ethernet and TCP/IP. With MPIGAMMA, the Intel Pro 1000 NIC has a latency of 12 microsecs including the switch and a duplex bandwidth of 220 MBytes/sec. With the Extreme Networks X450a-48t switch we can sustain 220MBytes/sec over 48 ports at once. This is not IB performance but it seems sufficient to scale a number of applications to the 100 cpu level, and perhaps beyond. Tony --- Tony Ladd Professor, Chemical Engineering University of Florida PO Box 116005 Gainesville, FL 32611-6005 Tel: 352-392-6509 FAX: 352-392-9513 Email: tl...@che.ufl.edu Web: http://ladd.che.ufl.edu
Re: [OMPI users] dual Gigabit ethernet support
We manage to get 900+ Mbps on a broadcom, 570x chip. We run jumbo frames and use a force10 switch. This is with also openmpi-1.0.2 (have not tried rebuilding netpipe with 1.1.2) Also see great results with netpipe (mpi) on infiniband. Great work so far guys. 120: 6291459 bytes 3 times -->930.47 Mbps in 51586.67 usec 121: 8388605 bytes 3 times -->932.60 Mbps in 68625.17 usec 122: 8388608 bytes 3 times -->932.65 Mbps in 68621.83 usec 123: 8388611 bytes 3 times -->932.59 Mbps in 68625.85 usec Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985 On Oct 23, 2006, at 1:57 PM, George Bosilca wrote: I don't know what your bandwidth tester look like, but 140MB/s it's way too much for a single Gige card, except if it's a bidirectional bandwidth. Usually, on a new generation Gige card (Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI Express) with a AMD processor (AMD Athlon(tm) 64 Processor 3500+) I only manage to get around 800Mb/s out of a point-to-point transfer. With an external card not on the OCI-express bus I barely get 600Mb/s... Why you don't use a real network performance tool such as Netpipe. At least it will insure you that the bandwidth is the one you expect. Thanks, george. On Oct 23, 2006, at 4:56 AM, Jayanta Roy wrote: Hi, Sometime before I have posted doubts about using dual gigabit support fully. See I get ~140MB/s full duplex transfer rate in each of following runs. mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out How to combine these two port or use a proper routing table in place host file? I am using openmpi-1.1 version. -Jayanta ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] dual Gigabit ethernet support
What I think is happening is this: The initial transfer rate you are seeing is the burst rate; after a long time average, your sustained transfer rate emerges. Like George said, you should use a proven tool to measure your bandwidth. We use netperf, a freeware from HP. That said, the ethernet technology is not a good candidate for HPC (one reason people don't use it in the backplanes, despite the low cost). Do the math yourself: there is a 54 byte overhead (14 B ethernet + 20B IP + 20B TCP) for every packet sent, for socket communication. That is why protocols like uDAPL over Infiniband is gaining in popularity. Durga On 10/23/06, Jayanta Roywrote: Hi, I have tried with lamboot with a host file where odd-even nodes will talk within themselves using eth0 and talk across them using eth1. So my transfer runs @ 230MB/s at starting. But after few transfers rate falls down to ~130MB/s and after long run finally comes to ~54MB/s. Why this type of network slowing down with time is happenning? Regards, Jayanta On Mon, 23 Oct 2006, Durga Choudhury wrote: > Did you try channel bonding? If your OS is Linux, there are plenty of > "howto" on the internet which will tell you how to do it. > > However, your CPU might be the bottleneck in this case. How much of CPU > horsepower is available at 140MB/s? > > If the CPU *is* the bottleneck, changing your network driver (e.g. from > interrupt-based to poll-based packet transfer) might help. If you are > unfamiliar with writing network drivers for your OS, this may not be a > trivial task, though. > > Oh, and like I pointed out last time, if all of the above seem OK, try > putting your second link to a separate PC and see if you can gate twice the > throughput. If so, then the ECMP implementation of your IP stack is what is > causing the problem. This is the hardest one to fix. You could rewrite a few > routines in ipv4 processing and recompile the Kernel, if you are familiar > with Kernel building and your OS is Linux. > > > On 10/23/06, Jayanta Roy wrote: >> >> Hi, >> >> Sometime before I have posted doubts about using dual gigabit support >> fully. See I get ~140MB/s full duplex transfer rate in each of following >> runs. >> >> mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out >> >> mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out >> >> How to combine these two port or use a proper routing table in place host >> file? I am using openmpi-1.1 version. >> >> -Jayanta >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > -- > Devil wanted omnipresence; > He therefore created communists. > Jayanta Roy National Centre for Radio Astrophysics | Phone : +91-20-25697107 Tata Institute of Fundamental Research | Fax: +91-20-25692149 Pune University Campus, Pune 411 007| e-mail : j...@ncra.tifr.res.in India ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Devil wanted omnipresence; He therefore created communists.
Re: [OMPI users] dual Gigabit ethernet support
Hi George, Yes, it is duplex BW. The BW benchmark is a simple timing call around MPI_Alltoall call. Then you estimate the network traffic from the sending buffer size and get the rate. Regards, Jayanta On Mon, 23 Oct 2006, George Bosilca wrote: I don't know what your bandwidth tester look like, but 140MB/s it's way too much for a single Gige card, except if it's a bidirectional bandwidth. Usually, on a new generation Gige card (Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI Express) with a AMD processor (AMD Athlon(tm) 64 Processor 3500+) I only manage to get around 800Mb/s out of a point-to-point transfer. With an external card not on the OCI-express bus I barely get 600Mb/s... Why you don't use a real network performance tool such as Netpipe. At least it will insure you that the bandwidth is the one you expect. Thanks, george. On Oct 23, 2006, at 4:56 AM, Jayanta Roy wrote: Hi, Sometime before I have posted doubts about using dual gigabit support fully. See I get ~140MB/s full duplex transfer rate in each of following runs. mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out How to combine these two port or use a proper routing table in place host file? I am using openmpi-1.1 version. -Jayanta ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Jayanta Roy National Centre for Radio Astrophysics | Phone : +91-20-25697107 Tata Institute of Fundamental Research | Fax: +91-20-25692149 Pune University Campus, Pune 411 007| e-mail : j...@ncra.tifr.res.in India
Re: [OMPI users] dual Gigabit ethernet support
I don't know what your bandwidth tester look like, but 140MB/s it's way too much for a single Gige card, except if it's a bidirectional bandwidth. Usually, on a new generation Gige card (Broadcom Corporation NetXtreme BCM5751 Gigabit Ethernet PCI Express) with a AMD processor (AMD Athlon(tm) 64 Processor 3500+) I only manage to get around 800Mb/s out of a point-to-point transfer. With an external card not on the OCI-express bus I barely get 600Mb/s... Why you don't use a real network performance tool such as Netpipe. At least it will insure you that the bandwidth is the one you expect. Thanks, george. On Oct 23, 2006, at 4:56 AM, Jayanta Roy wrote: Hi, Sometime before I have posted doubts about using dual gigabit support fully. See I get ~140MB/s full duplex transfer rate in each of following runs. mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out How to combine these two port or use a proper routing table in place host file? I am using openmpi-1.1 version. -Jayanta ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] dual Gigabit ethernet support
Hello, On 10/23/06, Jayanta Roywrote: Hi, Sometime before I have posted doubts about using dual gigabit support fully. See I get ~140MB/s full duplex transfer rate in each of following runs. Thats impressive, since its _more_ than the threotetical limit of 1Gb ethernet. 140MB = 140 x 8 Mbit = 1120 Megabit/sec > 1 Gigabit/sec... mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out How to combine these two port or use a proper routing table in place host file? I am using openmpi-1.1 version. -Jayanta -- Miguel Sousa Filipe
Re: [OMPI users] dual Gigabit ethernet support
Did you try channel bonding? If your OS is Linux, there are plenty of "howto" on the internet which will tell you how to do it. However, your CPU might be the bottleneck in this case. How much of CPU horsepower is available at 140MB/s? If the CPU *is* the bottleneck, changing your network driver (e.g. from interrupt-based to poll-based packet transfer) might help. If you are unfamiliar with writing network drivers for your OS, this may not be a trivial task, though. Oh, and like I pointed out last time, if all of the above seem OK, try putting your second link to a separate PC and see if you can gate twice the throughput. If so, then the ECMP implementation of your IP stack is what is causing the problem. This is the hardest one to fix. You could rewrite a few routines in ipv4 processing and recompile the Kernel, if you are familiar with Kernel building and your OS is Linux. On 10/23/06, Jayanta Roywrote: Hi, Sometime before I have posted doubts about using dual gigabit support fully. See I get ~140MB/s full duplex transfer rate in each of following runs. mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out How to combine these two port or use a proper routing table in place host file? I am using openmpi-1.1 version. -Jayanta ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Devil wanted omnipresence; He therefore created communists.
[OMPI users] dual Gigabit ethernet support
Hi, Sometime before I have posted doubts about using dual gigabit support fully. See I get ~140MB/s full duplex transfer rate in each of following runs. mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out How to combine these two port or use a proper routing table in place host file? I am using openmpi-1.1 version. -Jayanta