Re: [Veritas-bu] Some info on my experiences with 10GbE
I don't think it's the T2000's that are so special, I think it may have more to do with Sun's cards... I recently did some 10G testing with a Sun X4100 w/ 2 Opteron 2216's (dual core, 2.4Ghz) running RHEL 4, and found that it drastically outperformed the 4-core 1Ghz T2000's. I just swapped one of the cards from a T2000 into the X4100, and I tested sending data from the X4100->T2000, and then from the T2000->X4100. When sending data from the X4100->T2000, the throughput was the same as T2000->T2000, but when sending data from T2000->X4100, the maximum throughput achieved was 9.8Gb/s with 16 threads. An interesting observation is that all 4 cores on the X4100 were 50% idle when running at this rate. Also, worth mentioning is that I did some tests between a couple 8-core 1.2Ghz T2000's, and I got them to 9.3Gb/s with 16 threads, so the 8-core 1.2Ghz T2000's definitly outperform the 4-core 1Ghz ones. Big supprise. :) Similar to the previous poster, iperf seemed to perform the best with a 512k buffer and 512k tcp window size, and I also saw some large fluctuations in total throughput results. I'm guessing this is due to where the Solaris scheduler is running threads, since with mpstat I'd occasionally see 1-2 cores (8 cpu's in mpstat) at 99-100% sys, and then the other 2 cores would be 100% idle. If the scheduler would spread the load across the cores more evenly then perhaps more throughput could be achieved. To help smooth the results, all the numbers I've reported on the list are the average of 3 separate 5-minute long iperf runs. Btw, I'm also finding that the single threaded performance is crap with these cards on the 1Ghz T2000's - though with the 1.2Ghz T2000's or X4100, single threaded performance was slightly better (interestingly, the 8-core T2000's consitently stumbled w/ 3 threads): No. Threads Mbit/sec X4100->T2000T2000->X4100 T2000->T2000(8core/1.2Ghz) 1 944 2143 1686 2 18673988 1937 3 25584772 1897 4 31465096 3704 6 43688071 5934 8 54688282 6908 16 64729842 9311 32 65139893 9283 -devon ------------ Date: Mon, 7 Jan 2008 15:33:08 -0500 From: "Curtis Preston" <[EMAIL PROTECTED]> Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE To: Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset="us-ascii" This is another pro-T2000 report. What makes them special? --- W. Curtis Preston Backup Blog @ www.backupcentral.com VP Data Protection, GlassHouse Technologies -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of pancamo Sent: Friday, January 04, 2008 11:43 PM To: VERITAS-BU@mailman.eng.auburn.edu Subject: [Veritas-bu] Some info on my experiences with 10GbE I just started testing 2 T2000's with dual 10Gbps SUN Nics directly connected to each other... I'm somewhat pissed that I'm only able to get about 658Mbps from one thread on the 10Gbps nic while I'm able to get 938Mbps using the onboard 1Gbs nic with when using the iperf default values. Which means in some cases the 10Gbit nic is actually slower than the onboard 1Gbit nic However, I was able to get 7.2 Gbps using 10 threads. Here are some of my max results with different thread (-P) values ./iperf -c 192.168.1.2 -f m -w 512K -l 512 -P x TCP Win Buffer Threads Gbps 512 512 1 1.4 512 512 2 2.5 512 512 4 4.3 512 512 6 6.4 512 512 8 6.1 512 512 10 7.2 512 512 15 4.6 512 512 18 3.6 512 512 20 3 512 512 30 2.5 512 512 60 2.3 Another annoying deal was that the results from iperf were not the same each time I ran the test. The results were as much as 3Gbps different from run to run. The results should be the same for each run. my /etc/system settings settings that I added as suggested by SUN set ddi_msix_alloc_limit=8 set ip:ip_soft_rings_cnt=8 set ip:ip_squeue_fanout=1 set ip:tcp_squeue_wput=1 set ip:ip_squeue_bind=0 set ipge:ipge_tx_syncq=1 set ipge:ipge_bcopy_thresh = 512 set ipge:ipge_dvma_thresh = 1 set consistent_coloring=2 set pcie:pcie_aer_ce_mask=0x1 Here are the NDD settings that I found here: http://www.sun.com/servers/coolthreads/tnb/parameters.jsp#2 ndd -set /dev/tcp tcp_conn_req_max_q 16384 ndd -set /dev/tcp tcp_conn_req_max_q0 16384 ndd -set /dev/tcp tcp_max_buf 10485760 ndd -set /de
Re: [Veritas-bu] Some info on my experiences with 10GbE
[EMAIL PROTECTED] said: > This is another pro-T2000 report. What makes them special? I'll chime in with my additional comments along the lines of the other responder, who mentioned the large number of cores and threads, etc. Some of the kernel tunables mentioned have to do with "interrupt fanout" support for the Sun 10GbE NIC -- so the NIC's interrupts can be handled by a number of CPU cores in parallel, which is pretty important when each core isn't all that speedy (in the T2000, anyway). The T2000's onboard NIC's also take advantage of interrupt fanout, and it's making its way into Sun's other hardware platforms (both SPARC and x86) as well. To go along with the driver support, it's recommended that you use the Solaris "psradm" and "pooladm" commands to arrange for device interrupts to be limited to only 1 thread per physical core on the T2000. This way non-interrupt threads won't compete with the device-based threads. I'm not an expert with this, but I can report that following the recommendations does make a marked difference in our overall T2000 system throughput even with only a single GbE interface. Here are a couple references: http://www.solarisinternals.com/wiki/index.php/Networks#Tuning_Sun_Dual_10GbE_o n_T1000.2FT2000 http://serversidetechnologies.blogspot.com/ As a clue to my vintage (:-), I'll say that the T2000 reminds me of the Sequent Balance and Symmetry computer systems. These had up to 30 little CPU's in them (NS32032's, or Intel 80386's), as "fast" as 16 or 20MHz. The systems were not real speedy on a per-Unix-process basis, but they had a huge amount of bandwidth -- it took quite an effort to bog one down. My recollection of those days is a bit rusty, but I remember hearing that much of Sequent's expertise with parallelizing the Unix kernel ended up getting folded into Solaris/SVR4 back when AT&T and Sun made their grand SVR4/BSD reunion deal. Regards, Marion ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
This is another pro-T2000 report. What makes them special? --- W. Curtis Preston Backup Blog @ www.backupcentral.com VP Data Protection, GlassHouse Technologies -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of pancamo Sent: Friday, January 04, 2008 11:43 PM To: VERITAS-BU@mailman.eng.auburn.edu Subject: [Veritas-bu] Some info on my experiences with 10GbE I just started testing 2 T2000's with dual 10Gbps SUN Nics directly connected to each other... I'm somewhat pissed that I'm only able to get about 658Mbps from one thread on the 10Gbps nic while I'm able to get 938Mbps using the onboard 1Gbs nic with when using the iperf default values. Which means in some cases the 10Gbit nic is actually slower than the onboard 1Gbit nic However, I was able to get 7.2 Gbps using 10 threads. Here are some of my max results with different thread (-P) values ./iperf -c 192.168.1.2 -f m -w 512K -l 512 -P x TCP Win Buffer Threads Gbps 512 512 1 1.4 512 512 2 2.5 512 512 4 4.3 512 512 6 6.4 512 512 8 6.1 512 512 10 7.2 512 512 15 4.6 512 512 18 3.6 512 512 20 3 512 512 30 2.5 512 512 60 2.3 Another annoying deal was that the results from iperf were not the same each time I ran the test. The results were as much as 3Gbps different from run to run. The results should be the same for each run. my /etc/system settings settings that I added as suggested by SUN set ddi_msix_alloc_limit=8 set ip:ip_soft_rings_cnt=8 set ip:ip_squeue_fanout=1 set ip:tcp_squeue_wput=1 set ip:ip_squeue_bind=0 set ipge:ipge_tx_syncq=1 set ipge:ipge_bcopy_thresh = 512 set ipge:ipge_dvma_thresh = 1 set consistent_coloring=2 set pcie:pcie_aer_ce_mask=0x1 Here are the NDD settings that I found here: http://www.sun.com/servers/coolthreads/tnb/parameters.jsp#2 ndd -set /dev/tcp tcp_conn_req_max_q 16384 ndd -set /dev/tcp tcp_conn_req_max_q0 16384 ndd -set /dev/tcp tcp_max_buf 10485760 ndd -set /dev/tcp tcp_cwnd_max 10485760 ndd -set /dev/tcp tcp_xmit_hiwat 131072 ndd -set /dev/tcp tcp_recv_hiwat 131072 ndd -set /dev/nxge0 accept_jumbo 1 I also found information here: http://blogs.sun.com/sunay/entry/the_solaris_networking_the_magic cpreston wrote: > 7500 MB/s! That's the most impressive numbers I've ever seen by FAR. I may have to take back my "10 GbE is a Lie!" blog post, and I'd be happy to do so. > > Can you share things besides the T2000? For example, > > what OS and patch levels are you running? > Any IP patches? > Any IP-specific patches? > What ndd settings are you using? > Is rss enabled? > > "Input, I need Input!" > > --- > W. Curtis Preston > Backup Blog @ www.backupcentral.com (http://www.backupcentral.com) > VP Data Protection, GlassHouse Technologies > > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peters, Devon C > Sent: Wednesday, October 17, 2007 12:12 PM > To: VERITAS-BU@mailman.eng.auburn.edu > Subject: [Veritas-bu] Some info on my experiences with 10GbE > > > Since I've seen a little bit of talk about 10GbE on here in the past I figured I'd share some of my experiences... > I've recently been testing some of Sun's dual-port 10GbE NICs on some small T2000's (1Ghz, 4-core). I'm only using a single port on each card, and the servers are currently directly connected to each other (waiting for my network team to get switches and fibre in place). > So far, I've been able to drive throughput between these two systems to about 7500Mbit/sec using iperf. When the throughput gets this high, all the cores/threads on the receiving T2000 become saturated and TCP retransmits start climbing, but both systems remain quite responsive. Since these are only 4-core T2000's, I would guess that the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz processors) should be capable of more throughput, possibly near line speed. > The down side achieving this high of throughput is that it requires lots of data streams. When transmitting with a single data stream, the most throughput I've gotten is about 1500Mbit/sec. I only got up to 7500Mbit/s when using 64 data streams... Also, the biggest gains seem to be in the jump from 1 to 8 data streams; with 8 streams I was able to get throughput up to 6500Mbit/sec. > Our goal for 10GbE, is to be able to restore data from tape at a speed of at least 2400Mbit/sec (300MB/sec). We have large daily backups (3-4TB) that we would like to be able to restore (not backup) in a reasonable amount of time. These restores are used to refresh our test and development environments with current data. The actual backups are done with array based snapshots (HDS ShadowCopy), which then get mounted and backed up by a dedicated media server (6-core T2000). We're currently getting about 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape drives - MPX=3 and it's ve
Re: [Veritas-bu] Some info on my experiences with 10GbE
Have you tried an aggregate of the 4 onboard 1gb/s NICs? Paul -- > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf > Of pancamo > Sent: January 5, 2008 2:43 AM > To: VERITAS-BU@mailman.eng.auburn.edu > Subject: [Veritas-bu] Some info on my experiences with 10GbE > > > > I just started testing 2 T2000's with dual 10Gbps SUN Nics > directly connected to each other... > > I'm somewhat pissed that I'm only able to get about 658Mbps > from one thread on the 10Gbps nic while I'm able to get > 938Mbps using the onboard 1Gbs nic with when using the iperf > default values. Which means in some cases the 10Gbit nic is > actually slower than the onboard 1Gbit nic La version française suit le texte anglais. This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so. Le présent courriel peut contenir de l'information privilégiée ou confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient par une personne autre que le ou les destinataires désignés est interdite. Si vous recevez ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans délai à l'expéditeur un message électronique pour l'aviser que vous avez éliminé de votre ordinateur toute copie du courriel reçu. ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
> in my mind, using 2 Gb/s or 4 Gb/s shouldn't make a bit of difference > for a drive that natively writes at 80 MB/s It shouldn't make a difference if you're sending uncompressible data to it. If you send highly compressible data to the drive, then there are 2 places on the drive (that I can think of) that could be bottlenecks - the FC interface and the compression ASIC: [ Server ] [ Tape Drive ] [2Gb FC HBA]--[2Gb FC]--[Compression ASIC]--[Write Head]--[Physical Tape] If the 2Gb FC interface is receiving data at 170MB/s, and the compression ASIC does 4:1 compression, then the write head will be sending this compressed data to tape at a speed of 42.5MB/s. With the 4Gb drives, I'm still not able to push the write head to it's native max speed of 80MB/s: I'm getting 265MB/s with, with 4:1 compression, so the native write speed is around 66MB/s. -devon -Original Message- From: Nick Majeran [mailto:[EMAIL PROTECTED] Sent: Friday, October 19, 2007 9:02 AM To: veritas-bu@mailman.eng.auburn.edu; Peters, Devon C Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE Regarding the tape drives and compression -- this is the part that confuses me. I can max-out an LTO-3 drive at native write speed at 80MB/s with no problem using pre-compressed data (compressed Sybase dbdumps), even with a measly 64kb block size. This is using direct NDMP with 2 Gb/s fc IBM LTO-3 drives. Using contrived data, i.e. large files dd'ed from /dev/zero or hpcreatedata, I have in the past maxed out 2 Gb/s LTO-3 drives at approximately 170 MB/s, as you claim above. However, this was using 256kb block sizes. I have read reports where 2 Gb/s LTO-3 drives can be pushed to 220-230 MB/s using the maximum block size supported by LTO-3 (2 MB) and contrived data. Now, if compression is done at the drive, I would think that with a 2 Gb/s interface, it should be able to receive data at roughly 170 MB/s, but since the drive natively spins at 80 MB/s, it would compress that data, 4x, as you claim, to get that 240 MB/s top end. But, in my mind, using 2 Gb/s or 4 Gb/s shouldn't make a bit of difference for a drive that natively writes at 80 MB/s. Does anyone else have experience with this? Also, I've seen LTO-3 tapes in our environment marked as "FULL" by Netbackup with close to 2 TB of data on them. -- nick Yep, I'm using jumbo frames. The performance was around 50% lower without it. I'm not currently using any switches for 10GbE, the servers are connected directly together. Re 4Gb vs 2Gb tape drives - since the data is compressed at the drive, we still need to be able to transfer the data to the drives as fast as possible. The highest throughput we've been able to get with a single 2Gb fibre HBA is about 190MB/s (using multiple 2Gb disk-subsystem ports zoned to a single HBA port). The highest throughput we've gotten with a single 2Gb tape drive is 170MB/s. Since this is near the peak we can get with 2Gb, I assume that the 2Gb interface on the tape drive is what's limiting our throughput. Also, we get about 4x compression of this data on the tapes (~1600MB on an LTO3 tape). So, with 265MB/s at 4x compression, the physical write speed of the drive is probably somewhere around 65MB/s (265/4). Since the tape compression ratio has remained the same with both 2Gb and 4Gb drives, I'd guess that the physical drive speeds with the 2Gb drives were probably closer to 40MB/s (170/4)... -devon ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Regarding the tape drives and compression -- this is the part that confuses me. I can max-out an LTO-3 drive at native write speed at 80MB/s with no problem using pre-compressed data (compressed Sybase dbdumps), even with a measly 64kb block size. This is using direct NDMP with 2 Gb/s fc IBM LTO-3 drives. Using contrived data, i.e. large files dd'ed from /dev/zero or hpcreatedata, I have in the past maxed out 2 Gb/s LTO-3 drives at approximately 170 MB/s, as you claim above. However, this was using 256kb block sizes. I have read reports where 2 Gb/s LTO-3 drives can be pushed to 220-230 MB/s using the maximum block size supported by LTO-3 (2 MB) and contrived data. Now, if compression is done at the drive, I would think that with a 2 Gb/s interface, it should be able to receive data at roughly 170 MB/s, but since the drive natively spins at 80 MB/s, it would compress that data, 4x, as you claim, to get that 240 MB/s top end. But, in my mind, using 2 Gb/s or 4 Gb/s shouldn't make a bit of difference for a drive that natively writes at 80 MB/s. Does anyone else have experience with this? Also, I've seen LTO-3 tapes in our environment marked as "FULL" by Netbackup with close to 2 TB of data on them. -- nick Yep, I'm using jumbo frames. The performance was around 50% lower without it. I'm not currently using any switches for 10GbE, the servers are connected directly together. Re 4Gb vs 2Gb tape drives - since the data is compressed at the drive, we still need to be able to transfer the data to the drives as fast as possible. The highest throughput we've been able to get with a single 2Gb fibre HBA is about 190MB/s (using multiple 2Gb disk-subsystem ports zoned to a single HBA port). The highest throughput we've gotten with a single 2Gb tape drive is 170MB/s. Since this is near the peak we can get with 2Gb, I assume that the 2Gb interface on the tape drive is what's limiting our throughput. Also, we get about 4x compression of this data on the tapes (~1600MB on an LTO3 tape). So, with 265MB/s at 4x compression, the physical write speed of the drive is probably somewhere around 65MB/s (265/4). Since the tape compression ratio has remained the same with both 2Gb and 4Gb drives, I'd guess that the physical drive speeds with the 2Gb drives were probably closer to 40MB/s (170/4)... -devon ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Yep, I'm using jumbo frames. The performance was around 50% lower without it. I'm not currently using any switches for 10GbE, the servers are connected directly together. Re 4Gb vs 2Gb tape drives - since the data is compressed at the drive, we still need to be able to transfer the data to the drives as fast as possible. The highest throughput we've been able to get with a single 2Gb fibre HBA is about 190MB/s (using multiple 2Gb disk-subsystem ports zoned to a single HBA port). The highest throughput we've gotten with a single 2Gb tape drive is 170MB/s. Since this is near the peak we can get with 2Gb, I assume that the 2Gb interface on the tape drive is what's limiting our throughput. Also, we get about 4x compression of this data on the tapes (~1600MB on an LTO3 tape). So, with 265MB/s at 4x compression, the physical write speed of the drive is probably somewhere around 65MB/s (265/4). Since the tape compression ratio has remained the same with both 2Gb and 4Gb drives, I'd guess that the physical drive speeds with the 2Gb drives were probably closer to 40MB/s (170/4)... -devon -Original Message- From: Nick Majeran [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 2:18 PM To: veritas-bu@mailman.eng.auburn.edu; Peters, Devon C Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE Devon, just a few more questions: So you *are* using jumbo frames? I saw that it was enabled in ndd, but you haven't mentioned it outright. Also, what network switching equipment are you using for these tests? Also, I'm curious, how is it that 4Gb/s LTO-3 drives can write "faster" than 2 Gb/s with contrived data? It seems like it shouldn't make a difference, since the data stream is compressed at the drive. thanks! -- nick We've been pretty happy with the T2000's. The tape library is an IBM 3584, the tape drives are IBM's 4Gb FC LTO-3 drives, there's a dedicated 4Gb HBA for each drive, and everything is connected to 4Gb McData switches. We used to have IBM's 2Gb FC LTO-3 drives, and with those the peak performance was around 165MB/s per drive. These 4Gb drives peak at around 265MB/s per drive, though with all 3 tape drives active, we see throughput closer to 220MB/s per drive...I'm guessing we're bottlenecked by the ports on our disk subsystem at the moment, but since performance is more than acceptable we're not looking to tune this any further - at least not until our LTO-4 drives are installed next month ;). -devon ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Devon, just a few more questions: So you *are* using jumbo frames? I saw that it was enabled in ndd, but you haven't mentioned it outright. Also, what network switching equipment are you using for these tests? Also, I'm curious, how is it that 4Gb/s LTO-3 drives can write "faster" than 2 Gb/s with contrived data? It seems like it shouldn't make a difference, since the data stream is compressed at the drive. thanks! -- nick We've been pretty happy with the T2000's. The tape library is an IBM 3584, the tape drives are IBM's 4Gb FC LTO-3 drives, there's a dedicated 4Gb HBA for each drive, and everything is connected to 4Gb McData switches. We used to have IBM's 2Gb FC LTO-3 drives, and with those the peak performance was around 165MB/s per drive. These 4Gb drives peak at around 265MB/s per drive, though with all 3 tape drives active, we see throughput closer to 220MB/s per drive...I'm guessing we're bottlenecked by the ports on our disk subsystem at the moment, but since performance is more than acceptable we're not looking to tune this any further - at least not until our LTO-4 drives are installed next month ;). -devon ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Regarding the poor performance you've seen, I'm curious how many data streams you were using? The number of simultaneous streams seems to have a big impact on total throughput. Also, if you haven't done so, you should check out the Solaris Internals website for some recommended tunables with these cards: http://www.solarisinternals.com/wiki/index.php/Networks I sent out the settings I'm using in a previous email, so you might be able to take those and see if they work for you... -devon -- Date: Thu, 18 Oct 2007 16:30:40 +0800 From: "Mellor, Adam A." <[EMAIL PROTECTED]> Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE To: Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset="us-ascii" Also, Very Interested, My backup servers are 8core T2000's, we will be putting the SUN nxge 10Gbit cards into them. I have only seen poor results from the card in back to back configuration (the network is not at 10Gbit yet). so far i have been misserable with results in line with Mr Preston's. More that happy (Change controll permitting) to duplicate your setup with the 8 core boxes. Adam. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Curtis Preston Sent: Thursday, 18 October 2007 4:07 PM To: Peters, Devon C; VERITAS-BU@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE 7500 MB/s! That's the most impressive numbers I've ever seen by FAR. I may have to take back my "10 GbE is a Lie!" blog post, and I'd be happy to do so. Can you share things besides the T2000? For example, what OS and patch levels are you running? Any IP patches? Any IP-specific patches? What ndd settings are you using? Is rss enabled? "Input, I need Input!" --- W. Curtis Preston Backup Blog @ www.backupcentral.com VP Data Protection, GlassHouse Technologies From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peters, Devon C Sent: Wednesday, October 17, 2007 12:12 PM To: VERITAS-BU@mailman.eng.auburn.edu Subject: [Veritas-bu] Some info on my experiences with 10GbE Since I've seen a little bit of talk about 10GbE on here in the past I figured I'd share some of my experiences... I've recently been testing some of Sun's dual-port 10GbE NICs on some small T2000's (1Ghz, 4-core). I'm only using a single port on each card, and the servers are currently directly connected to each other (waiting for my network team to get switches and fibre in place). So far, I've been able to drive throughput between these two systems to about 7500Mbit/sec using iperf. When the throughput gets this high, all the cores/threads on the receiving T2000 become saturated and TCP retransmits start climbing, but both systems remain quite responsive. Since these are only 4-core T2000's, I would guess that the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz processors) should be capable of more throughput, possibly near line speed. The down side achieving this high of throughput is that it requires lots of data streams. When transmitting with a single data stream, the most throughput I've gotten is about 1500Mbit/sec. I only got up to 7500Mbit/s when using 64 data streams... Also, the biggest gains seem to be in the jump from 1 to 8 data streams; with 8 streams I was able to get throughput up to 6500Mbit/sec. Our goal for 10GbE, is to be able to restore data from tape at a speed of at least 2400Mbit/sec (300MB/sec). We have large daily backups (3-4TB) that we would like to be able to restore (not backup) in a reasonable amount of time. These restores are used to refresh our test and development environments with current data. The actual backups are done with array based snapshots (HDS ShadowCopy), which then get mounted and backed up by a dedicated media server (6-core T2000). We're currently getting about 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape drives - MPX=3 and it's very compressible data). Going off my iperf results, the restoring this data using 9 streams should get us well over 2400Mbit/sec. But - we haven't installed the cards on our media servers yet, so I have yet to see what the actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful it'll be close to the iperf results, but if it doesn't meet the goal then we'll be looking at other options. -- Devon Peters NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachme
Re: [Veritas-bu] Some info on my experiences with 10GbE
The data is oracle database files and archive logs, and they compress real well. The largest single database is about 4TB. -devon From: Hall, Christian N. [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 12:22 PM To: Peters, Devon C; [EMAIL PROTECTED]; VERITAS-BU@mailman.eng.auburn.edu Subject: RE: [Veritas-bu] Some info on my experiences with 10GbE Devon, What is your data type your backing up? How much data? Thanks, Chris Hall From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peters, Devon C Sent: Thursday, October 18, 2007 3:10 PM To: [EMAIL PROTECTED]; VERITAS-BU@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE We've been pretty happy with the T2000's. The tape library is an IBM 3584, the tape drives are IBM's 4Gb FC LTO-3 drives, there's a dedicated 4Gb HBA for each drive, and everything is connected to 4Gb McData switches. We used to have IBM's 2Gb FC LTO-3 drives, and with those the peak performance was around 165MB/s per drive. These 4Gb drives peak at around 265MB/s per drive, though with all 3 tape drives active, we see throughput closer to 220MB/s per drive...I'm guessing we're bottlenecked by the ports on our disk subsystem at the moment, but since performance is more than acceptable we're not looking to tune this any further - at least not until our LTO-4 drives are installed next month ;). -devon From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 6:10 AM To: Peters, Devon C; VERITAS-BU@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE Devon, Good to hear that T2000's are screamers. What are the library/tape drive specs. Are the drives FC attached? or are they attached via scsi to the media server? Thanks, Karl > From: [EMAIL PROTECTED] [mailto:veritas-bu- > [EMAIL PROTECTED] On Behalf Of Peters, Devon C > Sent: Wednesday, October 17, 2007 12:12 PM > To: VERITAS-BU@mailman.eng.auburn.edu > Subject: [Veritas-bu] Some info on my experiences with 10GbE > > Since I've seen a little bit of talk about 10GbE on here in the past > I figured I'd share some of my experiences... > I've recently been testing some of Sun's dual-port 10GbE NICs on > some small T2000's (1Ghz, 4-core). I'm only using a single port on > each card, and the servers are currently directly connected to each > other (waiting for my network team to get switches and fibre in place). > So far, I've been able to drive throughput between these two systems > to about 7500Mbit/sec using iperf. When the throughput gets this > high, all the cores/threads on the receiving T2000 become saturated > and TCP retransmits start climbing, but both systems remain quite > responsive. Since these are only 4-core T2000's, I would guess that > the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz > processors) should be capable of more throughput, possibly near line speed. > The down side achieving this high of throughput is that it requires > lots of data streams. When transmitting with a single data stream, > the most throughput I've gotten is about 1500Mbit/sec. I only got > up to 7500Mbit/s when using 64 data streams... Also, the biggest > gains seem to be in the jump from 1 to 8 data streams; with 8 > streams I was able to get throughput up to 6500Mbit/sec. > Our goal for 10GbE, is to be able to restore data from tape at a > speed of at least 2400Mbit/sec (300MB/sec). We have large daily > backups (3-4TB) that we would like to be able to restore (not > backup) in a reasonable amount of time. These restores are used to > refresh our test and development environments with current data. > The actual backups are done with array based snapshots (HDS > ShadowCopy), which then get mounted and backed up by a dedicated > media server (6-core T2000). We're currently getting about > 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape > drives - MPX=3 and it's very compressible data). > Going off my iperf results, the restoring this data using 9 streams > should get us well over 2400Mbit/sec. But - we haven't installed > the cards on our media servers yet, so I have yet to see what the > actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful > it'll be close to the iperf results, but if it doesn't meet the goal > then we'll be looking at other options. > -- > Devon Peters ___ > Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
That sounds pretty similar to what ip_squeue_fanout does for Solaris10 - and using it made a noticable performance improvement: ip_squeue_fanout Description Determines the mode of associating TCP/IP connections with squeues A value of 0 associates a new TCP/IP connection with the CPU that creates the connection. A value of 1 associates the connection with multiple squeues that belong to different CPUs. The number of squeues that are used to fanout the connection is based upon ip_soft_rings_cnt. -devon -Original Message- From: Curtis Preston [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 12:47 PM To: Peters, Devon C; VERITAS-BU@mailman.eng.auburn.edu Subject: RE: [Veritas-bu] Some info on my experiences with 10GbE "Is rss enabled?" Not sure what you're asking here... RSS is receive-side scaling, which apparently helps improve performance: http://www.microsoft.com/whdc/device/network/ndis_rss.mspx I actually just learned about it the other day talking to a 10 GbE NIC vendor. He asked me that question about another user that had posted different results (but still on Solaris), and so I thought I'd ask you. Upon further research, it appears to be a Microsoft-only thing. So, never mind! ;) ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
"Is rss enabled?" Not sure what you're asking here... RSS is receive-side scaling, which apparently helps improve performance: http://www.microsoft.com/whdc/device/network/ndis_rss.mspx I actually just learned about it the other day talking to a 10 GbE NIC vendor. He asked me that question about another user that had posted different results (but still on Solaris), and so I thought I'd ask you. Upon further research, it appears to be a Microsoft-only thing. So, never mind! ;) ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Devon, What is your data type your backing up? How much data? Thanks, Chris Hall From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peters, Devon C Sent: Thursday, October 18, 2007 3:10 PM To: [EMAIL PROTECTED]; VERITAS-BU@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE We've been pretty happy with the T2000's. The tape library is an IBM 3584, the tape drives are IBM's 4Gb FC LTO-3 drives, there's a dedicated 4Gb HBA for each drive, and everything is connected to 4Gb McData switches. We used to have IBM's 2Gb FC LTO-3 drives, and with those the peak performance was around 165MB/s per drive. These 4Gb drives peak at around 265MB/s per drive, though with all 3 tape drives active, we see throughput closer to 220MB/s per drive...I'm guessing we're bottlenecked by the ports on our disk subsystem at the moment, but since performance is more than acceptable we're not looking to tune this any further - at least not until our LTO-4 drives are installed next month ;). -devon From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 6:10 AM To: Peters, Devon C; VERITAS-BU@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE Devon, Good to hear that T2000's are screamers. What are the library/tape drive specs. Are the drives FC attached? or are they attached via scsi to the media server? Thanks, Karl > From: [EMAIL PROTECTED] [mailto:veritas-bu- > [EMAIL PROTECTED] On Behalf Of Peters, Devon C > Sent: Wednesday, October 17, 2007 12:12 PM > To: VERITAS-BU@mailman.eng.auburn.edu > Subject: [Veritas-bu] Some info on my experiences with 10GbE > > Since I've seen a little bit of talk about 10GbE on here in the past > I figured I'd share some of my experiences... > I've recently been testing some of Sun's dual-port 10GbE NICs on > some small T2000's (1Ghz, 4-core). I'm only using a single port on > each card, and the servers are currently directly connected to each > other (waiting for my network team to get switches and fibre in place). > So far, I've been able to drive throughput between these two systems > to about 7500Mbit/sec using iperf. When the throughput gets this > high, all the cores/threads on the receiving T2000 become saturated > and TCP retransmits start climbing, but both systems remain quite > responsive. Since these are only 4-core T2000's, I would guess that > the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz > processors) should be capable of more throughput, possibly near line speed. > The down side achieving this high of throughput is that it requires > lots of data streams. When transmitting with a single data stream, > the most throughput I've gotten is about 1500Mbit/sec. I only got > up to 7500Mbit/s when using 64 data streams... Also, the biggest > gains seem to be in the jump from 1 to 8 data streams; with 8 > streams I was able to get throughput up to 6500Mbit/sec. > Our goal for 10GbE, is to be able to restore data from tape at a > speed of at least 2400Mbit/sec (300MB/sec). We have large daily > backups (3-4TB) that we would like to be able to restore (not > backup) in a reasonable amount of time. These restores are used to > refresh our test and development environments with current data. > The actual backups are done with array based snapshots (HDS > ShadowCopy), which then get mounted and backed up by a dedicated > media server (6-core T2000). We're currently getting about > 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape > drives - MPX=3 and it's very compressible data). > Going off my iperf results, the restoring this data using 9 streams > should get us well over 2400Mbit/sec. But - we haven't installed > the cards on our media servers yet, so I have yet to see what the > actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful > it'll be close to the iperf results, but if it doesn't meet the goal > then we'll be looking at other options. > -- > Devon Peters ___ > Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
We've been pretty happy with the T2000's. The tape library is an IBM 3584, the tape drives are IBM's 4Gb FC LTO-3 drives, there's a dedicated 4Gb HBA for each drive, and everything is connected to 4Gb McData switches. We used to have IBM's 2Gb FC LTO-3 drives, and with those the peak performance was around 165MB/s per drive. These 4Gb drives peak at around 265MB/s per drive, though with all 3 tape drives active, we see throughput closer to 220MB/s per drive...I'm guessing we're bottlenecked by the ports on our disk subsystem at the moment, but since performance is more than acceptable we're not looking to tune this any further - at least not until our LTO-4 drives are installed next month ;). -devon From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 6:10 AM To: Peters, Devon C; VERITAS-BU@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE Devon, Good to hear that T2000's are screamers. What are the library/tape drive specs. Are the drives FC attached? or are they attached via scsi to the media server? Thanks, Karl > From: [EMAIL PROTECTED] [mailto:veritas-bu- > [EMAIL PROTECTED] On Behalf Of Peters, Devon C > Sent: Wednesday, October 17, 2007 12:12 PM > To: VERITAS-BU@mailman.eng.auburn.edu > Subject: [Veritas-bu] Some info on my experiences with 10GbE > > Since I've seen a little bit of talk about 10GbE on here in the past > I figured I'd share some of my experiences... > I've recently been testing some of Sun's dual-port 10GbE NICs on > some small T2000's (1Ghz, 4-core). I'm only using a single port on > each card, and the servers are currently directly connected to each > other (waiting for my network team to get switches and fibre in place). > So far, I've been able to drive throughput between these two systems > to about 7500Mbit/sec using iperf. When the throughput gets this > high, all the cores/threads on the receiving T2000 become saturated > and TCP retransmits start climbing, but both systems remain quite > responsive. Since these are only 4-core T2000's, I would guess that > the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz > processors) should be capable of more throughput, possibly near line speed. > The down side achieving this high of throughput is that it requires > lots of data streams. When transmitting with a single data stream, > the most throughput I've gotten is about 1500Mbit/sec. I only got > up to 7500Mbit/s when using 64 data streams... Also, the biggest > gains seem to be in the jump from 1 to 8 data streams; with 8 > streams I was able to get throughput up to 6500Mbit/sec. > Our goal for 10GbE, is to be able to restore data from tape at a > speed of at least 2400Mbit/sec (300MB/sec). We have large daily > backups (3-4TB) that we would like to be able to restore (not > backup) in a reasonable amount of time. These restores are used to > refresh our test and development environments with current data. > The actual backups are done with array based snapshots (HDS > ShadowCopy), which then get mounted and backed up by a dedicated > media server (6-core T2000). We're currently getting about > 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape > drives - MPX=3 and it's very compressible data). > Going off my iperf results, the restoring this data using 9 streams > should get us well over 2400Mbit/sec. But - we haven't installed > the cards on our media servers yet, so I have yet to see what the > actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful > it'll be close to the iperf results, but if it doesn't meet the goal > then we'll be looking at other options. > -- > Devon Peters ___ > Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
I'd be be glad to share... The OS is sol10 11/06, and I'm running the recommended patch cluster that was available on 9/12 - kernel patch is 125100-10. For tunables, I've tested quite a few different permutations of settings for tcp, but I didn't find a whole lot to be gained from this. Peformance seemed to be best, as long as I was using a TCP congestion window of 512k or 1024k (sol10 default max is 1024k). In the end I basically bumped up the max buffer and window sizes to 10MB, enabled window scaling, and bumped up the connection queues: tcp_conn_req_max_q 8192 tcp_conn_req_max_q0 8192 tcp_max_buf 10485760 tcp_cwnd_max10485760 tcp_recv_hiwat 65536 tcp_xmit_hiwat 65536 The tunables that made a noticable difference regarding performance are: ddi_msix_alloc_limit8 tcp_squeue_wput1 ip_soft_rings_cnt 64 ip_squeue_fanout1 nxge0 accept_jumbo 1 only one cpu/thread per core is interruptable (set using: psradm -i 1-3 5-7 9-11 13-15) You can find Sun's recommended settings for these cards here: http://www.solarisinternals.com/wiki/index.php/Networks Also, the iperf commands that have provided the highest throughput are: Server: iperf -s -f m -w 512K -l 512K Client: iperf -c -f m -w 512K -l 512K -t 600 -P "Is rss enabled?" Not sure what you're asking here... -devon From: Curtis Preston [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 1:07 AM To: Peters, Devon C; VERITAS-BU@mailman.eng.auburn.edu Subject: RE: [Veritas-bu] Some info on my experiences with 10GbE 7500 MB/s! That's the most impressive numbers I've ever seen by FAR. I may have to take back my "10 GbE is a Lie!" blog post, and I'd be happy to do so. Can you share things besides the T2000? For example, what OS and patch levels are you running? Any IP patches? Any IP-specific patches? What ndd settings are you using? Is rss enabled? "Input, I need Input!" --- W. Curtis Preston Backup Blog @ www.backupcentral.com VP Data Protection, GlassHouse Technologies From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peters, Devon C Sent: Wednesday, October 17, 2007 12:12 PM To: VERITAS-BU@mailman.eng.auburn.edu Subject: [Veritas-bu] Some info on my experiences with 10GbE Since I've seen a little bit of talk about 10GbE on here in the past I figured I'd share some of my experiences... I've recently been testing some of Sun's dual-port 10GbE NICs on some small T2000's (1Ghz, 4-core). I'm only using a single port on each card, and the servers are currently directly connected to each other (waiting for my network team to get switches and fibre in place). So far, I've been able to drive throughput between these two systems to about 7500Mbit/sec using iperf. When the throughput gets this high, all the cores/threads on the receiving T2000 become saturated and TCP retransmits start climbing, but both systems remain quite responsive. Since these are only 4-core T2000's, I would guess that the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz processors) should be capable of more throughput, possibly near line speed. The down side achieving this high of throughput is that it requires lots of data streams. When transmitting with a single data stream, the most throughput I've gotten is about 1500Mbit/sec. I only got up to 7500Mbit/s when using 64 data streams... Also, the biggest gains seem to be in the jump from 1 to 8 data streams; with 8 streams I was able to get throughput up to 6500Mbit/sec. Our goal for 10GbE, is to be able to restore data from tape at a speed of at least 2400Mbit/sec (300MB/sec). We have large daily backups (3-4TB) that we would like to be able to restore (not backup) in a reasonable amount of time. These restores are used to refresh our test and development environments with current data. The actual backups are done with array based snapshots (HDS ShadowCopy), which then get mounted and backed up by a dedicated media server (6-core T2000). We're currently getting about 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape drives - MPX=3 and it's very compressible data). Going off my iperf results, the restoring this data using 9 streams should get us well over 2400Mbit/sec. But - we haven't installed the cards on our media servers yet, so I have yet to see what the actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful it'll be close to the iperf results, but if it doesn't meet the goal then we'll be looking at other options. -- Devon Peters ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Devon, Good to hear that T2000's are screamers. What are the library/tape drive specs. Are the drives FC attached? or are they attached via scsi to the media server? Thanks, Karl > From: [EMAIL PROTECTED] [mailto:veritas-bu- > [EMAIL PROTECTED] On Behalf Of Peters, Devon C > Sent: Wednesday, October 17, 2007 12:12 PM > To: VERITAS-BU@mailman.eng.auburn.edu > Subject: [Veritas-bu] Some info on my experiences with 10GbE > > Since I've seen a little bit of talk about 10GbE on here in the past > I figured I'd share some of my experiences... > I've recently been testing some of Sun's dual-port 10GbE NICs on > some small T2000's (1Ghz, 4-core). I'm only using a single port on > each card, and the servers are currently directly connected to each > other (waiting for my network team to get switches and fibre in place). > So far, I've been able to drive throughput between these two systems > to about 7500Mbit/sec using iperf. When the throughput gets this > high, all the cores/threads on the receiving T2000 become saturated > and TCP retransmits start climbing, but both systems remain quite > responsive. Since these are only 4-core T2000's, I would guess that > the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz > processors) should be capable of more throughput, possibly near line speed. > The down side achieving this high of throughput is that it requires > lots of data streams. When transmitting with a single data stream, > the most throughput I've gotten is about 1500Mbit/sec. I only got > up to 7500Mbit/s when using 64 data streams? Also, the biggest > gains seem to be in the jump from 1 to 8 data streams; with 8 > streams I was able to get throughput up to 6500Mbit/sec. > Our goal for 10GbE, is to be able to restore data from tape at a > speed of at least 2400Mbit/sec (300MB/sec). We have large daily > backups (3-4TB) that we would like to be able to restore (not > backup) in a reasonable amount of time. These restores are used to > refresh our test and development environments with current data. > The actual backups are done with array based snapshots (HDS > ShadowCopy), which then get mounted and backed up by a dedicated > media server (6-core T2000). We're currently getting about > 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape > drives - MPX=3 and it's very compressible data). > Going off my iperf results, the restoring this data using 9 streams > should get us well over 2400Mbit/sec. But - we haven't installed > the cards on our media servers yet, so I have yet to see what the > actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful > it'll be close to the iperf results, but if it doesn't meet the goal > then we'll be looking at other options. > -- > Devon Peters ___ > Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Also, Very Interested, My backup servers are 8core T2000's, we will be putting the SUN nxge 10Gbit cards into them. I have only seen poor results from the card in back to back configuration (the network is not at 10Gbit yet). so far i have been misserable with results in line with Mr Preston's. More that happy (Change controll permitting) to duplicate your setup with the 8 core boxes. Adam. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Curtis Preston Sent: Thursday, 18 October 2007 4:07 PM To: Peters, Devon C; VERITAS-BU@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Some info on my experiences with 10GbE 7500 MB/s! That's the most impressive numbers I've ever seen by FAR. I may have to take back my "10 GbE is a Lie!" blog post, and I'd be happy to do so. Can you share things besides the T2000? For example, what OS and patch levels are you running? Any IP patches? Any IP-specific patches? What ndd settings are you using? Is rss enabled? "Input, I need Input!" --- W. Curtis Preston Backup Blog @ www.backupcentral.com VP Data Protection, GlassHouse Technologies From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peters, Devon C Sent: Wednesday, October 17, 2007 12:12 PM To: VERITAS-BU@mailman.eng.auburn.edu Subject: [Veritas-bu] Some info on my experiences with 10GbE Since I've seen a little bit of talk about 10GbE on here in the past I figured I'd share some of my experiences... I've recently been testing some of Sun's dual-port 10GbE NICs on some small T2000's (1Ghz, 4-core). I'm only using a single port on each card, and the servers are currently directly connected to each other (waiting for my network team to get switches and fibre in place). So far, I've been able to drive throughput between these two systems to about 7500Mbit/sec using iperf. When the throughput gets this high, all the cores/threads on the receiving T2000 become saturated and TCP retransmits start climbing, but both systems remain quite responsive. Since these are only 4-core T2000's, I would guess that the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz processors) should be capable of more throughput, possibly near line speed. The down side achieving this high of throughput is that it requires lots of data streams. When transmitting with a single data stream, the most throughput I've gotten is about 1500Mbit/sec. I only got up to 7500Mbit/s when using 64 data streams... Also, the biggest gains seem to be in the jump from 1 to 8 data streams; with 8 streams I was able to get throughput up to 6500Mbit/sec. Our goal for 10GbE, is to be able to restore data from tape at a speed of at least 2400Mbit/sec (300MB/sec). We have large daily backups (3-4TB) that we would like to be able to restore (not backup) in a reasonable amount of time. These restores are used to refresh our test and development environments with current data. The actual backups are done with array based snapshots (HDS ShadowCopy), which then get mounted and backed up by a dedicated media server (6-core T2000). We're currently getting about 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape drives - MPX=3 and it's very compressible data). Going off my iperf results, the restoring this data using 9 streams should get us well over 2400Mbit/sec. But - we haven't installed the cards on our media servers yet, so I have yet to see what the actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful it'll be close to the iperf results, but if it doesn't meet the goal then we'll be looking at other options. -- Devon Peters NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments. ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
7500 MB/s! That's the most impressive numbers I've ever seen by FAR. I may have to take back my "10 GbE is a Lie!" blog post, and I'd be happy to do so. Can you share things besides the T2000? For example, what OS and patch levels are you running? Any IP patches? Any IP-specific patches? What ndd settings are you using? Is rss enabled? "Input, I need Input!" --- W. Curtis Preston Backup Blog @ www.backupcentral.com VP Data Protection, GlassHouse Technologies From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peters, Devon C Sent: Wednesday, October 17, 2007 12:12 PM To: VERITAS-BU@mailman.eng.auburn.edu Subject: [Veritas-bu] Some info on my experiences with 10GbE Since I've seen a little bit of talk about 10GbE on here in the past I figured I'd share some of my experiences... I've recently been testing some of Sun's dual-port 10GbE NICs on some small T2000's (1Ghz, 4-core). I'm only using a single port on each card, and the servers are currently directly connected to each other (waiting for my network team to get switches and fibre in place). So far, I've been able to drive throughput between these two systems to about 7500Mbit/sec using iperf. When the throughput gets this high, all the cores/threads on the receiving T2000 become saturated and TCP retransmits start climbing, but both systems remain quite responsive. Since these are only 4-core T2000's, I would guess that the 6 or 8-core T2000's (especially with 1.2Ghz or 1.4Ghz processors) should be capable of more throughput, possibly near line speed. The down side achieving this high of throughput is that it requires lots of data streams. When transmitting with a single data stream, the most throughput I've gotten is about 1500Mbit/sec. I only got up to 7500Mbit/s when using 64 data streams... Also, the biggest gains seem to be in the jump from 1 to 8 data streams; with 8 streams I was able to get throughput up to 6500Mbit/sec. Our goal for 10GbE, is to be able to restore data from tape at a speed of at least 2400Mbit/sec (300MB/sec). We have large daily backups (3-4TB) that we would like to be able to restore (not backup) in a reasonable amount of time. These restores are used to refresh our test and development environments with current data. The actual backups are done with array based snapshots (HDS ShadowCopy), which then get mounted and backed up by a dedicated media server (6-core T2000). We're currently getting about 650MB/sec of throughput with the backups (9 streams on 3 LTO3 tape drives - MPX=3 and it's very compressible data). Going off my iperf results, the restoring this data using 9 streams should get us well over 2400Mbit/sec. But - we haven't installed the cards on our media servers yet, so I have yet to see what the actual performanee of netbackup and LTO3 over 10GbE is. I'm hopeful it'll be close to the iperf results, but if it doesn't meet the goal then we'll be looking at other options. -- Devon Peters ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Re: [Veritas-bu] Some info on my experiences with 10GbE
Very nice write up and useful information, thanks! On Wed, 17 Oct 2007, Peters, Devon C wrote: > Since I've seen a little bit of talk about 10GbE on here in the past I > figured I'd share some of my experiences... > > I've recently been testing some of Sun's dual-port 10GbE NICs on some > small T2000's (1Ghz, 4-core). I'm only using a single port on each > card, and the servers are currently directly connected to each other > (waiting for my network team to get switches and fibre in place). > > So far, I've been able to drive throughput between these two systems to > about 7500Mbit/sec using iperf. When the throughput gets this high, all > the cores/threads on the receiving T2000 become saturated and TCP > retransmits start climbing, but both systems remain quite responsive. > Since these are only 4-core T2000's, I would guess that the 6 or 8-core > T2000's (especially with 1.2Ghz or 1.4Ghz processors) should be capable > of more throughput, possibly near line speed. > > The down side achieving this high of throughput is that it requires lots > of data streams. When transmitting with a single data stream, the most > throughput I've gotten is about 1500Mbit/sec. I only got up to > 7500Mbit/s when using 64 data streams... Also, the biggest gains seem > to be in the jump from 1 to 8 data streams; with 8 streams I was able > to get throughput up to 6500Mbit/sec. > > Our goal for 10GbE, is to be able to restore data from tape at a speed > of at least 2400Mbit/sec (300MB/sec). We have large daily backups > (3-4TB) that we would like to be able to restore (not backup) in a > reasonable amount of time. These restores are used to refresh our test > and development environments with current data. The actual backups are > done with array based snapshots (HDS ShadowCopy), which then get mounted > and backed up by a dedicated media server (6-core T2000). We're > currently getting about 650MB/sec of throughput with the backups (9 > streams on 3 LTO3 tape drives - MPX=3 and it's very compressible data). > > Going off my iperf results, the restoring this data using 9 streams > should get us well over 2400Mbit/sec. But - we haven't installed the > cards on our media servers yet, so I have yet to see what the actual > performanee of netbackup and LTO3 over 10GbE is. I'm hopeful it'll be > close to the iperf results, but if it doesn't meet the goal then we'll > be looking at other options. > > -- > Devon Peters > ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu