Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
Craig, In another SHARE session, I seem to remember that if you need to have very communications high throughput then you do not want to use VSWITCHes. Instead you want to dedicate a connection to the OSA adapter. (I am unable at the moment to find that presentation.) However, in Mario Held's presentation, where he is going through the performance characteristics of the various types of communications, there is a phrase that says Use direct OSA for outside connection of demanding guests Also in Rob van der Heij's presentation Linux on z/VM Understanding CPU Usage, there is a section Improving TSM Throughput. This presentation presents a case study. You might find it interesting. Ron Craig Collins wrote: We are just starting to use OSA-Express3 10 Gb ports for SLES10-SP2 Linux guests. We're trying to use these with TSM servers running on SLES10 to backup other non-zSeries servers in our environment. We are using the 10 Gb OSAs connected to VSWITCHes in zVM 5.4. Currently we have only one SLES10 TSM server connected to a VSWITCH that is the only thing connected to a 10 Gb OSA and are seeing throughput of less than 1 Gb/s. The Cisco switch the OSA port is connected to recognizes the speed as 10 Gb. The TSM server and all of the servers it is backing up are on the same subnet and there is no firewall involved. Thanks to linuxvm.org, we found Share presentation 2192 by Mario Held from August 2009 named Linux on System z Performance Update - Part 2: Networking and Crypto which we found helpful and we have altered some of our settings based upon the recommendations. We also have been over the OSA and Vswitch documentation from IBM looking for any speed settings related to this type of OSA card, a VSWITCH, or a nic definition for a linux guest on zVM, but did not find anything. We still are not getting throughput we expect (or maybe desire). We wondered if a nic setting in SLES10 could be keeping the connection from getting above the 1 Gb/s mark, but cannot find a parameter to change as the normal parameters with ethtool don't seem to apply in this environment. Is anyone else using a 10 Gb OSA through a VSWITCH and getting throughput greater than 1 Gb for a single server instance? If so, are there any settings you needed to change to get to that performance level? Or is there a maximum of around 1 Gb that a single server instance can achieve? We're grasping at straws at this point. Any ideas are appreciated. Craig Collins State of WI, DOA, DET -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 . -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
Hi Craig and Ron, I am the customer that Rob's case study was based on. I had the exact same problem (with 1 GbE OSA's) that you are having. Your problem is indeed the VSWITCH. Kick it and use native bonding in Linux. If you need more than one guest, buy more OSA's, they are way cheaper than IFL's ;-)) Best regards, Pieter Harder Van: Linux on 390 Port [linux-...@vm.marist.edu] namens Ron Foster at Baldor-IS [rfos...@baldor.com] Verzonden: maandag 26 oktober 2009 20:11 Aan: LINUX-390@VM.MARIST.EDU Onderwerp: Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput Craig, In another SHARE session, I seem to remember that if you need to have very communications high throughput then you do not want to use VSWITCHes. Instead you want to dedicate a connection to the OSA adapter. (I am unable at the moment to find that presentation.) However, in Mario Held's presentation, where he is going through the performance characteristics of the various types of communications, there is a phrase that says Use direct OSA for outside connection of demanding guests Also in Rob van der Heij's presentation Linux on z/VM Understanding CPU Usage, there is a section Improving TSM Throughput. This presentation presents a case study. You might find it interesting. Ron Craig Collins wrote: We are just starting to use OSA-Express3 10 Gb ports for SLES10-SP2 Linux guests. We're trying to use these with TSM servers running on SLES10 to backup other non-zSeries servers in our environment. We are using the 10 Gb OSAs connected to VSWITCHes in zVM 5.4. Currently we have only one SLES10 TSM server connected to a VSWITCH that is the only thing connected to a 10 Gb OSA and are seeing throughput of less than 1 Gb/s. The Cisco switch the OSA port is connected to recognizes the speed as 10 Gb. The TSM server and all of the servers it is backing up are on the same subnet and there is no firewall involved. Thanks to linuxvm.org, we found Share presentation 2192 by Mario Held from August 2009 named Linux on System z Performance Update - Part 2: Networking and Crypto which we found helpful and we have altered some of our settings based upon the recommendations. We also have been over the OSA and Vswitch documentation from IBM looking for any speed settings related to this type of OSA card, a VSWITCH, or a nic definition for a linux guest on zVM, but did not find anything. We still are not getting throughput we expect (or maybe desire). We wondered if a nic setting in SLES10 could be keeping the connection from getting above the 1 Gb/s mark, but cannot find a parameter to change as the normal parameters with ethtool don't seem to apply in this environment. Is anyone else using a 10 Gb OSA through a VSWITCH and getting throughput greater than 1 Gb for a single server instance? If so, are there any settings you needed to change to get to that performance level? Or is there a maximum of around 1 Gb that a single server instance can achieve? We're grasping at straws at this point. Any ideas are appreciated. Craig Collins State of WI, DOA, DET -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 . -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 Brabant Water N.V. Postbus 1068 5200 BC 's-Hertogenbosch http://www.brabantwater.nl Handelsregister: 16005077 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
From Craig Collins grizl...@gmail.com We wondered if a nic setting in SLES10 could be keeping the connection from getting above the 1 Gb/s mark, We don't have a OSA-Express3 10 Gb, but what are you driving that file transfer with? ftp? iperf? One thread? Have you look at the cpu cosumption that the server is using during your test? - Please consider the environment before printing this email and any attachments. This e-mail and any attachments are intended only for the individual or company to which it is addressed and may contain information which is privileged, confidential and prohibited from disclosure or unauthorized use under applicable law. If you are not the intended recipient of this e-mail, you are hereby notified that any use, dissemination, or copying of this e-mail or the information contained in this e-mail is strictly prohibited by the sender. If you have received this transmission in error, please return the material received to the sender and delete all copies from your system. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
Hi Pieter, Do you have any numbers to compare vswitch vs native? OSA throughput, and cpu? Thanks - Please consider the environment before printing this email and any attachments. This e-mail and any attachments are intended only for the individual or company to which it is addressed and may contain information which is privileged, confidential and prohibited from disclosure or unauthorized use under applicable law. If you are not the intended recipient of this e-mail, you are hereby notified that any use, dissemination, or copying of this e-mail or the information contained in this e-mail is strictly prohibited by the sender. If you have received this transmission in error, please return the material received to the sender and delete all copies from your system. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
For the finer points of performance measurement I defer to Rob. This is a highlevel coarse view: With a VSWITCH with LACP involved and using two full IFL's with 4 OSA GbE I could barely exceed 100 MB/s with no other activity at all. With the same 2 IFL's and 4 OSA dedicated and bonded within Linux I have seen 175 MB/s with other activity going on. See sample from TSM performance instrumentation: TOTAL SERVER SUMMARY Operation Count Tottime Avgtime Maxtime InstTput RealTput Total KB Disk Read 354044 1929.5240.0050.517 17011.1 9270.3 32823388 Disk Write 2693971 16382.6640.0060.766 35619.5 164809.4 583542652 Disk Commit 3781.4320.0040.090 Tape Read6307 26.3200.0040.147 61100.4454.2 1608193 Tape Write 131942 464.4510.0041.061 72719.1 9538.9 33774493 Tape Locate 5 139.922 27.985 64.225 Tape Commit10 32.0183.202 14.161 Tape Data Copy 241324 26.3080.0000.190 Tape Misc 3 49.823 16.608 16.861 Data Copy62681.5110.0000.036 Network Recv 56592336 65035.1780.001 940.637 8939.0 164189.4 581347509 Network Send 27670.1410.0000.006 1522.7 0.1 216 Tm Lock Wait 478.9110.1900.932 Acquire Latch 499215.0800.0000.275 Acquire XLatch 700341.4020.0000.288 Thread Wait5340114 105207.6220.020 3501.058 Best regards, Pieter Harder pieter.har...@brabantwater.nl tel +31-73-6837133 / +31-6-47272537 -Oorspronkelijk bericht- Van: Linux on 390 Port [mailto:linux-...@vm.marist.edu] Namens Sterling James Verzonden: maandag 26 oktober 2009 21:03 Aan: LINUX-390@VM.MARIST.EDU Onderwerp: Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput Hi Pieter, Do you have any numbers to compare vswitch vs native? OSA throughput, and cpu? Thanks - Please consider the environment before printing this email and any attachments. This e-mail and any attachments are intended only for the individual or company to which it is addressed and may contain information which is privileged, confidential and prohibited from disclosure or unauthorized use under applicable law. If you are not the intended recipient of this e-mail, you are hereby notified that any use, dissemination, or copying of this e-mail or the information contained in this e-mail is strictly prohibited by the sender. If you have received this transmission in error, please return the material received to the sender and delete all copies from your system. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 Brabant Water N.V. Postbus 1068 5200 BC 's-Hertogenbosch http://www.brabantwater.nl Handelsregister: 16005077 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
Craig, Pieter gave you the information on how to bond multiple OSA adapters together. He did not cover the user direct entries. You will need some dedicate statements in your directory. Lets say that your Linux guest is looking for the vswitch at 0.0.0600 0.0.0601 0.0.0602, and your OSA subchannels that you can use are at 0.0.0c00, 0.0.c01, and 0.0.c02, then your dedicate statements would look like this: DEDICATE 600 c00 DEDICATE 601 c01 DEDICATE 602 c02 Pieter's example below has 4 OSA adapters bonded together, so you would need 3 more sets of dedicate statements. Remember when dedicating OSA adapters for zLinux guests, that each set of subchannels can only be used once in an LPAR. You can have a Linux LPAR that uses c00, c01, and c02. You have have two LPARs that use c00, c01, and c02. You cannot have two Linux guests use c00, c01, c02. Assuming the folks that did your I/O configuration (and you have the right hardware) allow it, you can have one guest use c00, c01, and c02, and another guest use c03, c04, c05. (I don't know if bonding will prevent this.) The other thing to look at is MTU size. If you are wanting to move a lot of data, you want to be using jumbo frames. (Whether you are using vswitches or dedicated OSAs.) The other wild card is z/VM 6.1. According to Reed Mullen's SHARE presentation, is that it contains changes to enhance z/VM's virtual networking. I don't know by how much. At SHARE, Bill Bitner said that z/VM 6.1 was not ready to be talked about. Ron From: Linux on 390 Port [linux-...@vm.marist.edu] On Behalf Of Harder, Pieter [pieter.har...@brabantwater.nl] Sent: Monday, October 26, 2009 3:44 PM To: LINUX-390@VM.MARIST.EDU Subject: FW: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput Might be of public interest... Hi Craig, Using SLES10 SP2 in /etc/sysconfig/hardware/hwcfg-qeth-bus-ccw-0.0.0c00: STARTMODE=auto MODULE=qeth MODULE_OPTIONS= MODULE_UNLOAD=yes SCRIPTUP=hwup-ccw SCRIPTUP_ccw=hwup-ccw SCRIPTUP_ccwgroup=hwup-qeth SCRIPTDOWN=hwdown-ccw CCW_CHAN_IDS=0.0.0c00 0.0.0c01 0.0.0c02 CCW_CHAN_NUM=3 CCW_CHAN_MODE=0 QETH_LAYER2_SUPPORT=1 QETH_OPTIONS=buffer_count=128 Three more of those, 1c00/02, 2c00/02, 3c00/02 In /etc/sysconfig/network/ifcfg-qeth-bus-ccw-0.0.0c00: BOOTPROTO=static STARTMODE=onboot LLADDR=02:00:00:02:03:96 IPADDR= SLAVE='yes' MASTER='bond0' MTU='8992' _nm_name='qeth-bus-ccw-0.0.0c00' Three more of those, you get the picture. In /etc/sysconfig/network/ifcfg-bond0: BOOTPROTO=static STARTMODE=onboot IPADDR=10.2.3.150 NETMASK=255.255.255.0 NETWORK=10.2.3.0 BROADCAST=10.2.3.255 MTU='8992' BONDING_MASTER='yes' BONDING_MODULE_OPTS='mode=802.3ad miimon=500 xmit_hash_policy=layer2' BONDING_SLAVE0='qeth-bus-ccw-0.0.0c00' BONDING_SLAVE1='qeth-bus-ccw-0.0.1c00' BONDING_SLAVE2='qeth-bus-ccw-0.0.2c00' BONDING_SLAVE3='qeth-bus-ccw-0.0.3c00' And that is basically it. Best regards, Pieter Harder pieter.har...@brabantwater.nlmailto:pieter.har...@brabantwater.nl tel +31-73-6837133 / +31-6-47272537 Van: Craig Collins [mailto:grizl...@gmail.com] Verzonden: maandag 26 oktober 2009 20:58 Aan: Harder, Pieter Onderwerp: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput Hi Pieter, Thanks for your response. We've never tried connecting them directly to a guest. So we're working on finding the network config parameters we need to use on the linux guest to connect the OSA devices and get the vlanid set correctly. If you have any hints or config parameters that you would be willing to share or manuals we should look into, please send them along. Hopefully tomorrow we will take a shot at this. Craig Collins State ow WI, DOA, DET Brabant Water N.V. Postbus 1068 5200 BC 's-Hertogenbosch http://www.brabantwater.nl Handelsregister: 16005077 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
On Monday, 10/26/2009 at 05:10 EDT, Ron Foster at Baldor-IS rfos...@baldor.com wrote: The other wild card is z/VM 6.1. According to Reed Mullen's SHARE presentation, is that it contains changes to enhance z/VM's virtual networking. I don't know by how much. At SHARE, Bill Bitner said that z/VM 6.1 was not ready to be talked about. The enhancements to VSWITCH involve guest-guest communications. There shouldn't be much change in over-the-wire speeds feeds. z/VM 6.1 GA'd last week, so we can talk about it all we want now. :-) Alan Altmark z/VM Development IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
The other thing we will need to do is set the vlanid since we wont be doing that through the vswitch as we have been up until now. We found the command to do that against the nic definition, which is what I am guessing we will need to do after everything is setup with the bonding. Thanks for the info and the quick responses. Craig Collins State of WI, DOA, DET -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
On Monday, 10/26/2009 at 04:21 EDT, Harder, Pieter pieter.har...@brabantwater.nl wrote: For the finer points of performance measurement I defer to Rob. This is a highlevel coarse view: With a VSWITCH with LACP involved and using two full IFL's with 4 OSA GbE I could barely exceed 100 MB/s with no other activity at all. With the same 2 IFL's and 4 OSA dedicated and bonded within Linux I have seen 175 MB/s with other activity going on. Do you have statistics for packets on each interface in the port group? If you read Load Balancing within the Virtual Switch in the z/VM Connectivity book, you will discover that load balancing takes place on a per-conversation basis, where a conversation is defined as a unique pairing of (vnic origin MAC, dest MAC). Since most guests on a VSWITCH will tending to speak to the same host (the gateway), the destination MAC can be considered a constant. Consequently, you will only see the load balancing in action when you have multiple VNICs in the guest or multiple guests. This means that a single virtual NIC on a single guest is still limited to a single OSA, so any single-guest performance measurement of LACP-enabled VSWITCH can give misleading results as to the capacity of the VSWITCH as a whole. (Physical switches have multiple load balancing algorithms to choose from.) The z/VM Performance Report contains IBM analysis of Link Aggregation. Alan Altmark z/VM Development IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
Also keep in mind that the TSM window size and buffer size will have a significant impact on throughput. For a 10G interface, larger is better for both. A sadly realistic checkpoint, though: You are unlikely to EVER get an appreciable percentage of wire speed out of the 10G interfaces using TSM no matter what you do; the protocol is half duplex in a lot of places which kills the interface data pipelining. If you can get it to exceed 15-20% of wire speed, you're in a very good place. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
Do you have statistics for packets on each interface in the port group? When I had the VSWITCH setup I was looking closely into this. While not perfectly balanced due to our client distribution, there was reasonable traffic going on on all interfaces. And it was the processor power available. When cycles were taken away by our higher prioity SAP engines throughput immediately sufferred way below the 100 MB/s mark. Physical switches have multiple load balancing algorithms to choose from. Exactly. Our Cisco VSS switch pair does, as is exhibited by the distribution I am currently seeing coming in on the dedicated OSA's. Nothing changed outside of the zSeries. The problem still exists on outgoing traffic due to the old bonding driver level in SLES10 SP2 that only allows layer2 hashing. The one outgoing OSA used is pegged at 80-90 percent utilization according to PTK. Pieter Harder Brabant Water N.V. Postbus 1068 5200 BC 's-Hertogenbosch http://www.brabantwater.nl Handelsregister: 16005077 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: OSA-Express3 10 Gb, Vswitch, and SLES10 Linux Throughput
A sadly realistic checkpoint, though: You are unlikely to EVER get an appreciable percentage of wire speed out of the 10G interfaces using TSM no matter what you do; the protocol is half duplex in a lot of places which kills the interface data pipelining. If you can get it to exceed 15-20% of wire speed, you're in a very good place. For one client. If multiple clients are involved you get more out of the wire. My ROT is that 1 client is needed for 25% utilization on a 1 GbE. We currently use 7 to get to 175%, which is all our two IFL's can handle. Pieter Harder Brabant Water N.V. Postbus 1068 5200 BC 's-Hertogenbosch http://www.brabantwater.nl Handelsregister: 16005077 -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390