Re performance, there has been no need to change time slice since qdio, about 5 years ago.

Offer Baruch wrote:
Hi all,

Well we conducted another test (much better than the last one - it had too many 
unpredictable variables):
1. 1 destination machine let's call it "dest1" 2. dest1 has 2 nics
        2.1 vlan 411 access mode thru a VSWITCH
2.2 vlan 420 trunck mode directly thru osa devices (different OSA then the VSWITCH) 3. 1 source machine lets' call it source1
4 same nic configuration as dest1
5. using two different ssh connections to srouce1, we run the ping command to 
the dest1 machine:
        5.1 first ping from source1 is to dest1 vlan 411
        5.2 second ping from source1 is to dest1 vlan 420
6. what we were trying to prove is that the same 2 machine at the same time 
will respond much better using the OSA directly then using the VSWITCH.
7. well... we failed to prove that... as when the VSWITCH pings peeked so did 
the OSA pings.. we didn't expect that.

So, the problem is not the VSWITCH and not the OSA as the peeks happen on both 
of them simultaneously.
Our next guess is that  SRM dispatching time slice is too big. Please share if 
you have a better idea...

The default is 5ms? Really? So, If I have 5 machines in the dispatch list 
(let's say with 1 CPU) it is very possible that a simple ping will take 25ms.
Now we have 2 IFLs running about 15 guests on 1 z/VM and another z/VM on the 
same CEC (sharing the 2 IFLs) with about 4 guests that does not do much (PR/SM 
weight is 98% to the big z/VM and 2% to the small).
Total CPU utilization is at about 80%.
We are having trouble on the big z/VM :-)
Although not all guests has the same share (and I admit that I don't fully 
understand how the shares really work - I have read more than one article about 
it) 15 times 5 ms devided by 2 IFLs is 37.5ms.
And what if the guest answering the ping is busy with other stuff (like real 
application work) it can be in the next time slice.. that is 75ms.

Just to be clear. We really don't care about the ping response time. We are 
having real performance issues and just can’t find the reason. CPU is not 
utilize and we can still see stolen time on all guests.

Did any of you changed your time slice settings? Does this make any sense to 
you CPU performance experts.

Thanks!
Offer Baruch


-----Original Message-----
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Leland 
Lucius
Sent: Friday, October 28, 2011 7:41 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: z/VM Switch performance

On 10/27/11 11:07 AM, Marcy Cortes wrote:
Offer, I do get the erratic pings too.  Not as high as 200, but some 50s.

It was reported to me a few weeks ago by one of our more sophisticated users 
that traceroute occasionally fails over the same vswitch.
Run it like 20 times 1 right after another to recreate or use the -q option 
with something like -q 8.

I checked with another customer and he also saw the same behavior.

I opened a PMR with VM but they said to open one with Linux.  I have not gotten 
around to opening one with Novell yet.

Could some of you others out there try this simple ping test?

We are also vlan aware and it does happen on both LACP and non-LACP.

We have both aware and unaware VSWITCHes.


Results of two guests connected to the same aware VSWITCH and on the same VLAN:


pzawap01:~ # traceroute pzawap03
traceroute to pzawap03 (172.2.2.211), 30 hops max, 40 byte packets
 1  pzawap03.svc (172.2.2.211)  0.114 ms   0.134 ms   0.016 ms
pzawap01:~ # traceroute pzawap03
traceroute to pzawap03 (172.2.2.211), 30 hops max, 40 byte packets
 1  pzawap03.svc (172.2.2.211)  0.055 ms   0.136 ms   0.024 ms
pzawap01:~ # traceroute pzawap03
traceroute to pzawap03 (172.2.2.211), 30 hops max, 40 byte packets
 1  pzawap03.svc (172.2.2.211)  0.000 ms * *


Results of two guests connected to the same unaware VSWITCH:


pzsdns01:~ # traceroute 192.1.1.28
traceroute to 192.1.1.28 (192.1.1.28), 30 hops max, 40 byte packets
 1  192.1.1.28 (192.1.1.28)  0.199 ms   0.036 ms   0.074 ms
pzsdns01:~ # traceroute 192.1.1.28
traceroute to 192.1.1.28 (192.1.1.28), 30 hops max, 40 byte packets
 1  192.1.1.28 (192.1.1.28)  0.189 ms   0.492 ms *
pzsdns01:~ # traceroute 192.1.1.28
traceroute to 192.1.1.28 (192.1.1.28), 30 hops max, 40 byte packets
 1  192.1.1.28 (192.1.1.28)  0.140 ms   0.038 ms   0.087 ms
pzsdns01:~ # traceroute 192.1.1.28
traceroute to 192.1.1.28 (192.1.1.28), 30 hops max, 40 byte packets
 1  192.1.1.28 (192.1.1.28)  0.244 ms   0.044 ms   0.026 ms
pzsdns01:~ # traceroute 192.1.1.28
traceroute to 192.1.1.28 (192.1.1.28), 30 hops max, 40 byte packets
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  192.1.1.28 (192.1.1.28)  0.306 ms   0.196 ms   0.286 ms


Results from a guest on an unaware VSWITCH to a guest on a different unaware 
VSWITCH (same LPAR)


pzsdns01:~ # traceroute pzsadm01
traceroute to pzsadm01 (172.1.1.35), 30 hops max, 40 byte packets
 1  192.1.1.1 (192.1.1.1)  0.289 ms   0.254 ms   0.230 ms
 2  pzsadm01.svc (172.1.1.35)  0.271 ms   0.311 ms   0.324 ms
pzsdns01:~ # traceroute pzsadm01
traceroute to pzsadm01 (172.1.1.35), 30 hops max, 40 byte packets
 1  192.1.1.1 (192.1.1.1)  0.320 ms   0.276 ms   0.682 ms
 2  pzsadm01.svc (172.1.1.35)  0.321 ms   0.263 ms   0.296 ms
pzsdns01:~ # traceroute pzsadm01
traceroute to pzsadm01 (172.1.1.35), 30 hops max, 40 byte packets
 1  192.1.1.1 (192.1.1.1)  0.322 ms   0.278 ms   0.259 ms
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  pzsadm01.svc (172.1.1.35)  0.755 ms * *


Results from a guest on an aware VSWITCH to a guest on an aware VSWITCH on 
different LPARs (same VLAN)


pzawap01:~ # traceroute pzawap04
traceroute to pzawap04 (172.3.3.212), 30 hops max, 40 byte packets
 1  pzawap04.svc (172.3.3.212)  0.540 ms   0.298 ms   0.310 ms
pzawap01:~ # traceroute pzawap04
traceroute to pzawap04 (172.3.3.212), 30 hops max, 40 byte packets
 1  pzawap04.svc (172.3.3.212)  0.463 ms   0.312 ms   0.313 ms
pzawap01:~ # traceroute pzawap04
traceroute to pzawap04 (172.3.3.212), 30 hops max, 40 byte packets
 1  pzawap04.svc (172.3.3.212)  0.381 ms * *


All guests are SLES10-SP3 (kernel 2.6.16.60-0.76.8-default)

Leland

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1411 / Virus Database: 2092/3978 - Release Date: 10/27/11

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/




----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

<<attachment: BARTON.vcf>>

Reply via email to