Thanks a lot for the quick reply Stefan Following from problem VM: 18:56:29 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
18:56:44 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.20 0.00 18:56:49 1 0.00 0.00 0.00 0.00 3.21 10.22 0.00 79.56 908.22 18:56:54 1 0.00 0.00 0.00 0.00 11.47 54.93 0.00 2.82 5527.77 18:56:59 1 0.00 0.00 0.00 0.00 10.04 66.06 0.00 2.21 7160.64 18:57:04 1 0.00 0.00 0.00 0.00 10.42 65.13 0.00 2.00 7295.99 18:57:09 1 0.00 0.00 0.00 0.00 12.53 50.51 0.00 4.04 5700.20 18:57:14 1 0.00 0.00 0.00 0.00 16.43 65.53 0.00 8.62 9572.34 18:57:19 1 0.00 0.00 0.00 0.00 11.45 60.64 0.00 4.02 5798.19 18:57:24 1 0.00 0.00 0.00 0.00 11.45 81.33 0.00 0.80 6064.26 18:57:29 1 0.00 0.00 0.00 0.00 7.65 85.11 0.00 0.80 7578.27 18:57:34 1 0.00 0.00 0.00 0.00 9.42 84.17 0.00 1.40 9083.97 18:57:39 1 0.00 0.00 0.00 0.00 7.78 82.83 0.00 1.60 7264.87 18:57:44 1 0.00 0.00 0.00 0.00 8.62 87.78 0.00 0.60 8597.80 18:57:49 1 0.00 0.00 0.00 0.00 10.02 82.16 0.00 2.40 7750.90 18:57:54 1 0.00 0.00 0.00 0.00 8.42 81.76 0.00 1.00 6303.41 18:57:59 1 0.00 0.00 0.00 0.00 7.63 87.35 0.00 1.20 9422.49 18:58:04 1 0.00 0.00 0.00 0.00 10.44 80.32 0.00 2.21 7496.79 18:58:09 1 0.00 0.00 0.00 0.00 6.43 59.84 0.00 26.91 5019.28 18:58:14 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 1.00 18:58:19 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00 I set the affinity of both tx and rx interfaces to cpu 1 so just showing cpu1. NAPI weight is 128 in this version, I changed to 64 just to see. This version of the code seems to be changing quota and budget (which i did not see in newer versions) I am thinking of playing around with that. I also see that this version kicks for every packet on the tx side. Any other pointers would be really helpful. Thanks Kal On Wed, Mar 2, 2016 at 3:28 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote: > On Tue, Mar 01, 2016 at 04:06:16PM -0800, kalyan tata wrote: > > Hi All, > > > > I am new to qemu development. > > Sorry If this is not the correct forum for this question, it would be > great > > if you could direct me to correct forum. > > > > I am seeing very low virtio network throughput on an older (2.6.18) linux > > guest vs another newer guest (3.10) both running on the same host. (same > > config 2 vcpus, no multi Q etc.) I see very high CPU usage on the 2.6.18 > > guest at very low network throughput and want to profile to find > > bottleneck. > > > > I tried to use "perf kvm" but the analysis shows overhead as max .25 % > > where as top in VM shows 100% cpu. (I used following as a guide > > > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/Virtualization_Tuning_and_Optimization_Guide/index.html#sect-Virtualization_Tuning_Optimization_Guide-Monitoring_Tools-perf_kvm > > ) > > > > 0.25% :5235 [uhci_hcd] [g] 0xffffffff80182236 > > 0.24% :5235 [uhci_hcd] [g] 0xffffffff8018226a > > 0.23% :5235 [virtio_ring] [g] vring_new_virtqueue > > 0.20% :5236 [uhci_hcd] [g] 0xffffffff80182236 > > 0.18% :5236 [uhci_hcd] [g] 0xffffffff8018226a > > 0.18% :5235 [uhci_hcd] [g] 0xffffffff8016f385 > > 0.14% :5236 [uhci_hcd] [g] 0xffffffff802fbe0f > > 0.14% :5235 [uhci_hcd] [g] 0xffffffff8001161a > > 0.14% :5235 [virtio_ring] [g] virtqueue_is_broken > > > > > > My basic question is - Is there a way I can profile the older version of > > linux guest so i can see the bottleneck (where the guest is spending CPU > > cycles) My aim is to see if i can patch the older version in the critical > > path with improvements made in newer version > > What is the output of "mpstat 5" in the guest and on the host? mpstat > is part of the "sysstat" package. > > mpstat is similar to vmstat but also shows "guest time" and "steal > time". Both are relevant to virtualization and will help show which > component is using so much CPU time. > > Stefan >