subject:"Performance of 40\-way guest running 2.6.32\-220 \(RHEL6.2\) vs. 3.3.1 OS"

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-19 Thread Gleb Natapov

On Wed, Apr 18, 2012 at 09:44:47PM -0700, Chegu Vinod wrote:
On 4/17/2012 6:25 AM, Chegu Vinod wrote:
On 4/17/2012 2:49 AM, Gleb Natapov wrote:
On Mon, Apr 16, 2012 at 07:44:39AM -0700, Chegu Vinod wrote:
On 4/16/2012 5:18 AM, Gleb Natapov wrote:
On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote:
On 04/11/2012 01:21 PM, Chegu Vinod wrote:
Hello,

While running an AIM7 (workfile.high_systime) in a
single 40-way (or a single
60-way KVM guest) I noticed pretty bad performance when
the guest was booted
with 3.3.1 kernel when compared to the same guest booted
with 2.6.32-220
(RHEL6.2) kernel.
For the 40-way Guest-RunA (2.6.32-220 kernel) performed
nearly 9x better than
the Guest-RunB (3.3.1 kernel). In the case of 60-way
guest run the older guest
kernel was nearly 12x better !
How many CPUs your host has?
80 Cores on the DL980. (i.e. 8 Westmere sockets).

So you are not oversubscribing CPUs at all. Are those real cores
or including HT?

HT is off.

Do you have other cpus hogs running on the host while testing the guest?

Nope. Sometimes I do run the utilities like perf or sar or
mpstat on the numa node 0 (where
the guest is not running).

I was using numactl to bind the qemu of the 40-way guests to numa
nodes : 4-7 ( or for a 60-way guest
binding them to nodes 2-7)

/etc/qemu-ifup tap0

numactl --cpunodebind=4,5,6,7 --membind=4,5,6,7
/usr/local/bin/qemu-system-x86_64 -enable-kvm -cpu
Westmere,+rdtscp,+pdpe1gb,+dca,+xtpr,+tm2,+est,+vmx,+ds_cpl,+monitor,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
-enable-kvm \
-m 65536 -smp 40 \
-name vm1 -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait
\
-drive
file=/var/lib/libvirt/images/vmVinod1/vm1.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
-device virtio-blk-pci,scsi=off,bus=pci
.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-monitor stdio \
-net nic,macaddr=..mac_addr.. \
-net tap,ifname=tap0,script=no,downscript=no \
-vnc :4

/etc/qemu-ifdown tap0

I knew that there will be a few additional temporary qemu worker
threads created... i.e. some over
subscription will be there.

4 nodes above have 40 real cores, yes?

Yes .
Other than the qemu's related threads and some of the generic
per-cpu Linux kernel threads (e.g. migration etc)
there isn't anything else running on these Numa nodes.

Can you try to run upstream
kernel without binding at all and check the performance?

Re-ran the same workload *without* binding the qemu...but using the
3.3.1 kernel

20-way guest: Performance got much worse when compared to the case
where bind the qemu.
40-way guest: about the same as in the case where we bind the qemu
60-way guest: about the same as in the case where we bind the qemu

Trying out a couple of other experiments...

With 8 sockets the numa effects are probably very strong. Couple of things to
try:
1. Run vm that fits into one numa node and bind it to a numa node. Compare
performance of rhel kernel and upstream.
2. Run vm bigger than numa node, bind vcpus to numa nodes separately and
pass resulted topology to a guest using -numa flag.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-18 Thread Chegu Vinod

On 4/17/2012 6:25 AM, Chegu Vinod wrote:

On 4/17/2012 2:49 AM, Gleb Natapov wrote:

On Mon, Apr 16, 2012 at 07:44:39AM -0700, Chegu Vinod wrote:

On 4/16/2012 5:18 AM, Gleb Natapov wrote:

On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote:

On 04/11/2012 01:21 PM, Chegu Vinod wrote:

Hello,

(RHEL6.2) kernel.
For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x
better than
the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run
the older guest

kernel was nearly 12x better !

How many CPUs your host has?

80 Cores on the DL980. (i.e. 8 Westmere sockets).

So you are not oversubscribing CPUs at all. Are those real cores or
including HT?

HT is off.

Do you have other cpus hogs running on the host while testing the guest?

Nope. Sometimes I do run the utilities like perf or sar or
mpstat on the numa node 0 (where

the guest is not running).

I was using numactl to bind the qemu of the 40-way guests to numa
nodes : 4-7 ( or for a 60-way guest
binding them to nodes 2-7)

/etc/qemu-ifup tap0

numactl --cpunodebind=4,5,6,7 --membind=4,5,6,7
/usr/local/bin/qemu-system-x86_64 -enable-kvm -cpu
Westmere,+rdtscp,+pdpe1gb,+dca,+xtpr,+tm2,+est,+vmx,+ds_cpl,+monitor,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme

-enable-kvm \
-m 65536 -smp 40 \
-name vm1 -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait

\
-drive
file=/var/lib/libvirt/images/vmVinod1/vm1.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none

-device virtio-blk-pci,scsi=off,bus=pci
.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-monitor stdio \
-net nic,macaddr=..mac_addr.. \
-net tap,ifname=tap0,script=no,downscript=no \
-vnc :4

/etc/qemu-ifdown tap0

I knew that there will be a few additional temporary qemu worker
threads created... i.e. some over
subscription will be there.

4 nodes above have 40 real cores, yes?

Yes .
Other than the qemu's related threads and some of the generic per-cpu
Linux kernel threads (e.g. migration etc)

there isn't anything else running on these Numa nodes.

Can you try to run upstream
kernel without binding at all and check the performance?

Re-ran the same workload *without* binding the qemu...but using the
3.3.1 kernel

20-way guest: Performance got much worse when compared to the case where
bind the qemu.

40-way guest: about the same as in the case where we bind the qemu
60-way guest: about the same as in the case where we bind the qemu

Trying out a couple of other experiments...

FYI
Vinod

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-17 Thread Gleb Natapov

On Mon, Apr 16, 2012 at 07:44:39AM -0700, Chegu Vinod wrote:
On 4/16/2012 5:18 AM, Gleb Natapov wrote:
On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote:
On 04/11/2012 01:21 PM, Chegu Vinod wrote:
Hello,

While running an AIM7 (workfile.high_systime) in a single 40-way (or a
single
60-way KVM guest) I noticed pretty bad performance when the guest was
booted
with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220
(RHEL6.2) kernel.
For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better
than
the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older
guest
kernel was nearly 12x better !
How many CPUs your host has?

80 Cores on the DL980. (i.e. 8 Westmere sockets).

So you are not oversubscribing CPUs at all. Are those real cores or including
HT?
Do you have other cpus hogs running on the host while testing the guest?

I was using numactl to bind the qemu of the 40-way guests to numa
nodes : 4-7 ( or for a 60-way guest
binding them to nodes 2-7)

/etc/qemu-ifup tap0

/etc/qemu-ifdown tap0

I knew that there will be a few additional temporary qemu worker
threads created... i.e. some over
subscription will be there.

4 nodes above have 40 real cores, yes? Can you try to run upstream
kernel without binding at all and check the performance?

Will have to retry by doing some explicit pinning of the vcpus to
native cores (without using virsh).

Turned on function tracing and found that there appears to be more time
being
spent around the lock code in the 3.3.1 guest when compared to the
2.6.32-220
guest.
Looks like you may be running into the ticket spinlock
code. During the early RHEL 6 days, Gleb came up with a
patch to automatically disable ticket spinlocks when
running inside a KVM guest.

IIRC that patch got rejected upstream at the time,
with upstream developers preferring to wait for a
better solution.

If such a better solution is not on its way upstream
now (two years later), maybe we should just merge
Gleb's patch upstream for the time being?
I think the pv spinlock that is actively discussed currently should
address the issue, but I am not sure someone tests it against non-ticket
lock in a guest to see which one performs better.

I did see that discussion...seems to have originated from the Xen context.

Yes, The problem is the same for both hypervisors.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-17 Thread Chegu Vinod

On 4/17/2012 2:49 AM, Gleb Natapov wrote:

On Mon, Apr 16, 2012 at 07:44:39AM -0700, Chegu Vinod wrote:

On 4/16/2012 5:18 AM, Gleb Natapov wrote:

On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote:

On 04/11/2012 01:21 PM, Chegu Vinod wrote:

Hello,

While running an AIM7 (workfile.high_systime) in a single 40-way (or a single
60-way KVM guest) I noticed pretty bad performance when the guest was booted
with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220
(RHEL6.2) kernel.
For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better than
the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older guest
kernel was nearly 12x better !

How many CPUs your host has?

80 Cores on the DL980. (i.e. 8 Westmere sockets).

So you are not oversubscribing CPUs at all. Are those real cores or including
HT?

HT is off.

Do you have other cpus hogs running on the host while testing the guest?

Nope. Sometimes I do run the utilities like perf or sar or mpstat
on the numa node 0 (where

the guest is not running).

I was using numactl to bind the qemu of the 40-way guests to numa
nodes : 4-7 ( or for a 60-way guest
binding them to nodes 2-7)

/etc/qemu-ifup tap0

/etc/qemu-ifdown tap0

I knew that there will be a few additional temporary qemu worker
threads created... i.e. some over
subscription will be there.

4 nodes above have 40 real cores, yes?

Yes .
Other than the qemu's related threads and some of the generic per-cpu
Linux kernel threads (e.g. migration etc)

there isn't anything else running on these Numa nodes.

Can you try to run upstream
kernel without binding at all and check the performance?

I shall re-run and get back to you with this info.

Typically for the native runs... binding the workload results in better
numbers. Hence I choose to do the
binding for the guest too...i.e. on the same numa nodes as the native
case for virt. vs. native comparison
purposes. Having said that ...In the past I had seen a couple of cases
where the non-binded guest
performed better than the native case. Need to re-run and dig into this
further...

Will have to retry by doing some explicit pinning of the vcpus to
native cores (without using virsh).

Turned on function tracing and found that there appears to be more time being
spent around the lock code in the 3.3.1 guest when compared to the 2.6.32-220
guest.

Looks like you may be running into the ticket spinlock
code. During the early RHEL 6 days, Gleb came up with a
patch to automatically disable ticket spinlocks when
running inside a KVM guest.

IIRC that patch got rejected upstream at the time,
with upstream developers preferring to wait for a
better solution.

If such a better solution is not on its way upstream
now (two years later), maybe we should just merge
Gleb's patch upstream for the time being?

I think the pv spinlock that is actively discussed currently should
address the issue, but I am not sure someone tests it against non-ticket
lock in a guest to see which one performs better.

I did see that discussion...seems to have originated from the Xen context.

Yes, The problem is the same for both hypervisors.

--
Gleb.

Thanks
Vinod

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-16 Thread Gleb Natapov

On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote:
 On 04/11/2012 01:21 PM, Chegu Vinod wrote:
 
 Hello,
 
 While running an AIM7 (workfile.high_systime) in a single 40-way (or a single
 60-way KVM guest) I noticed pretty bad performance when the guest was booted
 with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220
 (RHEL6.2) kernel.
 
 For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better than
 the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older 
 guest
 kernel was nearly 12x better !
 
How many CPUs your host has?

 Turned on function tracing and found that there appears to be more time being
 spent around the lock code in the 3.3.1 guest when compared to the 2.6.32-220
 guest.
 
 Looks like you may be running into the ticket spinlock
 code. During the early RHEL 6 days, Gleb came up with a
 patch to automatically disable ticket spinlocks when
 running inside a KVM guest.
 
 IIRC that patch got rejected upstream at the time,
 with upstream developers preferring to wait for a
 better solution.
 
 If such a better solution is not on its way upstream
 now (two years later), maybe we should just merge
 Gleb's patch upstream for the time being?
I think the pv spinlock that is actively discussed currently should
address the issue, but I am not sure someone tests it against non-ticket
lock in a guest to see which one performs better.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-16 Thread Chegu Vinod

On 4/16/2012 5:18 AM, Gleb Natapov wrote:

On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote:

On 04/11/2012 01:21 PM, Chegu Vinod wrote:

Hello,

While running an AIM7 (workfile.high_systime) in a single 40-way (or a single
60-way KVM guest) I noticed pretty bad performance when the guest was booted
with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220
(RHEL6.2) kernel.
For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better than
the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older guest
kernel was nearly 12x better !

How many CPUs your host has?

80 Cores on the DL980. (i.e. 8 Westmere sockets).

I was using numactl to bind the qemu of the 40-way guests to numa nodes
: 4-7 ( or for a 60-way guest

binding them to nodes 2-7)

/etc/qemu-ifup tap0

-m 65536 -smp 40 \
-name vm1 -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait
\
-drive
file=/var/lib/libvirt/images/vmVinod1/vm1.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
-device virtio-blk-pci,scsi=off,bus=pci

.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-monitor stdio \
-net nic,macaddr=..mac_addr.. \
-net tap,ifname=tap0,script=no,downscript=no \
-vnc :4

/etc/qemu-ifdown tap0

I knew that there will be a few additional temporary qemu worker threads
created... i.e. some over

subscription will be there.

Will have to retry by doing some explicit pinning of the vcpus to native
cores (without using virsh).

Turned on function tracing and found that there appears to be more time being
spent around the lock code in the 3.3.1 guest when compared to the 2.6.32-220
guest.

Looks like you may be running into the ticket spinlock
code. During the early RHEL 6 days, Gleb came up with a
patch to automatically disable ticket spinlocks when
running inside a KVM guest.

IIRC that patch got rejected upstream at the time,
with upstream developers preferring to wait for a
better solution.

If such a better solution is not on its way upstream
now (two years later), maybe we should just merge
Gleb's patch upstream for the time being?

I think the pv spinlock that is actively discussed currently should
address the issue, but I am not sure someone tests it against non-ticket
lock in a guest to see which one performs better.

I did see that discussion...seems to have originated from the Xen context.

Vinod

--
Gleb.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-15 Thread Chegu Vinod

Rik van Riel riel at redhat.com writes:

 
 On 04/11/2012 01:21 PM, Chegu Vinod wrote:
 
  Hello,
 
  While running an AIM7 (workfile.high_systime) in a single 40-way (or a 
single
  60-way KVM guest) I noticed pretty bad performance when the guest was booted
  with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220
  (RHEL6.2) kernel.
 
  For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better 
than
  the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older 
guest
  kernel was nearly 12x better !
 
  Turned on function tracing and found that there appears to be more time 
being
  spent around the lock code in the 3.3.1 guest when compared to the 2.6.32-
220
  guest.
 
 Looks like you may be running into the ticket spinlock
 code. During the early RHEL 6 days, Gleb came up with a
 patch to automatically disable ticket spinlocks when
 running inside a KVM guest.
 

Thanks for the pointer. 
Perhaps that is the issue.  
I did look up that old discussion thread.


 IIRC that patch got rejected upstream at the time,
 with upstream developers preferring to wait for a
 better solution.
 
 If such a better solution is not on its way upstream
 now (two years later), maybe we should just merge
 Gleb's patch upstream for the time being?



Also noticed a recent discussion thread (that originated from the Xen context)

http://article.gmane.org/gmane.linux.kernel.virtualization/15078

Not yet sure if this recent discussion is also in some way related to
the older one initiated by Gleb.

Thanks
Vinod



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-12 Thread Rik van Riel


On 04/11/2012 01:21 PM, Chegu Vinod wrote:


Hello,

While running an AIM7 (workfile.high_systime) in a single 40-way (or a single
60-way KVM guest) I noticed pretty bad performance when the guest was booted
with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220
(RHEL6.2) kernel.



For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better than
the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older guest
kernel was nearly 12x better !



Turned on function tracing and found that there appears to be more time being
spent around the lock code in the 3.3.1 guest when compared to the 2.6.32-220
guest.


Looks like you may be running into the ticket spinlock
code. During the early RHEL 6 days, Gleb came up with a
patch to automatically disable ticket spinlocks when
running inside a KVM guest.

IIRC that patch got rejected upstream at the time,
with upstream developers preferring to wait for a
better solution.

If such a better solution is not on its way upstream
now (two years later), maybe we should just merge
Gleb's patch upstream for the time being?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

2012-04-11 Thread Chegu Vinod


Hello,

While running an AIM7 (workfile.high_systime) in a single 40-way (or a single 
60-way KVM guest) I noticed pretty bad performance when the guest was booted 
with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220 
(RHEL6.2) kernel.

'am still trying to dig more into the details here. Wondering if some changes 
in 
the upstream kernel (i.e. since 2.6.32-220) might be causing this to show up in 
a guest environment (esp. for this system-intensive workload).  

Has anyone else observed this kind of behavior ? Is it a known issue with a fix 
in the pipeline ? If not are there any special knobs/tunables that one needs to 
explicitly set/clear etc. when using newer kernels like 3.3.1 in a guest ? 

I have included some info. below. 

Also any pointers on what else I could capture that would be helpful.

Thanks!
Vinod

---

Platform used:
DL980 G7 (80 cores + 128G RAM).  Hyper-threading is turned off.

Workload used:
AIM7  (workfile.high_systime) and using RAM disks. This is 
primarily a cpu intensive workload...not much i/o. 

Software used :
qemu-system-x86_64   :  1.0.50(i.e. latest as of about a week or so ago).
Native/Host  OS  :  3.3.1 (SLUB allocator explicitly enabled)
Guest-RunA   OS  :  2.6.32-220 (i.e. RHEL6.2 kernel)
Guest-RunB   OS  :  3.3.1

Guest was pinned on :
numa node: 4,5,6,7   -   40VCPUs + 64G   (i.e. 40-way guest)
numa node: 2,3,4,5,7  -  60VCPUs + 96G   (i.e. 60-way guest)

For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better than 
the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older guest 
kernel was nearly 12x better !

For the Guest-RunB (3.3.1) case I ran mpstat -P ALL 1 on the host and 
observed 
that a very high % of time was being spent by the CPUs outside the guest mode 
and mostly in the host (i.e.  sys). Looking at the perf related traces it 
seemed like there were long pauses in the guest perhaps waiting for the 
zone-lru_lock as part of release_pages() and this resulted in the VT's PLE 
related code to kick-in on the host.

Turned on function tracing and found that there appears to be more time being
spent around the lock code in the 3.3.1 guest when compared to the 2.6.32-220
guest.  Here is a small sampling of these traces... Notice the time stamp jump 
around _spin_lock_irqsave -release_pages in the case of Guest-RunB. 


1) 40-way Guest-RunA (2.6.32-220 kernel):
-


#   TASK-PID   CPU#  TIMESTAMP  FUNCTION

   ...-32147 [020] 145783.127452: native_flush_tlb -flush_tlb_mm
   ...-32147 [020] 145783.127452: free_pages_and_swap_cache -
unmap_region
   ...-32147 [020] 145783.127452: lru_add_drain -
free_pages_and_swap_cache
   ...-32147 [020] 145783.127452: release_pages -
free_pages_and_swap_cache
   ...-32147 [020] 145783.127452: _spin_lock_irqsave -release_pages
   ...-32147 [020] 145783.127452: __mod_zone_page_state -
release_pages
   ...-32147 [020] 145783.127452: mem_cgroup_del_lru_list -
release_pages

...

   ...-32147 [022] 145783.133536: release_pages -
free_pages_and_swap_cache
   ...-32147 [022] 145783.133536: _spin_lock_irqsave -release_pages
   ...-32147 [022] 145783.133536: __mod_zone_page_state -
release_pages
   ...-32147 [022] 145783.133536: mem_cgroup_del_lru_list -
release_pages
   ...-32147 [022] 145783.133537: lookup_page_cgroup -
mem_cgroup_del_lru_list




2) 40-way Guest-RunB (3.3.1):
-


#   TASK-PID   CPU#  TIMESTAMP  FUNCTION
   ...-16459 [009]  101757.383125: free_pages_and_swap_cache -
tlb_flush_mmu
   ...-16459 [009]  101757.383125: lru_add_drain -
free_pages_and_swap_cache
   ...-16459 [009]  101757.383125: release_pages -
free_pages_and_swap_cache
   ...-16459 [009]  101757.383125: _raw_spin_lock_irqsave -
release_pages
   ...-16459 [009] d... 101757.384861: mem_cgroup_lru_del_list -
release_pages
   ...-16459 [009] d... 101757.384861: lookup_page_cgroup -
mem_cgroup_lru_del_list




   ...-16459 [009] .N.. 101757.390385: release_pages -
free_pages_and_swap_cache
   ...-16459 [009] .N.. 101757.390385: _raw_spin_lock_irqsave -
release_pages
   ...-16459 [009] dN.. 101757.392983: mem_cgroup_lru_del_list -
release_pages
   ...-16459 [009] dN.. 101757.392983: lookup_page_cgroup -
mem_cgroup_lru_del_list
   ...-16459 [009] dN.. 101757.392983: __mod_zone_page_state -
release_pages




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

9 matches

Site Navigation

Mail list logo

Footer information