David S. Ahern wrote:
RHEL3 is in Maintenance mode (for an explanation see
http://www.redhat.com/security/updates/errata/) which means performance
enhancement patches will not make it in.
Scratch that idea, then.
Also, I'm going to be out of the office for a couple of weeks in July,
so I
Avi Kivity wrote:
> David S. Ahern wrote:
>> Avi:
>>
>> We did not get a chance to do this at the Forum. I'd be interested in
>> whatever options you have for reducing the scan time further (e.g., try
>> to get scan time down to 1-2 seconds).
>>
>>
>
> I'm unlikely to get time to do this proper
David S. Ahern wrote:
Avi:
We did not get a chance to do this at the Forum. I'd be interested in
whatever options you have for reducing the scan time further (e.g., try
to get scan time down to 1-2 seconds).
I'm unlikely to get time to do this properly for at least a week, as
this will be
Avi:
We did not get a chance to do this at the Forum. I'd be interested in
whatever options you have for reducing the scan time further (e.g., try
to get scan time down to 1-2 seconds).
thanks,
david
Avi Kivity wrote:
> David S. Ahern wrote:
>> I gave a shot at implementing your suggestion, bu
David S. Ahern wrote:
I gave a shot at implementing your suggestion, but evidently I am still
not understanding the shadow implementation. Can you suggest a patch to
try this out?
We can have a hacking session in kvm forum. Bring a guest on your laptop.
It isn't going to be easy to both fi
Avi Kivity wrote:
> David S. Ahern wrote:
>>> Oh! Only 45K pages were direct, so the other 45K were shared, with
>>> perhaps many ptes. We shoud count ptes, not pages.
>>>
>>> Can you modify page_referenced() to count the numbers of ptes mapped (1
>>> for direct pages, nr_chains for indirect pa
David S. Ahern wrote:
Oh! Only 45K pages were direct, so the other 45K were shared, with
perhaps many ptes. We shoud count ptes, not pages.
Can you modify page_referenced() to count the numbers of ptes mapped (1
for direct pages, nr_chains for indirect pages) and print the total
deltas in acti
Avi Kivity wrote:
> David S. Ahern wrote:
>>> I haven't been able to reproduce this:
>>>
>>>
[EMAIL PROTECTED] root]# ps -elf | grep -E 'memuser|kscand'
1 S root 7 1 1 75 0- 0 schedu 10:07 ?
00:00:26 [kscand]
0 S root 1464 1 1 75 0
David S. Ahern wrote:
I haven't been able to reproduce this:
[EMAIL PROTECTED] root]# ps -elf | grep -E 'memuser|kscand'
1 S root 7 1 1 75 0- 0 schedu 10:07 ?
00:00:26 [kscand]
0 S root 1464 1 1 75 0- 196986 schedu 10:20 pts/0
00:00:21 ./me
Andrea Arcangeli wrote:
On Thu, May 29, 2008 at 06:16:55PM +0300, Avi Kivity wrote:
Yes. We need a fault in order to set the guest accessed bit.
So what I'm missing now is how the spte corresponding to the user pte
that is under test_and_clear to clear the accessed bit, will not the
z
On Thu, May 29, 2008 at 06:16:55PM +0300, Avi Kivity wrote:
> Yes. We need a fault in order to set the guest accessed bit.
So what I'm missing now is how the spte corresponding to the user pte
that is under test_and_clear to clear the accessed bit, will not the
zapped immediately. If we don't zap
Avi Kivity wrote:
> David S. Ahern wrote:
>> The short answer is that I am still see large system time hiccups in the
>> guests due to kscand in the guest scanning its active lists. I do see
>> better response for a KVM_MAX_PTE_HISTORY of 3 than with 4. (For
>> completeness I also tried a history
Andrea Arcangeli wrote:
Here I count the second write and this isn't done on the fixmap area
like the first write above, but this is a write to the real user pte,
pointed by the fixmap. So if this is emulated it means the shadow of
the user pte pointing to the real data page is still active.
Andrea Arcangeli wrote:
> On Thu, May 29, 2008 at 01:01:06PM +0300, Avi Kivity wrote:
>> No, two:
>>
>> static inline void set_pte(pte_t *ptep, pte_t pte)
>> {
>>ptep->pte_high = pte.pte_high;
>>smp_wmb();
>>ptep->pte_low = pte.pte_low;
>> }
>
> Right, that can be 2 or 1 d
On Thu, May 29, 2008 at 01:01:06PM +0300, Avi Kivity wrote:
> No, two:
>
> static inline void set_pte(pte_t *ptep, pte_t pte)
> {
>ptep->pte_high = pte.pte_high;
>smp_wmb();
>ptep->pte_low = pte.pte_low;
> }
Right, that can be 2 or 1 depending on PAE non-PAE, other 2.4
ente
This is 2.4/RHEL3, so HZ=100. 848 jiffies = 8.48 seconds -- and that's
just the one age bucket and this is just one example pulled randomly
(well after boot). During that time kscand does get scheduled out, but
ultimately guest time is at 100% during the scans.
david
Avi Kivity wrote:
> David S.
David S. Ahern wrote:
This is 2.4/RHEL3, so HZ=100. 848 jiffies = 8.48 seconds -- and that's
just the one age bucket and this is just one example pulled randomly
(well after boot). During that time kscand does get scheduled out, but
ultimately guest time is at 100% during the scans.
Er, yes
David S. Ahern wrote:
I've been instrumenting the guest kernel as well. It's the scanning of
the active lists that triggers a lot of calls to paging64_prefetch_page,
and, as you guys know, correlates with the number of direct pages in the
list. Earlier in this thread I traced the kvm cycles to
pa
David S. Ahern wrote:
I've been instrumenting the guest kernel as well. It's the scanning of
the active lists that triggers a lot of calls to paging64_prefetch_page,
and, as you guys know, correlates with the number of direct pages in the
list. Earlier in this thread I traced the kvm cycles to
pa
Andrea Arcangeli wrote:
- set up kmap to point at pte
- test_and_clear_bit(pte)
- kunmap
From kvm's point of view this looks like
- several accesses to set up the kmap
Hmm, the kmap establishment takes a single guest operation in the
fixmap area. That's a single write to the pte, to
David S. Ahern wrote:
I have a clone of the kvm repository, but evidently not running the
right magic to see the changes in the per-page-pte-tracking branch. I
ran the following:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git
git branch per-page-pte-tracking
[EMAIL PROTECT
I have a clone of the kvm repository, but evidently not running the
right magic to see the changes in the per-page-pte-tracking branch. I
ran the following:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git
git branch per-page-pte-tracking
[EMAIL PROTECTED] kvm]$ git branch
Yes, I've tried changing kscand_work_percent (values of 50 and 30).
Basically it makes kscand wake more often (ie.,MIN_AGING_INTERVAL
declines in proportion) put do less work each trip through the lists.
I have not seen a noticeable change in guest behavior.
david
Andrea Arcangeli wrote:
> On W
On Wed, May 28, 2008 at 09:43:09AM -0600, David S. Ahern wrote:
> This is the code in the RHEL3.8 kernel:
>
> static int scan_active_list(struct zone_struct * zone, int age,
> struct list_head * list, int count)
> {
> struct list_head *page_lru , *next;
> struct page * pa
On Wed, May 28, 2008 at 05:57:21PM +0300, Avi Kivity wrote:
> What about CONFIG_HIGHPTE?
Ah yes sorry! Official 2.4 has no highpte capability but surely RH
backported highpte to 2.4 so that would explain the cpu time spent in
kswapd _guest_ context.
If highpte is the problem and you've troubles r
This is the code in the RHEL3.8 kernel:
static int scan_active_list(struct zone_struct * zone, int age,
struct list_head * list, int count)
{
struct list_head *page_lru , *next;
struct page * page;
int over_rsslimit;
count = count * kscand_work_perc
I've been instrumenting the guest kernel as well. It's the scanning of
the active lists that triggers a lot of calls to paging64_prefetch_page,
and, as you guys know, correlates with the number of direct pages in the
list. Earlier in this thread I traced the kvm cycles to
paging64_prefetch_page().
Andrea Arcangeli wrote:
So I never found a relation to the symptom reported of VM kernel
threads going weird, with KVM optimal handling of kmap ptes.
The problem is this code:
static int scan_active_list(struct zone_struct * zone, int age,
struct list_head * list)
{
Andrea Arcangeli wrote:
On Wed, May 28, 2008 at 08:13:44AM -0600, David S. Ahern wrote:
Weird. Could it be something about the hosts?
Note that the VM itself will never make use of kmap. The VM is "data"
agonistic. The VM has never any idea with the data contained by the
pages. kmap/km
On Wed, May 28, 2008 at 08:13:44AM -0600, David S. Ahern wrote:
> Weird. Could it be something about the hosts?
Note that the VM itself will never make use of kmap. The VM is "data"
agonistic. The VM has never any idea with the data contained by the
pages. kmap/kmap_atomic/kunmap_atomic are only n
David S. Ahern wrote:
Weird. Could it be something about the hosts?
I have been running these tests on a DL320G5 with a Xeon 3050 CPU, 2.13
GHz. Host OS is Fedora 8 with the 2.6.25.3 kernel.
I'll rebuild kvm-69 with your latest patch and try the test programs again.
I've pushed it into kvm
Weird. Could it be something about the hosts?
I have been running these tests on a DL320G5 with a Xeon 3050 CPU, 2.13
GHz. Host OS is Fedora 8 with the 2.6.25.3 kernel.
I'll rebuild kvm-69 with your latest patch and try the test programs again.
david
Avi Kivity wrote:
> David S. Ahern wrote:
>
David S. Ahern wrote:
The short answer is that I am still see large system time hiccups in the
guests due to kscand in the guest scanning its active lists. I do see
better response for a KVM_MAX_PTE_HISTORY of 3 than with 4. (For
completeness I also tried a history of 2, but it performed worse th
The short answer is that I am still see large system time hiccups in the
guests due to kscand in the guest scanning its active lists. I do see
better response for a KVM_MAX_PTE_HISTORY of 3 than with 4. (For
completeness I also tried a history of 2, but it performed worse than 3
which is no surpris
Avi Kivity wrote:
The answer turns out to be "yes", so here's a patch that adds a pte
access history table for each shadowed guest page-table. Let me know
if it helps. Benchmarking a variety of workloads on all guests
supported by kvm is left as an exercise for the reader, but I suspect
th
Avi Kivity wrote:
There are (at least) three options available:
- detect and special-case this scenario
- change the flood detector to be per page table instead of per vcpu
- change the flood detector to look at a list of recently used page
tables instead of the last page table
I'm having a h
David S. Ahern wrote:
Does the fact that the hugemem kernel works just fine have any bearing
on your options? Or rather, is there something unique about the way
kscand works in the hugemem kernel that its performance is ok?
Yes. If your guest has < 4GB of memory, then all of it is lowmem i
Does the fact that the hugemem kernel works just fine have any bearing
on your options? Or rather, is there something unique about the way
kscand works in the hugemem kernel that its performance is ok?
I mentioned last month (so without your first patch) that running the
hugemem kernel showed a re
David S. Ahern wrote:
[dsa] No. I saw the same problem with the flood count at 5. The
attachment in the last email shows kvm_stat data during a kscand event.
The data was collected with the patch you posted. With the flood count
at 3 the mmu cache/flood counters are in the 18,000/sec and pte upda
[resend to new list].
David S. Ahern wrote:
> I was just digging through the sysstat history files, and I was not
> imagining it: I did have an excellent overnight run on 5/13-5/14 with
> your patch and the standard RHEL3U8 smp kernel in the guest. I have no
> idea why I cannot get anywhere close
40 matches
Mail list logo