RE: Xen common code across architecture
Dong, Eddie wrote: > Jeremy/Andrew: > > Isaku Yamahata, I and some other IA64/Xen community memebers are > > working together to enable pv_ops for IA64 Linux. This patch is a > preparation to > move common arch/x86/xen/events.c to drivers/xen (contents are > identical) against > mm tree, it is based on Yamahata's IA64/pv_ops patch serie. > In case you want to have a brief view of whole pv_ops/IA64 patch > serie, > please refer to IA64 Linux mailinglist. > > Thanks, Eddie > > Fix a typo. Merged one is attached too. Signed-off-by: Yaozu (Eddie) Dong <[EMAIL PROTECTED]> --- drivers/xen/events_old.c2008-03-25 14:31:40.503525471 +0800 +++ drivers/xen/events.c2008-03-25 14:19:39.841851430 +0800 @@ -37,7 +37,7 @@ #include #include -#include "xen-ops.h" +#include /* * This lock protects updates to the following mapping and reference-count typo Description: typo move_xenirq3.patch Description: move_xenirq3.patch ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable
Avi Kivity wrote: > Well, dup_mm() can't work (and now that I think about it, for more > reasons -- what if the process has threads?). We lock out multithreaded users already, -EINVAL. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable
Carsten Otte wrote: > Avi Kivity wrote: >> Well, dup_mm() can't work (and now that I think about it, for more >> reasons -- what if the process has threads?). > We lock out multithreaded users already, -EINVAL. > Would be much better if this can be avoided. It's surprising. -- Any sufficiently difficult bug is indistinguishable from a feature. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Xen-devel] Re: Xen paravirt frontend block hang
Jeremy Fitzhardinge wrote: > Christopher S. Aker wrote: >> Jeremy Fitzhardinge wrote: >>> Are you running an SMP or UP domain? I found I could get hangs very >>> easily with UP (but I need confirm it isn't a result of some other >>> very experimental patches). >> >> The hang occurs with both SMP and UP compiled pv_ops kernels. SMP >> kernels are still slightly responsive after the hang occurs, which >> makes me think only one proc gets stuck at a time, not the entire kernel. > > The patch I posted yesterday - "xen: fix RMW when unmasking events" - > should definitively fix the hanging-under-load bugs (I hope). Confirmed-by: [EMAIL PROTECTED] Nice work! -Chris ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [RFC/PATCH 02/15 v2] preparation: host memory management changes for s390 kvm
On Sat, 22 Mar 2008 18:02:39 +0100 Carsten Otte <[EMAIL PROTECTED]> wrote: > From: Heiko Carstens <[EMAIL PROTECTED]> > From: Christian Borntraeger <[EMAIL PROTECTED]> > > This patch changes the s390 memory management defintions to use the pgste > field > for dirty and reference bit tracking of host and guest code. Usually on s390, > dirty and referenced are tracked in storage keys, which belong to the physical > page. This changes with virtualization: The guest and host dirty/reference > bits > are defined to be the logical OR of the values for the mapping and the > physical > page. This patch implements the necessary changes in pgtable.h for s390. > > > There is a common code change in mm/rmap.c, the call to > page_test_and_clear_young > must be moved. This is a no-op for all architecture but s390. page_referenced > checks the referenced bits for the physiscal page and for all mappings: > o The physical page is checked with page_test_and_clear_young. > o The mappings are checked with ptep_test_and_clear_young and friends. > > Without pgstes (the current implementation on Linux s390) the physical page > check is implemented but the mapping callbacks are no-ops because dirty > and referenced are not tracked in the s390 page tables. The pgstes introduces > guest and host dirty and reference bits for s390 in the host mapping. These > mapping must be checked before page_test_and_clear_young resets the reference > bit. > > ... > > --- linux-host.orig/mm/rmap.c > +++ linux-host/mm/rmap.c > @@ -413,9 +413,6 @@ int page_referenced(struct page *page, i > { > int referenced = 0; > > - if (page_test_and_clear_young(page)) > - referenced++; > - > if (TestClearPageReferenced(page)) > referenced++; > > @@ -433,6 +430,10 @@ int page_referenced(struct page *page, i > unlock_page(page); > } > } > + > + if (page_test_and_clear_young(page)) > + referenced++; > + > return referenced; > } ack. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [RFC/PATCH 01/15 v2] preparation: provide hook to enable pgstes in user pagetable
On Sat, 22 Mar 2008 18:02:37 +0100 Carsten Otte <[EMAIL PROTECTED]> wrote: > From: Martin Schwidefsky <[EMAIL PROTECTED]> > > The SIE instruction on s390 uses the 2nd half of the page table page to > virtualize the storage keys of a guest. This patch offers the s390_enable_sie > function, which reorganizes the page tables of a single-threaded process to > reserve space in the page table: > s390_enable_sie makes sure that the process is single threaded and then uses > dup_mm to create a new mm with reorganized page tables. The old mm is freed > and the process has now a page status extended field after every page table. > > Code that wants to exploit pgstes should SELECT CONFIG_PGSTE. > > This patch has a small common code hit, namely making dup_mm non-static. > > Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's > review feedback. Now we do have the prototype for dup_mm in > include/linux/sched.h. > > ... > > --- linux-host.orig/kernel/fork.c > +++ linux-host/kernel/fork.c > @@ -498,7 +498,7 @@ void mm_release(struct task_struct *tsk, > * Allocate a new mm structure and copy contents from the > * mm structure of the passed in task structure. > */ > -static struct mm_struct *dup_mm(struct task_struct *tsk) > +struct mm_struct *dup_mm(struct task_struct *tsk) > { > struct mm_struct *mm, *oldmm = current->mm; > int err; ack > --- linux-host.orig/include/linux/sched.h > +++ linux-host/include/linux/sched.h > @@ -1758,6 +1758,8 @@ extern void mmput(struct mm_struct *); > extern struct mm_struct *get_task_mm(struct task_struct *task); > /* Remove the current tasks stale references to the old mm_struct */ > extern void mm_release(struct task_struct *, struct mm_struct *); > +/* Allocate a new mm structure and copy contents from tsk->mm */ > +extern struct mm_struct *dup_mm(struct task_struct *tsk); > > extern int copy_thread(int, unsigned long, unsigned long, unsigned long, > struct task_struct *, struct pt_regs *); > extern void flush_thread(void); > hm, why did we put these in sched.h? oh well - acked-by-me. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable
Martin Schwidefsky wrote: > On Sun, 2008-03-23 at 12:15 +0200, Avi Kivity wrote: > Can you convert the page tables at a later time without doing a wholesale replacement of the mm? It should be a bit easier to keep people off the pagetables than keep their grubby mitts off the mm itself. >>> Yes, as far as I can see you're right. And whatever we do in arch code, >>> after all it's just a work around to avoid a new clone flag. >>> If something like clone() with CLONE_KVM would be useful for more >>> architectures than just s390 then maybe we should try to get a flag. >>> >>> Oh... there are just two unused clone flag bits left. Looks like the >>> namespace changes ate up a lot of them lately. >>> >>> Well, we could still play dirty tricks like setting a bit in current >>> via whatever mechanism which indicates child-wants-extended-page-tables >>> and then just fork and be happy. >>> >>> >> How about taking mmap_sem for write and converting all page tables >> in-place? I'd rather avoid the need to fork() when creating a VM. >> > > That was my initial approach as well. If all the page table allocations > can be fullfilled the code is not too complicated. To handle allocation > failures gets tricky. At this point I realized that dup_mmap already > does what we want to do. It walks all the page tables, allocates new > page tables and copies the ptes. In principle I would reinvent the wheel > if we can not use dup_mmap Well, dup_mm() can't work (and now that I think about it, for more reasons -- what if the process has threads?). I don't think conversion is too bad. You'd need a four-level loop to allocate and convert, and another loop to deallocate in case of error. If, as I don't doubt, s390 hardware can modify the ptes, you'd need cmpxchg to read and clear a pte in one operation. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization