RE: Xen common code across architecture

2008-03-24 Thread Dong, Eddie
Dong, Eddie wrote:
> Jeremy/Andrew:
> 
>   Isaku Yamahata, I and some other IA64/Xen community memebers are
> 
> working together to enable pv_ops for IA64 Linux. This patch is a
> preparation to
> move common arch/x86/xen/events.c to drivers/xen (contents are
> identical) against
> mm tree, it is based on Yamahata's IA64/pv_ops patch serie.
>   In case you want to have a brief view of whole pv_ops/IA64 patch
> serie,
> please refer to IA64 Linux mailinglist.
> 
> Thanks, Eddie
> 
> 
 Fix a typo. Merged one is attached too.


Signed-off-by: Yaozu (Eddie) Dong <[EMAIL PROTECTED]>

--- drivers/xen/events_old.c2008-03-25 14:31:40.503525471 +0800
+++ drivers/xen/events.c2008-03-25 14:19:39.841851430 +0800
@@ -37,7 +37,7 @@
 #include 
 #include 
 
-#include "xen-ops.h"
+#include 
 
 /*
  * This lock protects updates to the following mapping and
reference-count


typo
Description: typo


move_xenirq3.patch
Description: move_xenirq3.patch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable

2008-03-24 Thread Carsten Otte
Avi Kivity wrote:
> Well, dup_mm() can't work (and now that I think about it, for more 
> reasons -- what if the process has threads?).
We lock out multithreaded users already, -EINVAL.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable

2008-03-24 Thread Avi Kivity
Carsten Otte wrote:
> Avi Kivity wrote:
>> Well, dup_mm() can't work (and now that I think about it, for more 
>> reasons -- what if the process has threads?).
> We lock out multithreaded users already, -EINVAL.
>

Would be much better if this can be avoided.  It's surprising.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [Xen-devel] Re: Xen paravirt frontend block hang

2008-03-24 Thread Christopher S. Aker
Jeremy Fitzhardinge wrote:
> Christopher S. Aker wrote:
>> Jeremy Fitzhardinge wrote:
>>> Are you running an SMP or UP domain?  I found I could get hangs very 
>>> easily with UP (but I need confirm it isn't a result of some other 
>>> very experimental patches).
>>
>> The hang occurs with both SMP and UP compiled pv_ops kernels.  SMP 
>> kernels are still slightly responsive after the hang occurs, which 
>> makes me think only one proc gets stuck at a time, not the entire kernel. 
> 
> The patch I posted yesterday - "xen: fix RMW when unmasking events" - 
> should definitively fix the hanging-under-load bugs (I hope). 

Confirmed-by: [EMAIL PROTECTED]

Nice work!

-Chris

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [RFC/PATCH 02/15 v2] preparation: host memory management changes for s390 kvm

2008-03-24 Thread Andrew Morton
On Sat, 22 Mar 2008 18:02:39 +0100
Carsten Otte <[EMAIL PROTECTED]> wrote:

> From: Heiko Carstens <[EMAIL PROTECTED]>
> From: Christian Borntraeger <[EMAIL PROTECTED]>
> 
> This patch changes the s390 memory management defintions to use the pgste 
> field
> for dirty and reference bit tracking of host and guest code. Usually on s390, 
> dirty and referenced are tracked in storage keys, which belong to the physical
> page. This changes with virtualization: The guest and host dirty/reference 
> bits
> are defined to be the logical OR of the values for the mapping and the 
> physical
> page. This patch implements the necessary changes in pgtable.h for s390.
> 
> 
> There is a common code change in mm/rmap.c, the call to 
> page_test_and_clear_young
> must be moved. This is a no-op for all architecture but s390. page_referenced
> checks the referenced bits for the physiscal page and for all mappings:
> o The physical page is checked with page_test_and_clear_young.
> o The mappings are checked with ptep_test_and_clear_young and friends.
> 
> Without pgstes (the current implementation on Linux s390) the physical page
> check is implemented but the mapping callbacks are no-ops because dirty 
> and referenced are not tracked in the s390 page tables. The pgstes introduces 
> guest and host dirty and reference bits for s390 in the host mapping. These
> mapping must be checked before page_test_and_clear_young resets the reference
> bit. 
>
> ...
>
> --- linux-host.orig/mm/rmap.c
> +++ linux-host/mm/rmap.c
> @@ -413,9 +413,6 @@ int page_referenced(struct page *page, i
>  {
>   int referenced = 0;
>  
> - if (page_test_and_clear_young(page))
> - referenced++;
> -
>   if (TestClearPageReferenced(page))
>   referenced++;
>  
> @@ -433,6 +430,10 @@ int page_referenced(struct page *page, i
>   unlock_page(page);
>   }
>   }
> +
> + if (page_test_and_clear_young(page))
> + referenced++;
> +
>   return referenced;
>  }

ack.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [RFC/PATCH 01/15 v2] preparation: provide hook to enable pgstes in user pagetable

2008-03-24 Thread Andrew Morton
On Sat, 22 Mar 2008 18:02:37 +0100
Carsten Otte <[EMAIL PROTECTED]> wrote:

> From: Martin Schwidefsky <[EMAIL PROTECTED]>
> 
> The SIE instruction on s390 uses the 2nd half of the page table page to
> virtualize the storage keys of a guest. This patch offers the s390_enable_sie
> function, which reorganizes the page tables of a single-threaded process to
> reserve space in the page table:
> s390_enable_sie makes sure that the process is single threaded and then uses
> dup_mm to create a new mm with reorganized page tables. The old mm is freed 
> and the process has now a page status extended field after every page table.
> 
> Code that wants to exploit pgstes should SELECT CONFIG_PGSTE.
> 
> This patch has a small common code hit, namely making dup_mm non-static.
> 
> Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's
> review feedback. Now we do have the prototype for dup_mm in
> include/linux/sched.h.
> 
> ...
>
> --- linux-host.orig/kernel/fork.c
> +++ linux-host/kernel/fork.c
> @@ -498,7 +498,7 @@ void mm_release(struct task_struct *tsk,
>   * Allocate a new mm structure and copy contents from the
>   * mm structure of the passed in task structure.
>   */
> -static struct mm_struct *dup_mm(struct task_struct *tsk)
> +struct mm_struct *dup_mm(struct task_struct *tsk)
>  {
>   struct mm_struct *mm, *oldmm = current->mm;
>   int err;

ack

> --- linux-host.orig/include/linux/sched.h
> +++ linux-host/include/linux/sched.h
> @@ -1758,6 +1758,8 @@ extern void mmput(struct mm_struct *);
>  extern struct mm_struct *get_task_mm(struct task_struct *task);
>  /* Remove the current tasks stale references to the old mm_struct */
>  extern void mm_release(struct task_struct *, struct mm_struct *);
> +/* Allocate a new mm structure and copy contents from tsk->mm */
> +extern struct mm_struct *dup_mm(struct task_struct *tsk);
>  
>  extern int  copy_thread(int, unsigned long, unsigned long, unsigned long, 
> struct task_struct *, struct pt_regs *);
>  extern void flush_thread(void);
> 

hm, why did we put these in sched.h?

oh well - acked-by-me.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable

2008-03-24 Thread Avi Kivity
Martin Schwidefsky wrote:
> On Sun, 2008-03-23 at 12:15 +0200, Avi Kivity wrote:
>   
 Can you convert the page tables at a later time without doing a
 wholesale replacement of the mm?  It should be a bit easier to keep
 people off the pagetables than keep their grubby mitts off the mm
 itself.
 
 
>>> Yes, as far as I can see you're right. And whatever we do in arch code,
>>> after all it's just a work around to avoid a new clone flag.
>>> If something like clone() with CLONE_KVM would be useful for more
>>> architectures than just s390 then maybe we should try to get a flag.
>>>
>>> Oh... there are just two unused clone flag bits left. Looks like the
>>> namespace changes ate up a lot of them lately.
>>>
>>> Well, we could still play dirty tricks like setting a bit in current
>>> via whatever mechanism which indicates child-wants-extended-page-tables
>>> and then just fork and be happy.
>>>   
>>>   
>> How about taking mmap_sem for write and converting all page tables 
>> in-place?  I'd rather avoid the need to fork() when creating a VM.
>> 
>
> That was my initial approach as well. If all the page table allocations
> can be fullfilled the code is not too complicated. To handle allocation
> failures gets tricky. At this point I realized that dup_mmap already
> does what we want to do. It walks all the page tables, allocates new
> page tables and copies the ptes. In principle I would reinvent the wheel
> if we can not use dup_mmap

Well, dup_mm() can't work (and now that I think about it, for more 
reasons -- what if the process has threads?).

I don't think conversion is too bad.  You'd need a four-level loop to 
allocate and convert, and another loop to deallocate in case of error.  
If, as I don't doubt, s390 hardware can modify the ptes, you'd need 
cmpxchg to read and clear a pte in one operation.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization