[kvm-devel] [PATCH 001/001] mmu-notifier-core v17

2008-05-09 Thread Andrea Arcangeli
From: Andrea Arcangeli <[EMAIL PROTECTED]> With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages. There are secondary MMUs (with secondary sptes and secondary tlbs) too. sptes in the kvm case are shadow pagetables, but when I say spte in mmu-notifier context, I mean

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-09 Thread Andrea Arcangeli
On Fri, May 09, 2008 at 08:37:29PM +0200, Peter Zijlstra wrote: > Another possibility, would something like this work? > > > /* > * null out the begin function, no new begin calls can be made > */ > rcu_assing_pointer(my_notifier.invalidate_start_begin, NULL); > > /* > * lock/unlock a

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-08 Thread Andrea Arcangeli
On Thu, May 08, 2008 at 09:11:33AM -0700, Linus Torvalds wrote: > Btw, this is an issue only on 32-bit x86, because on 64-bit one we already > have the padding due to the alignment of the 64-bit pointers in the > list_head (so there's already empty space there). > > On 32-bit, the alignment of l

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Thu, May 08, 2008 at 08:30:20AM +0300, Pekka Enberg wrote: > On Thu, May 8, 2008 at 8:27 AM, Pekka Enberg <[EMAIL PROTECTED]> wrote: > > You might want to read carefully what Linus wrote: > > > > > The one that already has a 4 byte padding thing on x86-64 just after the > > > spinlock? And tha

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 09:14:45PM -0700, Linus Torvalds wrote: > IOW, you didn't even look at it, did you? Actually I looked both at the struct and at the slab alignment just in case it was changed recently. Now after reading your mail I also compiled it just in case. 2.6.26-rc1 # name

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 08:10:33PM -0700, Christoph Lameter wrote: > On Thu, 8 May 2008, Andrea Arcangeli wrote: > > > to the sort function to break the loop. After that we remove the 512 > > vma cap and mm_lock is free to run as long as it wants like > > /dev/urandom,

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 06:12:32PM -0700, Christoph Lameter wrote: > Andrea's mm_lock could have wider impact. It is the first effective > way that I have seen of temporarily holding off reclaim from an address > space. It sure is a brute force approach. The only improvement I can imagine on mm_

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 06:57:05PM -0700, Linus Torvalds wrote: > Take five minutes. Take a deep breadth. And *think* about actually reading > what I wrote. > > The bitflag *can* prevent taking the same lock twice. It just needs to be > in the right place. It's not that I didn't read it, but to

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 06:39:48PM -0700, Linus Torvalds wrote: > > > On Wed, 7 May 2008, Christoph Lameter wrote: > > > > > (That said, we're not running out of vm flags yet, and if we were, we > > > could just add another word. We're already wasting that space right now > > > on > > > 64-bi

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
Sorry for not having completely answered to this. I initially thought stop_machine could work when you mentioned it, but I don't think it can even removing xpmem block-inside-mmu-notifier-method requirements. For stop_machine to solve this (besides being slower and potentially not more safe as run

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 06:02:49PM -0700, Linus Torvalds wrote: > You replace mm_lock() with the sequence that Andrew gave you (and I > described): > > spin_lock(&global_lock) > .. get all locks UNORDERED .. > spin_unlock(&global_lock) > > and you're now done. You have your "mm

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Thu, May 08, 2008 at 09:28:38AM +1000, Benjamin Herrenschmidt wrote: > > On Thu, 2008-05-08 at 00:44 +0200, Andrea Arcangeli wrote: > > > > Please note, we can't allow a thread to be in the middle of > > zap_page_range while mmu_notifier_register runs

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
Hi Andrew, On Wed, May 07, 2008 at 03:59:14PM -0700, Andrew Morton wrote: > CPU0: CPU1: > > spin_lock(global_lock) > spin_lock(a->lock); spin_lock(b->lock); == mmu_notifier_register() > spin_lock(b->lo

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
To remove mm_lock without adding an horrible system-wide lock before every i_mmap_lock etc.. we've to remove invalidate_range_begin/end. Then we can return to an older approach of doing only invalidate_page and serializing it with the PT lock against get_user_pages. That works fine for KVM but GRU

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 03:44:24PM -0700, Linus Torvalds wrote: > > > On Thu, 8 May 2008, Andrea Arcangeli wrote: > > > > Unfortunately the lock you're talking about would be: > > > > static spinlock_t global_lock = ... > > > > There'

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 03:31:03PM -0700, Andrew Morton wrote: > Nope. We only need to take the global lock before taking *two or more* of > the per-vma locks. > > I really wish I'd thought of that. I don't see how you can avoid taking the system-wide-global lock before every single anon_vma->lo

Re: [kvm-devel] [ofa-general] Re: [PATCH 01 of 11] mmu-notifier-core

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 03:31:08PM -0700, Roland Dreier wrote: > I think the point you're missing is that any patches written by > Christoph need a line like > > From: Christoph Lameter <[EMAIL PROTECTED]> > > at the top of the body so that Christoph becomes the author when it is > committed into

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-07 Thread Andrea Arcangeli
On Thu, May 08, 2008 at 12:27:58AM +0200, Andrea Arcangeli wrote: > I rechecked and I guarantee that the patches where Christoph isn't > listed are developed by myself and he didn't write a single line on > them. In any case I expect Christoph to review (he's CCed)

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 03:11:10PM -0700, Linus Torvalds wrote: > > > On Wed, 7 May 2008, Andrea Arcangeli wrote: > > > > As far as I can tell, authorship has been destroyed by at least two of > > > the > > > patches (ie Christoph seems to

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 02:36:57PM -0700, Linus Torvalds wrote: > > had to do any blocking I/O during vmtruncate before, now we have to. > > I really suspect we don't really have to, and that it would be better to > just fix the code that does that. I'll let you discuss with Christoph and Robin

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 01:30:39PM -0700, Linus Torvalds wrote: > > > On Wed, 7 May 2008, Andrew Morton wrote: > > > > The patch looks OK to me. > > As far as I can tell, authorship has been destroyed by at least two of the > patches (ie Christoph seems to be the author, but Andrea seems to ha

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 01:56:23PM -0700, Linus Torvalds wrote: > This also looks very debatable indeed. The only performance numbers quoted > are: > > > This results in f.e. the Aim9 brk performance test to got down by 10-15%. > > which just seems like a total disaster. > > The whole series

Re: [kvm-devel] [PATCH 03 of 11] invalidate_page outside PT lock

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 01:39:43PM -0400, Rik van Riel wrote: > Would it be an idea to merge them into one, so the first patch > introduces the right conventions directly? The only reason this isn't merged into one, is that this requires non obvious (not difficult though) to the core VM code. I wa

Re: [kvm-devel] [PATCH 02 of 11] get_task_mm

2008-05-07 Thread Andrea Arcangeli
On Wed, May 07, 2008 at 10:59:48AM -0500, Robin Holt wrote: > You can drop this patch. > > This turned out to be a race in xpmem. It "appeared" as if it were a > race in get_task_mm, but it really is not. The current->mm field is > cleared under the task_lock and the task_lock is grabbed by get_

[kvm-devel] [PATCH 00 of 11] mmu notifier #v16

2008-05-07 Thread Andrea Arcangeli
Hello, this is the last update of the mmu notifier patch. Jack asked a __mmu_notifier_register to call under mmap_sem in write mode. Here an update with that change plus allowing ->release not to be implemented (two liner change to mmu_notifier.c). The entire diff between v15 and v16 mmu-notifi

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-05-07 Thread Andrea Arcangeli
On Tue, Apr 29, 2008 at 06:03:40PM +0200, Andrea Arcangeli wrote: > Christoph if you've interest in evolving anon-vma-sem and i_mmap_sem > yourself in this direction, you're very welcome to go ahead while I In case you didn't notice this already, for a further explanation o

[kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210096013 -7200 # Node ID e20917dcc8284b6a07cfcced13dda4cbca850a9c # Parent 5026689a3bc323a26d33ad882c34c4c9c9a3ecd8 mmu-notifier-core With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages

[kvm-devel] [PATCH 07 of 11] i_mmap_rwsem

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115135 -7200 # Node ID 58f716ad4d067afb6bdd1b5f7042e19d854aae0d # Parent 0621238970155f8ff2d60ca4996dcdd470f9c6ce i_mmap_rwsem The conversion to a rwsem allows notifier callbacks during rmap traversal for files

[kvm-devel] [PATCH 05 of 11] unmap vmas tlb flushing

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115131 -7200 # Node ID 20bc6a66a86ef6bd60919cc77ff51d4af741b057 # Parent 34f6a4bf67ce66714ba2d5c13a5fed241d34fb09 unmap vmas tlb flushing Move the tlb flushing inside of unmap vmas. This saves us from passing a p

[kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115136 -7200 # Node ID 6b384bb988786aa78ef07440180e4b2948c4c6a2 # Parent 58f716ad4d067afb6bdd1b5f7042e19d854aae0d anon-vma-rwsem Convert the anon_vma spinlock to a rw semaphore. This allows concurrent traver

[kvm-devel] [PATCH 09 of 11] mm_lock-rwsem

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115508 -7200 # Node ID 94eaa1515369e8ef183e2457f6f25a7f36473d70 # Parent 6b384bb988786aa78ef07440180e4b2948c4c6a2 mm_lock-rwsem Convert mm_lock to use semaphores after i_mmap_lock and anon_vma_lock conversion.

[kvm-devel] [PATCH 06 of 11] rwsem contended

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115132 -7200 # Node ID 0621238970155f8ff2d60ca4996dcdd470f9c6ce # Parent 20bc6a66a86ef6bd60919cc77ff51d4af741b057 rwsem contended Add a function to rw_semaphores to check if there are any processes waiting f

[kvm-devel] [PATCH 11 of 11] mmap sems

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115798 -7200 # Node ID eb924315351f6b056428e35c983ad28040420fea # Parent 5b2eb7d28a4517daf91b08b4dcfbb58fd2b42d0b mmap sems This patch adds a lock ordering rule to avoid a potential deadlock when multiple mmap_sem

[kvm-devel] [PATCH 10 of 11] export zap_page_range for XPMEM

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115797 -7200 # Node ID 5b2eb7d28a4517daf91b08b4dcfbb58fd2b42d0b # Parent 94eaa1515369e8ef183e2457f6f25a7f36473d70 export zap_page_range for XPMEM XPMEM would have used sys_madvise() except that madvise_dontneed() r

[kvm-devel] [PATCH 03 of 11] invalidate_page outside PT lock

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115129 -7200 # Node ID d60d200565abde6a8ed45271e53cde9c5c75b426 # Parent c5badbefeee07518d9d1acca13e94c981420317c invalidate_page outside PT lock Moves all mmu notifier methods outside the PT lock (first and no

[kvm-devel] [PATCH 02 of 11] get_task_mm

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115127 -7200 # Node ID c5badbefeee07518d9d1acca13e94c981420317c # Parent e20917dcc8284b6a07cfcced13dda4cbca850a9c get_task_mm get_task_mm should not succeed if mmput() is running and has reduced the mm_users co

[kvm-devel] [PATCH 04 of 11] free-pgtables

2008-05-07 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1210115130 -7200 # Node ID 34f6a4bf67ce66714ba2d5c13a5fed241d34fb09 # Parent d60d200565abde6a8ed45271e53cde9c5c75b426 free-pgtables Move the tlb flushing into free_pgtables. The conversion of the locks taken for rever

[kvm-devel] mmu notifier v15 -> v16 diff

2008-05-06 Thread Andrea Arcangeli
Hello everyone, This is to allow GRU code to call __mmu_notifier_register inside the mmap_sem (write mode is required as documented in the patch). It also removes the requirement to implement ->release as it's not guaranteed all users will really need it. I didn't integrate the search function a

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-06 Thread Andrea Arcangeli
On Mon, May 05, 2008 at 02:46:25PM -0500, Jack Steiner wrote: > If a task fails to unmap a GRU segment, they still exist at the start of Yes, this will also happen in case the well behaved task receives SIGKILL, so you can test it that way too. > exit. On the ->release callout, I set a flag in th

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-05 Thread Andrea Arcangeli
On Mon, May 05, 2008 at 12:25:06PM -0500, Jack Steiner wrote: > Agree. My apologies... I should have caught it. No problem. > __mmu_notifier_register/__mmu_notifier_unregister seems like a better way to > go, although either is ok. If you also like __mmu_notifier_register more I'll go with it. T

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-05 Thread Andrea Arcangeli
On Mon, May 05, 2008 at 11:21:13AM -0500, Jack Steiner wrote: > The GRU does the registration/deregistration of mmu notifiers from > mmap/munmap. > At this point, the mmap_sem is already held writeable. I hit a deadlock > in mm_lock. It'd been better to know about this detail earlier, but frankly

Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-04 Thread Andrea Arcangeli
On Sun, May 04, 2008 at 02:13:45PM -0500, Robin Holt wrote: > > diff --git a/mm/Kconfig b/mm/Kconfig > > --- a/mm/Kconfig > > +++ b/mm/Kconfig > > @@ -205,3 +205,6 @@ config VIRT_TO_BUS > > config VIRT_TO_BUS > > def_bool y > > depends on !ARCH_NO_VIRT_TO_BUS > > + > > +config MMU_NOTIFIER

[kvm-devel] kvm mmu notifier update

2008-05-03 Thread Andrea Arcangeli
odule isn't updated, to decrease the risk of runtime failure). Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 8d45fab..ce3251c 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -21,6 +21,7 @@ config KVM

Re: [kvm-devel] fx_init schedule in atomic

2008-05-02 Thread Andrea Arcangeli
On Fri, May 02, 2008 at 12:28:32PM +0300, Avi Kivity wrote: > Applied, thanks. Dynamic allocation for the fpu state was introduced in > 2.6.26-rc, right? It seems very recent, hit mainline on 30 Apr. Also we may want to think if there's something cheaper than fx_save to trigger a math exception

[kvm-devel] [PATCH 11 of 11] mmap sems

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740229 -7200 # Node ID b4bf6df98bc00bfbef9423b0dd31cfdba63a5eeb # Parent 4f462fb3dff614cd7d971219c3feaef0b43359c1 mmap sems This patch adds a lock ordering rule to avoid a potential deadlock when multiple mmap_sem

[kvm-devel] [PATCH 10 of 11] export zap_page_range for XPMEM

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740229 -7200 # Node ID 4f462fb3dff614cd7d971219c3feaef0b43359c1 # Parent 721c3787cd42043734331e54a42eb20c51766f71 export zap_page_range for XPMEM XPMEM would have used sys_madvise() except that madvise_dontneed() r

[kvm-devel] [PATCH 09 of 11] mm_lock-rwsem

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740226 -7200 # Node ID 721c3787cd42043734331e54a42eb20c51766f71 # Parent 0be678c52e540d5f5d5fd9af549b57b9bb018d32 mm_lock-rwsem Convert mm_lock to use semaphores after i_mmap_lock and anon_vma_lock conversion.

[kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740186 -7200 # Node ID 0be678c52e540d5f5d5fd9af549b57b9bb018d32 # Parent de28c85baef11b90c993047ca851a2f52c85a5be anon-vma-rwsem Convert the anon_vma spinlock to a rw semaphore. This allows concurrent traver

[kvm-devel] [PATCH 07 of 11] i_mmap_rwsem

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740186 -7200 # Node ID de28c85baef11b90c993047ca851a2f52c85a5be # Parent 74b873f3ea07012e2fc864f203edf1179865feb1 i_mmap_rwsem The conversion to a rwsem allows notifier callbacks during rmap traversal for files

[kvm-devel] [PATCH 06 of 11] rwsem contended

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740186 -7200 # Node ID 74b873f3ea07012e2fc864f203edf1179865feb1 # Parent a8ac53b928dfcea0ccb326fb7d71f908f0df85f4 rwsem contended Add a function to rw_semaphores to check if there are any processes waiting f

[kvm-devel] [PATCH 04 of 11] free-pgtables

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740185 -7200 # Node ID 14e9f5a12bb1657fa6756e18d5dac71d4ad1a55e # Parent ea8fc9187b6d3ef2742061b4f62598afe55281cf free-pgtables Move the tlb flushing into free_pgtables. The conversion of the locks taken for rever

[kvm-devel] [PATCH 03 of 11] invalidate_page outside PT lock

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740185 -7200 # Node ID ea8fc9187b6d3ef2742061b4f62598afe55281cf # Parent c85c85c4be165eb6de16136bb97cf1fa7fd5c88f invalidate_page outside PT lock Moves all mmu notifier methods outside the PT lock (first and no

[kvm-devel] [PATCH 05 of 11] unmap vmas tlb flushing

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740186 -7200 # Node ID a8ac53b928dfcea0ccb326fb7d71f908f0df85f4 # Parent 14e9f5a12bb1657fa6756e18d5dac71d4ad1a55e unmap vmas tlb flushing Move the tlb flushing inside of unmap vmas. This saves us from passing a p

[kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740175 -7200 # Node ID 1489529e7b53d3f2dab8431372aa4850ec821caa # Parent 5026689a3bc323a26d33ad882c34c4c9c9a3ecd8 mmu-notifier-core With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages

[kvm-devel] [PATCH 02 of 11] get_task_mm

2008-05-02 Thread Andrea Arcangeli
# HG changeset patch # User Andrea Arcangeli <[EMAIL PROTECTED]> # Date 1209740185 -7200 # Node ID c85c85c4be165eb6de16136bb97cf1fa7fd5c88f # Parent 1489529e7b53d3f2dab8431372aa4850ec821caa get_task_mm get_task_mm should not succeed if mmput() is running and has reduced the mm_users co

[kvm-devel] [PATCH 00 of 11] mmu notifier #v15

2008-05-02 Thread Andrea Arcangeli
Hello everyone, 1/11 is the latest version of the mmu-notifier-core patch. As usual all later 2-11/11 patches follows but those aren't meant for 2.6.26. Thanks! Andrea - This SF.net email is sponsored by the 2008 JavaOne(SM

[kvm-devel] mmu notifier-core v14->v15 diff for review

2008-05-01 Thread Andrea Arcangeli
Hello everyone, this is the v14 to v15 difference to the mmu-notifier-core patch. This is just for review of the difference, I'll post full v15 soon, please review the diff in the meantime. Lots of those cleanups are thanks to Andrew review on mmu-notifier-core in v14. He also spotted the GFP_KERN

[kvm-devel] fx_init schedule in atomic

2008-05-01 Thread Andrea Arcangeli
Hello, This make sure not to schedule in atomic during fx_init. I also changed the name of fpu_init to fx_finit to avoid duplicating the name with fpu_init that is already used in the kernel, this makes grep simpler if nothing else. Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> diff

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page (v2)

2008-04-30 Thread Andrea Arcangeli
On Wed, Apr 30, 2008 at 11:59:47AM +0300, Avi Kivity wrote: > The code is not trying to find a vma for the address, but a vma for the > address which also has VM_PFNMAP set. The cases for vma not found, or vma > found, but not VM_PFNMAP, are folded together. Muli's saying the comparison is rever

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page (v2)

2008-04-30 Thread Andrea Arcangeli
On Tue, Apr 29, 2008 at 06:12:51PM -0500, Anthony Liguori wrote: > IIUC PPC correctly, all IO pages have corresponding struct pages. This > means that get_user_pages() would succeed and you can reference count them? > In this case, we would never take the VM_PFNMAP path. get_user_pages only wo

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-29 Thread Andrea Arcangeli
On Tue, Apr 29, 2008 at 10:50:30AM -0500, Robin Holt wrote: > You have said this continually about a CONFIG option. I am unsure how > that could be achieved. Could you provide a patch? I'm busy with the reserved ram patch against 2.6.25 and latest kvm.git that is moving from pages to pfn for pci

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-29 Thread Andrea Arcangeli
On Mon, Apr 28, 2008 at 06:28:06PM -0700, Christoph Lameter wrote: > On Tue, 29 Apr 2008, Andrea Arcangeli wrote: > > > Frankly I've absolutely no idea why rcu is needed in all rmap code > > when walking the page->mapping. Definitely the PG_locked is taken so > &

Re: [kvm-devel] [PATCH] Handle vma regions with no backing page

2008-04-29 Thread Andrea Arcangeli
On Tue, Apr 29, 2008 at 09:32:09AM -0500, Anthony Liguori wrote: > + vma = find_vma(current->mm, addr); > + if (vma == NULL) { > + get_page(bad_page); > + return page_to_pfn(bad_page); > + } Here you must check vm_start ad

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-29 Thread Andrea Arcangeli
Hi Hugh!! On Tue, Apr 29, 2008 at 11:49:11AM +0100, Hugh Dickins wrote: > [I'm scarcely following the mmu notifiers to-and-fro, which seems > to be in good hands, amongst faster thinkers than me: who actually > need and can test this stuff. Don't let me slow you down; but I > can quickly clarify

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-28 Thread Andrea Arcangeli
On Mon, Apr 28, 2008 at 01:34:11PM -0700, Christoph Lameter wrote: > On Sun, 27 Apr 2008, Andrea Arcangeli wrote: > > > Talking about post 2.6.26: the refcount with rcu in the anon-vma > > conversion seems unnecessary and may explain part of the AIM slowdown > > to

Re: [kvm-devel] fork() within a VM with MMU notifiers

2008-04-28 Thread Andrea Arcangeli
On Mon, Apr 28, 2008 at 11:11:56AM -0500, Anthony Liguori wrote: > Here's my thinking as to why we don't want to destroy the VM in the mmu > notifiers ->release method. I don't have a valid use-case for this but my > argument depends on the fact that this is something that should work. > Daemo

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-27 Thread Andrea Arcangeli
On Sat, Apr 26, 2008 at 08:17:34AM -0500, Robin Holt wrote: > the first four sets. The fifth is the oversubscription test which trips > my xpmem bug. This is as good as the v12 runs from before. Now that mmu-notifier-core #v14 seems finished and hopefully will appear in 2.6.26 ;), I started exer

Re: [kvm-devel] mmu notifier #v14

2008-04-26 Thread Andrea Arcangeli
the sptes. The ioctl of the qemu userland could run in any other task with a mm different than the one of the guest and ->release allows this to work fine without memory corruption and without requiring page pinning. As far a I can tell your example explains why we need this fix ;). Here an

Re: [kvm-devel] mmu notifier #v14

2008-04-26 Thread Andrea Arcangeli
On Sat, Apr 26, 2008 at 01:59:23PM -0500, Anthony Liguori wrote: >> +static void kvm_unmap_spte(struct kvm *kvm, u64 *spte) >> +{ >> +struct page *page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >> >> PAGE_SHIFT); >> +get_page(page); >> > > You should not assume a struct page exists fo

[kvm-devel] mmu notifier #v14

2008-04-26 Thread Andrea Arcangeli
disarmed regardless of MMU_NOTIFIER=y or =n. http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.25/mmu-notifier-v14/mmu-notifier-core I'll be sending that patch to Andrew inbox. Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> diff --git a/arch/x86/kv

[kvm-devel] fix external module compile

2008-04-26 Thread Andrea Arcangeli
oving the external-module-compat in the same place with the other includes where `pwd` works instead of $(src) that doesn't work anymore for whatever reason. Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> diff --git a/kernel/Kbuild b/kernel/Kbuild index cabfc75..d9245eb 100644 ---

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-26 Thread Andrea Arcangeli
On Sat, Apr 26, 2008 at 08:17:34AM -0500, Robin Holt wrote: > Since this include and the one for mm_types.h both are build breakages > for ia64, I think you need to apply your ia64_cpumask and the following > (possibly as a single patch) first or in your patch 1. Without that, > ia64 doing a git-b

Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-25 Thread Andrea Arcangeli
On Fri, Apr 25, 2008 at 02:25:32PM -0500, Robin Holt wrote: > I think you still need mm_lock (unless I miss something). What happens > when one callout is scanning mmu_notifier_invalidate_range_start() and > you unlink. That list next pointer with LIST_POISON1 which is a really > bad address for

Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-25 Thread Andrea Arcangeli
On Fri, Apr 25, 2008 at 06:56:39PM +0200, Andrea Arcangeli wrote: > > > + data->i_mmap_locks = vmalloc(nr_i_mmap_locks * > > > + sizeof(spinlock_t)); > > > > This is why non-typesafe allocators suck. You want

Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-25 Thread Andrea Arcangeli
I somehow lost missed this email in my inbox, found it now because it was strangely still unread... Sorry for the late reply! On Tue, Apr 22, 2008 at 03:06:24PM +1000, Rusty Russell wrote: > On Wednesday 09 April 2008 01:44:04 Andrea Arcangeli wrote: > > --- a/include/linux/mm.h > >

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-24 Thread Andrea Arcangeli
On Thu, Apr 24, 2008 at 05:39:43PM +0200, Andrea Arcangeli wrote: > There's at least one small issue I noticed so far, that while _release > don't need to care about _register, but _unregister definitely need to > care about _register. I've to take the mmap_sem in additi

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-24 Thread Andrea Arcangeli
eed to care about _register. I've to take the mmap_sem in addition or in replacement of the unregister_lock. The srcu_read_lock can also likely moved just before releasing the unregister_lock but that's just a minor optimization to make the code more strict. > On Thu, Apr 24, 2008 a

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Thu, Apr 24, 2008 at 12:19:28AM +0200, Andrea Arcangeli wrote: > /dev/kvm closure. Given this can be a considered a bugfix to > mmu_notifier_unregister I'll apply it to 1/N and I'll release a new I'm not sure anymore this can be considered a bugfix given how large change

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 06:37:13PM +0200, Andrea Arcangeli wrote: > I'm afraid if you don't want to worst-case unregister with ->release > you need to have a better idea than my mm_lock and personally I can't > see any other way than mm_lock to ensure not to mi

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 11:27:21AM -0700, Christoph Lameter wrote: > There is a potential issue in move_ptes where you call > invalidate_range_end after dropping i_mmap_sem whereas my patches did the > opposite. Mmap_sem saves you there? Yes, there's really no risk of races in this area after in

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 11:21:49AM -0700, Christoph Lameter wrote: > No I really want you to do this. I have no interest in a takeover in the Ok if you want me to do this, I definitely prefer the core to go in now. It's so much easier to concentrate on two problems at different times then to atta

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 11:19:26AM -0700, Christoph Lameter wrote: > If unregister fails then the driver should not detach from the address > space immediately but wait until -->release is called. That may be > a possible solution. It will be rare that the unregister fails. This is the current ide

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 11:09:35AM -0700, Christoph Lameter wrote: > Why is there still the hlist stuff being used for the mmu notifier list? > And why is this still unsafe? What's the problem with hlist, it saves 8 bytes for each mm_struct, you should be using it too instead of list. > There ar

Re: [kvm-devel] [PATCH 04 of 12] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 11:02:18AM -0700, Christoph Lameter wrote: > We have had this workaround effort done years ago and have been > suffering the ill effects of pinning for years. Had to deal with Yes. In addition to the pinning, there's lot of additional tlb flushing work to do in kvm withou

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 12:09:09PM -0500, Jack Steiner wrote: > > You may have spotted this already. If so, just ignore this. > > It looks like there is a bug in copy_page_range() around line 667. > It's possible to do a mmu_notifier_invalidate_range_start(), then > return -ENOMEM w/o doing a cor

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 06:26:29PM +0200, Andrea Arcangeli wrote: > On Tue, Apr 22, 2008 at 04:20:35PM -0700, Christoph Lameter wrote: > > I guess I have to prepare another patchset then? Apologies for my previous not too polite comment in answer to the above, but I thought this double

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 07:28:49PM -0500, Jack Steiner wrote: > The GRU driver unregisters the notifier when all GRU mappings > are unmapped. I could make it work either way - either with or without > an unregister function. However, unregister is the most logical > action to take when all mappings

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 04:20:35PM -0700, Christoph Lameter wrote: > I guess I have to prepare another patchset then? If you want to embarrass yourself three time in a row go ahead ;). I thought two failed takeovers was enough. -

Re: [kvm-devel] [PATCH 04 of 12] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-23 Thread Andrea Arcangeli
On Wed, Apr 23, 2008 at 10:45:36AM -0500, Robin Holt wrote: > XPMEM has passed all regression tests using your version 12 notifiers. That's great news, thanks! I'd greatly appreciate if you could test #v13 too as I posted it. It already passed GRU and KVM regressions tests and it should work fine

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
t we can't focus on this for 2.6.26. We can also consider making mmu_notifier_register safe against double calls on the same structure but again that's not something we should be doing in 1/N and it can be done later in a backwards compatible way (plus we're perfectly fine with the API

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 06:07:27PM -0500, Robin Holt wrote: > > The only other change I did has been to move mmu_notifier_unregister > > at the end of the patchset after getting more questions about its > > reliability and I documented a bit the rmmod requirements for > > ->release. we'll think lat

Re: [kvm-devel] [PATCH 04 of 12] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-23 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 04:14:26PM -0700, Christoph Lameter wrote: > We want a full solution and this kind of patching makes the patches > difficuilt to review because later patches revert earlier ones. I know you rather want to see KVM development stalled for more months than to get a partial so

Re: [kvm-devel] [PATCH 00 of 12] mmu notifier #v13

2008-04-23 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 01:30:53PM -0700, Christoph Lameter wrote: > One solution would be to separate the invalidate_page() callout into a > patch at the very end that can be omitted. AFACIT There is no compelling > reason to have this callback and it complicates the API for the device > driver

Re: [kvm-devel] [PATCH 10 of 12] Convert mm_lock to use semaphores after i_mmap_lock and anon_vma_lock

2008-04-22 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 01:26:13PM -0700, Christoph Lameter wrote: > Doing the right patch ordering would have avoided this patch and allow > better review. I didn't actually write this patch myself. This did it instead: s/anon_vma_lock/anon_vma_sem/ s/i_mmap_lock/i_mmap_sem/ s/locks/sems/ s/spi

Re: [kvm-devel] [PATCH 02 of 12] Fix ia64 compilation failure because of common code include bug

2008-04-22 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 01:22:55PM -0700, Christoph Lameter wrote: > Looks like this is not complete. There are numerous .h files missing which > means that various structs are undefined (fs.h and rmap.h are needed > f.e.) which leads to surprises when dereferencing fields of these struct. > > I

Re: [kvm-devel] [PATCH 04 of 12] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-22 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 01:24:21PM -0700, Christoph Lameter wrote: > Reverts a part of an earlier patch. Why isnt this merged into 1 of 12? To give zero regression risk to 1/12 when MMU_NOTIFIER=y or =n and the mmu notifiers aren't registered by GRU or KVM. Keep in mind that the whole point of my

Re: [kvm-devel] [PATCH 03 of 12] get_task_mm should not succeed if mmput() is running and has reduced

2008-04-22 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 01:23:16PM -0700, Christoph Lameter wrote: > Missing signoff by you. I thought I had to signoff if I conributed with anything that could resemble copyright? Given I only merged that patch, I can add an Acked-by if you like, but merging this in my patchset was already an imp

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-22 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 01:19:29PM -0700, Christoph Lameter wrote: > 3. As noted by Eric and also contained in private post from yesterday by >me: The cmp function needs to retrieve the value before >doing comparisons which is not done for the == of a and b. I retrieved the value, which i

Re: [kvm-devel] [PATCH 00 of 12] mmu notifier #v13

2008-04-22 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 01:22:13PM -0500, Robin Holt wrote: > 1) invalidate_page: You retain an invalidate_page() callout. I believe > we have progressed that discussion to the point that it requires some > direction for Andrew, Linus, or somebody in authority. The basics > of the difference dis

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-22 Thread Andrea Arcangeli
On Tue, Apr 22, 2008 at 05:37:38PM +0200, Eric Dumazet wrote: > I am saying your intent was probably to test > > else if ((unsigned long)*(spinlock_t **)a == > (unsigned long)*(spinlock_t **)b) > return 0; Indeed... > Hum, it's not a micro-optimization, but a bug fix. :)

  1   2   3   4   >