From: Andrea Arcangeli <[EMAIL PROTECTED]>
With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to
pages. There are secondary MMUs (with secondary sptes and secondary
tlbs) too. sptes in the kvm case are shadow pagetables, but when I say
spte in mmu-notifier context, I mean
On Fri, May 09, 2008 at 08:37:29PM +0200, Peter Zijlstra wrote:
> Another possibility, would something like this work?
>
>
> /*
> * null out the begin function, no new begin calls can be made
> */
> rcu_assing_pointer(my_notifier.invalidate_start_begin, NULL);
>
> /*
> * lock/unlock a
On Thu, May 08, 2008 at 09:11:33AM -0700, Linus Torvalds wrote:
> Btw, this is an issue only on 32-bit x86, because on 64-bit one we already
> have the padding due to the alignment of the 64-bit pointers in the
> list_head (so there's already empty space there).
>
> On 32-bit, the alignment of l
On Thu, May 08, 2008 at 08:30:20AM +0300, Pekka Enberg wrote:
> On Thu, May 8, 2008 at 8:27 AM, Pekka Enberg <[EMAIL PROTECTED]> wrote:
> > You might want to read carefully what Linus wrote:
> >
> > > The one that already has a 4 byte padding thing on x86-64 just after the
> > > spinlock? And tha
On Wed, May 07, 2008 at 09:14:45PM -0700, Linus Torvalds wrote:
> IOW, you didn't even look at it, did you?
Actually I looked both at the struct and at the slab alignment just in
case it was changed recently. Now after reading your mail I also
compiled it just in case.
2.6.26-rc1
# name
On Wed, May 07, 2008 at 08:10:33PM -0700, Christoph Lameter wrote:
> On Thu, 8 May 2008, Andrea Arcangeli wrote:
>
> > to the sort function to break the loop. After that we remove the 512
> > vma cap and mm_lock is free to run as long as it wants like
> > /dev/urandom,
On Wed, May 07, 2008 at 06:12:32PM -0700, Christoph Lameter wrote:
> Andrea's mm_lock could have wider impact. It is the first effective
> way that I have seen of temporarily holding off reclaim from an address
> space. It sure is a brute force approach.
The only improvement I can imagine on mm_
On Wed, May 07, 2008 at 06:57:05PM -0700, Linus Torvalds wrote:
> Take five minutes. Take a deep breadth. And *think* about actually reading
> what I wrote.
>
> The bitflag *can* prevent taking the same lock twice. It just needs to be
> in the right place.
It's not that I didn't read it, but to
On Wed, May 07, 2008 at 06:39:48PM -0700, Linus Torvalds wrote:
>
>
> On Wed, 7 May 2008, Christoph Lameter wrote:
> >
> > > (That said, we're not running out of vm flags yet, and if we were, we
> > > could just add another word. We're already wasting that space right now
> > > on
> > > 64-bi
Sorry for not having completely answered to this. I initially thought
stop_machine could work when you mentioned it, but I don't think it
can even removing xpmem block-inside-mmu-notifier-method requirements.
For stop_machine to solve this (besides being slower and potentially
not more safe as run
On Wed, May 07, 2008 at 06:02:49PM -0700, Linus Torvalds wrote:
> You replace mm_lock() with the sequence that Andrew gave you (and I
> described):
>
> spin_lock(&global_lock)
> .. get all locks UNORDERED ..
> spin_unlock(&global_lock)
>
> and you're now done. You have your "mm
On Thu, May 08, 2008 at 09:28:38AM +1000, Benjamin Herrenschmidt wrote:
>
> On Thu, 2008-05-08 at 00:44 +0200, Andrea Arcangeli wrote:
> >
> > Please note, we can't allow a thread to be in the middle of
> > zap_page_range while mmu_notifier_register runs
Hi Andrew,
On Wed, May 07, 2008 at 03:59:14PM -0700, Andrew Morton wrote:
> CPU0: CPU1:
>
> spin_lock(global_lock)
> spin_lock(a->lock); spin_lock(b->lock);
== mmu_notifier_register()
> spin_lock(b->lo
To remove mm_lock without adding an horrible system-wide lock before
every i_mmap_lock etc.. we've to remove
invalidate_range_begin/end. Then we can return to an older approach of
doing only invalidate_page and serializing it with the PT lock against
get_user_pages. That works fine for KVM but GRU
On Wed, May 07, 2008 at 03:44:24PM -0700, Linus Torvalds wrote:
>
>
> On Thu, 8 May 2008, Andrea Arcangeli wrote:
> >
> > Unfortunately the lock you're talking about would be:
> >
> > static spinlock_t global_lock = ...
> >
> > There'
On Wed, May 07, 2008 at 03:31:03PM -0700, Andrew Morton wrote:
> Nope. We only need to take the global lock before taking *two or more* of
> the per-vma locks.
>
> I really wish I'd thought of that.
I don't see how you can avoid taking the system-wide-global lock
before every single anon_vma->lo
On Wed, May 07, 2008 at 03:31:08PM -0700, Roland Dreier wrote:
> I think the point you're missing is that any patches written by
> Christoph need a line like
>
> From: Christoph Lameter <[EMAIL PROTECTED]>
>
> at the top of the body so that Christoph becomes the author when it is
> committed into
On Thu, May 08, 2008 at 12:27:58AM +0200, Andrea Arcangeli wrote:
> I rechecked and I guarantee that the patches where Christoph isn't
> listed are developed by myself and he didn't write a single line on
> them. In any case I expect Christoph to review (he's CCed)
On Wed, May 07, 2008 at 03:11:10PM -0700, Linus Torvalds wrote:
>
>
> On Wed, 7 May 2008, Andrea Arcangeli wrote:
>
> > > As far as I can tell, authorship has been destroyed by at least two of
> > > the
> > > patches (ie Christoph seems to
On Wed, May 07, 2008 at 02:36:57PM -0700, Linus Torvalds wrote:
> > had to do any blocking I/O during vmtruncate before, now we have to.
>
> I really suspect we don't really have to, and that it would be better to
> just fix the code that does that.
I'll let you discuss with Christoph and Robin
On Wed, May 07, 2008 at 01:30:39PM -0700, Linus Torvalds wrote:
>
>
> On Wed, 7 May 2008, Andrew Morton wrote:
> >
> > The patch looks OK to me.
>
> As far as I can tell, authorship has been destroyed by at least two of the
> patches (ie Christoph seems to be the author, but Andrea seems to ha
On Wed, May 07, 2008 at 01:56:23PM -0700, Linus Torvalds wrote:
> This also looks very debatable indeed. The only performance numbers quoted
> are:
>
> > This results in f.e. the Aim9 brk performance test to got down by 10-15%.
>
> which just seems like a total disaster.
>
> The whole series
On Wed, May 07, 2008 at 01:39:43PM -0400, Rik van Riel wrote:
> Would it be an idea to merge them into one, so the first patch
> introduces the right conventions directly?
The only reason this isn't merged into one, is that this requires
non obvious (not difficult though) to the core VM code. I wa
On Wed, May 07, 2008 at 10:59:48AM -0500, Robin Holt wrote:
> You can drop this patch.
>
> This turned out to be a race in xpmem. It "appeared" as if it were a
> race in get_task_mm, but it really is not. The current->mm field is
> cleared under the task_lock and the task_lock is grabbed by get_
Hello,
this is the last update of the mmu notifier patch.
Jack asked a __mmu_notifier_register to call under mmap_sem in write mode.
Here an update with that change plus allowing ->release not to be implemented
(two liner change to mmu_notifier.c).
The entire diff between v15 and v16 mmu-notifi
On Tue, Apr 29, 2008 at 06:03:40PM +0200, Andrea Arcangeli wrote:
> Christoph if you've interest in evolving anon-vma-sem and i_mmap_sem
> yourself in this direction, you're very welcome to go ahead while I
In case you didn't notice this already, for a further explanation o
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210096013 -7200
# Node ID e20917dcc8284b6a07cfcced13dda4cbca850a9c
# Parent 5026689a3bc323a26d33ad882c34c4c9c9a3ecd8
mmu-notifier-core
With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to
pages
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115135 -7200
# Node ID 58f716ad4d067afb6bdd1b5f7042e19d854aae0d
# Parent 0621238970155f8ff2d60ca4996dcdd470f9c6ce
i_mmap_rwsem
The conversion to a rwsem allows notifier callbacks during rmap traversal
for files
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115131 -7200
# Node ID 20bc6a66a86ef6bd60919cc77ff51d4af741b057
# Parent 34f6a4bf67ce66714ba2d5c13a5fed241d34fb09
unmap vmas tlb flushing
Move the tlb flushing inside of unmap vmas. This saves us from passing
a p
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115136 -7200
# Node ID 6b384bb988786aa78ef07440180e4b2948c4c6a2
# Parent 58f716ad4d067afb6bdd1b5f7042e19d854aae0d
anon-vma-rwsem
Convert the anon_vma spinlock to a rw semaphore. This allows concurrent
traver
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115508 -7200
# Node ID 94eaa1515369e8ef183e2457f6f25a7f36473d70
# Parent 6b384bb988786aa78ef07440180e4b2948c4c6a2
mm_lock-rwsem
Convert mm_lock to use semaphores after i_mmap_lock and anon_vma_lock
conversion.
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115132 -7200
# Node ID 0621238970155f8ff2d60ca4996dcdd470f9c6ce
# Parent 20bc6a66a86ef6bd60919cc77ff51d4af741b057
rwsem contended
Add a function to rw_semaphores to check if there are any processes
waiting f
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115798 -7200
# Node ID eb924315351f6b056428e35c983ad28040420fea
# Parent 5b2eb7d28a4517daf91b08b4dcfbb58fd2b42d0b
mmap sems
This patch adds a lock ordering rule to avoid a potential deadlock when
multiple mmap_sem
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115797 -7200
# Node ID 5b2eb7d28a4517daf91b08b4dcfbb58fd2b42d0b
# Parent 94eaa1515369e8ef183e2457f6f25a7f36473d70
export zap_page_range for XPMEM
XPMEM would have used sys_madvise() except that madvise_dontneed()
r
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115129 -7200
# Node ID d60d200565abde6a8ed45271e53cde9c5c75b426
# Parent c5badbefeee07518d9d1acca13e94c981420317c
invalidate_page outside PT lock
Moves all mmu notifier methods outside the PT lock (first and no
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115127 -7200
# Node ID c5badbefeee07518d9d1acca13e94c981420317c
# Parent e20917dcc8284b6a07cfcced13dda4cbca850a9c
get_task_mm
get_task_mm should not succeed if mmput() is running and has reduced
the mm_users co
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1210115130 -7200
# Node ID 34f6a4bf67ce66714ba2d5c13a5fed241d34fb09
# Parent d60d200565abde6a8ed45271e53cde9c5c75b426
free-pgtables
Move the tlb flushing into free_pgtables. The conversion of the locks
taken for rever
Hello everyone,
This is to allow GRU code to call __mmu_notifier_register inside the
mmap_sem (write mode is required as documented in the patch).
It also removes the requirement to implement ->release as it's not
guaranteed all users will really need it.
I didn't integrate the search function a
On Mon, May 05, 2008 at 02:46:25PM -0500, Jack Steiner wrote:
> If a task fails to unmap a GRU segment, they still exist at the start of
Yes, this will also happen in case the well behaved task receives
SIGKILL, so you can test it that way too.
> exit. On the ->release callout, I set a flag in th
On Mon, May 05, 2008 at 12:25:06PM -0500, Jack Steiner wrote:
> Agree. My apologies... I should have caught it.
No problem.
> __mmu_notifier_register/__mmu_notifier_unregister seems like a better way to
> go, although either is ok.
If you also like __mmu_notifier_register more I'll go with it. T
On Mon, May 05, 2008 at 11:21:13AM -0500, Jack Steiner wrote:
> The GRU does the registration/deregistration of mmu notifiers from
> mmap/munmap.
> At this point, the mmap_sem is already held writeable. I hit a deadlock
> in mm_lock.
It'd been better to know about this detail earlier, but frankly
On Sun, May 04, 2008 at 02:13:45PM -0500, Robin Holt wrote:
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -205,3 +205,6 @@ config VIRT_TO_BUS
> > config VIRT_TO_BUS
> > def_bool y
> > depends on !ARCH_NO_VIRT_TO_BUS
> > +
> > +config MMU_NOTIFIER
odule isn't updated,
to decrease the risk of runtime failure).
Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 8d45fab..ce3251c 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
On Fri, May 02, 2008 at 12:28:32PM +0300, Avi Kivity wrote:
> Applied, thanks. Dynamic allocation for the fpu state was introduced in
> 2.6.26-rc, right?
It seems very recent, hit mainline on 30 Apr.
Also we may want to think if there's something cheaper than fx_save to
trigger a math exception
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740229 -7200
# Node ID b4bf6df98bc00bfbef9423b0dd31cfdba63a5eeb
# Parent 4f462fb3dff614cd7d971219c3feaef0b43359c1
mmap sems
This patch adds a lock ordering rule to avoid a potential deadlock when
multiple mmap_sem
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740229 -7200
# Node ID 4f462fb3dff614cd7d971219c3feaef0b43359c1
# Parent 721c3787cd42043734331e54a42eb20c51766f71
export zap_page_range for XPMEM
XPMEM would have used sys_madvise() except that madvise_dontneed()
r
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740226 -7200
# Node ID 721c3787cd42043734331e54a42eb20c51766f71
# Parent 0be678c52e540d5f5d5fd9af549b57b9bb018d32
mm_lock-rwsem
Convert mm_lock to use semaphores after i_mmap_lock and anon_vma_lock
conversion.
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740186 -7200
# Node ID 0be678c52e540d5f5d5fd9af549b57b9bb018d32
# Parent de28c85baef11b90c993047ca851a2f52c85a5be
anon-vma-rwsem
Convert the anon_vma spinlock to a rw semaphore. This allows concurrent
traver
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740186 -7200
# Node ID de28c85baef11b90c993047ca851a2f52c85a5be
# Parent 74b873f3ea07012e2fc864f203edf1179865feb1
i_mmap_rwsem
The conversion to a rwsem allows notifier callbacks during rmap traversal
for files
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740186 -7200
# Node ID 74b873f3ea07012e2fc864f203edf1179865feb1
# Parent a8ac53b928dfcea0ccb326fb7d71f908f0df85f4
rwsem contended
Add a function to rw_semaphores to check if there are any processes
waiting f
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740185 -7200
# Node ID 14e9f5a12bb1657fa6756e18d5dac71d4ad1a55e
# Parent ea8fc9187b6d3ef2742061b4f62598afe55281cf
free-pgtables
Move the tlb flushing into free_pgtables. The conversion of the locks
taken for rever
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740185 -7200
# Node ID ea8fc9187b6d3ef2742061b4f62598afe55281cf
# Parent c85c85c4be165eb6de16136bb97cf1fa7fd5c88f
invalidate_page outside PT lock
Moves all mmu notifier methods outside the PT lock (first and no
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740186 -7200
# Node ID a8ac53b928dfcea0ccb326fb7d71f908f0df85f4
# Parent 14e9f5a12bb1657fa6756e18d5dac71d4ad1a55e
unmap vmas tlb flushing
Move the tlb flushing inside of unmap vmas. This saves us from passing
a p
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740175 -7200
# Node ID 1489529e7b53d3f2dab8431372aa4850ec821caa
# Parent 5026689a3bc323a26d33ad882c34c4c9c9a3ecd8
mmu-notifier-core
With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to
pages
# HG changeset patch
# User Andrea Arcangeli <[EMAIL PROTECTED]>
# Date 1209740185 -7200
# Node ID c85c85c4be165eb6de16136bb97cf1fa7fd5c88f
# Parent 1489529e7b53d3f2dab8431372aa4850ec821caa
get_task_mm
get_task_mm should not succeed if mmput() is running and has reduced
the mm_users co
Hello everyone,
1/11 is the latest version of the mmu-notifier-core patch.
As usual all later 2-11/11 patches follows but those aren't meant for 2.6.26.
Thanks!
Andrea
-
This SF.net email is sponsored by the 2008 JavaOne(SM
Hello everyone,
this is the v14 to v15 difference to the mmu-notifier-core patch. This
is just for review of the difference, I'll post full v15 soon, please
review the diff in the meantime. Lots of those cleanups are thanks to
Andrew review on mmu-notifier-core in v14. He also spotted the
GFP_KERN
Hello,
This make sure not to schedule in atomic during fx_init. I also
changed the name of fpu_init to fx_finit to avoid duplicating the name
with fpu_init that is already used in the kernel, this makes grep
simpler if nothing else.
Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>
diff
On Wed, Apr 30, 2008 at 11:59:47AM +0300, Avi Kivity wrote:
> The code is not trying to find a vma for the address, but a vma for the
> address which also has VM_PFNMAP set. The cases for vma not found, or vma
> found, but not VM_PFNMAP, are folded together.
Muli's saying the comparison is rever
On Tue, Apr 29, 2008 at 06:12:51PM -0500, Anthony Liguori wrote:
> IIUC PPC correctly, all IO pages have corresponding struct pages. This
> means that get_user_pages() would succeed and you can reference count them?
> In this case, we would never take the VM_PFNMAP path.
get_user_pages only wo
On Tue, Apr 29, 2008 at 10:50:30AM -0500, Robin Holt wrote:
> You have said this continually about a CONFIG option. I am unsure how
> that could be achieved. Could you provide a patch?
I'm busy with the reserved ram patch against 2.6.25 and latest kvm.git
that is moving from pages to pfn for pci
On Mon, Apr 28, 2008 at 06:28:06PM -0700, Christoph Lameter wrote:
> On Tue, 29 Apr 2008, Andrea Arcangeli wrote:
>
> > Frankly I've absolutely no idea why rcu is needed in all rmap code
> > when walking the page->mapping. Definitely the PG_locked is taken so
> &
On Tue, Apr 29, 2008 at 09:32:09AM -0500, Anthony Liguori wrote:
> + vma = find_vma(current->mm, addr);
> + if (vma == NULL) {
> + get_page(bad_page);
> + return page_to_pfn(bad_page);
> + }
Here you must check vm_start ad
Hi Hugh!!
On Tue, Apr 29, 2008 at 11:49:11AM +0100, Hugh Dickins wrote:
> [I'm scarcely following the mmu notifiers to-and-fro, which seems
> to be in good hands, amongst faster thinkers than me: who actually
> need and can test this stuff. Don't let me slow you down; but I
> can quickly clarify
On Mon, Apr 28, 2008 at 01:34:11PM -0700, Christoph Lameter wrote:
> On Sun, 27 Apr 2008, Andrea Arcangeli wrote:
>
> > Talking about post 2.6.26: the refcount with rcu in the anon-vma
> > conversion seems unnecessary and may explain part of the AIM slowdown
> > to
On Mon, Apr 28, 2008 at 11:11:56AM -0500, Anthony Liguori wrote:
> Here's my thinking as to why we don't want to destroy the VM in the mmu
> notifiers ->release method. I don't have a valid use-case for this but my
> argument depends on the fact that this is something that should work.
> Daemo
On Sat, Apr 26, 2008 at 08:17:34AM -0500, Robin Holt wrote:
> the first four sets. The fifth is the oversubscription test which trips
> my xpmem bug. This is as good as the v12 runs from before.
Now that mmu-notifier-core #v14 seems finished and hopefully will
appear in 2.6.26 ;), I started exer
the sptes.
The ioctl of the qemu userland could run in any other task with a mm
different than the one of the guest and ->release allows this to work
fine without memory corruption and without requiring page pinning.
As far a I can tell your example explains why we need this fix ;).
Here an
On Sat, Apr 26, 2008 at 01:59:23PM -0500, Anthony Liguori wrote:
>> +static void kvm_unmap_spte(struct kvm *kvm, u64 *spte)
>> +{
>> +struct page *page = pfn_to_page((*spte & PT64_BASE_ADDR_MASK) >>
>> PAGE_SHIFT);
>> +get_page(page);
>>
>
> You should not assume a struct page exists fo
disarmed regardless of MMU_NOTIFIER=y or =n.
http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.25/mmu-notifier-v14/mmu-notifier-core
I'll be sending that patch to Andrew inbox.
Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>
diff --git a/arch/x86/kv
oving the external-module-compat in the same
place with the other includes where `pwd` works instead of $(src) that
doesn't work anymore for whatever reason.
Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>
diff --git a/kernel/Kbuild b/kernel/Kbuild
index cabfc75..d9245eb 100644
---
On Sat, Apr 26, 2008 at 08:17:34AM -0500, Robin Holt wrote:
> Since this include and the one for mm_types.h both are build breakages
> for ia64, I think you need to apply your ia64_cpumask and the following
> (possibly as a single patch) first or in your patch 1. Without that,
> ia64 doing a git-b
On Fri, Apr 25, 2008 at 02:25:32PM -0500, Robin Holt wrote:
> I think you still need mm_lock (unless I miss something). What happens
> when one callout is scanning mmu_notifier_invalidate_range_start() and
> you unlink. That list next pointer with LIST_POISON1 which is a really
> bad address for
On Fri, Apr 25, 2008 at 06:56:39PM +0200, Andrea Arcangeli wrote:
> > > + data->i_mmap_locks = vmalloc(nr_i_mmap_locks *
> > > + sizeof(spinlock_t));
> >
> > This is why non-typesafe allocators suck. You want
I somehow lost missed this email in my inbox, found it now because it
was strangely still unread... Sorry for the late reply!
On Tue, Apr 22, 2008 at 03:06:24PM +1000, Rusty Russell wrote:
> On Wednesday 09 April 2008 01:44:04 Andrea Arcangeli wrote:
> > --- a/include/linux/mm.h
> >
On Thu, Apr 24, 2008 at 05:39:43PM +0200, Andrea Arcangeli wrote:
> There's at least one small issue I noticed so far, that while _release
> don't need to care about _register, but _unregister definitely need to
> care about _register. I've to take the mmap_sem in additi
eed to
care about _register. I've to take the mmap_sem in addition or in
replacement of the unregister_lock. The srcu_read_lock can also likely
moved just before releasing the unregister_lock but that's just a
minor optimization to make the code more strict.
> On Thu, Apr 24, 2008 a
On Thu, Apr 24, 2008 at 12:19:28AM +0200, Andrea Arcangeli wrote:
> /dev/kvm closure. Given this can be a considered a bugfix to
> mmu_notifier_unregister I'll apply it to 1/N and I'll release a new
I'm not sure anymore this can be considered a bugfix given how large
change
On Wed, Apr 23, 2008 at 06:37:13PM +0200, Andrea Arcangeli wrote:
> I'm afraid if you don't want to worst-case unregister with ->release
> you need to have a better idea than my mm_lock and personally I can't
> see any other way than mm_lock to ensure not to mi
On Wed, Apr 23, 2008 at 11:27:21AM -0700, Christoph Lameter wrote:
> There is a potential issue in move_ptes where you call
> invalidate_range_end after dropping i_mmap_sem whereas my patches did the
> opposite. Mmap_sem saves you there?
Yes, there's really no risk of races in this area after in
On Wed, Apr 23, 2008 at 11:21:49AM -0700, Christoph Lameter wrote:
> No I really want you to do this. I have no interest in a takeover in the
Ok if you want me to do this, I definitely prefer the core to go in
now. It's so much easier to concentrate on two problems at different
times then to atta
On Wed, Apr 23, 2008 at 11:19:26AM -0700, Christoph Lameter wrote:
> If unregister fails then the driver should not detach from the address
> space immediately but wait until -->release is called. That may be
> a possible solution. It will be rare that the unregister fails.
This is the current ide
On Wed, Apr 23, 2008 at 11:09:35AM -0700, Christoph Lameter wrote:
> Why is there still the hlist stuff being used for the mmu notifier list?
> And why is this still unsafe?
What's the problem with hlist, it saves 8 bytes for each mm_struct,
you should be using it too instead of list.
> There ar
On Wed, Apr 23, 2008 at 11:02:18AM -0700, Christoph Lameter wrote:
> We have had this workaround effort done years ago and have been
> suffering the ill effects of pinning for years. Had to deal with
Yes. In addition to the pinning, there's lot of additional tlb
flushing work to do in kvm withou
On Wed, Apr 23, 2008 at 12:09:09PM -0500, Jack Steiner wrote:
>
> You may have spotted this already. If so, just ignore this.
>
> It looks like there is a bug in copy_page_range() around line 667.
> It's possible to do a mmu_notifier_invalidate_range_start(), then
> return -ENOMEM w/o doing a cor
On Wed, Apr 23, 2008 at 06:26:29PM +0200, Andrea Arcangeli wrote:
> On Tue, Apr 22, 2008 at 04:20:35PM -0700, Christoph Lameter wrote:
> > I guess I have to prepare another patchset then?
Apologies for my previous not too polite comment in answer to the
above, but I thought this double
On Tue, Apr 22, 2008 at 07:28:49PM -0500, Jack Steiner wrote:
> The GRU driver unregisters the notifier when all GRU mappings
> are unmapped. I could make it work either way - either with or without
> an unregister function. However, unregister is the most logical
> action to take when all mappings
On Tue, Apr 22, 2008 at 04:20:35PM -0700, Christoph Lameter wrote:
> I guess I have to prepare another patchset then?
If you want to embarrass yourself three time in a row go ahead ;). I
thought two failed takeovers was enough.
-
On Wed, Apr 23, 2008 at 10:45:36AM -0500, Robin Holt wrote:
> XPMEM has passed all regression tests using your version 12 notifiers.
That's great news, thanks! I'd greatly appreciate if you could test
#v13 too as I posted it. It already passed GRU and KVM regressions
tests and it should work fine
t we can't
focus on this for 2.6.26. We can also consider making
mmu_notifier_register safe against double calls on the same structure
but again that's not something we should be doing in 1/N and it can be
done later in a backwards compatible way (plus we're perfectly fine
with the API
On Tue, Apr 22, 2008 at 06:07:27PM -0500, Robin Holt wrote:
> > The only other change I did has been to move mmu_notifier_unregister
> > at the end of the patchset after getting more questions about its
> > reliability and I documented a bit the rmmod requirements for
> > ->release. we'll think lat
On Tue, Apr 22, 2008 at 04:14:26PM -0700, Christoph Lameter wrote:
> We want a full solution and this kind of patching makes the patches
> difficuilt to review because later patches revert earlier ones.
I know you rather want to see KVM development stalled for more months
than to get a partial so
On Tue, Apr 22, 2008 at 01:30:53PM -0700, Christoph Lameter wrote:
> One solution would be to separate the invalidate_page() callout into a
> patch at the very end that can be omitted. AFACIT There is no compelling
> reason to have this callback and it complicates the API for the device
> driver
On Tue, Apr 22, 2008 at 01:26:13PM -0700, Christoph Lameter wrote:
> Doing the right patch ordering would have avoided this patch and allow
> better review.
I didn't actually write this patch myself. This did it instead:
s/anon_vma_lock/anon_vma_sem/
s/i_mmap_lock/i_mmap_sem/
s/locks/sems/
s/spi
On Tue, Apr 22, 2008 at 01:22:55PM -0700, Christoph Lameter wrote:
> Looks like this is not complete. There are numerous .h files missing which
> means that various structs are undefined (fs.h and rmap.h are needed
> f.e.) which leads to surprises when dereferencing fields of these struct.
>
> I
On Tue, Apr 22, 2008 at 01:24:21PM -0700, Christoph Lameter wrote:
> Reverts a part of an earlier patch. Why isnt this merged into 1 of 12?
To give zero regression risk to 1/12 when MMU_NOTIFIER=y or =n and the
mmu notifiers aren't registered by GRU or KVM. Keep in mind that the
whole point of my
On Tue, Apr 22, 2008 at 01:23:16PM -0700, Christoph Lameter wrote:
> Missing signoff by you.
I thought I had to signoff if I conributed with anything that could
resemble copyright? Given I only merged that patch, I can add an
Acked-by if you like, but merging this in my patchset was already an
imp
On Tue, Apr 22, 2008 at 01:19:29PM -0700, Christoph Lameter wrote:
> 3. As noted by Eric and also contained in private post from yesterday by
>me: The cmp function needs to retrieve the value before
>doing comparisons which is not done for the == of a and b.
I retrieved the value, which i
On Tue, Apr 22, 2008 at 01:22:13PM -0500, Robin Holt wrote:
> 1) invalidate_page: You retain an invalidate_page() callout. I believe
> we have progressed that discussion to the point that it requires some
> direction for Andrew, Linus, or somebody in authority. The basics
> of the difference dis
On Tue, Apr 22, 2008 at 05:37:38PM +0200, Eric Dumazet wrote:
> I am saying your intent was probably to test
>
> else if ((unsigned long)*(spinlock_t **)a ==
> (unsigned long)*(spinlock_t **)b)
> return 0;
Indeed...
> Hum, it's not a micro-optimization, but a bug fix. :)
1 - 100 of 378 matches
Mail list logo