Philippe Gerum wrote: > On Tue, 2007-08-07 at 17:09 +0200, Gilles Chanteperdrix wrote: >> On 8/7/07, Jan Kiszka <[EMAIL PROTECTED]> wrote: >>> Gilles Chanteperdrix wrote: >>>> The fact that you are in a hurry should not be an excuse to propose a >>>> fix which is much worse than the bug itself. >>> Please explain. >> Using GFP_ATOMIC is not a good idea, because the kernel needs the >> GFP_ATOMIC pools for allocation that take place from really atomic >> contexts such as interruption handlers for instance. fork should not >> be an atomic context. >> >> So the only reason I see why you propose this patch is because you are >> in a hurry. Ok, before we fix this properly, you can use GFP_ATOMIC >> locally, but please do not make >> this an official patch. >> > > I'm afraid you are both right. Draining pages from the atomic pool might > starve truely atomic contexts, and releasing the lock guarding the > pagetable pages for the source mm seems contradictory with what the > current code tries to achieve in holding it. > > Here is a patch preallocating the page for the cow-breaking code, which > relies on the lock breaking pattern already in place. Not exactly > pretty, slightly tested here (UP, spinlock debug), but looks ok so far. > Needs SMP testing before any attempt is made to merge it.
Funny. I just finished my own patch of this kind and made it pass a
basic COW stress test here. Find it below, it is very similar. Will
now analyse yours to see which one looks nicer and/or is more correct.
Jan
---
mm/memory.c | 38 +++++++++++++++++++++++++++++---------
1 file changed, 29 insertions(+), 9 deletions(-)
Index: linux-2.6.20.15/mm/memory.c
===================================================================
--- linux-2.6.20.15.orig/mm/memory.c
+++ linux-2.6.20.15/mm/memory.c
@@ -456,7 +456,7 @@ static inline void cow_user_page(struct
static inline int
copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
pte_t *dst_pte, pte_t *src_pte, struct vm_area_struct *vma,
- unsigned long addr, int *rss)
+ unsigned long addr, int *rss, struct page **new_page)
{
unsigned long vm_flags = vma->vm_flags;
pte_t pte = *src_pte;
@@ -498,13 +498,15 @@ copy_one_pte(struct mm_struct *dst_mm, s
#ifdef CONFIG_IPIPE
if (((vm_flags|src_mm->def_flags) & (VM_LOCKED|VM_PINNED)) ==
(VM_LOCKED|VM_PINNED)) {
struct page *old_page = vm_normal_page(vma, addr, pte);
- page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+
+ page = *new_page;
if (!page)
return -ENOMEM;
+ *new_page = NULL;
cow_user_page(page, old_page, addr, vma);
pte = mk_pte(page, vma->vm_page_prot);
-
+
if (vm_flags & VM_SHARED)
pte = pte_mkclean(pte);
pte = pte_mkold(pte);
@@ -546,12 +548,23 @@ static int copy_pte_range(struct mm_stru
spinlock_t *src_ptl, *dst_ptl;
int progress = 0;
int rss[2];
+ int err = 0;
+ int need_new_pages = 0;
+ struct page *new_page = NULL;
again:
+ if (need_new_pages && !new_page) {
+ new_page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+ if (!new_page)
+ return -ENOMEM;
+ }
+
rss[1] = rss[0] = 0;
dst_pte = pte_alloc_map_lock(dst_mm, dst_pmd, addr, &dst_ptl);
- if (!dst_pte)
- return -ENOMEM;
+ if (!dst_pte) {
+ err = -ENOMEM;
+ goto cleanup_exit;
+ }
src_pte = pte_offset_map_nested(src_pmd, addr);
src_ptl = pte_lockptr(src_mm, src_pmd);
spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
@@ -574,10 +587,13 @@ again:
continue;
}
if (copy_one_pte(dst_mm, src_mm, dst_pte,
- src_pte, vma, addr, rss))
- return -ENOMEM;
+ src_pte, vma, addr, rss, &new_page) < 0) {
+ need_new_pages = 1;
+ break;
+ }
progress += 8;
- } while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
+ } while (dst_pte++, src_pte++, addr += PAGE_SIZE,
+ addr != end && (!need_new_pages || new_page));
arch_leave_lazy_mmu_mode();
spin_unlock(src_ptl);
@@ -587,7 +603,11 @@ again:
cond_resched();
if (addr != end)
goto again;
- return 0;
+
+cleanup_exit:
+ if (new_page)
+ page_cache_release(new_page);
+ return err;
}
static inline int copy_pmd_range(struct mm_struct *dst_mm, struct mm_struct
*src_mm,
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Adeos-main mailing list [email protected] https://mail.gna.org/listinfo/adeos-main
