On Tue, 2 Aug 2005, Martin Schwidefsky wrote: > > Why do we require the !pte_dirty(pte) check? I don't get it. If a writeable > clean pte is just fine then why do we check the dirty bit at all? Doesn't > pte_dirty() imply pte_write()?
Not quite. This is all about the peculiar ptrace case, which sets "force" to get_user_pages, and ends up handled by the little maybe_mkwrite function: we sometimes allow ptrace to modify the page while the user does not have have write access to it via the pte. Robin discovered a race which proves it's unsafe for get_user_pages to reset its lookup_write flag (another stage in this peculiar path) after a single try, Nick proposed a patch which adds another VM_ return code which each arch would need to handle, Linus looked for something simpler and hit upon checking pte_dirty rather than pte_write (and removing the then unnecessary lookup_write flag). Linus' changes are in the 2.6.13-rc5 mm/memory.c, but that leaves s390 broken at present. > With the additional !pte_write(pte) check (and if I haven't overlooked > something which is not unlikely) s390 should work fine even without the > software-dirty bit hack. I agree the pte_write check ought to go back in next to the pte_dirty check, and that will leave s390 handling most uses of get_user_pages correctly, but still failing to handle the peculiar case of strace modifying a page to which the user does not currently have write access (e.g. setting a breakpoint in readonly text). Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/