On Thu, Aug 04, 2005 at 04:35:06PM +0100, Hugh Dickins wrote:
> And it does miss arm, the only arch which actually needs changing
> right now, if we simply restore the original values which Nick shifted
> - although arm references the VM_FAULT_ codes in some places, it also
> uses "> 0". arm26 loo
On Fri, Aug 05, 2005 at 12:30:07AM +1000, Nick Piggin wrote:
> Alexander Nyberg wrote:
> > On Wed, Aug 03, 2005 at 09:12:37AM -0700 Linus Torvalds wrote:
> >
> >
> >>
> >>Ok, I applied this because it was reasonably pretty and I liked the
> >>approach. It seems buggy, though, since it was using
On Thu, 4 Aug 2005, Alexander Nyberg wrote:
>
> Hardcoding is evil so it's good it gets cleaned up anyway.
Yes.
> > parisc, cris, m68k, frv, sh64, arm26 are also broken. Would you mind
> > resending a patch that fixes them all?
>
> Remove the hardcoding in return value checking of handle_mm_f
On Thu, 4 Aug 2005, Alexander Nyberg wrote:
>
> Hardcoding is evil so it's good it gets cleaned up anyway.
>
> > parisc, cris, m68k, frv, sh64, arm26 are also broken.
> > Would you mind resending a patch that fixes them all?
>
> Remove the hardcoding in return value checking of handle_mm_fault()
> >
> >x86_64 had hardcoded the VM_ numbers so it broke down when the numbers
> >were changed.
> >
>
> Ugh, sorry I should have audited this but I really wasn't expecting
> it (famous last words). Hasn't been a good week for me.
Hardcoding is evil so it's good it gets cleaned up anyway.
> parisc
Alexander Nyberg wrote:
On Wed, Aug 03, 2005 at 09:12:37AM -0700 Linus Torvalds wrote:
Ok, I applied this because it was reasonably pretty and I liked the
approach. It seems buggy, though, since it was using "switch ()" to test
the bits (wrongly, afaik), and I'm going to apply the appended
On Wed, Aug 03, 2005 at 09:12:37AM -0700 Linus Torvalds wrote:
>
>
> On Wed, 3 Aug 2005, Nick Piggin wrote:
> >
> > Oh, it gets rid of the -1 for VM_FAULT_OOM. Doesn't seem like there
> > is a good reason for it, but might that break out of tree drivers?
>
> Ok, I applied this because it was r
On Thu, 4 Aug 2005, Robin Holt wrote:
> On Wed, Aug 03, 2005 at 12:31:34PM +0100, Hugh Dickins wrote:
> > On Wed, 3 Aug 2005, Robin Holt wrote:
> > > On Mon, Aug 01, 2005 at 11:18:42AM -0700, Linus Torvalds wrote:
> > > >
> > > > Can somebody who saw the problem in the first place please verify?
>
On Wed, Aug 03, 2005 at 12:31:34PM +0100, Hugh Dickins wrote:
> On Wed, 3 Aug 2005, Robin Holt wrote:
> > On Mon, Aug 01, 2005 at 11:18:42AM -0700, Linus Torvalds wrote:
> > >
> > > Can somebody who saw the problem in the first place please verify?
OK. I took the three commits:
4ceb5db9757aaeadc
Linus Torvalds wrote:
On Wed, 3 Aug 2005, Nick Piggin wrote:
Oh, it gets rid of the -1 for VM_FAULT_OOM. Doesn't seem like there
is a good reason for it, but might that break out of tree drivers?
Ok, I applied this because it was reasonably pretty and I liked the
approach. It seems buggy,
On Wed, 3 Aug 2005, Linus Torvalds wrote:
>
> Ok, I applied this because it was reasonably pretty and I liked the
> approach. It seems buggy, though, since it was using "switch ()" to test
> the bits (wrongly, afaik), and I'm going to apply the appended on top of
> it. Holler quickly if you dis
On Wed, 3 Aug 2005, Linus Torvalds wrote:
>
> What makes us not get into an infinite loop there?
Ahh, never mind, I didn't notice that we never set the "read" thing at
all, so ptrace will never care if it's readable or not. No problem.
Linus
-
To unsubscribe from this list: s
On Wed, 3 Aug 2005, Linus Torvalds wrote:
>
> Ok, I applied this because it was reasonably pretty and I liked the
> approach. It seems buggy, though, since it was using "switch ()" to test
> the bits (wrongly, afaik), and I'm going to apply the appended on top of
> it. Holler quickly if you d
On Wed, 3 Aug 2005, Nick Piggin wrote:
>
> Oh, it gets rid of the -1 for VM_FAULT_OOM. Doesn't seem like there
> is a good reason for it, but might that break out of tree drivers?
Ok, I applied this because it was reasonably pretty and I liked the
approach. It seems buggy, though, since it was
Hugh Dickins wrote:
Stupidity was the reason I thought handle_mm_fault couldn't be inline:
I was picturing it static inline within mm/memory.c, failed to make the
great intellectual leap you've achieved by moving it to include/linux/mm.h.
Well it was one of my finer moments, so don't be too h
On Wed, 3 Aug 2005, Martin Schwidefsky wrote:
> Hugh Dickins <[EMAIL PROTECTED]> wrote on 08/02/2005 10:55:31 PM:
> >
> > Here we are: get_user_pages quite untested, let alone the racy case,
>
> Ahh, just tested it and everythings seems to work (even for s390)..
> I'm happy :-)
Thanks for testing
On Wed, 3 Aug 2005, Nick Piggin wrote:
> Hugh Dickins wrote:
> >
> > Here we are: get_user_pages quite untested, let alone the racy case,
> > but I think it should work. Please all hack it around as you see fit,
> > I'll check mail when I get home, but won't be very responsive...
>
> Seems OK to
On Wed, 3 Aug 2005, Robin Holt wrote:
> On Mon, Aug 01, 2005 at 11:18:42AM -0700, Linus Torvalds wrote:
> >
> > Can somebody who saw the problem in the first place please verify?
>
> Unfortunately, I can not get the user test to run against anything but the
> SLES9 SP2 kernel. I took the commit
Hugh Dickins <[EMAIL PROTECTED]> wrote on 08/02/2005 10:55:31 PM:
> > Go for it, I think whatever we do won't be wonderfully pretty.
>
> Here we are: get_user_pages quite untested, let alone the racy case,
> but I think it should work. Please all hack it around as you see fit,
> I'll check mail w
Hugh Dickins wrote:
On Tue, 2 Aug 2005, Linus Torvalds wrote:
Go for it, I think whatever we do won't be wonderfully pretty.
Here we are: get_user_pages quite untested, let alone the racy case,
but I think it should work. Please all hack it around as you see fit,
I'll check mail when I get
On Mon, Aug 01, 2005 at 11:18:42AM -0700, Linus Torvalds wrote:
> On Mon, 1 Aug 2005, Linus Torvalds wrote:
> >
> > Ie something like the below (which is totally untested, obviously, but I
> > think conceptually is a lot more correct, and obviously a lot simpler).
>
> I've tested it, and thought
On Tue, 2 Aug 2005, Linus Torvalds wrote:
>
> Go for it, I think whatever we do won't be wonderfully pretty.
Here we are: get_user_pages quite untested, let alone the racy case,
but I think it should work. Please all hack it around as you see fit,
I'll check mail when I get home, but won't be ve
On Tue, 2 Aug 2005, Hugh Dickins wrote:
> >
> > Yes, good point. If the thing is still marked dirty in the TLB, some other
> > thread might be writing to the page after we've cleared dirty but before
> > we've flushed the TLB - causing the new dirty bit to be lost. I think.
>
> Would that mat
On Tue, 2 Aug 2005, Linus Torvalds wrote:
> On Tue, 2 Aug 2005, Hugh Dickins wrote:
> >
> > It might not be so bad. It's going to access the struct page anyway.
> > And clearing dirty from parent and child at fork time could save two
> > set_page_dirtys at exit time. But I'm not sure that we cou
On Tue, 2 Aug 2005, Hugh Dickins wrote:
>
> It might not be so bad. It's going to access the struct page anyway.
> And clearing dirty from parent and child at fork time could save two
> set_page_dirtys at exit time. But I'm not sure that we could batch the
> the dirty bit clearing into one TLB
On Tue, 2 Aug 2005, Linus Torvalds wrote:
>
> Since we will have dropped the page table lock when calling
> handle_mm_fault() (which will just re-get the lock and then drop it
> again) _and_ since we don't actually mark the page dirty if it was
> writable, it's entirely possible that the VM scan
On Tue, 2 Aug 2005, Linus Torvalds wrote:
> On Tue, 2 Aug 2005, Hugh Dickins wrote:
> >
> > But have I just realized a non-s390 problem with your pte_dirty
> > technique? The ptep_set_wrprotect in fork's copy_one_pte.
> >
> > That's specifically write-protecting the pte to force COW, but leaving
On Tue, 2 Aug 2005, Linus Torvalds wrote:
>
> In fact, that brings up another race altogether: a thread that does a
> fork() at the same time [...]
You don't even need that, actually. There's another race by which the
write could have gotten lost both with the new code _and_ the old code.
Sin
Linus Torvalds <[EMAIL PROTECTED]> wrote on 08/02/2005 05:30:37 PM:
> > With the additional !pte_write(pte) check (and if I haven't overlooked
> > something which is not unlikely) s390 should work fine even without the
> > software-dirty bit hack.
>
> No it won't. It will just loop forever in a ti
On Tue, 2 Aug 2005, Hugh Dickins wrote:
>
> But have I just realized a non-s390 problem with your pte_dirty
> technique? The ptep_set_wrprotect in fork's copy_one_pte.
>
> That's specifically write-protecting the pte to force COW, but leaving
> the dirty bit: so now get_user_pages will skip CO
On Tue, 2 Aug 2005, Linus Torvalds wrote:
>
> On the other hand, this being s390, maybe nobody cares?
You have a cruel streak.
But have I just realized a non-s390 problem with your pte_dirty
technique? The ptep_set_wrprotect in fork's copy_one_pte.
That's specifically write-protecting the pte
On Tue, 2 Aug 2005, Martin Schwidefsky wrote:
>
> Why do we require the !pte_dirty(pte) check? I don't get it. If a writeable
> clean pte is just fine then why do we check the dirty bit at all? Doesn't
> pte_dirty() imply pte_write()?
A _non_writable and clean pty is _also_ fine sometimes. But
Hugh Dickins <[EMAIL PROTECTED]> wrote on 08/02/2005 02:26:09 PM:
> On Tue, 2 Aug 2005, Martin Schwidefsky wrote:
> >
> > Why do we require the !pte_dirty(pte) check? I don't get it. If a
writeable
> > clean pte is just fine then why do we check the dirty bit at all?
Doesn't
> > pte_dirty() imply
> On Monday, August 01, 2005, Linus Torvalds wrote:
>
> Also, I haven't actually heard from whoever actually
> noticed the problem in the first place (Robin?) whether
> the fix does fix it. It "obviously does", but testing
> is always good ;)
Robin took yesterday & today (Tues) off but will test
Hugh Dickins wrote:
On Tue, 2 Aug 2005, Martin Schwidefsky wrote:
With the additional !pte_write(pte) check (and if I haven't overlooked
something which is not unlikely) s390 should work fine even without the
software-dirty bit hack.
I agree the pte_write check ought to go back in next to t
On Tue, 2 Aug 2005, Martin Schwidefsky wrote:
>
> Why do we require the !pte_dirty(pte) check? I don't get it. If a writeable
> clean pte is just fine then why do we check the dirty bit at all? Doesn't
> pte_dirty() imply pte_write()?
Not quite. This is all about the peculiar ptrace case, which
> > Any chance you can change the __follow_page test to account for
> > writeable clean ptes? Something like
> >
> > if (write && !pte_dirty(pte) && !pte_write(pte))
> > goto out;
> >
> > And then you would re-add the set_page_dirty logic further on.
>
> Hmm.. That should be pos
Linus Torvalds <[EMAIL PROTECTED]> wrote on 08/01/2005 09:48:40 PM:
> > Attractive, I very much wanted to do that rather than change all the
> > arches, but I think s390 rules it out: its pte_mkdirty does nothing,
> > its pte_dirty just says no.
>
> How does s390 work at all?
The big difference b
On Tue, 2 Aug 2005, Nick Piggin wrote:
>
> Any chance you can change the __follow_page test to account for
> writeable clean ptes? Something like
>
> if (write && !pte_dirty(pte) && !pte_write(pte))
> goto out;
>
> And then you would re-add the set_page_dirty logic further
On Mon, 2005-08-01 at 20:45 -0700, Linus Torvalds wrote:
>
> On Tue, 2 Aug 2005, Nick Piggin wrote:
> >
> > Surely this introduces integrity problems when `force` is not set?
>
> "force" changes how we test the vma->vm_flags, that was always the
> meaning from a security standpoint (and that ha
On Tue, 2 Aug 2005, Nick Piggin wrote:
>
> Surely this introduces integrity problems when `force` is not set?
"force" changes how we test the vma->vm_flags, that was always the
meaning from a security standpoint (and that hasn't changed).
The old code had this "lookup_write = write && !force;
Linus Torvalds wrote:
Instead, I'd suggest changing the logic for "lookup_write". Make it
require that the page table entry is _dirty_ (not writable), and then
remove the line that says:
lookup_write = write && !force;
and you're now done. A successful mm fault for write _should_ a
Linus Torvalds wrote:
On Mon, 1 Aug 2005, Nick Piggin wrote:
Not sure if this should be fixed for 2.6.13. It can result in
pagecache corruption: so I guess that answers my own question.
Hell no.
This patch is clearly untested and must _not_ be applied:
Yes, I meant that the problem sh
On Mon, 1 Aug 2005, Linus Torvalds wrote:
>
> Of course, if VM_MAYWRITE is not set, you could just convert it silently
> to a MAP_PRIVATE at the VM level (that's literally what we used to do,
> back when we didn't support writable shared mappings at all, all those
> years ago), so at least now
On Mon, 1 Aug 2005, Hugh Dickins wrote:
> >
> > We have always just done a COW if it's read-only - even if it's shared.
> >
> > The point being that if a process mapped did a read-only mapping, and a
> > tracer wants to modify memory, the tracer is always allowed to do so, but
> > it's _not_
On Mon, 1 Aug 2005, Linus Torvalds wrote:
> On Mon, 1 Aug 2005, Hugh Dickins wrote:
> >
> > > Aside, that brings up an interesting question - why should readonly
> > > mappings of writeable files (with VM_MAYWRITE set) disallow ptrace
> > > write access while readonly mappings of readonly files no
On Mon, 1 Aug 2005, Andrew Morton wrote:
> static inline int handle_mm_fault(...)
> {
> int ret = __handle_mm_fault(...);
>
> if (unlikely(ret == VM_FAULT_RACE))
> ret = VM_FAULT_MINOR;
> return ret;
> }
> because VM_FAULT_RACE is some internal private thing.
> It d
On Mon, 1 Aug 2005, Andrew Morton wrote:
>
> We could just do:
>
> static inline int handle_mm_fault(...)
> {
> int ret = __handle_mm_fault(...);
>
> if (unlikely(ret == VM_FAULT_RACE))
> ret = VM_FAULT_MINOR;
The reason I really dislike this whole VM_FAULT_RACE thing
On Mon, 1 Aug 2005, Andrew Morton wrote:
>
> That was introduced 19 months ago by the s390 guys (see patch below).
This really is a very broken patch, btw.
> + if (write && !pte_write(pte))
> + goto out;
> + if (write && !pte_dirty(pte)) {
> +
Hugh Dickins <[EMAIL PROTECTED]> wrote:
>
> There are currently 21 architectures,
> but so far your patch only updates 14 of them?
We could just do:
static inline int handle_mm_fault(...)
{
int ret = __handle_mm_fault(...);
if (unlikely(ret == VM_FAULT_RACE))
ret
On Mon, 1 Aug 2005, Hugh Dickins wrote:
>
> > Aside, that brings up an interesting question - why should readonly
> > mappings of writeable files (with VM_MAYWRITE set) disallow ptrace
> > write access while readonly mappings of readonly files not? Or am I
> > horribly confused?
>
> Either you
On Mon, 1 Aug 2005, Nick Piggin wrote:
>
> This was tested by Robin and appears to solve the problem. Roland
> had a quick look and thought the basic idea was sound. I'd like to
> get a couple more acks before going forward, and in particular
> Robin was contemplating possible efficiency improveme
Hugh Dickins <[EMAIL PROTECTED]> wrote:
>
> On Mon, 1 Aug 2005, Linus Torvalds wrote:
> >
> > that "continue" will continue without the spinlock held, and now do
>
> Yes, I was at last about to reply on that point and others.
> I'll make those comments in a separate mail to Nick and all.
>
> >
On Mon, 1 Aug 2005, Hugh Dickins wrote:
>
> Attractive, I very much wanted to do that rather than change all the
> arches, but I think s390 rules it out: its pte_mkdirty does nothing,
> its pte_dirty just says no.
How does s390 work at all?
> Or should we change s390 to set a flag in the pte j
On Mon, 1 Aug 2005, Nick Piggin wrote:
> Ingo Molnar wrote:
> > Hugh's posting said:
> >
> > "it's trying to avoid an endless loop of finding the pte not writable
> > when ptrace is modifying a page which the user is currently protected
> > against writing to (setting a breakpoint in readonly tex
On Mon, 1 Aug 2005, Linus Torvalds wrote:
>
> that "continue" will continue without the spinlock held, and now do
Yes, I was at last about to reply on that point and others.
I'll make those comments in a separate mail to Nick and all.
> Instead, I'd suggest changing the logic for "lookup_write"
On Mon, 1 Aug 2005, Linus Torvalds wrote:
>
> Ie something like the below (which is totally untested, obviously, but I
> think conceptually is a lot more correct, and obviously a lot simpler).
I've tested it, and thought more about it, and I can't see any fault with
the approach. In fact, I lik
On Mon, 1 Aug 2005, Nick Piggin wrote:
>
> Not sure if this should be fixed for 2.6.13. It can result in
> pagecache corruption: so I guess that answers my own question.
Hell no.
This patch is clearly untested and must _not_ be applied:
+ case VM_FAULT_RACE:
+
Ingo Molnar wrote:
Hugh's posting said:
"it's trying to avoid an endless loop of finding the pte not writable
when ptrace is modifying a page which the user is currently protected
against writing to (setting a breakpoint in readonly text, perhaps?)"
i'm wondering, why should that case
* Nick Piggin <[EMAIL PROTECTED]> wrote:
> Ingo Molnar wrote:
> >* Nick Piggin <[EMAIL PROTECTED]> wrote:
>
> >>Feedback please, anyone.
> >
> >
> >it looks good to me, but wouldnt it be simpler (in terms of patch and
> >architecture impact) to always retry the follow_page() in
> >get_user_pag
Ingo Molnar wrote:
* Nick Piggin <[EMAIL PROTECTED]> wrote:
Feedback please, anyone.
it looks good to me, but wouldnt it be simpler (in terms of patch and
architecture impact) to always retry the follow_page() in
get_user_pages(), in case of a minor fault? The sequence of minor faults
* Nick Piggin <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Not sure if this should be fixed for 2.6.13. It can result in
> pagecache corruption: so I guess that answers my own question.
>
> This was tested by Robin and appears to solve the problem. Roland had
> a quick look and thought the basic idea
62 matches
Mail list logo