Re: pgd_none_or_clear_bad strangeness?
On Wed, Oct 03, 2007 at 07:18:23PM +0100, Hugh Dickins wrote: > On Wed, 3 Oct 2007, Nick Piggin wrote: > > On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: > > > In lib/pagewalk.c, I've been using the various forms of > > > {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that > > > seemed the canonical way to do things. Lately (eg with -rc7-mm1), > > > these have been triggering messages like "bad pgd 0x01e3" and causing > > > nasty double faults. It appears this is actually triggered at the pmd > > > level (mm/memory.c:116), though it appears to produce the wrong > > > message. > > I guess the "wrong message" is an artifact of pud/pmd folding; > but I get too confused by the different levels myself to want to > think more about it - I'll just assume it's "right" somehow ;) > > > > > > > Has something changed here? I'm pretty sure this used to work! Is this > > I don't know of anything changing here, sorry. > > > > not a kosher thing to do? Does it make any sense I'd repeatedly run > > > into a bad pmd in the middle of bash's page table right after boot? > > > The simple _none variant seems to work, but I worry that it's papering > > > over a real problem. > > > > No, I think that should be the right thing to do for userspace pages. > > You're not walking into a hugetlb area or a kernel mapping are you? > > (the bad pgd: line could be important... 0x01e3 would be a linear kernel > > mapping I think?). > > I should have spent more time reading Nick's reply and less time trying > to work it out for myself! Yes, that's the conclusion I came to, for > some reason you're now going beyond the user vmas and walking into the > linear kernel mapping, which has _PAGE_GLOBAL and _PAGE_PSE bits set. Indeed, that's precisely what's happening. I'm walking one page past the end of userspace. And the reason is I changed my walker from using for loops to do/while loops at Nick's insistance, so start==end no longer gets noticed immediately. This also explains why the bug doesn't manifest in lguest: no PSE mappings. Thanks, guys! -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pgd_none_or_clear_bad strangeness?
On Wed, 3 Oct 2007, Nick Piggin wrote: > On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: > > In lib/pagewalk.c, I've been using the various forms of > > {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that > > seemed the canonical way to do things. Lately (eg with -rc7-mm1), > > these have been triggering messages like "bad pgd 0x01e3" and causing > > nasty double faults. It appears this is actually triggered at the pmd > > level (mm/memory.c:116), though it appears to produce the wrong > > message. I guess the "wrong message" is an artifact of pud/pmd folding; but I get too confused by the different levels myself to want to think more about it - I'll just assume it's "right" somehow ;) > > > > Has something changed here? I'm pretty sure this used to work! Is this I don't know of anything changing here, sorry. > > not a kosher thing to do? Does it make any sense I'd repeatedly run > > into a bad pmd in the middle of bash's page table right after boot? > > The simple _none variant seems to work, but I worry that it's papering > > over a real problem. > > No, I think that should be the right thing to do for userspace pages. > You're not walking into a hugetlb area or a kernel mapping are you? > (the bad pgd: line could be important... 0x01e3 would be a linear kernel > mapping I think?). I should have spent more time reading Nick's reply and less time trying to work it out for myself! Yes, that's the conclusion I came to, for some reason you're now going beyond the user vmas and walking into the linear kernel mapping, which has _PAGE_GLOBAL and _PAGE_PSE bits set. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pgd_none_or_clear_bad strangeness?
On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: > In lib/pagewalk.c, I've been using the various forms of > {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that > seemed the canonical way to do things. Lately (eg with -rc7-mm1), > these have been triggering messages like "bad pgd 0x01e3" and causing > nasty double faults. It appears this is actually triggered at the pmd > level (mm/memory.c:116), though it appears to produce the wrong > message. > > Has something changed here? I'm pretty sure this used to work! Is this > not a kosher thing to do? Does it make any sense I'd repeatedly run > into a bad pmd in the middle of bash's page table right after boot? > The simple _none variant seems to work, but I worry that it's papering > over a real problem. No, I think that should be the right thing to do for userspace pages. You're not walking into a hugetlb area or a kernel mapping are you? (the bad pgd: line could be important... 0x01e3 would be a linear kernel mapping I think?). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pgd_none_or_clear_bad strangeness?
On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: In lib/pagewalk.c, I've been using the various forms of {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that seemed the canonical way to do things. Lately (eg with -rc7-mm1), these have been triggering messages like bad pgd 0x01e3 and causing nasty double faults. It appears this is actually triggered at the pmd level (mm/memory.c:116), though it appears to produce the wrong message. Has something changed here? I'm pretty sure this used to work! Is this not a kosher thing to do? Does it make any sense I'd repeatedly run into a bad pmd in the middle of bash's page table right after boot? The simple _none variant seems to work, but I worry that it's papering over a real problem. No, I think that should be the right thing to do for userspace pages. You're not walking into a hugetlb area or a kernel mapping are you? (the bad pgd: line could be important... 0x01e3 would be a linear kernel mapping I think?). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pgd_none_or_clear_bad strangeness?
On Wed, 3 Oct 2007, Nick Piggin wrote: On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: In lib/pagewalk.c, I've been using the various forms of {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that seemed the canonical way to do things. Lately (eg with -rc7-mm1), these have been triggering messages like bad pgd 0x01e3 and causing nasty double faults. It appears this is actually triggered at the pmd level (mm/memory.c:116), though it appears to produce the wrong message. I guess the wrong message is an artifact of pud/pmd folding; but I get too confused by the different levels myself to want to think more about it - I'll just assume it's right somehow ;) Has something changed here? I'm pretty sure this used to work! Is this I don't know of anything changing here, sorry. not a kosher thing to do? Does it make any sense I'd repeatedly run into a bad pmd in the middle of bash's page table right after boot? The simple _none variant seems to work, but I worry that it's papering over a real problem. No, I think that should be the right thing to do for userspace pages. You're not walking into a hugetlb area or a kernel mapping are you? (the bad pgd: line could be important... 0x01e3 would be a linear kernel mapping I think?). I should have spent more time reading Nick's reply and less time trying to work it out for myself! Yes, that's the conclusion I came to, for some reason you're now going beyond the user vmas and walking into the linear kernel mapping, which has _PAGE_GLOBAL and _PAGE_PSE bits set. Hugh - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pgd_none_or_clear_bad strangeness?
On Wed, Oct 03, 2007 at 07:18:23PM +0100, Hugh Dickins wrote: On Wed, 3 Oct 2007, Nick Piggin wrote: On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: In lib/pagewalk.c, I've been using the various forms of {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that seemed the canonical way to do things. Lately (eg with -rc7-mm1), these have been triggering messages like bad pgd 0x01e3 and causing nasty double faults. It appears this is actually triggered at the pmd level (mm/memory.c:116), though it appears to produce the wrong message. I guess the wrong message is an artifact of pud/pmd folding; but I get too confused by the different levels myself to want to think more about it - I'll just assume it's right somehow ;) Has something changed here? I'm pretty sure this used to work! Is this I don't know of anything changing here, sorry. not a kosher thing to do? Does it make any sense I'd repeatedly run into a bad pmd in the middle of bash's page table right after boot? The simple _none variant seems to work, but I worry that it's papering over a real problem. No, I think that should be the right thing to do for userspace pages. You're not walking into a hugetlb area or a kernel mapping are you? (the bad pgd: line could be important... 0x01e3 would be a linear kernel mapping I think?). I should have spent more time reading Nick's reply and less time trying to work it out for myself! Yes, that's the conclusion I came to, for some reason you're now going beyond the user vmas and walking into the linear kernel mapping, which has _PAGE_GLOBAL and _PAGE_PSE bits set. Indeed, that's precisely what's happening. I'm walking one page past the end of userspace. And the reason is I changed my walker from using for loops to do/while loops at Nick's insistance, so start==end no longer gets noticed immediately. This also explains why the bug doesn't manifest in lguest: no PSE mappings. Thanks, guys! -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
pgd_none_or_clear_bad strangeness?
In lib/pagewalk.c, I've been using the various forms of {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that seemed the canonical way to do things. Lately (eg with -rc7-mm1), these have been triggering messages like "bad pgd 0x01e3" and causing nasty double faults. It appears this is actually triggered at the pmd level (mm/memory.c:116), though it appears to produce the wrong message. Has something changed here? I'm pretty sure this used to work! Is this not a kosher thing to do? Does it make any sense I'd repeatedly run into a bad pmd in the middle of bash's page table right after boot? The simple _none variant seems to work, but I worry that it's papering over a real problem. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
pgd_none_or_clear_bad strangeness?
In lib/pagewalk.c, I've been using the various forms of {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that seemed the canonical way to do things. Lately (eg with -rc7-mm1), these have been triggering messages like bad pgd 0x01e3 and causing nasty double faults. It appears this is actually triggered at the pmd level (mm/memory.c:116), though it appears to produce the wrong message. Has something changed here? I'm pretty sure this used to work! Is this not a kosher thing to do? Does it make any sense I'd repeatedly run into a bad pmd in the middle of bash's page table right after boot? The simple _none variant seems to work, but I worry that it's papering over a real problem. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/