On Mon, Jan 22, 2018 at 5:26 AM, Rasmus Villemoes
wrote:
> On 2018-01-19 19:42, Linus Torvalds wrote:
>>
>> I actually asked (long long ago) for an optinal compiler warning for
>> "pointer subtraction with non-power-of-2 sizes". Not because of it
>> being undefined,
On Mon, Jan 22, 2018 at 5:26 AM, Rasmus Villemoes
wrote:
> On 2018-01-19 19:42, Linus Torvalds wrote:
>>
>> I actually asked (long long ago) for an optinal compiler warning for
>> "pointer subtraction with non-power-of-2 sizes". Not because of it
>> being undefined, but simply because it's
On 2018-01-19 19:42, Linus Torvalds wrote:
>
> I actually asked (long long ago) for an optinal compiler warning for
> "pointer subtraction with non-power-of-2 sizes". Not because of it
> being undefined, but simply because it's expensive. The
> divide->multiply thing doesn't always work,
Huh? If
On 2018-01-19 19:42, Linus Torvalds wrote:
>
> I actually asked (long long ago) for an optinal compiler warning for
> "pointer subtraction with non-power-of-2 sizes". Not because of it
> being undefined, but simply because it's expensive. The
> divide->multiply thing doesn't always work,
Huh? If
On Sat, Jan 20, 2018 at 05:24:32AM +, Al Viro wrote:
> On Sat, Jan 20, 2018 at 02:02:37AM +, Al Viro wrote:
>
> > Note that those sizes are rather sensitive to lockdep, spinlock debugging,
> > etc.
>
> That they certainly are: on one of the testing .config I'm using it gave this:
>
On Sat, Jan 20, 2018 at 05:24:32AM +, Al Viro wrote:
> On Sat, Jan 20, 2018 at 02:02:37AM +, Al Viro wrote:
>
> > Note that those sizes are rather sensitive to lockdep, spinlock debugging,
> > etc.
>
> That they certainly are: on one of the testing .config I'm using it gave this:
>
On Sat, Jan 20, 2018 at 02:02:37AM +, Al Viro wrote:
> Note that those sizes are rather sensitive to lockdep, spinlock debugging,
> etc.
That they certainly are: on one of the testing .config I'm using it gave this:
1104 sizeof struct page = 56
81 sizeof struct
On Sat, Jan 20, 2018 at 02:02:37AM +, Al Viro wrote:
> Note that those sizes are rather sensitive to lockdep, spinlock debugging,
> etc.
That they certainly are: on one of the testing .config I'm using it gave this:
1104 sizeof struct page = 56
81 sizeof struct
On Fri, Jan 19, 2018 at 02:53:25PM -0800, Linus Torvalds wrote:
> It would probably be good to add the size too, just to explain why
> it's potentially expensive.
>
> That said, apparently we do have hundreds of them, with just
> cpufreq_frequency_table having a ton. Maybe some are hidden in
On Fri, Jan 19, 2018 at 02:53:25PM -0800, Linus Torvalds wrote:
> It would probably be good to add the size too, just to explain why
> it's potentially expensive.
>
> That said, apparently we do have hundreds of them, with just
> cpufreq_frequency_table having a ton. Maybe some are hidden in
On Fri, Jan 19, 2018 at 2:12 PM, Al Viro wrote:
> On Fri, Jan 19, 2018 at 10:42:18AM -0800, Linus Torvalds wrote:
>>
>> We *should* be careful about it. I guess sparse could be made to warn,
>> but I'm afraid that we have so many of these things that a warning
>> isn't
On Fri, Jan 19, 2018 at 2:12 PM, Al Viro wrote:
> On Fri, Jan 19, 2018 at 10:42:18AM -0800, Linus Torvalds wrote:
>>
>> We *should* be careful about it. I guess sparse could be made to warn,
>> but I'm afraid that we have so many of these things that a warning
>> isn't reasonable.
>
> You mean
On Fri, Jan 19, 2018 at 10:42:18AM -0800, Linus Torvalds wrote:
> On Fri, Jan 19, 2018 at 4:55 AM, Matthew Wilcox wrote:
> >
> > So really we should be casting 'b' and 'a' to uintptr_t to be fully
> > compliant with the spec.
>
> That's an unnecessary technicality.
>
> Any
On Fri, Jan 19, 2018 at 10:42:18AM -0800, Linus Torvalds wrote:
> On Fri, Jan 19, 2018 at 4:55 AM, Matthew Wilcox wrote:
> >
> > So really we should be casting 'b' and 'a' to uintptr_t to be fully
> > compliant with the spec.
>
> That's an unnecessary technicality.
>
> Any compiler that doesn't
On Fri, Jan 19, 2018 at 4:55 AM, Matthew Wilcox wrote:
>
> So really we should be casting 'b' and 'a' to uintptr_t to be fully
> compliant with the spec.
That's an unnecessary technicality.
Any compiler that doesn't get pointer inequality testing right is not
worth even
On Fri, Jan 19, 2018 at 4:55 AM, Matthew Wilcox wrote:
>
> So really we should be casting 'b' and 'a' to uintptr_t to be fully
> compliant with the spec.
That's an unnecessary technicality.
Any compiler that doesn't get pointer inequality testing right is not
worth even worrying about. We
On Fri, Jan 19, 2018 at 02:49:55AM +0300, Kirill A. Shutemov wrote:
> > So that's why you can't do pointer diffs between two arrays. Not
> > because you can't subtract the two pointers, but because the
> > *division* part of the C pointer diff rules leads to issues.
>
> Thanks a lot for the
On Fri, Jan 19, 2018 at 02:49:55AM +0300, Kirill A. Shutemov wrote:
> > So that's why you can't do pointer diffs between two arrays. Not
> > because you can't subtract the two pointers, but because the
> > *division* part of the C pointer diff rules leads to issues.
>
> Thanks a lot for the
On Fri, Jan 19, 2018 at 12:07:47PM +, Michal Hocko wrote:
> > >From 861f68c555b87fd6c0ccc3428ace91b7e185b73a Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov"
> > Date: Thu, 18 Jan 2018 18:24:07 +0300
> > Subject: [PATCH] mm, page_vma_mapped: Drop faulty
On Fri, Jan 19, 2018 at 12:07:47PM +, Michal Hocko wrote:
> > >From 861f68c555b87fd6c0ccc3428ace91b7e185b73a Mon Sep 17 00:00:00 2001
> > From: "Kirill A. Shutemov"
> > Date: Thu, 18 Jan 2018 18:24:07 +0300
> > Subject: [PATCH] mm, page_vma_mapped: Drop faulty pointer arithmetics in
> >
On Fri 19-01-18 14:49:17, Kirill A. Shutemov wrote:
> On Fri, Jan 19, 2018 at 11:33:42AM +0100, Michal Hocko wrote:
> > On Fri 19-01-18 13:02:59, Kirill A. Shutemov wrote:
> > > On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> > > > On Thu 18-01-18 18:40:26, Kirill A. Shutemov
On Fri 19-01-18 14:49:17, Kirill A. Shutemov wrote:
> On Fri, Jan 19, 2018 at 11:33:42AM +0100, Michal Hocko wrote:
> > On Fri 19-01-18 13:02:59, Kirill A. Shutemov wrote:
> > > On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> > > > On Thu 18-01-18 18:40:26, Kirill A. Shutemov
On Fri, Jan 19, 2018 at 11:33:42AM +0100, Michal Hocko wrote:
> On Fri 19-01-18 13:02:59, Kirill A. Shutemov wrote:
> > On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> > > On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
> > > [...]
> > > > + /*
> > > > +* Make
On Fri, Jan 19, 2018 at 11:33:42AM +0100, Michal Hocko wrote:
> On Fri 19-01-18 13:02:59, Kirill A. Shutemov wrote:
> > On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> > > On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
> > > [...]
> > > > + /*
> > > > +* Make
On Fri 19-01-18 13:02:59, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> > On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
> > [...]
> > > + /*
> > > + * Make sure that pages are in the same section before doing pointer
> > > + * arithmetics.
> >
On Fri 19-01-18 13:02:59, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> > On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
> > [...]
> > > + /*
> > > + * Make sure that pages are in the same section before doing pointer
> > > + * arithmetics.
> >
On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
> [...]
> > + /*
> > +* Make sure that pages are in the same section before doing pointer
> > +* arithmetics.
> > +*/
> > + if (page_to_section(pvmw->page) !=
On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
> [...]
> > + /*
> > +* Make sure that pages are in the same section before doing pointer
> > +* arithmetics.
> > +*/
> > + if (page_to_section(pvmw->page) !=
Kirill A. Shutemov wrote:
> Something like this?
>
>
> From 251e124630da82482e8b320c73162ce89af04d5d Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov"
> Date: Thu, 18 Jan 2018 18:24:07 +0300
> Subject: [PATCH] mm, page_vma_mapped: Fix pointer arithmetics in
Kirill A. Shutemov wrote:
> Something like this?
>
>
> From 251e124630da82482e8b320c73162ce89af04d5d Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov"
> Date: Thu, 18 Jan 2018 18:24:07 +0300
> Subject: [PATCH] mm, page_vma_mapped: Fix pointer arithmetics in check_pte()
>
> Tetsuo reported
On Thu, Jan 18, 2018 at 09:26:25AM -0800, Linus Torvalds wrote:
> On Thu, Jan 18, 2018 at 8:56 AM, Kirill A. Shutemov
> wrote:
> >
> > I can't say I fully grasp how 'diff' got this value and how it leads to both
> > checks being false.
>
> I think the problem is that page
On Thu, Jan 18, 2018 at 09:26:25AM -0800, Linus Torvalds wrote:
> On Thu, Jan 18, 2018 at 8:56 AM, Kirill A. Shutemov
> wrote:
> >
> > I can't say I fully grasp how 'diff' got this value and how it leads to both
> > checks being false.
>
> I think the problem is that page difference when they
On Thu, Jan 18, 2018 at 9:26 AM, Luck, Tony wrote:
>> Both are real page. But why do you expect pages to be 64-byte alinged?
>> Both are aligned to 64-bit as they suppose to be IIUC.
>
> On a 64-bit kernel sizeof struct page == 64 (after much work by people to
> trim out
On Thu, Jan 18, 2018 at 9:26 AM, Luck, Tony wrote:
>> Both are real page. But why do you expect pages to be 64-byte alinged?
>> Both are aligned to 64-bit as they suppose to be IIUC.
>
> On a 64-bit kernel sizeof struct page == 64 (after much work by people to
> trim out excess stuff). So I
On Thu, Jan 18, 2018 at 8:56 AM, Kirill A. Shutemov
wrote:
>
> I can't say I fully grasp how 'diff' got this value and how it leads to both
> checks being false.
I think the problem is that page difference when they are in different sections.
When you do
> Both are real page. But why do you expect pages to be 64-byte alinged?
> Both are aligned to 64-bit as they suppose to be IIUC.
On a 64-bit kernel sizeof struct page == 64 (after much work by people to
trim out excess stuff). So I thought we made sure to align the base address
of blocks of
On Thu, Jan 18, 2018 at 8:56 AM, Kirill A. Shutemov
wrote:
>
> I can't say I fully grasp how 'diff' got this value and how it leads to both
> checks being false.
I think the problem is that page difference when they are in different sections.
When you do
pte_page(*pvmw->pte) - pvmw->page
> Both are real page. But why do you expect pages to be 64-byte alinged?
> Both are aligned to 64-bit as they suppose to be IIUC.
On a 64-bit kernel sizeof struct page == 64 (after much work by people to
trim out excess stuff). So I thought we made sure to align the base address
of blocks of
On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
[...]
> + /*
> + * Make sure that pages are in the same section before doing pointer
> + * arithmetics.
> + */
> + if (page_to_section(pvmw->page) != page_to_section(page))
> + return false;
OK, THPs shouldn't
On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
[...]
> + /*
> + * Make sure that pages are in the same section before doing pointer
> + * arithmetics.
> + */
> + if (page_to_section(pvmw->page) != page_to_section(page))
> + return false;
OK, THPs shouldn't
On Thu, Jan 18, 2018 at 6:38 AM, Dave Hansen
wrote:
> On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
>> - if (pte_page(*pvmw->pte) - pvmw->page >=
>> - hpage_nr_pages(pvmw->page)) {
>
> Is ->pte guaranteed to map a page which
On Thu, Jan 18, 2018 at 6:38 AM, Dave Hansen
wrote:
> On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
>> - if (pte_page(*pvmw->pte) - pvmw->page >=
>> - hpage_nr_pages(pvmw->page)) {
>
> Is ->pte guaranteed to map a page which is within the same section
On Thu, Jan 18, 2018 at 03:58:30PM +0100, Andrea Arcangeli wrote:
> On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> > On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > > [ 10.084024] diff: -858690919
> > > [ 10.084258] hpage_nr_pages: 1
> > > [ 10.084386] check1: 0
> > > [
On Thu, Jan 18, 2018 at 03:58:30PM +0100, Andrea Arcangeli wrote:
> On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> > On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > > [ 10.084024] diff: -858690919
> > > [ 10.084258] hpage_nr_pages: 1
> > > [ 10.084386] check1: 0
> > > [
On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > [ 10.084024] diff: -858690919
> > [ 10.084258] hpage_nr_pages: 1
> > [ 10.084386] check1: 0
> > [ 10.084478] check2: 0
> ...
> > diff --git a/mm/page_vma_mapped.c
On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > [ 10.084024] diff: -858690919
> > [ 10.084258] hpage_nr_pages: 1
> > [ 10.084386] check1: 0
> > [ 10.084478] check2: 0
> ...
> > diff --git a/mm/page_vma_mapped.c
On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > [ 10.084024] diff: -858690919
> > [ 10.084258] hpage_nr_pages: 1
> > [ 10.084386] check1: 0
> > [ 10.084478] check2: 0
> ...
> > diff --git a/mm/page_vma_mapped.c
On Thu, Jan 18, 2018 at 06:45:00AM -0800, Dave Hansen wrote:
> On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> > [ 10.084024] diff: -858690919
> > [ 10.084258] hpage_nr_pages: 1
> > [ 10.084386] check1: 0
> > [ 10.084478] check2: 0
> ...
> > diff --git a/mm/page_vma_mapped.c
On 01/18/2018 06:45 AM, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 06:38:10AM -0800, Dave Hansen wrote:
>> On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
>>> - if (pte_page(*pvmw->pte) - pvmw->page >=
>>> - hpage_nr_pages(pvmw->page)) {
>> Is ->pte
On 01/18/2018 06:45 AM, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 06:38:10AM -0800, Dave Hansen wrote:
>> On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
>>> - if (pte_page(*pvmw->pte) - pvmw->page >=
>>> - hpage_nr_pages(pvmw->page)) {
>> Is ->pte
On Thu, Jan 18, 2018 at 06:38:10AM -0800, Dave Hansen wrote:
> On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
> > - if (pte_page(*pvmw->pte) - pvmw->page >=
> > - hpage_nr_pages(pvmw->page)) {
>
> Is ->pte guaranteed to map a page which is within the same
On Thu, Jan 18, 2018 at 06:38:10AM -0800, Dave Hansen wrote:
> On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
> > - if (pte_page(*pvmw->pte) - pvmw->page >=
> > - hpage_nr_pages(pvmw->page)) {
>
> Is ->pte guaranteed to map a page which is within the same
On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> [ 10.084024] diff: -858690919
> [ 10.084258] hpage_nr_pages: 1
> [ 10.084386] check1: 0
> [ 10.084478] check2: 0
...
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index d22b84310f6d..57b4397f1ea5 100644
> ---
On 01/18/2018 04:25 AM, Kirill A. Shutemov wrote:
> [ 10.084024] diff: -858690919
> [ 10.084258] hpage_nr_pages: 1
> [ 10.084386] check1: 0
> [ 10.084478] check2: 0
...
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index d22b84310f6d..57b4397f1ea5 100644
> ---
On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
> - if (pte_page(*pvmw->pte) - pvmw->page >=
> - hpage_nr_pages(pvmw->page)) {
Is ->pte guaranteed to map a page which is within the same section as
pvmw->page? Otherwise, with sparsemem (non-vmemmap), the
On 01/18/2018 05:12 AM, Kirill A. Shutemov wrote:
> - if (pte_page(*pvmw->pte) - pvmw->page >=
> - hpage_nr_pages(pvmw->page)) {
Is ->pte guaranteed to map a page which is within the same section as
pvmw->page? Otherwise, with sparsemem (non-vmemmap), the
On Thu, Jan 18, 2018 at 04:12:10PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 03:25:50PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> > > Tetsuo Handa wrote:
> > > > OK. I missed the mark. I overlooked that 4.11 already has
On Thu, Jan 18, 2018 at 04:12:10PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 03:25:50PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> > > Tetsuo Handa wrote:
> > > > OK. I missed the mark. I overlooked that 4.11 already has
On Thu, Jan 18, 2018 at 03:25:50PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> > Tetsuo Handa wrote:
> > > OK. I missed the mark. I overlooked that 4.11 already has this problem.
> > >
> > > I needed to bisect between 4.10 and 4.11, and I got
On Thu, Jan 18, 2018 at 03:25:50PM +0300, Kirill A. Shutemov wrote:
> On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> > Tetsuo Handa wrote:
> > > OK. I missed the mark. I overlooked that 4.11 already has this problem.
> > >
> > > I needed to bisect between 4.10 and 4.11, and I got
On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > OK. I missed the mark. I overlooked that 4.11 already has this problem.
> >
> > I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
> >
> > I haven't completed bisecting between
On Thu, Jan 18, 2018 at 05:12:45PM +0900, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > OK. I missed the mark. I overlooked that 4.11 already has this problem.
> >
> > I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
> >
> > I haven't completed bisecting between
Tetsuo Handa wrote:
> OK. I missed the mark. I overlooked that 4.11 already has this problem.
>
> I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
>
> I haven't completed bisecting between b4fb8f66f1ae2e16 and c470abd4fde40ea6,
> but
> b4fb8f66f1ae2e16 ("mm, page_alloc:
Tetsuo Handa wrote:
> OK. I missed the mark. I overlooked that 4.11 already has this problem.
>
> I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
>
> I haven't completed bisecting between b4fb8f66f1ae2e16 and c470abd4fde40ea6,
> but
> b4fb8f66f1ae2e16 ("mm, page_alloc:
On Wed, Jan 17, 2018 at 2:00 PM, Dave Hansen
wrote:
>
> I thought that page_zone_id() stuff was there to prevent this kind of
> cross-zone stuff from happening.
Ahh, that was the part I missed. Yeah looks like that checks things
properly. Although the mask generation
On Wed, Jan 17, 2018 at 2:00 PM, Dave Hansen
wrote:
>
> I thought that page_zone_id() stuff was there to prevent this kind of
> cross-zone stuff from happening.
Ahh, that was the part I missed. Yeah looks like that checks things
properly. Although the mask generation is *so* confusing that I
On 01/17/2018 01:51 PM, Linus Torvalds wrote:
> In fact, it seems to be such a fundamental bug that I suspect I'm
> entirely wrong, and full of shit. So it's an interesting and not
> _obviously_ incorrect theory, but I suspect I must be missing
> something.
I'll just note that a few of the pfns I
On 01/17/2018 01:51 PM, Linus Torvalds wrote:
> In fact, it seems to be such a fundamental bug that I suspect I'm
> entirely wrong, and full of shit. So it's an interesting and not
> _obviously_ incorrect theory, but I suspect I must be missing
> something.
I'll just note that a few of the pfns I
On 01/17/2018 01:39 PM, Linus Torvalds wrote:
>
> So maybe something like this to test the theory?
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 76c9688b6a0a..f919a5548943 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -756,6 +756,8 @@ static inline
On 01/17/2018 01:39 PM, Linus Torvalds wrote:
>
> So maybe something like this to test the theory?
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 76c9688b6a0a..f919a5548943 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -756,6 +756,8 @@ static inline
On Wed, Jan 17, 2018 at 1:39 PM, Linus Torvalds
wrote:
>
> In fact, the whole
>
>pfn_valid_within(buddy_pfn)
>
> test looks very odd. Maybe the pfn of the buddy is valid, but it's not
> in the same zone? Then we'd combine the two pages in two different
> zones
On Wed, Jan 17, 2018 at 1:39 PM, Linus Torvalds
wrote:
>
> In fact, the whole
>
>pfn_valid_within(buddy_pfn)
>
> test looks very odd. Maybe the pfn of the buddy is valid, but it's not
> in the same zone? Then we'd combine the two pages in two different
> zones into one combined page.
It
On Wed, Jan 17, 2018 at 3:08 AM, Tetsuo Handa
wrote:
>
> I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
> [...]
> git bisect bad b4fb8f66f1ae2e167d06c12d018025a8d4d3ba7e
> # first bad commit: [b4fb8f66f1ae2e167d06c12d018025a8d4d3ba7e]
On Wed, Jan 17, 2018 at 3:08 AM, Tetsuo Handa
wrote:
>
> I needed to bisect between 4.10 and 4.11, and I got plausible culprit.
> [...]
> git bisect bad b4fb8f66f1ae2e167d06c12d018025a8d4d3ba7e
> # first bad commit: [b4fb8f66f1ae2e167d06c12d018025a8d4d3ba7e] mm,
> page_alloc: Add missing check
Linus Torvalds wrote:
> > It turned out that CONFIG_FLATMEM was irrelevant. I just did not hit it.
>
> So have you actually been able to see the problem with FLATMEM, or is
> this based on the bisect? Because I really think the bisect is pretty
> much guaranteed to be wrong.
Oops, this "it" is
Linus Torvalds wrote:
> > It turned out that CONFIG_FLATMEM was irrelevant. I just did not hit it.
>
> So have you actually been able to see the problem with FLATMEM, or is
> this based on the bisect? Because I really think the bisect is pretty
> much guaranteed to be wrong.
Oops, this "it" is
On Tue, Jan 16, 2018 at 9:33 AM, Tetsuo Handa
wrote:
>
> Since I got a faster reproducer, I tried full bisection between 4.11 and
> 4.12-rc1.
> But I have no idea why bisection arrives at c0332694903a37cf.
I don't think your reproducer is 100% reliable.
And
On Tue, Jan 16, 2018 at 9:33 AM, Tetsuo Handa
wrote:
>
> Since I got a faster reproducer, I tried full bisection between 4.11 and
> 4.12-rc1.
> But I have no idea why bisection arrives at c0332694903a37cf.
I don't think your reproducer is 100% reliable.
And bisection is great because it's very
On Tue, Jan 16, 2018 at 12:06 AM, Dave Hansen
wrote:
> On 01/15/2018 06:14 PM, Linus Torvalds wrote:
>> But I'm adding Dave Hansen explicitly to the cc, in case he has any
>> ideas. Not because I blame him, but he's touched the sparsemem code
>> fairly recently, so
On Tue, Jan 16, 2018 at 12:06 AM, Dave Hansen
wrote:
> On 01/15/2018 06:14 PM, Linus Torvalds wrote:
>> But I'm adding Dave Hansen explicitly to the cc, in case he has any
>> ideas. Not because I blame him, but he's touched the sparsemem code
>> fairly recently, so maybe he'd have some idea on
Linus Torvalds wrote:
> On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa
> wrote:
> >
> > I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether
> > we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly
> > timing dependent.
>
>
Linus Torvalds wrote:
> On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa
> wrote:
> >
> > I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether
> > we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly
> > timing dependent.
>
> Hmm. Maybe. But sparsemem really
* Dave Hansen wrote:
> Did anyone else notice the
>
> [ 31.068198] ? vmalloc_sync_all+0x150/0x150
>
> present in a bunch of the stack traces? That should be pretty uncommon.
I thikn that's pretty unusual:
> Is it just part of the normal
* Dave Hansen wrote:
> Did anyone else notice the
>
> [ 31.068198] ? vmalloc_sync_all+0x150/0x150
>
> present in a bunch of the stack traces? That should be pretty uncommon.
I thikn that's pretty unusual:
> Is it just part of the normal do_page_fault() stack and the stack
>
On 01/15/2018 06:14 PM, Linus Torvalds wrote:
> But I'm adding Dave Hansen explicitly to the cc, in case he has any
> ideas. Not because I blame him, but he's touched the sparsemem code
> fairly recently, so maybe he'd have some idea on adding sanity
> checking to the sparsemem version of
On 01/15/2018 06:14 PM, Linus Torvalds wrote:
> But I'm adding Dave Hansen explicitly to the cc, in case he has any
> ideas. Not because I blame him, but he's touched the sparsemem code
> fairly recently, so maybe he'd have some idea on adding sanity
> checking to the sparsemem version of
On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa
wrote:
>
> I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether
> we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly
> timing dependent.
Hmm. Maybe. But sparsemem really also
On Mon, Jan 15, 2018 at 5:15 PM, Tetsuo Handa
wrote:
>
> I can't reproduce this with CONFIG_FLATMEM=y . But I'm not sure whether
> we are hitting a bug in CONFIG_SPARSEMEM=y code, for the bug is highly
> timing dependent.
Hmm. Maybe. But sparsemem really also generates *much* more complex
code
Linus Torvalds wrote:
> On Sun, Jan 14, 2018 at 3:54 AM, Tetsuo Handa
> wrote:
> > This memory corruption bug occurs even on CONFIG_SMP=n CONFIG_PREEMPT_NONE=y
> > kernel. This bug highly depends on timing and thus too difficult to bisect.
> > This bug seems to
Linus Torvalds wrote:
> On Sun, Jan 14, 2018 at 3:54 AM, Tetsuo Handa
> wrote:
> > This memory corruption bug occurs even on CONFIG_SMP=n CONFIG_PREEMPT_NONE=y
> > kernel. This bug highly depends on timing and thus too difficult to bisect.
> > This bug seems to exist at least since Linux 4.8
90 matches
Mail list logo