Re: oops in copy_page_rep()

2013-01-11 Thread Andrea Arcangeli
On Fri, Jan 11, 2013 at 01:50:44AM -0600, Simon Jeons wrote: > On Tue, 2013-01-08 at 18:49 +0100, Andrea Arcangeli wrote: > > Hi Kirill, > > > > On Tue, Jan 08, 2013 at 07:30:58PM +0200, Kirill A. Shutemov wrote: > > > Merged patch is obviously broken: huge_pmd_set_accessed() can be called > > > o

Re: oops in copy_page_rep()

2013-01-10 Thread Simon Jeons
On Tue, 2013-01-08 at 18:49 +0100, Andrea Arcangeli wrote: > Hi Kirill, > > On Tue, Jan 08, 2013 at 07:30:58PM +0200, Kirill A. Shutemov wrote: > > Merged patch is obviously broken: huge_pmd_set_accessed() can be called > > only if the pmd is under splitting. > > Of course I assume you meant "onl

Re: oops in copy_page_rep()

2013-01-09 Thread Mel Gorman
On Tue, Jan 08, 2013 at 08:52:14AM -0800, Linus Torvalds wrote: > On Tue, Jan 8, 2013 at 8:31 AM, Kirill A. Shutemov > wrote: > >> > >> Heh. I was more thinking about why do_huge_pmd_wp_page() needs it, but > >> do_huge_pmd_numa_page() does not. > > > > It does. The check should be moved up. > >

Re: oops in copy_page_rep()

2013-01-09 Thread Hillf Danton
On Wed, Jan 9, 2013 at 2:21 AM, Linus Torvalds wrote: > On Tue, Jan 8, 2013 at 10:03 AM, Andrea Arcangeli wrote: >> >> It looks very fine to me, but I suggest to move it above the >> pmd_numa() check because of the newly introduced >> migrate_misplaced_transhuge_page method relying on pmd_same to

Re: oops in copy_page_rep()

2013-01-08 Thread Hugh Dickins
On Tue, 8 Jan 2013, Andrea Arcangeli wrote: > > Looking at this, one thing that isn't clear is where the page_count is > checked in migrate_misplaced_transhuge_page. Ok that it's unable to > migrate anon transhuge COW shared pages so it doesn't need to mess > with rmap (the mapcount check makes it

Re: oops in copy_page_rep()

2013-01-08 Thread Linus Torvalds
On Tue, Jan 8, 2013 at 10:03 AM, Andrea Arcangeli wrote: > > It looks very fine to me, but I suggest to move it above the > pmd_numa() check because of the newly introduced > migrate_misplaced_transhuge_page method relying on pmd_same too. Hmm. If we need it there, then we need to fix the *later*

Re: oops in copy_page_rep()

2013-01-08 Thread Andrea Arcangeli
On Tue, Jan 08, 2013 at 09:51:47AM -0800, Linus Torvalds wrote: > On Tue, Jan 8, 2013 at 9:37 AM, Andrea Arcangeli wrote: > > > > The reason it returned to userland and retried the fault is that this > > should be infrequent enough not to worry about it and this was > > marginally simpler but it c

Re: oops in copy_page_rep()

2013-01-08 Thread Kirill A. Shutemov
On Tue, Jan 08, 2013 at 06:49:51PM +0100, Andrea Arcangeli wrote: > Hi Kirill, > > On Tue, Jan 08, 2013 at 07:30:58PM +0200, Kirill A. Shutemov wrote: > > Merged patch is obviously broken: huge_pmd_set_accessed() can be called > > only if the pmd is under splitting. > > Of course I assume you mea

Re: oops in copy_page_rep()

2013-01-08 Thread Linus Torvalds
On Tue, Jan 8, 2013 at 9:37 AM, Andrea Arcangeli wrote: > > The reason it returned to userland and retried the fault is that this > should be infrequent enough not to worry about it and this was > marginally simpler but it could be changed. Yeah, that was my suspicion. And as mentioned, returning

Re: oops in copy_page_rep()

2013-01-08 Thread Andrea Arcangeli
Hi Kirill, On Tue, Jan 08, 2013 at 07:30:58PM +0200, Kirill A. Shutemov wrote: > Merged patch is obviously broken: huge_pmd_set_accessed() can be called > only if the pmd is under splitting. Of course I assume you meant "only if the pmd is not under splitting". But no, setting a bitflag like the

Re: oops in copy_page_rep()

2013-01-08 Thread Linus Torvalds
On Tue, Jan 8, 2013 at 9:30 AM, Kirill A. Shutemov wrote: > > Check difference between patch above and merged one -- a1dd450. > Merged patch is obviously broken: huge_pmd_set_accessed() can be called > only if the pmd is under splitting. Ok, that's a totally different issue, and seems to be due t

Re: oops in copy_page_rep()

2013-01-08 Thread Andrea Arcangeli
Hi, On Tue, Jan 08, 2013 at 08:52:14AM -0800, Linus Torvalds wrote: > On Tue, Jan 8, 2013 at 8:31 AM, Kirill A. Shutemov > wrote: > >> > >> Heh. I was more thinking about why do_huge_pmd_wp_page() needs it, but > >> do_huge_pmd_numa_page() does not. > > > > It does. The check should be moved up.

Re: oops in copy_page_rep()

2013-01-08 Thread Kirill A. Shutemov
On Tue, Jan 08, 2013 at 08:52:14AM -0800, Linus Torvalds wrote: > On Tue, Jan 8, 2013 at 8:31 AM, Kirill A. Shutemov > wrote: > >> > >> Heh. I was more thinking about why do_huge_pmd_wp_page() needs it, but > >> do_huge_pmd_numa_page() does not. > > > > It does. The check should be moved up. > >

Re: oops in copy_page_rep()

2013-01-08 Thread Linus Torvalds
On Tue, Jan 8, 2013 at 8:31 AM, Kirill A. Shutemov wrote: >> >> Heh. I was more thinking about why do_huge_pmd_wp_page() needs it, but >> do_huge_pmd_numa_page() does not. > > It does. The check should be moved up. > >> Also, do we actually need it for huge_pmd_set_accessed()? The >> *placement* o

Re: oops in copy_page_rep()

2013-01-08 Thread Kirill A. Shutemov
On Tue, Jan 08, 2013 at 07:37:06AM -0800, Linus Torvalds wrote: > On Tue, Jan 8, 2013 at 5:04 AM, Hillf Danton wrote: > > On Tue, Jan 8, 2013 at 1:34 AM, Linus Torvalds > > wrote: > >> > >> Hmm. Is there some reason we never need to worry about it for the > >> "pmd_numa()" case just above? > >> >

Re: oops in copy_page_rep()

2013-01-08 Thread Linus Torvalds
On Tue, Jan 8, 2013 at 5:04 AM, Hillf Danton wrote: > On Tue, Jan 8, 2013 at 1:34 AM, Linus Torvalds > wrote: >> >> Hmm. Is there some reason we never need to worry about it for the >> "pmd_numa()" case just above? >> >> A comment about this all might be a really good idea. >> > Yes Sir, added.

Re: oops in copy_page_rep()

2013-01-08 Thread Hillf Danton
On Tue, Jan 8, 2013 at 1:34 AM, Linus Torvalds wrote: > On Mon, Jan 7, 2013 at 4:24 AM, Hillf Danton wrote: >> >> I take another try with waiting added, take a look please. > > Hmm. Is there some reason we never need to worry about it for the > "pmd_numa()" case just above? > > A comment about th

Re: oops in copy_page_rep()

2013-01-07 Thread Linus Torvalds
On Mon, Jan 7, 2013 at 4:24 AM, Hillf Danton wrote: > > I take another try with waiting added, take a look please. Hmm. Is there some reason we never need to worry about it for the "pmd_numa()" case just above? A comment about this all might be a really good idea. Linus -- To

Re: oops in copy_page_rep()

2013-01-07 Thread Hillf Danton
Hello Hugh On Mon, Jan 7, 2013 at 3:06 AM, Hugh Dickins wrote: > I don't entirely like your patch (or the original code): shouldn't > there be a wait_split_huge_page(), rather than hammering back with > repeated faults until the split has completed? > I take another try with waiting added, take a

Re: oops in copy_page_rep()

2013-01-06 Thread Hugh Dickins
On Sun, 6 Jan 2013, Dave Jones wrote: > > investigating the huge page theory a little further I'm a bit confused. > The kernel on that machine has THP enabled, and the cpu supports it (an old > amd64), but.. > > $ cat /sys/kernel/mm/hugepages/hugepages-2048kB/* > 0 > 0 > 0 > 0 > 0 > 0 > > I was

Re: oops in copy_page_rep()

2013-01-06 Thread Dave Jones
On Sat, Jan 05, 2013 at 07:57:39PM -0800, Linus Torvalds wrote: > Adding more people in case somebody else has any idea. Anybody? > > On Sat, Jan 5, 2013 at 7:22 AM, Dave Jones wrote: > > I have no idea what happened here, but this is the first time I've seen > > this one. > > This was run

Re: oops in copy_page_rep()

2013-01-06 Thread Hugh Dickins
On Sun, 6 Jan 2013, Hillf Danton wrote: > On Sat, Jan 5, 2013 at 11:22 PM, Dave Jones wrote: > > I have no idea what happened here, but this is the first time I've seen > > this one. > > This was running a tree pulled yesterday afternoon. > > > > BUG: unable to handle kernel paging request at fff

Re: oops in copy_page_rep()

2013-01-06 Thread Dave Jones
On Sun, Jan 06, 2013 at 07:55:53PM +0800, Hillf Danton wrote: > On Sat, Jan 5, 2013 at 11:22 PM, Dave Jones wrote: > > I have no idea what happened here, but this is the first time I've seen > > this one. > > This was running a tree pulled yesterday afternoon. > > > Would you please try th

Re: oops in copy_page_rep()

2013-01-06 Thread Hillf Danton
Hi Dave On Sat, Jan 5, 2013 at 11:22 PM, Dave Jones wrote: > I have no idea what happened here, but this is the first time I've seen this > one. > This was running a tree pulled yesterday afternoon. > > BUG: unable to handle kernel paging request at 880100201000 > IP: [] copy_page_rep+0x5/0x

Re: oops in copy_page_rep()

2013-01-05 Thread Linus Torvalds
Adding more people in case somebody else has any idea. Anybody? On Sat, Jan 5, 2013 at 7:22 AM, Dave Jones wrote: > I have no idea what happened here, but this is the first time I've seen this > one. > This was running a tree pulled yesterday afternoon. > > BUG: unable to handle kernel paging re