Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-02 Thread Michal Hocko
On Thu 02-08-12 14:33:10, Mel Gorman wrote: > On Thu, Aug 02, 2012 at 02:36:58PM +0200, Michal Hocko wrote: > > On Thu 02-08-12 08:37:57, Mel Gorman wrote: > > > On Thu, Aug 02, 2012 at 09:19:34AM +0200, Michal Hocko wrote: > > [...] > > > > On the other hand, mine is more coupled with the sharing

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-02 Thread Mel Gorman
On Thu, Aug 02, 2012 at 02:36:58PM +0200, Michal Hocko wrote: > On Thu 02-08-12 08:37:57, Mel Gorman wrote: > > On Thu, Aug 02, 2012 at 09:19:34AM +0200, Michal Hocko wrote: > [...] > > > On the other hand, mine is more coupled with the sharing code so it > > > makes the code easier to follow and a

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-02 Thread Michal Hocko
On Thu 02-08-12 08:37:57, Mel Gorman wrote: > On Thu, Aug 02, 2012 at 09:19:34AM +0200, Michal Hocko wrote: [...] > > On the other hand, mine is more coupled with the sharing code so it > > makes the code easier to follow and also makes the sharing more > > effective because racing processes see pm

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-02 Thread Mel Gorman
On Thu, Aug 02, 2012 at 09:19:34AM +0200, Michal Hocko wrote: > Hi Larry, > > On Wed 01-08-12 11:06:33, Larry Woodman wrote: > > On 08/01/2012 08:32 AM, Michal Hocko wrote: > > > > > >I am really lame :/. The previous patch is wrong as well for goto out > > >branch. The updated patch as follows: >

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-02 Thread Michal Hocko
Hi Larry, On Wed 01-08-12 11:06:33, Larry Woodman wrote: > On 08/01/2012 08:32 AM, Michal Hocko wrote: > > > >I am really lame :/. The previous patch is wrong as well for goto out > >branch. The updated patch as follows: > This patch worked fine Michal! Thanks for the good news! > You and Mel

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-01 Thread Larry Woodman
On 08/01/2012 08:32 AM, Michal Hocko wrote: I am really lame :/. The previous patch is wrong as well for goto out branch. The updated patch as follows: This patch worked fine Michal! You and Mel can duke it out over who's is best. :) Larry -- To unsubscribe from this list: send the line "un

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-01 Thread Michal Hocko
On Wed 01-08-12 10:20:36, Michal Hocko wrote: > On Tue 31-07-12 22:45:43, Larry Woodman wrote: > > On 07/31/2012 04:06 PM, Michal Hocko wrote: > > >On Tue 31-07-12 13:49:21, Larry Woodman wrote: > > >>On 07/31/2012 08:46 AM, Mel Gorman wrote: > > >>>Fundamentally I think the problem is that we are

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-08-01 Thread Michal Hocko
On Tue 31-07-12 22:45:43, Larry Woodman wrote: > On 07/31/2012 04:06 PM, Michal Hocko wrote: > >On Tue 31-07-12 13:49:21, Larry Woodman wrote: > >>On 07/31/2012 08:46 AM, Mel Gorman wrote: > >>>Fundamentally I think the problem is that we are not correctly detecting > >>>that page table sharing too

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Larry Woodman
On 07/31/2012 04:06 PM, Michal Hocko wrote: On Tue 31-07-12 13:49:21, Larry Woodman wrote: On 07/31/2012 08:46 AM, Mel Gorman wrote: Fundamentally I think the problem is that we are not correctly detecting that page table sharing took place during huge_pte_alloc(). This patch is longer and make

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Larry Woodman
On 07/31/2012 04:06 PM, Michal Hocko wrote: On Tue 31-07-12 13:49:21, Larry Woodman wrote: On 07/31/2012 08:46 AM, Mel Gorman wrote: Fundamentally I think the problem is that we are not correctly detecting that page table sharing took place during huge_pte_alloc(). This patch is longer and make

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Michal Hocko
On Tue 31-07-12 13:49:21, Larry Woodman wrote: > On 07/31/2012 08:46 AM, Mel Gorman wrote: > > > >Fundamentally I think the problem is that we are not correctly detecting > >that page table sharing took place during huge_pte_alloc(). This patch is > >longer and makes an API change but if I'm right,

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Larry Woodman
On 07/31/2012 08:46 AM, Mel Gorman wrote: Fundamentally I think the problem is that we are not correctly detecting that page table sharing took place during huge_pte_alloc(). This patch is longer and makes an API change but if I'm right, it addresses the underlying problem. The first VM_MAYSHARE

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Rik van Riel
On 07/31/2012 08:46 AM, Mel Gorman wrote: mm: hugetlbfs: Correctly detect if page tables have just been shared Each page mapped in a processes address space must be correctly accounted for in _mapcount. Normally the rules for this are straight-forward but hugetlbfs page table sharing is differe

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Mel Gorman
On Tue, Jul 31, 2012 at 09:07:14AM -0400, Larry Woodman wrote: > On 07/31/2012 08:46 AM, Mel Gorman wrote: > >On Mon, Jul 30, 2012 at 03:11:27PM -0400, Larry Woodman wrote: > >>> > >>>That is a surprise. Can you try your test case on 3.4 and tell us if the > >>>patch fixes the problem there? I woul

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Michal Hocko
On Tue 31-07-12 13:46:50, Mel Gorman wrote: [...] > mm: hugetlbfs: Correctly detect if page tables have just been shared > > Each page mapped in a processes address space must be correctly > accounted for in _mapcount. Normally the rules for this are > straight-forward but hugetlbfs page table sha

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Larry Woodman
On 07/31/2012 08:46 AM, Mel Gorman wrote: On Mon, Jul 30, 2012 at 03:11:27PM -0400, Larry Woodman wrote: That is a surprise. Can you try your test case on 3.4 and tell us if the patch fixes the problem there? I would like to rule out the possibility that the locking rules are slightly different

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Mel Gorman
On Mon, Jul 30, 2012 at 03:11:27PM -0400, Larry Woodman wrote: > > > >That is a surprise. Can you try your test case on 3.4 and tell us if the > >patch fixes the problem there? I would like to rule out the possibility > >that the locking rules are slightly different in RHEL. If it hits on 3.4 > >t

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-31 Thread Hillf Danton
On Tue, Jul 31, 2012 at 3:11 AM, Larry Woodman wrote: > [ 1106.156569] [ cut here ] > [ 1106.161731] kernel BUG at mm/filemap.c:135! > [ 1106.166395] invalid opcode: [#1] SMP > [ 1106.170975] CPU 22 > [ 1106.173115] Modules linked in: bridge stp llc sunrpc binfmt_misc

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-30 Thread Larry Woodman
On 07/27/2012 06:23 AM, Mel Gorman wrote: On Thu, Jul 26, 2012 at 11:48:56PM -0400, Larry Woodman wrote: On 07/26/2012 02:37 PM, Rik van Riel wrote: On 07/23/2012 12:04 AM, Hugh Dickins wrote: I spent hours trying to dream up a better patch, trying various approaches. I think I have a nice o

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-27 Thread Larry Woodman
On 07/27/2012 06:23 AM, Mel Gorman wrote: On Thu, Jul 26, 2012 at 11:48:56PM -0400, Larry Woodman wrote: On 07/26/2012 02:37 PM, Rik van Riel wrote: On 07/23/2012 12:04 AM, Hugh Dickins wrote: I spent hours trying to dream up a better patch, trying various approaches. I think I have a nice o

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-27 Thread Mel Gorman
On Thu, Jul 26, 2012 at 11:48:56PM -0400, Larry Woodman wrote: > On 07/26/2012 02:37 PM, Rik van Riel wrote: > >On 07/23/2012 12:04 AM, Hugh Dickins wrote: > > > >>I spent hours trying to dream up a better patch, trying various > >>approaches. I think I have a nice one now, what do you think? And

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-27 Thread Larry Woodman
On 07/26/2012 11:48 PM, Larry Woodman wrote: Mel, did you see this??? Larry This patch looks good to me. Larry, does Hugh's patch survive your testing? Like I said earlier, no. However, I finally set up a reproducer that only takes a few seconds on a large system and this totally fixe

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-27 Thread Michal Hocko
On Thu 26-07-12 14:31:50, Rik van Riel wrote: > On 07/20/2012 10:36 AM, Michal Hocko wrote: > > >--- a/arch/x86/mm/hugetlbpage.c > >+++ b/arch/x86/mm/hugetlbpage.c > >@@ -81,7 +81,12 @@ static void huge_pmd_share(struct mm_struct *mm, unsigned > >long addr, pud_t *pud) > > if (saddr)

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-27 Thread Mel Gorman
On Thu, Jul 26, 2012 at 01:42:26PM -0400, Rik van Riel wrote: > On 07/23/2012 12:04 AM, Hugh Dickins wrote: > > >Please don't be upset if I say that I don't like either of your patches. > >Mainly for obvious reasons - I don't like Mel's because anything with > >trylock retries and nested spinlocks

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-26 Thread Larry Woodman
On 07/26/2012 02:37 PM, Rik van Riel wrote: On 07/23/2012 12:04 AM, Hugh Dickins wrote: I spent hours trying to dream up a better patch, trying various approaches. I think I have a nice one now, what do you think? And more importantly, does it work? I have not tried to test it at all, that I

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-26 Thread Larry Woodman
On 07/26/2012 02:37 PM, Rik van Riel wrote: On 07/23/2012 12:04 AM, Hugh Dickins wrote: I spent hours trying to dream up a better patch, trying various approaches. I think I have a nice one now, what do you think? And more importantly, does it work? I have not tried to test it at all, that I

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-26 Thread Rik van Riel
On 07/23/2012 12:04 AM, Hugh Dickins wrote: I spent hours trying to dream up a better patch, trying various approaches. I think I have a nice one now, what do you think? And more importantly, does it work? I have not tried to test it at all, that I'm hoping to leave to you, I'm sure you'll at

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-26 Thread Rik van Riel
On 07/20/2012 10:36 AM, Michal Hocko wrote: --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -81,7 +81,12 @@ static void huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) if (saddr) { spte = huge_pte_offset(svma->vm_mm

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-26 Thread Rik van Riel
On 07/23/2012 12:04 AM, Hugh Dickins wrote: Please don't be upset if I say that I don't like either of your patches. Mainly for obvious reasons - I don't like Mel's because anything with trylock retries and nested spinlocks worries me before I can even start to think about it; and I don't like M

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-26 Thread Larry Woodman
On 07/26/2012 01:42 PM, Rik van Riel wrote: On 07/23/2012 12:04 AM, Hugh Dickins wrote: Please don't be upset if I say that I don't like either of your patches. Mainly for obvious reasons - I don't like Mel's because anything with trylock retries and nested spinlocks worries me before I can eve

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-25 Thread Mel Gorman
On Tue, Jul 24, 2012 at 12:23:58PM -0700, Hugh Dickins wrote: > On Tue, 24 Jul 2012, Mel Gorman wrote: > > On Mon, Jul 23, 2012 at 06:08:05PM -0700, Hugh Dickins wrote: > > > > > > So, after a bout of anxiety, I think my &= ~VM_MAYSHARE remains good. > > > > > > > I agree with you. When I was th

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-24 Thread Hugh Dickins
On Tue, 24 Jul 2012, Mel Gorman wrote: > On Mon, Jul 23, 2012 at 06:08:05PM -0700, Hugh Dickins wrote: > > > > So, after a bout of anxiety, I think my &= ~VM_MAYSHARE remains good. > > > > I agree with you. When I was thinking about the potential problems, I was > thinking of them in the general

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-24 Thread Michal Hocko
On Tue 24-07-12 10:34:06, Mel Gorman wrote: [...] > ---8<--- > mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables > > If a process creates a large hugetlbfs mapping that is eligible for page > table sharing and forks heavily with children some of whom fault and > others whic

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-24 Thread Mel Gorman
On Mon, Jul 23, 2012 at 06:08:05PM -0700, Hugh Dickins wrote: > On Mon, 23 Jul 2012, Mel Gorman wrote: > > On Sun, Jul 22, 2012 at 09:04:33PM -0700, Hugh Dickins wrote: > > > On Fri, 20 Jul 2012, Mel Gorman wrote: > > > > On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote: > > > > I like

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-24 Thread Michal Hocko
On Mon 23-07-12 18:08:05, Hugh Dickins wrote: > On Mon, 23 Jul 2012, Mel Gorman wrote: > > On Sun, Jul 22, 2012 at 09:04:33PM -0700, Hugh Dickins wrote: > > > On Fri, 20 Jul 2012, Mel Gorman wrote: > > > > On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote: > > > > I like it in that it's

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-23 Thread Hugh Dickins
On Mon, 23 Jul 2012, Mel Gorman wrote: > On Sun, Jul 22, 2012 at 09:04:33PM -0700, Hugh Dickins wrote: > > On Fri, 20 Jul 2012, Mel Gorman wrote: > > > On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote: > > I like it in that it's simple and I can confirm it works for the test case > of

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-23 Thread Mel Gorman
On Sun, Jul 22, 2012 at 09:04:33PM -0700, Hugh Dickins wrote: > On Fri, 20 Jul 2012, Mel Gorman wrote: > > On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote: > > > And here is my attempt for the fix (Hugh mentioned something similar > > > earlier but he suggested using special flags in p

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-22 Thread Hugh Dickins
On Fri, 20 Jul 2012, Mel Gorman wrote: > On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote: > > And here is my attempt for the fix (Hugh mentioned something similar > > earlier but he suggested using special flags in ptes or VMAs). I still > > owe doc. update and it hasn't been tested wi

Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-20 Thread Mel Gorman
On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote: > And here is my attempt for the fix (Hugh mentioned something similar > earlier but he suggested using special flags in ptes or VMAs). I still > owe doc. update and it hasn't been tested with too many configs and I > could missed some d

[PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-20 Thread Michal Hocko
And here is my attempt for the fix (Hugh mentioned something similar earlier but he suggested using special flags in ptes or VMAs). I still owe doc. update and it hasn't been tested with too many configs and I could missed some definition updates. I also think that changelog could be much better, I