Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-17 Thread Ingo Molnar

* Jan Beulich <[EMAIL PROTECTED]> wrote:

> > hm, does this patch fix any real bug seen live? (if yes then do you 
> > have links to it, etc?) It would be a v2.6.24 fix in theory but it 
> > looks too dangerous for that. So i've queued it up for v2.6.25, for 
> > the time being.
> 
> I had run into the issue with experimental code of mine quite a while 
> back (i.e. I don't even recall what exactly I was doing back then). 
> After that I just continued to keep the fix (which I had submitted 
> before, but which collided with something in Andi's tree back then 
> iirc).

i'd be inclined to have it in the for-2.6.25 queue then - even plain 
bugfixes in the pageattr code have broken boxes in the past, so this 
isnt something we want to touch so late in the cycle. (We can still 
reprioritize it if a serious regression pops up that this addresses.)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-17 Thread Jan Beulich
>>> Ingo Molnar <[EMAIL PROTECTED]> 17.12.07 14:28 >>>
>
>* Jan Beulich <[EMAIL PROTECTED]> wrote:
>
>> When either calling change_page_attr() with the default attributes 
>> pages in the direct mapping have and a page's attributes already were 
>> set to the default or when changing the attributes from one 
>> non-default value to another, the reference counting broke, leading to 
>> either premature restoration of a large page or missing the 
>> opportunity to do so.
>> 
>> At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it 
>> architecturally ought to have.
>
>hm, does this patch fix any real bug seen live? (if yes then do you have 
>links to it, etc?) It would be a v2.6.24 fix in theory but it looks too 
>dangerous for that. So i've queued it up for v2.6.25, for the time 
>being.

I had run into the issue with experimental code of mine quite a while back
(i.e. I don't even recall what exactly I was doing back then). After that I
just continued to keep the fix (which I had submitted before, but which
collided with something in Andi's tree back then iirc).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-17 Thread Ingo Molnar

* Jan Beulich <[EMAIL PROTECTED]> wrote:

> When either calling change_page_attr() with the default attributes 
> pages in the direct mapping have and a page's attributes already were 
> set to the default or when changing the attributes from one 
> non-default value to another, the reference counting broke, leading to 
> either premature restoration of a large page or missing the 
> opportunity to do so.
> 
> At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it 
> architecturally ought to have.

hm, does this patch fix any real bug seen live? (if yes then do you have 
links to it, etc?) It would be a v2.6.24 fix in theory but it looks too 
dangerous for that. So i've queued it up for v2.6.25, for the time 
being.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-17 Thread Ingo Molnar

* Jan Beulich [EMAIL PROTECTED] wrote:

 When either calling change_page_attr() with the default attributes 
 pages in the direct mapping have and a page's attributes already were 
 set to the default or when changing the attributes from one 
 non-default value to another, the reference counting broke, leading to 
 either premature restoration of a large page or missing the 
 opportunity to do so.
 
 At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it 
 architecturally ought to have.

hm, does this patch fix any real bug seen live? (if yes then do you have 
links to it, etc?) It would be a v2.6.24 fix in theory but it looks too 
dangerous for that. So i've queued it up for v2.6.25, for the time 
being.

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-17 Thread Jan Beulich
 Ingo Molnar [EMAIL PROTECTED] 17.12.07 14:28 

* Jan Beulich [EMAIL PROTECTED] wrote:

 When either calling change_page_attr() with the default attributes 
 pages in the direct mapping have and a page's attributes already were 
 set to the default or when changing the attributes from one 
 non-default value to another, the reference counting broke, leading to 
 either premature restoration of a large page or missing the 
 opportunity to do so.
 
 At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it 
 architecturally ought to have.

hm, does this patch fix any real bug seen live? (if yes then do you have 
links to it, etc?) It would be a v2.6.24 fix in theory but it looks too 
dangerous for that. So i've queued it up for v2.6.25, for the time 
being.

I had run into the issue with experimental code of mine quite a while back
(i.e. I don't even recall what exactly I was doing back then). After that I
just continued to keep the fix (which I had submitted before, but which
collided with something in Andi's tree back then iirc).

Jan

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-17 Thread Ingo Molnar

* Jan Beulich [EMAIL PROTECTED] wrote:

  hm, does this patch fix any real bug seen live? (if yes then do you 
  have links to it, etc?) It would be a v2.6.24 fix in theory but it 
  looks too dangerous for that. So i've queued it up for v2.6.25, for 
  the time being.
 
 I had run into the issue with experimental code of mine quite a while 
 back (i.e. I don't even recall what exactly I was doing back then). 
 After that I just continued to keep the fix (which I had submitted 
 before, but which collided with something in Andi's tree back then 
 iirc).

i'd be inclined to have it in the for-2.6.25 queue then - even plain 
bugfixes in the pageattr code have broken boxes in the past, so this 
isnt something we want to touch so late in the cycle. (We can still 
reprioritize it if a serious regression pops up that this addresses.)

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-14 Thread Jan Beulich
>>> Jeremy Fitzhardinge <[EMAIL PROTECTED]> 14.12.07 08:12 >>>
>Jan Beulich wrote:
>> When either calling change_page_attr() with the default attributes
>> pages in the direct mapping have and a page's attributes already were
>> set to the default or when changing the attributes from one non-default
>> value to another, the reference counting broke, leading to either
>> premature restoration of a large page or missing the opportunity to do
>> so.
>>
>> At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it
>> architecturally ought to have.
>>   
>
>Could you put this in a separate patch?  I have a bunch of page*.h and
>pgtable*.h refactoring patches which will conflict with this.

I doesn't seem logical to do so: The patch needs to introduce the definitions
for 32-bits (in order to define pte_pgprot()), and not doing the adjustment
for 64-bits here means (a) becoming inconsistent and (b) the pte_pgprot()
there would be incorrect. So such a split out patch would need to be a pre-
requisite to the one here, which wouldn't help avoiding the collisions with
your unification patches.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-14 Thread Jan Beulich
 Jeremy Fitzhardinge [EMAIL PROTECTED] 14.12.07 08:12 
Jan Beulich wrote:
 When either calling change_page_attr() with the default attributes
 pages in the direct mapping have and a page's attributes already were
 set to the default or when changing the attributes from one non-default
 value to another, the reference counting broke, leading to either
 premature restoration of a large page or missing the opportunity to do
 so.

 At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it
 architecturally ought to have.
   

Could you put this in a separate patch?  I have a bunch of page*.h and
pgtable*.h refactoring patches which will conflict with this.

I doesn't seem logical to do so: The patch needs to introduce the definitions
for 32-bits (in order to define pte_pgprot()), and not doing the adjustment
for 64-bits here means (a) becoming inconsistent and (b) the pte_pgprot()
there would be incorrect. So such a split out patch would need to be a pre-
requisite to the one here, which wouldn't help avoiding the collisions with
your unification patches.

Jan

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-13 Thread Jeremy Fitzhardinge
Jan Beulich wrote:
> When either calling change_page_attr() with the default attributes
> pages in the direct mapping have and a page's attributes already were
> set to the default or when changing the attributes from one non-default
> value to another, the reference counting broke, leading to either
> premature restoration of a large page or missing the opportunity to do
> so.
>
> At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it
> architecturally ought to have.
>   

Could you put this in a separate patch?  I have a bunch of page*.h and
pgtable*.h refactoring patches which will conflict with this.

J

> Signed-off-by: Jan Beulich <[EMAIL PROTECTED]>
> Cc: Andi Kleen <[EMAIL PROTECTED]>
>
>  arch/x86/mm/ioremap_64.c |4 +-
>  arch/x86/mm/pageattr_32.c|   84 
> +--
>  arch/x86/mm/pageattr_64.c|   57 +++--
>  include/asm-x86/page_32.h|   10 +
>  include/asm-x86/page_64.h|2 -
>  include/asm-x86/pgtable_32.h |3 +
>  include/asm-x86/pgtable_64.h |4 +-
>  7 files changed, 114 insertions(+), 50 deletions(-)
>
> --- linux-2.6.24-rc5/arch/x86/mm/ioremap_64.c 2007-12-12 11:28:18.0 
> +0100
> +++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/ioremap_64.c  2007-12-04 
> 16:01:11.0 +0100
> @@ -48,7 +48,7 @@ ioremap_change_attr(unsigned long phys_a
>* Must use a address here and not struct page because the phys 
> addr
>* can be a in hole between nodes and not have an memmap entry.
>*/
> - err = 
> change_page_attr_addr(vaddr,npages,__pgprot(__PAGE_KERNEL|flags));
> + err = 
> change_page_attr_addr(vaddr,npages,MAKE_GLOBAL(__PAGE_KERNEL|flags));
>   if (!err)
>   global_flush_tlb();
>   }
> @@ -199,7 +199,7 @@ void iounmap(volatile void __iomem *addr
>  
>   /* Reset the direct mapping. Can block */
>   if (p->flags >> 20)
> - ioremap_change_attr(p->phys_addr, p->size, 0);
> + ioremap_change_attr(p->phys_addr, get_vm_area_size(p), 0);
>  
>   /* Finally remove it */
>   o = remove_vm_area((void *)addr);
> --- linux-2.6.24-rc5/arch/x86/mm/pageattr_32.c2007-12-12 
> 11:28:18.0 +0100
> +++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/pageattr_32.c 2007-12-04 
> 16:01:11.0 +0100
> @@ -116,24 +116,22 @@ static void set_pmd_pte(pte_t *kpte, uns
>   spin_unlock_irqrestore(_lock, flags);
>  }
>  
> +static pgprot_t _ref_prot[KERNEL_PGD_PTRS * PTRS_PER_PMD];
> +#define ref_prot(addr) _ref_prot[__pa(addr) >> PMD_SHIFT]
> +
>  /* 
>   * No more special protections in this 2/4MB area - revert to a
>   * large page again. 
>   */
>  static inline void revert_page(struct page *kpte_page, unsigned long address)
>  {
> - pgprot_t ref_prot;
>   pte_t *linear;
>  
> - ref_prot =
> - ((address & LARGE_PAGE_MASK) < (unsigned long)&_etext)
> - ? PAGE_KERNEL_LARGE_EXEC : PAGE_KERNEL_LARGE;
> -
>   linear = (pte_t *)
>   pmd_offset(pud_offset(pgd_offset_k(address), address), address);
>   set_pmd_pte(linear,  address,
> - pfn_pte((__pa(address) & LARGE_PAGE_MASK) >> PAGE_SHIFT,
> - ref_prot));
> + pte_mkhuge(pfn_pte((__pa(address) & LARGE_PAGE_MASK) >> 
> PAGE_SHIFT,
> +ref_prot(address;
>  }
>  
>  static inline void save_page(struct page *kpte_page)
> @@ -142,12 +140,22 @@ static inline void save_page(struct page
>   list_add(_page->lru, _list);
>  }
>  
> +static inline int pgprot_match(pgprot_t prot1, pgprot_t prot2)
> +{
> + return !((pgprot_val(prot1) ^ pgprot_val(prot2))
> +#ifdef CONFIG_X86_PAE
> +  & __supported_pte_mask
> +#endif
> +  & ~(_PAGE_ACCESSED|_PAGE_DIRTY));
> +}
> +
>  static int
>  __change_page_attr(struct page *page, pgprot_t prot)
>  { 
>   pte_t *kpte; 
>   unsigned long address;
>   struct page *kpte_page;
> + pgprot_t old_prot, ref_prot;
>  
>   BUG_ON(PageHighMem(page));
>   address = (unsigned long)page_address(page);
> @@ -159,29 +167,31 @@ __change_page_attr(struct page *page, pg
>   BUG_ON(PageLRU(kpte_page));
>   BUG_ON(PageCompound(kpte_page));
>  
> - if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL)) { 
> + old_prot = pte_pgprot(pte_clrhuge(*kpte));
> + ref_prot = ref_prot(address);
> + if (!pgprot_match(prot, ref_prot)) {
>   if (!pte_huge(*kpte)) {
>   set_pte_atomic(kpte, mk_pte(page, prot)); 
>   } else {
> - pgprot_t ref_prot;
> - struct page *split;
> -
> - ref_prot =
> - ((address & LARGE_PAGE_MASK) < (unsigned long)&_etext)
> - ? PAGE_KERNEL_EXEC : PAGE_KERNEL;
> - split = 

[PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-13 Thread Jan Beulich
When either calling change_page_attr() with the default attributes
pages in the direct mapping have and a page's attributes already were
set to the default or when changing the attributes from one non-default
value to another, the reference counting broke, leading to either
premature restoration of a large page or missing the opportunity to do
so.

At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it
architecturally ought to have.

Signed-off-by: Jan Beulich <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>

 arch/x86/mm/ioremap_64.c |4 +-
 arch/x86/mm/pageattr_32.c|   84 +--
 arch/x86/mm/pageattr_64.c|   57 +++--
 include/asm-x86/page_32.h|   10 +
 include/asm-x86/page_64.h|2 -
 include/asm-x86/pgtable_32.h |3 +
 include/asm-x86/pgtable_64.h |4 +-
 7 files changed, 114 insertions(+), 50 deletions(-)

--- linux-2.6.24-rc5/arch/x86/mm/ioremap_64.c   2007-12-12 11:28:18.0 
+0100
+++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/ioremap_64.c2007-12-04 
16:01:11.0 +0100
@@ -48,7 +48,7 @@ ioremap_change_attr(unsigned long phys_a
 * Must use a address here and not struct page because the phys 
addr
 * can be a in hole between nodes and not have an memmap entry.
 */
-   err = 
change_page_attr_addr(vaddr,npages,__pgprot(__PAGE_KERNEL|flags));
+   err = 
change_page_attr_addr(vaddr,npages,MAKE_GLOBAL(__PAGE_KERNEL|flags));
if (!err)
global_flush_tlb();
}
@@ -199,7 +199,7 @@ void iounmap(volatile void __iomem *addr
 
/* Reset the direct mapping. Can block */
if (p->flags >> 20)
-   ioremap_change_attr(p->phys_addr, p->size, 0);
+   ioremap_change_attr(p->phys_addr, get_vm_area_size(p), 0);
 
/* Finally remove it */
o = remove_vm_area((void *)addr);
--- linux-2.6.24-rc5/arch/x86/mm/pageattr_32.c  2007-12-12 11:28:18.0 
+0100
+++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/pageattr_32.c   2007-12-04 
16:01:11.0 +0100
@@ -116,24 +116,22 @@ static void set_pmd_pte(pte_t *kpte, uns
spin_unlock_irqrestore(_lock, flags);
 }
 
+static pgprot_t _ref_prot[KERNEL_PGD_PTRS * PTRS_PER_PMD];
+#define ref_prot(addr) _ref_prot[__pa(addr) >> PMD_SHIFT]
+
 /* 
  * No more special protections in this 2/4MB area - revert to a
  * large page again. 
  */
 static inline void revert_page(struct page *kpte_page, unsigned long address)
 {
-   pgprot_t ref_prot;
pte_t *linear;
 
-   ref_prot =
-   ((address & LARGE_PAGE_MASK) < (unsigned long)&_etext)
-   ? PAGE_KERNEL_LARGE_EXEC : PAGE_KERNEL_LARGE;
-
linear = (pte_t *)
pmd_offset(pud_offset(pgd_offset_k(address), address), address);
set_pmd_pte(linear,  address,
-   pfn_pte((__pa(address) & LARGE_PAGE_MASK) >> PAGE_SHIFT,
-   ref_prot));
+   pte_mkhuge(pfn_pte((__pa(address) & LARGE_PAGE_MASK) >> 
PAGE_SHIFT,
+  ref_prot(address;
 }
 
 static inline void save_page(struct page *kpte_page)
@@ -142,12 +140,22 @@ static inline void save_page(struct page
list_add(_page->lru, _list);
 }
 
+static inline int pgprot_match(pgprot_t prot1, pgprot_t prot2)
+{
+   return !((pgprot_val(prot1) ^ pgprot_val(prot2))
+#ifdef CONFIG_X86_PAE
+& __supported_pte_mask
+#endif
+& ~(_PAGE_ACCESSED|_PAGE_DIRTY));
+}
+
 static int
 __change_page_attr(struct page *page, pgprot_t prot)
 { 
pte_t *kpte; 
unsigned long address;
struct page *kpte_page;
+   pgprot_t old_prot, ref_prot;
 
BUG_ON(PageHighMem(page));
address = (unsigned long)page_address(page);
@@ -159,29 +167,31 @@ __change_page_attr(struct page *page, pg
BUG_ON(PageLRU(kpte_page));
BUG_ON(PageCompound(kpte_page));
 
-   if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL)) { 
+   old_prot = pte_pgprot(pte_clrhuge(*kpte));
+   ref_prot = ref_prot(address);
+   if (!pgprot_match(prot, ref_prot)) {
if (!pte_huge(*kpte)) {
set_pte_atomic(kpte, mk_pte(page, prot)); 
} else {
-   pgprot_t ref_prot;
-   struct page *split;
-
-   ref_prot =
-   ((address & LARGE_PAGE_MASK) < (unsigned long)&_etext)
-   ? PAGE_KERNEL_EXEC : PAGE_KERNEL;
-   split = split_large_page(address, prot, ref_prot);
-   if (!split)
+   BUG_ON(!pgprot_match(old_prot, ref_prot));
+   kpte_page = split_large_page(address, prot, ref_prot);
+   if (!kpte_page)
return -ENOMEM;
-

[PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-13 Thread Jan Beulich
When either calling change_page_attr() with the default attributes
pages in the direct mapping have and a page's attributes already were
set to the default or when changing the attributes from one non-default
value to another, the reference counting broke, leading to either
premature restoration of a large page or missing the opportunity to do
so.

At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it
architecturally ought to have.

Signed-off-by: Jan Beulich [EMAIL PROTECTED]
Cc: Andi Kleen [EMAIL PROTECTED]

 arch/x86/mm/ioremap_64.c |4 +-
 arch/x86/mm/pageattr_32.c|   84 +--
 arch/x86/mm/pageattr_64.c|   57 +++--
 include/asm-x86/page_32.h|   10 +
 include/asm-x86/page_64.h|2 -
 include/asm-x86/pgtable_32.h |3 +
 include/asm-x86/pgtable_64.h |4 +-
 7 files changed, 114 insertions(+), 50 deletions(-)

--- linux-2.6.24-rc5/arch/x86/mm/ioremap_64.c   2007-12-12 11:28:18.0 
+0100
+++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/ioremap_64.c2007-12-04 
16:01:11.0 +0100
@@ -48,7 +48,7 @@ ioremap_change_attr(unsigned long phys_a
 * Must use a address here and not struct page because the phys 
addr
 * can be a in hole between nodes and not have an memmap entry.
 */
-   err = 
change_page_attr_addr(vaddr,npages,__pgprot(__PAGE_KERNEL|flags));
+   err = 
change_page_attr_addr(vaddr,npages,MAKE_GLOBAL(__PAGE_KERNEL|flags));
if (!err)
global_flush_tlb();
}
@@ -199,7 +199,7 @@ void iounmap(volatile void __iomem *addr
 
/* Reset the direct mapping. Can block */
if (p-flags  20)
-   ioremap_change_attr(p-phys_addr, p-size, 0);
+   ioremap_change_attr(p-phys_addr, get_vm_area_size(p), 0);
 
/* Finally remove it */
o = remove_vm_area((void *)addr);
--- linux-2.6.24-rc5/arch/x86/mm/pageattr_32.c  2007-12-12 11:28:18.0 
+0100
+++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/pageattr_32.c   2007-12-04 
16:01:11.0 +0100
@@ -116,24 +116,22 @@ static void set_pmd_pte(pte_t *kpte, uns
spin_unlock_irqrestore(pgd_lock, flags);
 }
 
+static pgprot_t _ref_prot[KERNEL_PGD_PTRS * PTRS_PER_PMD];
+#define ref_prot(addr) _ref_prot[__pa(addr)  PMD_SHIFT]
+
 /* 
  * No more special protections in this 2/4MB area - revert to a
  * large page again. 
  */
 static inline void revert_page(struct page *kpte_page, unsigned long address)
 {
-   pgprot_t ref_prot;
pte_t *linear;
 
-   ref_prot =
-   ((address  LARGE_PAGE_MASK)  (unsigned long)_etext)
-   ? PAGE_KERNEL_LARGE_EXEC : PAGE_KERNEL_LARGE;
-
linear = (pte_t *)
pmd_offset(pud_offset(pgd_offset_k(address), address), address);
set_pmd_pte(linear,  address,
-   pfn_pte((__pa(address)  LARGE_PAGE_MASK)  PAGE_SHIFT,
-   ref_prot));
+   pte_mkhuge(pfn_pte((__pa(address)  LARGE_PAGE_MASK)  
PAGE_SHIFT,
+  ref_prot(address;
 }
 
 static inline void save_page(struct page *kpte_page)
@@ -142,12 +140,22 @@ static inline void save_page(struct page
list_add(kpte_page-lru, df_list);
 }
 
+static inline int pgprot_match(pgprot_t prot1, pgprot_t prot2)
+{
+   return !((pgprot_val(prot1) ^ pgprot_val(prot2))
+#ifdef CONFIG_X86_PAE
+ __supported_pte_mask
+#endif
+ ~(_PAGE_ACCESSED|_PAGE_DIRTY));
+}
+
 static int
 __change_page_attr(struct page *page, pgprot_t prot)
 { 
pte_t *kpte; 
unsigned long address;
struct page *kpte_page;
+   pgprot_t old_prot, ref_prot;
 
BUG_ON(PageHighMem(page));
address = (unsigned long)page_address(page);
@@ -159,29 +167,31 @@ __change_page_attr(struct page *page, pg
BUG_ON(PageLRU(kpte_page));
BUG_ON(PageCompound(kpte_page));
 
-   if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL)) { 
+   old_prot = pte_pgprot(pte_clrhuge(*kpte));
+   ref_prot = ref_prot(address);
+   if (!pgprot_match(prot, ref_prot)) {
if (!pte_huge(*kpte)) {
set_pte_atomic(kpte, mk_pte(page, prot)); 
} else {
-   pgprot_t ref_prot;
-   struct page *split;
-
-   ref_prot =
-   ((address  LARGE_PAGE_MASK)  (unsigned long)_etext)
-   ? PAGE_KERNEL_EXEC : PAGE_KERNEL;
-   split = split_large_page(address, prot, ref_prot);
-   if (!split)
+   BUG_ON(!pgprot_match(old_prot, ref_prot));
+   kpte_page = split_large_page(address, prot, ref_prot);
+   if (!kpte_page)
return -ENOMEM;
-   

Re: [PATCH] x86: fix ref-counting bug in change_page_attr()

2007-12-13 Thread Jeremy Fitzhardinge
Jan Beulich wrote:
 When either calling change_page_attr() with the default attributes
 pages in the direct mapping have and a page's attributes already were
 set to the default or when changing the attributes from one non-default
 value to another, the reference counting broke, leading to either
 premature restoration of a large page or missing the opportunity to do
 so.

 At the same time, make __PHYSICAL_MASK_SHIFT on 64-bits the value it
 architecturally ought to have.
   

Could you put this in a separate patch?  I have a bunch of page*.h and
pgtable*.h refactoring patches which will conflict with this.

J

 Signed-off-by: Jan Beulich [EMAIL PROTECTED]
 Cc: Andi Kleen [EMAIL PROTECTED]

  arch/x86/mm/ioremap_64.c |4 +-
  arch/x86/mm/pageattr_32.c|   84 
 +--
  arch/x86/mm/pageattr_64.c|   57 +++--
  include/asm-x86/page_32.h|   10 +
  include/asm-x86/page_64.h|2 -
  include/asm-x86/pgtable_32.h |3 +
  include/asm-x86/pgtable_64.h |4 +-
  7 files changed, 114 insertions(+), 50 deletions(-)

 --- linux-2.6.24-rc5/arch/x86/mm/ioremap_64.c 2007-12-12 11:28:18.0 
 +0100
 +++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/ioremap_64.c  2007-12-04 
 16:01:11.0 +0100
 @@ -48,7 +48,7 @@ ioremap_change_attr(unsigned long phys_a
* Must use a address here and not struct page because the phys 
 addr
* can be a in hole between nodes and not have an memmap entry.
*/
 - err = 
 change_page_attr_addr(vaddr,npages,__pgprot(__PAGE_KERNEL|flags));
 + err = 
 change_page_attr_addr(vaddr,npages,MAKE_GLOBAL(__PAGE_KERNEL|flags));
   if (!err)
   global_flush_tlb();
   }
 @@ -199,7 +199,7 @@ void iounmap(volatile void __iomem *addr
  
   /* Reset the direct mapping. Can block */
   if (p-flags  20)
 - ioremap_change_attr(p-phys_addr, p-size, 0);
 + ioremap_change_attr(p-phys_addr, get_vm_area_size(p), 0);
  
   /* Finally remove it */
   o = remove_vm_area((void *)addr);
 --- linux-2.6.24-rc5/arch/x86/mm/pageattr_32.c2007-12-12 
 11:28:18.0 +0100
 +++ 2.6.24-rc5-x86-change_page_attr/arch/x86/mm/pageattr_32.c 2007-12-04 
 16:01:11.0 +0100
 @@ -116,24 +116,22 @@ static void set_pmd_pte(pte_t *kpte, uns
   spin_unlock_irqrestore(pgd_lock, flags);
  }
  
 +static pgprot_t _ref_prot[KERNEL_PGD_PTRS * PTRS_PER_PMD];
 +#define ref_prot(addr) _ref_prot[__pa(addr)  PMD_SHIFT]
 +
  /* 
   * No more special protections in this 2/4MB area - revert to a
   * large page again. 
   */
  static inline void revert_page(struct page *kpte_page, unsigned long address)
  {
 - pgprot_t ref_prot;
   pte_t *linear;
  
 - ref_prot =
 - ((address  LARGE_PAGE_MASK)  (unsigned long)_etext)
 - ? PAGE_KERNEL_LARGE_EXEC : PAGE_KERNEL_LARGE;
 -
   linear = (pte_t *)
   pmd_offset(pud_offset(pgd_offset_k(address), address), address);
   set_pmd_pte(linear,  address,
 - pfn_pte((__pa(address)  LARGE_PAGE_MASK)  PAGE_SHIFT,
 - ref_prot));
 + pte_mkhuge(pfn_pte((__pa(address)  LARGE_PAGE_MASK)  
 PAGE_SHIFT,
 +ref_prot(address;
  }
  
  static inline void save_page(struct page *kpte_page)
 @@ -142,12 +140,22 @@ static inline void save_page(struct page
   list_add(kpte_page-lru, df_list);
  }
  
 +static inline int pgprot_match(pgprot_t prot1, pgprot_t prot2)
 +{
 + return !((pgprot_val(prot1) ^ pgprot_val(prot2))
 +#ifdef CONFIG_X86_PAE
 +   __supported_pte_mask
 +#endif
 +   ~(_PAGE_ACCESSED|_PAGE_DIRTY));
 +}
 +
  static int
  __change_page_attr(struct page *page, pgprot_t prot)
  { 
   pte_t *kpte; 
   unsigned long address;
   struct page *kpte_page;
 + pgprot_t old_prot, ref_prot;
  
   BUG_ON(PageHighMem(page));
   address = (unsigned long)page_address(page);
 @@ -159,29 +167,31 @@ __change_page_attr(struct page *page, pg
   BUG_ON(PageLRU(kpte_page));
   BUG_ON(PageCompound(kpte_page));
  
 - if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL)) { 
 + old_prot = pte_pgprot(pte_clrhuge(*kpte));
 + ref_prot = ref_prot(address);
 + if (!pgprot_match(prot, ref_prot)) {
   if (!pte_huge(*kpte)) {
   set_pte_atomic(kpte, mk_pte(page, prot)); 
   } else {
 - pgprot_t ref_prot;
 - struct page *split;
 -
 - ref_prot =
 - ((address  LARGE_PAGE_MASK)  (unsigned long)_etext)
 - ? PAGE_KERNEL_EXEC : PAGE_KERNEL;
 - split = split_large_page(address, prot, ref_prot);
 - if (!split)
 + BUG_ON(!pgprot_match(old_prot, ref_prot));
 +