Re: [PATCH] swapin flush cache bug

2001-06-27 Thread NIIBE Yutaka

Hello Stephen, 

Stephen C. Tweedie wrote:
 > First, don't we want to do a flush_page_to_ram() *before* starting the
 > swap IO?

Well, let me explain the issue.  It is the thing we need to do
flushing *after* I/O.

--
Problem with virtually indexed physically tagged write-back cache.

(1) Page got swapped out

   Swap out
   [ Page ] > [ Disk ]

(2) Page got swapped in asynchronously, possibly by read-ahead

   Swap in
   [ Page ] < [ Disk ]
   K

   The I/O from disk goes through kernel virtual address K.
   We have cache entries indexed by K.

(3) Page fault occurs at user space U

   U > [ Page ] -> [ Disk ]
   K

   The control goes to do_swap_page, found the page at
   lookup_swap_cache.

   If K and U indexes differently, we have cache alias issues, we need
   to flush the entries indexed by K and let them go to memory.  Or else, 
   user space will see bogus data in Page.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread Stephen C. Tweedie

Hi,

On Thu, Jun 28, 2001 at 09:07:52AM +0900, NIIBE Yutaka wrote:
> Marcelo Tosatti wrote:
>  > I think Stephen C. Tweedie has some considerations about the cache
>  > flushing calls on do_swap_page().
> 
> Yup.  IIRC, he said that flushing cache at do_swap_page() (which I've
> tried at first) is not good, because it's the hot path and it causes
> another performance problem in apache or sendmail, where many
> processes share the pages in swap cache.
> 
> Then, the fix I sent yesterday is second try.  The flush is moved to
> I/O routine, instead of do_swap_page().

I really want somebody who has worked on weird caching architectures
to look at it too, but I don't see that the new code works well.
First, don't we want to do a flush_page_to_ram() *before* starting the
swap IO?  There's no point dma'ing the swap page to ram if old, dirty
cache data is then going to be written back on top of that.

Secondly, the flushing of icache/dcache only needs to be done by the
time we come to use the page, so can be performed any time, before or
after the IO.  We're not going to be accessing the page during IO so
if we flush first we won't risk having stale cache data by the time
the IO completes.

So, why not just do the flushing before the IO?  Adding an async trap
to flush the page after swap IO seems the wrong approach: this should
be possible to fix synchronously.

Cheers,
 Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread Marcelo Tosatti



On Thu, 28 Jun 2001, NIIBE Yutaka wrote:

> Marcelo Tosatti wrote:
>  > I think Stephen C. Tweedie has some considerations about the cache
>  > flushing calls on do_swap_page().
> 
> Yup.  IIRC, he said that flushing cache at do_swap_page() (which I've
> tried at first) is not good, because it's the hot path and it causes
> another performance problem in apache or sendmail, where many
> processes share the pages in swap cache.

I guess he has some other comments about it... 

Again, Stephen ? 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread NIIBE Yutaka

Marcelo Tosatti wrote:
 > I think Stephen C. Tweedie has some considerations about the cache
 > flushing calls on do_swap_page().

Yup.  IIRC, he said that flushing cache at do_swap_page() (which I've
tried at first) is not good, because it's the hot path and it causes
another performance problem in apache or sendmail, where many
processes share the pages in swap cache.

Then, the fix I sent yesterday is second try.  The flush is moved to
I/O routine, instead of do_swap_page().

Actually, I really wonder why current code causes no problem in other
architectures.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread Marcelo Tosatti



Hi, 

I think Stephen C. Tweedie has some considerations about the cache
flushing calls on do_swap_page().

Stephen? 


On Wed, 27 Jun 2001, NIIBE Yutaka wrote:

> Hello Marcelo, 
> 
> This is follow-up to the mail in February.  You may perhaps forget the
> context, it's the bug of MM about cache flushing for swapped-in-pages.
> I see this bug on SuperH port (SH-4).
> 
> I think that we have this issue on the machine whose flush_dcache_page()
> is defined.  In current code, the cache aren't flushed for 
> the asynchronously-swapped-in pages which is cached in swap cache.
> This is the problem.
> 
> Marcelo Tosatti writes:
>  > Yet another thing (1) on end_buffer_io_async() to handle a case which is
>  > only true for a specific user of it. Since the other special case handling
>  > is for swap IO too, I think a separate IO end operation for swap would be
>  > interesting.
>  > 
>  > (1) The current one is SetPageDecrAfter handling.
> 
> How about this?  I've updated MM bugzilla already.
> 
> 2001-06-26  NIIBE Yutaka  <[EMAIL PROTECTED]>
> 
>   * include/linux/mm.h (PG_flush_after, PageFlushAfter,
>   SetPageFlushAfter, PageTestandClearFlushAfter): New bit.
>   * mm/page_io.c (rw_swap_page_base): Set flush-after bit.
>   * fs/buffer.c (end_buffer_io_async): Implement flush-ing
>   with PG_flush_after.
> 
>   * mm/memory.c (do_swap_page): Remove flush-ing the page.
> 
> diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/fs/buffer.c 
>kernel/fs/buffer.c
> --- v2.4.6-pre5/fs/buffer.c   Mon Jun 25 18:48:07 2001
> +++ kernel/fs/buffer.cTue Jun 26 15:11:17 2001
> @@ -831,6 +831,9 @@ static void end_buffer_io_async(struct b
>   if (PageTestandClearDecrAfter(page))
>   atomic_dec(_async_pages);
>  
> + if (PageTestandClearFlushAfter(page))
> + flush_dcache_page(page);
> +
>   UnlockPage(page);
>  
>   return;
> diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/include/linux/mm.h 
>kernel/include/linux/mm.h
> --- v2.4.6-pre5/include/linux/mm.hMon Jun 25 18:48:09 2001
> +++ kernel/include/linux/mm.h Tue Jun 26 14:58:56 2001
> @@ -282,6 +282,7 @@ typedef struct page {
>  #define PG_inactive_clean11
>  #define PG_highmem   12
>  #define PG_checked   13  /* kill me in 2.5.. */
> +#define PG_flush_after   14
>   /* bits 21-29 unused */
>  #define PG_arch_130
>  #define PG_reserved  31
> @@ -364,6 +365,10 @@ static inline void set_page_dirty(struct
>  
>  #define SetPageReserved(page)set_bit(PG_reserved, &(page)->flags)
>  #define ClearPageReserved(page)  clear_bit(PG_reserved, &(page)->flags)
> +
> +#define PageFlushAfter(page) test_bit(PG_flush_after, &(page)->flags)
> +#define SetPageFlushAfter(page)  set_bit(PG_flush_after, &(page)->flags)
> +#define PageTestandClearFlushAfter(page) test_and_clear_bit(PG_flush_after, 
>&(page)->flags)
>  
>  /*
>   * Error return values for the *_nopage functions
> diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/memory.c 
>kernel/mm/memory.c
> --- v2.4.6-pre5/mm/memory.c   Mon Jun 25 18:48:10 2001
> +++ kernel/mm/memory.cTue Jun 26 14:48:15 2001
> @@ -1109,8 +1109,6 @@ static int do_swap_page(struct mm_struct
>   return -1;
>   }
>   wait_on_page(page);
> - flush_page_to_ram(page);
> - flush_icache_page(vma, page);
>   }
>  
>   /*
> diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/page_io.c 
>kernel/mm/page_io.c
> --- v2.4.6-pre5/mm/page_io.c  Mon Apr 30 16:15:32 2001
> +++ kernel/mm/page_io.c   Tue Jun 26 15:01:00 2001
> @@ -50,6 +50,7 @@ static int rw_swap_page_base(int rw, swp
>  
>   if (rw == READ) {
>   ClearPageUptodate(page);
> + SetPageFlushAfter(page);
>   kstat.pswpin++;
>   } else
>   kstat.pswpout++;
> -- 
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread Marcelo Tosatti



Hi, 

I think Stephen C. Tweedie has some considerations about the cache
flushing calls on do_swap_page().

Stephen? 


On Wed, 27 Jun 2001, NIIBE Yutaka wrote:

 Hello Marcelo, 
 
 This is follow-up to the mail in February.  You may perhaps forget the
 context, it's the bug of MM about cache flushing for swapped-in-pages.
 I see this bug on SuperH port (SH-4).
 
 I think that we have this issue on the machine whose flush_dcache_page()
 is defined.  In current code, the cache aren't flushed for 
 the asynchronously-swapped-in pages which is cached in swap cache.
 This is the problem.
 
 Marcelo Tosatti writes:
   Yet another thing (1) on end_buffer_io_async() to handle a case which is
   only true for a specific user of it. Since the other special case handling
   is for swap IO too, I think a separate IO end operation for swap would be
   interesting.
   
   (1) The current one is SetPageDecrAfter handling.
 
 How about this?  I've updated MM bugzilla already.
 
 2001-06-26  NIIBE Yutaka  [EMAIL PROTECTED]
 
   * include/linux/mm.h (PG_flush_after, PageFlushAfter,
   SetPageFlushAfter, PageTestandClearFlushAfter): New bit.
   * mm/page_io.c (rw_swap_page_base): Set flush-after bit.
   * fs/buffer.c (end_buffer_io_async): Implement flush-ing
   with PG_flush_after.
 
   * mm/memory.c (do_swap_page): Remove flush-ing the page.
 
 diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/fs/buffer.c 
kernel/fs/buffer.c
 --- v2.4.6-pre5/fs/buffer.c   Mon Jun 25 18:48:07 2001
 +++ kernel/fs/buffer.cTue Jun 26 15:11:17 2001
 @@ -831,6 +831,9 @@ static void end_buffer_io_async(struct b
   if (PageTestandClearDecrAfter(page))
   atomic_dec(nr_async_pages);
  
 + if (PageTestandClearFlushAfter(page))
 + flush_dcache_page(page);
 +
   UnlockPage(page);
  
   return;
 diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/include/linux/mm.h 
kernel/include/linux/mm.h
 --- v2.4.6-pre5/include/linux/mm.hMon Jun 25 18:48:09 2001
 +++ kernel/include/linux/mm.h Tue Jun 26 14:58:56 2001
 @@ -282,6 +282,7 @@ typedef struct page {
  #define PG_inactive_clean11
  #define PG_highmem   12
  #define PG_checked   13  /* kill me in 2.5.early. */
 +#define PG_flush_after   14
   /* bits 21-29 unused */
  #define PG_arch_130
  #define PG_reserved  31
 @@ -364,6 +365,10 @@ static inline void set_page_dirty(struct
  
  #define SetPageReserved(page)set_bit(PG_reserved, (page)-flags)
  #define ClearPageReserved(page)  clear_bit(PG_reserved, (page)-flags)
 +
 +#define PageFlushAfter(page) test_bit(PG_flush_after, (page)-flags)
 +#define SetPageFlushAfter(page)  set_bit(PG_flush_after, (page)-flags)
 +#define PageTestandClearFlushAfter(page) test_and_clear_bit(PG_flush_after, 
(page)-flags)
  
  /*
   * Error return values for the *_nopage functions
 diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/memory.c 
kernel/mm/memory.c
 --- v2.4.6-pre5/mm/memory.c   Mon Jun 25 18:48:10 2001
 +++ kernel/mm/memory.cTue Jun 26 14:48:15 2001
 @@ -1109,8 +1109,6 @@ static int do_swap_page(struct mm_struct
   return -1;
   }
   wait_on_page(page);
 - flush_page_to_ram(page);
 - flush_icache_page(vma, page);
   }
  
   /*
 diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/page_io.c 
kernel/mm/page_io.c
 --- v2.4.6-pre5/mm/page_io.c  Mon Apr 30 16:15:32 2001
 +++ kernel/mm/page_io.c   Tue Jun 26 15:01:00 2001
 @@ -50,6 +50,7 @@ static int rw_swap_page_base(int rw, swp
  
   if (rw == READ) {
   ClearPageUptodate(page);
 + SetPageFlushAfter(page);
   kstat.pswpin++;
   } else
   kstat.pswpout++;
 -- 
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread NIIBE Yutaka

Marcelo Tosatti wrote:
  I think Stephen C. Tweedie has some considerations about the cache
  flushing calls on do_swap_page().

Yup.  IIRC, he said that flushing cache at do_swap_page() (which I've
tried at first) is not good, because it's the hot path and it causes
another performance problem in apache or sendmail, where many
processes share the pages in swap cache.

Then, the fix I sent yesterday is second try.  The flush is moved to
I/O routine, instead of do_swap_page().

Actually, I really wonder why current code causes no problem in other
architectures.
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread Marcelo Tosatti



On Thu, 28 Jun 2001, NIIBE Yutaka wrote:

 Marcelo Tosatti wrote:
   I think Stephen C. Tweedie has some considerations about the cache
   flushing calls on do_swap_page().
 
 Yup.  IIRC, he said that flushing cache at do_swap_page() (which I've
 tried at first) is not good, because it's the hot path and it causes
 another performance problem in apache or sendmail, where many
 processes share the pages in swap cache.

I guess he has some other comments about it... 

Again, Stephen ? 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-27 Thread Stephen C. Tweedie

Hi,

On Thu, Jun 28, 2001 at 09:07:52AM +0900, NIIBE Yutaka wrote:
 Marcelo Tosatti wrote:
   I think Stephen C. Tweedie has some considerations about the cache
   flushing calls on do_swap_page().
 
 Yup.  IIRC, he said that flushing cache at do_swap_page() (which I've
 tried at first) is not good, because it's the hot path and it causes
 another performance problem in apache or sendmail, where many
 processes share the pages in swap cache.
 
 Then, the fix I sent yesterday is second try.  The flush is moved to
 I/O routine, instead of do_swap_page().

I really want somebody who has worked on weird caching architectures
to look at it too, but I don't see that the new code works well.
First, don't we want to do a flush_page_to_ram() *before* starting the
swap IO?  There's no point dma'ing the swap page to ram if old, dirty
cache data is then going to be written back on top of that.

Secondly, the flushing of icache/dcache only needs to be done by the
time we come to use the page, so can be performed any time, before or
after the IO.  We're not going to be accessing the page during IO so
if we flush first we won't risk having stale cache data by the time
the IO completes.

So, why not just do the flushing before the IO?  Adding an async trap
to flush the page after swap IO seems the wrong approach: this should
be possible to fix synchronously.

Cheers,
 Stephen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-26 Thread NIIBE Yutaka

Hello Marcelo, 

This is follow-up to the mail in February.  You may perhaps forget the
context, it's the bug of MM about cache flushing for swapped-in-pages.
I see this bug on SuperH port (SH-4).

I think that we have this issue on the machine whose flush_dcache_page()
is defined.  In current code, the cache aren't flushed for 
the asynchronously-swapped-in pages which is cached in swap cache.
This is the problem.

Marcelo Tosatti writes:
 > Yet another thing (1) on end_buffer_io_async() to handle a case which is
 > only true for a specific user of it. Since the other special case handling
 > is for swap IO too, I think a separate IO end operation for swap would be
 > interesting.
 > 
 > (1) The current one is SetPageDecrAfter handling.

How about this?  I've updated MM bugzilla already.

2001-06-26  NIIBE Yutaka  <[EMAIL PROTECTED]>

* include/linux/mm.h (PG_flush_after, PageFlushAfter,
SetPageFlushAfter, PageTestandClearFlushAfter): New bit.
* mm/page_io.c (rw_swap_page_base): Set flush-after bit.
* fs/buffer.c (end_buffer_io_async): Implement flush-ing
with PG_flush_after.

* mm/memory.c (do_swap_page): Remove flush-ing the page.

diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/fs/buffer.c 
kernel/fs/buffer.c
--- v2.4.6-pre5/fs/buffer.c Mon Jun 25 18:48:07 2001
+++ kernel/fs/buffer.c  Tue Jun 26 15:11:17 2001
@@ -831,6 +831,9 @@ static void end_buffer_io_async(struct b
if (PageTestandClearDecrAfter(page))
atomic_dec(_async_pages);
 
+   if (PageTestandClearFlushAfter(page))
+   flush_dcache_page(page);
+
UnlockPage(page);
 
return;
diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/include/linux/mm.h 
kernel/include/linux/mm.h
--- v2.4.6-pre5/include/linux/mm.h  Mon Jun 25 18:48:09 2001
+++ kernel/include/linux/mm.h   Tue Jun 26 14:58:56 2001
@@ -282,6 +282,7 @@ typedef struct page {
 #define PG_inactive_clean  11
 #define PG_highmem 12
 #define PG_checked 13  /* kill me in 2.5.. */
+#define PG_flush_after 14
/* bits 21-29 unused */
 #define PG_arch_1  30
 #define PG_reserved31
@@ -364,6 +365,10 @@ static inline void set_page_dirty(struct
 
 #define SetPageReserved(page)  set_bit(PG_reserved, &(page)->flags)
 #define ClearPageReserved(page)clear_bit(PG_reserved, &(page)->flags)
+
+#define PageFlushAfter(page)   test_bit(PG_flush_after, &(page)->flags)
+#define SetPageFlushAfter(page)set_bit(PG_flush_after, &(page)->flags)
+#define PageTestandClearFlushAfter(page)   test_and_clear_bit(PG_flush_after, 
+&(page)->flags)
 
 /*
  * Error return values for the *_nopage functions
diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/memory.c 
kernel/mm/memory.c
--- v2.4.6-pre5/mm/memory.c Mon Jun 25 18:48:10 2001
+++ kernel/mm/memory.c  Tue Jun 26 14:48:15 2001
@@ -1109,8 +1109,6 @@ static int do_swap_page(struct mm_struct
return -1;
}
wait_on_page(page);
-   flush_page_to_ram(page);
-   flush_icache_page(vma, page);
}
 
/*
diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/page_io.c 
kernel/mm/page_io.c
--- v2.4.6-pre5/mm/page_io.cMon Apr 30 16:15:32 2001
+++ kernel/mm/page_io.c Tue Jun 26 15:01:00 2001
@@ -50,6 +50,7 @@ static int rw_swap_page_base(int rw, swp
 
if (rw == READ) {
ClearPageUptodate(page);
+   SetPageFlushAfter(page);
kstat.pswpin++;
} else
kstat.pswpout++;
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-06-26 Thread NIIBE Yutaka

Hello Marcelo, 

This is follow-up to the mail in February.  You may perhaps forget the
context, it's the bug of MM about cache flushing for swapped-in-pages.
I see this bug on SuperH port (SH-4).

I think that we have this issue on the machine whose flush_dcache_page()
is defined.  In current code, the cache aren't flushed for 
the asynchronously-swapped-in pages which is cached in swap cache.
This is the problem.

Marcelo Tosatti writes:
  Yet another thing (1) on end_buffer_io_async() to handle a case which is
  only true for a specific user of it. Since the other special case handling
  is for swap IO too, I think a separate IO end operation for swap would be
  interesting.
  
  (1) The current one is SetPageDecrAfter handling.

How about this?  I've updated MM bugzilla already.

2001-06-26  NIIBE Yutaka  [EMAIL PROTECTED]

* include/linux/mm.h (PG_flush_after, PageFlushAfter,
SetPageFlushAfter, PageTestandClearFlushAfter): New bit.
* mm/page_io.c (rw_swap_page_base): Set flush-after bit.
* fs/buffer.c (end_buffer_io_async): Implement flush-ing
with PG_flush_after.

* mm/memory.c (do_swap_page): Remove flush-ing the page.

diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/fs/buffer.c 
kernel/fs/buffer.c
--- v2.4.6-pre5/fs/buffer.c Mon Jun 25 18:48:07 2001
+++ kernel/fs/buffer.c  Tue Jun 26 15:11:17 2001
@@ -831,6 +831,9 @@ static void end_buffer_io_async(struct b
if (PageTestandClearDecrAfter(page))
atomic_dec(nr_async_pages);
 
+   if (PageTestandClearFlushAfter(page))
+   flush_dcache_page(page);
+
UnlockPage(page);
 
return;
diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/include/linux/mm.h 
kernel/include/linux/mm.h
--- v2.4.6-pre5/include/linux/mm.h  Mon Jun 25 18:48:09 2001
+++ kernel/include/linux/mm.h   Tue Jun 26 14:58:56 2001
@@ -282,6 +282,7 @@ typedef struct page {
 #define PG_inactive_clean  11
 #define PG_highmem 12
 #define PG_checked 13  /* kill me in 2.5.early. */
+#define PG_flush_after 14
/* bits 21-29 unused */
 #define PG_arch_1  30
 #define PG_reserved31
@@ -364,6 +365,10 @@ static inline void set_page_dirty(struct
 
 #define SetPageReserved(page)  set_bit(PG_reserved, (page)-flags)
 #define ClearPageReserved(page)clear_bit(PG_reserved, (page)-flags)
+
+#define PageFlushAfter(page)   test_bit(PG_flush_after, (page)-flags)
+#define SetPageFlushAfter(page)set_bit(PG_flush_after, (page)-flags)
+#define PageTestandClearFlushAfter(page)   test_and_clear_bit(PG_flush_after, 
+(page)-flags)
 
 /*
  * Error return values for the *_nopage functions
diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/memory.c 
kernel/mm/memory.c
--- v2.4.6-pre5/mm/memory.c Mon Jun 25 18:48:10 2001
+++ kernel/mm/memory.c  Tue Jun 26 14:48:15 2001
@@ -1109,8 +1109,6 @@ static int do_swap_page(struct mm_struct
return -1;
}
wait_on_page(page);
-   flush_page_to_ram(page);
-   flush_icache_page(vma, page);
}
 
/*
diff -ruNp --exclude=CVS --exclude=.cvsignore v2.4.6-pre5/mm/page_io.c 
kernel/mm/page_io.c
--- v2.4.6-pre5/mm/page_io.cMon Apr 30 16:15:32 2001
+++ kernel/mm/page_io.c Tue Jun 26 15:01:00 2001
@@ -50,6 +50,7 @@ static int rw_swap_page_base(int rw, swp
 
if (rw == READ) {
ClearPageUptodate(page);
+   SetPageFlushAfter(page);
kstat.pswpin++;
} else
kstat.pswpout++;
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-14 Thread Marcelo Tosatti


On Wed, 14 Feb 2001, NIIBE Yutaka wrote:

> Alan Cox wrote:
>  > Ok we need to handle that case a bit more intelligently so those flushes dont
>  > get into other ports code paths. 
> 
> Possibly at fs/buffer.c:end_buffer_io_async?
> 
> We need to flush the cache when I/O was READ or READA.

Yet another thing (1) on end_buffer_io_async() to handle a case which is
only true for a specific user of it. Since the other special case handling
is for swap IO too, I think a separate IO end operation for swap would be
interesting.

(1) The current one is SetPageDecrAfter handling.

> Is there any way for end_buffer_io_async to distinguish which I/O (READ or WRITE)
> has been done?

Yes. If the buffer_head is uptodated (BH_Uptodate) then its a WRITE,
otherwise its a READ (this is only true before mark_buffer_uptodate() call
inside end_buffer_io_async(), of course). 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-14 Thread Marcelo Tosatti


On Wed, 14 Feb 2001, NIIBE Yutaka wrote:

 Alan Cox wrote:
   Ok we need to handle that case a bit more intelligently so those flushes dont
   get into other ports code paths. 
 
 Possibly at fs/buffer.c:end_buffer_io_async?
 
 We need to flush the cache when I/O was READ or READA.

Yet another thing (1) on end_buffer_io_async() to handle a case which is
only true for a specific user of it. Since the other special case handling
is for swap IO too, I think a separate IO end operation for swap would be
interesting.

(1) The current one is SetPageDecrAfter handling.

 Is there any way for end_buffer_io_async to distinguish which I/O (READ or WRITE)
 has been done?

Yes. If the buffer_head is uptodated (BH_Uptodate) then its a WRITE,
otherwise its a READ (this is only true before mark_buffer_uptodate() call
inside end_buffer_io_async(), of course). 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread NIIBE Yutaka

Alan Cox wrote:
 > Ok we need to handle that case a bit more intelligently so those flushes dont
 > get into other ports code paths. 

Possibly at fs/buffer.c:end_buffer_io_async?

We need to flush the cache when I/O was READ or READA.  Is there any
way for end_buffer_io_async to distinguish which I/O (READ or WRITE)
has been done?

--
Problem with write-back cache.

(1) Page got swapped out

   Swap out
   [ Disk ] < P [ Page ]

(2) Page got swapped in asynchronously, possibly by read-ahead

   Swap in
   [ Disk ] > P [ Page ]
  K

   The I/O from disk goes through kernel virtual address K.
   We have cache entries indexed by K.

(3) Page fault occurs at user space U

   [ Disk ]   P [ Page ] <- U
  K

   The control goes to do_swap_page, found the page at
   lookup_swap_cache.

   If K and U indexes differently, we have cache alias issues, 
   we need to flush the entries indexed by K.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread NIIBE Yutaka

Russell King wrote:
 > Unless someone else (Rik/DaveM) says otherwise, it is my understanding
 > that any IO for page P will only ever be a write to disk.  Therefore,
 > when you get a copy of the page from the swap cache, the physical memory
 > for that page is the same as it was when the process was using it last.
[...]
 > The data from memory will still be up to date though.  However, I agree
 > that you will end up with cache aliases.  I will also end up with cache
 > aliases.  The question now is, do these aliases really matter?
 > 
 > On my caches, the answer is no because they're not marked dirty, and
 > therefore will get dropped from the cache without writeback to memory.
 > 
 > If your cache doesn't write back clean cache data to memory, then you
 > should also behave well.

Yes, that's the difference.  It's write back cache, in my case.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread Alan Cox

> Suppose there's I/O to the physical page P asynchronously, and the
> page is placed in the swap cache.  It remains cache entry, say,
> indexed kernel virtual address K.  Then, process maps P at U.  U and K
> (may) indexes differently.  The process will get the data from memory
> (not the one in the cashe), if it's not flushed.

Ok we need to handle that case a bit more intelligently so those flushes dont
get into other ports code paths. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread Russell King

NIIBE Yutaka writes:
> My case (SH-4) is: virtual address indexed, physical address tagged cache
> (which has alias issue).

vivt caches have the same alias issue.

> Suppose there's I/O to the physical page P asynchronously, and the
> page is placed in the swap cache.

Unless someone else (Rik/DaveM) says otherwise, it is my understanding
that any IO for page P will only ever be a write to disk.  Therefore,
when you get a copy of the page from the swap cache, the physical memory
for that page is the same as it was when the process was using it last.

> It remains cache entry, say, indexed kernel virtual address K.  Then,
> process maps P at U.  U and K (may) indexes differently.  The process
> will get the data from memory (not the one in the cashe), if it's not
> flushed.

The data from memory will still be up to date though.  However, I agree
that you will end up with cache aliases.  I will also end up with cache
aliases.  The question now is, do these aliases really matter?

On my caches, the answer is no because they're not marked dirty, and
therefore will get dropped from the cache without writeback to memory.

If your cache doesn't write back clean cache data to memory, then you
should also behave well.

However, that said, someone more experienced with the Linux MM should
comment.

--
Russell King ([EMAIL PROTECTED])The developer of ARM Linux
 http://www.arm.linux.org.uk/personal/aboutme.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread NIIBE Yutaka

Russell King wrote:
 > What was the problem?  The old code seems to behave well on a virtual
 > address indexed virtual address tagged cache.

My case (SH-4) is: virtual address indexed, physical address tagged cache
(which has alias issue).

Suppose there's I/O to the physical page P asynchronously, and the
page is placed in the swap cache.  It remains cache entry, say,
indexed kernel virtual address K.  Then, process maps P at U.  U and K
(may) indexes differently.  The process will get the data from memory
(not the one in the cashe), if it's not flushed.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread Russell King

Marcelo Tosatti writes:
> If lookup_swap_cache() finds a page in the swap cache, and that page was
> in memory because of the swapin readahead, the cache is not flushed.
> 
> Here is a patch to fix the problem by always flushing the cache including
> for pages in the swap cache:

> -
> -   flush_page_to_ram(page);
> -   flush_icache_page(vma, page);
> }
>  
> mm->rss++;
> +
> +   flush_page_to_ram(page);
> +   flush_icache_page(vma, page);

Surely if the page is in the swap cache, we don't need the
flush_page_to_ram() because the data is already written to the page.  Yes,
there may be some reminents of it in the cache due to it being written
to disk via PIO.

Thinking about it some more - we have a process.  It used to contain page
P at address V.  We unmapped the page (and did the right thing with the
caches).  Now, something wants to access address V, so we pull the page
from the swap cache, and place page P back at address V.  We therefore
shouldn't need any cache manipulation at this point.

What was the problem?  The old code seems to behave well on a virtual
address indexed virtual address tagged cache.

--
Russell King ([EMAIL PROTECTED])The developer of ARM Linux
 http://www.arm.linux.org.uk/personal/aboutme.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread Russell King

Marcelo Tosatti writes:
 If lookup_swap_cache() finds a page in the swap cache, and that page was
 in memory because of the swapin readahead, the cache is not flushed.
 
 Here is a patch to fix the problem by always flushing the cache including
 for pages in the swap cache:

 -
 -   flush_page_to_ram(page);
 -   flush_icache_page(vma, page);
 }
  
 mm-rss++;
 +
 +   flush_page_to_ram(page);
 +   flush_icache_page(vma, page);

Surely if the page is in the swap cache, we don't need the
flush_page_to_ram() because the data is already written to the page.  Yes,
there may be some reminents of it in the cache due to it being written
to disk via PIO.

Thinking about it some more - we have a process.  It used to contain page
P at address V.  We unmapped the page (and did the right thing with the
caches).  Now, something wants to access address V, so we pull the page
from the swap cache, and place page P back at address V.  We therefore
shouldn't need any cache manipulation at this point.

What was the problem?  The old code seems to behave well on a virtual
address indexed virtual address tagged cache.

--
Russell King ([EMAIL PROTECTED])The developer of ARM Linux
 http://www.arm.linux.org.uk/personal/aboutme.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread NIIBE Yutaka

Russell King wrote:
  What was the problem?  The old code seems to behave well on a virtual
  address indexed virtual address tagged cache.

My case (SH-4) is: virtual address indexed, physical address tagged cache
(which has alias issue).

Suppose there's I/O to the physical page P asynchronously, and the
page is placed in the swap cache.  It remains cache entry, say,
indexed kernel virtual address K.  Then, process maps P at U.  U and K
(may) indexes differently.  The process will get the data from memory
(not the one in the cashe), if it's not flushed.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread Russell King

NIIBE Yutaka writes:
 My case (SH-4) is: virtual address indexed, physical address tagged cache
 (which has alias issue).

vivt caches have the same alias issue.

 Suppose there's I/O to the physical page P asynchronously, and the
 page is placed in the swap cache.

Unless someone else (Rik/DaveM) says otherwise, it is my understanding
that any IO for page P will only ever be a write to disk.  Therefore,
when you get a copy of the page from the swap cache, the physical memory
for that page is the same as it was when the process was using it last.

 It remains cache entry, say, indexed kernel virtual address K.  Then,
 process maps P at U.  U and K (may) indexes differently.  The process
 will get the data from memory (not the one in the cashe), if it's not
 flushed.

The data from memory will still be up to date though.  However, I agree
that you will end up with cache aliases.  I will also end up with cache
aliases.  The question now is, do these aliases really matter?

On my caches, the answer is no because they're not marked dirty, and
therefore will get dropped from the cache without writeback to memory.

If your cache doesn't write back clean cache data to memory, then you
should also behave well.

However, that said, someone more experienced with the Linux MM should
comment.

--
Russell King ([EMAIL PROTECTED])The developer of ARM Linux
 http://www.arm.linux.org.uk/personal/aboutme.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread Alan Cox

 Suppose there's I/O to the physical page P asynchronously, and the
 page is placed in the swap cache.  It remains cache entry, say,
 indexed kernel virtual address K.  Then, process maps P at U.  U and K
 (may) indexes differently.  The process will get the data from memory
 (not the one in the cashe), if it's not flushed.

Ok we need to handle that case a bit more intelligently so those flushes dont
get into other ports code paths. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread NIIBE Yutaka

Russell King wrote:
  Unless someone else (Rik/DaveM) says otherwise, it is my understanding
  that any IO for page P will only ever be a write to disk.  Therefore,
  when you get a copy of the page from the swap cache, the physical memory
  for that page is the same as it was when the process was using it last.
[...]
  The data from memory will still be up to date though.  However, I agree
  that you will end up with cache aliases.  I will also end up with cache
  aliases.  The question now is, do these aliases really matter?
  
  On my caches, the answer is no because they're not marked dirty, and
  therefore will get dropped from the cache without writeback to memory.
  
  If your cache doesn't write back clean cache data to memory, then you
  should also behave well.

Yes, that's the difference.  It's write back cache, in my case.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] swapin flush cache bug

2001-02-13 Thread NIIBE Yutaka

Alan Cox wrote:
  Ok we need to handle that case a bit more intelligently so those flushes dont
  get into other ports code paths. 

Possibly at fs/buffer.c:end_buffer_io_async?

We need to flush the cache when I/O was READ or READA.  Is there any
way for end_buffer_io_async to distinguish which I/O (READ or WRITE)
has been done?

--
Problem with write-back cache.

(1) Page got swapped out

   Swap out
   [ Disk ]  P [ Page ]

(2) Page got swapped in asynchronously, possibly by read-ahead

   Swap in
   [ Disk ]  P [ Page ]
  K

   The I/O from disk goes through kernel virtual address K.
   We have cache entries indexed by K.

(3) Page fault occurs at user space U

   [ Disk ]   P [ Page ] - U
  K

   The control goes to do_swap_page, found the page at
   lookup_swap_cache.

   If K and U indexes differently, we have cache alias issues, 
   we need to flush the entries indexed by K.
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] swapin flush cache bug

2001-02-12 Thread Marcelo Tosatti


Hi,

Niibe Yutaka noted (and added an entry on the MM bugzilla system) that
cache flushing on do_swap_page() is buggy. Here: 

---
struct page *page = lookup_swap_cache(entry);
pte_t pte;

if (!page) {
lock_kernel();
swapin_readahead(entry);
page = read_swap_cache(entry);
unlock_kernel();
if (!page)
return -1;

flush_page_to_ram(page);
flush_icache_page(vma, page);
}

mm->rss++;
--

If lookup_swap_cache() finds a page in the swap cache, and that page was
in memory because of the swapin readahead, the cache is not flushed.

Here is a patch to fix the problem by always flushing the cache including
for pages in the swap cache:

--- linux.orig/mm/memory.c.orig   Thu Feb  8 10:52:20 2001
+++ linux/mm/memory.cThu Feb  8 10:54:07 2001
@@ -1033,12 +1033,12 @@
unlock_kernel();
if (!page)
return -1;
-
-   flush_page_to_ram(page);
-   flush_icache_page(vma, page);
}
 
mm->rss++;
+
+   flush_page_to_ram(page);
+   flush_icache_page(vma, page);
 
pte = mk_pte(page, vma->vm_page_prot);




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/