Re: [PATCH] Improve heuristic detecting sequential reads

2007-04-12 Thread Jan Kara
On Tue 10-04-07 19:27:22, Andrew Morton wrote:
> And shouldn't offset be called prev_offset?  Or should prev_page be called
> page?  Or index?  Or prev_index?  Or Marmaduke?  The naming is all a mess.
> 
> Wanna take a look at all of this, please?
  OK, attached is a patch which changes prev_page to prev_index and offset
to prev_offset. It's still a bit misleading that in mm/filemap.c, variables
with page-numbers are called _index while in mm/readahead.c they
are often called just offset... But let's leave it for next time.

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Rename file_ra_state.prev_page to prev_index and file_ra_state.offset to
prev_offset.  Also update of prev_index in do_generic_mapping_read() is now
moved close to the update of prev_offset.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.21-rc6-1-readahead/include/linux/fs.h linux-2.6.21-rc6-2-readahead-tidy/include/linux/fs.h
--- linux-2.6.21-rc6-1-readahead/include/linux/fs.h	2007-04-10 17:14:42.0 +0200
+++ linux-2.6.21-rc6-2-readahead-tidy/include/linux/fs.h	2007-04-12 17:01:08.0 +0200
@@ -696,13 +696,13 @@ struct file_ra_state {
 	unsigned long size;
 	unsigned long flags;		/* ra flags RA_FLAG_xxx*/
 	unsigned long cache_hit;	/* cache hit count*/
-	unsigned long prev_page;	/* Cache last read() position */
+	unsigned long prev_index;	/* Cache last read() position */
 	unsigned long ahead_start;	/* Ahead window */
 	unsigned long ahead_size;
 	unsigned long ra_pages;		/* Maximum readahead window */
 	unsigned long mmap_hit;		/* Cache hit stat for mmap accesses */
 	unsigned long mmap_miss;	/* Cache miss stat for mmap accesses */
-	unsigned int offset;		/* Offset where last read() ended in a page */
+	unsigned int prev_offset;	/* Offset where last read() ended in a page */
 };
 #define RA_FLAG_MISS 0x01	/* a cache miss occured against this file */
 #define RA_FLAG_INCACHE 0x02	/* file is already in cache */
diff -rupX /home/jack/.kerndiffexclude linux-2.6.21-rc6-1-readahead/mm/filemap.c linux-2.6.21-rc6-2-readahead-tidy/mm/filemap.c
--- linux-2.6.21-rc6-1-readahead/mm/filemap.c	2007-04-10 17:18:10.0 +0200
+++ linux-2.6.21-rc6-2-readahead-tidy/mm/filemap.c	2007-04-12 16:19:25.0 +0200
@@ -877,8 +877,8 @@ void do_generic_mapping_read(struct addr
 	cached_page = NULL;
 	index = *ppos >> PAGE_CACHE_SHIFT;
 	next_index = index;
-	prev_index = ra.prev_page;
-	prev_offset = ra.offset;
+	prev_index = ra.prev_index;
+	prev_offset = ra.prev_offset;
 	last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
 	offset = *ppos & ~PAGE_CACHE_MASK;
 
@@ -931,7 +931,6 @@ page_ok:
 		 */
 		if (prev_index != index || offset != prev_offset)
 			mark_page_accessed(page);
-		prev_index = index;
 
 		/*
 		 * Ok, we have the page, and it's up-to-date, so
@@ -947,7 +946,8 @@ page_ok:
 		offset += ret;
 		index += offset >> PAGE_CACHE_SHIFT;
 		offset &= ~PAGE_CACHE_MASK;
-		prev_offset = ra.offset = offset;
+		prev_index = index;
+		prev_offset = ra.prev_offset = offset;
 
 		page_cache_release(page);
 		if (ret == nr && desc->count)
diff -rupX /home/jack/.kerndiffexclude linux-2.6.21-rc6-1-readahead/mm/readahead.c linux-2.6.21-rc6-2-readahead-tidy/mm/readahead.c
--- linux-2.6.21-rc6-1-readahead/mm/readahead.c	2007-04-10 17:14:42.0 +0200
+++ linux-2.6.21-rc6-2-readahead-tidy/mm/readahead.c	2007-04-12 17:03:18.0 +0200
@@ -37,7 +37,7 @@ void
 file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
 {
 	ra->ra_pages = mapping->backing_dev_info->ra_pages;
-	ra->prev_page = -1;
+	ra->prev_index = -1;
 }
 EXPORT_SYMBOL_GPL(file_ra_state_init);
 
@@ -202,19 +202,19 @@ out:
  * size:	Number of pages in that read
  *  Together, these form the "current window".
  *  Together, start and size represent the `readahead window'.
- * prev_page:   The page which the readahead algorithm most-recently inspected.
+ * prev_index:  The page which the readahead algorithm most-recently inspected.
  *  It is mainly used to detect sequential file reading.
  *  If page_cache_readahead sees that it is again being called for
  *  a page which it just looked at, it can return immediately without
  *  making any state changes.
- * offset:  Offset in the prev_page where the last read ended - used for
+ * offset:  Offset in the prev_index where the last read ended - used for
  *  detection of sequential file reading.
  * ahead_start,
  * ahead_size:  Together, these form the "ahead window".
  * ra_pages:	The externally controlled max readahead for this fd.
  *
  * When readahead is in the off state (size == 0), readahead is disabled.
- * In this state, prev_page is used to detect the resumption of sequential I/O.
+ * In this state, prev_index is used to detect the resumption of sequential I/O.
  

Re: [PATCH] Improve heuristic detecting sequential reads

2007-04-11 Thread Jan Kara
On Tue 10-04-07 19:27:22, Andrew Morton wrote:
> On Tue, 10 Apr 2007 17:54:11 +0200 Jan Kara <[EMAIL PROTECTED]> wrote:
> 
> > Introduce ra.offset and store in it an offset where the previous read 
> > ended. This way
> > we can detect whether reads are really sequential (and thus we should not 
> > mark the page
> > as accessed repeatedly) or whether they are random and just happen to be in 
> > the same page
> > (and the page should really be marked accessed again).
> 
> (less columns, please)
  Oops, sorry.

> OK.  So prev_page and prev_offset are now a complexified representation of a
> loff_t, no?  
  Yes.

> So why don't we just use a loff_t for this?
 I did not merge them because most other things in file_ra_state are in
pages and thus comparisons and assignments would require additional
shifts, which would IMHO make the whole thing less clear.

> Anyway, the asymmetry in our handling of prev_index (sometimes called
> prev_page!) and prev_offset is unpleasing.  This:
  prev_page is a member of the file_ra_state. Probably prev_index would be
better name there...

> --- a/mm/filemap.c~readahead-improve-heuristic-detecting-sequential-reads-tidy
> +++ a/mm/filemap.c
> @@ -933,6 +933,7 @@ page_ok:
>   if (prev_index != index || offset != prev_offset)
>   mark_page_accessed(page);
>   prev_index = index;
> + prev_offset = ra.offset = offset;
>  
>   /*
>* Ok, we have the page, and it's up-to-date, so
> @@ -948,7 +949,6 @@ page_ok:
>   offset += ret;
>   index += offset >> PAGE_CACHE_SHIFT;
>   offset &= ~PAGE_CACHE_MASK;
> - prev_offset = ra.offset = offset;
>  
>   page_cache_release(page);
>   if (ret == nr && desc->count)
> 
> improves things somewhat.  But I think it would be nicer if their handling
> was unified, or at least consistent.  We update ra.offset here, and we
> update ra.prev_page over there.
  It's clearer but wrong, as Wu already noted :). But the assignment
prev_index = index
  can be shifted somewhat lower (definitely after the comment) which makes
things slightly more readable.

> And shouldn't offset be called prev_offset?  Or should prev_page be called
> page?  Or index?  Or prev_index?  Or Marmaduke?  The naming is all a mess.
  I vote for prev_index and 'offset' could be prev_offset. I'll create a
tidy-up patch...

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Improve heuristic detecting sequential reads

2007-04-11 Thread Andi Kleen
> adaptive readahead?  Has been in -mm for a year.  Problem is, it is
> _so_ complete that I just don't know how to merge it.  It's huge, and
> only Wu understands it. So it's really rather stuck.

No benchmark numbers for/against it at all?

-Andi
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Improve heuristic detecting sequential reads

2007-04-10 Thread WU Fengguang
On Tue, Apr 10, 2007 at 04:45:40PM -0700, Andrew Morton wrote:
> On 11 Apr 2007 00:56:51 +0200
> Andi Kleen <[EMAIL PROTECTED]> wrote:
> 
> > There's a much more complete patchkit for this that gets reposted
> > regularly on l-k. Perhaps it would make sense to test that first?
> 
> adaptive readahead?  Has been in -mm for a year.  Problem is, it is
> _so_ complete that I just don't know how to merge it.  It's huge, and
> only Wu understands it. So it's really rather stuck.

Yeah, it is.

I'll come up with a minimal patch to first simplify the current
readahead algorithm. The code is ready, if ignoring the document ;-)

Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Improve heuristic detecting sequential reads

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 17:54:11 +0200 Jan Kara <[EMAIL PROTECTED]> wrote:

> Introduce ra.offset and store in it an offset where the previous read ended. 
> This way
> we can detect whether reads are really sequential (and thus we should not 
> mark the page
> as accessed repeatedly) or whether they are random and just happen to be in 
> the same page
> (and the page should really be marked accessed again).

(less columns, please)

OK.  So prev_page and prev_offset are now a complexified representation of a
loff_t, no?  

So why don't we just use a loff_t for this?

Anyway, the asymmetry in our handling of prev_index (sometimes called
prev_page!) and prev_offset is unpleasing.  This:

--- a/mm/filemap.c~readahead-improve-heuristic-detecting-sequential-reads-tidy
+++ a/mm/filemap.c
@@ -933,6 +933,7 @@ page_ok:
if (prev_index != index || offset != prev_offset)
mark_page_accessed(page);
prev_index = index;
+   prev_offset = ra.offset = offset;
 
/*
 * Ok, we have the page, and it's up-to-date, so
@@ -948,7 +949,6 @@ page_ok:
offset += ret;
index += offset >> PAGE_CACHE_SHIFT;
offset &= ~PAGE_CACHE_MASK;
-   prev_offset = ra.offset = offset;
 
page_cache_release(page);
if (ret == nr && desc->count)

improves things somewhat.  But I think it would be nicer if their handling
was unified, or at least consistent.  We update ra.offset here, and we
update ra.prev_page over there.

And shouldn't offset be called prev_offset?  Or should prev_page be called
page?  Or index?  Or prev_index?  Or Marmaduke?  The naming is all a mess.

Wanna take a look at all of this, please?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Improve heuristic detecting sequential reads

2007-04-10 Thread Andrew Morton
On 11 Apr 2007 00:56:51 +0200
Andi Kleen <[EMAIL PROTECTED]> wrote:

> Jan Kara <[EMAIL PROTECTED]> writes:
> 
> >   Hello!
> > 
> >   In thread http://lkml.org/lkml/2007/3/9/403, we discussed a problem
> > with the current heuristic for detecting sequential IO in
> > do_generic_mapping_read() - for small files a page is marked as accessed
> > only once which can cause a performance problems.
> >   Attached is a patch that improves the heuristic by introducing a
> > ra.offset variable so now we can reliably detect sequetial reads.
> > The patch replaces Nick's patch from http://lkml.org/lkml/2007/3/15/177
> > (Nick has acked to replace the patch). Could you please put the patch
> > into -mm?
> 
> There's a much more complete patchkit for this that gets reposted
> regularly on l-k. Perhaps it would make sense to test that first?

adaptive readahead?  Has been in -mm for a year.  Problem is, it is
_so_ complete that I just don't know how to merge it.  It's huge, and
only Wu understands it. So it's really rather stuck.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Improve heuristic detecting sequential reads

2007-04-10 Thread Andi Kleen
Jan Kara <[EMAIL PROTECTED]> writes:

>   Hello!
> 
>   In thread http://lkml.org/lkml/2007/3/9/403, we discussed a problem
> with the current heuristic for detecting sequential IO in
> do_generic_mapping_read() - for small files a page is marked as accessed
> only once which can cause a performance problems.
>   Attached is a patch that improves the heuristic by introducing a
> ra.offset variable so now we can reliably detect sequetial reads.
> The patch replaces Nick's patch from http://lkml.org/lkml/2007/3/15/177
> (Nick has acked to replace the patch). Could you please put the patch
> into -mm?

There's a much more complete patchkit for this that gets reposted
regularly on l-k. Perhaps it would make sense to test that first?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/