Re: [PATCH] Improve heuristic detecting sequential reads
On Tue 10-04-07 19:27:22, Andrew Morton wrote: > And shouldn't offset be called prev_offset? Or should prev_page be called > page? Or index? Or prev_index? Or Marmaduke? The naming is all a mess. > > Wanna take a look at all of this, please? OK, attached is a patch which changes prev_page to prev_index and offset to prev_offset. It's still a bit misleading that in mm/filemap.c, variables with page-numbers are called _index while in mm/readahead.c they are often called just offset... But let's leave it for next time. Honza -- Jan Kara <[EMAIL PROTECTED]> SuSE CR Labs Rename file_ra_state.prev_page to prev_index and file_ra_state.offset to prev_offset. Also update of prev_index in do_generic_mapping_read() is now moved close to the update of prev_offset. Signed-off-by: Jan Kara <[EMAIL PROTECTED]> diff -rupX /home/jack/.kerndiffexclude linux-2.6.21-rc6-1-readahead/include/linux/fs.h linux-2.6.21-rc6-2-readahead-tidy/include/linux/fs.h --- linux-2.6.21-rc6-1-readahead/include/linux/fs.h 2007-04-10 17:14:42.0 +0200 +++ linux-2.6.21-rc6-2-readahead-tidy/include/linux/fs.h 2007-04-12 17:01:08.0 +0200 @@ -696,13 +696,13 @@ struct file_ra_state { unsigned long size; unsigned long flags; /* ra flags RA_FLAG_xxx*/ unsigned long cache_hit; /* cache hit count*/ - unsigned long prev_page; /* Cache last read() position */ + unsigned long prev_index; /* Cache last read() position */ unsigned long ahead_start; /* Ahead window */ unsigned long ahead_size; unsigned long ra_pages; /* Maximum readahead window */ unsigned long mmap_hit; /* Cache hit stat for mmap accesses */ unsigned long mmap_miss; /* Cache miss stat for mmap accesses */ - unsigned int offset; /* Offset where last read() ended in a page */ + unsigned int prev_offset; /* Offset where last read() ended in a page */ }; #define RA_FLAG_MISS 0x01 /* a cache miss occured against this file */ #define RA_FLAG_INCACHE 0x02 /* file is already in cache */ diff -rupX /home/jack/.kerndiffexclude linux-2.6.21-rc6-1-readahead/mm/filemap.c linux-2.6.21-rc6-2-readahead-tidy/mm/filemap.c --- linux-2.6.21-rc6-1-readahead/mm/filemap.c 2007-04-10 17:18:10.0 +0200 +++ linux-2.6.21-rc6-2-readahead-tidy/mm/filemap.c 2007-04-12 16:19:25.0 +0200 @@ -877,8 +877,8 @@ void do_generic_mapping_read(struct addr cached_page = NULL; index = *ppos >> PAGE_CACHE_SHIFT; next_index = index; - prev_index = ra.prev_page; - prev_offset = ra.offset; + prev_index = ra.prev_index; + prev_offset = ra.prev_offset; last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT; offset = *ppos & ~PAGE_CACHE_MASK; @@ -931,7 +931,6 @@ page_ok: */ if (prev_index != index || offset != prev_offset) mark_page_accessed(page); - prev_index = index; /* * Ok, we have the page, and it's up-to-date, so @@ -947,7 +946,8 @@ page_ok: offset += ret; index += offset >> PAGE_CACHE_SHIFT; offset &= ~PAGE_CACHE_MASK; - prev_offset = ra.offset = offset; + prev_index = index; + prev_offset = ra.prev_offset = offset; page_cache_release(page); if (ret == nr && desc->count) diff -rupX /home/jack/.kerndiffexclude linux-2.6.21-rc6-1-readahead/mm/readahead.c linux-2.6.21-rc6-2-readahead-tidy/mm/readahead.c --- linux-2.6.21-rc6-1-readahead/mm/readahead.c 2007-04-10 17:14:42.0 +0200 +++ linux-2.6.21-rc6-2-readahead-tidy/mm/readahead.c 2007-04-12 17:03:18.0 +0200 @@ -37,7 +37,7 @@ void file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping) { ra->ra_pages = mapping->backing_dev_info->ra_pages; - ra->prev_page = -1; + ra->prev_index = -1; } EXPORT_SYMBOL_GPL(file_ra_state_init); @@ -202,19 +202,19 @@ out: * size: Number of pages in that read * Together, these form the "current window". * Together, start and size represent the `readahead window'. - * prev_page: The page which the readahead algorithm most-recently inspected. + * prev_index: The page which the readahead algorithm most-recently inspected. * It is mainly used to detect sequential file reading. * If page_cache_readahead sees that it is again being called for * a page which it just looked at, it can return immediately without * making any state changes. - * offset: Offset in the prev_page where the last read ended - used for + * offset: Offset in the prev_index where the last read ended - used for * detection of sequential file reading. * ahead_start, * ahead_size: Together, these form the "ahead window". * ra_pages: The externally controlled max readahead for this fd. * * When readahead is in the off state (size == 0), readahead is disabled. - * In this state, prev_page is used to detect the resumption of sequential I/O. + * In this state, prev_index is used to detect the resumption of sequential I/O.
Re: [PATCH] Improve heuristic detecting sequential reads
On Tue 10-04-07 19:27:22, Andrew Morton wrote: > On Tue, 10 Apr 2007 17:54:11 +0200 Jan Kara <[EMAIL PROTECTED]> wrote: > > > Introduce ra.offset and store in it an offset where the previous read > > ended. This way > > we can detect whether reads are really sequential (and thus we should not > > mark the page > > as accessed repeatedly) or whether they are random and just happen to be in > > the same page > > (and the page should really be marked accessed again). > > (less columns, please) Oops, sorry. > OK. So prev_page and prev_offset are now a complexified representation of a > loff_t, no? Yes. > So why don't we just use a loff_t for this? I did not merge them because most other things in file_ra_state are in pages and thus comparisons and assignments would require additional shifts, which would IMHO make the whole thing less clear. > Anyway, the asymmetry in our handling of prev_index (sometimes called > prev_page!) and prev_offset is unpleasing. This: prev_page is a member of the file_ra_state. Probably prev_index would be better name there... > --- a/mm/filemap.c~readahead-improve-heuristic-detecting-sequential-reads-tidy > +++ a/mm/filemap.c > @@ -933,6 +933,7 @@ page_ok: > if (prev_index != index || offset != prev_offset) > mark_page_accessed(page); > prev_index = index; > + prev_offset = ra.offset = offset; > > /* >* Ok, we have the page, and it's up-to-date, so > @@ -948,7 +949,6 @@ page_ok: > offset += ret; > index += offset >> PAGE_CACHE_SHIFT; > offset &= ~PAGE_CACHE_MASK; > - prev_offset = ra.offset = offset; > > page_cache_release(page); > if (ret == nr && desc->count) > > improves things somewhat. But I think it would be nicer if their handling > was unified, or at least consistent. We update ra.offset here, and we > update ra.prev_page over there. It's clearer but wrong, as Wu already noted :). But the assignment prev_index = index can be shifted somewhat lower (definitely after the comment) which makes things slightly more readable. > And shouldn't offset be called prev_offset? Or should prev_page be called > page? Or index? Or prev_index? Or Marmaduke? The naming is all a mess. I vote for prev_index and 'offset' could be prev_offset. I'll create a tidy-up patch... Honza -- Jan Kara <[EMAIL PROTECTED]> SuSE CR Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Improve heuristic detecting sequential reads
> adaptive readahead? Has been in -mm for a year. Problem is, it is > _so_ complete that I just don't know how to merge it. It's huge, and > only Wu understands it. So it's really rather stuck. No benchmark numbers for/against it at all? -Andi > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Improve heuristic detecting sequential reads
On Tue, Apr 10, 2007 at 04:45:40PM -0700, Andrew Morton wrote: > On 11 Apr 2007 00:56:51 +0200 > Andi Kleen <[EMAIL PROTECTED]> wrote: > > > There's a much more complete patchkit for this that gets reposted > > regularly on l-k. Perhaps it would make sense to test that first? > > adaptive readahead? Has been in -mm for a year. Problem is, it is > _so_ complete that I just don't know how to merge it. It's huge, and > only Wu understands it. So it's really rather stuck. Yeah, it is. I'll come up with a minimal patch to first simplify the current readahead algorithm. The code is ready, if ignoring the document ;-) Wu - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Improve heuristic detecting sequential reads
On Tue, 10 Apr 2007 17:54:11 +0200 Jan Kara <[EMAIL PROTECTED]> wrote: > Introduce ra.offset and store in it an offset where the previous read ended. > This way > we can detect whether reads are really sequential (and thus we should not > mark the page > as accessed repeatedly) or whether they are random and just happen to be in > the same page > (and the page should really be marked accessed again). (less columns, please) OK. So prev_page and prev_offset are now a complexified representation of a loff_t, no? So why don't we just use a loff_t for this? Anyway, the asymmetry in our handling of prev_index (sometimes called prev_page!) and prev_offset is unpleasing. This: --- a/mm/filemap.c~readahead-improve-heuristic-detecting-sequential-reads-tidy +++ a/mm/filemap.c @@ -933,6 +933,7 @@ page_ok: if (prev_index != index || offset != prev_offset) mark_page_accessed(page); prev_index = index; + prev_offset = ra.offset = offset; /* * Ok, we have the page, and it's up-to-date, so @@ -948,7 +949,6 @@ page_ok: offset += ret; index += offset >> PAGE_CACHE_SHIFT; offset &= ~PAGE_CACHE_MASK; - prev_offset = ra.offset = offset; page_cache_release(page); if (ret == nr && desc->count) improves things somewhat. But I think it would be nicer if their handling was unified, or at least consistent. We update ra.offset here, and we update ra.prev_page over there. And shouldn't offset be called prev_offset? Or should prev_page be called page? Or index? Or prev_index? Or Marmaduke? The naming is all a mess. Wanna take a look at all of this, please? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Improve heuristic detecting sequential reads
On 11 Apr 2007 00:56:51 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote: > Jan Kara <[EMAIL PROTECTED]> writes: > > > Hello! > > > > In thread http://lkml.org/lkml/2007/3/9/403, we discussed a problem > > with the current heuristic for detecting sequential IO in > > do_generic_mapping_read() - for small files a page is marked as accessed > > only once which can cause a performance problems. > > Attached is a patch that improves the heuristic by introducing a > > ra.offset variable so now we can reliably detect sequetial reads. > > The patch replaces Nick's patch from http://lkml.org/lkml/2007/3/15/177 > > (Nick has acked to replace the patch). Could you please put the patch > > into -mm? > > There's a much more complete patchkit for this that gets reposted > regularly on l-k. Perhaps it would make sense to test that first? adaptive readahead? Has been in -mm for a year. Problem is, it is _so_ complete that I just don't know how to merge it. It's huge, and only Wu understands it. So it's really rather stuck. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Improve heuristic detecting sequential reads
Jan Kara <[EMAIL PROTECTED]> writes: > Hello! > > In thread http://lkml.org/lkml/2007/3/9/403, we discussed a problem > with the current heuristic for detecting sequential IO in > do_generic_mapping_read() - for small files a page is marked as accessed > only once which can cause a performance problems. > Attached is a patch that improves the heuristic by introducing a > ra.offset variable so now we can reliably detect sequetial reads. > The patch replaces Nick's patch from http://lkml.org/lkml/2007/3/15/177 > (Nick has acked to replace the patch). Could you please put the patch > into -mm? There's a much more complete patchkit for this that gets reposted regularly on l-k. Perhaps it would make sense to test that first? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/