Re: [patch] truncate_inode_pages

2001-06-12 Thread Dieter Nützel

Am Dienstag, 12. Juni 2001 14:32 schrieb Daniel Phillips:
> On Tuesday 12 June 2001 02:00, you wrote:
> > Now with -ac13 and the third try.
> > Is it final?
>
> There have been no further problems reported, but it's Andrew Morton's
> patch, his decision.
>
> --
> Daniel

Hello Andrew,

have you forwarded it to Alan for apply?

Thanks,
Dieter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-11 Thread Andrew Morton

Daniel Phillips wrote:
> 
> On Monday 11 June 2001 14:45, Andrew Morton wrote:
> > Daniel Phillips wrote:
> > > On Sunday 10 June 2001 03:31, Andrew Morton wrote:
> > > > Daniel Phillips wrote:
> > > > > This is easy, just set the list head to the page about to be
> > > > > truncated.
> > > >
> > > > Works for me.
> > >
> > > It looks good, but it's black magic
> >
> > No, it's wrong.  I'm getting BUG()s in clear_inode():
> > [...]
> > The lists are mangled.
> 
> curr is being advanced in the wrong place.

Yes.

> /me makes note to self: never resist the temptation to clean things up the
> rest of the way
> 
> I'll actually apply the patch and try it this time ;-)

The bug is surprisingly hard to trigger.


Take three:


--- linux-2.4.5/mm/filemap.cMon May 28 13:31:49 2001
+++ linux-akpm/mm/filemap.c Mon Jun 11 23:31:08 2001
@@ -230,17 +230,17 @@
unsigned long offset;
 
page = list_entry(curr, struct page, list);
-   curr = curr->next;
offset = page->index;
 
/* Is one of the pages to truncate? */
if ((offset >= start) || (*partial && (offset + 1) == start)) {
+   list_del(head);
+   list_add(head, curr);
if (TryLockPage(page)) {
page_cache_get(page);
spin_unlock(&pagecache_lock);
wait_on_page(page);
-   page_cache_release(page);
-   return 1;
+   goto out_restart;
}
page_cache_get(page);
spin_unlock(&pagecache_lock);
@@ -252,11 +252,15 @@
truncate_complete_page(page);
 
UnlockPage(page);
-   page_cache_release(page);
-   return 1;
+   goto out_restart;
}
+   curr = curr->next;
}
return 0;
+out_restart:
+   page_cache_release(page);
+   spin_lock(&pagecache_lock);
+   return 1;
 }
 
 
@@ -273,15 +277,19 @@
 {
unsigned long start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
+   int complete;
 
-repeat:
spin_lock(&pagecache_lock);
-   if (truncate_list_pages(&mapping->clean_pages, start, &partial))
-   goto repeat;
-   if (truncate_list_pages(&mapping->dirty_pages, start, &partial))
-   goto repeat;
-   if (truncate_list_pages(&mapping->locked_pages, start, &partial))
-   goto repeat;
+   do {
+   complete = 1;
+   while (truncate_list_pages(&mapping->clean_pages, start, &partial))
+   complete = 0;
+   while (truncate_list_pages(&mapping->dirty_pages, start, &partial))
+   complete = 0;
+   while (truncate_list_pages(&mapping->locked_pages, start, &partial))
+   complete = 0;
+   } while (!complete);
+   /* Traversed all three lists without dropping the lock */
spin_unlock(&pagecache_lock);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-11 Thread Daniel Phillips

On Monday 11 June 2001 14:45, Andrew Morton wrote:
> Daniel Phillips wrote:
> > On Sunday 10 June 2001 03:31, Andrew Morton wrote:
> > > Daniel Phillips wrote:
> > > > This is easy, just set the list head to the page about to be
> > > > truncated.
> > >
> > > Works for me.
> >
> > It looks good, but it's black magic
>
> No, it's wrong.  I'm getting BUG()s in clear_inode():
> [...]
> The lists are mangled.

curr is being advanced in the wrong place.

/me makes note to self: never resist the temptation to clean things up the 
rest of the way

I'll actually apply the patch and try it this time ;-)

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-11 Thread Andrew Morton

Daniel Phillips wrote:
> 
> On Sunday 10 June 2001 03:31, Andrew Morton wrote:
> > Daniel Phillips wrote:
> > > This is easy, just set the list head to the page about to be truncated.
> >
> > Works for me.
> 
> It looks good, but it's black magic

No, it's wrong.  I'm getting BUG()s in clear_inode():

485 if (inode->i_data.nrpages)
486 BUG();

(gdb) p inode->i_data
$5 = {clean_pages = {next = 0xd7423ae4, prev = 0xc1628280}, dirty_pages = {next = 
0xd7423aec, prev = 0xd7423aec}, 
  locked_pages = {next = 0xd7423af4, prev = 0xd7423af4}, nrpages = 0x1, a_ops = 
0xc02e4940, host = 0xd7423a40, 
  i_mmap = 0x0, i_mmap_shared = 0x0, i_shared_lock = {lock = 0x1}, gfp_mask = 0x5}

(gdb) p *(struct page *)0xd7423ae4
$6 = {list = {next = 0xd7423ae4, prev = 0xc1628280}, mapping = 0xd7423aec, index = 
0xd7423aec, 
  next_hash = 0xd7423af4, count = {counter = 0xd7423af4}, flags = 0x1, lru = {next = 
0xc02e4940, 
prev = 0xd7423a40}, age = 0x0, wait = {lock = {lock = 0x0}, task_list = {next = 
0x1, prev = 0x5}}, 
  pprev_hash = 0x0, buffers = 0x0, virtual = 0x0, zone = 0x0}
(gdb) p &inode->i_data

The lists are mangled.

I think what's happening is:

curr = curr->next;
offset = page->index;

/* Is one of the pages to truncate? */
if ((offset >= start) || (*partial && (offset + 1) == start)) {
list_del(head);
list_add_tail(head, curr);

`curr' can point to `head' when we get inside the `if'.
So we do, effectively,

list_del(head);
list_add_tail(head, head);

Which turns mapping->clean_pages into an empty list,
and the target page.list thinks it's on a list, but
really isn't.  A subsequent list_del() on the page
screws things up somehow.

Ho-hum.  I'll have another go tomorrow.  Probably a
simple `if (curr != head)' will cure it.

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-10 Thread Dieter Nützel

> Daniel Phillips wrote:
> > 
> > This is easy, just set the list head to the page about to be truncated.
>
> Works for me.
>
> --- linux-2.4.5/mm/filemap.cMon May 28 13:31:49 2001
> +++ linux-akpm/mm/filemap.c Sun Jun 10 11:29:19 2001
> @@ -235,12 +235,13 @@

[snip]

Works for me 12 times faster on my Athlon 550 and ReiserFS.

System:

Athlon 550 (old one, 0.25 µm)
MSI MS-6167 Rev 1.0B (Irongate C4)
320 MB SDRAM PC100-2-2-2
AHA-2940UW
IBM DDYS-T18350N 18 GB (UW/U160)
Kernel 2.4.5-ac12
Glibc-2.2 (SuSE 7.1)

SunWave1>cat /proc/version
Linux version 2.4.5-ac12 (root@SunWave1) (gcc version 2.95.2 19991024 
(release)) #1 Sat Jun 9 17:41:07 CEST 2001
 
SunWave1>time ./ftruncate
0.430u 54.790s 1:00.09 91.8%0+0k 0+0io 32887pf+0w
 
With the mm/filemap.c fix:
 
SunWave1>cat /proc/version
Linux version 2.4.5-ac12 (root@SunWave1) (gcc version 2.95.2 19991024 
(release)) #1 Sun Jun 10 22:49:07 CEST 2001
 
SunWave1>time ./ftruncate
0.220u 1.670s 0:05.13 36.8% 0+0k 0+0io 32852pf+0w

Thanks,
Dieter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-10 Thread Daniel Phillips

On Sunday 10 June 2001 03:31, Andrew Morton wrote:
> Daniel Phillips wrote:
> > This is easy, just set the list head to the page about to be truncated.
>
> Works for me.

It looks good, but it's black magic - it could use a comment along the lines 
of:

/*
 * Ensure at least one pass through all three lists with lock held
 */

There is some fuzz on the original code that could use a scrubbing.  "start" 
should not be the 1st full page needing truncating, it should be the first 
page needing truncating at all, full or partial.

The truncation condition becomes:

-   if ((offset >= start) || (*partial && (offset + 1) == start)) {
+   if (offset >= start) {
[...]
-   if (*partial && (offset + 1) == start)
+   if (*partial && offset == start)

Then the conditions are so easy to test it's not worth passing the &partial, 
just pass lstart... and... oh bother, it's not broke, why fix it ;-)

--
Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-09 Thread Andrew Morton

Daniel Phillips wrote:
> 
> This is easy, just set the list head to the page about to be truncated.

Works for me.

--- linux-2.4.5/mm/filemap.cMon May 28 13:31:49 2001
+++ linux-akpm/mm/filemap.c Sun Jun 10 11:29:19 2001
@@ -235,12 +235,13 @@
 
/* Is one of the pages to truncate? */
if ((offset >= start) || (*partial && (offset + 1) == start)) {
+   list_del(head);
+   list_add_tail(head, curr);
if (TryLockPage(page)) {
page_cache_get(page);
spin_unlock(&pagecache_lock);
wait_on_page(page);
-   page_cache_release(page);
-   return 1;
+   goto out_restart;
}
page_cache_get(page);
spin_unlock(&pagecache_lock);
@@ -252,11 +253,14 @@
truncate_complete_page(page);
 
UnlockPage(page);
-   page_cache_release(page);
-   return 1;
+   goto out_restart;
}
}
return 0;
+out_restart:
+   page_cache_release(page);
+   spin_lock(&pagecache_lock);
+   return 1;
 }
 
 
@@ -273,15 +277,18 @@
 {
unsigned long start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
+   int complete;
 
-repeat:
spin_lock(&pagecache_lock);
-   if (truncate_list_pages(&mapping->clean_pages, start, &partial))
-   goto repeat;
-   if (truncate_list_pages(&mapping->dirty_pages, start, &partial))
-   goto repeat;
-   if (truncate_list_pages(&mapping->locked_pages, start, &partial))
-   goto repeat;
+   do {
+   complete = 1;
+   while (truncate_list_pages(&mapping->clean_pages, start, &partial))
+   complete = 0;
+   while (truncate_list_pages(&mapping->dirty_pages, start, &partial))
+   complete = 0;
+   while (truncate_list_pages(&mapping->locked_pages, start, &partial))
+   complete = 0;
+   } while (!complete);
spin_unlock(&pagecache_lock);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-09 Thread Daniel Phillips

On Saturday 09 June 2001 19:40, Alexander Viro wrote:
> > takes 45 seconds CPU time due to the O(clean * dirty) algorithm in
> > truncate_inode_pages().  The machine is locked up for the duration.
> > The patch reduces this to 20 milliseconds via an O(clean + dirty)
> > algorithm.
>
> Unfortunately, it's _not_ O(clean + dirty).
>
> > +   while (truncate_list_pages(&mapping->clean_pages, start,
> > &partial)) { +  spin_lock(&pagecache_lock);
> > +   complete = 0;
> > +   }
>
> Cool. Now think what happens if pages with large indices are in the
> very end of list. Half of them. You skip clean/2 pages on each of
> clean/2 passes. Hardly a linear behaviour - all you need is a
> different program to trigger it.
>
> Now, having a separate pass that would reorder the pages on list,
> moving the to-kill ones in the beginning might help.

This is easy, just set the list head to the page about to be truncated.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [patch] truncate_inode_pages

2001-06-09 Thread Alexander Viro



> takes 45 seconds CPU time due to the O(clean * dirty) algorithm in
> truncate_inode_pages().  The machine is locked up for the duration.
> The patch reduces this to 20 milliseconds via an O(clean + dirty)
> algorithm.

Unfortunately, it's _not_ O(clean + dirty).

> + while (truncate_list_pages(&mapping->clean_pages, start, &partial)) {
> + spin_lock(&pagecache_lock);
> + complete = 0;
> + }

Cool. Now think what happens if pages with large indices are in the
very end of list. Half of them. You skip clean/2 pages on each of
clean/2 passes. Hardly a linear behaviour - all you need is a different
program to trigger it.

Now, having a separate pass that would reorder the pages on list,
moving the to-kill ones in the beginning might help.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[patch] truncate_inode_pages

2001-06-09 Thread Andrew Morton

The ftruncate() in this program:

#include 
#include 
#include 

int clean = 64 * 1024 * 1024;
int dirty = 64 * 1024 * 1024;

main()
{
int fd = open("foo", O_RDWR|O_TRUNC|O_CREAT, 0666);
void *mem;

ftruncate(fd, clean + dirty);
mem = mmap(0, clean + dirty, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
memset(mem, 0, clean + dirty);
msync(mem, clean, MS_SYNC);
ftruncate(fd, clean);
}


takes 45 seconds CPU time due to the O(clean * dirty) algorithm in
truncate_inode_pages().  The machine is locked up for the duration.
The patch reduces this to 20 milliseconds via an O(clean + dirty)
algorithm.


--- linux-2.4.5/mm/filemap.cMon May 28 13:31:49 2001
+++ linux-akpm/mm/filemap.c Sun Jun 10 01:27:09 2001
@@ -273,15 +273,24 @@
 {
unsigned long start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
+   int complete;
 
-repeat:
spin_lock(&pagecache_lock);
-   if (truncate_list_pages(&mapping->clean_pages, start, &partial))
-   goto repeat;
-   if (truncate_list_pages(&mapping->dirty_pages, start, &partial))
-   goto repeat;
-   if (truncate_list_pages(&mapping->locked_pages, start, &partial))
-   goto repeat;
+   do {
+   complete = 1;
+   while (truncate_list_pages(&mapping->clean_pages, start, &partial)) {
+   spin_lock(&pagecache_lock);
+   complete = 0;
+   }
+   while (truncate_list_pages(&mapping->dirty_pages, start, &partial)) {
+   spin_lock(&pagecache_lock);
+   complete = 0;
+   }
+   while (truncate_list_pages(&mapping->locked_pages, start, &partial)) {
+   spin_lock(&pagecache_lock);
+   complete = 0;
+   }
+   } while (!complete);
spin_unlock(&pagecache_lock);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/