Re: [CALL FOR TESTING] Make Ext3 fsck way faster [2.6.24-rc6 -mm patch]

2008-01-15 Thread Andrew Morton

I'm wondering about the real value of this change, really.

In any decent environment, people will fsck their ext3 filesystems during
planned downtime, and the benefit of reducing that downtime from 6
hours/machine to 2 hours/machine is probably fairly small, given that there
is no service interruption.  (The same applies to desktops and laptops).

Sure, the benefit is not *zero*, but it's small.  Much less than it would
be with ext2.  I mean, the avoid unplanned fscks feature is the whole
reason why ext3 has journalling (and boy is that feature expensive during
normal operation).

So...  it's unobvious that the benefit of this feature is worth its risks
and costs?

-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CALL FOR TESTING] Make Ext3 fsck way faster [2.6.24-rc6 -mm patch]

2008-01-15 Thread Christoph Hellwig
On Tue, Jan 15, 2008 at 03:04:41AM -0800, Andrew Morton wrote:
 I'm wondering about the real value of this change, really.
 
 In any decent environment, people will fsck their ext3 filesystems during
 planned downtime, and the benefit of reducing that downtime from 6
 hours/machine to 2 hours/machine is probably fairly small, given that there
 is no service interruption.  (The same applies to desktops and laptops).
 
 Sure, the benefit is not *zero*, but it's small.  Much less than it would
 be with ext2.  I mean, the avoid unplanned fscks feature is the whole
 reason why ext3 has journalling (and boy is that feature expensive during
 normal operation).
 
 So...  it's unobvious that the benefit of this feature is worth its risks
 and costs?

They won't fsck in planned downtimes.  They will have to use fsck when
the shit hits the fan and they need to.   Not sure about ext3, but big
XFS user with a close tie to the US goverment were concerned about this
case for really big filesystems and have sponsored speedup including
multithreading xfs_repair.  I'm pretty sure the same arguments apply
to ext3, even if the filesystems are a few magnitudes smaller.

 
 -
 To unsubscribe from this list: send the line unsubscribe linux-ext4 in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
---end quoted text---
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CALL FOR TESTING] Make Ext3 fsck way faster [2.6.24-rc6 -mm patch]

2008-01-15 Thread Christoph Hellwig
On Tue, Jan 15, 2008 at 01:15:33PM +, Christoph Hellwig wrote:
 They won't fsck in planned downtimes.  They will have to use fsck when
 the shit hits the fan and they need to.   Not sure about ext3, but big
 XFS user with a close tie to the US goverment were concerned about this
 case for really big filesystems and have sponsored speedup including
 multithreading xfs_repair.  I'm pretty sure the same arguments apply
 to ext3, even if the filesystems are a few magnitudes smaller.

And to add to that thanks to the not quite optimal default of
peridocially checking that I alwasy forget to turn off on test machines
an ext3 fsck speedup would be in my personal interested, and probably
that of tons of developers :)
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: checkpatch.pl warnings

2008-01-15 Thread Aneesh Kumar K.V
On Mon, Jan 14, 2008 at 12:49:27PM -0800, Mingming Cao wrote:
 Hi Guys,
 
 Could you check the checkpatch.pl warnings and see if it make sense to fix 
 them? Thanks!
 
 [EMAIL PROTECTED]:~/fs/ext4/stylecheck$ grep has style problems *
 linux-2.6.24-rc7-48-bit-i_blocks.patch.out:Your patch has style problems, 
 please review.  If any of these errors
 linux-2.6.24-rc7-ext3-4-migrate.patch.out:Your patch has style problems, 
 please review.  If any of these errors
 linux-2.6.24-rc7-ext4_export_iov_shorten_from_kernel_for_ext4.patch.out:Your 
 patch has style problems, please review.  If any of these errors
 linux-2.6.24-rc7-ext4-journal_chksum-2.6.20.patch.out:Your patch has style 
 problems, please review.  If any of these errors
 linux-2.6.24-rc7-ext4_rec_len_overflow_with_64kblk_fix-v2.patch.out:Your 
 patch has style problems, please review.  If any of these errors
 linux-2.6.24-rc7-ext4_store_maxbytes_for_bitmaped_files.patch.out:Your patch 
 has style problems, please review.  If any of these errors
 linux-2.6.24-rc7-inode-version-ext4.patch.out:Your patch has style problems, 
 please review.  If any of these errors
 linux-2.6.24-rc7-jbd-stats-through-procfs.out:Your patch has style problems, 
 please review.  If any of these errors
 linux-2.6.24-rc7-large-file-blocktype.patch.out:Your patch has style 
 problems, please review.  If any of these errors
 linux-2.6.24-rc7-mballoc-core.patch.out:Your patch has style problems, please 
 review.  If any of these errors
 

Fixed the checkpatch.pl warning for all the patches in the patch queue.
The diff is attached below for review.

patch queue at
http://www.radian.org/~kvaneesh/ext4/jan-15-2008/
http://www.radian.org/~kvaneesh/ext4/jan-15-2008/patch-queue.tar

This include the complete patch queue.

Tested with
--
fsx_linux, fs_inode, fsstress on x86_64 
fsx_linux, fs_inode, fsstress on ppc64


diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 1e46997..2ea7ef4 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -1798,8 +1798,9 @@ static int ext4_remove_blocks(handle_t *handle, struct 
inode *inode,
printk(KERN_INFO strange request: removal %u-%u from %u:%u\n,
from, to, le32_to_cpu(ex-ee_block), ee_len);
} else {
-   printk(KERN_INFO strange request: removal(2) %u-%u from 
%u:%u\n,
-   from, to, le32_to_cpu(ex-ee_block), ee_len);
+   printk(KERN_INFO strange request: removal(2) 
+   %u-%u from %u:%u\n,
+   from, to, le32_to_cpu(ex-ee_block), ee_len);
}
return 0;
 }
@@ -2140,10 +2141,11 @@ void ext4_ext_release(struct super_block *sb)
  *   b Splits in two extents: Write is happening at either end of the extent
  *   c Splits in three extents: Somone is writing in middle of the extent
  */
-static int ext4_ext_convert_to_initialized(handle_t *handle, struct inode 
*inode,
-   struct ext4_ext_path *path,
-   ext4_lblk_t iblock,
-   unsigned long max_blocks)
+static int ext4_ext_convert_to_initialized(handle_t *handle,
+   struct inode *inode,
+   struct ext4_ext_path *path,
+   ext4_lblk_t iblock,
+   unsigned long max_blocks)
 {
struct ext4_extent *ex, newex;
struct ext4_extent *ex1 = NULL;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index f7dc5f3..6bb788d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -751,7 +751,8 @@ err_out:
for (i = 1; i = num; i++) {
BUFFER_TRACE(where[i].bh, call jbd2_journal_forget);
ext4_journal_forget(handle, where[i].bh);
-   ext4_free_blocks(handle,inode,le32_to_cpu(where[i-1].key),1, 0);
+   ext4_free_blocks(handle, inode,
+   le32_to_cpu(where[i-1].key), 1, 0);
}
ext4_free_blocks(handle, inode, le32_to_cpu(where[num].key), blks, 0);
 
@@ -2829,7 +2830,7 @@ static blkcnt_t ext4_inode_blocks(struct ext4_inode 
*raw_inode,
EXT4_FEATURE_RO_COMPAT_HUGE_FILE)) {
/* we are using combined 48 bit field */
i_blocks = ((u64)le16_to_cpu(raw_inode-i_blocks_high))  32 |
-   
le32_to_cpu(raw_inode-i_blocks_lo);
+   le32_to_cpu(raw_inode-i_blocks_lo);
if (ei-i_flags  EXT4_HUGE_FILE_FL) {
/* i_blocks represent file system block size */
return i_blocks   (inode-i_blkbits - 9);
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 16854fd..d8cd81e 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -337,7 +337,7 @@ struct ext4_group_info 

Re: [CALL FOR TESTING] Make Ext3 fsck way faster [2.6.24-rc6 -mm patch]

2008-01-15 Thread Ric Wheeler

Andrew Morton wrote:

I'm wondering about the real value of this change, really.

In any decent environment, people will fsck their ext3 filesystems during
planned downtime, and the benefit of reducing that downtime from 6
hours/machine to 2 hours/machine is probably fairly small, given that there
is no service interruption.  (The same applies to desktops and laptops).

Sure, the benefit is not *zero*, but it's small.  Much less than it would
be with ext2.  I mean, the avoid unplanned fscks feature is the whole
reason why ext3 has journalling (and boy is that feature expensive during
normal operation).

So...  it's unobvious that the benefit of this feature is worth its risks
and costs?


I actually think that the value of this kind of reduction is huge. We 
have seen fsck run for days (not just hours) which makes the restore 
from backup versus fsck decision favor the tapes...


ric
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CALL FOR TESTING] Make Ext3 fsck way faster [2.6.24-rc6 -mm patch]

2008-01-15 Thread Theodore Tso
On Tue, Jan 15, 2008 at 01:15:33PM +, Christoph Hellwig wrote:
 They won't fsck in planned downtimes.  They will have to use fsck when
 the shit hits the fan and they need to.   Not sure about ext3, but big
 XFS user with a close tie to the US goverment were concerned about this
 case for really big filesystems and have sponsored speedup including
 multithreading xfs_repair.  I'm pretty sure the same arguments apply
 to ext3, even if the filesystems are a few magnitudes smaller.

Agreed, 100%.  Even if you fsck snapshots during slow periods, it
still doesn't help you if the filesystem gets corrupted due to a
hardware or software error.  That's where this will matter the most.

Val Hensen has done a proof of concept patch that multi-threads e2fsck
(and she's working on one that would be long-term supportable) that
might reduce the value of this patch, but metaclustering should still
help.

  In any decent environment, people will fsck their ext3 filesystems during
  planned downtime, and the benefit of reducing that downtime from 6
  hours/machine to 2 hours/machine is probably fairly small, given that there
  is no service interruption.  (The same applies to desktops and laptops).
  
  Sure, the benefit is not *zero*, but it's small.  Much less than it would
  be with ext2.  I mean, the avoid unplanned fscks feature is the whole
  reason why ext3 has journalling (and boy is that feature expensive during
  normal operation).

Also, it's not just reducing fsck times, although that's the main one.
The last time this was suggested, the rationale was to speed up the
rm dvd.iso case.  Also, something which *could* be done, if Abhishek
wants to pursue it, would be to pull in all of the indirect blocks
when the file is opened, and create an in-memory extent tree that
would speed up access to the file.  It's rarely worth doing this
without metaclustering, since it doesn't help for sequential I/O, only
random I/O, but with metaclustering it would also be a win for
sequential I/O.  (This would also remove the minor performance
degradation for sequential I/O imposed by metaclustering, and in fact
improve it slightly for really big files.)

- Ted
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: checkpatch.pl warnings

2008-01-15 Thread Mingming Cao
On Tue, 2008-01-15 at 18:22 +0530, Aneesh Kumar K.V wrote:
 On Mon, Jan 14, 2008 at 12:49:27PM -0800, Mingming Cao wrote:
  Hi Guys,
  
  Could you check the checkpatch.pl warnings and see if it make sense to fix 
  them? Thanks!
  
  [EMAIL PROTECTED]:~/fs/ext4/stylecheck$ grep has style problems *
  linux-2.6.24-rc7-48-bit-i_blocks.patch.out:Your patch has style problems, 
  please review.  If any of these errors
  linux-2.6.24-rc7-ext3-4-migrate.patch.out:Your patch has style problems, 
  please review.  If any of these errors
  linux-2.6.24-rc7-ext4_export_iov_shorten_from_kernel_for_ext4.patch.out:Your
   patch has style problems, please review.  If any of these errors
  linux-2.6.24-rc7-ext4-journal_chksum-2.6.20.patch.out:Your patch has style 
  problems, please review.  If any of these errors
  linux-2.6.24-rc7-ext4_rec_len_overflow_with_64kblk_fix-v2.patch.out:Your 
  patch has style problems, please review.  If any of these errors
  linux-2.6.24-rc7-ext4_store_maxbytes_for_bitmaped_files.patch.out:Your 
  patch has style problems, please review.  If any of these errors
  linux-2.6.24-rc7-inode-version-ext4.patch.out:Your patch has style 
  problems, please review.  If any of these errors
  linux-2.6.24-rc7-jbd-stats-through-procfs.out:Your patch has style 
  problems, please review.  If any of these errors
  linux-2.6.24-rc7-large-file-blocktype.patch.out:Your patch has style 
  problems, please review.  If any of these errors
  linux-2.6.24-rc7-mballoc-core.patch.out:Your patch has style problems, 
  please review.  If any of these errors
  
 
 Fixed the checkpatch.pl warning for all the patches in the patch queue.
 The diff is attached below for review.

Thanks!

patch queue has been updated.

Mingming

-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-15 Thread Mike Snitzer
On Jan 14, 2008 7:50 AM, Fengguang Wu [EMAIL PROTECTED] wrote:
 On Mon, Jan 14, 2008 at 12:41:26PM +0100, Peter Zijlstra wrote:
 
  On Mon, 2008-01-14 at 12:30 +0100, Joerg Platte wrote:
   Am Montag, 14. Januar 2008 schrieb Fengguang Wu:
  
Joerg, this patch fixed the bug for me :-)
  
   Fengguang, congratulations, I can confirm that your patch fixed the bug! 
   With
   previous kernels the bug showed up after each reboot. Now, when booting 
   the
   patched kernel everything is fine and there is no longer any suspicious
   iowait!
  
   Do you have an idea why this problem appeared in 2.6.24? Did somebody 
   change
   the ext2 code or is it related to the changes in the scheduler?
 
  It was Fengguang who changed the inode writeback code, and I guess the
  new and improved code was less able do deal with these funny corner
  cases. But he has been very good in tracking them down and solving them,
  kudos to him for that work!

 Thank you.

 In particular the bug is triggered by the patch named:
 writeback: introduce writeback_control.more_io to indicate more io
 That patch means to speed up writeback, but unfortunately its
 aggressiveness has disclosed bugs in reiserfs, jfs and now ext2.

 Linus, given the number of bugs it triggered, I'd recommend revert
 this patch(git commit 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b). Let's
 push it back to -mm tree for more testings?

Fengguang,

I'd like to better understand where your writeback work stands
relative to 2.6.24-rcX and -mm.  To be clear, your changes in
2.6.24-rc7 have been benchmarked to provide a ~33% sequential write
performance improvement with ext3 (as compared to 2.6.22, CFS could be
helping, etc but...).  Very impressive!

Given this improvement it is unfortunate to see your request to revert
2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b but it is understandable if
you're not confident in it for 2.6.24.

That said, you recently posted an -mm patchset that first reverts
2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b and then goes on to address
the slow writes for concurrent large and small file writes bug:
http://lkml.org/lkml/2008/1/15/132

For those interested in using your writeback improvements in
production sooner rather than later (primarily with ext3); what
recommendations do you have?  Just heavily test our own 2.6.24 + your
evolving close, but not ready for merge -mm writeback patchset?

regards,
Mike
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-15 Thread Ingo Molnar

* Fengguang Wu [EMAIL PROTECTED] wrote:

 On Mon, Jan 14, 2008 at 12:41:26PM +0100, Peter Zijlstra wrote:
  
  On Mon, 2008-01-14 at 12:30 +0100, Joerg Platte wrote:
   Am Montag, 14. Januar 2008 schrieb Fengguang Wu:
   
Joerg, this patch fixed the bug for me :-)
   
   Fengguang, congratulations, I can confirm that your patch fixed the bug! 
   With 
   previous kernels the bug showed up after each reboot. Now, when booting 
   the 
   patched kernel everything is fine and there is no longer any suspicious 
   iowait!
   
   Do you have an idea why this problem appeared in 2.6.24? Did somebody 
   change 
   the ext2 code or is it related to the changes in the scheduler?
  
  It was Fengguang who changed the inode writeback code, and I guess the
  new and improved code was less able do deal with these funny corner
  cases. But he has been very good in tracking them down and solving them,
  kudos to him for that work!
 
 Thank you.
 
 In particular the bug is triggered by the patch named:
 writeback: introduce writeback_control.more_io to indicate more io
 That patch means to speed up writeback, but unfortunately its
 aggressiveness has disclosed bugs in reiserfs, jfs and now ext2.
 
 Linus, given the number of bugs it triggered, I'd recommend revert 
 this patch(git commit 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b). Let's 
 push it back to -mm tree for more testings?

i dont think a revert at this stage is a good idea and i'm not sure 
pushing it back into -mm would really expose more of these bugs. And 
these are real bugs in filesystems - bugs which we want to see fixed 
anyway. You are also tracking down those bugs very fast.

[ perhaps, if it's possible technically (and if it is clean enough), you
  might want to offer a runtime debug tunable that can be used to switch
  off the new aspects of your code. That would speed up testing, in case
  anyone suspects the new writeback code. ]

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Do not try lock_acquire after handle made invalid

2008-01-15 Thread Jonas Bonn
This likely fixes the oops in __lock_acquire reported as:

http://www.kerneloops.org/raw.php?rawid=2753msgid=
http://www.kerneloops.org/raw.php?rawid=2749msgid=

In these reported oopses, start_this_handle is returning -EROFS.

Signed-off-by: Jonas Bonn [EMAIL PROTECTED]
---
 fs/jbd/transaction.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/jbd/transaction.c b/fs/jbd/transaction.c
index 08ff6c7..038ed74 100644
--- a/fs/jbd/transaction.c
+++ b/fs/jbd/transaction.c
@@ -288,10 +288,12 @@ handle_t *journal_start(journal_t *journal, int nblocks)
jbd_free_handle(handle);
current-journal_info = NULL;
handle = ERR_PTR(err);
+   goto out;
}
 
lock_acquire(handle-h_lockdep_map, 0, 0, 0, 2, _THIS_IP_);
 
+out:
return handle;
 }
 
-- 
1.5.2.5

-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CALL FOR TESTING] Make Ext3 fsck way faster [2.6.24-rc6 -mm patch]

2008-01-15 Thread Valdis . Kletnieks
On Tue, 15 Jan 2008 10:09:16 EST, Ric Wheeler said:
 I actually think that the value of this kind of reduction is huge. We 
 have seen fsck run for days (not just hours) which makes the restore 
 from backup versus fsck decision favor the tapes...

Funny thing is that for many of these sorts of cases, restore from backup
is *also* a days issue unless you do a *lot* of very clever planning
ahead to be able to get multiple tape drives moving at the same time while
not causing issues at the receiving end either





pgpTTyPj1liLj.pgp
Description: PGP signature


Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-15 Thread Fengguang Wu
On Tue, Jan 15, 2008 at 10:42:13PM +0100, Ingo Molnar wrote:
 
 * Fengguang Wu [EMAIL PROTECTED] wrote:
 
  On Mon, Jan 14, 2008 at 12:41:26PM +0100, Peter Zijlstra wrote:
   
   On Mon, 2008-01-14 at 12:30 +0100, Joerg Platte wrote:
Am Montag, 14. Januar 2008 schrieb Fengguang Wu:

 Joerg, this patch fixed the bug for me :-)

Fengguang, congratulations, I can confirm that your patch fixed the 
bug! With 
previous kernels the bug showed up after each reboot. Now, when booting 
the 
patched kernel everything is fine and there is no longer any suspicious 
iowait!

Do you have an idea why this problem appeared in 2.6.24? Did somebody 
change 
the ext2 code or is it related to the changes in the scheduler?
   
   It was Fengguang who changed the inode writeback code, and I guess the
   new and improved code was less able do deal with these funny corner
   cases. But he has been very good in tracking them down and solving them,
   kudos to him for that work!
  
  Thank you.
  
  In particular the bug is triggered by the patch named:
  writeback: introduce writeback_control.more_io to indicate more io
  That patch means to speed up writeback, but unfortunately its
  aggressiveness has disclosed bugs in reiserfs, jfs and now ext2.
  
  Linus, given the number of bugs it triggered, I'd recommend revert 
  this patch(git commit 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b). Let's 
  push it back to -mm tree for more testings?
 
 i dont think a revert at this stage is a good idea and i'm not sure 
 pushing it back into -mm would really expose more of these bugs. And 
 these are real bugs in filesystems - bugs which we want to see fixed 
 anyway. You are also tracking down those bugs very fast.
 
 [ perhaps, if it's possible technically (and if it is clean enough), you
   might want to offer a runtime debug tunable that can be used to switch
   off the new aspects of your code. That would speed up testing, in case
   anyone suspects the new writeback code. ]

The patch is too aggressive in itself. We'd better not risk on it.
The iowait is only unpleasant not destructive. But it will hurt if
many users complaints. Comment says that nfs_writepages() sometimes
bales out without doing anything.

However I have an improved and more safe patch now. It won't iowait
when nfs_writepages() bale out without increasing pages_skipped, or
even when some buggy filesystem forget to clear PAGECACHE_TAG_DIRTY.
(The magic lies in the first chunk below.)

Mike, you can use this one on 2.6.24.


---
 fs/fs-writeback.c |   17 +++--
 include/linux/writeback.h |1 +
 mm/page-writeback.c   |9 ++---
 3 files changed, 22 insertions(+), 5 deletions(-)

--- linux.orig/fs/fs-writeback.c
+++ linux/fs/fs-writeback.c
@@ -284,7 +284,16 @@ __sync_single_inode(struct inode *inode,
 * soon as the queue becomes uncongested.
 */
inode-i_state |= I_DIRTY_PAGES;
-   requeue_io(inode);
+   if (wbc-nr_to_write = 0)
+   /*
+* slice used up: queue for next turn
+*/
+   requeue_io(inode);
+   else
+   /*
+* somehow blocked: retry later
+*/
+   redirty_tail(inode);
} else {
/*
 * Otherwise fully redirty the inode so that
@@ -479,8 +488,12 @@ sync_sb_inodes(struct super_block *sb, s
iput(inode);
cond_resched();
spin_lock(inode_lock);
-   if (wbc-nr_to_write = 0)
+   if (wbc-nr_to_write = 0) {
+   wbc-more_io = 1;
break;
+   }
+   if (!list_empty(sb-s_more_io))
+   wbc-more_io = 1;
}
return; /* Leave any unwritten inodes on s_io */
 }
--- linux.orig/include/linux/writeback.h
+++ linux/include/linux/writeback.h
@@ -62,6 +62,7 @@ struct writeback_control {
unsigned for_reclaim:1; /* Invoked from the page allocator */
unsigned for_writepages:1;  /* This is a writepages() call */
unsigned range_cyclic:1;/* range_start is cyclic */
+   unsigned more_io:1; /* more io to be dispatched */
 };
 
 /*
--- linux.orig/mm/page-writeback.c
+++ linux/mm/page-writeback.c
@@ -558,6 +558,7 @@ static void background_writeout(unsigned
global_page_state(NR_UNSTABLE_NFS)  background_thresh
 min_pages = 0)
break;
+  

Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-15 Thread Fengguang Wu
On Tue, Jan 15, 2008 at 04:13:22PM -0500, Mike Snitzer wrote:
 On Jan 14, 2008 7:50 AM, Fengguang Wu [EMAIL PROTECTED] wrote:
  On Mon, Jan 14, 2008 at 12:41:26PM +0100, Peter Zijlstra wrote:
  
   On Mon, 2008-01-14 at 12:30 +0100, Joerg Platte wrote:
Am Montag, 14. Januar 2008 schrieb Fengguang Wu:
   
 Joerg, this patch fixed the bug for me :-)
   
Fengguang, congratulations, I can confirm that your patch fixed the 
bug! With
previous kernels the bug showed up after each reboot. Now, when booting 
the
patched kernel everything is fine and there is no longer any suspicious
iowait!
   
Do you have an idea why this problem appeared in 2.6.24? Did somebody 
change
the ext2 code or is it related to the changes in the scheduler?
  
   It was Fengguang who changed the inode writeback code, and I guess the
   new and improved code was less able do deal with these funny corner
   cases. But he has been very good in tracking them down and solving them,
   kudos to him for that work!
 
  Thank you.
 
  In particular the bug is triggered by the patch named:
  writeback: introduce writeback_control.more_io to indicate more io
  That patch means to speed up writeback, but unfortunately its
  aggressiveness has disclosed bugs in reiserfs, jfs and now ext2.
 
  Linus, given the number of bugs it triggered, I'd recommend revert
  this patch(git commit 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b). Let's
  push it back to -mm tree for more testings?
 
 Fengguang,
 
 I'd like to better understand where your writeback work stands
 relative to 2.6.24-rcX and -mm.  To be clear, your changes in
 2.6.24-rc7 have been benchmarked to provide a ~33% sequential write
 performance improvement with ext3 (as compared to 2.6.22, CFS could be
 helping, etc but...).  Very impressive!

Wow, glad to hear that.

 Given this improvement it is unfortunate to see your request to revert
 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b but it is understandable if
 you're not confident in it for 2.6.24.
 
 That said, you recently posted an -mm patchset that first reverts
 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b and then goes on to address
 the slow writes for concurrent large and small file writes bug:
 http://lkml.org/lkml/2008/1/15/132
 
 For those interested in using your writeback improvements in
 production sooner rather than later (primarily with ext3); what
 recommendations do you have?  Just heavily test our own 2.6.24 + your
 evolving close, but not ready for merge -mm writeback patchset?

It's not ready mainly because it is fresh made and need more
feedbacks. It's doing OK on my desktop :-)

-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html