Re: [PATCH] ext4: dir inode reservation V3

2007-11-13 Thread Alex Tomas
hmm. so you trade 265% degradation of creation for 40% improvement of unlink? thanks, Alex Coly Li wrote: normal ext4 ext4 with dir inode reservation mount options: -o data=writeback -o data=writeback,dir_ireserve=low

Re: [PATCH] ext4: dir inode reservation V3

2007-11-13 Thread Alex Tomas
hmm. so you trade 265% degradation of creation for 40% improvement of unlink? thanks, Alex Coly Li wrote: normal ext4 ext4 with dir inode reservation mount options: -o data=writeback -o data=writeback,dir_ireserve=low

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-19 Thread Alex Tomas
On 9/19/07, David Chinner <[EMAIL PROTECTED]> wrote: > The problem is this: to alter the fundamental block size of the > filesystem we also need to alter the data block size and that is > exactly the piece that linux does not support right now. So while > we have the capability to use large block

Re: [00/41] Large Blocksize Support V7 (adds memmap support)

2007-09-19 Thread Alex Tomas
On 9/19/07, David Chinner [EMAIL PROTECTED] wrote: The problem is this: to alter the fundamental block size of the filesystem we also need to alter the data block size and that is exactly the piece that linux does not support right now. So while we have the capability to use large block sizes

Re: [PATCH] ext4:fix unexpected error from ext4_reserve_global

2007-06-19 Thread Alex Tomas
ACK, of course. thanks, Alex Mingming Cao wrote: On Thu, 2007-06-14 at 19:29 +0400, Dmitriy Monakhov wrote: I just cant belive my eyes then i saw this at the first time... simple test: strace dd if=/dev/zero of=/mnt/file Thanks for reporting it. open("/dev/zero", O_RDONLY) = 0

Re: [PATCH] ext4:fix unexpected error from ext4_reserve_global

2007-06-19 Thread Alex Tomas
ACK, of course. thanks, Alex Mingming Cao wrote: On Thu, 2007-06-14 at 19:29 +0400, Dmitriy Monakhov wrote: I just cant belive my eyes then i saw this at the first time... simple test: strace dd if=/dev/zero of=/mnt/file Thanks for reporting it. open(/dev/zero, O_RDONLY) = 0

Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-04 Thread Alex Tomas
Andrew Morton wrote: I'm still not understanding. The terms you're using are a bit ambiguous. What does "find some dirty unallocated blocks" mean? Find a page which is dirty and which does not have a disk mapping? Normally the above operation would be implemented via

Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-04 Thread Alex Tomas
Andrew Morton wrote: On Fri, 04 May 2007 10:18:12 +0400 Alex Tomas <[EMAIL PROTECTED]> wrote: Andrew Morton wrote: Yes, there can be issues with needing to allocate journal space within the context of a commit. But no-no, this isn't required. we only need to mark pages/blocks

Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-04 Thread Alex Tomas
Andrew Morton wrote: Yes, there can be issues with needing to allocate journal space within the context of a commit. But no-no, this isn't required. we only need to mark pages/blocks within transaction, otherwise race is possible when we allocate blocks in transaction, then transacton starts

Re: [ext3][kernels = 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-04 Thread Alex Tomas
Andrew Morton wrote: Yes, there can be issues with needing to allocate journal space within the context of a commit. But no-no, this isn't required. we only need to mark pages/blocks within transaction, otherwise race is possible when we allocate blocks in transaction, then transacton starts

Re: [ext3][kernels = 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-04 Thread Alex Tomas
Andrew Morton wrote: On Fri, 04 May 2007 10:18:12 +0400 Alex Tomas [EMAIL PROTECTED] wrote: Andrew Morton wrote: Yes, there can be issues with needing to allocate journal space within the context of a commit. But no-no, this isn't required. we only need to mark pages/blocks within

Re: [ext3][kernels = 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-04 Thread Alex Tomas
Andrew Morton wrote: I'm still not understanding. The terms you're using are a bit ambiguous. What does find some dirty unallocated blocks mean? Find a page which is dirty and which does not have a disk mapping? Normally the above operation would be implemented via

Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-03 Thread Alex Tomas
Andrew Morton wrote: We can make great improvements here, and I've (twice) previously decribed how: hoist the entire ordered-mode data handling out of ext3, and out of the buffer_head layer and move it up into the VFS pagecache layer. Basically, do ordered-data with a commit-time inode walk,

Re: [ext3][kernels = 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

2007-05-03 Thread Alex Tomas
Andrew Morton wrote: We can make great improvements here, and I've (twice) previously decribed how: hoist the entire ordered-mode data handling out of ext3, and out of the buffer_head layer and move it up into the VFS pagecache layer. Basically, do ordered-data with a commit-time inode walk,

Re: 2.6.21-ext4-1

2007-04-30 Thread Alex Tomas
Theodore Ts'o wrote: P.S. One bug which I've noted --- if there is a failure due to disk filling up, running e2fsck on the filesystem will show that the i_blocks fields on the inodes where there was a failure to allocate disk blocks are left incorrect. I'm guessing this is a bug in the delayed

Re: 2.6.21-ext4-1

2007-04-30 Thread Alex Tomas
Theodore Ts'o wrote: P.S. One bug which I've noted --- if there is a failure due to disk filling up, running e2fsck on the filesystem will show that the i_blocks fields on the inodes where there was a failure to allocate disk blocks are left incorrect. I'm guessing this is a bug in the delayed

Re: O_DIRECT question

2007-01-17 Thread Alex Tomas
I think one problem with mmap/msync is that they can't maintain i_size atomically like regular write does. so, one needs to implement own i_size management in userspace. thanks, Alex > Side note: the only reason O_DIRECT exists is because database people are > too used to it, because other

Re: O_DIRECT question

2007-01-17 Thread Alex Tomas
I think one problem with mmap/msync is that they can't maintain i_size atomically like regular write does. so, one needs to implement own i_size management in userspace. thanks, Alex Side note: the only reason O_DIRECT exists is because database people are too used to it, because other OS's

Re: [PATCH] return ENOENT from ext3_link when racing with unlink

2007-01-16 Thread Alex Tomas
> Peter Staubach (PS) writes: PS> Just out of curosity, what keeps i_nlink from going to 0 immediately PS> after the new test is executed? i_mutex in vfs_link() and vfs_unlink() thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a

Re: [PATCH] return ENOENT from ext3_link when racing with unlink

2007-01-16 Thread Alex Tomas
Peter Staubach (PS) writes: PS Just out of curosity, what keeps i_nlink from going to 0 immediately PS after the new test is executed? i_mutex in vfs_link() and vfs_unlink() thanks, Alex - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
> Eric Sandeen (ES) writes: ES> Al says "no" and I'm not arguing. :) ES> Apparently this may be OK with some filesystems, and Al says he doesn't ES> want to know about i_nlink in the vfs in any case. well, generic_drop_inode() uses i_nlink ... ES> But I suppose there may be other

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
> Eric Sandeen (ES) writes: ES> I tend to agree, chatting w/ Al I think he does too. :) I'll test ES> a patch that kicks out ext3_link() with -ENOENT at the top, and resubmit ES> that if things go well. shouldn't VFS do that? thanks, Alex - To unsubscribe from this list: send the line

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
> Eric Sandeen (ES) writes: ES> so I think it's possible that link can sneak in there & find it after ES> the mutex is dropped...? Is this ok? :) It's certainly -happening- ES> anyway yes, but it shouldn't allow to re-link such inode back, IMHO. a filesystem may start some

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
. thanks, Alex >>>>> Alex Tomas (AT) writes: AT> interesting .. AT> I thought VFS doesn't allow concurrent operations. AT> if unlink goes first, then link should wait on the AT> parent's i_mutex and then found no source name. AT> thanks, Alex >>>>>

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
interesting .. I thought VFS doesn't allow concurrent operations. if unlink goes first, then link should wait on the parent's i_mutex and then found no source name. thanks, Alex > Eric Sandeen (ES) writes: ES> ) ES> I've been looking at a case where many threads are opening, unlinking,

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
interesting .. I thought VFS doesn't allow concurrent operations. if unlink goes first, then link should wait on the parent's i_mutex and then found no source name. thanks, Alex Eric Sandeen (ES) writes: ES ) ES I've been looking at a case where many threads are opening, unlinking, and ES

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
. thanks, Alex Alex Tomas (AT) writes: AT interesting .. AT I thought VFS doesn't allow concurrent operations. AT if unlink goes first, then link should wait on the AT parent's i_mutex and then found no source name. AT thanks, Alex Eric Sandeen (ES) writes: ES ) ES I've been looking

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
Eric Sandeen (ES) writes: ES so I think it's possible that link can sneak in there find it after ES the mutex is dropped...? Is this ok? :) It's certainly -happening- ES anyway yes, but it shouldn't allow to re-link such inode back, IMHO. a filesystem may start some non-revertable

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
Eric Sandeen (ES) writes: ES I tend to agree, chatting w/ Al I think he does too. :) I'll test ES a patch that kicks out ext3_link() with -ENOENT at the top, and resubmit ES that if things go well. shouldn't VFS do that? thanks, Alex - To unsubscribe from this list: send the line

Re: [PATCH] [RFC] remove ext3 inode from orphan list when link and unlink race

2007-01-12 Thread Alex Tomas
Eric Sandeen (ES) writes: ES Al says no and I'm not arguing. :) ES Apparently this may be OK with some filesystems, and Al says he doesn't ES want to know about i_nlink in the vfs in any case. well, generic_drop_inode() uses i_nlink ... ES But I suppose there may be other filesystems

Re: [RFC] delayed allocation for ext4

2006-12-28 Thread Alex Tomas
> David Chinner (DC) writes: DC> So that mean's we'll have 2 separate mechanisms for marking DC> pages as delalloc. XFS uses the BH_delay flag to indicate DC> that a buffer (block) attached to the page is using delalloc. >> >> well, for blocksize=pagesize we can save 56 bytes on every

Re: [RFC] delayed allocation for ext4

2006-12-28 Thread Alex Tomas
David Chinner (DC) writes: DC So that mean's we'll have 2 separate mechanisms for marking DC pages as delalloc. XFS uses the BH_delay flag to indicate DC that a buffer (block) attached to the page is using delalloc. well, for blocksize=pagesize we can save 56 bytes on every page. DC

Re: [RFC] ext4-block-reservation.patch

2006-12-23 Thread Alex Tomas
Hi, > Andrew Morton (AM) writes: AM> Should be cacheline_aligned_in_smp. AM> That's assuming it needs to be cacheline aligned at all. It can consume a AM> lot of space. the idea is to make block reservation cheap because it's called for every page. AM> AM> oh, this should be

Re: [RFC] delayed allocation for ext4

2006-12-23 Thread Alex Tomas
> Christoph Hellwig (CH) writes: CH> Note that recording delayed alloc state at a page granularity in addition CH> to just the buffer heads has a lot of advantages aswell and would help CH> xfs, too. But I think it makes a lot more sense to record it as a radix CH> tree tag to speed up

Re: [RFC] delayed allocation for ext4

2006-12-23 Thread Alex Tomas
Good day, > David Chinner (DC) writes: DC> So that mean's we'll have 2 separate mechanisms for marking DC> pages as delalloc. XFS uses the BH_delay flag to indicate DC> that a buffer (block) attached to the page is using delalloc. well, for blocksize=pagesize we can save 56 bytes on

Re: [RFC] delayed allocation for ext4

2006-12-23 Thread Alex Tomas
Good day, David Chinner (DC) writes: DC So that mean's we'll have 2 separate mechanisms for marking DC pages as delalloc. XFS uses the BH_delay flag to indicate DC that a buffer (block) attached to the page is using delalloc. well, for blocksize=pagesize we can save 56 bytes on every page.

Re: [RFC] delayed allocation for ext4

2006-12-23 Thread Alex Tomas
Christoph Hellwig (CH) writes: CH Note that recording delayed alloc state at a page granularity in addition CH to just the buffer heads has a lot of advantages aswell and would help CH xfs, too. But I think it makes a lot more sense to record it as a radix CH tree tag to speed up the gang

Re: [RFC] ext4-block-reservation.patch

2006-12-23 Thread Alex Tomas
Hi, Andrew Morton (AM) writes: AM Should be cacheline_aligned_in_smp. AM That's assuming it needs to be cacheline aligned at all. It can consume a AM lot of space. the idea is to make block reservation cheap because it's called for every page. AM looks AM oh, this should be

[RFC] ext4-block-reservation.patch

2006-12-22 Thread Alex Tomas
Index: linux-2.6.20-rc1/include/linux/ext4_fs.h === --- linux-2.6.20-rc1.orig/include/linux/ext4_fs.h 2006-12-14 04:14:23.0 +0300 +++ linux-2.6.20-rc1/include/linux/ext4_fs.h2006-12-22 20:21:12.0 +0300 @@

[RFC] ext4-delayed-allocation.patch

2006-12-22 Thread Alex Tomas
/ext4/writeback.c2006-12-22 22:59:33.0 +0300 @@ -0,0 +1,1167 @@ +/* + * Copyright (c) 2003-2006, Cluster File Systems, Inc, [EMAIL PROTECTED] + * Written by Alex Tomas <[EMAIL PROTECTED]> + * + * This program is free software; you can redistribute it and/or modi

[RFC] booked-page-flag.patch

2006-12-22 Thread Alex Tomas
Index: linux-2.6.20-rc1/include/linux/page-flags.h === --- linux-2.6.20-rc1.orig/include/linux/page-flags.h2006-12-14 04:14:23.0 +0300 +++ linux-2.6.20-rc1/include/linux/page-flags.h 2006-12-22 20:05:31.0 +0300

[RFC] delayed allocation for ext4

2006-12-22 Thread Alex Tomas
Good day, probably the previous set of patches (including mballoc/lg) is too large. so, I reworked delayed allocation a bit so that it can be used on top of regular balloc, though it still can be used with extents-enabled files only. this time series contains just 3 patches: -

[RFC] delayed allocation for ext4

2006-12-22 Thread Alex Tomas
Good day, probably the previous set of patches (including mballoc/lg) is too large. so, I reworked delayed allocation a bit so that it can be used on top of regular balloc, though it still can be used with extents-enabled files only. this time series contains just 3 patches: -

[RFC] booked-page-flag.patch

2006-12-22 Thread Alex Tomas
Index: linux-2.6.20-rc1/include/linux/page-flags.h === --- linux-2.6.20-rc1.orig/include/linux/page-flags.h2006-12-14 04:14:23.0 +0300 +++ linux-2.6.20-rc1/include/linux/page-flags.h 2006-12-22 20:05:31.0 +0300

[RFC] ext4-delayed-allocation.patch

2006-12-22 Thread Alex Tomas
Systems, Inc, [EMAIL PROTECTED] + * Written by Alex Tomas [EMAIL PROTECTED] + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed

[RFC] ext4-block-reservation.patch

2006-12-22 Thread Alex Tomas
Index: linux-2.6.20-rc1/include/linux/ext4_fs.h === --- linux-2.6.20-rc1.orig/include/linux/ext4_fs.h 2006-12-14 04:14:23.0 +0300 +++ linux-2.6.20-rc1/include/linux/ext4_fs.h2006-12-22 20:21:12.0 +0300 @@

Re: Boot failure with ext2 and initrds

2006-11-16 Thread Alex Tomas
> Andrew Morton (AM) writes: AM> What lock protects the fields in struct ext[234]_reserve_window from being AM> concurrently modified by two CPUs? None, it seems. Ditto AM> ext[234]_reserve_window_node. i_mutex will cover it for write(), but not AM> for pageout over a file hole. If we

Re: Boot failure with ext2 and initrds

2006-11-16 Thread Alex Tomas
Andrew Morton (AM) writes: AM What lock protects the fields in struct ext[234]_reserve_window from being AM concurrently modified by two CPUs? None, it seems. Ditto AM ext[234]_reserve_window_node. i_mutex will cover it for write(), but not AM for pageout over a file hole. If we end up

Re: [RFC] pdirops: vfs patch

2005-02-23 Thread Alex Tomas
> Jan Blunck (JB) writes: JB> Nope, d_alloc() is setting d_flags to DCACHE_UNHASHED. Therefore it is not found JB> by __d_lookup() until it is rehashed which is implicit done by ->lookup(). that means we can have two processes allocated dentry for same name. they'll call ->lookup() each

Re: [RFC] pdirops: vfs patch

2005-02-23 Thread Alex Tomas
Jan Blunck (JB) writes: JB Nope, d_alloc() is setting d_flags to DCACHE_UNHASHED. Therefore it is not found JB by __d_lookup() until it is rehashed which is implicit done by -lookup(). that means we can have two processes allocated dentry for same name. they'll call -lookup() each against

Re: [RFC] pdirops: vfs patch

2005-02-22 Thread Alex Tomas
> Jan Blunck (JB) writes: JB> i_sem does NOT protect the dcache. Also not in real_lookup(). The lock must be JB> acquired for ->lookup() and because we might sleep on i_sem, we have to get it JB> early and check for repopulation of the dcache. dentry is part of dcache, right? i_sem

Re: [RFC] pdirops: vfs patch

2005-02-22 Thread Alex Tomas
> Jan Blunck (JB) writes: >> 1) i_sem protects dcache too JB> Where? i_sem is the per-inode lock, and shouldn't be used else. read comments in fs/namei.c:read_lookup() >> 2) tmpfs has no "own" data, so we can use it this way (see 2nd patch) >> 3) I have pdirops patch for ext3, but it

Re: [RFC] pdirops: vfs patch

2005-02-22 Thread Alex Tomas
Jan Blunck (JB) writes: 1) i_sem protects dcache too JB Where? i_sem is the per-inode lock, and shouldn't be used else. read comments in fs/namei.c:read_lookup() 2) tmpfs has no own data, so we can use it this way (see 2nd patch) 3) I have pdirops patch for ext3, but it needs some

Re: [RFC] pdirops: vfs patch

2005-02-22 Thread Alex Tomas
Jan Blunck (JB) writes: JB i_sem does NOT protect the dcache. Also not in real_lookup(). The lock must be JB acquired for -lookup() and because we might sleep on i_sem, we have to get it JB early and check for repopulation of the dcache. dentry is part of dcache, right? i_sem protects

Re: [RFC] pdirops: vfs patch

2005-02-20 Thread Alex Tomas
> Jan Blunck (JB) writes: JB> With luck you have s_pdirops_size (or 1024) different renames altering JB> concurrently one directory inode. Therefore you need a lock protecting JB> your filesystem data. This is basically the job done by i_sem. So in JB> my opinion you only move "The

Re: [RFC] pdirops: vfs patch

2005-02-20 Thread Alex Tomas
Jan Blunck (JB) writes: JB With luck you have s_pdirops_size (or 1024) different renames altering JB concurrently one directory inode. Therefore you need a lock protecting JB your filesystem data. This is basically the job done by i_sem. So in JB my opinion you only move The Problem from the

Re: [RFC] pdirops: tmpfs patch

2005-02-19 Thread Alex Tomas
Index: linux-2.6.10/mm/shmem.c === --- linux-2.6.10.orig/mm/shmem.c2005-01-28 19:32:16.0 +0300 +++ linux-2.6.10/mm/shmem.c 2005-02-19 20:05:32.642599576 +0300 @@ -1849,7 +1849,7 @@ #endif }; -static int

Re: [RFC] pdirops: vfs patch

2005-02-19 Thread Alex Tomas
fs/inode.c |1 fs/namei.c | 66 ++--- include/linux/fs.h | 11 3 files changed, 54 insertions(+), 24 deletions(-) Index: linux-2.6.10/fs/namei.c ===

[RFC] parallel directory operations

2005-02-19 Thread Alex Tomas
Good day Al and all could you review couple patches that implement $subj for vfs and tmpfs. In short the idea is that we can protect operations taking semaphore related for set of names. definitely, protection at vfs layer isn't enough and filesystem will need to protect their own structures by

[RFC] parallel directory operations

2005-02-19 Thread Alex Tomas
Good day Al and all could you review couple patches that implement $subj for vfs and tmpfs. In short the idea is that we can protect operations taking semaphore related for set of names. definitely, protection at vfs layer isn't enough and filesystem will need to protect their own structures by

Re: [RFC] pdirops: vfs patch

2005-02-19 Thread Alex Tomas
fs/inode.c |1 fs/namei.c | 66 ++--- include/linux/fs.h | 11 3 files changed, 54 insertions(+), 24 deletions(-) Index: linux-2.6.10/fs/namei.c ===

Re: [RFC] pdirops: tmpfs patch

2005-02-19 Thread Alex Tomas
Index: linux-2.6.10/mm/shmem.c === --- linux-2.6.10.orig/mm/shmem.c2005-01-28 19:32:16.0 +0300 +++ linux-2.6.10/mm/shmem.c 2005-02-19 20:05:32.642599576 +0300 @@ -1849,7 +1849,7 @@ #endif }; -static int

Re: [Ext2-devel] Re: Latest ext3 patches (extents, mballoc, delayed allocation)

2005-02-15 Thread Alex Tomas
> Sonny Rao (SR) writes: SR> Alex, small buglet, If the FIBMAP-ioctl get's called on a file with SR> delayed allocation, you need to flush it (or at least allocate) before SR> returning the mappings. This doesn't seem to work properly at SR> present. good catch. thanks. - To

Re: [Ext2-devel] Re: Latest ext3 patches (extents, mballoc, delayed allocation)

2005-02-15 Thread Alex Tomas
Sonny Rao (SR) writes: SR Alex, small buglet, If the FIBMAP-ioctl get's called on a file with SR delayed allocation, you need to flush it (or at least allocate) before SR returning the mappings. This doesn't seem to work properly at SR present. good catch. thanks. - To unsubscribe

Re: Latest ext3 patches (extents, mballoc, delayed allocation)

2005-02-11 Thread Alex Tomas
Good day all, I've updated the patchset against 2.6.10. A bunch of bugs have been fixed and mballoc now behaves smarter a bit. Extents and mballoc patches collects some stats they print upon umount. NOTE: they must not be used to store important data. A lot of things are to be done. Please

Re: Latest ext3 patches (extents, mballoc, delayed allocation)

2005-02-11 Thread Alex Tomas
Good day all, I've updated the patchset against 2.6.10. A bunch of bugs have been fixed and mballoc now behaves smarter a bit. Extents and mballoc patches collects some stats they print upon umount. NOTE: they must not be used to store important data. A lot of things are to be done. Please

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-30 Thread Alex Tomas
>>>>> Stephen C Tweedie (SCT) writes: SCT> Hi, SCT> On Tue, 2005-01-25 at 19:30, Alex Tomas wrote: >> >> journal_dirty_metadata(handle, bh) >> >> { >> >> transaction->t_reserved--; >> >> handle->h_buf

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-25 Thread Alex Tomas
> Stephen C Tweedie (SCT) writes: >> journal_dirty_metadata(handle, bh) >> { >> transaction->t_reserved--; >> handle->h_buffer_credits--; >> if (jh->b_tcount > 0) { >> /* modifed, no need to track it any more */ >> transaction-> t_outstanding_credits++; >>

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-25 Thread Alex Tomas
Hi, could you review the following solution? t_outstanding_credits - number of _modified_ blocks in the transaction t_reserved - number of blocks all running handle reserved transaction size = t_outstanding_credits + t_reserved; #define TSIZE(t)((t)->t_outstanding_credits +

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-25 Thread Alex Tomas
Hi, could you review the following solution? t_outstanding_credits - number of _modified_ blocks in the transaction t_reserved - number of blocks all running handle reserved transaction size = t_outstanding_credits + t_reserved; #define TSIZE(t)((t)-t_outstanding_credits +

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-25 Thread Alex Tomas
Stephen C Tweedie (SCT) writes: journal_dirty_metadata(handle, bh) { transaction-t_reserved--; handle-h_buffer_credits--; if (jh-b_tcount 0) { /* modifed, no need to track it any more */ transaction- t_outstanding_credits++; jh- b_tcount = -1;

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-24 Thread Alex Tomas
> Stephen C Tweedie (SCT) writes: >> + /* return credit back to the handle if it was really spent */ >> + if (credits) { >> + handle->h_buffer_credits++; >> + spin_lock(>h_transaction->t_handle_lock); >> +

Re: [Ext2-devel] [PATCH] JBD: log space management optimization

2005-01-24 Thread Alex Tomas
is expensive and correct reservation allows us to avoid needless commits. here is the patch. tested on UP. Signed-off-by: Alex Tomas <[EMAIL PROTECTED]> Index: linux-2.6.7/fs/jbd/transaction.c === --- linux-2.6.7.orig/fs/jbd/transa

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-24 Thread Alex Tomas
> Stephen C Tweedie (SCT) writes: >> + /* return credit back to the handle if it was really spent */ >> + if (credits) >> + handle->h_buffer_credits++; >> + jh->b_tcount--; >> + if (jh->b_tcount == 0) { >> + /* >> +* this was last reference to

Re: [Ext2-devel] [PATCH] JBD: fix against journal overflow

2005-01-24 Thread Alex Tomas
> Stephen C Tweedie (SCT) writes: SCT> /* SCT>* Be pessimistic here about the number of those free blocks which SCT>* might be required for log descriptor control blocks. SCT>*/ SCT> ... SCT> left -= (left >> 3); oops. i overlooked this line. so, the fix becomes minor

Re: [Ext2-devel] [PATCH] JBD: fix against journal overflow

2005-01-24 Thread Alex Tomas
> Stephen C Tweedie (SCT) writes: SCT> I don't see how that "limit" is relevant here. wbuf is nothing but the SCT> size of the IO batches we pass to ll_rw_block() during that commit SCT> phase. j_free affects the total size of space the *entire* commit has SCT> to run into, and (as akpm

Re: [Ext2-devel] [PATCH] JBD: fix against journal overflow

2005-01-24 Thread Alex Tomas
Stephen C Tweedie (SCT) writes: SCT I don't see how that limit is relevant here. wbuf is nothing but the SCT size of the IO batches we pass to ll_rw_block() during that commit SCT phase. j_free affects the total size of space the *entire* commit has SCT to run into, and (as akpm has

Re: [Ext2-devel] [PATCH] JBD: fix against journal overflow

2005-01-24 Thread Alex Tomas
Stephen C Tweedie (SCT) writes: SCT /* SCT* Be pessimistic here about the number of those free blocks which SCT* might be required for log descriptor control blocks. SCT*/ SCT ... SCT left -= (left 3); oops. i overlooked this line. so, the fix becomes minor improvement

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-24 Thread Alex Tomas
Stephen C Tweedie (SCT) writes: + /* return credit back to the handle if it was really spent */ + if (credits) + handle-h_buffer_credits++; + jh-b_tcount--; + if (jh-b_tcount == 0) { + /* +* this was last reference to the block from the

Re: [Ext2-devel] [PATCH] JBD: log space management optimization

2005-01-24 Thread Alex Tomas
us to avoid needless commits. here is the patch. tested on UP. Signed-off-by: Alex Tomas [EMAIL PROTECTED] Index: linux-2.6.7/fs/jbd/transaction.c === --- linux-2.6.7.orig/fs/jbd/transaction.c 2004-08-26 17:12:40.0

Re: [Ext2-devel] [PATCH] JBD: journal_release_buffer()

2005-01-24 Thread Alex Tomas
Stephen C Tweedie (SCT) writes: + /* return credit back to the handle if it was really spent */ + if (credits) { + handle-h_buffer_credits++; + spin_lock(handle-h_transaction-t_handle_lock); + handle-h_transaction-t_outstanding_credits++; +

[PATCH] JBD: log space management optimization

2005-01-19 Thread Alex Tomas
ion. for example, removal of 500MB file reserves 136 blocks, but only 10 blocks go to the log. a commit is expensive and correct reservation allows us to avoid needless commits. here is the patch. tested on UP. thanks, Alex Signed-off-by: Alex Tomas <[EMAIL PROTECTED]> Index: linux-2.6

[PATCH] JBD: journal_release_buffer()

2005-01-19 Thread Alex Tomas
. Signed-off-by: Alex Tomas <[EMAIL PROTECTED]> Index: linux-2.6.7/include/linux/journal-head.h === --- linux-2.6.7.orig/include/linux/journal-head.h 2003-06-24 18:05:26.0 +0400 +++ linux-2.6.7/include/linux/journal-

[PATCH] JBD: fix against journal overflow

2005-01-19 Thread Alex Tomas
y descriptor blocks because static array wbuf can hold 64 blocks only. The fix is to have persistent array big enough to hold max. possible blocks. Signed-off-by: Alex Tomas <[EMAIL PROTECTED]> Index: linux-2.6.7/include/linux/jbd.h ===

[PATCH] JBD: fix against journal overflow

2005-01-19 Thread Alex Tomas
blocks because static array wbuf can hold 64 blocks only. The fix is to have persistent array big enough to hold max. possible blocks. Signed-off-by: Alex Tomas [EMAIL PROTECTED] Index: linux-2.6.7/include/linux/jbd.h === --- linux-2.6.7

[PATCH] JBD: journal_release_buffer()

2005-01-19 Thread Alex Tomas
. Signed-off-by: Alex Tomas [EMAIL PROTECTED] Index: linux-2.6.7/include/linux/journal-head.h === --- linux-2.6.7.orig/include/linux/journal-head.h 2003-06-24 18:05:26.0 +0400 +++ linux-2.6.7/include/linux/journal-head.h

[PATCH] JBD: log space management optimization

2005-01-19 Thread Alex Tomas
. for example, removal of 500MB file reserves 136 blocks, but only 10 blocks go to the log. a commit is expensive and correct reservation allows us to avoid needless commits. here is the patch. tested on UP. thanks, Alex Signed-off-by: Alex Tomas [EMAIL PROTECTED] Index: linux-2.6.7/fs/jbd