Re: [PATCH] Ext4: Uninitialized Block Groups
Hi Avantika, I ran some tests with the uninit_groups feature enabled and got error messages when running e2fsck on my ext4 partition. e2fsck complains of an invalid unused inodes count in some group descriptors. These errors occur when checking groups which have only one inode in use. The free inodes count has been decremented by one in these groups but not the unused inodes count. The following patch fixes the problem. Andreas, could you check if my patch is correct? Thanks a lot, Valérie Index: linux-2.6.23-rc6/fs/ext4/ialloc.c === --- linux-2.6.23-rc6.orig/fs/ext4/ialloc.c 2007-09-19 11:31:01.0 +0200 +++ linux-2.6.23-rc6/fs/ext4/ialloc.c 2007-09-19 11:31:41.0 +0200 @@ -633,13 +633,10 @@ got: /* If we didn't allocate from within the initialized part of the inode * table then we need to initialize up to this inode. */ if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) { - if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) { + if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) gdp-bg_flags = cpu_to_le16(~EXT4_BG_INODE_UNINIT); - free = EXT4_INODES_PER_GROUP(sb); - } else { - free = EXT4_INODES_PER_GROUP(sb) - + free = EXT4_INODES_PER_GROUP(sb) - le16_to_cpu(gdp-bg_itable_unused); - } if (ino free) gdp-bg_itable_unused = - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Ext4: Uninitialized Block Groups
Valerie Clement wrote: Hi Avantika, I ran some tests with the uninit_groups feature enabled and got error messages when running e2fsck on my ext4 partition. e2fsck complains of an invalid unused inodes count in some group descriptors. These errors occur when checking groups which have only one inode in use. The free inodes count has been decremented by one in these groups but not the unused inodes count. The following patch fixes the problem. Andreas, could you check if my patch is correct? Thanks a lot, Valérie Index: linux-2.6.23-rc6/fs/ext4/ialloc.c === --- linux-2.6.23-rc6.orig/fs/ext4/ialloc.c 2007-09-19 11:31:01.0 +0200 +++ linux-2.6.23-rc6/fs/ext4/ialloc.c 2007-09-19 11:31:41.0 +0200 @@ -633,13 +633,10 @@ got: /* If we didn't allocate from within the initialized part of the inode * table then we need to initialize up to this inode. */ if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) { - if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) { + if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) gdp-bg_flags = cpu_to_le16(~EXT4_BG_INODE_UNINIT); - free = EXT4_INODES_PER_GROUP(sb); - } else { - free = EXT4_INODES_PER_GROUP(sb) - + free = EXT4_INODES_PER_GROUP(sb) - le16_to_cpu(gdp-bg_itable_unused); - } the variable free is confusingly named here. It is not the free inode count. rather it indicate the last used relative inode number in the group. How about the below ? -aneesh diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 4250c02..cfe2e09 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -633,17 +633,21 @@ got: /* If we didn't allocate from within the initialized part of the inode * table then we need to initialize up to this inode. */ if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) { - if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) { + if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) gdp-bg_flags = cpu_to_le16(~EXT4_BG_INODE_UNINIT); - free = EXT4_INODES_PER_GROUP(sb); - } else { - free = EXT4_INODES_PER_GROUP(sb) - - le16_to_cpu(gdp-bg_itable_unused); - } - if (ino free) + /* +* Check the relative inode number against the last used +* relative inode number in this group. if it is greater +* we need to update the bg_itable_unused count +* +*/ + if (ino (EXT4_INODES_PER_GROUP(sb) - + le16_to_cpu(gdp-bg_itable_unused))) { + gdp-bg_itable_unused = cpu_to_le16(EXT4_INODES_PER_GROUP(sb) - ino); + } } gdp-bg_free_inodes_count = - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Ext4: Uninitialized Block Groups
Aneesh Kumar K.V wrote: the variable free is confusingly named here. It is not the free inode count. rather it indicate the last used relative inode number in the group. How about the below ? -aneesh I agree, it's better. Valérie diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 4250c02..cfe2e09 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -633,17 +633,21 @@ got: /* If we didn't allocate from within the initialized part of the inode * table then we need to initialize up to this inode. */ if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) { -if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) { +if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) gdp-bg_flags = cpu_to_le16(~EXT4_BG_INODE_UNINIT); -free = EXT4_INODES_PER_GROUP(sb); -} else { -free = EXT4_INODES_PER_GROUP(sb) - -le16_to_cpu(gdp-bg_itable_unused); -} -if (ino free) +/* + * Check the relative inode number against the last used + * relative inode number in this group. if it is greater + * we need to update the bg_itable_unused count + * + */ +if (ino (EXT4_INODES_PER_GROUP(sb) - +le16_to_cpu(gdp-bg_itable_unused))) { + gdp-bg_itable_unused = cpu_to_le16(EXT4_INODES_PER_GROUP(sb) - ino); +} } gdp-bg_free_inodes_count = - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Enabling h-trees too early?
Hi, I was just wondering: Currently we start to build h-tree in a directory already when the size of directory exceeds one block. But honestly, it does not seem to make much sence to use this feature until the directory is much larger (I'd say at least 16 or 32 KB). It actually slows down some operations like deleting the whole directory, etc. So what is the reason for starting building the tree so early? Just the simplicity of building it when the directory is just one block large? Honza -- Jan Kara [EMAIL PROTECTED] SUSE Labs, CR - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Ext4: Uninitialized Block Groups
On Sep 19, 2007 14:06 +0200, Valerie Clement wrote: I ran some tests with the uninit_groups feature enabled and got error messages when running e2fsck on my ext4 partition. e2fsck complains of an invalid unused inodes count in some group descriptors. These errors occur when checking groups which have only one inode in use. The free inodes count has been decremented by one in these groups but not the unused inodes count. Index: linux-2.6.23-rc6/fs/ext4/ialloc.c === --- linux-2.6.23-rc6.orig/fs/ext4/ialloc.c2007-09-19 11:31:01.0 +0200 +++ linux-2.6.23-rc6/fs/ext4/ialloc.c 2007-09-19 11:31:41.0 +0200 @@ -633,13 +633,10 @@ got: /* If we didn't allocate from within the initialized part of the inode * table then we need to initialize up to this inode. */ if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) { - if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) { + if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) gdp-bg_flags = cpu_to_le16(~EXT4_BG_INODE_UNINIT); - free = EXT4_INODES_PER_GROUP(sb); - } else { - free = EXT4_INODES_PER_GROUP(sb) - + free = EXT4_INODES_PER_GROUP(sb) - le16_to_cpu(gdp-bg_itable_unused); - } Hmm, this is indeed incorrect, but I'm not sure solution is the right one. I guess in our testing we ran it for a long time and must have created more than a single inode per group... What about the following instead? I think the assumption in the original code is that it's a new group, all the inodes are free, but that is not correct - we want to make NONE of the inodes free initially so that bt_itable_unused is recalculated below: if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) { gdp-bg_flags = cpu_to_le16(~EXT4_BG_INODE_UNINIT); - free = EXT4_INODES_PER_GROUP(sb); + free = 0; } else { free = EXT4_INODES_PER_GROUP(sb) - le16_to_cpu(gdp-bg_itable_unused); } if (ino free) gdp-bg_itable_unused = cpu_to_le16(EXT3_INODES_PER_GROUP(sb) - ino); Still a bit uneasy about off-by-one errors here though. Is ino 0 or 1 for the first inode in the group. We might need to have a -1 in the bg_itable_unused calculation still. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Enabling h-trees too early?
On Sep 19, 2007 17:07 +0200, Jan Kara wrote: I was just wondering: Currently we start to build h-tree in a directory already when the size of directory exceeds one block. But honestly, it does not seem to make much sence to use this feature until the directory is much larger (I'd say at least 16 or 32 KB). It actually slows down some operations like deleting the whole directory, etc. So what is the reason for starting building the tree so early? Just the simplicity of building it when the directory is just one block large? Yes, doing it at one block is easy, doing it with more blocks is complex. At the time we added this feature, tests showed no performance difference between 2 unhashed blocks and 2 hashed blocks so the easier code was used. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Enabling h-trees too early?
On Wed, Sep 19, 2007 at 05:07:15PM +0200, Jan Kara wrote: I was just wondering: Currently we start to build h-tree in a directory already when the size of directory exceeds one block. But honestly, it does not seem to make much sence to use this feature until the directory is much larger (I'd say at least 16 or 32 KB). It actually slows down some operations like deleting the whole directory, etc. So what is the reason for starting building the tree so early? Just the simplicity of building it when the directory is just one block large? How much is it slowing down operations such as rm -rf? For a small directory ( 32k), I would assume that the difference would be relatively small. What sort of differences have you measured, and is this a common case problem? Certainly one of the things that we could consider is for small directories to do an in-memory sort of all of the directory entries at opendir() time, and keeping that list until it is closed. We can't do this for really big directories, but we could easily do it for directories under 32k or 64k. - Ted - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] JBD slab cleanups
On Tue, 2007-09-18 at 19:19 -0700, Andrew Morton wrote: On Tue, 18 Sep 2007 18:00:01 -0700 Mingming Cao [EMAIL PROTECTED] wrote: JBD: Replace slab allocations with page cache allocations JBD allocate memory for committed_data and frozen_data from slab. However JBD should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Also this patch cleans up jbd_kmalloc and replace it with kmalloc directly __GFP_NOFAIL should only be used when we have no way of recovering from failure. The allocation in journal_init_common() (at least) _can_ recover and hence really shouldn't be using __GFP_NOFAIL. (Actually, nothing in the kernel should be using __GFP_NOFAIL. It is there as a marker which says we really shouldn't be doing this but we don't know how to fix it). So sometime it'd be good if you could review all the __GFP_NOFAILs in there and see if we can remove some, thanks. Here is the patch to clean up __GFP_NOFAIL flag in jbd/jbd2. In all cases except one handles memory allocation failure so I get rid of those GFP_NOFAIL flags. Also, shouldn't we use GFP_KERNEL instead of GFP_NOFS flag for kmalloc in jbd/jbd2? I will send a separate patch to cleanup that. Signed-off-by: Mingming Cao [EMAIL PROTECTED] --- fs/jbd/journal.c |2 +- fs/jbd/transaction.c |3 +-- fs/jbd2/journal.c |2 +- fs/jbd2/transaction.c |3 +-- 4 files changed, 4 insertions(+), 6 deletions(-) Index: linux-2.6.23-rc6/fs/jbd/journal.c === --- linux-2.6.23-rc6.orig/fs/jbd/journal.c 2007-09-19 11:47:58.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/journal.c 2007-09-19 11:48:40.0 -0700 @@ -653,7 +653,7 @@ static journal_t * journal_init_common ( journal_t *journal; int err; - journal = kmalloc(sizeof(*journal), GFP_KERNEL|__GFP_NOFAIL); + journal = kmalloc(sizeof(*journal), GFP_KERNEL); if (!journal) goto fail; memset(journal, 0, sizeof(*journal)); Index: linux-2.6.23-rc6/fs/jbd/transaction.c === --- linux-2.6.23-rc6.orig/fs/jbd/transaction.c 2007-09-19 11:48:05.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/transaction.c 2007-09-19 11:49:10.0 -0700 @@ -96,8 +96,7 @@ static int start_this_handle(journal_t * alloc_transaction: if (!journal-j_running_transaction) { - new_transaction = kmalloc(sizeof(*new_transaction), - GFP_NOFS|__GFP_NOFAIL); + new_transaction = kmalloc(sizeof(*new_transaction), GFP_NOFS); if (!new_transaction) { ret = -ENOMEM; goto out; Index: linux-2.6.23-rc6/fs/jbd2/journal.c === --- linux-2.6.23-rc6.orig/fs/jbd2/journal.c 2007-09-19 11:48:14.0 -0700 +++ linux-2.6.23-rc6/fs/jbd2/journal.c 2007-09-19 11:49:46.0 -0700 @@ -654,7 +654,7 @@ static journal_t * journal_init_common ( journal_t *journal; int err; - journal = kmalloc(sizeof(*journal), GFP_KERNEL|__GFP_NOFAIL); + journal = kmalloc(sizeof(*journal), GFP_KERNEL); if (!journal) goto fail; memset(journal, 0, sizeof(*journal)); Index: linux-2.6.23-rc6/fs/jbd2/transaction.c === --- linux-2.6.23-rc6.orig/fs/jbd2/transaction.c 2007-09-19 11:48:08.0 -0700 +++ linux-2.6.23-rc6/fs/jbd2/transaction.c 2007-09-19 11:50:12.0 -0700 @@ -96,8 +96,7 @@ static int start_this_handle(journal_t * alloc_transaction: if (!journal-j_running_transaction) { - new_transaction = kmalloc(sizeof(*new_transaction), - GFP_NOFS|__GFP_NOFAIL); + new_transaction = kmalloc(sizeof(*new_transaction), GFP_NOFS); if (!new_transaction) { ret = -ENOMEM; goto out; - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] JBD: use GFP_NOFS in kmalloc
Convert the GFP_KERNEL flag used in JBD/JBD2 to GFP_NOFS, consistent with the rest of kmalloc flag used in the JBD/JBD2 layer. Signed-off-by: Mingming Cao [EMAIL PROTECTED] --- fs/jbd/journal.c |6 +++--- fs/jbd/revoke.c |8 fs/jbd2/journal.c |6 +++--- fs/jbd2/revoke.c |8 4 files changed, 14 insertions(+), 14 deletions(-) Index: linux-2.6.23-rc6/fs/jbd/journal.c === --- linux-2.6.23-rc6.orig/fs/jbd/journal.c 2007-09-19 11:51:10.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/journal.c 2007-09-19 11:51:57.0 -0700 @@ -653,7 +653,7 @@ static journal_t * journal_init_common ( journal_t *journal; int err; - journal = kmalloc(sizeof(*journal), GFP_KERNEL); + journal = kmalloc(sizeof(*journal), GFP_NOFS); if (!journal) goto fail; memset(journal, 0, sizeof(*journal)); @@ -723,7 +723,7 @@ journal_t * journal_init_dev(struct bloc journal-j_blocksize = blocksize; n = journal-j_blocksize / sizeof(journal_block_tag_t); journal-j_wbufsize = n; - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); if (!journal-j_wbuf) { printk(KERN_ERR %s: Cant allocate bhs for commit thread\n, __FUNCTION__); @@ -777,7 +777,7 @@ journal_t * journal_init_inode (struct i /* journal descriptor can store up to n blocks -bzzz */ n = journal-j_blocksize / sizeof(journal_block_tag_t); journal-j_wbufsize = n; - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); if (!journal-j_wbuf) { printk(KERN_ERR %s: Cant allocate bhs for commit thread\n, __FUNCTION__); Index: linux-2.6.23-rc6/fs/jbd/revoke.c === --- linux-2.6.23-rc6.orig/fs/jbd/revoke.c 2007-09-19 11:51:30.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/revoke.c2007-09-19 11:52:34.0 -0700 @@ -206,7 +206,7 @@ int journal_init_revoke(journal_t *journ while((tmp = 1UL) != 0UL) shift++; - journal-j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL); + journal-j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache, GFP_NOFS); if (!journal-j_revoke_table[0]) return -ENOMEM; journal-j_revoke = journal-j_revoke_table[0]; @@ -219,7 +219,7 @@ int journal_init_revoke(journal_t *journ journal-j_revoke-hash_shift = shift; journal-j_revoke-hash_table = - kmalloc(hash_size * sizeof(struct list_head), GFP_KERNEL); + kmalloc(hash_size * sizeof(struct list_head), GFP_NOFS); if (!journal-j_revoke-hash_table) { kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); journal-j_revoke = NULL; @@ -229,7 +229,7 @@ int journal_init_revoke(journal_t *journ for (tmp = 0; tmp hash_size; tmp++) INIT_LIST_HEAD(journal-j_revoke-hash_table[tmp]); - journal-j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL); + journal-j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache, GFP_NOFS); if (!journal-j_revoke_table[1]) { kfree(journal-j_revoke_table[0]-hash_table); kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); @@ -246,7 +246,7 @@ int journal_init_revoke(journal_t *journ journal-j_revoke-hash_shift = shift; journal-j_revoke-hash_table = - kmalloc(hash_size * sizeof(struct list_head), GFP_KERNEL); + kmalloc(hash_size * sizeof(struct list_head), GFP_NOFS); if (!journal-j_revoke-hash_table) { kfree(journal-j_revoke_table[0]-hash_table); kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); Index: linux-2.6.23-rc6/fs/jbd2/journal.c === --- linux-2.6.23-rc6.orig/fs/jbd2/journal.c 2007-09-19 11:52:48.0 -0700 +++ linux-2.6.23-rc6/fs/jbd2/journal.c 2007-09-19 11:53:12.0 -0700 @@ -654,7 +654,7 @@ static journal_t * journal_init_common ( journal_t *journal; int err; - journal = kmalloc(sizeof(*journal), GFP_KERNEL); + journal = kmalloc(sizeof(*journal), GFP_NOFS); if (!journal) goto fail; memset(journal, 0, sizeof(*journal)); @@ -724,7 +724,7 @@ journal_t * jbd2_journal_init_dev(struct journal-j_blocksize = blocksize; n = journal-j_blocksize / sizeof(journal_block_tag_t); journal-j_wbufsize = n; - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); +
Re: [PATCH] JBD slab cleanups
On Wed, 2007-09-19 at 12:15 -0700, Mingming Cao wrote: Here is the patch to clean up __GFP_NOFAIL flag in jbd/jbd2. In all cases except one handles memory allocation failure so I get rid of those GFP_NOFAIL flags. Also, shouldn't we use GFP_KERNEL instead of GFP_NOFS flag for kmalloc in jbd/jbd2? I will send a separate patch to cleanup that. No. GFP_NOFS avoids deadlock. It prevents the allocation from making recursive calls back into the file system that could end up blocking on jbd code. Shaggy -- David Kleikamp IBM Linux Technology Center - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] JBD slab cleanups
On Sep 19, 2007 12:15 -0700, Mingming Cao wrote: @@ -96,8 +96,7 @@ static int start_this_handle(journal_t * alloc_transaction: if (!journal-j_running_transaction) { - new_transaction = kmalloc(sizeof(*new_transaction), - GFP_NOFS|__GFP_NOFAIL); + new_transaction = kmalloc(sizeof(*new_transaction), GFP_NOFS); This should probably be a __GFP_NOFAIL if we are trying to start a new handle in truncate, as there is no way to propagate an error to the caller. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH e2fsprogs] - ignore bind mounts in fsck
An entry like this in /etc/fstab: /foo/barext3bind,defaults 1 3 will stop boot, as fsck.ext3 tries to check it and fails: e2fsck 1.40.2 (12-Jul-2007) fsck.ext3: Is a directory while trying to open /foo The superblock could not be read or does not describe a correct ext2 filesystem. ... Granted, asking for fsck of a bind mount in the fstab is a bit odd, but it doesn't seem like it should stop the boot process if you make this mistake. Signed-off-by: Eric Sandeen [EMAIL PROTECTED] Index: e2fsprogs-1.40.2/misc/fsck.c === --- e2fsprogs-1.40.2.orig/misc/fsck.c +++ e2fsprogs-1.40.2/misc/fsck.c @@ -867,6 +867,12 @@ static int ignore(struct fs_info *fs) if (fs-passno == 0) return 1; + /* +* If this is a bind mount, ignore it. +*/ + if (opt_in_list(bind, fs-opts)) + return 1; + interpret_type(fs); /* - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] JBD slab cleanups
On Wed, 2007-09-19 at 19:28 +, Dave Kleikamp wrote: On Wed, 2007-09-19 at 14:26 -0500, Dave Kleikamp wrote: On Wed, 2007-09-19 at 12:15 -0700, Mingming Cao wrote: Here is the patch to clean up __GFP_NOFAIL flag in jbd/jbd2. In all cases except one handles memory allocation failure so I get rid of those GFP_NOFAIL flags. Also, shouldn't we use GFP_KERNEL instead of GFP_NOFS flag for kmalloc in jbd/jbd2? I will send a separate patch to cleanup that. No. GFP_NOFS avoids deadlock. It prevents the allocation from making recursive calls back into the file system that could end up blocking on jbd code. Oh, I see your patch now. You mean use GFP_NOFS instead of GFP_KERNEL. :-) OK then. oops, I did mean what you say here.:-) Shaggy - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] JBD: use GFP_NOFS in kmalloc
On Wed, 19 Sep 2007 12:22:09 -0700 Mingming Cao [EMAIL PROTECTED] wrote: Convert the GFP_KERNEL flag used in JBD/JBD2 to GFP_NOFS, consistent with the rest of kmalloc flag used in the JBD/JBD2 layer. Signed-off-by: Mingming Cao [EMAIL PROTECTED] --- fs/jbd/journal.c |6 +++--- fs/jbd/revoke.c |8 fs/jbd2/journal.c |6 +++--- fs/jbd2/revoke.c |8 4 files changed, 14 insertions(+), 14 deletions(-) Index: linux-2.6.23-rc6/fs/jbd/journal.c === --- linux-2.6.23-rc6.orig/fs/jbd/journal.c2007-09-19 11:51:10.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/journal.c 2007-09-19 11:51:57.0 -0700 @@ -653,7 +653,7 @@ static journal_t * journal_init_common ( journal_t *journal; int err; - journal = kmalloc(sizeof(*journal), GFP_KERNEL); + journal = kmalloc(sizeof(*journal), GFP_NOFS); if (!journal) goto fail; memset(journal, 0, sizeof(*journal)); @@ -723,7 +723,7 @@ journal_t * journal_init_dev(struct bloc journal-j_blocksize = blocksize; n = journal-j_blocksize / sizeof(journal_block_tag_t); journal-j_wbufsize = n; - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); if (!journal-j_wbuf) { printk(KERN_ERR %s: Cant allocate bhs for commit thread\n, __FUNCTION__); @@ -777,7 +777,7 @@ journal_t * journal_init_inode (struct i /* journal descriptor can store up to n blocks -bzzz */ n = journal-j_blocksize / sizeof(journal_block_tag_t); journal-j_wbufsize = n; - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); if (!journal-j_wbuf) { printk(KERN_ERR %s: Cant allocate bhs for commit thread\n, __FUNCTION__); Index: linux-2.6.23-rc6/fs/jbd/revoke.c === --- linux-2.6.23-rc6.orig/fs/jbd/revoke.c 2007-09-19 11:51:30.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/revoke.c 2007-09-19 11:52:34.0 -0700 @@ -206,7 +206,7 @@ int journal_init_revoke(journal_t *journ while((tmp = 1UL) != 0UL) shift++; - journal-j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL); + journal-j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache, GFP_NOFS); if (!journal-j_revoke_table[0]) return -ENOMEM; journal-j_revoke = journal-j_revoke_table[0]; @@ -219,7 +219,7 @@ int journal_init_revoke(journal_t *journ journal-j_revoke-hash_shift = shift; journal-j_revoke-hash_table = - kmalloc(hash_size * sizeof(struct list_head), GFP_KERNEL); + kmalloc(hash_size * sizeof(struct list_head), GFP_NOFS); if (!journal-j_revoke-hash_table) { kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); journal-j_revoke = NULL; @@ -229,7 +229,7 @@ int journal_init_revoke(journal_t *journ for (tmp = 0; tmp hash_size; tmp++) INIT_LIST_HEAD(journal-j_revoke-hash_table[tmp]); - journal-j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL); + journal-j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache, GFP_NOFS); if (!journal-j_revoke_table[1]) { kfree(journal-j_revoke_table[0]-hash_table); kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); @@ -246,7 +246,7 @@ int journal_init_revoke(journal_t *journ journal-j_revoke-hash_shift = shift; journal-j_revoke-hash_table = - kmalloc(hash_size * sizeof(struct list_head), GFP_KERNEL); + kmalloc(hash_size * sizeof(struct list_head), GFP_NOFS); if (!journal-j_revoke-hash_table) { kfree(journal-j_revoke_table[0]-hash_table); kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); These were all OK using GFP_KERNEL. GFP_NOFS should only be used when the caller is holding some fs locks which might cause a deadlock if that caller reentered the fs in -writepage (and maybe put_inode and such). That isn't the case in any of the above code, which is all mount time stuff (I think). ext3/4 should be using GFP_NOFS when the caller has a transaction open, has a page locked, is holding i_mutex, etc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] JBD: use GFP_NOFS in kmalloc
On Wed, 2007-09-19 at 14:34 -0700, Andrew Morton wrote: On Wed, 19 Sep 2007 12:22:09 -0700 Mingming Cao [EMAIL PROTECTED] wrote: Convert the GFP_KERNEL flag used in JBD/JBD2 to GFP_NOFS, consistent with the rest of kmalloc flag used in the JBD/JBD2 layer. Signed-off-by: Mingming Cao [EMAIL PROTECTED] --- fs/jbd/journal.c |6 +++--- fs/jbd/revoke.c |8 fs/jbd2/journal.c |6 +++--- fs/jbd2/revoke.c |8 4 files changed, 14 insertions(+), 14 deletions(-) Index: linux-2.6.23-rc6/fs/jbd/journal.c === --- linux-2.6.23-rc6.orig/fs/jbd/journal.c 2007-09-19 11:51:10.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/journal.c 2007-09-19 11:51:57.0 -0700 @@ -653,7 +653,7 @@ static journal_t * journal_init_common ( journal_t *journal; int err; - journal = kmalloc(sizeof(*journal), GFP_KERNEL); + journal = kmalloc(sizeof(*journal), GFP_NOFS); if (!journal) goto fail; memset(journal, 0, sizeof(*journal)); @@ -723,7 +723,7 @@ journal_t * journal_init_dev(struct bloc journal-j_blocksize = blocksize; n = journal-j_blocksize / sizeof(journal_block_tag_t); journal-j_wbufsize = n; - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); if (!journal-j_wbuf) { printk(KERN_ERR %s: Cant allocate bhs for commit thread\n, __FUNCTION__); @@ -777,7 +777,7 @@ journal_t * journal_init_inode (struct i /* journal descriptor can store up to n blocks -bzzz */ n = journal-j_blocksize / sizeof(journal_block_tag_t); journal-j_wbufsize = n; - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); if (!journal-j_wbuf) { printk(KERN_ERR %s: Cant allocate bhs for commit thread\n, __FUNCTION__); Index: linux-2.6.23-rc6/fs/jbd/revoke.c === --- linux-2.6.23-rc6.orig/fs/jbd/revoke.c 2007-09-19 11:51:30.0 -0700 +++ linux-2.6.23-rc6/fs/jbd/revoke.c2007-09-19 11:52:34.0 -0700 @@ -206,7 +206,7 @@ int journal_init_revoke(journal_t *journ while((tmp = 1UL) != 0UL) shift++; - journal-j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL); + journal-j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache, GFP_NOFS); if (!journal-j_revoke_table[0]) return -ENOMEM; journal-j_revoke = journal-j_revoke_table[0]; @@ -219,7 +219,7 @@ int journal_init_revoke(journal_t *journ journal-j_revoke-hash_shift = shift; journal-j_revoke-hash_table = - kmalloc(hash_size * sizeof(struct list_head), GFP_KERNEL); + kmalloc(hash_size * sizeof(struct list_head), GFP_NOFS); if (!journal-j_revoke-hash_table) { kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); journal-j_revoke = NULL; @@ -229,7 +229,7 @@ int journal_init_revoke(journal_t *journ for (tmp = 0; tmp hash_size; tmp++) INIT_LIST_HEAD(journal-j_revoke-hash_table[tmp]); - journal-j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL); + journal-j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache, GFP_NOFS); if (!journal-j_revoke_table[1]) { kfree(journal-j_revoke_table[0]-hash_table); kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); @@ -246,7 +246,7 @@ int journal_init_revoke(journal_t *journ journal-j_revoke-hash_shift = shift; journal-j_revoke-hash_table = - kmalloc(hash_size * sizeof(struct list_head), GFP_KERNEL); + kmalloc(hash_size * sizeof(struct list_head), GFP_NOFS); if (!journal-j_revoke-hash_table) { kfree(journal-j_revoke_table[0]-hash_table); kmem_cache_free(revoke_table_cache, journal-j_revoke_table[0]); These were all OK using GFP_KERNEL. GFP_NOFS should only be used when the caller is holding some fs locks which might cause a deadlock if that caller reentered the fs in -writepage (and maybe put_inode and such). That isn't the case in any of the above code, which is all mount time stuff (I think). You are right they are all occur at initialization time. ext3/4 should be using GFP_NOFS when the caller has a transaction open, has a page locked, is holding i_mutex, etc. Thanks for your feedback. Mingming - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Ext4: Uninitialized Block Groups
Andreas Dilger wrote: On Sep 18, 2007 20:03 -0700, Andrew Morton wrote: On Tue, 18 Sep 2007 17:25:31 -0700 Avantika Mathur [EMAIL PROTECTED] wrote: +#if !defined(CONFIG_CRC16) +/** CRC table for the CRC-16. The poly is 0x8005 (x16 + x15 + x2 + 1) */ +__u16 const crc16_table[256] = { me + 0x, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241, That's rather sad. A plain old depends on would be better. My bad. We wrote this patch and it had to run on older kernels that might not even have lib/crc16.c (it was added around 2.6.15 or so, so e.g. RHEL4 doesn't have it). I forgot to remove it from the upstream submission, and since it didn't cause problems nobody else complained... The incremental patch below removes the local crc16 code, and also resolves an issue with properly updating bg_itable_unused when an inode is allocated in an unused block groups. Thanks Avantika --- fs/Kconfig |1 + fs/ext4/ialloc.c |8 +++- fs/ext4/super.c | 51 +-- 3 files changed, 9 insertions(+), 51 deletions(-) Index: linux-2.6.23-rc6/fs/ext4/ialloc.c === --- linux-2.6.23-rc6.orig/fs/ext4/ialloc.c 2007-09-19 15:38:21.0 -0700 +++ linux-2.6.23-rc6/fs/ext4/ialloc.c 2007-09-19 15:41:11.0 -0700 @@ -635,12 +635,18 @@ if (EXT4_HAS_RO_COMPAT_FEATURE(sb, EXT4_FEATURE_RO_COMPAT_GDT_CSUM)) { if (gdp-bg_flags cpu_to_le16(EXT4_BG_INODE_UNINIT)) { gdp-bg_flags = cpu_to_le16(~EXT4_BG_INODE_UNINIT); - free = EXT4_INODES_PER_GROUP(sb); + free = 0; } else { free = EXT4_INODES_PER_GROUP(sb) - le16_to_cpu(gdp-bg_itable_unused); } + /* +* Check the relative inode number against the last used +* relative inode number in this group. if it is greater +* we need to update the bg_itable_unused count +* +*/ if (ino free) gdp-bg_itable_unused = cpu_to_le16(EXT4_INODES_PER_GROUP(sb) - ino); Index: linux-2.6.23-rc6/fs/ext4/super.c === --- linux-2.6.23-rc6.orig/fs/ext4/super.c 2007-09-19 15:38:21.0 -0700 +++ linux-2.6.23-rc6/fs/ext4/super.c2007-09-19 15:38:51.0 -0700 @@ -37,6 +37,7 @@ #include linux/quotaops.h #include linux/seq_file.h #include linux/log2.h +#include linux/crc16.h #include asm/uaccess.h @@ -1248,56 +1249,6 @@ return res; } -#if !defined(CONFIG_CRC16) -/** CRC table for the CRC-16. The poly is 0x8005 (x16 + x15 + x2 + 1) */ -__u16 const crc16_table[256] = { - 0x, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241, - 0xC601, 0x06C0, 0x0780, 0xC741, 0x0500, 0xC5C1, 0xC481, 0x0440, - 0xCC01, 0x0CC0, 0x0D80, 0xCD41, 0x0F00, 0xCFC1, 0xCE81, 0x0E40, - 0x0A00, 0xCAC1, 0xCB81, 0x0B40, 0xC901, 0x09C0, 0x0880, 0xC841, - 0xD801, 0x18C0, 0x1980, 0xD941, 0x1B00, 0xDBC1, 0xDA81, 0x1A40, - 0x1E00, 0xDEC1, 0xDF81, 0x1F40, 0xDD01, 0x1DC0, 0x1C80, 0xDC41, - 0x1400, 0xD4C1, 0xD581, 0x1540, 0xD701, 0x17C0, 0x1680, 0xD641, - 0xD201, 0x12C0, 0x1380, 0xD341, 0x1100, 0xD1C1, 0xD081, 0x1040, - 0xF001, 0x30C0, 0x3180, 0xF141, 0x3300, 0xF3C1, 0xF281, 0x3240, - 0x3600, 0xF6C1, 0xF781, 0x3740, 0xF501, 0x35C0, 0x3480, 0xF441, - 0x3C00, 0xFCC1, 0xFD81, 0x3D40, 0xFF01, 0x3FC0, 0x3E80, 0xFE41, - 0xFA01, 0x3AC0, 0x3B80, 0xFB41, 0x3900, 0xF9C1, 0xF881, 0x3840, - 0x2800, 0xE8C1, 0xE981, 0x2940, 0xEB01, 0x2BC0, 0x2A80, 0xEA41, - 0xEE01, 0x2EC0, 0x2F80, 0xEF41, 0x2D00, 0xEDC1, 0xEC81, 0x2C40, - 0xE401, 0x24C0, 0x2580, 0xE541, 0x2700, 0xE7C1, 0xE681, 0x2640, - 0x2200, 0xE2C1, 0xE381, 0x2340, 0xE101, 0x21C0, 0x2080, 0xE041, - 0xA001, 0x60C0, 0x6180, 0xA141, 0x6300, 0xA3C1, 0xA281, 0x6240, - 0x6600, 0xA6C1, 0xA781, 0x6740, 0xA501, 0x65C0, 0x6480, 0xA441, - 0x6C00, 0xACC1, 0xAD81, 0x6D40, 0xAF01, 0x6FC0, 0x6E80, 0xAE41, - 0xAA01, 0x6AC0, 0x6B80, 0xAB41, 0x6900, 0xA9C1, 0xA881, 0x6840, - 0x7800, 0xB8C1, 0xB981, 0x7940, 0xBB01, 0x7BC0, 0x7A80, 0xBA41, - 0xBE01, 0x7EC0, 0x7F80, 0xBF41, 0x7D00, 0xBDC1, 0xBC81, 0x7C40, - 0xB401, 0x74C0, 0x7580, 0xB541, 0x7700, 0xB7C1, 0xB681, 0x7640, - 0x7200, 0xB2C1, 0xB381, 0x7340, 0xB101, 0x71C0, 0x7080, 0xB041, - 0x5000, 0x90C1, 0x9181, 0x5140, 0x9301, 0x53C0, 0x5280, 0x9241, - 0x9601, 0x56C0, 0x5780, 0x9741, 0x5500, 0x95C1, 0x9481, 0x5440, - 0x9C01, 0x5CC0, 0x5D80, 0x9D41, 0x5F00, 0x9FC1, 0x9E81, 0x5E40, - 0x5A00, 0x9AC1, 0x9B81, 0x5B40, 0x9901, 0x59C0, 0x5880, 0x9841, - 0x8801, 0x48C0, 0x4980, 0x8941, 0x4B00, 0x8BC1, 0x8A81,
Re: [PATCH] obsolete libcom-err for SuSE e2fsprogs
Andreas Dilger wrote: It isn't possible to build an e2fsprogs via make rpm on SuSE and have it install cleanly, because they split out some of the libraries into separate packages. We've got the current patch to the .spec file, but I'm open to discussion if it is more desirable to change the .spec to continue to build separate RPMs (though that is more of a distribution hassle and might need major changes in the .spec file). FWIW, I also have an RFE assigned to me for RHEL/Fedora to split up our e2fsprogs packages for libcom_err and libuuid... since many non-filesystem things now require them. So, this is sort of going in the opposite direction. :) Any idea how many distros already split it out? -Eric Index: e2fsprogs-1.40.2/e2fsprogs.spec.in === --- e2fsprogs-1.40.2.orig/e2fsprogs.spec.in +++ e2fsprogs-1.40.2/e2fsprogs.spec.in @@ -13,6 +13,10 @@ Url: http://e2fsprogs.sourceforge.net/ Prereq: /sbin/ldconfig BuildRoot: %{_tmppath}/%{name}-root +%if %{_vendor} == suse +Obsoletes: libcom_err %{version} +Provides: libcom_err = %{version} +%endif %description The e2fsprogs package contains a number of utilities for creating, Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] JBD: use GFP_NOFS in kmalloc
On Sep 19, 2007 12:22 -0700, Mingming Cao wrote: Convert the GFP_KERNEL flag used in JBD/JBD2 to GFP_NOFS, consistent with the rest of kmalloc flag used in the JBD/JBD2 layer. @@ -653,7 +653,7 @@ static journal_t * journal_init_common ( - journal = kmalloc(sizeof(*journal), GFP_KERNEL); + journal = kmalloc(sizeof(*journal), GFP_NOFS); @@ -723,7 +723,7 @@ journal_t * journal_init_dev(struct bloc - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); @@ -777,7 +777,7 @@ journal_t * journal_init_inode (struct i - journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_KERNEL); + journal-j_wbuf = kmalloc(n * sizeof(struct buffer_head*), GFP_NOFS); Is there a reason for this change except it's in a filesystem, so it should be GFP_NOFS? We are only doing journal setup during mount so there shouldn't be any problem using GFP_KERNEL. I don't think it will inject any defect into the code, but I don't think it is needed either. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] obsolete libcom-err for SuSE e2fsprogs
On Sep 19, 2007 20:41 -0500, Eric Sandeen wrote: Andreas Dilger wrote: It isn't possible to build an e2fsprogs via make rpm on SuSE and have it install cleanly, because they split out some of the libraries into separate packages. We've got the current patch to the .spec file, but I'm open to discussion if it is more desirable to change the .spec to continue to build separate RPMs (though that is more of a distribution hassle and might need major changes in the .spec file). FWIW, I also have an RFE assigned to me for RHEL/Fedora to split up our e2fsprogs packages for libcom_err and libuuid... since many non-filesystem things now require them. So, this is sort of going in the opposite direction. :) Any idea how many distros already split it out? I know Debian-based distros have done this for ages... I'd also welcome someone with rpm-fu split it into separate packages. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html