Hi,

On Mon, Jul 03, 2000 at 02:24:07AM +0200, Juan J. Quintela wrote:

>         This patch is against test3-pre2.
> It gives here good performance in the first run, and very bad
> in the following ones of dbench 48.  I am hitting here problems with
> the locking scheme.  I get a lot of contention in __wait_on_supper.
> Almost all the dbench processes are waiting in:
> 
>            0xc013639c __wait_on_super+0x184 (0xc13f4c00)
>            0xc01523e5 ext2_alloc_block+0x21 (0xc4840c20, 0x12901d, 0xc7427ea0)

Known, and I did a patch for this ages ago.  It actually didn't make a
whole lot of difference.  The last version of the ext2 diffs I did for
this are included below.

--Stephen

>From [EMAIL PROTECTED]  Mon May  1 18:18:15 2000
Return-Path: <sct>
Received: (from sct@localhost)
        by dukat.scot.redhat.com (8.9.3/8.9.3) id SAA16268
        for [EMAIL PROTECTED]; Mon, 1 May 2000 18:18:14 +0100
Resent-From: "Stephen C. Tweedie" <[EMAIL PROTECTED]>
Resent-Message-ID: <[EMAIL PROTECTED]>
Resent-Date: Mon, 1 May 2000 18:18:14 +0100 (BST)
Resent-To: [EMAIL PROTECTED]
X-Authentication-Warning: worf.scot.redhat.com: sct set sender to 
[EMAIL PROTECTED] using -f
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <[EMAIL PROTECTED]>
In-Reply-To: <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>
        <[EMAIL PROTECTED]>
X-Mailer: VM 6.59 under Emacs 20.3.1
From: "Stephen C. Tweedie" <[EMAIL PROTECTED]>
To: Linus Torvalds <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>, "Stephen C. Tweedie" <[EMAIL PROTECTED]>,
        [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: DANGER: DONT apply it. Re: [patch] ext2nolock-2.2.8-A0
Date: Sat, 15 May 1999 03:05:41 +0100 (BST)
Status: RO
Content-Length: 22531
Lines: 755

Hi,

Linus Torvalds writes:

 > In particular, ext2_new_block() returns a block number. That's all fine
 > and dandy, but it's extremely and utterly idiotic. It means that
 > ext2_new_block() needs to clear the block, which is where all the
 > race-avoidance comes from as far as I can tell. 

There are two sets of races involved.  One involves consistency of the
bitmaps themselves: we need to make sure that concurrent allocations
are consistent.  The second is consistency of inodes, and that is
already handled in fs/ext2/inode.c: when we call ext2_alloc_block(),
we follow it up with a check to see if anybody else has allocated the
same block in the same inode while we slept.  If so, we free the block
and repeat the block search.

So, there's no reason why block clearing needs to be locked: the block
cannot be installed in the inode indirection maps until we have
completely returned from ext2_new_block() anyway.

Anyway, the patch below (against 2.2.8) is an extension of the last
one I posted, and it gets rid of the superblock lock in balloc.c
entirely.  We do have to be a bit more careful about allocation
consitency without the lock: in particular, quota operations used to
happen between bitmap and group descriptor updates, but now we can't
risk blocking while the bitmaps and group descriptors are
inconsistent.

It compiles but I can't test right now.  Feel free to play with it if
you want to, but I do believe we can safely go without superblock
locks in both ialloc.c and balloc.c if we are careful.

The main change that dropping the lock imposes on us is that we cannot
rely on the bitmap buffer remaining pinned in cache for the duration
of the allocation, so we have to bump the buffer_head b_count in
load_block_bitmap() and brelse it once we are finished with it.
Shift-scrlck will be very useful here for spotting the buffer_head
leaks I've introduced. :)

--Stephen

alloc-2.2.8.diff:
----------------------------------------------------------------
--- fs/ext2/balloc.c.~1~        Thu Oct 29 05:54:56 1998
+++ fs/ext2/balloc.c    Fri May 14 14:46:59 1999
@@ -9,6 +9,13 @@
  *  Enhanced block allocation by Stephen Tweedie ([EMAIL PROTECTED]), 1993
  *  Big-endian to little-endian byte-swapping/bitmaps by
  *        David S. Miller ([EMAIL PROTECTED]), 1995
+ *
+ *  Dropped use of the superblock lock.
+ *  This means that we have to take extra care not to block between any
+ *  operations on the bitmap and the corresponding update to the group 
+ *  descriptors.
+ *       Stephen C. Tweedie ([EMAIL PROTECTED]), 1999
+ * 
  */
 
 /*
@@ -74,42 +81,33 @@
 }
 
 /*
- * Read the bitmap for a given block_group, reading into the specified 
- * slot in the superblock's bitmap cache.
+ * Read the bitmap for a given block_group.
  *
- * Return >=0 on success or a -ve error code.
+ * Return the buffer_head on success or NULL on IO error.
  */
 
-static int read_block_bitmap (struct super_block * sb,
-                              unsigned int block_group,
-                              unsigned long bitmap_nr)
+static struct buffer_head * read_block_bitmap (struct super_block * sb,
+                                            unsigned int block_group)
 {
        struct ext2_group_desc * gdp;
        struct buffer_head * bh = NULL;
-       int retval = -EIO;
        
        gdp = ext2_get_group_desc (sb, block_group, NULL);
        if (!gdp)
-               goto error_out;
-       retval = 0;
+               return NULL;
+
        bh = bread (sb->s_dev, le32_to_cpu(gdp->bg_block_bitmap), sb->s_blocksize);
        if (!bh) {
                ext2_error (sb, "read_block_bitmap",
                            "Cannot read block bitmap - "
                            "block_group = %d, block_bitmap = %lu",
                            block_group, (unsigned long) gdp->bg_block_bitmap);
-               retval = -EIO;
+               return NULL;
        }
-       /*
-        * On IO error, just leave a zero in the superblock's block pointer for
-        * this group.  The IO will be retried next time.
-        */
-error_out:
-       sb->u.ext2_sb.s_block_bitmap_number[bitmap_nr] = block_group;
-       sb->u.ext2_sb.s_block_bitmap[bitmap_nr] = bh;
-       return retval;
+       return bh;
 }
 
+
 /*
  * load_block_bitmap loads the block bitmap for a blocks group
  *
@@ -126,9 +124,9 @@
 static int load__block_bitmap (struct super_block * sb,
                               unsigned int block_group)
 {
-       int i, j, retval = 0;
+       int i, j;
        unsigned long block_bitmap_number;
-       struct buffer_head * block_bitmap;
+       struct buffer_head * block_bitmap, *cached_bitmap = NULL;
 
        if (block_group >= sb->u.ext2_sb.s_groups_count)
                ext2_panic (sb, "load_block_bitmap",
@@ -136,27 +134,35 @@
                            "block_group = %d, groups_count = %lu",
                            block_group, sb->u.ext2_sb.s_groups_count);
 
+ retry:
+
        if (sb->u.ext2_sb.s_groups_count <= EXT2_MAX_GROUP_LOADED) {
                if (sb->u.ext2_sb.s_block_bitmap[block_group]) {
                        if (sb->u.ext2_sb.s_block_bitmap_number[block_group] ==
-                           block_group)
+                           block_group) {
+                               brelse(cached_bitmap);
                                return block_group;
+                       }
                        ext2_error (sb, "load_block_bitmap",
                                    "block_group != block_bitmap_number");
                }
-               retval = read_block_bitmap (sb, block_group, block_group);
-               if (retval < 0)
-                       return retval;
+               if (!cached_bitmap)
+                       goto load;
+               sb->u.ext2_sb.s_block_bitmap_number[block_group] = block_group;
+               sb->u.ext2_sb.s_block_bitmap[block_group] = cached_bitmap;
                return block_group;
        }
 
        for (i = 0; i < sb->u.ext2_sb.s_loaded_block_bitmaps &&
                    sb->u.ext2_sb.s_block_bitmap_number[i] != block_group; i++)
                ;
+
+       /* Already cached? */
        if (i < sb->u.ext2_sb.s_loaded_block_bitmaps &&
            sb->u.ext2_sb.s_block_bitmap_number[i] == block_group) {
                block_bitmap_number = sb->u.ext2_sb.s_block_bitmap_number[i];
                block_bitmap = sb->u.ext2_sb.s_block_bitmap[i];
+
                for (j = i; j > 0; j--) {
                        sb->u.ext2_sb.s_block_bitmap_number[j] =
                                sb->u.ext2_sb.s_block_bitmap_number[j - 1];
@@ -165,28 +171,36 @@
                }
                sb->u.ext2_sb.s_block_bitmap_number[0] = block_bitmap_number;
                sb->u.ext2_sb.s_block_bitmap[0] = block_bitmap;
+               touch_buffer(block_bitmap);
+               brelse(cached_bitmap);
+               return 0;
+       } 
 
-               /*
-                * There's still one special case here --- if block_bitmap == 0
-                * then our last attempt to read the bitmap failed and we have
-                * just ended up caching that failure.  Try again to read it.
-                */
-               if (!block_bitmap)
-                       retval = read_block_bitmap (sb, block_group, 0);
-       } else {
-               if (sb->u.ext2_sb.s_loaded_block_bitmaps < EXT2_MAX_GROUP_LOADED)
-                       sb->u.ext2_sb.s_loaded_block_bitmaps++;
-               else
-                       brelse (sb->u.ext2_sb.s_block_bitmap[EXT2_MAX_GROUP_LOADED - 
1]);
-               for (j = sb->u.ext2_sb.s_loaded_block_bitmaps - 1; j > 0;  j--) {
-                       sb->u.ext2_sb.s_block_bitmap_number[j] =
-                               sb->u.ext2_sb.s_block_bitmap_number[j - 1];
-                       sb->u.ext2_sb.s_block_bitmap[j] =
-                               sb->u.ext2_sb.s_block_bitmap[j - 1];
-               }
-               retval = read_block_bitmap (sb, block_group, 0);
+       /* Have we loaded the right bitmap yet? */
+       if (!cached_bitmap)
+               goto load;
+       
+       /* OK, shift the bitmaps round and install this new one in the list. */
+       if (sb->u.ext2_sb.s_loaded_block_bitmaps < EXT2_MAX_GROUP_LOADED)
+               sb->u.ext2_sb.s_loaded_block_bitmaps++;
+       else
+               brelse (sb->u.ext2_sb.s_block_bitmap[EXT2_MAX_GROUP_LOADED - 1]);
+       for (j = sb->u.ext2_sb.s_loaded_block_bitmaps - 1; j > 0;  j--) {
+               sb->u.ext2_sb.s_block_bitmap_number[j] =
+                       sb->u.ext2_sb.s_block_bitmap_number[j - 1];
+               sb->u.ext2_sb.s_block_bitmap[j] =
+                       sb->u.ext2_sb.s_block_bitmap[j - 1];
        }
-       return retval;
+
+       sb->u.ext2_sb.s_block_bitmap_number[0] = block_group;
+       sb->u.ext2_sb.s_block_bitmap[0] = cached_bitmap;
+       return 0;
+
+ load:
+       cached_bitmap = read_block_bitmap(sb, block_group);
+       if (!cached_bitmap)
+               return -EIO;
+       goto retry;
 }
 
 /*
@@ -202,19 +216,21 @@
  * differentiating between a group for which we have never performed a bitmap
  * IO request, and a group for which the last bitmap read request failed.
  */
-static inline int load_block_bitmap (struct super_block * sb,
-                                    unsigned int block_group)
+static inline struct buffer_head * load_block_bitmap (struct super_block * sb,
+                                                     unsigned int block_group)
 {
        int slot;
-       
+       struct buffer_head *bh;
+
        /*
         * Do the lookup for the slot.  First of all, check if we're asking
         * for the same slot as last time, and did we succeed that last time?
         */
        if (sb->u.ext2_sb.s_loaded_block_bitmaps > 0 &&
            sb->u.ext2_sb.s_block_bitmap_number[0] == block_group &&
-           sb->u.ext2_sb.s_block_bitmap[block_group]) {
-               return 0;
+           (bh = sb->u.ext2_sb.s_block_bitmap[block_group])) {
+               bh->b_count++;
+               return bh;
        }
        /*
         * Or can we do a fast lookup based on a loaded group on a filesystem
@@ -236,30 +252,31 @@
         * <0 means we just got an error
         */
        if (slot < 0)
-               return slot;
+               return NULL;
        
        /*
         * If it's a valid slot, we may still have cached a previous IO error,
         * in which case the bh in the superblock cache will be zero.
         */
        if (!sb->u.ext2_sb.s_block_bitmap[slot])
-               return -EIO;
+               return NULL;
        
        /*
         * Must have been read in OK to get this far.
         */
-       return slot;
+       bh = sb->u.ext2_sb.s_block_bitmap[slot];
+       bh->b_count++;
+       return bh;
 }
 
 void ext2_free_blocks (const struct inode * inode, unsigned long block,
                       unsigned long count)
 {
-       struct buffer_head * bh;
+       struct buffer_head * bh = NULL;
        struct buffer_head * bh2;
        unsigned long block_group;
        unsigned long bit;
        unsigned long i;
-       int bitmap_nr;
        unsigned long overflow;
        struct super_block * sb;
        struct ext2_group_desc * gdp;
@@ -270,7 +287,6 @@
                printk ("ext2_free_blocks: nonexistent device");
                return;
        }
-       lock_super (sb);
        es = sb->u.ext2_sb.s_es;
        if (block < le32_to_cpu(es->s_first_data_block) || 
            (block + count) > le32_to_cpu(es->s_blocks_count)) {
@@ -296,11 +312,11 @@
                overflow = bit + count - EXT2_BLOCKS_PER_GROUP(sb);
                count -= overflow;
        }
-       bitmap_nr = load_block_bitmap (sb, block_group);
-       if (bitmap_nr < 0)
+       brelse (bh);
+       bh = load_block_bitmap (sb, block_group);
+       if (bh == NULL)
                goto error_return;
        
-       bh = sb->u.ext2_sb.s_block_bitmap[bitmap_nr];
        gdp = ext2_get_group_desc (sb, block_group, &bh2);
        if (!gdp)
                goto error_return;
@@ -323,11 +339,11 @@
                                      "bit already cleared for block %lu", 
                                      block);
                else {
-                       DQUOT_FREE_BLOCK(sb, inode, 1);
                        gdp->bg_free_blocks_count =
                                cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count)+1);
                        es->s_free_blocks_count =
                                cpu_to_le32(le32_to_cpu(es->s_free_blocks_count)+1);
+                       DQUOT_FREE_BLOCK(sb, inode, 1);
                }
        }
        
@@ -346,7 +362,7 @@
        }
        sb->s_dirt = 1;
 error_return:
-       unlock_super (sb);
+       brelse (bh);
        return;
 }
 
@@ -360,11 +376,10 @@
 int ext2_new_block (const struct inode * inode, unsigned long goal,
                    u32 * prealloc_count, u32 * prealloc_block, int * err)
 {
-       struct buffer_head * bh;
+       struct buffer_head * bh = NULL;
        struct buffer_head * bh2;
        char * p, * r;
        int i, j, k, tmp;
-       int bitmap_nr;
        struct super_block * sb;
        struct ext2_group_desc * gdp;
        struct ext2_super_block * es;
@@ -379,14 +394,12 @@
                return 0;
        }
 
-       lock_super (sb);
        es = sb->u.ext2_sb.s_es;
        if (le32_to_cpu(es->s_free_blocks_count) <= le32_to_cpu(es->s_r_blocks_count) 
&&
            ((sb->u.ext2_sb.s_resuid != current->fsuid) &&
             (sb->u.ext2_sb.s_resgid == 0 ||
              !in_group_p (sb->u.ext2_sb.s_resgid)) && 
             !capable(CAP_SYS_RESOURCE))) {
-               unlock_super (sb);
                return 0;
        }
 
@@ -410,12 +423,11 @@
                if (j)
                        goal_attempts++;
 #endif
-               bitmap_nr = load_block_bitmap (sb, i);
-               if (bitmap_nr < 0)
+               brelse (bh);
+               bh = load_block_bitmap (sb, i);
+               if (bh == NULL)
                        goto io_error;
                
-               bh = sb->u.ext2_sb.s_block_bitmap[bitmap_nr];
-
                ext2_debug ("goal is at %d:%d.\n", i, j);
 
                if (!ext2_test_bit(j, bh->b_data)) {
@@ -469,6 +481,8 @@
        }
 
        ext2_debug ("Bit not found in block group %d.\n", i);
+       brelse (bh);
+       bh = NULL;
 
        /*
         * Now search the rest of the groups.  We assume that 
@@ -479,23 +493,18 @@
                if (i >= sb->u.ext2_sb.s_groups_count)
                        i = 0;
                gdp = ext2_get_group_desc (sb, i, &bh2);
-               if (!gdp) {
-                       *err = -EIO;
-                       unlock_super (sb);
-                       return 0;
-               }
+               if (!gdp) 
+                       goto io_error;
                if (le16_to_cpu(gdp->bg_free_blocks_count) > 0)
                        break;
        }
        if (k >= sb->u.ext2_sb.s_groups_count) {
-               unlock_super (sb);
                return 0;
        }
-       bitmap_nr = load_block_bitmap (sb, i);
-       if (bitmap_nr < 0)
+       bh =  load_block_bitmap (sb, i);
+       if (bh == NULL)
                goto io_error;
        
-       bh = sb->u.ext2_sb.s_block_bitmap[bitmap_nr];
        r = memscan(bh->b_data, 0, EXT2_BLOCKS_PER_GROUP(sb) >> 3);
        j = (r - bh->b_data) << 3;
        if (j < EXT2_BLOCKS_PER_GROUP(sb))
@@ -506,7 +515,7 @@
        if (j >= EXT2_BLOCKS_PER_GROUP(sb)) {
                ext2_error (sb, "ext2_new_block",
                            "Free blocks count corrupted for block group %d", i);
-               unlock_super (sb);
+               brelse (bh);
                return 0;
        }
 
@@ -526,13 +535,13 @@
         * Check quota for allocation of this block.
         */
        if(DQUOT_ALLOC_BLOCK(sb, inode, 1)) {
-               unlock_super(sb);
+               brelse (bh);
                *err = -EDQUOT;
                return 0;
        }
-
+       
        tmp = j + i * EXT2_BLOCKS_PER_GROUP(sb) + le32_to_cpu(es->s_first_data_block);
-
+       
        if (test_opt (sb, CHECK_STRICT) &&
            (tmp == le32_to_cpu(gdp->bg_block_bitmap) ||
             tmp == le32_to_cpu(gdp->bg_inode_bitmap) ||
@@ -541,14 +550,20 @@
                            "Allocating block in system zone - "
                            "block = %u", tmp);
 
+       /* 
+        * The quota check may have blocked.  That should be rare, but
+        * if it does happen, there's a chance another process will
+        * have allocated the current block in the mean time.  We need
+        * to check and retry if the block is no longer free.  
+        */
        if (ext2_set_bit (j, bh->b_data)) {
-               ext2_warning (sb, "ext2_new_block",
-                             "bit already set for block %d", j);
                DQUOT_FREE_BLOCK(sb, inode, 1);
                goto repeat;
        }
 
        ext2_debug ("found bit %d\n", j);
+       gdp->bg_free_blocks_count = cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) 
+- 1);
+       es->s_free_blocks_count = cpu_to_le32(le32_to_cpu(es->s_free_blocks_count) - 
+1);
 
        /*
         * Do block preallocation now if required.
@@ -562,25 +577,26 @@
 
                *prealloc_count = 0;
                *prealloc_block = tmp + 1;
-               for (k = 1;
-                    k < prealloc_goal && (j + k) < EXT2_BLOCKS_PER_GROUP(sb);
-                    k++) {
-                       if (DQUOT_PREALLOC_BLOCK(sb, inode, 1))
-                               break;
-                       if (ext2_set_bit (j + k, bh->b_data)) {
-                               DQUOT_FREE_BLOCK(sb, inode, 1);
-                               break;
-                       }
-                       (*prealloc_count)++;
-               }       
-               gdp->bg_free_blocks_count =
+               if (!DQUOT_PREALLOC_BLOCK(sb, inode, 1)) {
+                       for (k = 1;
+                            k < prealloc_goal &&
+                                    (j + k) < EXT2_BLOCKS_PER_GROUP(sb);
+                            k++) {
+                               if (ext2_set_bit (j + k, bh->b_data))
+                                       break;
+                               (*prealloc_count)++;
+                       }       
+                       gdp->bg_free_blocks_count =
                        cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) -
                               *prealloc_count);
-               es->s_free_blocks_count =
-                       cpu_to_le32(le32_to_cpu(es->s_free_blocks_count) -
-                              *prealloc_count);
-               ext2_debug ("Preallocated a further %lu bits.\n",
-                           *prealloc_count);
+                       es->s_free_blocks_count =
+                               cpu_to_le32(le32_to_cpu(es->s_free_blocks_count) -
+                                           *prealloc_count);
+                       if (*prealloc_count < prealloc_goal)
+                               DQUOT_FREE_BLOCK(sb, inode, prealloc_goal - 
+*prealloc_count);
+                       ext2_debug ("Preallocated a further %lu bits.\n",
+                                   *prealloc_count);
+               }
        }
 #endif
 
@@ -592,16 +608,18 @@
                wait_on_buffer (bh);
        }
 
+       /* We are finished with the bitmap buffer now */
+       brelse (bh);
+
        if (j >= le32_to_cpu(es->s_blocks_count)) {
                ext2_error (sb, "ext2_new_block",
                            "block >= blocks count - "
                            "block_group = %d, block=%d", i, j);
-               unlock_super (sb);
                return 0;
        }
-       if (!(bh = getblk (sb->s_dev, j, sb->s_blocksize))) {
+
+               if (!(bh = getblk (sb->s_dev, j, sb->s_blocksize))) {
                ext2_error (sb, "ext2_new_block", "cannot get block %d", j);
-               unlock_super (sb);
                return 0;
        }
        memset(bh->b_data, 0, sb->s_blocksize);
@@ -612,18 +630,15 @@
        ext2_debug ("allocating block %d. "
                    "Goal hits %d of %d.\n", j, goal_hits, goal_attempts);
 
-       gdp->bg_free_blocks_count = cpu_to_le16(le16_to_cpu(gdp->bg_free_blocks_count) 
- 1);
        mark_buffer_dirty(bh2, 1);
-       es->s_free_blocks_count = cpu_to_le32(le32_to_cpu(es->s_free_blocks_count) - 
1);
        mark_buffer_dirty(sb->u.ext2_sb.s_sbh, 1);
        sb->s_dirt = 1;
-       unlock_super (sb);
        *err = 0;
        return j;
        
 io_error:
        *err = -EIO;
-       unlock_super (sb);
+       brelse (bh);
        return 0;
        
 }
@@ -699,11 +714,9 @@
        struct ext2_super_block * es;
        unsigned long desc_count, bitmap_count, x;
        unsigned long desc_blocks;
-       int bitmap_nr;
        struct ext2_group_desc * gdp;
        int i, j;
 
-       lock_super (sb);
        es = sb->u.ext2_sb.s_es;
        desc_count = 0;
        bitmap_count = 0;
@@ -715,11 +728,9 @@
                if (!gdp)
                        continue;
                desc_count += le16_to_cpu(gdp->bg_free_blocks_count);
-               bitmap_nr = load_block_bitmap (sb, i);
-               if (bitmap_nr < 0)
+               bh = load_block_bitmap (sb, i);
+               if (bh == NULL)
                        continue;
-               
-               bh = sb->u.ext2_sb.s_block_bitmap[bitmap_nr];
 
                if (!(sb->u.ext2_sb.s_feature_ro_compat &
                     EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER) ||
@@ -760,11 +771,11 @@
                                    "stored = %d, counted = %lu", i,
                                    le16_to_cpu(gdp->bg_free_blocks_count), x);
                bitmap_count += x;
+               brelse (bh);
        }
        if (le32_to_cpu(es->s_free_blocks_count) != bitmap_count)
                ext2_error (sb, "ext2_check_blocks_bitmap",
                            "Wrong free blocks count in super block, "
                            "stored = %lu, counted = %lu",
                            (unsigned long) le32_to_cpu(es->s_free_blocks_count), 
bitmap_count);
-       unlock_super (sb);
 }
--- fs/ext2/ialloc.c.~1~        Tue Oct 20 22:08:14 1998
+++ fs/ext2/ialloc.c    Wed May 12 19:33:06 1999
@@ -39,42 +39,35 @@
 #include <asm/byteorder.h>
 
 /*
- * Read the inode allocation bitmap for a given block_group, reading
- * into the specified slot in the superblock's bitmap cache.
+ * Read the bitmap for a given block_group.
  *
- * Return >=0 on success or a -ve error code.
+ * Return the buffer_head on success or NULL on IO error.
  */
-static int read_inode_bitmap (struct super_block * sb,
-                              unsigned long block_group,
-                              unsigned int bitmap_nr)
+
+static struct buffer_head * read_inode_bitmap (struct super_block * sb,
+                                              unsigned long block_group)
 {
        struct ext2_group_desc * gdp;
        struct buffer_head * bh = NULL;
-       int retval = 0;
 
        gdp = ext2_get_group_desc (sb, block_group, NULL);
-       if (!gdp) {
-               retval = -EIO;
-               goto error_out;
-       }
+       if (!gdp)
+               return NULL;
+       
+       unlock_super(sb);
        bh = bread (sb->s_dev, le32_to_cpu(gdp->bg_inode_bitmap), sb->s_blocksize);
+       lock_super(sb);
        if (!bh) {
                ext2_error (sb, "read_inode_bitmap",
                            "Cannot read inode bitmap - "
                            "block_group = %lu, inode_bitmap = %lu",
                            block_group, (unsigned long) gdp->bg_inode_bitmap);
-               retval = -EIO;
+               return NULL;
        }
-       /*
-        * On IO error, just leave a zero in the superblock's block pointer for
-        * this group.  The IO will be retried next time.
-        */
-error_out:
-       sb->u.ext2_sb.s_inode_bitmap_number[bitmap_nr] = block_group;
-       sb->u.ext2_sb.s_inode_bitmap[bitmap_nr] = bh;
-       return retval;
+       return bh;
 }
 
+
 /*
  * load_inode_bitmap loads the inode bitmap for a blocks group
  *
@@ -91,9 +84,9 @@
 static int load_inode_bitmap (struct super_block * sb,
                              unsigned int block_group)
 {
-       int i, j, retval = 0;
+       int i, j;
        unsigned long inode_bitmap_number;
-       struct buffer_head * inode_bitmap;
+       struct buffer_head * inode_bitmap, * cached_bitmap = NULL;
 
        if (block_group >= sb->u.ext2_sb.s_groups_count)
                ext2_panic (sb, "load_inode_bitmap",
@@ -104,18 +97,23 @@
            sb->u.ext2_sb.s_inode_bitmap_number[0] == block_group &&
            sb->u.ext2_sb.s_inode_bitmap[0] != NULL)
                return 0;
+
+ retry:
+
        if (sb->u.ext2_sb.s_groups_count <= EXT2_MAX_GROUP_LOADED) {
                if (sb->u.ext2_sb.s_inode_bitmap[block_group]) {
                        if (sb->u.ext2_sb.s_inode_bitmap_number[block_group] != 
block_group)
                                ext2_panic (sb, "load_inode_bitmap",
                                            "block_group != inode_bitmap_number");
-                       else
+                       else {
+                               brelse(cached_bitmap);
                                return block_group;
+                       }
                } else {
-                       retval = read_inode_bitmap (sb, block_group,
-                                                   block_group);
-                       if (retval < 0)
-                               return retval;
+                       if (!cached_bitmap)
+                               goto load;
+                       sb->u.ext2_sb.s_inode_bitmap_number[block_group] = block_group;
+                       sb->u.ext2_sb.s_inode_bitmap[block_group] = cached_bitmap;
                        return block_group;
                }
        }
@@ -124,10 +122,13 @@
                    sb->u.ext2_sb.s_inode_bitmap_number[i] != block_group;
             i++)
                ;
+
+       /* Already cached? */
        if (i < sb->u.ext2_sb.s_loaded_inode_bitmaps &&
            sb->u.ext2_sb.s_inode_bitmap_number[i] == block_group) {
                inode_bitmap_number = sb->u.ext2_sb.s_inode_bitmap_number[i];
                inode_bitmap = sb->u.ext2_sb.s_inode_bitmap[i];
+
                for (j = i; j > 0; j--) {
                        sb->u.ext2_sb.s_inode_bitmap_number[j] =
                                sb->u.ext2_sb.s_inode_bitmap_number[j - 1];
@@ -136,29 +137,35 @@
                }
                sb->u.ext2_sb.s_inode_bitmap_number[0] = inode_bitmap_number;
                sb->u.ext2_sb.s_inode_bitmap[0] = inode_bitmap;
+               touch_buffer(inode_bitmap);
+               brelse(cached_bitmap);
+               return 0;
+       } 
 
-               /*
-                * There's still one special case here --- if inode_bitmap == 0
-                * then our last attempt to read the bitmap failed and we have
-                * just ended up caching that failure.  Try again to read it.
-                */
-               if (!inode_bitmap)
-                       retval = read_inode_bitmap (sb, block_group, 0);
-               
-       } else {
-               if (sb->u.ext2_sb.s_loaded_inode_bitmaps < EXT2_MAX_GROUP_LOADED)
-                       sb->u.ext2_sb.s_loaded_inode_bitmaps++;
-               else
-                       brelse (sb->u.ext2_sb.s_inode_bitmap[EXT2_MAX_GROUP_LOADED - 
1]);
-               for (j = sb->u.ext2_sb.s_loaded_inode_bitmaps - 1; j > 0; j--) {
-                       sb->u.ext2_sb.s_inode_bitmap_number[j] =
-                               sb->u.ext2_sb.s_inode_bitmap_number[j - 1];
-                       sb->u.ext2_sb.s_inode_bitmap[j] =
-                               sb->u.ext2_sb.s_inode_bitmap[j - 1];
-               }
-               retval = read_inode_bitmap (sb, block_group, 0);
+       /* Have we loaded the right bitmap yet? */
+       if (!cached_bitmap)
+               goto load;
+       
+       if (sb->u.ext2_sb.s_loaded_inode_bitmaps < EXT2_MAX_GROUP_LOADED)
+               sb->u.ext2_sb.s_loaded_inode_bitmaps++;
+       else
+               brelse (sb->u.ext2_sb.s_inode_bitmap[EXT2_MAX_GROUP_LOADED - 1]);
+       for (j = sb->u.ext2_sb.s_loaded_inode_bitmaps - 1; j > 0; j--) {
+               sb->u.ext2_sb.s_inode_bitmap_number[j] =
+                       sb->u.ext2_sb.s_inode_bitmap_number[j - 1];
+               sb->u.ext2_sb.s_inode_bitmap[j] =
+                       sb->u.ext2_sb.s_inode_bitmap[j - 1];
        }
-       return retval;
+
+       sb->u.ext2_sb.s_inode_bitmap_number[0] = block_group;
+       sb->u.ext2_sb.s_inode_bitmap[0] = cached_bitmap;
+       return 0;
+
+ load:
+       cached_bitmap = read_inode_bitmap(sb, block_group);
+       if (!cached_bitmap)
+               return -EIO;
+       goto retry;
 }
 
 /*

Reply via email to