--- Begin Message ---
Package: e2fsprogs
Version: 1.42.1-2
Severity: normal
Tags: patch
Shrinking a filesystem is currently very CPU intensive. On a 100G ext4
filesystem filled to 50% full by making copies of /usr/share/doc to
$MNTPOINT/doc{1,2,3,...}, running resize2fs on a fairly slow (~5400
rpm) disk and a fast Core i7 processor:
# time resize2fs -p $FILESYSTEM 75G
[...]
real 96m26.852s
user 56m39.768s
sys 0m27.922s
Apparently shrinking larger filesystems causes even larger slowdowns
(CPU time consumed grows superlinearly), but I don't have the numbers
to back this up right now.
Turns out the slow and CPU intensive phase is the "Scanning inode
table" phase. The reason for this is that the inode table is scanned
bit by bit in ext2fs_new_inode(). It appears that in many cases a huge
string of bits needs to be tested to find the first zero bit. It
doesn't help that for each bit test a modulo operation is done in
ext2fs_new_inode(), after which the bit test is done by calling this
chain of functions for each bit:
ext2fs_new_inode() -> ext2fs_fast_test_inode_bitmap2() [inlined] ->
ext2fs_test_generic_bmap() -> ext2fs_test_generic_bitmap() ->
ext2fs_test_bit().
I also noted that resize2fs does not call ext2fs_open2() with
EXT2_FLAG_64BITS set; is there a reason for that? Adding that flag
change causes a minor speedup by eliminating
ext2fs_test_generic_bitmap() from the call chain.
I don't know if an algorithmic change would be the "proper" solution
for this, but I managed to make the process a lot faster (fast enough
to not be a bottleneck in this case, but perhaps it should also be
tested on both bigger filesystems and SSDs) by changing the following
while trying to affect the performance of other code in libext2fs as
little as possible:
0. Original CPU time: 3400 seconds.
1. Move the modulo out of the loop in ext2fs_new_inode().
CPU time: 2585 seconds.
2. Call ext2fs_open2() with EXT2_FLAG_64BITS. CPU time: 2057 seconds.
3a. Implement a function, ext2fs_find_first_zero_generic_bmap(), which
finds the first zero bit in a bitmap. Defaults to doing pretty
much what ext2fs_new_inode() was doing if the relevant (new)
function in the bitmap_ops struct is NULL. Change
ext2fs_new_inode() to use it.
3b. Implement optimized ba_find_first_zero() for bitarray bitmaps.
Simply scan the bitarray byte by byte to find a value != 0xff.
CPU time: 45 seconds. This still appears to translate to 45
seconds real time.
4. Further optimize ba_find_first_zero() to scan in (aligned) 64-bit
chunks CPU time: 17 seconds
After these optimizations, the wall time taken to resize the same
filesystem has decreased from 96m27s to 39m49s:
# time resize2fs -p $FILESYSTEM 75G
real 39m49.287s
user 0m16.601s
sys 0m26.606s
dumpe2fs output from resulting filesystems before and after my
modifications only differ in last write timestamp. e2fsck -f
$FILESYSTEM reports no problems.
A tentative (or proof of concept) patch is attached. Some concerns
that I have:
1. How should internal errors, like functions returning EINVAL when
that ostensibly cannot be due to invalid filesystem, be handled?
Currently ext2fs_new_inode() just propagates the error further:
+ if (err != ENOENT)
+ return err; /* Internal error? */
2. My additions in blkmap64_ba.c rely on the availability of
<stdint.h> and intptr_t (to detect when a pointer is 8-byte
aligned). Perhaps that should be done in another way if C89
compatibility is a concern?
3. 64-bit scans might be slower on a system with native register size
<64 bits. I doubt they will be slower than the code used to be,
though.
4. What is bitmap->cluster_bits and how should it be handled? in
ba_find_first_zero(); similar code in
ext2fs_find_first_zero_generic_bmap():
+ /* FIXME what is this? */
+ if (bitmap->cluster_bits)
+ return EINVAL;
5. Is returning ENOENT the preferred way of indicating "not found"?
How is this supposed to integrate in the com_err framework?
I'd be happy to update the patch to fix any remaining issues, unless
you think there's another, cleaner solution to the slowness.
By the way, compiling e2fsprogs with -Wall reveals things like missing
headers:
../../lib/ext2fs/ext2fs.h:1712:3: warning: implicit declaration of function
‘open64’ [-Wimplicit-function-declaration]
Sami
-- System Information:
Debian Release: wheezy/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.2.9 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages e2fsprogs depends on:
ii e2fslibs 1.42.1-2
ii libblkid1 2.20.1-4
ii libc6 2.13-27
ii libcomerr2 1.42.1-2
ii libss2 1.42.1-2
ii libuuid1 2.20.1-4
ii util-linux 2.20.1-4
e2fsprogs recommends no packages.
Versions of packages e2fsprogs suggests:
ii e2fsck-static 1.42.1-2
ii gpart <none>
ii parted 2.3-8
-- no debconf information
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index 948a0ec..f203284 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -126,17 +126,33 @@ errcode_t ext2fs_new_inode(ext2_filsys fs, ext2_ino_t dir,
if (start_inode > fs->super->s_inodes_count)
return EXT2_ET_INODE_ALLOC_FAIL;
i = start_inode;
-
+ ext2_ino_t modulo = (i - 1) % EXT2_INODES_PER_GROUP(fs->super);
do {
- if (((i - 1) % EXT2_INODES_PER_GROUP(fs->super)) == 0)
+ if (modulo == 0)
check_inode_uninit(fs, map, (i - 1) /
EXT2_INODES_PER_GROUP(fs->super));
- if (!ext2fs_fast_test_inode_bitmap2(map, i))
+ ext2_ino_t upto = i + (EXT2_INODES_PER_GROUP(fs->super) - modulo);
+ if (i < start_inode && upto >= start_inode)
+ upto = start_inode - 1;
+ if (upto > fs->super->s_inodes_count)
+ upto = fs->super->s_inodes_count;
+
+ ext2_ino_t first_zero;
+ errcode_t err = ext2fs_find_first_zero_inode_bitmap2(map, i, upto, &first_zero);
+ if (!err) {
+ i = first_zero;
break;
- i++;
- if (i > fs->super->s_inodes_count)
+ } else {
+ if (err != ENOENT)
+ return err; /* Internal error? */
+ i = upto;
+ }
+
+ if (++i > fs->super->s_inodes_count) {
i = EXT2_FIRST_INODE(fs->super);
+ modulo = (i - 1) % EXT2_INODES_PER_GROUP(fs->super);
+ }
} while (i != start_inode);
if (ext2fs_test_inode_bitmap2(map, i))
diff --git a/lib/ext2fs/bitops.h b/lib/ext2fs/bitops.h
index 83a01e4..70caa86 100644
--- a/lib/ext2fs/bitops.h
+++ b/lib/ext2fs/bitops.h
@@ -188,6 +188,9 @@ extern void ext2fs_mark_block_bitmap_range2(ext2fs_block_bitmap bitmap,
blk64_t block, unsigned int num);
extern void ext2fs_unmark_block_bitmap_range2(ext2fs_block_bitmap bitmap,
blk64_t block, unsigned int num);
+extern errcode_t ext2fs_find_first_zero_generic_bmap(ext2fs_generic_bitmap bitmap,
+ __u64 start, __u64 end,
+ __u64 *out);
/*
* The inline routines themselves...
@@ -593,6 +596,19 @@ _INLINE_ int ext2fs_fast_test_inode_bitmap2(ext2fs_inode_bitmap bitmap,
inode);
}
+_INLINE_ errcode_t ext2fs_find_first_zero_inode_bitmap2(ext2fs_inode_bitmap bitmap,
+ ext2_ino_t start,
+ ext2_ino_t end,
+ ext2_ino_t *out)
+{
+ __u64 o;
+ errcode_t rv = ext2fs_find_first_zero_generic_bmap((ext2fs_generic_bitmap) bitmap,
+ start, end, &o);
+ if (!rv)
+ *out = o;
+ return rv;
+}
+
_INLINE_ blk64_t ext2fs_get_block_bitmap_start2(ext2fs_block_bitmap bitmap)
{
return ext2fs_get_generic_bmap_start((ext2fs_generic_bitmap) bitmap);
diff --git a/lib/ext2fs/blkmap64_ba.c b/lib/ext2fs/blkmap64_ba.c
index 3f0c643..968ffac 100644
--- a/lib/ext2fs/blkmap64_ba.c
+++ b/lib/ext2fs/blkmap64_ba.c
@@ -12,6 +12,7 @@
#include "config.h"
#include <stdio.h>
#include <string.h>
+#include <stdint.h>
#if HAVE_UNISTD_H
#include <unistd.h>
#endif
@@ -317,6 +318,86 @@ static void ba_print_stats(ext2fs_generic_bitmap bitmap)
sizeof(struct ext2fs_ba_private_struct));
}
+/* Find the first zero bit between start and end, inclusive */
+static errcode_t ba_find_first_zero(ext2fs_generic_bitmap bitmap,
+ __u64 start, __u64 end, __u64 *out)
+{
+ if (start < bitmap->start || end > bitmap->end || start > end)
+ return EINVAL;
+
+ /* FIXME what is this? */
+ if (bitmap->cluster_bits)
+ return EINVAL;
+
+ ext2fs_ba_private bp = (ext2fs_ba_private)bitmap->private;
+ unsigned long bitpos = start - bitmap->start;
+ unsigned long count = end - start + 1;
+
+ /* scan bits until a byte boundary */
+ while ((bitpos & 0x7) != 0 && count > 0) {
+ if (!ext2fs_test_bit64(bitpos, bp->bitarray)) {
+ *out = bitpos + bitmap->start;
+ return 0;
+ }
+ bitpos++;
+ count--;
+ }
+
+ if (!count)
+ return ENOENT;
+
+ int byte_found = 0; /* whether a != 0xff byte has been found */
+
+ const unsigned char *pos = ((unsigned char *)bp->bitarray) + (bitpos >> 3);
+ /* scan bytes until 8-byte (64-bit) aligned */
+ while (count >= 8 && (((intptr_t)pos) & 0x07)) {
+ if (*pos != 0xff) {
+ byte_found = 1;
+ break;
+ }
+ pos++;
+ count -= 8;
+ bitpos += 8;
+ }
+
+ if (!byte_found) {
+ unsigned long max_loop_count = count >> 6; /* 8-byte blocks */
+ unsigned long i = max_loop_count;
+ while (i) {
+ if (*((const __u64 *)pos) != ((__u64)-1))
+ break;
+ pos += 8;
+ i--;
+ }
+ count -= 64 * (max_loop_count - i);
+ bitpos += 64 * (max_loop_count - i);
+
+ max_loop_count = count >> 3;
+ i = max_loop_count;
+ while (i) {
+ if (*pos != 0xff) {
+ byte_found = 1;
+ break;
+ }
+ pos++;
+ i--;
+ }
+ count -= 8 * (max_loop_count - i);
+ bitpos += 8 * (max_loop_count - i);
+ }
+
+ /* Here either count < 8 or byte_found == 1. */
+ while (count-- > 0) {
+ if (!ext2fs_test_bit64(bitpos, bp->bitarray)) {
+ *out = bitpos + bitmap->start;
+ return 0;
+ }
+ bitpos++;
+ }
+
+ return ENOENT;
+}
+
struct ext2_bitmap_ops ext2fs_blkmap64_bitarray = {
.type = EXT2FS_BMAP64_BITARRAY,
.new_bmap = ba_new_bmap,
@@ -333,4 +414,5 @@ struct ext2_bitmap_ops ext2fs_blkmap64_bitarray = {
.get_bmap_range = ba_get_bmap_range,
.clear_bmap = ba_clear_bmap,
.print_stats = ba_print_stats,
+ .find_first_zero = ba_find_first_zero
};
diff --git a/lib/ext2fs/bmap64.h b/lib/ext2fs/bmap64.h
index 288e1b6..38a20bf 100644
--- a/lib/ext2fs/bmap64.h
+++ b/lib/ext2fs/bmap64.h
@@ -89,6 +89,11 @@ struct ext2_bitmap_ops {
__u64 start, size_t num, void *out);
void (*clear_bmap)(ext2fs_generic_bitmap bitmap);
void (*print_stats)(ext2fs_generic_bitmap);
+
+ /* Find first zero bit between start and end, inclusive.
+ * May be NULL, in which case a generic function is used. */
+ errcode_t (*find_first_zero)(ext2fs_generic_bitmap bitmap,
+ __u64 start, __u64 end, __u64 *out);
};
extern struct ext2_bitmap_ops ext2fs_blkmap64_bitarray;
diff --git a/lib/ext2fs/gen_bitmap64.c b/lib/ext2fs/gen_bitmap64.c
index bf1a76b..14089b2 100644
--- a/lib/ext2fs/gen_bitmap64.c
+++ b/lib/ext2fs/gen_bitmap64.c
@@ -759,3 +759,30 @@ errcode_t ext2fs_convert_subcluster_bitmap(ext2_filsys fs,
*bitmap = cmap;
return 0;
}
+
+errcode_t ext2fs_find_first_zero_generic_bmap(ext2fs_generic_bitmap bitmap,
+ __u64 start, __u64 end, __u64 *out)
+{
+ if (bitmap->bitmap_ops->find_first_zero)
+ return bitmap->bitmap_ops->find_first_zero(bitmap, start, end, out);
+
+ /* FIXME what is cluster_bits? */
+ if (!bitmap || !EXT2FS_IS_64_BITMAP(bitmap) || bitmap->cluster_bits)
+ return EINVAL;
+
+ if (start < bitmap->start || end > bitmap->end || start > end) {
+ warn_bitmap(bitmap, EXT2FS_TEST_ERROR, start);
+ return EINVAL;
+ }
+
+ while (start <= end) {
+ int b = bitmap->bitmap_ops->test_bmap(bitmap, start);
+ if (!b) {
+ *out = start;
+ return 0;
+ }
+ start++;
+ }
+
+ return ENOENT;
+}
diff --git a/lib/ext2fs/openfs.c b/lib/ext2fs/openfs.c
index 32e068c..7c49da6 100644
--- a/lib/ext2fs/openfs.c
+++ b/lib/ext2fs/openfs.c
@@ -87,6 +87,7 @@ errcode_t ext2fs_open(const char *name, int flags, int superblock,
* features aren't supported.
* EXT2_FLAG_JOURNAL_DEV_OK - Open an ext3 journal device
* EXT2_FLAG_SKIP_MMP - Open without multi-mount protection check.
+ * EXT2_FLAG_64BITS - Allow 64-bit bitfields (needed for large filesystems)
*/
errcode_t ext2fs_open2(const char *name, const char *io_options,
int flags, int superblock,
diff --git a/resize/main.c b/resize/main.c
index 1ab0e04..ec0686e 100644
--- a/resize/main.c
+++ b/resize/main.c
@@ -302,6 +302,9 @@ int main (int argc, char ** argv)
if (!(mount_flags & EXT2_MF_MOUNTED))
io_flags = EXT2_FLAG_RW | EXT2_FLAG_EXCLUSIVE;
+
+ io_flags |= EXT2_FLAG_64BITS;
+
retval = ext2fs_open2(device_name, io_options, io_flags,
0, 0, io_ptr, &fs);
if (retval) {
signature.asc
Description: Digital signature
--- End Message ---