Re: Testing ext4 persistent preallocation patches for 64 bit features
On Wed, 2007-02-07 at 13:18 +0530, Amit K. Arora wrote: I plan to test the persistent preallocation patches on a huge sparse device, to know if 32 bit physical block numbers (upto 48bit) behave as expected. Thanks! I have following questions for this and will appreciate suggestions here: c) Do I need to put some hack in the filesystem code for above (to allocate 32 bit physical block numbers) ? I had a ext3 hack patch before to allow application specify which block group is the targeted block allocation group,using ioctl command, so to allocate 32 bit physical block numbers it just set the target block group beyond 2**(32-15) = 2**17. patch is below.. BTW, have you considered - move the preallocation code in ioctl to a seperate function, and call that function from ioctl? That way we could easily switch to posix_falloc later. - Test preallocation with mapped IO? - disable preallocation if the filesystem free blocks is under some low watermarks, to save space for near future real block allocation? - is de-preallocation something worth doing? Mingming --- linux-2.6.16-ming/fs/ext3/balloc.c | 24 ++- linux-2.6.16-ming/fs/ext3/ioctl.c | 29 linux-2.6.16-ming/include/linux/ext3_fs.h |1 linux-2.6.16-ming/include/linux/ext3_fs_i.h |1 4 files changed, 46 insertions(+), 9 deletions(-) diff -puN fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack fs/ext3/ioctl.c --- linux-2.6.16/fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack 2006-03-28 15:19:58.0 -0800 +++ linux-2.6.16-ming/fs/ext3/ioctl.c 2006-03-28 15:54:14.507288400 -0800 @@ -22,6 +22,7 @@ int ext3_ioctl (struct inode * inode, st struct ext3_inode_info *ei = EXT3_I(inode); unsigned int flags; unsigned short rsv_window_size; + unsigned int blk_group; ext3_debug (cmd = %u, arg = %lu\n, cmd, arg); @@ -193,6 +194,34 @@ flags_err: mutex_unlock(ei-truncate_mutex); return 0; } + case EXT3_IOC_SETALLOCBLKGRP: { + + if (!test_opt(inode-i_sb, RESERVATION) ||!S_ISREG(inode-i_mode)) + return -ENOTTY; + + if (IS_RDONLY(inode)) + return -EROFS; + + if ((current-fsuid != inode-i_uid) !capable(CAP_FOWNER)) + return -EACCES; + + if (get_user(blk_group, (int __user *)arg)) + return -EFAULT; + + /* +* need to allocate reservation structure for this inode +* before set the window size +*/ + mutex_lock(ei-truncate_mutex); + if (!ei-i_block_alloc_info) + ext3_init_block_alloc_info(inode); + + if (ei-i_block_alloc_info){ + ei-i_block_alloc_info-goal_block_group = blk_group; + } + mutex_unlock(ei-truncate_mutex); + return 0; + } case EXT3_IOC_GROUP_EXTEND: { unsigned long n_blocks_count; struct super_block *sb = inode-i_sb; diff -puN include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack include/linux/ext3_fs.h --- linux-2.6.16/include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack 2006-03-28 15:42:51.0 -0800 +++ linux-2.6.16-ming/include/linux/ext3_fs.h 2006-03-28 15:51:48.321237417 -0800 @@ -238,6 +238,7 @@ struct ext3_new_group_data { #endif #define EXT3_IOC_GETRSVSZ _IOR('f', 5, long) #define EXT3_IOC_SETRSVSZ _IOW('f', 6, long) +#define EXT3_IOC_SETALLOCBLKGRP_IOW('f', 9, long) /* * Mount options diff -puN include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack include/linux/ext3_fs_i.h --- linux-2.6.16/include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack 2006-03-28 15:43:59.0 -0800 +++ linux-2.6.16-ming/include/linux/ext3_fs_i.h 2006-03-28 15:47:54.274367219 -0800 @@ -51,6 +51,7 @@ struct ext3_block_alloc_info { * allocation when we detect linearly ascending requests. */ __u32 last_alloc_physical_block; + __u32 goal_block_group; }; #define rsv_start rsv_window._rsv_start diff -puN fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack fs/ext3/balloc.c --- linux-2.6.16/fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack 2006-03-28 15:45:30.0 -0800 +++ linux-2.6.16-ming/fs/ext3/balloc.c 2006-03-28 16:03:55.770850040 -0800 @@ -285,6 +285,7 @@ void ext3_init_block_alloc_info(struct i rsv-rsv_alloc_hit = 0; block_i-last_alloc_logical_block = 0; block_i-last_alloc_physical_block = 0; + block_i-goal_block_group = 0; } ei-i_block_alloc_info = block_i; } @@ -1263,15 +1264,20 @@ unsigned long ext3_new_blocks(handle_t * *errp = -ENOSPC; goto out; } - -
Re: Testing ext4 persistent preallocation patches for 64 bit features
On Wed, Feb 07, 2007 at 12:25:50AM -0800, Mingming Cao wrote: On Wed, 2007-02-07 at 13:18 +0530, Amit K. Arora wrote: I plan to test the persistent preallocation patches on a huge sparse device, to know if 32 bit physical block numbers (upto 48bit) behave as expected. Thanks! I have following questions for this and will appreciate suggestions here: c) Do I need to put some hack in the filesystem code for above (to allocate 32 bit physical block numbers) ? I had a ext3 hack patch before to allow application specify which block group is the targeted block allocation group,using ioctl command, so to allocate 32 bit physical block numbers it just set the target block group beyond 2**(32-15) = 2**17. patch is below.. BTW, have you considered - move the preallocation code in ioctl to a seperate function, and call that function from ioctl? That way we could easily switch to posix_falloc later. Good suggestion. - Test preallocation with mapped IO? - disable preallocation if the filesystem free blocks is under some low watermarks, to save space for near future real block allocation? A policy decision like this is probably worth a discussion during today's call. - is de-preallocation something worth doing? Wouldn't truncate do that ? Or you thinking of something like hole punching ? Regards Suparna Mingming --- linux-2.6.16-ming/fs/ext3/balloc.c | 24 ++- linux-2.6.16-ming/fs/ext3/ioctl.c | 29 linux-2.6.16-ming/include/linux/ext3_fs.h |1 linux-2.6.16-ming/include/linux/ext3_fs_i.h |1 4 files changed, 46 insertions(+), 9 deletions(-) diff -puN fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack fs/ext3/ioctl.c --- linux-2.6.16/fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack 2006-03-28 15:19:58.0 -0800 +++ linux-2.6.16-ming/fs/ext3/ioctl.c 2006-03-28 15:54:14.507288400 -0800 @@ -22,6 +22,7 @@ int ext3_ioctl (struct inode * inode, st struct ext3_inode_info *ei = EXT3_I(inode); unsigned int flags; unsigned short rsv_window_size; + unsigned int blk_group; ext3_debug (cmd = %u, arg = %lu\n, cmd, arg); @@ -193,6 +194,34 @@ flags_err: mutex_unlock(ei-truncate_mutex); return 0; } + case EXT3_IOC_SETALLOCBLKGRP: { + + if (!test_opt(inode-i_sb, RESERVATION) ||!S_ISREG(inode-i_mode)) + return -ENOTTY; + + if (IS_RDONLY(inode)) + return -EROFS; + + if ((current-fsuid != inode-i_uid) !capable(CAP_FOWNER)) + return -EACCES; + + if (get_user(blk_group, (int __user *)arg)) + return -EFAULT; + + /* + * need to allocate reservation structure for this inode + * before set the window size + */ + mutex_lock(ei-truncate_mutex); + if (!ei-i_block_alloc_info) + ext3_init_block_alloc_info(inode); + + if (ei-i_block_alloc_info){ + ei-i_block_alloc_info-goal_block_group = blk_group; + } + mutex_unlock(ei-truncate_mutex); + return 0; + } case EXT3_IOC_GROUP_EXTEND: { unsigned long n_blocks_count; struct super_block *sb = inode-i_sb; diff -puN include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack include/linux/ext3_fs.h --- linux-2.6.16/include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack 2006-03-28 15:42:51.0 -0800 +++ linux-2.6.16-ming/include/linux/ext3_fs.h 2006-03-28 15:51:48.321237417 -0800 @@ -238,6 +238,7 @@ struct ext3_new_group_data { #endif #define EXT3_IOC_GETRSVSZ_IOR('f', 5, long) #define EXT3_IOC_SETRSVSZ_IOW('f', 6, long) +#define EXT3_IOC_SETALLOCBLKGRP _IOW('f', 9, long) /* * Mount options diff -puN include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack include/linux/ext3_fs_i.h --- linux-2.6.16/include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack 2006-03-28 15:43:59.0 -0800 +++ linux-2.6.16-ming/include/linux/ext3_fs_i.h 2006-03-28 15:47:54.274367219 -0800 @@ -51,6 +51,7 @@ struct ext3_block_alloc_info { * allocation when we detect linearly ascending requests. */ __u32 last_alloc_physical_block; + __u32 goal_block_group; }; #define rsv_start rsv_window._rsv_start diff -puN fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack fs/ext3/balloc.c --- linux-2.6.16/fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack 2006-03-28 15:45:30.0 -0800 +++ linux-2.6.16-ming/fs/ext3/balloc.c2006-03-28 16:03:55.770850040 -0800 @@ -285,6 +285,7 @@ void ext3_init_block_alloc_info(struct i rsv-rsv_alloc_hit = 0;
Re: Testing ext4 persistent preallocation patches for 64 bit features
On Feb 07, 2007 16:06 +0530, Suparna Bhattacharya wrote: On Wed, Feb 07, 2007 at 12:25:50AM -0800, Mingming Cao wrote: - disable preallocation if the filesystem free blocks is under some low watermarks, to save space for near future real block allocation? A policy decision like this is probably worth a discussion during today's call. - is de-preallocation something worth doing? As discussed in the call - I don't think we can remove preallocations. The whole point of database preallocation is to guarantee that this space is available in the filesystem when writing into a file at random offsets (which would otherwise be sparse). Similarly, persistent preallocation shouldn't be considered differently than an efficient way of doing zero filling of blocks. At least that is my understanding... Is this code implementing the uninitialized extents for databases (via explicit preallocation via fallocate/ioctl) so that they don't have to zero-fill large files, or is there also automatic preallocation of space to files (e.g. for O_APPEND files)? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-ext4 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html