Re: Testing ext4 persistent preallocation patches for 64 bit features
On Wed, Feb 07, 2007 at 12:25:50AM -0800, Mingming Cao wrote: > On Wed, 2007-02-07 at 13:18 +0530, Amit K. Arora wrote: > > c) Do I need to put some hack in the filesystem code for above (to > > allocate >32 bit physical block numbers) ? > I had a ext3 hack patch before to allow application specify which block > group is the targeted block allocation group,using ioctl command, so to > allocate >32 bit physical block numbers it just set the target block > group beyond 2**(32-15) = 2**17. patch is below.. Thanks for the patch! > BTW, have you considered > - move the preallocation code in ioctl to a seperate function, and call > that function from ioctl? That way we could easily switch to > posix_falloc later. OK. > - Test preallocation with mapped IO? I haven't done that yet. Will test it out too. Thanks! -- Regards, Amit Arora - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Testing ext4 persistent preallocation patches for 64 bit features
On Wed, Feb 07, 2007 at 02:11:17PM -0700, Andreas Dilger wrote: > On Feb 07, 2007 16:06 +0530, Suparna Bhattacharya wrote: > > On Wed, Feb 07, 2007 at 12:25:50AM -0800, Mingming Cao wrote: > > > - disable preallocation if the filesystem free blocks is under some low > > > watermarks, to save space for near future real block allocation? > > > > A policy decision like this is probably worth a discussion during today's > > call. > > > > > - is de-preallocation something worth doing? > > As discussed in the call - I don't think we can remove preallocations. > The whole point of database preallocation is to guarantee that this space > is available in the filesystem when writing into a file at random offsets > (which would otherwise be sparse). > > Similarly, persistent preallocation shouldn't be considered differently > than an efficient way of doing zero filling of blocks. At least that is > my understanding... Is this code implementing the "uninitialized extents" > for databases (via explicit preallocation via fallocate/ioctl) so that > they don't have to zero-fill large files, or is there also automatic > preallocation of space to files (e.g. for O_APPEND files)? You are right. There is no automatic preallocation of space being done here. This code just implements the explicit (persistent) preallocation of blocks via ioctl. -- Regards, Amit Arora - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Testing ext4 persistent preallocation patches for 64 bit features
On Feb 07, 2007 16:06 +0530, Suparna Bhattacharya wrote: > On Wed, Feb 07, 2007 at 12:25:50AM -0800, Mingming Cao wrote: > > - disable preallocation if the filesystem free blocks is under some low > > watermarks, to save space for near future real block allocation? > > A policy decision like this is probably worth a discussion during today's > call. > > > - is de-preallocation something worth doing? As discussed in the call - I don't think we can remove preallocations. The whole point of database preallocation is to guarantee that this space is available in the filesystem when writing into a file at random offsets (which would otherwise be sparse). Similarly, persistent preallocation shouldn't be considered differently than an efficient way of doing zero filling of blocks. At least that is my understanding... Is this code implementing the "uninitialized extents" for databases (via explicit preallocation via fallocate/ioctl) so that they don't have to zero-fill large files, or is there also automatic preallocation of space to files (e.g. for O_APPEND files)? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Testing ext4 persistent preallocation patches for 64 bit features
On Wed, Feb 07, 2007 at 12:25:50AM -0800, Mingming Cao wrote: > On Wed, 2007-02-07 at 13:18 +0530, Amit K. Arora wrote: > > I plan to test the persistent preallocation patches on a huge sparse > > device, to know if >32 bit physical block numbers (upto 48bit) behave as > > expected. > Thanks! > > > > I have following questions for this and will appreciate > > suggestions here: > > > c) Do I need to put some hack in the filesystem code for above (to > > allocate >32 bit physical block numbers) ? > > > > I had a ext3 hack patch before to allow application specify which block > group is the targeted block allocation group,using ioctl command, so to > allocate >32 bit physical block numbers it just set the target block > group beyond 2**(32-15) = 2**17. patch is below.. > > BTW, have you considered > - move the preallocation code in ioctl to a seperate function, and call > that function from ioctl? That way we could easily switch to > posix_falloc later. Good suggestion. > - Test preallocation with mapped IO? > - disable preallocation if the filesystem free blocks is under some low > watermarks, to save space for near future real block allocation? A policy decision like this is probably worth a discussion during today's call. > - is de-preallocation something worth doing? Wouldn't truncate do that ? Or you thinking of something like hole punching ? Regards Suparna > > Mingming > > --- > > linux-2.6.16-ming/fs/ext3/balloc.c | 24 ++- > linux-2.6.16-ming/fs/ext3/ioctl.c | 29 > > linux-2.6.16-ming/include/linux/ext3_fs.h |1 > linux-2.6.16-ming/include/linux/ext3_fs_i.h |1 > 4 files changed, 46 insertions(+), 9 deletions(-) > > diff -puN fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack fs/ext3/ioctl.c > --- linux-2.6.16/fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack > 2006-03-28 15:19:58.0 -0800 > +++ linux-2.6.16-ming/fs/ext3/ioctl.c 2006-03-28 15:54:14.507288400 -0800 > @@ -22,6 +22,7 @@ int ext3_ioctl (struct inode * inode, st > struct ext3_inode_info *ei = EXT3_I(inode); > unsigned int flags; > unsigned short rsv_window_size; > + unsigned int blk_group; > > ext3_debug ("cmd = %u, arg = %lu\n", cmd, arg); > > @@ -193,6 +194,34 @@ flags_err: > mutex_unlock(&ei->truncate_mutex); > return 0; > } > + case EXT3_IOC_SETALLOCBLKGRP: { > + > + if (!test_opt(inode->i_sb, RESERVATION) > ||!S_ISREG(inode->i_mode)) > + return -ENOTTY; > + > + if (IS_RDONLY(inode)) > + return -EROFS; > + > + if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) > + return -EACCES; > + > + if (get_user(blk_group, (int __user *)arg)) > + return -EFAULT; > + > + /* > + * need to allocate reservation structure for this inode > + * before set the window size > + */ > + mutex_lock(&ei->truncate_mutex); > + if (!ei->i_block_alloc_info) > + ext3_init_block_alloc_info(inode); > + > + if (ei->i_block_alloc_info){ > + ei->i_block_alloc_info->goal_block_group = blk_group; > + } > + mutex_unlock(&ei->truncate_mutex); > + return 0; > + } > case EXT3_IOC_GROUP_EXTEND: { > unsigned long n_blocks_count; > struct super_block *sb = inode->i_sb; > diff -puN include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack > include/linux/ext3_fs.h > --- linux-2.6.16/include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack > 2006-03-28 15:42:51.0 -0800 > +++ linux-2.6.16-ming/include/linux/ext3_fs.h 2006-03-28 15:51:48.321237417 > -0800 > @@ -238,6 +238,7 @@ struct ext3_new_group_data { > #endif > #define EXT3_IOC_GETRSVSZ_IOR('f', 5, long) > #define EXT3_IOC_SETRSVSZ_IOW('f', 6, long) > +#define EXT3_IOC_SETALLOCBLKGRP _IOW('f', 9, long) > > /* > * Mount options > diff -puN include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack > include/linux/ext3_fs_i.h > --- linux-2.6.16/include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack > 2006-03-28 15:43:59.0 -0800 > +++ linux-2.6.16-ming/include/linux/ext3_fs_i.h 2006-03-28 > 15:47:54.274367219 -0800 > @@ -51,6 +51,7 @@ struct ext3_block_alloc_info { >* allocation when we detect linearly ascending requests. >*/ > __u32 last_alloc_physical_block; > + __u32 goal_block_group; > }; > > #define rsv_start rsv_window._rsv_start > diff -puN fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack fs/ext3/balloc.c > --- linux-2.6.16/fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack > 2006-03-28 15:45:30.0 -0800 > +++ linux-2.6.16-ming/fs/ext3/balloc.c2006-03-28 16:
Re: Testing ext4 persistent preallocation patches for 64 bit features
On Wed, 2007-02-07 at 13:18 +0530, Amit K. Arora wrote: > I plan to test the persistent preallocation patches on a huge sparse > device, to know if >32 bit physical block numbers (upto 48bit) behave as > expected. Thanks! > I have following questions for this and will appreciate > suggestions here: > c) Do I need to put some hack in the filesystem code for above (to > allocate >32 bit physical block numbers) ? > I had a ext3 hack patch before to allow application specify which block group is the targeted block allocation group,using ioctl command, so to allocate >32 bit physical block numbers it just set the target block group beyond 2**(32-15) = 2**17. patch is below.. BTW, have you considered - move the preallocation code in ioctl to a seperate function, and call that function from ioctl? That way we could easily switch to posix_falloc later. - Test preallocation with mapped IO? - disable preallocation if the filesystem free blocks is under some low watermarks, to save space for near future real block allocation? - is de-preallocation something worth doing? Mingming --- linux-2.6.16-ming/fs/ext3/balloc.c | 24 ++- linux-2.6.16-ming/fs/ext3/ioctl.c | 29 linux-2.6.16-ming/include/linux/ext3_fs.h |1 linux-2.6.16-ming/include/linux/ext3_fs_i.h |1 4 files changed, 46 insertions(+), 9 deletions(-) diff -puN fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack fs/ext3/ioctl.c --- linux-2.6.16/fs/ext3/ioctl.c~ext3_set_alloc_blk_group_hack 2006-03-28 15:19:58.0 -0800 +++ linux-2.6.16-ming/fs/ext3/ioctl.c 2006-03-28 15:54:14.507288400 -0800 @@ -22,6 +22,7 @@ int ext3_ioctl (struct inode * inode, st struct ext3_inode_info *ei = EXT3_I(inode); unsigned int flags; unsigned short rsv_window_size; + unsigned int blk_group; ext3_debug ("cmd = %u, arg = %lu\n", cmd, arg); @@ -193,6 +194,34 @@ flags_err: mutex_unlock(&ei->truncate_mutex); return 0; } + case EXT3_IOC_SETALLOCBLKGRP: { + + if (!test_opt(inode->i_sb, RESERVATION) ||!S_ISREG(inode->i_mode)) + return -ENOTTY; + + if (IS_RDONLY(inode)) + return -EROFS; + + if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) + return -EACCES; + + if (get_user(blk_group, (int __user *)arg)) + return -EFAULT; + + /* +* need to allocate reservation structure for this inode +* before set the window size +*/ + mutex_lock(&ei->truncate_mutex); + if (!ei->i_block_alloc_info) + ext3_init_block_alloc_info(inode); + + if (ei->i_block_alloc_info){ + ei->i_block_alloc_info->goal_block_group = blk_group; + } + mutex_unlock(&ei->truncate_mutex); + return 0; + } case EXT3_IOC_GROUP_EXTEND: { unsigned long n_blocks_count; struct super_block *sb = inode->i_sb; diff -puN include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack include/linux/ext3_fs.h --- linux-2.6.16/include/linux/ext3_fs.h~ext3_set_alloc_blk_group_hack 2006-03-28 15:42:51.0 -0800 +++ linux-2.6.16-ming/include/linux/ext3_fs.h 2006-03-28 15:51:48.321237417 -0800 @@ -238,6 +238,7 @@ struct ext3_new_group_data { #endif #define EXT3_IOC_GETRSVSZ _IOR('f', 5, long) #define EXT3_IOC_SETRSVSZ _IOW('f', 6, long) +#define EXT3_IOC_SETALLOCBLKGRP_IOW('f', 9, long) /* * Mount options diff -puN include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack include/linux/ext3_fs_i.h --- linux-2.6.16/include/linux/ext3_fs_i.h~ext3_set_alloc_blk_group_hack 2006-03-28 15:43:59.0 -0800 +++ linux-2.6.16-ming/include/linux/ext3_fs_i.h 2006-03-28 15:47:54.274367219 -0800 @@ -51,6 +51,7 @@ struct ext3_block_alloc_info { * allocation when we detect linearly ascending requests. */ __u32 last_alloc_physical_block; + __u32 goal_block_group; }; #define rsv_start rsv_window._rsv_start diff -puN fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack fs/ext3/balloc.c --- linux-2.6.16/fs/ext3/balloc.c~ext3_set_alloc_blk_group_hack 2006-03-28 15:45:30.0 -0800 +++ linux-2.6.16-ming/fs/ext3/balloc.c 2006-03-28 16:03:55.770850040 -0800 @@ -285,6 +285,7 @@ void ext3_init_block_alloc_info(struct i rsv->rsv_alloc_hit = 0; block_i->last_alloc_logical_block = 0; block_i->last_alloc_physical_block = 0; + block_i->goal_block_group = 0; } ei->i_block_alloc_info = block_i; } @@ -1263,15 +1264,20 @@ unsigned long ext3_new_blocks(handle_t * *errp = -ENOSPC;
Testing ext4 persistent preallocation patches for 64 bit features
I plan to test the persistent preallocation patches on a huge sparse device, to know if >32 bit physical block numbers (upto 48bit) behave as expected. I have following questions for this and will appreciate suggestions here: a) What should be the sparse device size which I should use for testing? Should a size of > 8TB (say, 100 TB) be enough ? The physical device (backing store device) size I can have is upto 70GB. b) How do I test allocation of >32 bit physical block numbers ? I can not fill > 8TB, since the physical storage available with me is just 70GB. c) Do I need to put some hack in the filesystem code for above (to allocate >32 bit physical block numbers) ? Any further ideas on how to test this will help. Thanks! -- Regards, Amit Arora - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html