Re: [RFC] Implement ->page_mkwrite for XFS
On Wed, Feb 07, 2007 at 10:18:23AM +, Christoph Hellwig wrote: > > This looks to me. But given that this is generic code except for the > get_block callback, shouldn't we put the guts into buffer.c and wire > all filesystems up to use it? e.g. > > > int block_page_mkwrite(struct vm_area_struct *vma, struct page *page, > get_block_t get_block) > { > struct inode *inode = vma->vm_file->f_path.dentry->d_inode; > unsigned long end; > int ret = 0; > > if ((page->index + 1) << PAGE_CACHE_SHIFT > i_size_read(inode)) > end = i_size_read(inode) & ~PAGE_CACHE_MASK; > else > end = PAGE_CACHE_SIZE; > > lock_page(page); > ret = block_prepare_write(page, 0, end, block); > if (!ret) > ret = block_commit_write(page, 0, end); > unlock_page(page); > return ret; > } > > and then in xfs and similar in other filesystems: > > STATIC int > xfs_vm_page_mkwrite( > struct vm_area_struct *vma, > struct page *page) > { > return block_page_mkwrite(vma, page, xfs_get_blocks); > } Yes, that can be done. block_page_mkwrite() would then go into fs/buffer.c? My patch originally had a bunch of other stuff and i wasn't sure that it could be done with generic code. I'll send an updated patch in a little while. > BTW, why is xfs_get_blocks not called xfs_get_block? I presume because it replaced the xfs_get_block() function when the block mapping callouts were modified to support mapping of multiple blocks. Maybe you should ask Nathan that question. ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Implement ->page_mkwrite for XFS
On Wed, Feb 07, 2007 at 09:53:25AM +1100, David Chinner wrote: > Folks, > > I'm not sure of the exact locking rules and constraints for > ->page_mkwrite(), so I thought I better fish around for comments. > > With XFS, we need to hook pages being dirtied by mmap writes so that > we can attach buffers of the correct state tothe pages. This means > that when we write them back, the correct thing happens. > > For example, if you mmap an unwritten extent (preallocated), > currently your data will get written to disk but the extent will not > get converted to a written extent. IOWs, you lose the data because > when you read it back it will seen as unwritten and treated as a > hole. > > AFAICT, it is safe to lock the page during ->page_mkwrite and that > it is safe to issue I/O (e.g. metadata reads) to determine the > current state of the file. I am also assuming that, at this point, > we are not allowed to change the file size and so we have to be > careful in ->page_mkwrite we don't do that. What else have I missed > here? > > IOWs, I've basically treated ->page_mkwrite() as wrapper for > block_prepare_write/block_commit_write because they do all the > buffer mapping and state manipulation I think is necessary. Is it > safe to call these functions, or are there some other constraints we > have to work under here? > > Patch below. Comments? > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > > > --- > fs/xfs/linux-2.6/xfs_file.c | 34 ++ > 1 file changed, 34 insertions(+) > > Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c > === > --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_file.c2007-01-16 > 10:54:15.0 +1100 > +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c 2007-02-07 09:49:00.508017483 > +1100 > @@ -446,6 +446,38 @@ xfs_file_open_exec( > } > #endif /* HAVE_FOP_OPEN_EXEC */ > > +/* > + * mmap()d file has taken write protection fault and is being made > + * writable. We can set the page state up correctly for a writable > + * page, which means we can do correct delalloc accounting (ENOSPC > + * checking!) and unwritten extent mapping. > + */ > +STATIC int > +xfs_vm_page_mkwrite( > + struct vm_area_struct *vma, > + struct page *page) > +{ > + struct inode*inode = vma->vm_file->f_path.dentry->d_inode; > + unsigned long end; > + int ret = 0; > + > + end = page->index + 1; > + end <<= PAGE_CACHE_SHIFT; > + if (end > i_size_read(inode)) > + end = i_size_read(inode) & ~PAGE_CACHE_MASK; > + else > + end = PAGE_CACHE_SIZE; > + > + lock_page(page); > + ret = block_prepare_write(page, 0, end, xfs_get_blocks); > + if (!ret) > + ret = block_commit_write(page, 0, end); > + unlock_page(page); > + > + return ret; > +} This looks to me. But given that this is generic code except for the get_block callback, shouldn't we put the guts into buffer.c and wire all filesystems up to use it? e.g. int block_page_mkwrite(struct vm_area_struct *vma, struct page *page, get_block_t get_block) { struct inode *inode = vma->vm_file->f_path.dentry->d_inode; unsigned long end; int ret = 0; if ((page->index + 1) << PAGE_CACHE_SHIFT > i_size_read(inode)) end = i_size_read(inode) & ~PAGE_CACHE_MASK; else end = PAGE_CACHE_SIZE; lock_page(page); ret = block_prepare_write(page, 0, end, block); if (!ret) ret = block_commit_write(page, 0, end); unlock_page(page); return ret; } and then in xfs and similar in other filesystems: STATIC int xfs_vm_page_mkwrite( struct vm_area_struct *vma, struct page *page) { return block_page_mkwrite(vma, page, xfs_get_blocks); } BTW, why is xfs_get_blocks not called xfs_get_block? - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] Implement ->page_mkwrite for XFS
Folks, I'm not sure of the exact locking rules and constraints for ->page_mkwrite(), so I thought I better fish around for comments. With XFS, we need to hook pages being dirtied by mmap writes so that we can attach buffers of the correct state tothe pages. This means that when we write them back, the correct thing happens. For example, if you mmap an unwritten extent (preallocated), currently your data will get written to disk but the extent will not get converted to a written extent. IOWs, you lose the data because when you read it back it will seen as unwritten and treated as a hole. AFAICT, it is safe to lock the page during ->page_mkwrite and that it is safe to issue I/O (e.g. metadata reads) to determine the current state of the file. I am also assuming that, at this point, we are not allowed to change the file size and so we have to be careful in ->page_mkwrite we don't do that. What else have I missed here? IOWs, I've basically treated ->page_mkwrite() as wrapper for block_prepare_write/block_commit_write because they do all the buffer mapping and state manipulation I think is necessary. Is it safe to call these functions, or are there some other constraints we have to work under here? Patch below. Comments? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/linux-2.6/xfs_file.c | 34 ++ 1 file changed, 34 insertions(+) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c === --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_file.c 2007-01-16 10:54:15.0 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c 2007-02-07 09:49:00.508017483 +1100 @@ -446,6 +446,38 @@ xfs_file_open_exec( } #endif /* HAVE_FOP_OPEN_EXEC */ +/* + * mmap()d file has taken write protection fault and is being made + * writable. We can set the page state up correctly for a writable + * page, which means we can do correct delalloc accounting (ENOSPC + * checking!) and unwritten extent mapping. + */ +STATIC int +xfs_vm_page_mkwrite( + struct vm_area_struct *vma, + struct page *page) +{ + struct inode*inode = vma->vm_file->f_path.dentry->d_inode; + unsigned long end; + int ret = 0; + + end = page->index + 1; + end <<= PAGE_CACHE_SHIFT; + if (end > i_size_read(inode)) + end = i_size_read(inode) & ~PAGE_CACHE_MASK; + else + end = PAGE_CACHE_SIZE; + + lock_page(page); + ret = block_prepare_write(page, 0, end, xfs_get_blocks); + if (!ret) + ret = block_commit_write(page, 0, end); + unlock_page(page); + + return ret; +} + + const struct file_operations xfs_file_operations = { .llseek = generic_file_llseek, .read = do_sync_read, @@ -503,12 +535,14 @@ const struct file_operations xfs_dir_fil static struct vm_operations_struct xfs_file_vm_ops = { .nopage = filemap_nopage, .populate = filemap_populate, + .page_mkwrite = xfs_vm_page_mkwrite, }; #ifdef HAVE_DMAPI static struct vm_operations_struct xfs_dmapi_file_vm_ops = { .nopage = xfs_vm_nopage, .populate = filemap_populate, + .page_mkwrite = xfs_vm_page_mkwrite, #ifdef HAVE_VMOP_MPROTECT .mprotect = xfs_vm_mprotect, #endif - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html