Re: [PATCH 1/2] fs: use memclear_highpage_flush to zero page data
On Mon, 09 Apr 2007 21:31:37 -0700 Nate Diller [EMAIL PROTECTED] wrote: It's very common for file systems to need to zero part or all of a page, the simplist way is just to use kmap_atomic() and memset(). There's actually a library function in include/linux/highmem.h that does exactly that, but it's confusingly named memclear_highpage_flush(), which is descriptive of *how* it does the work rather than what the *purpose* is. So this patch renames the function to zero_page_data(), and calls it from the various places that currently open code it. Compile tested in x86_64. signed-off-by: Nate Diller [EMAIL PROTECTED] --- drivers/block/loop.c |6 --- fs/affs/file.c |6 --- fs/buffer.c | 53 +-- fs/direct-io.c |8 +--- fs/ecryptfs/mmap.c | 14 +--- fs/ext3/inode.c | 12 +-- fs/ext4/inode.c | 12 +-- fs/ext4/writeback.c | 12 +-- fs/gfs2/bmap.c |6 --- fs/mpage.c | 11 +- fs/nfs/read.c| 10 ++--- fs/nfs/write.c |2 - fs/ntfs/aops.c | 32 +++--- fs/ntfs/file.c | 47 +-- fs/ocfs2/aops.c |5 -- fs/reiser4/plugin/file/cryptcompress.c | 19 +-- fs/reiser4/plugin/file/file.c|6 --- fs/reiser4/plugin/item/ctail.c |6 --- fs/reiser4/plugin/item/extent_file_ops.c | 19 +++ fs/reiser4/plugin/item/tail.c|8 +--- fs/reiserfs/file.c | 39 ++ fs/reiserfs/inode.c | 13 +-- fs/xfs/linux-2.6/xfs_lrw.c |2 - include/linux/highmem.h |2 - mm/filemap_xip.c |7 mm/truncate.c|2 - 26 files changed, 78 insertions(+), 281 deletions(-) Not sure that I agree with the name zero_page_data(). People might use it to, err, zero a page's data. Whereas it is really only for use against *user* pages. zero_user_page(), perhaps. Plus.. This patch as presented causes me surprising amounts of trouble. I need to split it up into - core plus filesystems which don't have maintainers (for me to merge) - filesystems which do have maintainers (one patch per), for maintainers to merge. - another patch for reiser4, to remain in -mm. And this is actually not possible to do, because my merge and the subsystem maintainers' merges will happen at different times. In the intervening window, the kernel won't compile. So instead I need to - split off the reiser4 bit - get acks from fs maintainers on the rest - merge the whole thing in one hit (minus reiser4) And I can do that, but it is the less preferable option. The better way to do this merge is: patch #1: static inline void memclear_highpage_flush(...) __deprecated { zero_user_page(...); } patch #2..n: convert filesystems. then, when all filesystems are converted, we're ready to remove memclear_highpage_flush(). But we do that six months later - let's not screw out-of-tree fs maintainers (and their users) unnecessarily. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Mon, 2007-04-09 at 22:10 +0200, Miklos Szeredi wrote: The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are you interested in the details? I can reproduce it, but forgot to note down the details of the brokenness. I don't know how far removed that is from the one being used by redhat, but assuming it's the same, then redhat-lspp@redhat.com will be very interested. OK. - user namespace setup: what if user has multiple sessions? 1) namespaces are shared? That's tricky because the session needs to be a child of a namespace server, not of login. I'm not sure PAM can handle this 2) or mounts are copied on login? That's not possible currently, as there's no way to send a mount between namespaces. Also it's tricky to make sure that new mounts are also shared See toward the end of the 'shared subtrees' OLS paper from last year for a suggestion on how to let users effectively 'log in to' an existing private mounts ns. This? 1. create a new namespace 2. bind /share/$USER to /share 3. for each pair ($who, $what) such that /share/$USER/$who/$what exists, look in /share/$who/allowed for peer $what $USER or slave $what $USER. If the former is found, rbind /share/$who/$what on /share/$USER/$who/$what; if the latter is found, do the same and follow with marking subtree under /share/$USER/$who/$what as slave. 4. rbind /share/$USER to /share 5. mark subtree under /share as private. 6. umount -l /share Well, someone please explain using short words, because I don't understand at all. I am trying to re-construct Viro's thoughts. I think the steps outlined above; though not accurate, are still insightful. The idea is -- there is one master namespace, which has under /share, a replica of the mount tree of namespaces belonging to all users. for example if there are two users A and B, then in the master namespace under /share you will find /share/A and /share/B, each reflecting the mount tree for the namespaces belonging to user-A and user-B respectively. Note: /share is a shared mount-tree, which means it can propagate mount events. Everytime the user logs on the machine, a new namespace is created which is the clone of the master namespace. In this new namespace, the /share/$user is made the root of the namespace. Also if other users have allowed part of their namespace available to this user, than those mounts are also brought under this namespace. And finally the entire tree under /share is unmounted. Note, though multiple namespaces can exist simultaneously for the same user, the user is provided the illusion of per-process-namespace since all the namespaces look identical. I am trying to rewrite the steps outlined above, which may or may not reflect Viro's thoughts, but certainly reflect my reconstruction of viro's thoughts. 1. clone the master namespace. 2. in the new namespace move the tree under /share/$me to / for each ($user, $what, $how) { move /share/$user/$what to /$what if ($how == slave) { make the mount tree under /$what as slave } } 3. in the new namespace make the tree under /share as private and unmount /share RP Thanks, Miklos - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Fri, 2007-04-06 at 16:16 -0700, H. Peter Anvin wrote: - users can use bind mounts without having to pre-configure them in /etc/fstab This is by far the biggest concern I see. I think the security implication of allowing anyone to do bind mounts are poorly understood. And especially so since there is no way for a filesystem module to veto such requests. Ian - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] fs: use memclear_highpage_flush to zero page data
On 10 Apr 2007, at 07:10, Andrew Morton wrote: On Mon, 09 Apr 2007 21:31:37 -0700 Nate Diller [EMAIL PROTECTED] wrote: It's very common for file systems to need to zero part or all of a page, the simplist way is just to use kmap_atomic() and memset(). There's actually a library function in include/linux/highmem.h that does exactly that, but it's confusingly named memclear_highpage_flush(), which is descriptive of *how* it does the work rather than what the *purpose* is. So this patch renames the function to zero_page_data(), and calls it from the various places that currently open code it. Compile tested in x86_64. signed-off-by: Nate Diller [EMAIL PROTECTED] --- drivers/block/loop.c |6 --- fs/affs/file.c |6 --- fs/buffer.c | 53 +-- fs/direct-io.c |8 +--- fs/ecryptfs/mmap.c | 14 +--- fs/ext3/inode.c | 12 +-- fs/ext4/inode.c | 12 +-- fs/ext4/writeback.c | 12 +-- fs/gfs2/bmap.c |6 --- fs/mpage.c | 11 +- fs/nfs/read.c| 10 ++--- fs/nfs/write.c |2 - fs/ntfs/aops.c | 32 +++--- fs/ntfs/file.c | 47 +-- fs/ocfs2/aops.c |5 -- fs/reiser4/plugin/file/cryptcompress.c | 19 +-- fs/reiser4/plugin/file/file.c|6 --- fs/reiser4/plugin/item/ctail.c |6 --- fs/reiser4/plugin/item/extent_file_ops.c | 19 +++ fs/reiser4/plugin/item/tail.c|8 +--- fs/reiserfs/file.c | 39 + + fs/reiserfs/inode.c | 13 +-- fs/xfs/linux-2.6/xfs_lrw.c |2 - include/linux/highmem.h |2 - mm/filemap_xip.c |7 mm/truncate.c|2 - 26 files changed, 78 insertions(+), 281 deletions(-) Not sure that I agree with the name zero_page_data(). People might use it to, err, zero a page's data. Whereas it is really only for use against *user* pages. zero_user_page(), perhaps. Plus.. This patch as presented causes me surprising amounts of trouble. I need to split it up into - core plus filesystems which don't have maintainers (for me to merge) - filesystems which do have maintainers (one patch per), for maintainers to merge. - another patch for reiser4, to remain in -mm. And this is actually not possible to do, because my merge and the subsystem maintainers' merges will happen at different times. In the intervening window, the kernel won't compile. So instead I need to - split off the reiser4 bit - get acks from fs maintainers on the rest - merge the whole thing in one hit (minus reiser4) And I can do that, but it is the less preferable option. The better way to do this merge is: patch #1: static inline void memclear_highpage_flush(...) __deprecated { zero_user_page(...); } patch #2..n: convert filesystems. then, when all filesystems are converted, we're ready to remove memclear_highpage_flush(). But we do that six months later - let's not screw out-of-tree fs maintainers (and their users) unnecessarily. Nate, I think you either do not understand what the KM_* constants passed to kmap_atomic() mean or you were overeager in your code replacement... You really, really cannot replace KM_BIO_SRC_IRQ with KM_USER0 in the NTFS i/o completion handler without trashing people's data left right an centre! Best regards, Anton -- Anton Altaparmakov aia21 at cam.ac.uk (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer, http://www.linux-ntfs.org/ - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/8] unprivileged mount syscall
On Mon, Apr 09, 2007 at 10:46:25AM -0700, Ram Pai wrote: On Mon, 2007-04-09 at 12:07 -0500, Serge E. Hallyn wrote: Quoting Miklos Szeredi ([EMAIL PROTECTED]): - need to set up mount propagation from global namespace to private ones, mount(8) does not yet have options to configure propagation Hmm, I guess I get lost using my own little systems, and just assumed that shared subtree functionality was making its way up into mount(8). Ram, have you been working on that? It is in FC6. I dont know the status off upstream util-linux. I did submit the patch many times to Adrian Bunk (the then util-linux maintainer) and got no response. I have not pushed the patches to the new maintainer(Karel Zak?) though. The shared-subtree patch has been applied: http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=commitdiff;h=389fbea536e4308d9475fa2a89e53e188ce8a0e3;hp=939a997de0c761d29fb7530976ca20da4898703a Karel -- Karel Zak [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Linux 2007 File System IO Workshop notes talks
We have some of the material reviewed and posted now from the IO FS workshop. USENIX has posted the talks at: http://www.usenix.org/events/lsf07/tech/tech.html A write up of the workshop went out at LWN and invoked a healthy discussion: http://lwn.net/Articles/226351/ At that LWN article, there is a link to the Linux FS wiki with good notes: http://linuxfs.pbwiki.com/LSF07-Workshop-Notes Another summary will go out in the next USENIX ;login edition. ric - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL -mm] Unionfs branch management code
Andrew Morton wrote: On Mon, 9 Apr 2007 10:53:51 -0400 Josef 'Jeff' Sipek [EMAIL PROTECTED] wrote: The following patches introduce new branch-management code into Unionfs as well as fix a number of stability issues and resource leaks. I have a mental note that unionfs is in the stuck state, due to general agreement that we should implement this functionality at the VFS level, one reason for which is unionfs's upper-vs-lower coherency problems. How can a union file system with a decent set of useful semantics be fully implemented at the VFS layer in a clean manner? For instance, a major use of unionfs is live CDs, namely unionfs w/ a read-only and read-write layer. Unionfs enables files to be copied up from the read-only layer to the read-write layer. Does one really want to implement copyup in the VFS? just my 2 agarot. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL -mm] Unionfs branch management code
On Tue, Apr 10, 2007 at 01:22:52PM -0400, Shaya Potter wrote: Andrew Morton wrote: On Mon, 9 Apr 2007 10:53:51 -0400 Josef 'Jeff' Sipek [EMAIL PROTECTED] wrote: The following patches introduce new branch-management code into Unionfs as well as fix a number of stability issues and resource leaks. I have a mental note that unionfs is in the stuck state, due to general agreement that we should implement this functionality at the VFS level, one reason for which is unionfs's upper-vs-lower coherency problems. How can a union file system with a decent set of useful semantics be fully implemented at the VFS layer in a clean manner? Unioning is quite odd. It uses concepts, some of which do indeed belong in the VFS (like actual merging of the lower directories), but others that most definitely do not (like whiteouts). For instance, a major use of unionfs is live CDs, namely unionfs w/ a read-only and read-write layer. Unionfs enables files to be copied up from the read-only layer to the read-write layer. Does one really want to implement copyup in the VFS? I don't think that copyup is the problem, but whiteouts...oh boy. Whiteouts/some kind of persistent storage is most definitely a filesystem construct, and it does not belong in the VFS. Josef Jeff Sipek. -- If I have trouble installing Linux, something is wrong. Very wrong. - Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] fs: use memclear_highpage_flush to zero page data
On 4/10/07, Anton Altaparmakov [EMAIL PROTECTED] wrote: On 10 Apr 2007, at 07:10, Andrew Morton wrote: On Mon, 09 Apr 2007 21:31:37 -0700 Nate Diller [EMAIL PROTECTED] wrote: It's very common for file systems to need to zero part or all of a page, the simplist way is just to use kmap_atomic() and memset(). There's actually a library function in include/linux/highmem.h that does exactly that, but it's confusingly named memclear_highpage_flush(), which is descriptive of *how* it does the work rather than what the *purpose* is. So this patch renames the function to zero_page_data(), and calls it from the various places that currently open code it. Compile tested in x86_64. signed-off-by: Nate Diller [EMAIL PROTECTED] --- drivers/block/loop.c |6 --- fs/affs/file.c |6 --- fs/buffer.c | 53 +-- fs/direct-io.c |8 +--- fs/ecryptfs/mmap.c | 14 +--- fs/ext3/inode.c | 12 +-- fs/ext4/inode.c | 12 +-- fs/ext4/writeback.c | 12 +-- fs/gfs2/bmap.c |6 --- fs/mpage.c | 11 +- fs/nfs/read.c| 10 ++--- fs/nfs/write.c |2 - fs/ntfs/aops.c | 32 +++--- fs/ntfs/file.c | 47 +-- fs/ocfs2/aops.c |5 -- fs/reiser4/plugin/file/cryptcompress.c | 19 +-- fs/reiser4/plugin/file/file.c|6 --- fs/reiser4/plugin/item/ctail.c |6 --- fs/reiser4/plugin/item/extent_file_ops.c | 19 +++ fs/reiser4/plugin/item/tail.c|8 +--- fs/reiserfs/file.c | 39 + + fs/reiserfs/inode.c | 13 +-- fs/xfs/linux-2.6/xfs_lrw.c |2 - include/linux/highmem.h |2 - mm/filemap_xip.c |7 mm/truncate.c|2 - 26 files changed, 78 insertions(+), 281 deletions(-) Not sure that I agree with the name zero_page_data(). People might use it to, err, zero a page's data. Whereas it is really only for use against *user* pages. zero_user_page(), perhaps. Plus.. This patch as presented causes me surprising amounts of trouble. I need to split it up into - core plus filesystems which don't have maintainers (for me to merge) - filesystems which do have maintainers (one patch per), for maintainers to merge. - another patch for reiser4, to remain in -mm. And this is actually not possible to do, because my merge and the subsystem maintainers' merges will happen at different times. In the intervening window, the kernel won't compile. So instead I need to - split off the reiser4 bit - get acks from fs maintainers on the rest - merge the whole thing in one hit (minus reiser4) And I can do that, but it is the less preferable option. The better way to do this merge is: patch #1: static inline void memclear_highpage_flush(...) __deprecated { zero_user_page(...); } patch #2..n: convert filesystems. then, when all filesystems are converted, we're ready to remove memclear_highpage_flush(). But we do that six months later - let's not screw out-of-tree fs maintainers (and their users) unnecessarily. Nate, I think you either do not understand what the KM_* constants passed to kmap_atomic() mean or you were overeager in your code replacement... You really, really cannot replace KM_BIO_SRC_IRQ with KM_USER0 in the NTFS i/o completion handler without trashing people's data left right an centre! good catch, I was indeed careless on that one. I just double checked all the other changes and that was the only non-KM_USER0 that slipped through. Thanks! I will submit a new patch later today that fixes this problem and the issues AKPM raised. NATE - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/17] locks: add fl_notify arguments for asynchronous lock return
On Mon, Apr 09, 2007 at 07:40:41PM +0100, Christoph Hellwig wrote: On Thu, Apr 05, 2007 at 07:40:58PM -0400, J. Bruce Fields wrote: We're using fl_notify to asynchronously return the result of a lock request. So we want fl_notify to be able to return a status and, if appropriate, a conflicting lock. This only current caller of fl_notify is in the blocked case, in which case we don't use these extra arguments. We also allow fl_notify to return an error. (Also ignored for now.) I don't really like the overload of fl_notify. What the reason not to use a separate callback? My vague memory is that Trond said something to the affect of fl_notify is there, let's use it rather than adding yet another callback. But our new usage of fl_notify does requires slightly different arguments and returns, and is used in a subtly different case. So I wouldn't object to a new callback. Trond? --b. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 9/17] locks: add lock cancel command
On Mon, Apr 09, 2007 at 07:41:44PM +0100, Christoph Hellwig wrote: On Thu, Apr 05, 2007 at 07:40:59PM -0400, J. Bruce Fields wrote: We do this by adding a new fcntl lock command: FL_CANCELLK. Some day this might also be made available to userspace applications that could benefit from an asynchronous locking api. Should we really add more and more subcases to -lock that probably don't share implementation code? I'd much prefer adding different operations. That'd be OK. We considered both-- http://marc.info/?l=linux-fsdevelm=116616992004056w=2 --but chose a new -lock case just because that might provide a cleaner mapping to the userspace interface if we ended up doing that some day. Is there any hard reason why it wouldn't work? --b. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/17] lockd: handle fl_notify callbacks
On Mon, Apr 09, 2007 at 07:43:52PM +0100, Christoph Hellwig wrote: On Thu, Apr 05, 2007 at 07:41:02PM -0400, J. Bruce Fields wrote: + if (block-b_fl) + kfree(block-b_fl); kfree(NULL) is fine. Whoops, thanks, will fix. +static void +nlmsvc_update_deferred_block(struct nlm_block *block, struct file_lock *conf, +int result) +{ + block-b_flags |= B_GOT_CALLBACK; + if (result == 0) + block-b_granted = 1; + else + block-b_flags |= B_TOO_LATE; + if (conf) { + block-b_fl = kzalloc(sizeof(struct file_lock), GFP_KERNEL); + if (block-b_fl) + locks_copy_lock(block-b_fl, conf); + } +} Shouldn't there be a way to propagate errors back to the caller when the kzalloc fails? That's fixed in a later patch, so there may be a problem with how I split up those patches--I'll check. --b. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/13] fs: convert core functions to zero_user_page
It's very common for file systems to need to zero part or all of a page, the simplist way is just to use kmap_atomic() and memset(). There's actually a library function in include/linux/highmem.h that does exactly that, but it's confusingly named memclear_highpage_flush(), which is descriptive of *how* it does the work rather than what the *purpose* is. So this patchset renames the function to zero_user_page(), and calls it from the various places that currently open code it. This first patch introduces the new function call, and converts all the core kernel callsites, both the open-coded ones and the old memclear_highpage_flush() ones. Following this patch is a series of conversions for each file system individually, per AKPM, and finally a patch deprecating the old call. The diffstat below shows the entire patchset. Compile tested in x86_64. signed-off-by: Nate Diller [EMAIL PROTECTED] --- drivers/block/loop.c |6 --- fs/affs/file.c |6 --- fs/buffer.c | 53 +-- fs/direct-io.c |8 +--- fs/ecryptfs/mmap.c | 14 +--- fs/ext3/inode.c | 12 +-- fs/ext4/inode.c | 12 +-- fs/ext4/writeback.c | 12 +-- fs/gfs2/bmap.c |6 --- fs/mpage.c | 11 +- fs/nfs/read.c| 10 ++--- fs/nfs/write.c |2 - fs/ntfs/aops.c | 26 ++- fs/ntfs/file.c | 47 +-- fs/ocfs2/aops.c |5 -- fs/reiser4/plugin/file/cryptcompress.c | 19 +-- fs/reiser4/plugin/file/file.c|6 --- fs/reiser4/plugin/item/ctail.c |6 --- fs/reiser4/plugin/item/extent_file_ops.c | 19 +++ fs/reiser4/plugin/item/tail.c|8 +--- fs/reiserfs/file.c | 39 ++ fs/reiserfs/inode.c | 13 +-- fs/xfs/linux-2.6/xfs_lrw.c |2 - include/linux/highmem.h |7 +++- mm/filemap_xip.c |7 mm/truncate.c|2 - 26 files changed, 82 insertions(+), 276 deletions(-) --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/drivers/block/loop.c linux-2.6.21-rc6-mm1-test/drivers/block/loop.c --- linux-2.6.21-rc6-mm1/drivers/block/loop.c 2007-04-10 18:27:04.0 -0700 +++ linux-2.6.21-rc6-mm1-test/drivers/block/loop.c 2007-04-10 18:18:16.0 -0700 @@ -244,17 +244,13 @@ static int do_lo_send_aops(struct loop_d transfer_result = lo_do_transfer(lo, WRITE, page, offset, bvec-bv_page, bv_offs, size, IV); if (unlikely(transfer_result)) { - char *kaddr; - /* * The transfer failed, but we still write the data to * keep prepare/commit calls balanced. */ printk(KERN_ERR loop: transfer error block %llu\n, (unsigned long long)index); - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, size); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, size); } flush_dcache_page(page); ret = aops-commit_write(file, page, offset, diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/buffer.c linux-2.6.21-rc6-mm1-test/fs/buffer.c --- linux-2.6.21-rc6-mm1/fs/buffer.c2007-04-10 18:27:04.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/buffer.c 2007-04-10 18:18:16.0 -0700 @@ -1862,13 +1862,8 @@ static int __block_prepare_write(struct if (block_start = to) break; if (buffer_new(bh)) { - void *kaddr; - clear_buffer_new(bh); - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr+block_start, 0, bh-b_size); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, block_start, bh-b_size); set_buffer_uptodate(bh); mark_buffer_dirty(bh); } @@ -1956,10 +1951,7 @@ int block_read_full_page(struct page *pa SetPageError(page); } if (!buffer_mapped(bh)) { - void *kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + i *
[PATCH 2/13] affs: use zero_user_page
Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/affs/file.c linux-2.6.21-rc6-mm1-test/fs/affs/file.c --- linux-2.6.21-rc6-mm1/fs/affs/file.c 2007-04-09 17:23:48.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/affs/file.c2007-04-09 18:18:23.0 -0700 @@ -628,11 +628,7 @@ static int affs_prepare_write_ofs(struct return err; } if (to PAGE_CACHE_SIZE) { - char *kaddr = kmap_atomic(page, KM_USER0); - - memset(kaddr + to, 0, PAGE_CACHE_SIZE - to); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, to, PAGE_CACHE_SIZE - to); if (size offset + to) { if (size offset + PAGE_CACHE_SIZE) tmp = size ~PAGE_CACHE_MASK; - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/13] ecryptfs: use zero_user_page
Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ecryptfs/mmap.c linux-2.6.21-rc6-mm1-test/fs/ecryptfs/mmap.c --- linux-2.6.21-rc6-mm1/fs/ecryptfs/mmap.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/ecryptfs/mmap.c2007-04-09 18:19:34.0 -0700 @@ -364,18 +364,14 @@ static int fill_zeros_to_end_of_page(str { struct inode *inode = page-mapping-host; int end_byte_in_page; - char *page_virt; if ((i_size_read(inode) / PAGE_CACHE_SIZE) != page-index) goto out; end_byte_in_page = i_size_read(inode) % PAGE_CACHE_SIZE; if (to end_byte_in_page) end_byte_in_page = to; - page_virt = kmap_atomic(page, KM_USER0); - memset((page_virt + end_byte_in_page), 0, - (PAGE_CACHE_SIZE - end_byte_in_page)); - kunmap_atomic(page_virt, KM_USER0); - flush_dcache_page(page); + zero_user_page(page, end_byte_in_page, + PAGE_CACHE_SIZE - end_byte_in_page); out: return 0; } @@ -740,7 +736,6 @@ int write_zeros(struct file *file, pgoff { int rc = 0; struct page *tmp_page; - char *tmp_page_virt; tmp_page = ecryptfs_get1page(file, index); if (IS_ERR(tmp_page)) { @@ -757,10 +752,7 @@ int write_zeros(struct file *file, pgoff page_cache_release(tmp_page); goto out; } - tmp_page_virt = kmap_atomic(tmp_page, KM_USER0); - memset(((char *)tmp_page_virt + start), 0, num_zeros); - kunmap_atomic(tmp_page_virt, KM_USER0); - flush_dcache_page(tmp_page); + zero_user_page(tmp_page, start, num_zeros); rc = ecryptfs_commit_write(file, tmp_page, start, start + num_zeros); if (rc 0) { ecryptfs_printk(KERN_ERR, Error attempting to write zero's - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/13] ext4: use zero_user_page
Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/inode.c linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c --- linux-2.6.21-rc6-mm1/fs/ext4/inode.c2007-04-10 17:15:04.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c 2007-04-10 18:33:04.0 -0700 @@ -1791,7 +1791,6 @@ int ext4_block_truncate_page(handle_t *h struct inode *inode = mapping-host; struct buffer_head *bh; int err = 0; - void *kaddr; if ((EXT4_I(inode)-i_flags EXT4_EXTENTS_FL) test_opt(inode-i_sb, EXTENTS) @@ -1808,10 +1807,7 @@ int ext4_block_truncate_page(handle_t *h */ if (!page_has_buffers(page) test_opt(inode-i_sb, NOBH) ext4_should_writeback_data(inode) PageUptodate(page)) { - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, length); set_page_dirty(page); goto unlock; } @@ -1864,11 +1860,7 @@ int ext4_block_truncate_page(handle_t *h goto unlock; } - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); - + zero_user_page(page, offset, length); BUFFER_TRACE(bh, zeroed end of block); err = 0; diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/writeback.c linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c --- linux-2.6.21-rc6-mm1/fs/ext4/writeback.c2007-04-10 18:05:52.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c 2007-04-10 18:33:04.0 -0700 @@ -961,7 +961,6 @@ int ext4_wb_writepage(struct page *page, loff_t i_size = i_size_read(inode); pgoff_t end_index = i_size PAGE_CACHE_SHIFT; unsigned offset; - void *kaddr; wb_debug(writepage %lu from inode %lu\n, page-index, inode-i_ino); @@ -1011,10 +1010,7 @@ int ext4_wb_writepage(struct page *page, * the page size, the remaining memory is zeroed when mapped, and * writes to that region are not written out to the file. */ - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, PAGE_CACHE_SIZE - offset); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, PAGE_CACHE_SIZE - offset); return ext4_wb_write_single_page(page, wbc); } @@ -1065,7 +1061,6 @@ int ext4_wb_block_truncate_page(handle_t struct inode *inode = mapping-host; struct buffer_head bh, *bhw = bh; unsigned blocksize, length; - void *kaddr; int err = 0; wb_debug(partial truncate from %lu on page %lu from inode %lu\n, @@ -1104,10 +1099,7 @@ int ext4_wb_block_truncate_page(handle_t } } - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, length); SetPageUptodate(page); __set_page_dirty_nobuffers(page); - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/13] gfs2: use zero_user_page
Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/gfs2/bmap.c linux-2.6.21-rc6-mm1-test/fs/gfs2/bmap.c --- linux-2.6.21-rc6-mm1/fs/gfs2/bmap.c 2007-04-09 17:23:48.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/gfs2/bmap.c2007-04-09 18:18:23.0 -0700 @@ -885,7 +885,6 @@ static int gfs2_block_truncate_page(stru unsigned blocksize, iblock, length, pos; struct buffer_head *bh; struct page *page; - void *kaddr; int err; page = grab_cache_page(mapping, index); @@ -933,10 +932,7 @@ static int gfs2_block_truncate_page(stru if (sdp-sd_args.ar_data == GFS2_DATA_ORDERED || gfs2_is_jdata(ip)) gfs2_trans_add_bh(ip-i_gl, bh, 0); - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, length); unlock: unlock_page(page); - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/13] nfs: use zero_user_page
Use zero_user_page() instead of the newly deprecated memclear_highpage_flush(). Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/nfs/read.c linux-2.6.21-rc6-mm1-test/fs/nfs/read.c --- linux-2.6.21-rc6-mm1/fs/nfs/read.c 2007-04-09 17:23:48.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/nfs/read.c 2007-04-09 18:18:23.0 -0700 @@ -79,7 +79,7 @@ void nfs_readdata_release(void *data) static int nfs_return_empty_page(struct page *page) { - memclear_highpage_flush(page, 0, PAGE_CACHE_SIZE); + zero_user_page(page, 0, PAGE_CACHE_SIZE); SetPageUptodate(page); unlock_page(page); return 0; @@ -103,10 +103,10 @@ static void nfs_readpage_truncate_uninit pglen = PAGE_CACHE_SIZE - base; for (;;) { if (remainder = pglen) { - memclear_highpage_flush(*pages, base, remainder); + zero_user_page(*pages, base, remainder); break; } - memclear_highpage_flush(*pages, base, pglen); + zero_user_page(*pages, base, pglen); pages++; remainder -= pglen; pglen = PAGE_CACHE_SIZE; @@ -130,7 +130,7 @@ static int nfs_readpage_async(struct nfs return PTR_ERR(new); } if (len PAGE_CACHE_SIZE) - memclear_highpage_flush(page, len, PAGE_CACHE_SIZE - len); + zero_user_page(page, len, PAGE_CACHE_SIZE - len); nfs_list_add_request(new, one_request); nfs_pagein_one(one_request, inode); @@ -561,7 +561,7 @@ readpage_async_filler(void *data, struct return PTR_ERR(new); } if (len PAGE_CACHE_SIZE) - memclear_highpage_flush(page, len, PAGE_CACHE_SIZE - len); + zero_user_page(page, len, PAGE_CACHE_SIZE - len); nfs_list_add_request(new, desc-head); return 0; } diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/nfs/write.c linux-2.6.21-rc6-mm1-test/fs/nfs/write.c --- linux-2.6.21-rc6-mm1/fs/nfs/write.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/nfs/write.c2007-04-09 18:18:23.0 -0700 @@ -169,7 +169,7 @@ static void nfs_mark_uptodate(struct pag if (count != nfs_page_length(page)) return; if (count != PAGE_CACHE_SIZE) - memclear_highpage_flush(page, count, PAGE_CACHE_SIZE - count); + zero_user_page(page, count, PAGE_CACHE_SIZE - count); SetPageUptodate(page); } - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/13] reiserfs: use zero_user_page
Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiserfs/file.c linux-2.6.21-rc6-mm1-test/fs/reiserfs/file.c --- linux-2.6.21-rc6-mm1/fs/reiserfs/file.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/reiserfs/file.c2007-04-09 18:18:23.0 -0700 @@ -1059,20 +1059,12 @@ static int reiserfs_prepare_file_region_ maping blocks, since there is none, so we just zero out remaining parts of first and last pages in write area (if needed) */ if ((pos ~((loff_t) PAGE_CACHE_SIZE - 1)) inode-i_size) { - if (from != 0) {/* First page needs to be partially zeroed */ - char *kaddr = kmap_atomic(prepared_pages[0], KM_USER0); - memset(kaddr, 0, from); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(prepared_pages[0]); - } - if (to != PAGE_CACHE_SIZE) {/* Last page needs to be partially zeroed */ - char *kaddr = - kmap_atomic(prepared_pages[num_pages - 1], - KM_USER0); - memset(kaddr + to, 0, PAGE_CACHE_SIZE - to); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(prepared_pages[num_pages - 1]); - } + if (from != 0) /* First page needs to be partially zeroed */ + zero_user_page(prepared_pages[0], 0, from); + + if (to != PAGE_CACHE_SIZE) /* Last page needs to be partially zeroed */ + zero_user_page(prepared_pages[num_pages-1], to, + PAGE_CACHE_SIZE - to); /* Since all blocks are new - use already calculated value */ return blocks; @@ -1199,13 +1191,9 @@ static int reiserfs_prepare_file_region_ ll_rw_block(READ, 1, bh); *wait_bh++ = bh; } else {/* Not mapped, zero it */ - char *kaddr = - kmap_atomic(prepared_pages[0], - KM_USER0); - memset(kaddr + block_start, 0, - from - block_start); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(prepared_pages[0]); + zero_user_page(prepared_pages[0], + block_start, + from - block_start); set_buffer_uptodate(bh); } } @@ -1237,13 +1225,8 @@ static int reiserfs_prepare_file_region_ ll_rw_block(READ, 1, bh); *wait_bh++ = bh; } else {/* Not mapped, zero it */ - char *kaddr = - kmap_atomic(prepared_pages - [num_pages - 1], - KM_USER0); - memset(kaddr + to, 0, block_end - to); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(prepared_pages[num_pages - 1]); + zero_user_page(prepared_pages[num_pages-1], + to, block_end - to); set_buffer_uptodate(bh); } } diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiserfs/inode.c linux-2.6.21-rc6-mm1-test/fs/reiserfs/inode.c --- linux-2.6.21-rc6-mm1/fs/reiserfs/inode.c2007-04-09 10:41:47.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/reiserfs/inode.c 2007-04-09 18:18:23.0 -0700 @@ -2148,13 +2148,8 @@ int reiserfs_truncate_file(struct inode length = offset (blocksize - 1); /* if we are not on a block boundary */ if (length) { - char *kaddr; - length = blocksize - length; - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset,
[PATCH 10/13] reiser4: use zero_user_page
Use zero_user_page() instead of open-coding it. Also replace the (mostly) redundant zero_page() function. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/cryptcompress.c linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/cryptcompress.c --- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/cryptcompress.c 2007-04-10 17:15:04.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/cryptcompress.c 2007-04-10 18:35:44.0 -0700 @@ -1897,7 +1897,6 @@ static int write_hole(struct inode *inode, reiser4_cluster_t * clust, loff_t file_off, loff_t to_file) { - char *data; int result = 0; unsigned cl_off, cl_count = 0; unsigned to_pg, pg_off; @@ -1934,10 +1933,7 @@ write_hole(struct inode *inode, reiser4_ to_pg = min_count(PAGE_CACHE_SIZE - pg_off, cl_count); lock_page(page); - data = kmap_atomic(page, KM_USER0); - memset(data + pg_off, 0, to_pg); - flush_dcache_page(page); - kunmap_atomic(data, KM_USER0); + zero_user_page(page, pg_off, to_pg); SetPageUptodate(page); unlock_page(page); @@ -2167,7 +2163,6 @@ read_some_cluster_pages(struct inode *in if (clust-nr_pages) { int off; - char *data; struct page * pg; assert(edward-1419, clust-pages != NULL); pg = clust-pages[clust-nr_pages - 1]; @@ -2175,10 +2170,7 @@ read_some_cluster_pages(struct inode *in off = off_to_pgoff(win-off+win-count+win-delta); if (off) { lock_page(pg); - data = kmap_atomic(pg, KM_USER0); - memset(data + off, 0, PAGE_CACHE_SIZE - off); - flush_dcache_page(pg); - kunmap_atomic(data, KM_USER0); + zero_user_page(pg, off, PAGE_CACHE_SIZE - off); unlock_page(pg); } } @@ -2217,20 +2209,15 @@ read_some_cluster_pages(struct inode *in (count_to_nrpages(inode-i_size) = pg-index)) { /* .. and appended, so set zeroes to the rest */ - char *data; int offset; lock_page(pg); - data = kmap_atomic(pg, KM_USER0); - assert(edward-1260, count_to_nrpages(win-off + win-count + win-delta) - 1 == i); offset = off_to_pgoff(win-off + win-count + win-delta); - memset(data + offset, 0, PAGE_CACHE_SIZE - offset); - flush_dcache_page(pg); - kunmap_atomic(data, KM_USER0); + zero_user_page(pg, offset, PAGE_CACHE_SIZE - offset); unlock_page(pg); /* still not uptodate */ break; diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/file.c linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/file.c --- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/file.c 2007-04-10 17:15:04.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/file.c 2007-04-10 18:35:44.0 -0700 @@ -433,7 +433,6 @@ static int shorten_file(struct inode *in struct page *page; int padd_from; unsigned long index; - char *kaddr; unix_file_info_t *uf_info; /* @@ -523,10 +522,7 @@ static int shorten_file(struct inode *in lock_page(page); assert(vs-1066, PageLocked(page)); - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + padd_from, 0, PAGE_CACHE_SIZE - padd_from); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, padd_from, PAGE_CACHE_SIZE - padd_from); unlock_page(page); page_cache_release(page); /* the below does up(sbinfo-delete_mutex). Do not get confused */ diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/ctail.c linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/ctail.c --- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/ctail.c 2007-04-10 17:15:04.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/ctail.c2007-04-10 18:35:44.0 -0700 @@ -627,11 +627,7 @@ int do_readpage_ctail(struct inode * ino #endif case FAKE_DISK_CLUSTER: /* fill the page by zeroes */ - data = kmap_atomic(page, KM_USER0); - - memset(data, 0, PAGE_CACHE_SIZE); -
[PATCH 13/13] fs: deprecate memclear_highpage_flush
Now that all the in-tree users are converted over to zero_user_page(), deprecate the old memclear_highpage_flush() call. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/include/linux/highmem.h linux-2.6.21-rc6-mm1-test/include/linux/highmem.h --- linux-2.6.21-rc6-mm1/include/linux/highmem.h2007-04-10 18:32:41.0 -0700 +++ linux-2.6.21-rc6-mm1-test/include/linux/highmem.h 2007-04-10 19:40:14.0 -0700 @@ -149,6 +149,8 @@ static inline void zero_user_page(struct kunmap_atomic(kaddr, KM_USER0); } +static void memclear_highpage_flush(struct page *page, unsigned int offset, + unsigned int size) __deprecated; static inline void memclear_highpage_flush(struct page *page, unsigned int offset, unsigned int size) { return zero_user_page(page, offset, size); - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/13] xfs: use zero_user_page
Use zero_user_page() instead of the newly deprecated memclear_highpage_flush(). Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/xfs/linux-2.6/xfs_lrw.c linux-2.6.21-rc6-mm1-test/fs/xfs/linux-2.6/xfs_lrw.c --- linux-2.6.21-rc6-mm1/fs/xfs/linux-2.6/xfs_lrw.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/xfs/linux-2.6/xfs_lrw.c2007-04-09 18:18:23.0 -0700 @@ -159,7 +159,7 @@ xfs_iozero( if (status) goto unlock; - memclear_highpage_flush(page, offset, bytes); + zero_user_page(page, offset, bytes); status = mapping-a_ops-commit_write(NULL, page, offset, offset + bytes); - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/13] ntfs: use zero_user_page
Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ntfs/aops.c linux-2.6.21-rc6-mm1-test/fs/ntfs/aops.c --- linux-2.6.21-rc6-mm1/fs/ntfs/aops.c 2007-04-09 10:41:47.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/ntfs/aops.c2007-04-09 18:18:23.0 -0700 @@ -245,8 +241,7 @@ static int ntfs_read_block(struct page * rl = NULL; nr = i = 0; do { - u8 *kaddr; - int err; + int err = 0; if (unlikely(buffer_uptodate(bh))) continue; @@ -254,7 +249,6 @@ static int ntfs_read_block(struct page * arr[nr++] = bh; continue; } - err = 0; bh-b_bdev = vol-sb-s_bdev; /* Is the block within the allowed limits? */ if (iblock lblock) { @@ -340,10 +334,7 @@ handle_hole: bh-b_blocknr = -1UL; clear_buffer_mapped(bh); handle_zblock: - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + i * blocksize, 0, blocksize); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(page); + zero_user_page(page, i * blocksize, blocksize); if (likely(!err)) set_buffer_uptodate(bh); } while (i++, iblock++, (bh = bh-b_this_page) != head); @@ -460,10 +451,7 @@ retry_readpage: * ok to ignore the compressed flag here. */ if (unlikely(page-index 0)) { - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr, 0, PAGE_CACHE_SIZE); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, 0, PAGE_CACHE_SIZE); goto done; } if (!NInoAttr(ni)) @@ -790,14 +778,9 @@ lock_retry_remap: * uptodate so it can get discarded by the VM. */ if (err == -ENOENT || lcn == LCN_ENOENT) { - u8 *kaddr; - bh-b_blocknr = -1; clear_buffer_dirty(bh); - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + bh_offset(bh), 0, blocksize); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(page); + zero_user_page(page, bh_offset(bh), blocksize); set_buffer_uptodate(bh); err = 0; continue; @@ -1422,10 +1405,7 @@ retry_writepage: if (page-index = (i_size PAGE_CACHE_SHIFT)) { /* The page straddles i_size. */ unsigned int ofs = i_size ~PAGE_CACHE_MASK; - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + ofs, 0, PAGE_CACHE_SIZE - ofs); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(page); + zero_user_page(page, ofs, PAGE_CACHE_SIZE - ofs); } /* Handle mst protected attributes. */ if (NInoMstProtected(ni)) diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ntfs/file.c linux-2.6.21-rc6-mm1-test/fs/ntfs/file.c --- linux-2.6.21-rc6-mm1/fs/ntfs/file.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/ntfs/file.c2007-04-09 18:18:23.0 -0700 @@ -606,11 +606,8 @@ do_next_page: ntfs_submit_bh_for_read(bh); *wait_bh++ = bh; } else { - u8 *kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + bh_offset(bh), 0, + zero_user_page(page, bh_offset(bh), blocksize); - kunmap_atomic(kaddr, KM_USER0); - flush_dcache_page(page); set_buffer_uptodate(bh); } } @@ -685,12 +682,8 @@ map_buffer_cached: ntfs_submit_bh_for_read(bh); *wait_bh++ = bh; } else { - u8 *kaddr = kmap_atomic(page, - KM_USER0); - memset(kaddr + bh_offset(bh), - 0, blocksize); - kunmap_atomic(kaddr, KM_USER0); -
Re: ext3, BKL, journal replay, multiple non-bind mounts of same device
On Apr 10, 2007 20:49 -0400, John Anthony Kazos Jr. wrote: Since it is possible for the same block device to be mounted multiple times concurrently by the same filesystem, and since ext3 explicitly disables the BKL during its fill_super operation which would prevent this, what is the result of mounting it multiple times this way? Especially if the filesystem is dirty and a journal is replayed. (In any case, what operation is being performed by ext3/ext4 that requires the BKL to be dropped? What's the need to even consider the BKL during fill_super?) And in general, how does a filesystem deal with being mounted multiple times in this way? In my testing and exploration so far, everything seems to generally work, but I haven't tried deliberately using different instances of the mount concurrently. Do we end up with locks not being held properly on the superblock because the super_block structure instances don't know about each other? Has dealing with this behavior of bd_claim really been considered before, and if so, what's the general scheme for handling it? It is a myth (that actually frightened me quite a bit when I first did it) that the filesystem is mounted twice in this case. The truth of the matter is if you mount -t ext3 /dev/ /mnt/1 and ... /mnt/2 you actually get the equivalent of a bind mount for this block device on the two mount points. You can see this easily because e.g. you don't get two kjournald threads for the two mounts, and it doesn't completely blow up. If, on the other hand, you tried one mount with ext3 and another with ext4 it will fail the second with -EBUSY. As for the BKL changes, your best bet is to go back through GIT and/or BK or search the mailing lists to see when and why that was added. It appears to have been 2.6.11, but I don't know why. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/13] ext4: use zero_user_page
On Apr 10, 2007 20:36 -0700, Nate Diller wrote: Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller [EMAIL PROTECTED] To: Andrew Morton [EMAIL PROTECTED], Alexander Viro [EMAIL PROTECTED] Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Would have been better to CC the filesystem maintainers directly (which was one of the reasons Andrew wanted per-fs patches so they can be Ack/Nack independently. Looks good in any case, Signed-off-by: Andreas Dilger [EMAIL PROTECTED] diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/inode.c linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c --- linux-2.6.21-rc6-mm1/fs/ext4/inode.c 2007-04-10 17:15:04.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c 2007-04-10 18:33:04.0 -0700 @@ -1791,7 +1791,6 @@ int ext4_block_truncate_page(handle_t *h struct inode *inode = mapping-host; struct buffer_head *bh; int err = 0; - void *kaddr; if ((EXT4_I(inode)-i_flags EXT4_EXTENTS_FL) test_opt(inode-i_sb, EXTENTS) @@ -1808,10 +1807,7 @@ int ext4_block_truncate_page(handle_t *h */ if (!page_has_buffers(page) test_opt(inode-i_sb, NOBH) ext4_should_writeback_data(inode) PageUptodate(page)) { - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, length); set_page_dirty(page); goto unlock; } @@ -1864,11 +1860,7 @@ int ext4_block_truncate_page(handle_t *h goto unlock; } - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); - + zero_user_page(page, offset, length); BUFFER_TRACE(bh, zeroed end of block); err = 0; diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/writeback.c linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c --- linux-2.6.21-rc6-mm1/fs/ext4/writeback.c 2007-04-10 18:05:52.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c 2007-04-10 18:33:04.0 -0700 @@ -961,7 +961,6 @@ int ext4_wb_writepage(struct page *page, loff_t i_size = i_size_read(inode); pgoff_t end_index = i_size PAGE_CACHE_SHIFT; unsigned offset; - void *kaddr; wb_debug(writepage %lu from inode %lu\n, page-index, inode-i_ino); @@ -1011,10 +1010,7 @@ int ext4_wb_writepage(struct page *page, * the page size, the remaining memory is zeroed when mapped, and * writes to that region are not written out to the file. */ - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, PAGE_CACHE_SIZE - offset); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, PAGE_CACHE_SIZE - offset); return ext4_wb_write_single_page(page, wbc); } @@ -1065,7 +1061,6 @@ int ext4_wb_block_truncate_page(handle_t struct inode *inode = mapping-host; struct buffer_head bh, *bhw = bh; unsigned blocksize, length; - void *kaddr; int err = 0; wb_debug(partial truncate from %lu on page %lu from inode %lu\n, @@ -1104,10 +1099,7 @@ int ext4_wb_block_truncate_page(handle_t } } - kaddr = kmap_atomic(page, KM_USER0); - memset(kaddr + offset, 0, length); - flush_dcache_page(page); - kunmap_atomic(kaddr, KM_USER0); + zero_user_page(page, offset, length); SetPageUptodate(page); __set_page_dirty_nobuffers(page); - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/13] fs: convert core functions to zero_user_page
On Tue, 10 Apr 2007 20:36:00 -0700 Nate Diller [EMAIL PROTECTED] wrote: It's very common for file systems to need to zero part or all of a page, the simplist way is just to use kmap_atomic() and memset(). There's actually a library function in include/linux/highmem.h that does exactly that, but it's confusingly named memclear_highpage_flush(), which is descriptive of *how* it does the work rather than what the *purpose* is. So this patchset renames the function to zero_user_page(), and calls it from the various places that currently open code it. This first patch introduces the new function call, and converts all the core kernel callsites, both the open-coded ones and the old memclear_highpage_flush() ones. Following this patch is a series of conversions for each file system individually, per AKPM, and finally a patch deprecating the old call. For the reasons Anton identified, I think it is better design while we're here to force callers to pass in the kmap-type which they wish to use for the atomic kmap. It makes the programmer think about what he wants to happen. The price of getting this wrong tends to be revoltingly rare file corruption. But we cannot make this change in the obvious fashion, because the KM_FOO identifiers are undefined if CONFIG_HIGHMEM=n. So zero_user_page(page, 1, 2, KM_USER0); won't compile on non-highmem. So we are forced to use a macro, like below. Also, you forgot to mark memclear_highpage_flush() __deprecated. And I'm surprised that this: +static inline void memclear_highpage_flush(struct page *page, unsigned int offset, unsigned int size) +{ + return zero_user_page(page, offset, size); +} compiled. zero_user_page() returns void... drivers/block/loop.c|2 +- fs/buffer.c | 21 - fs/direct-io.c |2 +- fs/mpage.c |6 -- include/linux/highmem.h | 29 + mm/filemap_xip.c|2 +- 6 files changed, 36 insertions(+), 26 deletions(-) diff -puN drivers/block/loop.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type drivers/block/loop.c --- a/drivers/block/loop.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type +++ a/drivers/block/loop.c @@ -250,7 +250,7 @@ static int do_lo_send_aops(struct loop_d */ printk(KERN_ERR loop: transfer error block %llu\n, (unsigned long long)index); - zero_user_page(page, offset, size); + zero_user_page(page, offset, size, KM_USER0); } flush_dcache_page(page); ret = aops-commit_write(file, page, offset, diff -puN fs/buffer.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type fs/buffer.c --- a/fs/buffer.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type +++ a/fs/buffer.c @@ -1855,7 +1855,7 @@ static int __block_prepare_write(struct break; if (buffer_new(bh)) { clear_buffer_new(bh); - zero_user_page(page, block_start, bh-b_size); + zero_user_page(page, block_start, bh-b_size, KM_USER0); set_buffer_uptodate(bh); mark_buffer_dirty(bh); } @@ -1943,7 +1943,8 @@ int block_read_full_page(struct page *pa SetPageError(page); } if (!buffer_mapped(bh)) { - zero_user_page(page, i * blocksize, blocksize); + zero_user_page(page, i * blocksize, blocksize, + KM_USER0); if (!err) set_buffer_uptodate(bh); continue; @@ -2107,7 +2108,8 @@ int cont_prepare_write(struct page *page PAGE_CACHE_SIZE, get_block); if (status) goto out_unmap; - zero_user_page(page, zerofrom, PAGE_CACHE_SIZE-zerofrom); + zero_user_page(page, zerofrom, PAGE_CACHE_SIZE - zerofrom, + KM_USER0); generic_commit_write(NULL, new_page, zerofrom, PAGE_CACHE_SIZE); unlock_page(new_page); page_cache_release(new_page); @@ -2134,7 +2136,7 @@ int cont_prepare_write(struct page *page if (status) goto out1; if (zerofrom offset) { - zero_user_page(page, zerofrom, offset-zerofrom); + zero_user_page(page, zerofrom, offset - zerofrom, KM_USER0); __block_commit_write(inode, page, zerofrom, offset); } return 0; @@ -2333,7 +2335,7 @@ failed: * Error recovery is pretty slack. Clear the page and mark it
Re: [PATCH 13/13] fs: deprecate memclear_highpage_flush
On Tue, 10 Apr 2007 20:36:00 -0700 Nate Diller [EMAIL PROTECTED] wrote: Now that all the in-tree users are converted over to zero_user_page(), deprecate the old memclear_highpage_flush() call. Signed-off-by: Nate Diller [EMAIL PROTECTED] --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/include/linux/highmem.h linux-2.6.21-rc6-mm1-test/include/linux/highmem.h --- linux-2.6.21-rc6-mm1/include/linux/highmem.h 2007-04-10 18:32:41.0 -0700 +++ linux-2.6.21-rc6-mm1-test/include/linux/highmem.h 2007-04-10 19:40:14.0 -0700 @@ -149,6 +149,8 @@ static inline void zero_user_page(struct kunmap_atomic(kaddr, KM_USER0); } +static void memclear_highpage_flush(struct page *page, unsigned int offset, + unsigned int size) __deprecated; static inline void memclear_highpage_flush(struct page *page, unsigned int offset, unsigned int size) { return zero_user_page(page, offset, size); oh, there it is. one can stick the __deprecated at the end of the definition, actually. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html