Re: [PATCH 1/2] fs: use memclear_highpage_flush to zero page data

2007-04-10 Thread Andrew Morton
On Mon, 09 Apr 2007 21:31:37 -0700 Nate Diller [EMAIL PROTECTED] wrote:

 It's very common for file systems to need to zero part or all of a page, the
 simplist way is just to use kmap_atomic() and memset().  There's actually a
 library function in include/linux/highmem.h that does exactly that, but it's
 confusingly named memclear_highpage_flush(), which is descriptive of *how*
 it does the work rather than what the *purpose* is.  So this patch renames
 the function to zero_page_data(), and calls it from the various places that
 currently open code it.
 
 Compile tested in x86_64.
 
 signed-off-by: Nate Diller [EMAIL PROTECTED]
 
 ---
 
  drivers/block/loop.c |6 ---
  fs/affs/file.c   |6 ---
  fs/buffer.c  |   53 
 +--
  fs/direct-io.c   |8 +---
  fs/ecryptfs/mmap.c   |   14 +---
  fs/ext3/inode.c  |   12 +--
  fs/ext4/inode.c  |   12 +--
  fs/ext4/writeback.c  |   12 +--
  fs/gfs2/bmap.c   |6 ---
  fs/mpage.c   |   11 +-
  fs/nfs/read.c|   10 ++---
  fs/nfs/write.c   |2 -
  fs/ntfs/aops.c   |   32 +++---
  fs/ntfs/file.c   |   47 +--
  fs/ocfs2/aops.c  |5 --
  fs/reiser4/plugin/file/cryptcompress.c   |   19 +--
  fs/reiser4/plugin/file/file.c|6 ---
  fs/reiser4/plugin/item/ctail.c   |6 ---
  fs/reiser4/plugin/item/extent_file_ops.c |   19 +++
  fs/reiser4/plugin/item/tail.c|8 +---
  fs/reiserfs/file.c   |   39 ++
  fs/reiserfs/inode.c  |   13 +--
  fs/xfs/linux-2.6/xfs_lrw.c   |2 -
  include/linux/highmem.h  |2 -
  mm/filemap_xip.c |7 
  mm/truncate.c|2 -
  26 files changed, 78 insertions(+), 281 deletions(-)
 

Not sure that I agree with the name zero_page_data().  People might use it
to, err, zero a page's data.  Whereas it is really only for use against
*user* pages.   zero_user_page(), perhaps.

Plus..

This patch as presented causes me surprising amounts of trouble.  I need to
split it up into

  - core plus filesystems which don't have maintainers (for me to merge)

  - filesystems which do have maintainers (one patch per), for
maintainers to merge.

  - another patch for reiser4, to remain in -mm.

And this is actually not possible to do, because my merge and the subsystem
maintainers' merges will happen at different times.  In the intervening
window, the kernel won't compile.

So instead I need to

  - split off the reiser4 bit

  - get acks from fs maintainers on the rest

  - merge the whole thing in one hit (minus reiser4)

And I can do that, but it is the less preferable option.


The better way to do this merge is:

patch #1:

static inline void memclear_highpage_flush(...) __deprecated
{
zero_user_page(...);
}

patch #2..n:  convert filesystems.


then, when all filesystems are converted, we're ready to remove
memclear_highpage_flush().  But we do that six months later - let's not
screw out-of-tree fs maintainers (and their users) unnecessarily.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/8] unprivileged mount syscall

2007-04-10 Thread Ram Pai
On Mon, 2007-04-09 at 22:10 +0200, Miklos Szeredi wrote:
   The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken.  Are
   you interested in the details?  I can reproduce it, but forgot to note
   down the details of the brokenness.
  
  I don't know how far removed that is from the one being used by redhat,
  but assuming it's the same, then redhat-lspp@redhat.com will be
  very interested.
 
 OK.
 
- user namespace setup: what if user has multiple sessions?
   
  1) namespaces are shared?  That's tricky because the session needs to
  be a child of a namespace server, not of login.  I'm not sure PAM
  can handle this
   
  2) or mounts are copied on login?  That's not possible currently,
  as there's no way to send a mount between namespaces.  Also it's
  tricky to make sure that new mounts are also shared
  
  See toward the end of the 'shared subtrees' OLS paper from last year for
  a suggestion on how to let users effectively 'log in to' an existing
  private mounts ns.
 
 This?
 
   1. create a new namespace
   2. bind /share/$USER to /share
   3. for each pair ($who, $what) such that
  /share/$USER/$who/$what exists, look
  in /share/$who/allowed for peer $what
  $USER or slave $what $USER. If the
  former is found, rbind /share/$who/$what
  on /share/$USER/$who/$what; if the
  latter is found, do the same and
  follow with marking subtree under
  /share/$USER/$who/$what as slave.
   4. rbind /share/$USER to /share
   5. mark subtree under /share as private.
   6. umount -l /share
 
 Well, someone please explain using short words, because I don't
 understand at all.

I am trying to re-construct Viro's thoughts.  I think the steps outlined
above; though not accurate, are still insightful.

The idea is -- there is one master namespace, which has
under /share, a replica of the mount tree of namespaces belonging to all
users. 

for example if there are two users A and B, then in the master namespace
under /share you will find /share/A and /share/B, each reflecting the
mount tree for the namespaces belonging to user-A and user-B
respectively. 

Note: /share is a shared mount-tree, which means it can propagate mount
events.

Everytime the user logs on the machine, a new namespace is created which
is the clone of the master namespace. In this new namespace,
the /share/$user is made the root of the namespace. Also if other
users have allowed part of their namespace available to this user,
than those mounts are also brought under this namespace. And finally the
entire tree under /share is unmounted.

Note, though multiple namespaces can exist simultaneously for the same
user, the user is provided the illusion of per-process-namespace since
all the namespaces look identical.  

I am trying to rewrite the steps outlined above, which may or may not
reflect Viro's thoughts, but certainly reflect my reconstruction of
viro's thoughts.

1. clone the master namespace.

2. in the new namespace

move the tree under /share/$me to /
for each ($user, $what, $how) {
move /share/$user/$what to /$what
if ($how == slave) {
 make the mount tree under /$what as slave
}
}

3. in the new namespace make the tree under 
   /share as private and unmount /share
  
 
   
RP


 
 Thanks,
 Miklos

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/8] unprivileged mount syscall

2007-04-10 Thread Ian Kent
On Fri, 2007-04-06 at 16:16 -0700, H. Peter Anvin wrote:
 
  - users can use bind mounts without having to pre-configure them in
/etc/fstab
 
 
 This is by far the biggest concern I see.  I think the security 
 implication of allowing anyone to do bind mounts are poorly understood.

And especially so since there is no way for a filesystem module to veto
such requests.

Ian


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fs: use memclear_highpage_flush to zero page data

2007-04-10 Thread Anton Altaparmakov

On 10 Apr 2007, at 07:10, Andrew Morton wrote:
On Mon, 09 Apr 2007 21:31:37 -0700 Nate Diller  
[EMAIL PROTECTED] wrote:
It's very common for file systems to need to zero part or all of a  
page, the
simplist way is just to use kmap_atomic() and memset().  There's  
actually a
library function in include/linux/highmem.h that does exactly  
that, but it's
confusingly named memclear_highpage_flush(), which is descriptive  
of *how*
it does the work rather than what the *purpose* is.  So this patch  
renames
the function to zero_page_data(), and calls it from the various  
places that

currently open code it.

Compile tested in x86_64.

signed-off-by: Nate Diller [EMAIL PROTECTED]

---

 drivers/block/loop.c |6 ---
 fs/affs/file.c   |6 ---
 fs/buffer.c  |   53  
+--

 fs/direct-io.c   |8 +---
 fs/ecryptfs/mmap.c   |   14 +---
 fs/ext3/inode.c  |   12 +--
 fs/ext4/inode.c  |   12 +--
 fs/ext4/writeback.c  |   12 +--
 fs/gfs2/bmap.c   |6 ---
 fs/mpage.c   |   11 +-
 fs/nfs/read.c|   10 ++---
 fs/nfs/write.c   |2 -
 fs/ntfs/aops.c   |   32 +++---
 fs/ntfs/file.c   |   47  
+--

 fs/ocfs2/aops.c  |5 --
 fs/reiser4/plugin/file/cryptcompress.c   |   19 +--
 fs/reiser4/plugin/file/file.c|6 ---
 fs/reiser4/plugin/item/ctail.c   |6 ---
 fs/reiser4/plugin/item/extent_file_ops.c |   19 +++
 fs/reiser4/plugin/item/tail.c|8 +---
 fs/reiserfs/file.c   |   39 + 
+

 fs/reiserfs/inode.c  |   13 +--
 fs/xfs/linux-2.6/xfs_lrw.c   |2 -
 include/linux/highmem.h  |2 -
 mm/filemap_xip.c |7 
 mm/truncate.c|2 -
 26 files changed, 78 insertions(+), 281 deletions(-)



Not sure that I agree with the name zero_page_data().  People might  
use it
to, err, zero a page's data.  Whereas it is really only for use  
against

*user* pages.   zero_user_page(), perhaps.

Plus..

This patch as presented causes me surprising amounts of trouble.  I  
need to

split it up into

  - core plus filesystems which don't have maintainers (for me to  
merge)


  - filesystems which do have maintainers (one patch per), for
maintainers to merge.

  - another patch for reiser4, to remain in -mm.

And this is actually not possible to do, because my merge and the  
subsystem
maintainers' merges will happen at different times.  In the  
intervening

window, the kernel won't compile.

So instead I need to

  - split off the reiser4 bit

  - get acks from fs maintainers on the rest

  - merge the whole thing in one hit (minus reiser4)

And I can do that, but it is the less preferable option.


The better way to do this merge is:

patch #1:

static inline void memclear_highpage_flush(...) __deprecated
{
zero_user_page(...);
}

patch #2..n:  convert filesystems.


then, when all filesystems are converted, we're ready to remove
memclear_highpage_flush().  But we do that six months later - let's  
not

screw out-of-tree fs maintainers (and their users) unnecessarily.


Nate, I think you either do not understand what the KM_* constants  
passed to kmap_atomic() mean or you were overeager in your code  
replacement...  You really, really cannot replace KM_BIO_SRC_IRQ with  
KM_USER0 in the NTFS i/o completion handler without trashing people's  
data left right an centre!


Best regards,

Anton
--
Anton Altaparmakov aia21 at cam.ac.uk (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/8] unprivileged mount syscall

2007-04-10 Thread Karel Zak
On Mon, Apr 09, 2007 at 10:46:25AM -0700, Ram Pai wrote:
 On Mon, 2007-04-09 at 12:07 -0500, Serge E. Hallyn wrote:
  Quoting Miklos Szeredi ([EMAIL PROTECTED]):
 
- need to set up mount propagation from global namespace to private
  ones, mount(8) does not yet have options to configure propagation
  
  Hmm, I guess I get lost using my own little systems, and just assumed
  that shared subtree functionality was making its way up into mount(8).
  Ram, have you been working on that?
 
 It is in FC6. I dont know the status off upstream util-linux. I did
 submit the patch many times to Adrian Bunk (the then util-linux
 maintainer) and got no response. I have not pushed the patches to the
 new maintainer(Karel Zak?) though.

 The shared-subtree patch has been applied:
 
http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=commitdiff;h=389fbea536e4308d9475fa2a89e53e188ce8a0e3;hp=939a997de0c761d29fb7530976ca20da4898703a

 
Karel

-- 
 Karel Zak  [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Linux 2007 File System IO Workshop notes talks

2007-04-10 Thread Ric Wheeler


We have some of the material reviewed and posted now from the IO  FS 
workshop.


USENIX has posted the talks at:

http://www.usenix.org/events/lsf07/tech/tech.html

A write up of the workshop went out at LWN and invoked a healthy discussion:

http://lwn.net/Articles/226351/

At that LWN article, there is a link to the Linux FS wiki with good notes:

http://linuxfs.pbwiki.com/LSF07-Workshop-Notes

Another summary will go out in the next USENIX ;login edition.

ric

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL -mm] Unionfs branch management code

2007-04-10 Thread Shaya Potter

Andrew Morton wrote:

On Mon,  9 Apr 2007 10:53:51 -0400 Josef 'Jeff' Sipek [EMAIL PROTECTED] 
wrote:


The following patches introduce new branch-management code into Unionfs as
well as fix a number of stability issues and resource leaks.


I have a mental note that unionfs is in the stuck state, due to general
agreement that we should implement this functionality at the VFS level, one
reason for which is unionfs's upper-vs-lower coherency problems.


How can a union file system with a decent set of useful semantics be 
fully implemented at the VFS layer in a clean manner?


For instance, a major use of unionfs is live CDs, namely unionfs w/ a 
read-only and read-write layer.  Unionfs enables files to be copied up 
from the read-only layer to the read-write layer.


Does one really want to implement copyup in the VFS?

just my 2 agarot.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL -mm] Unionfs branch management code

2007-04-10 Thread Josef Sipek
On Tue, Apr 10, 2007 at 01:22:52PM -0400, Shaya Potter wrote:
 Andrew Morton wrote:
 On Mon,  9 Apr 2007 10:53:51 -0400 Josef 'Jeff' Sipek 
 [EMAIL PROTECTED] wrote:
 
 The following patches introduce new branch-management code into Unionfs as
 well as fix a number of stability issues and resource leaks.
 
 I have a mental note that unionfs is in the stuck state, due to general
 agreement that we should implement this functionality at the VFS level, one
 reason for which is unionfs's upper-vs-lower coherency problems.
 
 How can a union file system with a decent set of useful semantics be 
 fully implemented at the VFS layer in a clean manner?

Unioning is quite odd. It uses concepts, some of which do indeed belong in
the VFS (like actual merging of the lower directories), but others that most
definitely do not (like whiteouts).

 For instance, a major use of unionfs is live CDs, namely unionfs w/ a 
 read-only and read-write layer.  Unionfs enables files to be copied up 
 from the read-only layer to the read-write layer.
 
 Does one really want to implement copyup in the VFS?
 
I don't think that copyup is the problem, but whiteouts...oh boy.
Whiteouts/some kind of persistent storage is most definitely a filesystem
construct, and it does not belong in the VFS.

Josef Jeff Sipek.

-- 
If I have trouble installing Linux, something is wrong. Very wrong.
- Linus Torvalds
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fs: use memclear_highpage_flush to zero page data

2007-04-10 Thread Nate Diller

On 4/10/07, Anton Altaparmakov [EMAIL PROTECTED] wrote:

On 10 Apr 2007, at 07:10, Andrew Morton wrote:
 On Mon, 09 Apr 2007 21:31:37 -0700 Nate Diller
 [EMAIL PROTECTED] wrote:
 It's very common for file systems to need to zero part or all of a
 page, the
 simplist way is just to use kmap_atomic() and memset().  There's
 actually a
 library function in include/linux/highmem.h that does exactly
 that, but it's
 confusingly named memclear_highpage_flush(), which is descriptive
 of *how*
 it does the work rather than what the *purpose* is.  So this patch
 renames
 the function to zero_page_data(), and calls it from the various
 places that
 currently open code it.

 Compile tested in x86_64.

 signed-off-by: Nate Diller [EMAIL PROTECTED]

 ---

  drivers/block/loop.c |6 ---
  fs/affs/file.c   |6 ---
  fs/buffer.c  |   53 
 +--
  fs/direct-io.c   |8 +---
  fs/ecryptfs/mmap.c   |   14 +---
  fs/ext3/inode.c  |   12 +--
  fs/ext4/inode.c  |   12 +--
  fs/ext4/writeback.c  |   12 +--
  fs/gfs2/bmap.c   |6 ---
  fs/mpage.c   |   11 +-
  fs/nfs/read.c|   10 ++---
  fs/nfs/write.c   |2 -
  fs/ntfs/aops.c   |   32 +++---
  fs/ntfs/file.c   |   47 
 +--
  fs/ocfs2/aops.c  |5 --
  fs/reiser4/plugin/file/cryptcompress.c   |   19 +--
  fs/reiser4/plugin/file/file.c|6 ---
  fs/reiser4/plugin/item/ctail.c   |6 ---
  fs/reiser4/plugin/item/extent_file_ops.c |   19 +++
  fs/reiser4/plugin/item/tail.c|8 +---
  fs/reiserfs/file.c   |   39 +
 +
  fs/reiserfs/inode.c  |   13 +--
  fs/xfs/linux-2.6/xfs_lrw.c   |2 -
  include/linux/highmem.h  |2 -
  mm/filemap_xip.c |7 
  mm/truncate.c|2 -
  26 files changed, 78 insertions(+), 281 deletions(-)


 Not sure that I agree with the name zero_page_data().  People might
 use it
 to, err, zero a page's data.  Whereas it is really only for use
 against
 *user* pages.   zero_user_page(), perhaps.

 Plus..

 This patch as presented causes me surprising amounts of trouble.  I
 need to
 split it up into

   - core plus filesystems which don't have maintainers (for me to
 merge)

   - filesystems which do have maintainers (one patch per), for
 maintainers to merge.

   - another patch for reiser4, to remain in -mm.

 And this is actually not possible to do, because my merge and the
 subsystem
 maintainers' merges will happen at different times.  In the
 intervening
 window, the kernel won't compile.

 So instead I need to

   - split off the reiser4 bit

   - get acks from fs maintainers on the rest

   - merge the whole thing in one hit (minus reiser4)

 And I can do that, but it is the less preferable option.


 The better way to do this merge is:

 patch #1:

 static inline void memclear_highpage_flush(...) __deprecated
 {
   zero_user_page(...);
 }

 patch #2..n:  convert filesystems.


 then, when all filesystems are converted, we're ready to remove
 memclear_highpage_flush().  But we do that six months later - let's
 not
 screw out-of-tree fs maintainers (and their users) unnecessarily.

Nate, I think you either do not understand what the KM_* constants
passed to kmap_atomic() mean or you were overeager in your code
replacement...  You really, really cannot replace KM_BIO_SRC_IRQ with
KM_USER0 in the NTFS i/o completion handler without trashing people's
data left right an centre!


good catch, I was indeed careless on that one.  I just double checked
all the other changes and that was the only non-KM_USER0 that slipped
through.  Thanks!

I will submit a new patch later today that fixes this problem and the
issues AKPM raised.

NATE
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/17] locks: add fl_notify arguments for asynchronous lock return

2007-04-10 Thread J. Bruce Fields
On Mon, Apr 09, 2007 at 07:40:41PM +0100, Christoph Hellwig wrote:
 On Thu, Apr 05, 2007 at 07:40:58PM -0400, J. Bruce Fields wrote:
  We're using fl_notify to asynchronously return the result of a lock
  request.  So we want fl_notify to be able to return a status and, if
  appropriate, a conflicting lock.
  
  This only current caller of fl_notify is in the blocked case, in which case
  we don't use these extra arguments.
  
  We also allow fl_notify to return an error.  (Also ignored for now.)
 
 I don't really like the overload of fl_notify.  What the reason not
 to use a separate callback?

My vague memory is that Trond said something to the affect of fl_notify
is there, let's use it rather than adding yet another callback.

But our new usage of fl_notify does requires slightly different
arguments and returns, and is used in a subtly different case.  So I
wouldn't object to a new callback.  Trond?

--b.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/17] locks: add lock cancel command

2007-04-10 Thread J. Bruce Fields
On Mon, Apr 09, 2007 at 07:41:44PM +0100, Christoph Hellwig wrote:
 On Thu, Apr 05, 2007 at 07:40:59PM -0400, J. Bruce Fields wrote:
  We do this by adding a new fcntl lock command: FL_CANCELLK.  Some day this
  might also be made available to userspace applications that could benefit 
  from
  an asynchronous locking api.
 
 Should we really add more and more subcases to -lock that probably don't
 share implementation code?  I'd much prefer adding different operations.

That'd be OK.  We considered both--

http://marc.info/?l=linux-fsdevelm=116616992004056w=2

--but chose a new -lock case just because that might provide a cleaner
mapping to the userspace interface if we ended up doing that some day.

Is there any hard reason why it wouldn't work?

--b.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/17] lockd: handle fl_notify callbacks

2007-04-10 Thread J. Bruce Fields
On Mon, Apr 09, 2007 at 07:43:52PM +0100, Christoph Hellwig wrote:
 On Thu, Apr 05, 2007 at 07:41:02PM -0400, J. Bruce Fields wrote:
  +   if (block-b_fl)
  +   kfree(block-b_fl);
 
 kfree(NULL) is fine.

Whoops, thanks, will fix.

  +static void
  +nlmsvc_update_deferred_block(struct nlm_block *block, struct file_lock 
  *conf,
  +int result)
  +{
  +   block-b_flags |= B_GOT_CALLBACK;
  +   if (result == 0)
  +   block-b_granted = 1;
  +   else
  +   block-b_flags |= B_TOO_LATE;
  +   if (conf) {
  +   block-b_fl = kzalloc(sizeof(struct file_lock), GFP_KERNEL);
  +   if (block-b_fl)
  +   locks_copy_lock(block-b_fl, conf);
  +   }
  +}
 
 Shouldn't there be a way to propagate errors back to the caller when
 the kzalloc fails?

That's fixed in a later patch, so there may be a problem with how I
split up those patches--I'll check.

--b.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/13] fs: convert core functions to zero_user_page

2007-04-10 Thread Nate Diller
It's very common for file systems to need to zero part or all of a page, the
simplist way is just to use kmap_atomic() and memset().  There's actually a
library function in include/linux/highmem.h that does exactly that, but it's
confusingly named memclear_highpage_flush(), which is descriptive of *how*
it does the work rather than what the *purpose* is.  So this patchset
renames the function to zero_user_page(), and calls it from the various
places that currently open code it.

This first patch introduces the new function call, and converts all the core
kernel callsites, both the open-coded ones and the old
memclear_highpage_flush() ones.  Following this patch is a series of
conversions for each file system individually, per AKPM, and finally a patch
deprecating the old call.  The diffstat below shows the entire patchset.

Compile tested in x86_64.

signed-off-by: Nate Diller [EMAIL PROTECTED]

---

 drivers/block/loop.c |6 ---
 fs/affs/file.c   |6 ---
 fs/buffer.c  |   53 +--
 fs/direct-io.c   |8 +---
 fs/ecryptfs/mmap.c   |   14 +---
 fs/ext3/inode.c  |   12 +--
 fs/ext4/inode.c  |   12 +--
 fs/ext4/writeback.c  |   12 +--
 fs/gfs2/bmap.c   |6 ---
 fs/mpage.c   |   11 +-
 fs/nfs/read.c|   10 ++---
 fs/nfs/write.c   |2 -
 fs/ntfs/aops.c   |   26 ++-
 fs/ntfs/file.c   |   47 +--
 fs/ocfs2/aops.c  |5 --
 fs/reiser4/plugin/file/cryptcompress.c   |   19 +--
 fs/reiser4/plugin/file/file.c|6 ---
 fs/reiser4/plugin/item/ctail.c   |6 ---
 fs/reiser4/plugin/item/extent_file_ops.c |   19 +++
 fs/reiser4/plugin/item/tail.c|8 +---
 fs/reiserfs/file.c   |   39 ++
 fs/reiserfs/inode.c  |   13 +--
 fs/xfs/linux-2.6/xfs_lrw.c   |2 -
 include/linux/highmem.h  |7 +++-
 mm/filemap_xip.c |7 
 mm/truncate.c|2 -
 26 files changed, 82 insertions(+), 276 deletions(-)

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/drivers/block/loop.c 
linux-2.6.21-rc6-mm1-test/drivers/block/loop.c
--- linux-2.6.21-rc6-mm1/drivers/block/loop.c   2007-04-10 18:27:04.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/drivers/block/loop.c  2007-04-10 
18:18:16.0 -0700
@@ -244,17 +244,13 @@ static int do_lo_send_aops(struct loop_d
transfer_result = lo_do_transfer(lo, WRITE, page, offset,
bvec-bv_page, bv_offs, size, IV);
if (unlikely(transfer_result)) {
-   char *kaddr;
-
/*
 * The transfer failed, but we still write the data to
 * keep prepare/commit calls balanced.
 */
printk(KERN_ERR loop: transfer error block %llu\n,
   (unsigned long long)index);
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + offset, 0, size);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, offset, size);
}
flush_dcache_page(page);
ret = aops-commit_write(file, page, offset,
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/buffer.c 
linux-2.6.21-rc6-mm1-test/fs/buffer.c
--- linux-2.6.21-rc6-mm1/fs/buffer.c2007-04-10 18:27:04.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/buffer.c   2007-04-10 18:18:16.0 
-0700
@@ -1862,13 +1862,8 @@ static int __block_prepare_write(struct 
if (block_start = to)
break;
if (buffer_new(bh)) {
-   void *kaddr;
-
clear_buffer_new(bh);
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr+block_start, 0, bh-b_size);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, block_start, bh-b_size);
set_buffer_uptodate(bh);
mark_buffer_dirty(bh);
}
@@ -1956,10 +1951,7 @@ int block_read_full_page(struct page *pa
SetPageError(page);
}
if (!buffer_mapped(bh)) {
-   void *kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + i * 

[PATCH 2/13] affs: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of open-coding it.

Signed-off-by: Nate Diller [EMAIL PROTECTED]

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/affs/file.c 
linux-2.6.21-rc6-mm1-test/fs/affs/file.c
--- linux-2.6.21-rc6-mm1/fs/affs/file.c 2007-04-09 17:23:48.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/affs/file.c2007-04-09 18:18:23.0 
-0700
@@ -628,11 +628,7 @@ static int affs_prepare_write_ofs(struct
return err;
}
if (to  PAGE_CACHE_SIZE) {
-   char *kaddr = kmap_atomic(page, KM_USER0);
-
-   memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, to, PAGE_CACHE_SIZE - to);
if (size  offset + to) {
if (size  offset + PAGE_CACHE_SIZE)
tmp = size  ~PAGE_CACHE_MASK;
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/13] ecryptfs: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of open-coding it.

Signed-off-by: Nate Diller [EMAIL PROTECTED]

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ecryptfs/mmap.c 
linux-2.6.21-rc6-mm1-test/fs/ecryptfs/mmap.c
--- linux-2.6.21-rc6-mm1/fs/ecryptfs/mmap.c 2007-04-09 17:24:03.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/ecryptfs/mmap.c2007-04-09 
18:19:34.0 -0700
@@ -364,18 +364,14 @@ static int fill_zeros_to_end_of_page(str
 {
struct inode *inode = page-mapping-host;
int end_byte_in_page;
-   char *page_virt;
 
if ((i_size_read(inode) / PAGE_CACHE_SIZE) != page-index)
goto out;
end_byte_in_page = i_size_read(inode) % PAGE_CACHE_SIZE;
if (to  end_byte_in_page)
end_byte_in_page = to;
-   page_virt = kmap_atomic(page, KM_USER0);
-   memset((page_virt + end_byte_in_page), 0,
-  (PAGE_CACHE_SIZE - end_byte_in_page));
-   kunmap_atomic(page_virt, KM_USER0);
-   flush_dcache_page(page);
+   zero_user_page(page, end_byte_in_page,
+   PAGE_CACHE_SIZE - end_byte_in_page);
 out:
return 0;
 }
@@ -740,7 +736,6 @@ int write_zeros(struct file *file, pgoff
 {
int rc = 0;
struct page *tmp_page;
-   char *tmp_page_virt;
 
tmp_page = ecryptfs_get1page(file, index);
if (IS_ERR(tmp_page)) {
@@ -757,10 +752,7 @@ int write_zeros(struct file *file, pgoff
page_cache_release(tmp_page);
goto out;
}
-   tmp_page_virt = kmap_atomic(tmp_page, KM_USER0);
-   memset(((char *)tmp_page_virt + start), 0, num_zeros);
-   kunmap_atomic(tmp_page_virt, KM_USER0);
-   flush_dcache_page(tmp_page);
+   zero_user_page(tmp_page, start, num_zeros);
rc = ecryptfs_commit_write(file, tmp_page, start, start + num_zeros);
if (rc  0) {
ecryptfs_printk(KERN_ERR, Error attempting to write zero's 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/13] ext4: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of open-coding it. 

Signed-off-by: Nate Diller [EMAIL PROTECTED]

--- 

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/inode.c 
linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c
--- linux-2.6.21-rc6-mm1/fs/ext4/inode.c2007-04-10 17:15:04.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c   2007-04-10 18:33:04.0 
-0700
@@ -1791,7 +1791,6 @@ int ext4_block_truncate_page(handle_t *h
struct inode *inode = mapping-host;
struct buffer_head *bh;
int err = 0;
-   void *kaddr;
 
if ((EXT4_I(inode)-i_flags  EXT4_EXTENTS_FL) 
test_opt(inode-i_sb, EXTENTS) 
@@ -1808,10 +1807,7 @@ int ext4_block_truncate_page(handle_t *h
 */
if (!page_has_buffers(page)  test_opt(inode-i_sb, NOBH) 
 ext4_should_writeback_data(inode)  PageUptodate(page)) {
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + offset, 0, length);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, offset, length);
set_page_dirty(page);
goto unlock;
}
@@ -1864,11 +1860,7 @@ int ext4_block_truncate_page(handle_t *h
goto unlock;
}
 
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + offset, 0, length);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
-
+   zero_user_page(page, offset, length);
BUFFER_TRACE(bh, zeroed end of block);
 
err = 0;
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/writeback.c 
linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c
--- linux-2.6.21-rc6-mm1/fs/ext4/writeback.c2007-04-10 18:05:52.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c   2007-04-10 
18:33:04.0 -0700
@@ -961,7 +961,6 @@ int ext4_wb_writepage(struct page *page,
loff_t i_size = i_size_read(inode);
pgoff_t end_index = i_size  PAGE_CACHE_SHIFT;
unsigned offset;
-   void *kaddr;
 
wb_debug(writepage %lu from inode %lu\n, page-index, inode-i_ino);
 
@@ -1011,10 +1010,7 @@ int ext4_wb_writepage(struct page *page,
 * the  page size, the remaining memory is zeroed when mapped, and
 * writes to that region are not written out to the file.
 */
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + offset, 0, PAGE_CACHE_SIZE - offset);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, offset, PAGE_CACHE_SIZE - offset);
return ext4_wb_write_single_page(page, wbc);
 }
 
@@ -1065,7 +1061,6 @@ int ext4_wb_block_truncate_page(handle_t
struct inode *inode = mapping-host;
struct buffer_head bh, *bhw = bh;
unsigned blocksize, length;
-   void *kaddr;
int err = 0;
 
wb_debug(partial truncate from %lu on page %lu from inode %lu\n,
@@ -1104,10 +1099,7 @@ int ext4_wb_block_truncate_page(handle_t
}
}
 
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + offset, 0, length);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, offset, length);
SetPageUptodate(page);
__set_page_dirty_nobuffers(page);
 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/13] gfs2: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of open-coding it. 

Signed-off-by: Nate Diller [EMAIL PROTECTED]

--- 

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/gfs2/bmap.c 
linux-2.6.21-rc6-mm1-test/fs/gfs2/bmap.c
--- linux-2.6.21-rc6-mm1/fs/gfs2/bmap.c 2007-04-09 17:23:48.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/gfs2/bmap.c2007-04-09 18:18:23.0 
-0700
@@ -885,7 +885,6 @@ static int gfs2_block_truncate_page(stru
unsigned blocksize, iblock, length, pos;
struct buffer_head *bh;
struct page *page;
-   void *kaddr;
int err;
 
page = grab_cache_page(mapping, index);
@@ -933,10 +932,7 @@ static int gfs2_block_truncate_page(stru
if (sdp-sd_args.ar_data == GFS2_DATA_ORDERED || gfs2_is_jdata(ip))
gfs2_trans_add_bh(ip-i_gl, bh, 0);
 
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + offset, 0, length);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, offset, length);
 
 unlock:
unlock_page(page);
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/13] nfs: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of the newly deprecated memclear_highpage_flush().

Signed-off-by: Nate Diller [EMAIL PROTECTED]

--- 

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/nfs/read.c 
linux-2.6.21-rc6-mm1-test/fs/nfs/read.c
--- linux-2.6.21-rc6-mm1/fs/nfs/read.c  2007-04-09 17:23:48.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/nfs/read.c 2007-04-09 18:18:23.0 
-0700
@@ -79,7 +79,7 @@ void nfs_readdata_release(void *data)
 static
 int nfs_return_empty_page(struct page *page)
 {
-   memclear_highpage_flush(page, 0, PAGE_CACHE_SIZE);
+   zero_user_page(page, 0, PAGE_CACHE_SIZE);
SetPageUptodate(page);
unlock_page(page);
return 0;
@@ -103,10 +103,10 @@ static void nfs_readpage_truncate_uninit
pglen = PAGE_CACHE_SIZE - base;
for (;;) {
if (remainder = pglen) {
-   memclear_highpage_flush(*pages, base, remainder);
+   zero_user_page(*pages, base, remainder);
break;
}
-   memclear_highpage_flush(*pages, base, pglen);
+   zero_user_page(*pages, base, pglen);
pages++;
remainder -= pglen;
pglen = PAGE_CACHE_SIZE;
@@ -130,7 +130,7 @@ static int nfs_readpage_async(struct nfs
return PTR_ERR(new);
}
if (len  PAGE_CACHE_SIZE)
-   memclear_highpage_flush(page, len, PAGE_CACHE_SIZE - len);
+   zero_user_page(page, len, PAGE_CACHE_SIZE - len);
 
nfs_list_add_request(new, one_request);
nfs_pagein_one(one_request, inode);
@@ -561,7 +561,7 @@ readpage_async_filler(void *data, struct
return PTR_ERR(new);
}
if (len  PAGE_CACHE_SIZE)
-   memclear_highpage_flush(page, len, PAGE_CACHE_SIZE - len);
+   zero_user_page(page, len, PAGE_CACHE_SIZE - len);
nfs_list_add_request(new, desc-head);
return 0;
 }
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/nfs/write.c 
linux-2.6.21-rc6-mm1-test/fs/nfs/write.c
--- linux-2.6.21-rc6-mm1/fs/nfs/write.c 2007-04-09 17:24:03.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/nfs/write.c2007-04-09 18:18:23.0 
-0700
@@ -169,7 +169,7 @@ static void nfs_mark_uptodate(struct pag
if (count != nfs_page_length(page))
return;
if (count != PAGE_CACHE_SIZE)
-   memclear_highpage_flush(page, count, PAGE_CACHE_SIZE - count);
+   zero_user_page(page, count, PAGE_CACHE_SIZE - count);
SetPageUptodate(page);
 }
 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/13] reiserfs: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of open-coding it. 

Signed-off-by: Nate Diller [EMAIL PROTECTED]

--- 

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiserfs/file.c 
linux-2.6.21-rc6-mm1-test/fs/reiserfs/file.c
--- linux-2.6.21-rc6-mm1/fs/reiserfs/file.c 2007-04-09 17:24:03.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/reiserfs/file.c2007-04-09 
18:18:23.0 -0700
@@ -1059,20 +1059,12 @@ static int reiserfs_prepare_file_region_
   maping blocks, since there is none, so we just zero out remaining
   parts of first and last pages in write area (if needed) */
if ((pos  ~((loff_t) PAGE_CACHE_SIZE - 1))  inode-i_size) {
-   if (from != 0) {/* First page needs to be partially 
zeroed */
-   char *kaddr = kmap_atomic(prepared_pages[0], KM_USER0);
-   memset(kaddr, 0, from);
-   kunmap_atomic(kaddr, KM_USER0);
-   flush_dcache_page(prepared_pages[0]);
-   }
-   if (to != PAGE_CACHE_SIZE) {/* Last page needs to be 
partially zeroed */
-   char *kaddr =
-   kmap_atomic(prepared_pages[num_pages - 1],
-   KM_USER0);
-   memset(kaddr + to, 0, PAGE_CACHE_SIZE - to);
-   kunmap_atomic(kaddr, KM_USER0);
-   flush_dcache_page(prepared_pages[num_pages - 1]);
-   }
+   if (from != 0)  /* First page needs to be partially 
zeroed */
+   zero_user_page(prepared_pages[0], 0, from);
+
+   if (to != PAGE_CACHE_SIZE)  /* Last page needs to be 
partially zeroed */
+   zero_user_page(prepared_pages[num_pages-1], to,
+   PAGE_CACHE_SIZE - to);
 
/* Since all blocks are new - use already calculated value */
return blocks;
@@ -1199,13 +1191,9 @@ static int reiserfs_prepare_file_region_
ll_rw_block(READ, 1, bh);
*wait_bh++ = bh;
} else {/* Not mapped, zero it */
-   char *kaddr =
-   kmap_atomic(prepared_pages[0],
-   KM_USER0);
-   memset(kaddr + block_start, 0,
-  from - block_start);
-   kunmap_atomic(kaddr, KM_USER0);
-   flush_dcache_page(prepared_pages[0]);
+   zero_user_page(prepared_pages[0],
+  block_start,
+  from - block_start);
set_buffer_uptodate(bh);
}
}
@@ -1237,13 +1225,8 @@ static int reiserfs_prepare_file_region_
ll_rw_block(READ, 1, bh);
*wait_bh++ = bh;
} else {/* Not mapped, zero it */
-   char *kaddr =
-   kmap_atomic(prepared_pages
-   [num_pages - 1],
-   KM_USER0);
-   memset(kaddr + to, 0, block_end - to);
-   kunmap_atomic(kaddr, KM_USER0);
-   
flush_dcache_page(prepared_pages[num_pages - 1]);
+   
zero_user_page(prepared_pages[num_pages-1],
+   to, block_end - to);
set_buffer_uptodate(bh);
}
}
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiserfs/inode.c 
linux-2.6.21-rc6-mm1-test/fs/reiserfs/inode.c
--- linux-2.6.21-rc6-mm1/fs/reiserfs/inode.c2007-04-09 10:41:47.0 
-0700
+++ linux-2.6.21-rc6-mm1-test/fs/reiserfs/inode.c   2007-04-09 
18:18:23.0 -0700
@@ -2148,13 +2148,8 @@ int reiserfs_truncate_file(struct inode 
length = offset  (blocksize - 1);
/* if we are not on a block boundary */
if (length) {
-   char *kaddr;
-
length = blocksize - length;
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + offset, 0, length);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, offset, 

[PATCH 10/13] reiser4: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of open-coding it.  Also replace the (mostly)
redundant zero_page() function.

Signed-off-by: Nate Diller [EMAIL PROTECTED]

--- 

diff -urpN -X dontdiff 
linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/cryptcompress.c 
linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/cryptcompress.c
--- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/cryptcompress.c 2007-04-10 
17:15:04.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/cryptcompress.c
2007-04-10 18:35:44.0 -0700
@@ -1897,7 +1897,6 @@ static int
 write_hole(struct inode *inode, reiser4_cluster_t * clust, loff_t file_off,
   loff_t to_file)
 {
-   char *data;
int result = 0;
unsigned cl_off, cl_count = 0;
unsigned to_pg, pg_off;
@@ -1934,10 +1933,7 @@ write_hole(struct inode *inode, reiser4_
 
to_pg = min_count(PAGE_CACHE_SIZE - pg_off, cl_count);
lock_page(page);
-   data = kmap_atomic(page, KM_USER0);
-   memset(data + pg_off, 0, to_pg);
-   flush_dcache_page(page);
-   kunmap_atomic(data, KM_USER0);
+   zero_user_page(page, pg_off, to_pg);
SetPageUptodate(page);
unlock_page(page);
 
@@ -2167,7 +2163,6 @@ read_some_cluster_pages(struct inode *in
 
if (clust-nr_pages) {
int off;
-   char *data;
struct page * pg;
assert(edward-1419, clust-pages != NULL);
pg = clust-pages[clust-nr_pages - 1];
@@ -2175,10 +2170,7 @@ read_some_cluster_pages(struct inode *in
off = off_to_pgoff(win-off+win-count+win-delta);
if (off) {
lock_page(pg);
-   data = kmap_atomic(pg, KM_USER0);
-   memset(data + off, 0, PAGE_CACHE_SIZE - off);
-   flush_dcache_page(pg);
-   kunmap_atomic(data, KM_USER0);
+   zero_user_page(pg, off, PAGE_CACHE_SIZE - off);
unlock_page(pg);
}
}
@@ -2217,20 +2209,15 @@ read_some_cluster_pages(struct inode *in
(count_to_nrpages(inode-i_size) = pg-index)) {
/* .. and appended,
   so set zeroes to the rest */
-   char *data;
int offset;
lock_page(pg);
-   data = kmap_atomic(pg, KM_USER0);
-
assert(edward-1260,
   count_to_nrpages(win-off + win-count +
win-delta) - 1 == i);
 
offset =
off_to_pgoff(win-off + win-count + win-delta);
-   memset(data + offset, 0, PAGE_CACHE_SIZE - offset);
-   flush_dcache_page(pg);
-   kunmap_atomic(data, KM_USER0);
+   zero_user_page(pg, offset, PAGE_CACHE_SIZE - offset);
unlock_page(pg);
/* still not uptodate */
break;
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/file.c 
linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/file.c
--- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/file.c  2007-04-10 
17:15:04.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/file.c 2007-04-10 
18:35:44.0 -0700
@@ -433,7 +433,6 @@ static int shorten_file(struct inode *in
struct page *page;
int padd_from;
unsigned long index;
-   char *kaddr;
unix_file_info_t *uf_info;
 
/*
@@ -523,10 +522,7 @@ static int shorten_file(struct inode *in
 
lock_page(page);
assert(vs-1066, PageLocked(page));
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + padd_from, 0, PAGE_CACHE_SIZE - padd_from);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, padd_from, PAGE_CACHE_SIZE - padd_from);
unlock_page(page);
page_cache_release(page);
/* the below does up(sbinfo-delete_mutex). Do not get confused */
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/ctail.c 
linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/ctail.c
--- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/ctail.c 2007-04-10 
17:15:04.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/ctail.c2007-04-10 
18:35:44.0 -0700
@@ -627,11 +627,7 @@ int do_readpage_ctail(struct inode * ino
 #endif
case FAKE_DISK_CLUSTER:
/* fill the page by zeroes */
-   data = kmap_atomic(page, KM_USER0);
-
-   memset(data, 0, PAGE_CACHE_SIZE);
-   

[PATCH 13/13] fs: deprecate memclear_highpage_flush

2007-04-10 Thread Nate Diller
Now that all the in-tree users are converted over to zero_user_page(),
deprecate the old memclear_highpage_flush() call.

Signed-off-by: Nate Diller [EMAIL PROTECTED]

---

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/include/linux/highmem.h 
linux-2.6.21-rc6-mm1-test/include/linux/highmem.h
--- linux-2.6.21-rc6-mm1/include/linux/highmem.h2007-04-10 
18:32:41.0 -0700
+++ linux-2.6.21-rc6-mm1-test/include/linux/highmem.h   2007-04-10 
19:40:14.0 -0700
@@ -149,6 +149,8 @@ static inline void zero_user_page(struct
kunmap_atomic(kaddr, KM_USER0);
 }
 
+static void memclear_highpage_flush(struct page *page, unsigned int offset,
+   unsigned int size) __deprecated;
 static inline void memclear_highpage_flush(struct page *page, unsigned int 
offset, unsigned int size)
 {
return zero_user_page(page, offset, size);
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/13] xfs: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of the newly deprecated memclear_highpage_flush(). 

Signed-off-by: Nate Diller [EMAIL PROTECTED]

--- 

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/xfs/linux-2.6/xfs_lrw.c 
linux-2.6.21-rc6-mm1-test/fs/xfs/linux-2.6/xfs_lrw.c
--- linux-2.6.21-rc6-mm1/fs/xfs/linux-2.6/xfs_lrw.c 2007-04-09 
17:24:03.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/xfs/linux-2.6/xfs_lrw.c2007-04-09 
18:18:23.0 -0700
@@ -159,7 +159,7 @@ xfs_iozero(
if (status)
goto unlock;
 
-   memclear_highpage_flush(page, offset, bytes);
+   zero_user_page(page, offset, bytes);
 
status = mapping-a_ops-commit_write(NULL, page, offset,
offset + bytes);
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/13] ntfs: use zero_user_page

2007-04-10 Thread Nate Diller
Use zero_user_page() instead of open-coding it. 

Signed-off-by: Nate Diller [EMAIL PROTECTED]

--- 

diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ntfs/aops.c 
linux-2.6.21-rc6-mm1-test/fs/ntfs/aops.c
--- linux-2.6.21-rc6-mm1/fs/ntfs/aops.c 2007-04-09 10:41:47.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/ntfs/aops.c2007-04-09 18:18:23.0 
-0700
@@ -245,8 +241,7 @@ static int ntfs_read_block(struct page *
rl = NULL;
nr = i = 0;
do {
-   u8 *kaddr;
-   int err;
+   int err = 0;
 
if (unlikely(buffer_uptodate(bh)))
continue;
@@ -254,7 +249,6 @@ static int ntfs_read_block(struct page *
arr[nr++] = bh;
continue;
}
-   err = 0;
bh-b_bdev = vol-sb-s_bdev;
/* Is the block within the allowed limits? */
if (iblock  lblock) {
@@ -340,10 +334,7 @@ handle_hole:
bh-b_blocknr = -1UL;
clear_buffer_mapped(bh);
 handle_zblock:
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + i * blocksize, 0, blocksize);
-   kunmap_atomic(kaddr, KM_USER0);
-   flush_dcache_page(page);
+   zero_user_page(page, i * blocksize, blocksize);
if (likely(!err))
set_buffer_uptodate(bh);
} while (i++, iblock++, (bh = bh-b_this_page) != head);
@@ -460,10 +451,7 @@ retry_readpage:
 * ok to ignore the compressed flag here.
 */
if (unlikely(page-index  0)) {
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr, 0, PAGE_CACHE_SIZE);
-   flush_dcache_page(page);
-   kunmap_atomic(kaddr, KM_USER0);
+   zero_user_page(page, 0, PAGE_CACHE_SIZE);
goto done;
}
if (!NInoAttr(ni))
@@ -790,14 +778,9 @@ lock_retry_remap:
 * uptodate so it can get discarded by the VM.
 */
if (err == -ENOENT || lcn == LCN_ENOENT) {
-   u8 *kaddr;
-
bh-b_blocknr = -1;
clear_buffer_dirty(bh);
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + bh_offset(bh), 0, blocksize);
-   kunmap_atomic(kaddr, KM_USER0);
-   flush_dcache_page(page);
+   zero_user_page(page, bh_offset(bh), blocksize);
set_buffer_uptodate(bh);
err = 0;
continue;
@@ -1422,10 +1405,7 @@ retry_writepage:
if (page-index = (i_size  PAGE_CACHE_SHIFT)) {
/* The page straddles i_size. */
unsigned int ofs = i_size  ~PAGE_CACHE_MASK;
-   kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + ofs, 0, PAGE_CACHE_SIZE - ofs);
-   kunmap_atomic(kaddr, KM_USER0);
-   flush_dcache_page(page);
+   zero_user_page(page, ofs, PAGE_CACHE_SIZE - ofs);
}
/* Handle mst protected attributes. */
if (NInoMstProtected(ni))
diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ntfs/file.c 
linux-2.6.21-rc6-mm1-test/fs/ntfs/file.c
--- linux-2.6.21-rc6-mm1/fs/ntfs/file.c 2007-04-09 17:24:03.0 -0700
+++ linux-2.6.21-rc6-mm1-test/fs/ntfs/file.c2007-04-09 18:18:23.0 
-0700
@@ -606,11 +606,8 @@ do_next_page:
ntfs_submit_bh_for_read(bh);
*wait_bh++ = bh;
} else {
-   u8 *kaddr = kmap_atomic(page, KM_USER0);
-   memset(kaddr + bh_offset(bh), 0,
+   zero_user_page(page, bh_offset(bh),
blocksize);
-   kunmap_atomic(kaddr, KM_USER0);
-   flush_dcache_page(page);
set_buffer_uptodate(bh);
}
}
@@ -685,12 +682,8 @@ map_buffer_cached:
ntfs_submit_bh_for_read(bh);
*wait_bh++ = bh;
} else {
-   u8 *kaddr = kmap_atomic(page,
-   KM_USER0);
-   memset(kaddr + bh_offset(bh),
-   0, blocksize);
-   kunmap_atomic(kaddr, KM_USER0);
- 

Re: ext3, BKL, journal replay, multiple non-bind mounts of same device

2007-04-10 Thread Andreas Dilger
On Apr 10, 2007  20:49 -0400, John Anthony Kazos Jr. wrote:
 Since it is possible for the same block device to be mounted multiple 
 times concurrently by the same filesystem, and since ext3 explicitly 
 disables the BKL during its fill_super operation which would prevent this, 
 what is the result of mounting it multiple times this way? Especially if 
 the filesystem is dirty and a journal is replayed. (In any case, what 
 operation is being performed by ext3/ext4 that requires the BKL to be 
 dropped? What's the need to even consider the BKL during fill_super?)
 
 And in general, how does a filesystem deal with being mounted multiple 
 times in this way? In my testing and exploration so far, everything seems 
 to generally work, but I haven't tried deliberately using different 
 instances of the mount concurrently. Do we end up with locks not being 
 held properly on the superblock because the super_block structure 
 instances don't know about each other? Has dealing with this behavior of 
 bd_claim really been considered before, and if so, what's the general 
 scheme for handling it?

It is a myth (that actually frightened me quite a bit when I first did it)
that the filesystem is mounted twice in this case.  The truth of the matter
is if you mount -t ext3 /dev/ /mnt/1 and ... /mnt/2 you actually
get the equivalent of a bind mount for this block device on the two mount
points.  You can see this easily because e.g. you don't get two kjournald
threads for the two mounts, and it doesn't completely blow up.

If, on the other hand, you tried one mount with ext3 and another with ext4
it will fail the second with -EBUSY.

As for the BKL changes, your best bet is to go back through GIT and/or BK
or search the mailing lists to see when and why that was added.  It appears
to have  been 2.6.11, but I don't know why.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/13] ext4: use zero_user_page

2007-04-10 Thread Andreas Dilger
On Apr 10, 2007  20:36 -0700, Nate Diller wrote:
 Use zero_user_page() instead of open-coding it. 
 
 Signed-off-by: Nate Diller [EMAIL PROTECTED]

 To: Andrew Morton [EMAIL PROTECTED],
Alexander Viro [EMAIL PROTECTED]
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org

Would have been better to CC the filesystem maintainers directly
(which was one of the reasons Andrew wanted per-fs patches so they
can be Ack/Nack independently.

Looks good in any case,

Signed-off-by: Andreas Dilger [EMAIL PROTECTED]

 diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/inode.c 
 linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c
 --- linux-2.6.21-rc6-mm1/fs/ext4/inode.c  2007-04-10 17:15:04.0 
 -0700
 +++ linux-2.6.21-rc6-mm1-test/fs/ext4/inode.c 2007-04-10 18:33:04.0 
 -0700
 @@ -1791,7 +1791,6 @@ int ext4_block_truncate_page(handle_t *h
   struct inode *inode = mapping-host;
   struct buffer_head *bh;
   int err = 0;
 - void *kaddr;
  
   if ((EXT4_I(inode)-i_flags  EXT4_EXTENTS_FL) 
   test_opt(inode-i_sb, EXTENTS) 
 @@ -1808,10 +1807,7 @@ int ext4_block_truncate_page(handle_t *h
*/
   if (!page_has_buffers(page)  test_opt(inode-i_sb, NOBH) 
ext4_should_writeback_data(inode)  PageUptodate(page)) {
 - kaddr = kmap_atomic(page, KM_USER0);
 - memset(kaddr + offset, 0, length);
 - flush_dcache_page(page);
 - kunmap_atomic(kaddr, KM_USER0);
 + zero_user_page(page, offset, length);
   set_page_dirty(page);
   goto unlock;
   }
 @@ -1864,11 +1860,7 @@ int ext4_block_truncate_page(handle_t *h
   goto unlock;
   }
  
 - kaddr = kmap_atomic(page, KM_USER0);
 - memset(kaddr + offset, 0, length);
 - flush_dcache_page(page);
 - kunmap_atomic(kaddr, KM_USER0);
 -
 + zero_user_page(page, offset, length);
   BUFFER_TRACE(bh, zeroed end of block);
  
   err = 0;
 diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/ext4/writeback.c 
 linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c
 --- linux-2.6.21-rc6-mm1/fs/ext4/writeback.c  2007-04-10 18:05:52.0 
 -0700
 +++ linux-2.6.21-rc6-mm1-test/fs/ext4/writeback.c 2007-04-10 
 18:33:04.0 -0700
 @@ -961,7 +961,6 @@ int ext4_wb_writepage(struct page *page,
   loff_t i_size = i_size_read(inode);
   pgoff_t end_index = i_size  PAGE_CACHE_SHIFT;
   unsigned offset;
 - void *kaddr;
  
   wb_debug(writepage %lu from inode %lu\n, page-index, inode-i_ino);
  
 @@ -1011,10 +1010,7 @@ int ext4_wb_writepage(struct page *page,
* the  page size, the remaining memory is zeroed when mapped, and
* writes to that region are not written out to the file.
*/
 - kaddr = kmap_atomic(page, KM_USER0);
 - memset(kaddr + offset, 0, PAGE_CACHE_SIZE - offset);
 - flush_dcache_page(page);
 - kunmap_atomic(kaddr, KM_USER0);
 + zero_user_page(page, offset, PAGE_CACHE_SIZE - offset);
   return ext4_wb_write_single_page(page, wbc);
  }
  
 @@ -1065,7 +1061,6 @@ int ext4_wb_block_truncate_page(handle_t
   struct inode *inode = mapping-host;
   struct buffer_head bh, *bhw = bh;
   unsigned blocksize, length;
 - void *kaddr;
   int err = 0;
  
   wb_debug(partial truncate from %lu on page %lu from inode %lu\n,
 @@ -1104,10 +1099,7 @@ int ext4_wb_block_truncate_page(handle_t
   }
   }
  
 - kaddr = kmap_atomic(page, KM_USER0);
 - memset(kaddr + offset, 0, length);
 - flush_dcache_page(page);
 - kunmap_atomic(kaddr, KM_USER0);
 + zero_user_page(page, offset, length);
   SetPageUptodate(page);
   __set_page_dirty_nobuffers(page);
  
 -
 To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/13] fs: convert core functions to zero_user_page

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 20:36:00 -0700 Nate Diller [EMAIL PROTECTED] wrote:

 It's very common for file systems to need to zero part or all of a page, the
 simplist way is just to use kmap_atomic() and memset().  There's actually a
 library function in include/linux/highmem.h that does exactly that, but it's
 confusingly named memclear_highpage_flush(), which is descriptive of *how*
 it does the work rather than what the *purpose* is.  So this patchset
 renames the function to zero_user_page(), and calls it from the various
 places that currently open code it.
 
 This first patch introduces the new function call, and converts all the core
 kernel callsites, both the open-coded ones and the old
 memclear_highpage_flush() ones.  Following this patch is a series of
 conversions for each file system individually, per AKPM, and finally a patch
 deprecating the old call.

For the reasons Anton identified, I think it is better design while we're here
to force callers to pass in the kmap-type which they wish to use for the atomic
kmap.  It makes the programmer think about what he wants to happen.  The price
of getting this wrong tends to be revoltingly rare file corruption.

But we cannot make this change in the obvious fashion, because the KM_FOO
identifiers are undefined if CONFIG_HIGHMEM=n.  So

zero_user_page(page, 1, 2, KM_USER0);

won't compile on non-highmem.

So we are forced to use a macro, like below.

Also, you forgot to mark memclear_highpage_flush() __deprecated.

And I'm surprised that this:

+static inline void memclear_highpage_flush(struct page *page, unsigned int 
offset, unsigned int size)
+{
+   return zero_user_page(page, offset, size);
+}

compiled.  zero_user_page() returns void...


 drivers/block/loop.c|2 +-
 fs/buffer.c |   21 -
 fs/direct-io.c  |2 +-
 fs/mpage.c  |6 --
 include/linux/highmem.h |   29 +
 mm/filemap_xip.c|2 +-
 6 files changed, 36 insertions(+), 26 deletions(-)

diff -puN 
drivers/block/loop.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type 
drivers/block/loop.c
--- 
a/drivers/block/loop.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type
+++ a/drivers/block/loop.c
@@ -250,7 +250,7 @@ static int do_lo_send_aops(struct loop_d
 */
printk(KERN_ERR loop: transfer error block %llu\n,
   (unsigned long long)index);
-   zero_user_page(page, offset, size);
+   zero_user_page(page, offset, size, KM_USER0);
}
flush_dcache_page(page);
ret = aops-commit_write(file, page, offset,
diff -puN 
fs/buffer.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type 
fs/buffer.c
--- a/fs/buffer.c~fs-convert-core-functions-to-zero_user_page-pass-kmap-type
+++ a/fs/buffer.c
@@ -1855,7 +1855,7 @@ static int __block_prepare_write(struct 
break;
if (buffer_new(bh)) {
clear_buffer_new(bh);
-   zero_user_page(page, block_start, bh-b_size);
+   zero_user_page(page, block_start, bh-b_size, KM_USER0);
set_buffer_uptodate(bh);
mark_buffer_dirty(bh);
}
@@ -1943,7 +1943,8 @@ int block_read_full_page(struct page *pa
SetPageError(page);
}
if (!buffer_mapped(bh)) {
-   zero_user_page(page, i * blocksize, blocksize);
+   zero_user_page(page, i * blocksize, blocksize,
+   KM_USER0);
if (!err)
set_buffer_uptodate(bh);
continue;
@@ -2107,7 +2108,8 @@ int cont_prepare_write(struct page *page
PAGE_CACHE_SIZE, get_block);
if (status)
goto out_unmap;
-   zero_user_page(page, zerofrom, PAGE_CACHE_SIZE-zerofrom);
+   zero_user_page(page, zerofrom, PAGE_CACHE_SIZE - zerofrom,
+   KM_USER0);
generic_commit_write(NULL, new_page, zerofrom, PAGE_CACHE_SIZE);
unlock_page(new_page);
page_cache_release(new_page);
@@ -2134,7 +2136,7 @@ int cont_prepare_write(struct page *page
if (status)
goto out1;
if (zerofrom  offset) {
-   zero_user_page(page, zerofrom, offset-zerofrom);
+   zero_user_page(page, zerofrom, offset - zerofrom, KM_USER0);
__block_commit_write(inode, page, zerofrom, offset);
}
return 0;
@@ -2333,7 +2335,7 @@ failed:
 * Error recovery is pretty slack.  Clear the page and mark it 

Re: [PATCH 13/13] fs: deprecate memclear_highpage_flush

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 20:36:00 -0700 Nate Diller [EMAIL PROTECTED] wrote:

 Now that all the in-tree users are converted over to zero_user_page(),
 deprecate the old memclear_highpage_flush() call.
 
 Signed-off-by: Nate Diller [EMAIL PROTECTED]
 
 ---
 
 diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/include/linux/highmem.h 
 linux-2.6.21-rc6-mm1-test/include/linux/highmem.h
 --- linux-2.6.21-rc6-mm1/include/linux/highmem.h  2007-04-10 
 18:32:41.0 -0700
 +++ linux-2.6.21-rc6-mm1-test/include/linux/highmem.h 2007-04-10 
 19:40:14.0 -0700
 @@ -149,6 +149,8 @@ static inline void zero_user_page(struct
   kunmap_atomic(kaddr, KM_USER0);
  }
  
 +static void memclear_highpage_flush(struct page *page, unsigned int offset,
 + unsigned int size) __deprecated;
  static inline void memclear_highpage_flush(struct page *page, unsigned int 
 offset, unsigned int size)
  {
   return zero_user_page(page, offset, size);

oh, there it is.

one can stick the __deprecated at the end of the definition, actually.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html