date:20071025

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Neil Brown

On Thursday October 25, [EMAIL PROTECTED] wrote:
> On Mon, 22 Oct 2007, Pekka Enberg wrote:
> > On 10/22/07, Hugh Dickins <[EMAIL PROTECTED]> wrote:
> > > Only ramdisk and shmem have been returning AOP_WRITEPAGE_ACTIVATE.
> > > Both of those set BDI_CAP_NO_WRITEBACK.  ramdisk never returned it
> > > if !wbc->for_reclaim.  I contend that shmem shouldn't either: it's
> > > a special code to get the LRU rotation right, not useful elsewhere.
> > > Though Documentation/filesystems/vfs.txt does imply wider use.
> > >
> > > I think this is where people use the phrase "go figure" ;)
> > 
> > Heh. As far as I can tell, the implication of "wider use" was added by
> > Neil in commit "341546f5ad6fce584531f744853a5807a140f2a9 Update some
> > VFS documentation", so perhaps he might know? Neil?

I just read the code, tried to understand it, translated that
understanding into English, and put that in vfs.txt.
It is very possible that what I wrote didn't match the intention of
the author, but it seemed to match the behaviour of the code.

The patch looks like it makes perfect sense to me.
Before the change, ->writepage could return AOP_WRITEPAGE_ACTIVATE
without unlocking the page, and this has precisely the effect of:
   ClearPageReclaim();  (if the call path was through pageout)
   SetPageActive();  (if the call was through shrink_page_list)
   unlock_page();

With the patch, the ->writepage method does the SetPageActive and the
unlock_page, which on the whole seems cleaner.

We seem to have lost a call to ClearPageReclaim - I don't know if that
is significant.

> 
> Special, hidden, undocumented, secret hack!  Then in 2.6.7 Andrew
> stole his own secret and used it when concocting ramdisk_writepage.
> Oh, and NFS made some kind of use of it in 2.6.6 only.  Then Neil
> revealed the secret to the uninitiated in 2.6.17: now, what's the
> appropriate punishment for that?

Surely the punishment should be for writing hidden undocumented hacks
in the first place!  I vote we just make him maintainer for the whole
kernel - that will keep him so busy that he will never have a chance
to do it again :-)

> --- 2.6.24-rc1/Documentation/filesystems/vfs.txt  2007-10-24 
> 07:15:11.0 +0100
> +++ linux/Documentation/filesystems/vfs.txt   2007-10-24 08:42:07.0 
> +0100
> @@ -567,9 +567,7 @@ struct address_space_operations {
>If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
>try too hard if there are problems, and may choose to write out
>other pages from the mapping if that is easier (e.g. due to
> -  internal dependencies).  If it chooses not to start writeout, it
> -  should return AOP_WRITEPAGE_ACTIVATE so that the VM will not keep
> -  calling ->writepage on that page.
> +  internal dependencies).
>  

It seems that the new requirement is that if the address_space
chooses not to write out the page, it should now call SetPageActive().
If that is the case, I think it should be explicit in the
documentation - please?

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 1/1] Drop CAP_SYS_RAWIO requirement for FIBMAP

2007-10-25 Thread Alan Cox

> I found Chris's comment about negative block numbers, I'll send a patch 
> out for that.
> 
> You mentioned back in 99 about racing with ftruncate.  Is it sufficient 
> to mutex_lock(i_mutex) and down_read(i_alloc_sem)?

One for the fs guys. That code has changed far beyond anything I
understand any more 8)

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 1/1] Drop CAP_SYS_RAWIO requirement for FIBMAP

2007-10-25 Thread Mike Waychison


Alan Cox wrote:

On Thu, 25 Oct 2007 16:06:40 -0700
Mike Waychison <[EMAIL PROTECTED]> wrote:


Remove the need for having CAP_SYS_RAWIO when doing a FIBMAP call on an open 
file descriptor.

It would be nice to allow users to have permission to see where their data is 
landing on disk, and there really isn't a good reason to keep them from getting 
at this information.


Historically this was done because people felt it was more secure. It
also allows you to make some deductions about other activities on the
disk but thats probably only a concern for very very security crazed
compartmentalised boxes

Also historically at least FIBMAP could be abused to crash the system.
Now if you can verify that has been fixed I have no problem, but given
that I can find no record of that being fixed it would be wise to audit
it first and review Chris Evans and other reports about what occurs when
FIBMAP is passed random block numbers.

FIBMAP has another problem for this general use as well - it takes an int
but the block number can now be bigger for very large files on 32bit.

Alan


I found Chris's comment about negative block numbers, I'll send a patch 
out for that.


You mentioned back in 99 about racing with ftruncate.  Is it sufficient 
to mutex_lock(i_mutex) and down_read(i_alloc_sem)?


Mike Waychison
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/31] Add an ERR_CAST() macro to complement ERR_PTR and co. [try #5]

2007-10-25 Thread Zach Brown

Roland Dreier wrote:
>  > > +static inline void *ERR_CAST(const void *ptr)
>  > > +{
>  > > +return (void *) ptr;
>  > > +}
>  > 
>  > Just to nit, surely you don't need the cast inside the function.  The
>  > casting happens at the call site between the argument and returned pointer.
> 
> The way it's written you kinda do, since it takes a const void * and
> returns a plain void *.

Ah, right, I missed that.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 1/1] Drop CAP_SYS_RAWIO requirement for FIBMAP

2007-10-25 Thread Alan Cox

On Thu, 25 Oct 2007 16:06:40 -0700
Mike Waychison <[EMAIL PROTECTED]> wrote:

> Remove the need for having CAP_SYS_RAWIO when doing a FIBMAP call on an open 
> file descriptor.
> 
> It would be nice to allow users to have permission to see where their data is 
> landing on disk, and there really isn't a good reason to keep them from 
> getting at this information.

Historically this was done because people felt it was more secure. It
also allows you to make some deductions about other activities on the
disk but thats probably only a concern for very very security crazed
compartmentalised boxes

Also historically at least FIBMAP could be abused to crash the system.
Now if you can verify that has been fixed I have no problem, but given
that I can find no record of that being fixed it would be wise to audit
it first and review Chris Evans and other reports about what occurs when
FIBMAP is passed random block numbers.

FIBMAP has another problem for this general use as well - it takes an int
but the block number can now be bigger for very large files on 32bit.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] VFS/Security: Rework inode_getsecurity and callers to return resulting buffer

2007-10-25 Thread Serge E. Hallyn

Quoting David P. Quigley ([EMAIL PROTECTED]):
>   This patch modifies the interface to inode_getsecurity to have the
> function return a buffer containing the security blob and its length via
> parameters instead of relying on the calling function to give it an
> appropriately sized buffer. Security blobs obtained with this function
> should be freed using the release_secctx LSM hook. This alleviates the
> problem of the caller having to guess a length and preallocate a buffer
> for this function allowing it to be used elsewhere for Labeled NFS. The
> patch also removed the unused err parameter. The conversion is similar
> to the one performed by Al Viro for the security_getprocattr hook.
> 
> Signed-off-by: David P. Quigley <[EMAIL PROTECTED]>
> ---
>  fs/xattr.c   |   26 --
>  include/linux/security.h |   27 ++-
>  include/linux/xattr.h|1 +
>  mm/shmem.c   |3 +--
>  security/dummy.c |4 +++-
>  security/selinux/hooks.c |   38 ++

(Hmm, I was about to ask if this diffstat could be complete, as it
doesn't have for instance security/security.c, but I guess this predates
the staticlsm patch...)

>  6 files changed, 53 insertions(+), 46 deletions(-)
> 
> diff --git a/fs/xattr.c b/fs/xattr.c
> index a44fd92..d45c7ef 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -105,6 +105,29 @@ out:
>  EXPORT_SYMBOL_GPL(vfs_setxattr);
>  
>  ssize_t
> +xattr_getsecurity(struct inode *inode, const char *name, void *value,
> + size_t size)
> +{
> + void *buffer = NULL;
> + ssize_t len;
> +
> + len = security_inode_getsecurity(inode, name, &buffer);
> + if (len < 0)
> + return len;
> + if (!value || !size)
> + goto out;
> + if (size < len) {
> + len = -ERANGE;
> + goto out;
> + }
> + memcpy(value, buffer, len);
> +out:
> + security_release_secctx(buffer, len);

This is mighty misleading in -ERANGE case :)  I realize that
selinux_release_secctx() ignores len anyway.  But given the description
in security.h, I'd say either you need to keep the actual length
allocated and pass that in here, or (probably better) have another patch
remove the second argument from security_release_secctx().

> + return len;
> +}
> +EXPORT_SYMBOL_GPL(xattr_getsecurity);
> +
> +ssize_t
>  vfs_getxattr(struct dentry *dentry, char *name, void *value, size_t size)
>  {
>   struct inode *inode = dentry->d_inode;
> @@ -126,8 +149,7 @@ vfs_getxattr(struct dentry *dentry, char *name, void 
> *value, size_t size)
>   if (!strncmp(name, XATTR_SECURITY_PREFIX,
>   XATTR_SECURITY_PREFIX_LEN)) {
>   const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
> - int ret = security_inode_getsecurity(inode, suffix, value,
> -  size, error);
> + int ret = xattr_getsecurity(inode, suffix, value, size);
>   /*
>* Only overwrite the return value if a security module
>* is actually active.
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 1a15526..8658929 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -391,15 +391,11 @@ struct request_sock;
>   *   identified by @name for @dentry.
>   *   Return 0 if permission is granted.
>   * @inode_getsecurity:
> - *   Copy the extended attribute representation of the security label 
> - *   associated with @name for @inode into @buffer.  @buffer may be
> - *   NULL to request the size of the buffer required.  @size indicates
> - *   the size of @buffer in bytes.  Note that @name is the remainder
> - *   of the attribute name after the security. prefix has been removed.
> - *   @err is the return value from the preceding fs getxattr call,
> - *   and can be used by the security module to determine whether it
> - *   should try and canonicalize the attribute value.
> - *   Return number of bytes used/required on success.
> + *   Retrieve a copy of the extended attribute representation of the
> + *   security label associated with @name for @inode via @buffer.  Note that
> + *   @name is the remainder of the attribute name after the security prefix
> + *   has been removed.
> + *   Return size of buffer on success.
>   * @inode_setsecurity:
>   *   Set the security label associated with @name for @inode from the
>   *   extended attribute value @value.  @size indicates the size of the
> @@ -1233,7 +1229,8 @@ struct security_operations {
>   int (*inode_listxattr) (struct dentry *dentry);
>   int (*inode_removexattr) (struct dentry *dentry, char *name);
>   const char *(*inode_xattr_getsuffix) (void);
> - int (*inode_getsecurity)(const struct inode *inode, const char *name, 
> void *buffer, size_t size, int err);
> + int (*inode_getsecurity)(const struct inode *ino

Re: [PATCH] VFS: new fgetattr() file operation

2007-10-25 Thread David Chinner

On Fri, Oct 26, 2007 at 01:10:14AM +0200, Miklos Szeredi wrote:
> > On Wed, Oct 24, 2007 at 05:27:04PM +0200, Miklos Szeredi wrote:
> > > > >> Wouldn't you be better off by attempting to implement an "open
> > > > >> by ino" operation and an operation to get the generation count
> > > > >> for the file and then modifying the network protocol of interest
> > > > >> to use these as identifiers for the file to be manipulated?
> > > > >> 
> > > > >
> > > > > You mean an "open by inode" on the userspace API?  My guess, it
> > > > > wouldn't get very far.
> > > > 
> > > > This isn't a new idea and has been implemented on a variety of
> > > > different systems.
> > > 
> > > Like?
> > 
> > XFS.
> > 
> > 'man open_by_handle'
> 
> Doesn't seem widely used, with 600 something google hits.

from the man page:

"They are intended  for  use  by a limited set of system utilities such
as backup programs."

It also gets used by HSMs and so it is current, tested and is not
going away

> And in this
> old thread Linus is not entirely enthusiastic about the concept:
> 
>   http://lkml.org/lkml/1999/1/11/244

That was "open by inode number", AFAICT. A handle is an opaque
blob that can be an arbitrary length defined by the filesystem.
You have to convert a fd or path to a handle first before you can
use it later, so any filesystem can implement it...

i.e. it is exactly what this (unanswered) post suggested:

http://lkml.org/lkml/1999/1/13/186

Just my 2c worth

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] VFS: Reorder vfs_getxattr to avoid unnecessary calls to the LSM

2007-10-25 Thread Serge E. Hallyn

Quoting James Morris ([EMAIL PROTECTED]):
> On Mon, 22 Oct 2007, David P. Quigley wrote:
> 
> > Originally vfs_getxattr would pull the security xattr variable using
> > the inode getxattr handle and then proceed to clobber it with a subsequent 
> > call
> > to the LSM. This patch reorders the two operations such that when the xattr
> > requested is in the security namespace it first attempts to grab the value 
> > from
> > the LSM directly. If it fails to obtain the value because there is no module
> > present or the module does not support the operation it will fall back to 
> > using
> > the inode getxattr operation. In the event that both are inaccessible it
> > returns EOPNOTSUPP.
> > 
> > Signed-off-by: David P. Quigley <[EMAIL PROTECTED]>
> 
> Acked-by: James Morris <[EMAIL PROTECTED]>

(not that it matters much, esp with selinux being the only current user,
but)

Acked-by: Serge Hallyn <[EMAIL PROTECTED]>

Makes sense and looks good.

thanks,
-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/31] Add an ERR_CAST() macro to complement ERR_PTR and co. [try #5]

2007-10-25 Thread Andrew Morton

On Thu, 25 Oct 2007 16:09:09 -0700
Zach Brown <[EMAIL PROTECTED]> wrote:

> 
> > + * ERR_CAST - Explicitly cast an error-valued pointer to another pointer 
> > type
> > + * @ptr: The pointer to cast.
> > + *
> > + * Explicitly cast an error-valued pointer to another pointer type in such 
> > a
> > + * way as to make it clear that's what's going on.
> > + */
> > +static inline void *ERR_CAST(const void *ptr)
> > +{
> > +   return (void *) ptr;
> > +}
> 
> Just to nit, surely you don't need the cast inside the function.  The
> casting happens at the call site between the argument and returned pointer.
> 

It'll warn without the cast.

btw, nit time.  This style:

return (void *)ptr;

outnumbers this style

return (void *) ptr;

by 4 to 1.  And I don't find the space attractive, useful or logical,
personally.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/31] Add an ERR_CAST() macro to complement ERR_PTR and co. [try #5]

2007-10-25 Thread Roland Dreier

 > > +static inline void *ERR_CAST(const void *ptr)
 > > +{
 > > +  return (void *) ptr;
 > > +}
 > 
 > Just to nit, surely you don't need the cast inside the function.  The
 > casting happens at the call site between the argument and returned pointer.

The way it's written you kinda do, since it takes a const void * and
returns a plain void *.  But I don't think that's the best way to
write it.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[patch 1/1] Drop CAP_SYS_RAWIO requirement for FIBMAP

2007-10-25 Thread Mike Waychison

Remove the need for having CAP_SYS_RAWIO when doing a FIBMAP call on an open 
file descriptor.

It would be nice to allow users to have permission to see where their data is 
landing on disk, and there really isn't a good reason to keep them from getting 
at this information.

Signed-off-by: Mike Waychison <[EMAIL PROTECTED]>
 fs/ioctl.c |2 --
 1 file changed, 2 deletions(-)

Index: linux-2.6.23/fs/ioctl.c
===
--- linux-2.6.23.orig/fs/ioctl.c2007-10-09 13:31:38.0 -0700
+++ linux-2.6.23/fs/ioctl.c 2007-10-25 15:48:24.0 -0700
@@ -56,8 +56,6 @@ static int file_ioctl(struct file *filp,
/* do we support this mess? */
if (!mapping->a_ops->bmap)
return -EINVAL;
-   if (!capable(CAP_SYS_RAWIO))
-   return -EPERM;
if ((error = get_user(block, p)) != 0)
return error;
 

--

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] VFS: new fgetattr() file operation

2007-10-25 Thread Miklos Szeredi

> On Wed, Oct 24, 2007 at 05:27:04PM +0200, Miklos Szeredi wrote:
> > > >> Wouldn't you be better off by attempting to implement an "open
> > > >> by ino" operation and an operation to get the generation count
> > > >> for the file and then modifying the network protocol of interest
> > > >> to use these as identifiers for the file to be manipulated?
> > > >> 
> > > >
> > > > You mean an "open by inode" on the userspace API?  My guess, it
> > > > wouldn't get very far.
> > > 
> > > This isn't a new idea and has been implemented on a variety of
> > > different systems.
> > 
> > Like?
> 
> XFS.
> 
> 'man open_by_handle'

Doesn't seem widely used, with 600 something google hits.  And in this
old thread Linus is not entirely enthusiastic about the concept:

  http://lkml.org/lkml/1999/1/11/244

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/31] Add an ERR_CAST() macro to complement ERR_PTR and co. [try #5]

2007-10-25 Thread Zach Brown


> + * ERR_CAST - Explicitly cast an error-valued pointer to another pointer type
> + * @ptr: The pointer to cast.
> + *
> + * Explicitly cast an error-valued pointer to another pointer type in such a
> + * way as to make it clear that's what's going on.
> + */
> +static inline void *ERR_CAST(const void *ptr)
> +{
> + return (void *) ptr;
> +}

Just to nit, surely you don't need the cast inside the function.  The
casting happens at the call site between the argument and returned pointer.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Add an ERR_CAST() function to complement ERR_PTR and co. [try #5.1]

2007-10-25 Thread David Howells

Add an ERR_CAST() function to complement ERR_PTR and co. for the purposes of
casting an error entyped as one pointer type to an error of another pointer
type whilst making it explicit as to what is going on.

This provides a replacement for the ERR_PTR(PTR_ERR(p)) construct.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 include/linux/err.h |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/linux/err.h b/include/linux/err.h
index 1ab1d44..08409cd 100644
--- a/include/linux/err.h
+++ b/include/linux/err.h
@@ -34,6 +34,18 @@ static inline long IS_ERR(const void *ptr)
return IS_ERR_VALUE((unsigned long)ptr);
 }
 
+/**
+ * ERR_CAST - Explicitly cast an error-valued pointer to another pointer type
+ * @ptr: The pointer to cast.
+ *
+ * Explicitly cast an error-valued pointer to another pointer type in such a
+ * way as to make it clear that's what's going on.
+ */
+static inline void *ERR_CAST(const void *ptr)
+{
+   return (void *) ptr;
+}
+
 #endif
 
 #endif /* _LINUX_ERR_H */

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] fs: restore nobh

2007-10-25 Thread Nick Piggin

On Thu, Oct 25, 2007 at 09:07:36PM +0200, Jan Kara wrote:
>   Hi,
> 
> > This is overdue, sorry. Got a little complicated, and I've been away from
> > my filesystem test setup so I didn't want ot send it (lucky, coz I found
> > a bug after more substantial testing).
> > 
> > Anyway, RFC?
>   Hmm, maybe one comment/question:
> 
> > Index: linux-2.6/fs/buffer.c
> > ===
> > --- linux-2.6.orig/fs/buffer.c  2007-10-08 14:09:35.0 +1000
> > +++ linux-2.6/fs/buffer.c   2007-10-08 16:45:28.0 +1000
> ...
> 
> > -/*
> > - * Make sure any changes to nobh_commit_write() are reflected in
> > - * nobh_truncate_page(), since it doesn't call commit_write().
> > - */
> > -int nobh_commit_write(struct file *file, struct page *page,
> > -   unsigned from, unsigned to)
> > +int nobh_write_end(struct file *file, struct address_space *mapping,
> > +   loff_t pos, unsigned len, unsigned copied,
> > +   struct page *page, void *fsdata)
> >  {
> > struct inode *inode = page->mapping->host;
> > -   loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
> > +   struct buffer_head *head = NULL;
> > +   struct buffer_head *bh;
> >  
> > -   if (page_has_buffers(page))
> > -   return generic_commit_write(file, page, from, to);
> > +   if (!PageMappedToDisk(page)) {
> > +   if (unlikely(copied < len) && !page_has_buffers(page))
> > +   attach_nobh_buffers(page, head);
> > +   if (page_has_buffers(page))
> > +   return generic_write_end(file, mapping, pos, len, 
> > copied, page, fsdata);
> > +   }
>   So we can have a page with attached buffers but we decide to not call
> generic_write_end()...
> 
> > SetPageUptodate(page);
> > set_page_dirty(page);
>   And we just set the page dirty but we leave buffers clean. So cannot
> subsequent try_to_free_buffers() come, mark page as clean (as buffers
> are clean) and discard the data?

Hmm, we might just be OK here because set_page_dirty should dirty the
buffers. However, I think you have spotted a mistake here and it would be
better if the generic_write_end test was outside the PageMapedToDisk
check.

Thanks!

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] VFS: new fgetattr() file operation

2007-10-25 Thread David Chinner

On Wed, Oct 24, 2007 at 05:27:04PM +0200, Miklos Szeredi wrote:
> > >> Wouldn't you be better off by attempting to implement an "open
> > >> by ino" operation and an operation to get the generation count
> > >> for the file and then modifying the network protocol of interest
> > >> to use these as identifiers for the file to be manipulated?
> > >> 
> > >
> > > You mean an "open by inode" on the userspace API?  My guess, it
> > > wouldn't get very far.
> > 
> > This isn't a new idea and has been implemented on a variety of
> > different systems.
> 
> Like?

XFS.

'man open_by_handle'

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH+comment] fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE

2007-10-25 Thread H. Peter Anvin


Erez Zadok wrote:

In message <[EMAIL PROTECTED]>, Hugh Dickins writes:

On Thu, 25 Oct 2007, Pekka Enberg wrote:



With unionfs also fixed, we don't know of an absolute need for this
patch (and so, on that basis, the !wbc->for_reclaim case could indeed
be removed very soon); but as I see it, the unionfs case has shown
that it's time to future-proof this code against whatever stacking
filesystems come along.  Hence I didn't mention the names of such
filesystems in the source comment.


I think "future proof" for other stackable f/s is a good idea, esp. since
many of the stackable f/s we've developed and distributed over the past 10
years are in some use in various places: gzipfs, avfs, tracefs, replayfs,
ncryptfs, versionfs, wrapfs, i3fs, and more (see www.filesystems.org).



A number of filesystems want partial or full stackability, so getting 
rid of lack-of-stackability whereever it may be is highly valuable.


-hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] fs: restore nobh

2007-10-25 Thread Jan Kara

  Hi,

> This is overdue, sorry. Got a little complicated, and I've been away from
> my filesystem test setup so I didn't want ot send it (lucky, coz I found
> a bug after more substantial testing).
> 
> Anyway, RFC?
  Hmm, maybe one comment/question:

> Index: linux-2.6/fs/buffer.c
> ===
> --- linux-2.6.orig/fs/buffer.c2007-10-08 14:09:35.0 +1000
> +++ linux-2.6/fs/buffer.c 2007-10-08 16:45:28.0 +1000
...

> -/*
> - * Make sure any changes to nobh_commit_write() are reflected in
> - * nobh_truncate_page(), since it doesn't call commit_write().
> - */
> -int nobh_commit_write(struct file *file, struct page *page,
> - unsigned from, unsigned to)
> +int nobh_write_end(struct file *file, struct address_space *mapping,
> + loff_t pos, unsigned len, unsigned copied,
> + struct page *page, void *fsdata)
>  {
>   struct inode *inode = page->mapping->host;
> - loff_t pos = ((loff_t)page->index << PAGE_CACHE_SHIFT) + to;
> + struct buffer_head *head = NULL;
> + struct buffer_head *bh;
>  
> - if (page_has_buffers(page))
> - return generic_commit_write(file, page, from, to);
> + if (!PageMappedToDisk(page)) {
> + if (unlikely(copied < len) && !page_has_buffers(page))
> + attach_nobh_buffers(page, head);
> + if (page_has_buffers(page))
> + return generic_write_end(file, mapping, pos, len, 
> copied, page, fsdata);
> + }
  So we can have a page with attached buffers but we decide to not call
generic_write_end()...

>   SetPageUptodate(page);
>   set_page_dirty(page);
  And we just set the page dirty but we leave buffers clean. So cannot
subsequent try_to_free_buffers() come, mark page as clean (as buffers
are clean) and discard the data?

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] fs: restore nobh

2007-10-25 Thread Jan Kara

  Hi,

> This is overdue, sorry. Got a little complicated, and I've been away from
> my filesystem test setup so I didn't want ot send it (lucky, coz I found
> a bug after more substantial testing).
> 
> Anyway, RFC?
  A bit overdue review but since I haven't seen anybody else replying...
I've read the patch and it looks fine. So feel free to add
Acked-by (or even Reviewed-by if we use it...): Jan Kara <[EMAIL PROTECTED]>

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Hugh Dickins

On Thu, 25 Oct 2007, Erez Zadok wrote:
> 
> On a related note, I would just love to get rid of calling the lower
> ->writepage in unionfs b/c I can't even tell if I have a lower page to use
> all the time.  I'd prefer to call vfs_write() if I can, but I'll need a
> struct file, or at least a dentry.

Why do you want to do that?  You gave a good reason why it's easier
for ecryptfs, but I doubt it's robust.  The higher the level you
choose to use, the harder to guarantee it won't deadlock.

Or that's my gut feeling anyway.  It's several years since I've
thought about such issues: just because I came into this knowing
about shmem_writepage, is perhaps not a good reason to choose me
as advisor!

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Hugh Dickins

On Mon, 22 Oct 2007, Erez Zadok wrote:
> 
> What's the precise semantics of AOP_WRITEPAGE_ACTIVATE?

Sigh - not at you, at it!  It's a secret that couldn't be kept secret,
a hack for tmpfs reclaim, let's just look forward to it going away.

> Is it considered an error or not?

No, it's definitely not an error.  It'a a private note from tmpfs
(or ramdisk) to vmscan, saying "don't waste your time coming back
to me with this page until you have to, please move on to another
more likely to be freeable".

> If it's an error, then I usually feel that it's important for
> a stacked f/s to return that error indication upwards.

Indeed, but this is not an error.  Remember, neither ramdisk nor
tmpfs is stable storage: okay, tmpfs can go out to disk by using
swap, but that's not stable storage - it's not reconstituted after
reboot.  (If there's an error in writing to swap, well, that's a
different issue; and there's few filesystems where such an I/O
error would be reported from ->writepage.)

> 
> The unionfs page and the lower page are somewhat tied together, at least
> logically.  For unionfs's page to be considered to have been written
> successfully, the lower page has to be written successfully.  So again, if
> the lower f/s returns AOP_WRITEPAGE_ACTIVATE, should I consider my unionfs
> page to have been written successfully or not?

Consider it written successfully.  (What does written mean with tmpfs?
it means a page can be freed, it doesn't mean the data is forever safe.)

> If I don't return
> AOP_WRITEPAGE_ACTIVATE up, can there be any chance that some vital data may
> never get flushed out?

Things should work better if you don't return AOP_WRITEPAGE_ACTIVATE.
If you mark your page as clean and successfully written, vmscan will
be able to free it.  If needed, we can get the data back from the
lower page on demand, but meanwhile a page has been freed, which
is what vmscan reclaim is all about.  (But of course, in the case
where you couldn't get hold of a page for the lower, you must redirty
yours before returning.)

> > unionfs_writepage also sets AOP_WRITEPAGE_ACTIVATE when it cannot
> > find_lock_page: that case may be appropriate.  Though I don't really
> > understand it: seems dangerous to be relying upon the lower level page
> > just happening to be there already.  Isn't memory pressure then likely
> > to clog up with lots of upper level dirty pages which cannot get
> > written out to the lower level?
> 
> Based on vfs.txt (which perhaps should be revised :-), I was trying to do
> the best I can to ensure that no data is lost if the current page cannot be
> written out to the lower f/s.
> 
> I used to do grab_cache_page() before, but that caused problems: writepage
> is not the right place to _increase_ memory pressure by allocating a new
> page...

Yes, but just hoping the lower page will be there, and doing nothing
to encourage it to become there, sounds an even poorer strategy to me.

It's not easy, I know.  Your position reminds me of the loop driver
(drivers/block/loop.c), which has long handled this situation (with
great success, though I doubt an absolute guarantee) by taking
__GFP_IO|__GFP_FS off the mapping_gfp_mask of the underlying file:
look for gfp_mask in loop_set_fd() (and I think ignore do_loop_switch(),
that's new to me and seems to be for a very special case).

I grepped for gfp in unionfs, and there seems to be nothing: I doubt
you can be robust under memory pressure without doing something about
that.  If you can take __GFP_IO|__GFP_FS off the lower mapping (just
while in unionfs_writepage, or longer term? what locking needed?),
then you should be able to go back to using grab_cache_page().

> 
> One solution I thought of is do what ecryptfs does: keep an open struct file
> in my inode and call vfs_write(), but I don't see that as a significantly
> cleaner/better solution.

I agree with you.

> (BTW, ecrypfts kinda had to go for vfs_write b/c
> it changes the data size and content of what it writes below; unionfs is
> simpler in that manner b/c it needs to write the same data to the lower file
> at the same offset.)

Ah, yes.

> 
> Another idea we've experimented with before is "page pointer flipping."  In
> writepage, we temporarily set the page->mapping->host to the lower_inode;
> then we call the lower writepage with OUR page; then fix back the
> page->mapping->host to the upper inode.  This had two benefits: first we can
> guarantee that we always have a page to write below, and second we don't
> need to keep both upper and lower pages (reduces memory pressure).  Before
> we did this page pointer flipping, we verified that the page is locked so no
> other user could be written the page->mapping->host in this transient state,
> and we ensured that no lower f/s was somehow caching the temporarily changed
> value of page->mapping->host for later use.  But, mucking with the pointers
> in this manner is kinda ugly, to say the least.  Still, I'd love to find a
> clean and simple way t

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Erez Zadok

That's a nice historical review, Huge, of how got into these mess we're in
now -- it all starts with good intentions. :-)

On a related note, I would just love to get rid of calling the lower
->writepage in unionfs b/c I can't even tell if I have a lower page to use
all the time.  I'd prefer to call vfs_write() if I can, but I'll need a
struct file, or at least a dentry.

What ecryptfs does is store a struct file inside it's inode, so it can use
it later in ->writepage to call vfs_write on the lower f/s.  And Unionfs may
have to go in that direction too, but this trick is not terribly clean --
storing a file inside an inode.

I realize that the calling path to ->writepage doesn't have a file/dentry
any more, but if we're considering larger changes to the writepage related
code, can we perhaps consider passing a file or dentry to >writepage (same
as commit_write, perhaps).

Thanks,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Hugh Dickins

On Mon, 22 Oct 2007, Erez Zadok wrote:
> In message <[EMAIL PROTECTED]>, Hugh Dickins writes:
> > 
> > Only ramdisk and shmem have been returning AOP_WRITEPAGE_ACTIVATE.
> > Both of those set BDI_CAP_NO_WRITEBACK.  ramdisk never returned it
> > if !wbc->for_reclaim.  I contend that shmem shouldn't either: it's
> > a special code to get the LRU rotation right, not useful elsewhere.
> > Though Documentation/filesystems/vfs.txt does imply wider use.
> 
> Yes, based on vfs.txt I figured unionfs should return
> AOP_WRITEPAGE_ACTIVATE.

unionfs_writepage returns it in two different cases: when it can't
find the underlying page; and when the underlying writepage returns
it.  I'd say it's wrong to return it in both cases.

In the first case, you don't really want your page put back to the head
of the active list, you want to come back to try it again quite soon
(I think): so you should just redirty and unlock and pretend success.

ramdisk uses A_W_A because none of its pages will ever become freeable
(and comment points out it'd be better if they weren't even on the
LRUs - I think several people have recently been putting forward
patches to keep such timewasters off the LRUs).

shmem uses A_W_A when there's no swap (left), or when the underlying
shm is marked as locked in memory: in each case, best to move on to
look for other pages to swap out.  (But I'm not quite convincing myself
that the temporarily out-of-swap case is different from yours above.)
It's about fixing some horrid busy loops where vmscan kept going
over the same hopeless pages repeatedly, instead of moving on to
better candidates.  Oh, there's a third case, when move_to_swap_cache
fails: that's rare, and I think I was just too lazy to separate them.

In your second case, I fail to see why the unionfs level should
mimic the lower level: you've successfully copied data and marked
the lower level pages as dirty, vmscan will come back to those in
due course, but it's just a waste of time for it to come back to
the unionfs pages again - isn't it?

> But, now that unionfs has ->writepages which won't
> even call the lower writepage if BDI_CAP_NO_WRITEBACK is on, then perhaps I
> no longer need unionfs_writepage to bother checking for
> AOP_WRITEPAGE_ACTIVATE, or even return it up?

unionfs_writepages handles the sync/msync/fsync leaking of A_W_A to
userspace issue, as does Pekka & Andrew's patch to write_cache_pages,
as does my patch to shmem_writepage.  And I'm contending that
unionfs_writepage should in no case return A_W_A up.

But so long as A_W_A is still defined, unionfs_writepage does
still need to check for it after calling the lower level ->writepage
(because it needs to do the missing unlock_page): unionfs_writepages
prevents unionfs_writepage being called on the normal writeout path,
but it's still getting called by vmscan under memory pressure.

(I'm in the habit of saying "vmscan" rather than naming the functions
in question, because every few months someone restructures that file
and changes their names.  I exaggerate, but it's happened often enough.)

> But, a future file system _could_ return AOP_WRITEPAGE_ACTIVATE w/o setting
> BDI_CAP_NO_WRITEBACK, right?  In that case, unionfs will still need to
> handle AOP_WRITEPAGE_ACTIVATE in ->writepage, right?

For so long as AOP_WRITEPAGE_ACTIVATE exists, unionfs_writepage needs to
check for it coming from the lower level ->writepage, as I said above.

But your/Pekka's unionfs_writepages doesn't need to worry about it
at all, because Andrew/Pekka's write_cache_pages fix prevents it
leaking up in the !reclaim case (as does my shmem_writepage fix):
please remove that AOP_WRITEPAGE_ACTIVATE comment from unionfs_writepages.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 31/31] IGET: Remove iget() and the read_inode() super op as being obsolete [try #5]

2007-10-25 Thread David Howells

Remove the old iget() call and the read_inode() superblock operation it uses
as these are really obsolete, and the use of read_inode() does not produce
proper error handling (no distinction between ENOMEM and EIO when marking
an inode bad).

Furthermore, this removes the temptation to use iget() to find an inode by
number in a filesystem from code outside that filesystem.

iget_locked() should be used instead.  A new function is added in an earlier
patch (iget_failed) that is to be called to mark an inode as bad, unlock it and
release it should the get routine fail.
Mark iget() and read_inode() as being obsolete and remove references to them
from the documentation.


Typically a filesystem will be modified such that the read_inode function
becomes an internal iget function, for example the following:

void thingyfs_read_inode(struct inode *inode)
{
...
}

would be changed into something like:

struct inode *thingyfs_iget(struct super_block *sp, unsigned long ino)
{
struct inode *inode;
int ret;

inode = iget_locked(sb, ino);
if (!inode)
return ERR_PTR(-ENOMEM);
if (!(inode->i_state & I_NEW))
return inode;

...
unlock_new_inode(inode);
return inode;
error:
iget_failed(inode);
return ERR_PTR(ret);
}

and then thingyfs_iget() would be called rather than iget(), for example:

ret = -EINVAL;
inode = iget(sb, ino);
if (!inode || is_bad_inode(inode))
goto error;

becomes:

inode = thingyfs_iget(sb, ino);
if (IS_ERR(inode)) {
ret = PTR_ERR(inode);
goto error;
}

Note that is_bad_inode() does not need to be called.  The error returned by
thingyfs_iget() should render it unnecessary.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 Documentation/filesystems/Locking |3 ---
 Documentation/filesystems/porting |   12 ++--
 Documentation/filesystems/vfs.txt |   17 +++--
 fs/inode.c|4 
 include/linux/fs.h|   14 --
 5 files changed, 9 insertions(+), 41 deletions(-)

diff --git a/Documentation/filesystems/Locking 
b/Documentation/filesystems/Locking
index 37c10cb..42d4b30 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -90,7 +90,6 @@ of the locking scheme for directory operations.
 prototypes:
struct inode *(*alloc_inode)(struct super_block *sb);
void (*destroy_inode)(struct inode *);
-   void (*read_inode) (struct inode *);
void (*dirty_inode) (struct inode *);
int (*write_inode) (struct inode *, int);
void (*put_inode) (struct inode *);
@@ -114,7 +113,6 @@ locking rules:
BKL s_lock  s_umount
 alloc_inode:   no  no  no
 destroy_inode: no
-read_inode:no  (see below)
 dirty_inode:   no  (must not sleep)
 write_inode:   no
 put_inode: no
@@ -133,7 +131,6 @@ show_options:   no  
(vfsmount->sem)
 quota_read:no  no  no  (see below)
 quota_write:   no  no  no  (see below)
 
-->read_inode() is not a method - it's a callback used in iget().
 ->remount_fs() will have the s_umount lock if it's already mounted.
 When called from get_sb_single, it does NOT have the s_umount lock.
 ->quota_read() and ->quota_write() functions are both guaranteed to
diff --git a/Documentation/filesystems/porting 
b/Documentation/filesystems/porting
index 3b0fb22..93d9362 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -34,8 +34,8 @@ FOO_I(inode) (see in-tree filesystems for examples).
 
 Make them ->alloc_inode and ->destroy_inode in your super_operations.
 
-Keep in mind that now you need explicit initialization of private data -
-typically in ->read_inode() and after getting an inode from new_inode().
+Keep in mind that now you need explicit initialization of private data
+typically between calling iget_locked() and unlocking the inode.
 
 At some point that will become mandatory.
 
@@ -173,10 +173,10 @@ should be a non-blocking function that initializes those 
parts of a
 newly created inode to allow the test function to succeed. 'data' is
 passed as an opaque value to both test and set functions.
 
-When the inode has been created by iget5_locked(), it will be returned with
-the I_NEW flag set and will still be locked. read_inode has not been
-called so the file system still has to finalize the initialization. Once
-the inode is initialized it must be unlocked by calling unlock_new_inode().
+When the inode h

[PATCH 29/31] IGET: Stop HOSTFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the HOSTFS filesystem from using iget() and read_inode().  Provide
hostfs_iget(), and call that instead of iget().  hostfs_iget() then uses
iget_locked() directly and returns a proper error code instead of an inode in
the event of an error.

hostfs_fill_sb_common() returns any error incurred when getting the root inode
instead of EINVAL.

Note that the contents of hostfs_kern.c need to be examined:

 (*) hostfs_iget() should perhaps subsume init_inode() and hostfs_read_inode().

 (*) It would appear that all hostfs inodes are the same inode because iget()
 was being called with inode number 0 - which forms the lookup key.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/hostfs/hostfs_kern.c |   59 +++
 1 files changed, 39 insertions(+), 20 deletions(-)

diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index 8966b05..8aafb53 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -202,7 +202,7 @@ static char *follow_link(char *link)
return ERR_PTR(n);
 }
 
-static int read_inode(struct inode *ino)
+static int hostfs_read_inode(struct inode *ino)
 {
char *name;
int err = 0;
@@ -233,6 +233,25 @@ static int read_inode(struct inode *ino)
return err;
 }
 
+static struct inode *hostfs_iget(struct super_block *sb)
+{
+   struct inode *inode;
+   long ret;
+
+   inode = iget_locked(sb, 0);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (inode->i_state & I_NEW) {
+   ret = hostfs_read_inode(inode);
+   if (ret < 0) {
+   iget_failed(inode);
+   return ERR_PTR(ret);
+   }
+   unlock_new_inode(inode);
+   }
+   return inode;
+}
+
 int hostfs_statfs(struct dentry *dentry, struct kstatfs *sf)
 {
/*
@@ -303,17 +322,11 @@ static void hostfs_destroy_inode(struct inode *inode)
kfree(HOSTFS_I(inode));
 }
 
-static void hostfs_read_inode(struct inode *inode)
-{
-   read_inode(inode);
-}
-
 static const struct super_operations hostfs_sbops = {
.alloc_inode= hostfs_alloc_inode,
.drop_inode = generic_delete_inode,
.delete_inode   = hostfs_delete_inode,
.destroy_inode  = hostfs_destroy_inode,
-   .read_inode = hostfs_read_inode,
.statfs = hostfs_statfs,
 };
 
@@ -571,10 +584,11 @@ int hostfs_create(struct inode *dir, struct dentry 
*dentry, int mode,
char *name;
int error, fd;
 
-   error = -ENOMEM;
-   inode = iget(dir->i_sb, 0);
-   if (inode == NULL)
+   inode = hostfs_iget(dir->i_sb);
+   if (IS_ERR(inode)) {
+   error = PTR_ERR(inode);
goto out;
+   }
 
error = init_inode(inode, dentry);
if (error)
@@ -615,10 +629,11 @@ struct dentry *hostfs_lookup(struct inode *ino, struct 
dentry *dentry,
char *name;
int err;
 
-   err = -ENOMEM;
-   inode = iget(ino->i_sb, 0);
-   if (inode == NULL)
+   inode = hostfs_iget(ino->i_sb);
+   if (IS_ERR(inode)) {
+   err = PTR_ERR(inode);
goto out;
+   }
 
err = init_inode(inode, dentry);
if (err)
@@ -736,11 +751,13 @@ int hostfs_mknod(struct inode *dir, struct dentry 
*dentry, int mode, dev_t dev)
 {
struct inode *inode;
char *name;
-   int err = -ENOMEM;
+   int err;
 
-   inode = iget(dir->i_sb, 0);
-   if (inode == NULL)
+   inode = hostfs_iget(dir->i_sb);
+   if (IS_ERR(inode)) {
+   err = PTR_ERR(inode);
goto out;
+   }
 
err = init_inode(inode, dentry);
if (err)
@@ -952,9 +969,11 @@ static int hostfs_fill_sb_common(struct super_block *sb, 
void *d, int silent)
 
sprintf(host_root_path, "%s/%s", root_ino, req_root);
 
-   root_inode = iget(sb, 0);
-   if (root_inode == NULL)
+   root_inode = hostfs_iget(sb);
+   if (IS_ERR(root_inode)) {
+   err = PTR_ERR(root_inode);
goto out_free;
+   }
 
err = init_inode(root_inode, NULL);
if (err)
@@ -972,8 +991,8 @@ static int hostfs_fill_sb_common(struct super_block *sb, 
void *d, int silent)
if (sb->s_root == NULL)
goto out_put;
 
-   err = read_inode(root_inode);
-   if (err) {
+   err = hostfs_read_inode(root_inode);
+   if (err){
/* No iput in this case because the dput does that for us */
dput(sb->s_root);
sb->s_root = NULL;

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 30/31] IGET: Stop HPPFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the HPPFS filesystem from using iget() and read_inode().  Provide an
hppfs_iget(), and call that instead of iget().  hppfs_iget() then uses
iget_locked() directly and returns a proper error code instead of an inode in
the event of an error.

hppfs_fill_sb_common() returns any error incurred when getting the root inode
instead of EINVAL.

Note that the contents of hppfs_kern.c need to be examined:

 (*) The HPPFS inode retains a pointer to the proc dentry it is shadowing, but
 whilst it does appear to retain a reference to it, it doesn't appear to
 destroy the reference if the inode goes away.

 (*) hppfs_iget() should perhaps subsume init_inode() and hppfs_read_inode().

 (*) It would appear that all hppfs inodes are the same inode because iget()
 was being called with inode number 0, which forms the lookup key.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/hppfs/hppfs_kern.c |   27 ++-
 1 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/hppfs/hppfs_kern.c b/fs/hppfs/hppfs_kern.c
index affb741..a1e1f0f 100644
--- a/fs/hppfs/hppfs_kern.c
+++ b/fs/hppfs/hppfs_kern.c
@@ -155,6 +155,20 @@ static void hppfs_read_inode(struct inode *ino)
ino->i_blocks = proc_ino->i_blocks;
 }
 
+static struct inode *hppfs_iget(struct super_block *sb)
+{
+   struct inode *inode;
+
+   inode = iget_locked(sb, 0);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (inode->i_state & I_NEW) {
+   hppfs_read_inode(inode);
+   unlock_new_inode(inode);
+   }
+   return inode;
+}
+
 static struct dentry *hppfs_lookup(struct inode *ino, struct dentry *dentry,
   struct nameidata *nd)
 {
@@ -190,9 +204,11 @@ static struct dentry *hppfs_lookup(struct inode *ino, 
struct dentry *dentry,
if(IS_ERR(proc_dentry))
return(proc_dentry);
 
-   inode = iget(ino->i_sb, 0);
-   if(inode == NULL)
+   inode = hppfs_iget(ino->i_sb);
+   if (IS_ERR(inode)) {
+   err = PTR_ERR(inode);
goto out_dput;
+   }
 
err = init_inode(inode, proc_dentry);
if(err)
@@ -652,7 +668,6 @@ static void hppfs_destroy_inode(struct inode *inode)
 static const struct super_operations hppfs_sbops = {
.alloc_inode= hppfs_alloc_inode,
.destroy_inode  = hppfs_destroy_inode,
-   .read_inode = hppfs_read_inode,
.delete_inode   = hppfs_delete_inode,
.statfs = hppfs_statfs,
 };
@@ -745,9 +760,11 @@ static int hppfs_fill_super(struct super_block *sb, void 
*d, int silent)
sb->s_magic = HPPFS_SUPER_MAGIC;
sb->s_op = &hppfs_sbops;
 
-   root_inode = iget(sb, 0);
-   if(root_inode == NULL)
+   root_inode = hppfs_iget(sb);
+   if (IS_ERR(root_inode)) {
+   err = PTR_ERR(root_inode);
goto out;
+   }
 
err = init_inode(root_inode, proc_sb->s_root);
if(err)

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 28/31] IGET: Stop OPENPROMFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the OPENPROMFS filesystem from using iget() and read_inode().  Replace
openpromfs_read_inode() with openpromfs_iget(), and call that instead of
iget().  openpromfs_iget() then uses iget_locked() directly and returns a
proper error code instead of an inode in the event of an error.

openpromfs_fill_super() returns any error incurred when getting the root inode
instead of ENOMEM (not that it currently incurs any other error).

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/openpromfs/inode.c |   45 ++---
 1 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index d881738..554586b 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -38,6 +38,8 @@ struct op_inode_info {
union op_inode_data u;
 };
 
+static struct inode *openprom_iget(struct super_block *sb, ino_t ino);
+
 static inline struct op_inode_info *OP_I(struct inode *inode)
 {
return container_of(inode, struct op_inode_info, vfs_inode);
@@ -226,10 +228,10 @@ static struct dentry *openpromfs_lookup(struct inode 
*dir, struct dentry *dentry
return ERR_PTR(-ENOENT);
 
 found:
-   inode = iget(dir->i_sb, ino);
+   inode = openprom_iget(dir->i_sb, ino);
mutex_unlock(&op_mutex);
-   if (!inode)
-   return ERR_PTR(-EINVAL);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
ent_oi = OP_I(inode);
ent_oi->type = ent_type;
ent_oi->u = ent_data;
@@ -348,14 +350,23 @@ static void openprom_destroy_inode(struct inode *inode)
kmem_cache_free(op_inode_cachep, OP_I(inode));
 }
 
-static void openprom_read_inode(struct inode * inode)
+static struct inode *openprom_iget(struct super_block *sb, ino_t ino)
 {
-   inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
-   if (inode->i_ino == OPENPROM_ROOT_INO) {
-   inode->i_op = &openprom_inode_operations;
-   inode->i_fop = &openprom_operations;
-   inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO;
+   struct inode *inode;
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (inode->i_state & I_NEW) {
+   inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
+   if (inode->i_ino == OPENPROM_ROOT_INO) {
+   inode->i_op = &openprom_inode_operations;
+   inode->i_fop = &openprom_operations;
+   inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO;
+   }
+   unlock_new_inode(inode);
}
+   return inode;
 }
 
 static int openprom_remount(struct super_block *sb, int *flags, char *data)
@@ -367,7 +378,6 @@ static int openprom_remount(struct super_block *sb, int 
*flags, char *data)
 static const struct super_operations openprom_sops = {
.alloc_inode= openprom_alloc_inode,
.destroy_inode  = openprom_destroy_inode,
-   .read_inode = openprom_read_inode,
.statfs = simple_statfs,
.remount_fs = openprom_remount,
 };
@@ -376,6 +386,7 @@ static int openprom_fill_super(struct super_block *s, void 
*data, int silent)
 {
struct inode *root_inode;
struct op_inode_info *oi;
+   int ret;
 
s->s_flags |= MS_NOATIME;
s->s_blocksize = 1024;
@@ -383,9 +394,11 @@ static int openprom_fill_super(struct super_block *s, void 
*data, int silent)
s->s_magic = OPENPROM_SUPER_MAGIC;
s->s_op = &openprom_sops;
s->s_time_gran = 1;
-   root_inode = iget(s, OPENPROM_ROOT_INO);
-   if (!root_inode)
+   root_inode = openprom_iget(s, OPENPROM_ROOT_INO);
+   if (IS_ERR(root_inode)) {
+   ret = PTR_ERR(root_inode);
goto out_no_root;
+   }
 
oi = OP_I(root_inode);
oi->type = op_inode_node;
@@ -393,13 +406,15 @@ static int openprom_fill_super(struct super_block *s, 
void *data, int silent)
 
s->s_root = d_alloc_root(root_inode);
if (!s->s_root)
-   goto out_no_root;
+   goto out_no_root_dentry;
return 0;
 
+out_no_root_dentry:
+   iput(root_inode);
+   ret = -ENOMEM;
 out_no_root:
printk("openprom_fill_super: get root inode failed\n");
-   iput(root_inode);
-   return -ENOMEM;
+   return ret;
 }
 
 static int openprom_get_sb(struct file_system_type *fs_type,

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 23/31] IGET: Stop PROCFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the PROCFS filesystem from using iget() and read_inode().  Merge
procfs_read_inode() into procfs_get_inode(), and have that call iget_locked()
instead of iget().

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/proc/inode.c |   60 ++-
 1 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index abe6a3f..1b597b7 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -78,11 +78,6 @@ static void proc_delete_inode(struct inode *inode)
 
 struct vfsmount *proc_mnt;
 
-static void proc_read_inode(struct inode * inode)
-{
-   inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
-}
-
 static struct kmem_cache * proc_inode_cachep;
 
 static struct inode *proc_alloc_inode(struct super_block *sb)
@@ -133,7 +128,6 @@ static int proc_remount(struct super_block *sb, int *flags, 
char *data)
 static const struct super_operations proc_sops = {
.alloc_inode= proc_alloc_inode,
.destroy_inode  = proc_destroy_inode,
-   .read_inode = proc_read_inode,
.drop_inode = generic_delete_inode,
.delete_inode   = proc_delete_inode,
.statfs = simple_statfs,
@@ -406,39 +400,41 @@ struct inode *proc_get_inode(struct super_block *sb, 
unsigned int ino,
if (de != NULL && !try_module_get(de->owner))
goto out_mod;
 
-   inode = iget(sb, ino);
+   inode = iget_locked(sb, ino);
if (!inode)
goto out_ino;
-
-   PROC_I(inode)->fd = 0;
-   PROC_I(inode)->pde = de;
-   if (de) {
-   if (de->mode) {
-   inode->i_mode = de->mode;
-   inode->i_uid = de->uid;
-   inode->i_gid = de->gid;
-   }
-   if (de->size)
-   inode->i_size = de->size;
-   if (de->nlink)
-   inode->i_nlink = de->nlink;
-   if (de->proc_iops)
-   inode->i_op = de->proc_iops;
-   if (de->proc_fops) {
-   if (S_ISREG(inode->i_mode)) {
+   if (inode->i_state & I_NEW) {
+   inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
+   PROC_I(inode)->fd = 0;
+   PROC_I(inode)->pde = de;
+   if (de) {
+   if (de->mode) {
+   inode->i_mode = de->mode;
+   inode->i_uid = de->uid;
+   inode->i_gid = de->gid;
+   }
+   if (de->size)
+   inode->i_size = de->size;
+   if (de->nlink)
+   inode->i_nlink = de->nlink;
+   if (de->proc_iops)
+   inode->i_op = de->proc_iops;
+   if (de->proc_fops) {
+   if (S_ISREG(inode->i_mode)) {
 #ifdef CONFIG_COMPAT
-   if (!de->proc_fops->compat_ioctl)
-   inode->i_fop =
-   &proc_reg_file_ops_no_compat;
-   else
+   if (!de->proc_fops->compat_ioctl)
+   inode->i_fop =
+   
&proc_reg_file_ops_no_compat;
+   else
 #endif
-   inode->i_fop = &proc_reg_file_ops;
+   inode->i_fop = 
&proc_reg_file_ops;
+   }
+   else
+   inode->i_fop = de->proc_fops;
}
-   else
-   inode->i_fop = de->proc_fops;
}
+   unlock_new_inode(inode);
}
-
return inode;
 
 out_ino:

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 26/31] IGET: Stop the SYSV filesystem from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the SYSV filesystem from using iget() and read_inode().  Replace
sysv_read_inode() with sysv_iget(), and call that instead of iget().
sysv_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/sysv/inode.c |   25 -
 fs/sysv/namei.c |6 +++---
 fs/sysv/super.c |4 ++--
 fs/sysv/sysv.h  |1 +
 4 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 81ec6c5..c5d60de 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -169,20 +169,27 @@ void sysv_set_inode(struct inode *inode, dev_t rdev)
init_special_inode(inode, inode->i_mode, rdev);
 }
 
-static void sysv_read_inode(struct inode *inode)
+struct inode *sysv_iget(struct super_block *sb, unsigned int ino)
 {
-   struct super_block * sb = inode->i_sb;
struct sysv_sb_info * sbi = SYSV_SB(sb);
struct buffer_head * bh;
struct sysv_inode * raw_inode;
struct sysv_inode_info * si;
-   unsigned int block, ino = inode->i_ino;
+   struct inode *inode;
+   unsigned int block;
 
if (!ino || ino > sbi->s_ninodes) {
printk("Bad inode number on dev %s: %d is out of range\n",
-  inode->i_sb->s_id, ino);
-   goto bad_inode;
+  sb->s_id, ino);
+   return ERR_PTR(-EIO);
}
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
raw_inode = sysv_raw_inode(sb, ino, &bh);
if (!raw_inode) {
printk("Major problem: unable to read inode from dev %s\n",
@@ -214,11 +221,12 @@ static void sysv_read_inode(struct inode *inode)
   old_decode_dev(fs32_to_cpu(sbi, si->i_data[0])));
else
sysv_set_inode(inode, 0);
-   return;
+   unlock_new_inode(inode);
+   return inode;
 
 bad_inode:
-   make_bad_inode(inode);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
 }
 
 static struct buffer_head * sysv_update_inode(struct inode * inode)
@@ -328,7 +336,6 @@ static void init_once(struct kmem_cache *cachep, void *p)
 const struct super_operations sysv_sops = {
.alloc_inode= sysv_alloc_inode,
.destroy_inode  = sysv_destroy_inode,
-   .read_inode = sysv_read_inode,
.write_inode= sysv_write_inode,
.delete_inode   = sysv_delete_inode,
.put_super  = sysv_put_super,
diff --git a/fs/sysv/namei.c b/fs/sysv/namei.c
index 6bd850b..a1f1ef3 100644
--- a/fs/sysv/namei.c
+++ b/fs/sysv/namei.c
@@ -53,9 +53,9 @@ static struct dentry *sysv_lookup(struct inode * dir, struct 
dentry * dentry, st
ino = sysv_inode_by_name(dentry);
 
if (ino) {
-   inode = iget(dir->i_sb, ino);
-   if (!inode)
-   return ERR_PTR(-EACCES);
+   inode = sysv_iget(dir->i_sb, ino);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
}
d_add(dentry, inode);
return NULL;
diff --git a/fs/sysv/super.c b/fs/sysv/super.c
index 6f9707a..a92d104 100644
--- a/fs/sysv/super.c
+++ b/fs/sysv/super.c
@@ -332,8 +332,8 @@ static int complete_read_super(struct super_block *sb, int 
silent, int size)
sb->s_magic = SYSV_MAGIC_BASE + sbi->s_type;
/* set up enough so that it can read an inode */
sb->s_op = &sysv_sops;
-   root_inode = iget(sb,SYSV_ROOT_INO);
-   if (!root_inode || is_bad_inode(root_inode)) {
+   root_inode = sysv_iget(sb,SYSV_ROOT_INO);
+   if (IS_ERR(root_inode)) {
printk("SysV FS: get root inode failed\n");
return 0;
}
diff --git a/fs/sysv/sysv.h b/fs/sysv/sysv.h
index 64c03bd..42d51d1 100644
--- a/fs/sysv/sysv.h
+++ b/fs/sysv/sysv.h
@@ -141,6 +141,7 @@ extern int __sysv_write_begin(struct file *file, struct 
address_space *mapping,
struct page **pagep, void **fsdata);
 
 /* inode.c */
+extern struct inode *sysv_iget(struct super_block *, unsigned int);
 extern int sysv_write_inode(struct inode *, int);
 extern int sysv_sync_inode(struct inode *);
 extern int sysv_sync_file(struct file *, struct dentry *, int);

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 27/31] IGET: Stop UFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the UFS filesystem from using iget() and read_inode().  Replace
ufs_read_inode() with ufs_iget(), and call that instead of iget().
ufs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

ufs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/ufs/inode.c |   34 --
 fs/ufs/namei.c |6 +++---
 fs/ufs/super.c |   14 +-
 fs/ufs/ufs.h   |2 +-
 4 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index 4320782..73539ef 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -714,26 +714,30 @@ static int ufs2_read_inode(struct inode *inode, struct 
ufs2_inode *ufs2_inode)
return 0;
 }
 
-void ufs_read_inode(struct inode * inode)
+struct inode *ufs_iget (struct super_block *sb, unsigned long ino)
 {
-   struct ufs_inode_info *ufsi = UFS_I(inode);
-   struct super_block * sb;
-   struct ufs_sb_private_info * uspi;
+   struct ufs_inode_info *ufsi;
+   struct ufs_sb_private_info * uspi = UFS_SB(sb)->s_uspi;
struct buffer_head * bh;
+   struct inode *inode;
int err;
 
-   UFSD("ENTER, ino %lu\n", inode->i_ino);
-
-   sb = inode->i_sb;
-   uspi = UFS_SB(sb)->s_uspi;
+   UFSD("ENTER, ino %lu\n", ino);
 
-   if (inode->i_ino < UFS_ROOTINO ||
-   inode->i_ino > (uspi->s_ncg * uspi->s_ipg)) {
+   if (ino < UFS_ROOTINO || ino > (uspi->s_ncg * uspi->s_ipg)) {
ufs_warning(sb, "ufs_read_inode", "bad inode number (%lu)\n",
-   inode->i_ino);
-   goto bad_inode;
+   ino);
+   return ERR_PTR(-EIO);
}
 
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   ufsi = UFS_I(inode);
+
bh = sb_bread(sb, uspi->s_sbbase + ufs_inotofsba(inode->i_ino));
if (!bh) {
ufs_warning(sb, "ufs_read_inode", "unable to read inode %lu\n",
@@ -765,10 +769,12 @@ void ufs_read_inode(struct inode * inode)
brelse(bh);
 
UFSD("EXIT\n");
-   return;
+   unlock_new_inode(inode);
+   return inode;
 
 bad_inode:
-   make_bad_inode(inode);
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
 }
 
 static void ufs1_update_inode(struct inode *inode, struct ufs_inode *ufs_inode)
diff --git a/fs/ufs/namei.c b/fs/ufs/namei.c
index d8bfbee..747a4de 100644
--- a/fs/ufs/namei.c
+++ b/fs/ufs/namei.c
@@ -57,10 +57,10 @@ static struct dentry *ufs_lookup(struct inode * dir, struct 
dentry *dentry, stru
lock_kernel();
ino = ufs_inode_by_name(dir, dentry);
if (ino) {
-   inode = iget(dir->i_sb, ino);
-   if (!inode) {
+   inode = ufs_iget(dir->i_sb, ino);
+   if (IS_ERR(inode)) {
unlock_kernel();
-   return ERR_PTR(-EACCES);
+   return ERR_CAST(inode);
}
}
unlock_kernel();
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 584cf12..e58045f 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -633,6 +633,7 @@ static int ufs_fill_super(struct super_block *sb, void 
*data, int silent)
unsigned block_size, super_block_size;
unsigned flags;
unsigned super_block_offset;
+   int ret = -EINVAL;
 
uspi = NULL;
ubh = NULL;
@@ -1066,12 +1067,16 @@ magic_found:
uspi->s_maxsymlinklen =
fs32_to_cpu(sb, usb3->fs_un2.fs_44.fs_maxsymlinklen);
 
-   inode = iget(sb, UFS_ROOTINO);
-   if (!inode || is_bad_inode(inode))
+   inode = ufs_iget(sb, UFS_ROOTINO);
+   if (IS_ERR(inode)) {
+   ret = PTR_ERR(inode);
goto failed;
+   }
sb->s_root = d_alloc_root(inode);
-   if (!sb->s_root)
+   if (!sb->s_root) {
+   ret = -ENOMEM;
goto dalloc_failed;
+   }
 
ufs_setup_cstotal(sb);
/*
@@ -1093,7 +1098,7 @@ failed:
kfree(sbi);
sb->s_fs_info = NULL;
UFSD("EXIT (FAILED)\n");
-   return -EINVAL;
+   return ret;
 
 failed_nomem:
UFSD("EXIT (NOMEM)\n");
@@ -1327,7 +1332,6 @@ static ssize_t ufs_quota_write(struct super_block *, int, 
const char *, size_t,
 static const struct super_operations ufs_super_ops = {
.alloc_inode= ufs_alloc_inode,
.destroy_inode  = ufs_destroy_inode,
-   .read_inode = ufs_read_inode,
.write_inode= ufs_write_inode,
.delete_inode   = ufs_delete_inode,
.put_super  = ufs_put_super,
diff --git a/fs/ufs/ufs.h b/fs/ufs/ufs.h
index 7faa4cd..fcb9231 100644
--- a/fs/ufs/ufs.h
+++ b/fs/ufs/ufs.h
@@ -106,7 +106,7 @@ extern void ufs_fre

[PATCH 25/31] IGET: Stop ROMFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the ROMFS filesystem from using iget() and read_inode().  Replace
romfs_read_inode() with romfs_iget(), and call that instead of iget().
romfs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

romfs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/romfs/inode.c |   45 +++--
 1 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/fs/romfs/inode.c b/fs/romfs/inode.c
index a49cf5b..18f1fe3 100644
--- a/fs/romfs/inode.c
+++ b/fs/romfs/inode.c
@@ -84,6 +84,8 @@ struct romfs_inode_info {
struct inode vfs_inode;
 };
 
+static struct inode *romfs_iget(struct super_block *, unsigned long);
+
 /* instead of private superblock data */
 static inline unsigned long romfs_maxsize(struct super_block *sb)
 {
@@ -117,7 +119,7 @@ static int romfs_fill_super(struct super_block *s, void 
*data, int silent)
struct buffer_head *bh;
struct romfs_super_block *rsb;
struct inode *root;
-   int sz;
+   int sz, ret = -EINVAL;
 
/* I would parse the options here, but there are none.. :) */
 
@@ -157,10 +159,13 @@ static int romfs_fill_super(struct super_block *s, void 
*data, int silent)
 & ROMFH_MASK;
 
s->s_op = &romfs_ops;
-   root = iget(s, sz);
-   if (!root)
+   root = romfs_iget(s, sz);
+   if (IS_ERR(root)) {
+   ret = PTR_ERR(root);
goto out;
+   }
 
+   ret = -ENOMEM;
s->s_root = d_alloc_root(root);
if (!s->s_root)
goto outiput;
@@ -173,7 +178,7 @@ outiput:
 out:
brelse(bh);
 outnobh:
-   return -EINVAL;
+   return ret;
 }
 
 /* That's simple too. */
@@ -389,8 +394,11 @@ romfs_lookup(struct inode *dir, struct dentry *dentry, 
struct nameidata *nd)
if ((be32_to_cpu(ri.next) & ROMFH_TYPE) == ROMFH_HRD)
offset = be32_to_cpu(ri.spec) & ROMFH_MASK;
 
-   if ((inode = iget(dir->i_sb, offset)))
-   goto outi;
+   inode = romfs_iget(dir->i_sb, offset);
+   if (IS_ERR(inode)) {
+   res = PTR_ERR(inode);
+   goto out;
+   }
 
/*
 * it's a bit funky, _lookup needs to return an error code
@@ -402,7 +410,7 @@ romfs_lookup(struct inode *dir, struct dentry *dentry, 
struct nameidata *nd)
 */
 
 out0:  inode = NULL;
-outi:  res = 0;
+   res = 0;
d_add (dentry, inode);
 
 out:   unlock_kernel();
@@ -478,20 +486,28 @@ static mode_t romfs_modemap[] =
S_IFBLK+0600, S_IFCHR+0600, S_IFSOCK+0644, S_IFIFO+0644
 };
 
-static void
-romfs_read_inode(struct inode *i)
+static struct inode *
+romfs_iget(struct super_block *sb, unsigned long ino)
 {
-   int nextfh, ino;
+   int nextfh;
struct romfs_inode ri;
+   struct inode *i;
+
+   ino &= ROMFH_MASK;
+   i = iget_locked(sb, ino);
+   if (!i)
+   return ERR_PTR(-ENOMEM);
+   if (!(i->i_state & I_NEW))
+   return i;
 
-   ino = i->i_ino & ROMFH_MASK;
i->i_mode = 0;
 
/* Loop for finding the real hard link */
for(;;) {
if (romfs_copyfrom(i, &ri, ino, ROMFH_SIZE) <= 0) {
-   printk("romfs: read error for inode 0x%x\n", ino);
-   return;
+   printk("romfs: read error for inode 0x%lx\n", ino);
+   iget_failed(i);
+   return ERR_PTR(-EIO);
}
/* XXX: do romfs_checksum here too (with name) */
 
@@ -548,6 +564,8 @@ romfs_read_inode(struct inode *i)
init_special_inode(i, ino,
MKDEV(nextfh>>16,nextfh&0x));
}
+   unlock_new_inode(i);
+   return i;
 }
 
 static struct kmem_cache * romfs_inode_cachep;
@@ -599,7 +617,6 @@ static int romfs_remount(struct super_block *sb, int 
*flags, char *data)
 static const struct super_operations romfs_ops = {
.alloc_inode= romfs_alloc_inode,
.destroy_inode  = romfs_destroy_inode,
-   .read_inode = romfs_read_inode,
.statfs = romfs_statfs,
.remount_fs = romfs_remount,
 };

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 24/31] IGET: Stop QNX4 from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the QNX4 filesystem from using iget() and read_inode().  Replace
qnx4_read_inode() with qnx4_iget(), and call that instead of iget().
qnx4_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

qnx4_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/qnx4/inode.c |   45 ++---
 fs/qnx4/namei.c |8 +---
 include/linux/qnx4_fs.h |1 +
 3 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index 638bdb9..d23bfe3 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -125,7 +125,6 @@ static int qnx4_write_inode(struct inode *inode, int unused)
 static void qnx4_put_super(struct super_block *sb);
 static struct inode *qnx4_alloc_inode(struct super_block *sb);
 static void qnx4_destroy_inode(struct inode *inode);
-static void qnx4_read_inode(struct inode *);
 static int qnx4_remount(struct super_block *sb, int *flags, char *data);
 static int qnx4_statfs(struct dentry *, struct kstatfs *);
 
@@ -133,7 +132,6 @@ static const struct super_operations qnx4_sops =
 {
.alloc_inode= qnx4_alloc_inode,
.destroy_inode  = qnx4_destroy_inode,
-   .read_inode = qnx4_read_inode,
.put_super  = qnx4_put_super,
.statfs = qnx4_statfs,
.remount_fs = qnx4_remount,
@@ -357,6 +355,7 @@ static int qnx4_fill_super(struct super_block *s, void 
*data, int silent)
struct inode *root;
const char *errmsg;
struct qnx4_sb_info *qs;
+   int ret = -EINVAL;
 
qs = kzalloc(sizeof(struct qnx4_sb_info), GFP_KERNEL);
if (!qs)
@@ -396,12 +395,14 @@ static int qnx4_fill_super(struct super_block *s, void 
*data, int silent)
}
 
/* does root not have inode number QNX4_ROOT_INO ?? */
-   root = iget(s, QNX4_ROOT_INO * QNX4_INODES_PER_BLOCK);
-   if (!root) {
+   root = qnx4_iget(s, QNX4_ROOT_INO * QNX4_INODES_PER_BLOCK);
+   if (IS_ERR(root)) {
printk("qnx4: get inode failed\n");
+   ret = PTR_ERR(root);
goto out;
}
 
+   ret = -ENOMEM;
s->s_root = d_alloc_root(root);
if (s->s_root == NULL)
goto outi;
@@ -417,7 +418,7 @@ static int qnx4_fill_super(struct super_block *s, void 
*data, int silent)
   outnobh:
kfree(qs);
s->s_fs_info = NULL;
-   return -EINVAL;
+   return ret;
 }
 
 static void qnx4_put_super(struct super_block *sb)
@@ -462,29 +463,37 @@ static const struct address_space_operations qnx4_aops = {
.bmap   = qnx4_bmap
 };
 
-static void qnx4_read_inode(struct inode *inode)
+struct inode *qnx4_iget(struct super_block *sb, unsigned long ino)
 {
struct buffer_head *bh;
struct qnx4_inode_entry *raw_inode;
-   int block, ino;
-   struct super_block *sb = inode->i_sb;
-   struct qnx4_inode_entry *qnx4_inode = qnx4_raw_inode(inode);
+   int block;
+   struct qnx4_inode_entry *qnx4_inode;
+   struct inode *inode;
 
-   ino = inode->i_ino;
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   qnx4_inode = qnx4_raw_inode(inode);
inode->i_mode = 0;
 
QNX4DEBUG(("Reading inode : [%d]\n", ino));
if (!ino) {
-   printk("qnx4: bad inode number on dev %s: %d is out of range\n",
+   printk("qnx4: bad inode number on dev %s: %lu is out of 
range\n",
   sb->s_id, ino);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
}
block = ino / QNX4_INODES_PER_BLOCK;
 
if (!(bh = sb_bread(sb, block))) {
printk("qnx4: major problem: unable to read inode from dev "
   "%s\n", sb->s_id);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
}
raw_inode = ((struct qnx4_inode_entry *) bh->b_data) +
(ino % QNX4_INODES_PER_BLOCK);
@@ -515,9 +524,15 @@ static void qnx4_read_inode(struct inode *inode)
inode->i_op = &page_symlink_inode_operations;
inode->i_mapping->a_ops = &qnx4_aops;
qnx4_i(inode)->mmu_private = inode->i_size;
-   } else
-   printk("qnx4: bad inode %d on dev %s\n",ino,sb->s_id);
+   } else {
+   printk("qnx4: bad inode %lu on dev %s\n",ino,sb->s_id);
+   iget_failed(inode);
+   brelse(bh);
+   return ERR_PTR(-EIO);
+   }
brelse(bh);
+   unlock_new_inode(inode);
+   return inode;
 }
 
 static struct kmem_cache *qnx4_inode_cachep;
diff --git a/fs/qnx4/namei.c b/fs/qnx4/namei.c
index 733cdf

[PATCH 17/31] IGET: Stop FUSE from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the FUSE filesystem from using read_inode(), which it doesn't use anyway.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/fuse/inode.c |6 --
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 9a68d69..e50be15 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -75,11 +75,6 @@ static void fuse_destroy_inode(struct inode *inode)
kmem_cache_free(fuse_inode_cachep, inode);
 }
 
-static void fuse_read_inode(struct inode *inode)
-{
-   /* No op */
-}
-
 void fuse_send_forget(struct fuse_conn *fc, struct fuse_req *req,
  unsigned long nodeid, u64 nlookup)
 {
@@ -513,7 +508,6 @@ static struct inode *get_root_inode(struct super_block *sb, 
unsigned mode)
 static const struct super_operations fuse_super_operations = {
.alloc_inode= fuse_alloc_inode,
.destroy_inode  = fuse_destroy_inode,
-   .read_inode = fuse_read_inode,
.clear_inode= fuse_clear_inode,
.drop_inode = generic_delete_inode,
.remount_fs = fuse_remount_fs,

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 22/31] IGET: Stop the MINIX filesystem from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the MINIX filesystem from using iget() and read_inode().  Replace
minix_read_inode() with minix_iget(), and call that instead of iget().
minix_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

minix_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/minix/inode.c |   43 +--
 fs/minix/minix.h |1 +
 fs/minix/namei.c |7 +++
 3 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index bf4cd31..5523407 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 
-static void minix_read_inode(struct inode * inode);
 static int minix_write_inode(struct inode * inode, int wait);
 static int minix_statfs(struct dentry *dentry, struct kstatfs *buf);
 static int minix_remount (struct super_block * sb, int * flags, char * data);
@@ -96,7 +95,6 @@ static void destroy_inodecache(void)
 static const struct super_operations minix_sops = {
.alloc_inode= minix_alloc_inode,
.destroy_inode  = minix_destroy_inode,
-   .read_inode = minix_read_inode,
.write_inode= minix_write_inode,
.delete_inode   = minix_delete_inode,
.put_super  = minix_put_super,
@@ -149,6 +147,7 @@ static int minix_fill_super(struct super_block *s, void 
*data, int silent)
unsigned long i, block;
struct inode *root_inode;
struct minix_sb_info *sbi;
+   int ret = -EINVAL;
 
sbi = kzalloc(sizeof(struct minix_sb_info), GFP_KERNEL);
if (!sbi)
@@ -246,10 +245,13 @@ static int minix_fill_super(struct super_block *s, void 
*data, int silent)
 
/* set up enough so that it can read an inode */
s->s_op = &minix_sops;
-   root_inode = iget(s, MINIX_ROOT_INO);
-   if (!root_inode || is_bad_inode(root_inode))
+   root_inode = minix_iget(s, MINIX_ROOT_INO);
+   if (IS_ERR(root_inode)) {
+   ret = PTR_ERR(root_inode);
goto out_no_root;
+   }
 
+   ret = -ENOMEM;
s->s_root = d_alloc_root(root_inode);
if (!s->s_root)
goto out_iput;
@@ -290,6 +292,7 @@ out_freemap:
goto out_release;
 
 out_no_map:
+   ret = -ENOMEM;
if (!silent)
printk("MINIX-fs: can't allocate map\n");
goto out_release;
@@ -316,7 +319,7 @@ out_bad_sb:
 out:
s->s_fs_info = NULL;
kfree(sbi);
-   return -EINVAL;
+   return ret;
 }
 
 static int minix_statfs(struct dentry *dentry, struct kstatfs *buf)
@@ -409,7 +412,7 @@ void minix_set_inode(struct inode *inode, dev_t rdev)
 /*
  * The minix V1 function to read an inode.
  */
-static void V1_minix_read_inode(struct inode * inode)
+static struct inode *V1_minix_iget(struct inode * inode)
 {
struct buffer_head * bh;
struct minix_inode * raw_inode;
@@ -418,8 +421,8 @@ static void V1_minix_read_inode(struct inode * inode)
 
raw_inode = minix_V1_raw_inode(inode->i_sb, inode->i_ino, &bh);
if (!raw_inode) {
-   make_bad_inode(inode);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
}
inode->i_mode = raw_inode->i_mode;
inode->i_uid = (uid_t)raw_inode->i_uid;
@@ -435,12 +438,14 @@ static void V1_minix_read_inode(struct inode * inode)
minix_inode->u.i1_data[i] = raw_inode->i_zone[i];
minix_set_inode(inode, old_decode_dev(raw_inode->i_zone[0]));
brelse(bh);
+   unlock_new_inode(inode);
+   return inode;
 }
 
 /*
  * The minix V2 function to read an inode.
  */
-static void V2_minix_read_inode(struct inode * inode)
+static struct inode *V2_minix_iget(struct inode * inode)
 {
struct buffer_head * bh;
struct minix2_inode * raw_inode;
@@ -449,8 +454,8 @@ static void V2_minix_read_inode(struct inode * inode)
 
raw_inode = minix_V2_raw_inode(inode->i_sb, inode->i_ino, &bh);
if (!raw_inode) {
-   make_bad_inode(inode);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
}
inode->i_mode = raw_inode->i_mode;
inode->i_uid = (uid_t)raw_inode->i_uid;
@@ -468,17 +473,27 @@ static void V2_minix_read_inode(struct inode * inode)
minix_inode->u.i2_data[i] = raw_inode->i_zone[i];
minix_set_inode(inode, old_decode_dev(raw_inode->i_zone[0]));
brelse(bh);
+   unlock_new_inode(inode);
+   return inode;
 }
 
 /*
  * The global function to read an inode.
  */
-static void minix_read_inode(struct inode * inode)
+struct inode *minix_iget(struct super_block *sb, unsigned long ino)
 {
+   struct inode *inode;
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inod

[PATCH 18/31] IGET: Stop HFSPLUS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the HFSPLUS filesystem from using iget() and read_inode().  Replace
hfsplus_read_inode() with hfsplus_iget(), and call that instead of iget().
hfsplus_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

hfsplus_fill_super() returns any error incurred when getting the root inode.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/hfsplus/btree.c  |6 --
 fs/hfsplus/dir.c|6 +++---
 fs/hfsplus/hfsplus_fs.h |3 +++
 fs/hfsplus/super.c  |   47 +--
 4 files changed, 43 insertions(+), 19 deletions(-)

diff --git a/fs/hfsplus/btree.c b/fs/hfsplus/btree.c
index 050d29c..bb54336 100644
--- a/fs/hfsplus/btree.c
+++ b/fs/hfsplus/btree.c
@@ -22,6 +22,7 @@ struct hfs_btree *hfs_btree_open(struct super_block *sb, u32 
id)
struct hfs_btree *tree;
struct hfs_btree_header_rec *head;
struct address_space *mapping;
+   struct inode *inode;
struct page *page;
unsigned int size;
 
@@ -33,9 +34,10 @@ struct hfs_btree *hfs_btree_open(struct super_block *sb, u32 
id)
spin_lock_init(&tree->hash_lock);
tree->sb = sb;
tree->cnid = id;
-   tree->inode = iget(sb, id);
-   if (!tree->inode)
+   inode = hfsplus_iget(sb, id);
+   if (IS_ERR(inode))
goto free_tree;
+   tree->inode = inode;
 
mapping = tree->inode->i_mapping;
page = read_mapping_page(mapping, 0, NULL);
diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c
index 1955ee6..2968364 100644
--- a/fs/hfsplus/dir.c
+++ b/fs/hfsplus/dir.c
@@ -97,9 +97,9 @@ again:
goto fail;
}
hfs_find_exit(&fd);
-   inode = iget(dir->i_sb, cnid);
-   if (!inode)
-   return ERR_PTR(-EACCES);
+   inode = hfsplus_iget(dir->i_sb, cnid);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
if (S_ISREG(inode->i_mode))
HFSPLUS_I(inode).dev = linkid;
 out:
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index d9f5eda..d72d0a8 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -345,6 +345,9 @@ int hfsplus_parse_options(char *, struct hfsplus_sb_info *);
 void hfsplus_fill_defaults(struct hfsplus_sb_info *);
 int hfsplus_show_options(struct seq_file *, struct vfsmount *);
 
+/* super.c */
+struct inode *hfsplus_iget(struct super_block *, unsigned long);
+
 /* tables.c */
 extern u16 hfsplus_case_fold_table[];
 extern u16 hfsplus_decompose_table[];
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index ecf70da..b0f9ad3 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -20,11 +20,18 @@ static void hfsplus_destroy_inode(struct inode *inode);
 
 #include "hfsplus_fs.h"
 
-static void hfsplus_read_inode(struct inode *inode)
+struct inode *hfsplus_iget(struct super_block *sb, unsigned long ino)
 {
struct hfs_find_data fd;
struct hfsplus_vh *vhdr;
-   int err;
+   struct inode *inode;
+   long err = -EIO;
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
 
INIT_LIST_HEAD(&HFSPLUS_I(inode).open_dir_list);
init_MUTEX(&HFSPLUS_I(inode).extents_lock);
@@ -41,7 +48,7 @@ static void hfsplus_read_inode(struct inode *inode)
hfs_find_exit(&fd);
if (err)
goto bad_inode;
-   return;
+   goto done;
}
vhdr = HFSPLUS_SB(inode->i_sb).s_vhdr;
switch(inode->i_ino) {
@@ -70,10 +77,13 @@ static void hfsplus_read_inode(struct inode *inode)
goto bad_inode;
}
 
-   return;
+done:
+   unlock_new_inode(inode);
+   return inode;
 
- bad_inode:
-   make_bad_inode(inode);
+bad_inode:
+   iget_failed(inode);
+   return ERR_PTR(err);
 }
 
 static int hfsplus_write_inode(struct inode *inode, int unused)
@@ -262,7 +272,6 @@ static int hfsplus_remount(struct super_block *sb, int 
*flags, char *data)
 static const struct super_operations hfsplus_sops = {
.alloc_inode= hfsplus_alloc_inode,
.destroy_inode  = hfsplus_destroy_inode,
-   .read_inode = hfsplus_read_inode,
.write_inode= hfsplus_write_inode,
.clear_inode= hfsplus_clear_inode,
.put_super  = hfsplus_put_super,
@@ -278,7 +287,7 @@ static int hfsplus_fill_super(struct super_block *sb, void 
*data, int silent)
struct hfsplus_sb_info *sbi;
hfsplus_cat_entry entry;
struct hfs_find_data fd;
-   struct inode *root;
+   struct inode *root, *inode;
struct qstr str;
struct nls_table *nls = NULL;
int err = -EINVAL;
@@ -366,18 +375,25 @@ static int hfsplus_fill_super(struct super_block *sb, 
void *data, int silent)
goto cleanup;
}
 
-   HFSPLUS_

[PATCH 16/31] IGET: Stop FreeVXFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the FreeVXFS filesystem from using iget() and read_inode().  Replace
vxfs_read_inode() with vxfs_iget(), and call that instead of iget().
vxfs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

vxfs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/freevxfs/vxfs_extern.h |2 +-
 fs/freevxfs/vxfs_inode.c  |   45 +
 fs/freevxfs/vxfs_lookup.c |6 +++---
 fs/freevxfs/vxfs_super.c  |   10 +++---
 4 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/fs/freevxfs/vxfs_extern.h b/fs/freevxfs/vxfs_extern.h
index 91ccee8..f307694 100644
--- a/fs/freevxfs/vxfs_extern.h
+++ b/fs/freevxfs/vxfs_extern.h
@@ -58,7 +58,7 @@ extern struct inode * vxfs_get_fake_inode(struct 
super_block *,
 extern voidvxfs_put_fake_inode(struct inode *);
 extern struct vxfs_inode_info *vxfs_blkiget(struct super_block *, 
u_long, ino_t);
 extern struct vxfs_inode_info *vxfs_stiget(struct super_block *, 
ino_t);
-extern voidvxfs_read_inode(struct inode *);
+extern struct inode *  vxfs_iget(struct super_block *, unsigned long);
 extern voidvxfs_clear_inode(struct inode *);
 
 /* vxfs_lookup.c */
diff --git a/fs/freevxfs/vxfs_inode.c b/fs/freevxfs/vxfs_inode.c
index d1f7c5b..95a0af9 100644
--- a/fs/freevxfs/vxfs_inode.c
+++ b/fs/freevxfs/vxfs_inode.c
@@ -129,7 +129,7 @@ fail:
  * Description:
  *  Search the for inode number @ino in the filesystem
  *  described by @sbp.  Use the specified inode table (@ilistp).
- *  Returns the matching VxFS inode on success, else a NULL pointer.
+ *  Returns the matching VxFS inode on success, else an error code.
  */
 static struct vxfs_inode_info *
 __vxfs_iget(ino_t ino, struct inode *ilistp)
@@ -157,12 +157,12 @@ __vxfs_iget(ino_t ino, struct inode *ilistp)
}
 
printk(KERN_WARNING "vxfs: error on page %p\n", pp);
-   return NULL;
+   return ERR_CAST(pp);
 
 fail:
printk(KERN_WARNING "vxfs: unable to read inode %ld\n", (unsigned 
long)ino);
vxfs_put_page(pp);
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 }
 
 /**
@@ -178,7 +178,10 @@ fail:
 struct vxfs_inode_info *
 vxfs_stiget(struct super_block *sbp, ino_t ino)
 {
-return __vxfs_iget(ino, VXFS_SBI(sbp)->vsi_stilist);
+   struct vxfs_inode_info *vip;
+
+vip = __vxfs_iget(ino, VXFS_SBI(sbp)->vsi_stilist);
+   return IS_ERR(vip) ? NULL : vip;
 }
 
 /**
@@ -282,23 +285,32 @@ vxfs_put_fake_inode(struct inode *ip)
 }
 
 /**
- * vxfs_read_inode - fill in inode information
- * @ip:inode pointer to fill
+ * vxfs_iget - get an inode
+ * @sbp:   the superblock to get the inode for
+ * @ino:   the number of the inode to get
  *
  * Description:
- *  vxfs_read_inode reads the disk inode for @ip and fills
- *  in all relevant fields in @ip.
+ *  vxfs_read_inode creates an inode, reads the disk inode for @ino and fills
+ *  in all relevant fields in the new inode.
  */
-void
-vxfs_read_inode(struct inode *ip)
+struct inode *
+vxfs_iget(struct super_block *sbp, ino_t ino)
 {
-   struct super_block  *sbp = ip->i_sb;
struct vxfs_inode_info  *vip;
const struct address_space_operations   *aops;
-   ino_t   ino = ip->i_ino;
-
-   if (!(vip = __vxfs_iget(ino, VXFS_SBI(sbp)->vsi_ilist)))
-   return;
+   struct inode *ip;
+
+   ip = iget_locked(sbp, ino);
+   if (!ip)
+   return ERR_PTR(-ENOMEM);
+   if (!(ip->i_state & I_NEW))
+   return ip;
+
+   vip = __vxfs_iget(ino, VXFS_SBI(sbp)->vsi_ilist);
+   if (IS_ERR(vip)) {
+   iget_failed(ip);
+   return ERR_CAST(vip);
+   }
 
vxfs_iinit(ip, vip);
 
@@ -323,7 +335,8 @@ vxfs_read_inode(struct inode *ip)
} else
init_special_inode(ip, ip->i_mode, 
old_decode_dev(vip->vii_rdev));
 
-   return;
+   unlock_new_inode(ip);
+   return ip;
 }
 
 /**
diff --git a/fs/freevxfs/vxfs_lookup.c b/fs/freevxfs/vxfs_lookup.c
index bf86e54..aee049c 100644
--- a/fs/freevxfs/vxfs_lookup.c
+++ b/fs/freevxfs/vxfs_lookup.c
@@ -213,10 +213,10 @@ vxfs_lookup(struct inode *dip, struct dentry *dp, struct 
nameidata *nd)
lock_kernel();
ino = vxfs_inode_by_name(dip, dp);
if (ino) {
-   ip = iget(dip->i_sb, ino);
-   if (!ip) {
+   ip = vxfs_iget(dip->i_sb, ino);
+   if (IS_ERR(ip)) {
unlock_kernel();
-   return ERR_PTR(-EACCES);
+   return ERR_CAST(ip);
}
}
unlock_kernel();
diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c
index 4f95572..1dacda8 100644
--- a/fs/freevxf

[PATCH 21/31] IGET: Stop JFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the JFS filesystem from using iget() and read_inode().  Replace
jfs_read_inode() with jfs_iget(), and call that instead of iget().
jfs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

jfs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
Acked-by: Dave Kleikamp <[EMAIL PROTECTED]>
---

 fs/jfs/inode.c |   20 
 fs/jfs/jfs_inode.h |2 +-
 fs/jfs/namei.c |   34 ++
 fs/jfs/super.c |   15 +--
 4 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/fs/jfs/inode.c b/fs/jfs/inode.c
index 4672013..2103397 100644
--- a/fs/jfs/inode.c
+++ b/fs/jfs/inode.c
@@ -31,11 +31,21 @@
 #include "jfs_debug.h"
 
 
-void jfs_read_inode(struct inode *inode)
+struct inode *jfs_iget(struct super_block *sb, unsigned long ino)
 {
-   if (diRead(inode)) {
-   make_bad_inode(inode);
-   return;
+   struct inode *inode;
+   int ret;
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   ret = diRead(inode);
+   if (ret < 0) {
+   iget_failed(inode);
+   return ERR_PTR(ret);
}
 
if (S_ISREG(inode->i_mode)) {
@@ -55,6 +65,8 @@ void jfs_read_inode(struct inode *inode)
inode->i_op = &jfs_file_inode_operations;
init_special_inode(inode, inode->i_mode, inode->i_rdev);
}
+   unlock_new_inode(inode);
+   return inode;
 }
 
 /*
diff --git a/fs/jfs/jfs_inode.h b/fs/jfs/jfs_inode.h
index 8e2cf2c..95a6a11 100644
--- a/fs/jfs/jfs_inode.h
+++ b/fs/jfs/jfs_inode.h
@@ -24,7 +24,7 @@ extern struct inode *ialloc(struct inode *, umode_t);
 extern int jfs_fsync(struct file *, struct dentry *, int);
 extern int jfs_ioctl(struct inode *, struct file *,
unsigned int, unsigned long);
-extern void jfs_read_inode(struct inode *);
+extern struct inode *jfs_iget(struct super_block *, unsigned long);
 extern int jfs_commit_inode(struct inode *, int);
 extern int jfs_write_inode(struct inode*, int);
 extern void jfs_delete_inode(struct inode *);
diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
index 4e0a849..4be6939 100644
--- a/fs/jfs/namei.c
+++ b/fs/jfs/namei.c
@@ -1462,12 +1462,10 @@ static struct dentry *jfs_lookup(struct inode *dip, 
struct dentry *dentry, struc
}
}
 
-   ip = iget(dip->i_sb, inum);
-   if (ip == NULL || is_bad_inode(ip)) {
+   ip = jfs_iget(dip->i_sb, inum);
+   if (IS_ERR(ip)) {
jfs_err("jfs_lookup: iget failed on inum %d", (uint) inum);
-   if (ip)
-   iput(ip);
-   return ERR_PTR(-EACCES);
+   return ERR_CAST(ip);
}
 
dentry = d_splice_alias(ip, dentry);
@@ -1485,12 +1483,11 @@ static struct inode *jfs_nfs_get_inode(struct 
super_block *sb,
 
if (ino == 0)
return ERR_PTR(-ESTALE);
-   inode = iget(sb, ino);
-   if (inode == NULL)
-   return ERR_PTR(-ENOMEM);
+   inode = jfs_iget(sb, ino);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
 
-   if (is_bad_inode(inode) ||
-   (generation && inode->i_generation != generation)) {
+   if (generation && inode->i_generation != generation) {
iput(inode);
return ERR_PTR(-ESTALE);
}
@@ -1521,17 +1518,14 @@ struct dentry *jfs_get_parent(struct dentry *dentry)
 
parent_ino =
le32_to_cpu(JFS_IP(dentry->d_inode)->i_dtroot.header.idotdot);
-   inode = iget(sb, parent_ino);
-   if (inode) {
-   if (is_bad_inode(inode)) {
+   inode = jfs_iget(sb, parent_ino);
+   if (IS_ERR(inode)) {
+   parent = ERR_CAST(inode);
+   } else {
+   parent = d_alloc_anon(inode);
+   if (!parent) {
+   parent = ERR_PTR(-ENOMEM);
iput(inode);
-   parent = ERR_PTR(-EACCES);
-   } else {
-   parent = d_alloc_anon(inode);
-   if (!parent) {
-   parent = ERR_PTR(-ENOMEM);
-   iput(inode);
-   }
}
}
 
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 314bb4f..428b1ca 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -414,7 +414,7 @@ static int jfs_fill_super(struct super_block *sb, void 
*data, int silent)
struct inode *inode;
int rc;
s64 newLVSize = 0;
-   int flag;
+   int flag, ret = -EINVAL;
 
jfs_info("In jfs_read_super: s_flags=0x%lx", sb->s_flags);
 
@@ -461,8 +461,10 @@ static int jfs_fill_super(struct super_block *sb, void

[PATCH 15/31] IGET: Stop FAT from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the FAT filesystem from using iget() and read_inode().  Replace
the call to iget() with a call to ilookup().

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/fat/inode.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index edb1a2c..4e33fde 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -634,8 +634,6 @@ static const struct super_operations fat_sops = {
.clear_inode= fat_clear_inode,
.remount_fs = fat_remount,
 
-   .read_inode = make_bad_inode,
-
.show_options   = fat_show_options,
 };
 
@@ -663,8 +661,8 @@ static struct dentry *fat_fh_to_dentry(struct super_block 
*sb,
if (fh_len < 5 || fh_type != 3)
return NULL;
 
-   inode = iget(sb, fh[0]);
-   if (!inode || is_bad_inode(inode) || inode->i_generation != fh[1]) {
+   inode = ilookup(sb, fh[0]);
+   if (!inode || inode->i_generation != fh[1]) {
if (inode)
iput(inode);
inode = NULL;

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 20/31] IGET: Stop JFFS2 from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the JFFS2 filesystem from using iget() and read_inode().  Replace
jffs2_read_inode() with jffs2_iget(), and call that instead of iget().
jffs2_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

jffs2_do_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/jffs2/dir.c  |6 +++--
 fs/jffs2/fs.c   |   56 ---
 fs/jffs2/os-linux.h |2 +-
 fs/jffs2/super.c|1 -
 4 files changed, 39 insertions(+), 26 deletions(-)

diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c
index 787e392..f948f7e 100644
--- a/fs/jffs2/dir.c
+++ b/fs/jffs2/dir.c
@@ -101,10 +101,10 @@ static struct dentry *jffs2_lookup(struct inode *dir_i, 
struct dentry *target,
ino = fd->ino;
up(&dir_f->sem);
if (ino) {
-   inode = iget(dir_i->i_sb, ino);
-   if (!inode) {
+   inode = jffs2_iget(dir_i->i_sb, ino);
+   if (IS_ERR(inode)) {
printk(KERN_WARNING "iget() failed for ino #%u\n", ino);
-   return (ERR_PTR(-EIO));
+   return ERR_CAST(inode);
}
}
 
diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c
index d2e06f7..6d1eadd 100644
--- a/fs/jffs2/fs.c
+++ b/fs/jffs2/fs.c
@@ -230,16 +230,23 @@ void jffs2_clear_inode (struct inode *inode)
jffs2_do_clear_inode(c, f);
 }
 
-void jffs2_read_inode (struct inode *inode)
+struct inode *jffs2_iget(struct super_block *sb, unsigned long ino)
 {
struct jffs2_inode_info *f;
struct jffs2_sb_info *c;
struct jffs2_raw_inode latest_node;
union jffs2_device_node jdev;
+   struct inode *inode;
dev_t rdev = 0;
int ret;
 
-   D1(printk(KERN_DEBUG "jffs2_read_inode(): inode->i_ino == %lu\n", 
inode->i_ino));
+   D1(printk(KERN_DEBUG "jffs2_iget(): ino == %lu\n", ino));
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
 
f = JFFS2_INODE_INFO(inode);
c = JFFS2_SB_INFO(inode->i_sb);
@@ -250,9 +257,9 @@ void jffs2_read_inode (struct inode *inode)
ret = jffs2_do_read_inode(c, f, inode->i_ino, &latest_node);
 
if (ret) {
-   make_bad_inode(inode);
up(&f->sem);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(ret);
}
inode->i_mode = jemode_to_cpu(latest_node.mode);
inode->i_uid = je16_to_cpu(latest_node.uid);
@@ -303,19 +310,14 @@ void jffs2_read_inode (struct inode *inode)
if (f->metadata->size != sizeof(jdev.old) &&
f->metadata->size != sizeof(jdev.new)) {
printk(KERN_NOTICE "Device node has strange size %d\n", 
f->metadata->size);
-   up(&f->sem);
-   jffs2_do_clear_inode(c, f);
-   make_bad_inode(inode);
-   return;
+   goto error_io;
}
D1(printk(KERN_DEBUG "Reading device numbers from flash\n"));
-   if (jffs2_read_dnode(c, f, f->metadata, (char *)&jdev, 0, 
f->metadata->size) < 0) {
+   ret = jffs2_read_dnode(c, f, f->metadata, (char *)&jdev, 0, 
f->metadata->size);
+   if (ret < 0) {
/* Eep */
printk(KERN_NOTICE "Read device numbers for inode %lu 
failed\n", (unsigned long)inode->i_ino);
-   up(&f->sem);
-   jffs2_do_clear_inode(c, f);
-   make_bad_inode(inode);
-   return;
+   goto error;
}
if (f->metadata->size == sizeof(jdev.old))
rdev = old_decode_dev(je16_to_cpu(jdev.old));
@@ -335,6 +337,16 @@ void jffs2_read_inode (struct inode *inode)
up(&f->sem);
 
D1(printk(KERN_DEBUG "jffs2_read_inode() returning\n"));
+   unlock_new_inode(inode);
+   return inode;
+
+error_io:
+   ret = -EIO;
+error:
+   up(&f->sem);
+   jffs2_do_clear_inode(c, f);
+   iget_failed(inode);
+   return ERR_PTR(ret);
 }
 
 void jffs2_dirty_inode(struct inode *inode)
@@ -522,15 +534,16 @@ int jffs2_do_fill_super(struct super_block *sb, void 
*data, int silent)
if ((ret = jffs2_do_mount_fs(c)))
goto out_inohash;
 
-   ret = -EINVAL;
-
D1(printk(KERN_DEBUG "jffs2_do_fill_super(): Getting root inode\n"));
-   root_i = iget(sb, 1);
-   if (is_bad_inode(root_i)) {
+   root_i = jffs2_iget(sb, 1);
+   if (IS_ERR(root_i)) {
D1(printk(KERN_WARNING "get root inode failed\n"));
-   goto out_root_i;
+   ret = PTR_ERR

[PATCH 19/31] IGET: Stop ISOFS from using read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the ISOFS filesystem from using read_inode().  Make isofs_read_inode()
return an error code, and make isofs_iget() pass it on.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/isofs/inode.c |   25 +
 1 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index 09e3d30..a854831 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -54,7 +54,7 @@ static void isofs_put_super(struct super_block *sb)
return;
 }
 
-static void isofs_read_inode(struct inode *);
+static int isofs_read_inode(struct inode *);
 static int isofs_statfs (struct dentry *, struct kstatfs *);
 
 static struct kmem_cache *isofs_inode_cachep;
@@ -107,7 +107,6 @@ static int isofs_remount(struct super_block *sb, int 
*flags, char *data)
 static const struct super_operations isofs_sops = {
.alloc_inode= isofs_alloc_inode,
.destroy_inode  = isofs_destroy_inode,
-   .read_inode = isofs_read_inode,
.put_super  = isofs_put_super,
.statfs = isofs_statfs,
.remount_fs = isofs_remount,
@@ -1186,7 +1185,7 @@ out_toomany:
goto out;
 }
 
-static void isofs_read_inode(struct inode *inode)
+static int isofs_read_inode(struct inode *inode)
 {
struct super_block *sb = inode->i_sb;
struct isofs_sb_info *sbi = ISOFS_SB(sb);
@@ -1199,6 +1198,7 @@ static void isofs_read_inode(struct inode *inode)
unsigned int de_len;
unsigned long offset;
struct iso_inode_info *ei = ISOFS_I(inode);
+   int ret = -EIO;
 
block = ei->i_iget5_block;
bh = sb_bread(inode->i_sb, block);
@@ -1216,6 +1216,7 @@ static void isofs_read_inode(struct inode *inode)
tmpde = kmalloc(de_len, GFP_KERNEL);
if (tmpde == NULL) {
printk(KERN_INFO "%s: out of memory\n", __func__);
+   ret = -ENOMEM;
goto fail;
}
memcpy(tmpde, bh->b_data + offset, frag1);
@@ -1259,8 +1260,10 @@ static void isofs_read_inode(struct inode *inode)
 
ei->i_section_size = isonum_733(de->size);
if (de->flags[-high_sierra] & 0x80) {
-   if(isofs_read_level3_size(inode))
+   ret = isofs_read_level3_size(inode);
+   if (ret < 0)
goto fail;
+   ret = -EIO;
} else {
ei->i_next_section_block = 0;
ei->i_next_section_offset = 0;
@@ -1346,16 +1349,16 @@ static void isofs_read_inode(struct inode *inode)
/* XXX - parse_rock_ridge_inode() had already set i_rdev. */
init_special_inode(inode, inode->i_mode, inode->i_rdev);
 
+   ret = 0;
 out:
kfree(tmpde);
if (bh)
brelse(bh);
-   return;
+   return ret;
 
 out_badread:
printk(KERN_WARNING "ISOFS: unable to read i-node block\n");
 fail:
-   make_bad_inode(inode);
goto out;
 }
 
@@ -1394,6 +1397,7 @@ struct inode *isofs_iget(struct super_block *sb,
unsigned long hashval;
struct inode *inode;
struct isofs_iget5_callback_data data;
+   long ret;
 
if (offset >= 1ul << sb->s_blocksize_bits)
return NULL;
@@ -1407,8 +1411,13 @@ struct inode *isofs_iget(struct super_block *sb,
&isofs_iget5_set, &data);
 
if (inode && (inode->i_state & I_NEW)) {
-   sb->s_op->read_inode(inode);
-   unlock_new_inode(inode);
+   ret = isofs_read_inode(inode);
+   if (ret < 0) {
+   iget_failed(inode);
+   inode = ERR_PTR(ret);
+   } else {
+   unlock_new_inode(inode);
+   }
}
 
return inode;

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/31] IGET: Stop EXT4 from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the EXT4 filesystem from using iget() and read_inode().  Replace
ext4_read_inode() with ext4_iget(), and call that instead of iget().
ext4_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

ext4_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
Acked-by: "Theodore Ts'o" <[EMAIL PROTECTED]>
Acked-by: Jan Kara <[EMAIL PROTECTED]>
---

 fs/ext4/ialloc.c|   58 ---
 fs/ext4/inode.c |   25 +++-
 fs/ext4/namei.c |   29 +++-
 fs/ext4/resize.c|7 ++
 fs/ext4/super.c |   36 -
 include/linux/ext4_fs.h |2 +-
 6 files changed, 87 insertions(+), 70 deletions(-)

diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index c61f37f..fdf170a 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -779,14 +779,15 @@ struct inode *ext4_orphan_get(struct super_block *sb, 
unsigned long ino)
unsigned long max_ino = le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count);
unsigned long block_group;
int bit;
-   struct buffer_head *bitmap_bh = NULL;
+   struct buffer_head *bitmap_bh;
struct inode *inode = NULL;
+   long err = -EIO;
 
/* Error cases - e2fsck has already cleaned up for us */
if (ino > max_ino) {
ext4_warning(sb, __FUNCTION__,
 "bad orphan ino %lu!  e2fsck was run?", ino);
-   goto out;
+   goto error;
}
 
block_group = (ino - 1) / EXT4_INODES_PER_GROUP(sb);
@@ -795,38 +796,49 @@ struct inode *ext4_orphan_get(struct super_block *sb, 
unsigned long ino)
if (!bitmap_bh) {
ext4_warning(sb, __FUNCTION__,
 "inode bitmap error for orphan %lu", ino);
-   goto out;
+   goto error;
}
 
/* Having the inode bit set should be a 100% indicator that this
 * is a valid orphan (no e2fsck run on fs).  Orphans also include
 * inodes that were being truncated, so we can't check i_nlink==0.
 */
-   if (!ext4_test_bit(bit, bitmap_bh->b_data) ||
-   !(inode = iget(sb, ino)) || is_bad_inode(inode) ||
-   NEXT_ORPHAN(inode) > max_ino) {
-   ext4_warning(sb, __FUNCTION__,
-"bad orphan inode %lu!  e2fsck was run?", ino);
-   printk(KERN_NOTICE "ext4_test_bit(bit=%d, block=%llu) = %d\n",
-  bit, (unsigned long long)bitmap_bh->b_blocknr,
-  ext4_test_bit(bit, bitmap_bh->b_data));
-   printk(KERN_NOTICE "inode=%p\n", inode);
-   if (inode) {
-   printk(KERN_NOTICE "is_bad_inode(inode)=%d\n",
-  is_bad_inode(inode));
-   printk(KERN_NOTICE "NEXT_ORPHAN(inode)=%u\n",
-  NEXT_ORPHAN(inode));
-   printk(KERN_NOTICE "max_ino=%lu\n", max_ino);
-   }
+   if (!ext4_test_bit(bit, bitmap_bh->b_data))
+   goto bad_orphan;
+
+   inode = ext4_iget(sb, ino);
+   if (IS_ERR(inode))
+   goto iget_failed;
+
+   if (NEXT_ORPHAN(inode) > max_ino)
+   goto bad_orphan;
+   brelse(bitmap_bh);
+   return inode;
+
+iget_failed:
+   err = PTR_ERR(inode);
+   inode = NULL;
+bad_orphan:
+   ext4_warning(sb, __FUNCTION__,
+"bad orphan inode %lu!  e2fsck was run?", ino);
+   printk(KERN_NOTICE "ext4_test_bit(bit=%d, block=%llu) = %d\n",
+  bit, (unsigned long long)bitmap_bh->b_blocknr,
+  ext4_test_bit(bit, bitmap_bh->b_data));
+   printk(KERN_NOTICE "inode=%p\n", inode);
+   if (inode) {
+   printk(KERN_NOTICE "is_bad_inode(inode)=%d\n",
+  is_bad_inode(inode));
+   printk(KERN_NOTICE "NEXT_ORPHAN(inode)=%u\n",
+  NEXT_ORPHAN(inode));
+   printk(KERN_NOTICE "max_ino=%lu\n", max_ino);
/* Avoid freeing blocks if we got a bad deleted inode */
-   if (inode && inode->i_nlink == 0)
+   if (inode->i_nlink == 0)
inode->i_blocks = 0;
iput(inode);
-   inode = NULL;
}
-out:
brelse(bitmap_bh);
-   return inode;
+error:
+   return ERR_PTR(err);
 }
 
 unsigned long ext4_count_free_inodes (struct super_block * sb)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5489703..942a9ef 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2661,21 +2661,31 @@ void ext4_get_inode_flags(struct ext4_inode_info *ei)
ei->i_flags |= EXT4_DIRSYNC_FL;
 }
 
-void ext4_read_inode(struct inode * inode)
+struct inode *ext4_iget(struct

[PATCH 13/31] IGET: Stop EXT3 from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the EXT3 filesystem from using iget() and read_inode().  Replace
ext3_read_inode() with ext3_iget(), and call that instead of iget().
ext3_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

ext3_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
Acked-by: "Theodore Ts'o" <[EMAIL PROTECTED]>
Acked-by: Jan Kara <[EMAIL PROTECTED]>
---

 fs/ext3/ialloc.c|   58 ---
 fs/ext3/inode.c |   25 +++-
 fs/ext3/namei.c |   29 +++-
 fs/ext3/resize.c|7 ++
 fs/ext3/super.c |   40 ++--
 include/linux/ext3_fs.h |2 +-
 6 files changed, 89 insertions(+), 72 deletions(-)

diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
index 1bc8cd8..58ae2f9 100644
--- a/fs/ext3/ialloc.c
+++ b/fs/ext3/ialloc.c
@@ -642,14 +642,15 @@ struct inode *ext3_orphan_get(struct super_block *sb, 
unsigned long ino)
unsigned long max_ino = le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count);
unsigned long block_group;
int bit;
-   struct buffer_head *bitmap_bh = NULL;
+   struct buffer_head *bitmap_bh;
struct inode *inode = NULL;
+   long err = -EIO;
 
/* Error cases - e2fsck has already cleaned up for us */
if (ino > max_ino) {
ext3_warning(sb, __FUNCTION__,
 "bad orphan ino %lu!  e2fsck was run?", ino);
-   goto out;
+   goto error;
}
 
block_group = (ino - 1) / EXT3_INODES_PER_GROUP(sb);
@@ -658,38 +659,49 @@ struct inode *ext3_orphan_get(struct super_block *sb, 
unsigned long ino)
if (!bitmap_bh) {
ext3_warning(sb, __FUNCTION__,
 "inode bitmap error for orphan %lu", ino);
-   goto out;
+   goto error;
}
 
/* Having the inode bit set should be a 100% indicator that this
 * is a valid orphan (no e2fsck run on fs).  Orphans also include
 * inodes that were being truncated, so we can't check i_nlink==0.
 */
-   if (!ext3_test_bit(bit, bitmap_bh->b_data) ||
-   !(inode = iget(sb, ino)) || is_bad_inode(inode) ||
-   NEXT_ORPHAN(inode) > max_ino) {
-   ext3_warning(sb, __FUNCTION__,
-"bad orphan inode %lu!  e2fsck was run?", ino);
-   printk(KERN_NOTICE "ext3_test_bit(bit=%d, block=%llu) = %d\n",
-  bit, (unsigned long long)bitmap_bh->b_blocknr,
-  ext3_test_bit(bit, bitmap_bh->b_data));
-   printk(KERN_NOTICE "inode=%p\n", inode);
-   if (inode) {
-   printk(KERN_NOTICE "is_bad_inode(inode)=%d\n",
-  is_bad_inode(inode));
-   printk(KERN_NOTICE "NEXT_ORPHAN(inode)=%u\n",
-  NEXT_ORPHAN(inode));
-   printk(KERN_NOTICE "max_ino=%lu\n", max_ino);
-   }
+   if (!ext3_test_bit(bit, bitmap_bh->b_data))
+   goto bad_orphan;
+
+   inode = ext3_iget(sb, ino);
+   if (IS_ERR(inode))
+   goto iget_failed;
+
+   if (NEXT_ORPHAN(inode) > max_ino)
+   goto bad_orphan;
+   brelse(bitmap_bh);
+   return inode;
+
+iget_failed:
+   err = PTR_ERR(inode);
+   inode = NULL;
+bad_orphan:
+   ext3_warning(sb, __FUNCTION__,
+"bad orphan inode %lu!  e2fsck was run?", ino);
+   printk(KERN_NOTICE "ext3_test_bit(bit=%d, block=%llu) = %d\n",
+  bit, (unsigned long long)bitmap_bh->b_blocknr,
+  ext3_test_bit(bit, bitmap_bh->b_data));
+   printk(KERN_NOTICE "inode=%p\n", inode);
+   if (inode) {
+   printk(KERN_NOTICE "is_bad_inode(inode)=%d\n",
+  is_bad_inode(inode));
+   printk(KERN_NOTICE "NEXT_ORPHAN(inode)=%u\n",
+  NEXT_ORPHAN(inode));
+   printk(KERN_NOTICE "max_ino=%lu\n", max_ino);
/* Avoid freeing blocks if we got a bad deleted inode */
-   if (inode && inode->i_nlink == 0)
+   if (inode->i_nlink == 0)
inode->i_blocks = 0;
iput(inode);
-   inode = NULL;
}
-out:
brelse(bitmap_bh);
-   return inode;
+error:
+   return ERR_PTR(err);
 }
 
 unsigned long ext3_count_free_inodes (struct super_block * sb)
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 9b162cd..164934a 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -2658,21 +2658,31 @@ void ext3_get_inode_flags(struct ext3_inode_info *ei)
ei->i_flags |= EXT3_DIRSYNC_FL;
 }
 
-void ext3_read_inode(struct inode * inode)
+struct inode *ext3_iget(str

[PATCH 12/31] IGET: Stop EXT2 from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the EXT2 filesystem from using iget() and read_inode().  Replace
ext2_read_inode() with ext2_iget(), and call that instead of iget().
ext2_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

ext2_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
Acked-by: "Theodore Ts'o" <[EMAIL PROTECTED]>
---

 fs/ext2/ext2.h  |2 +-
 fs/ext2/inode.c |   29 +
 fs/ext2/namei.c |   12 ++--
 fs/ext2/super.c |   32 ++--
 4 files changed, 46 insertions(+), 29 deletions(-)

diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
index 7730388..5890fc0 100644
--- a/fs/ext2/ext2.h
+++ b/fs/ext2/ext2.h
@@ -124,7 +124,7 @@ extern void ext2_check_inodes_bitmap (struct super_block *);
 extern unsigned long ext2_count_free (struct buffer_head *, unsigned);
 
 /* inode.c */
-extern void ext2_read_inode (struct inode *);
+extern struct inode *ext2_iget (struct super_block *, unsigned long);
 extern int ext2_write_inode (struct inode *, int);
 extern void ext2_put_inode (struct inode *);
 extern void ext2_delete_inode (struct inode *);
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index b1ab32a..864a534 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -1185,22 +1185,33 @@ void ext2_get_inode_flags(struct ext2_inode_info *ei)
ei->i_flags |= EXT2_DIRSYNC_FL;
 }
 
-void ext2_read_inode (struct inode * inode)
+struct inode *ext2_iget (struct super_block *sb, unsigned long ino)
 {
-   struct ext2_inode_info *ei = EXT2_I(inode);
-   ino_t ino = inode->i_ino;
+   struct ext2_inode_info *ei;
struct buffer_head * bh;
-   struct ext2_inode * raw_inode = ext2_get_inode(inode->i_sb, ino, &bh);
+   struct ext2_inode * raw_inode;
+   struct inode *inode;
+   long ret = -EIO;
int n;
 
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   ei = EXT2_I(inode);
 #ifdef CONFIG_EXT2_FS_POSIX_ACL
ei->i_acl = EXT2_ACL_NOT_CACHED;
ei->i_default_acl = EXT2_ACL_NOT_CACHED;
 #endif
ei->i_block_alloc_info = NULL;
 
-   if (IS_ERR(raw_inode))
+   raw_inode = ext2_get_inode(inode->i_sb, ino, &bh);
+   if (IS_ERR(raw_inode)) {
+   ret = PTR_ERR(raw_inode);
goto bad_inode;
+   }
 
inode->i_mode = le16_to_cpu(raw_inode->i_mode);
inode->i_uid = (uid_t)le16_to_cpu(raw_inode->i_uid_low);
@@ -1224,6 +1235,7 @@ void ext2_read_inode (struct inode * inode)
if (inode->i_nlink == 0 && (inode->i_mode == 0 || ei->i_dtime)) {
/* this inode is deleted */
brelse (bh);
+   ret = -ESTALE;
goto bad_inode;
}
inode->i_blocks = le32_to_cpu(raw_inode->i_blocks);
@@ -1290,11 +1302,12 @@ void ext2_read_inode (struct inode * inode)
}
brelse (bh);
ext2_set_inode_flags(inode);
-   return;
+   unlock_new_inode(inode);
+   return inode;

 bad_inode:
-   make_bad_inode(inode);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(ret);
 }
 
 static int ext2_update_inode(struct inode * inode, int do_sync)
diff --git a/fs/ext2/namei.c b/fs/ext2/namei.c
index e69beed..80c97fd 100644
--- a/fs/ext2/namei.c
+++ b/fs/ext2/namei.c
@@ -63,9 +63,9 @@ static struct dentry *ext2_lookup(struct inode * dir, struct 
dentry *dentry, str
ino = ext2_inode_by_name(dir, dentry);
inode = NULL;
if (ino) {
-   inode = iget(dir->i_sb, ino);
-   if (!inode)
-   return ERR_PTR(-EACCES);
+   inode = ext2_iget(dir->i_sb, ino);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
}
return d_splice_alias(inode, dentry);
 }
@@ -83,10 +83,10 @@ struct dentry *ext2_get_parent(struct dentry *child)
ino = ext2_inode_by_name(child->d_inode, &dotdot);
if (!ino)
return ERR_PTR(-ENOENT);
-   inode = iget(child->d_inode->i_sb, ino);
+   inode = ext2_iget(child->d_inode->i_sb, ino);
 
-   if (!inode)
-   return ERR_PTR(-EACCES);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
parent = d_alloc_anon(inode);
if (!parent) {
iput(inode);
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 154e25f..c405c8a 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -296,7 +296,6 @@ static ssize_t ext2_quota_write(struct super_block *sb, int 
type, const char *da
 static const struct super_operations ext2_sops = {
.alloc_inode= ext2_alloc_inode,
.destroy_inode  = ext2_destroy_inode,
-   .read_inode = ext2_read_inode,
.write_inode= ext2_write_inode,
.de

[PATCH 09/31] IGET: Stop BFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the BFS filesystem from using iget() and read_inode().  Replace
bfs_read_inode() with bfs_iget(), and call that instead of iget().
bfs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

bfs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/bfs/bfs.h   |2 ++
 fs/bfs/dir.c   |6 +++---
 fs/bfs/inode.c |   32 ++--
 3 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/fs/bfs/bfs.h b/fs/bfs/bfs.h
index 130f6c6..5cf50d4 100644
--- a/fs/bfs/bfs.h
+++ b/fs/bfs/bfs.h
@@ -46,6 +46,8 @@ static inline struct bfs_inode_info *BFS_I(struct inode 
*inode)
 #define printf(format, args...) \
printk(KERN_ERR "BFS-fs: %s(): " format, __FUNCTION__, ## args)
 
+/* inode.c */
+extern struct inode *bfs_iget(struct super_block *sb, unsigned long ino);
 
 /* file.c */
 extern const struct inode_operations bfs_file_inops;
diff --git a/fs/bfs/dir.c b/fs/bfs/dir.c
index 097f149..03c8bdb 100644
--- a/fs/bfs/dir.c
+++ b/fs/bfs/dir.c
@@ -141,10 +141,10 @@ static struct dentry * bfs_lookup(struct inode * dir, 
struct dentry * dentry, st
if (bh) {
unsigned long ino = (unsigned long)le16_to_cpu(de->ino);
brelse(bh);
-   inode = iget(dir->i_sb, ino);
-   if (!inode) {
+   inode = bfs_iget(dir->i_sb, ino);
+   if (IS_ERR(inode)) {
unlock_kernel();
-   return ERR_PTR(-EACCES);
+   return ERR_CAST(inode);
}
}
unlock_kernel();
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index 7bd9c2b..d279382 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -32,25 +32,29 @@ MODULE_LICENSE("GPL");
 
 void dump_imap(const char *prefix, struct super_block * s);
 
-static void bfs_read_inode(struct inode * inode)
+struct inode *bfs_iget(struct super_block *sb, unsigned long ino)
 {
-   unsigned long ino = inode->i_ino;
struct bfs_inode * di;
struct buffer_head * bh;
+   struct inode *inode;
int block, off;
 
+   inode = iget_locked(sb, ino);
+   if (IS_ERR(inode))
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
if (ino < BFS_ROOT_INO || ino > BFS_SB(inode->i_sb)->si_lasti) {
printf("Bad inode number %s:%08lx\n", inode->i_sb->s_id, ino);
-   make_bad_inode(inode);
-   return;
+   goto error;
}
 
block = (ino - BFS_ROOT_INO)/BFS_INODES_PER_BLOCK + 1;
bh = sb_bread(inode->i_sb, block);
if (!bh) {
printf("Unable to read inode %s:%08lx\n", inode->i_sb->s_id, 
ino);
-   make_bad_inode(inode);
-   return;
+   goto error;
}
 
off = (ino - BFS_ROOT_INO) % BFS_INODES_PER_BLOCK;
@@ -85,6 +89,12 @@ static void bfs_read_inode(struct inode * inode)
BFS_I(inode)->i_dsk_ino = le16_to_cpu(di->i_ino); /* can be 0 so we 
store a copy */
 
brelse(bh);
+   unlock_new_inode(inode);
+   return inode;
+
+error:
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
 }
 
 static int bfs_write_inode(struct inode * inode, int unused)
@@ -271,7 +281,6 @@ static void destroy_inodecache(void)
 static const struct super_operations bfs_sops = {
.alloc_inode= bfs_alloc_inode,
.destroy_inode  = bfs_destroy_inode,
-   .read_inode = bfs_read_inode,
.write_inode= bfs_write_inode,
.delete_inode   = bfs_delete_inode,
.put_super  = bfs_put_super,
@@ -306,6 +315,7 @@ static int bfs_fill_super(struct super_block *s, void 
*data, int silent)
struct inode * inode;
unsigned i, imap_len;
struct bfs_sb_info * info;
+   long ret = -EINVAL;
 
info = kzalloc(sizeof(*info), GFP_KERNEL);
if (!info)
@@ -340,14 +350,16 @@ static int bfs_fill_super(struct super_block *s, void 
*data, int silent)
set_bit(i, info->si_imap);
 
s->s_op = &bfs_sops;
-   inode = iget(s, BFS_ROOT_INO);
-   if (!inode) {
+   inode = bfs_iget(s, BFS_ROOT_INO);
+   if (IS_ERR(inode)) {
+   ret = PTR_ERR(inode);
kfree(info->si_imap);
goto out;
}
s->s_root = d_alloc_root(inode);
if (!s->s_root) {
iput(inode);
+   ret = -ENOMEM;
kfree(info->si_imap);
goto out;
}
@@ -402,7 +414,7 @@ out:
brelse(bh);
kfree(info);
s->s_fs_info = NULL;
-   return -EINVAL;
+   return ret;
 }
 
 static int bfs_get_sb(struct file_system_type *fs_type,

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More m

[PATCH 10/31] IGET: Stop CIFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the CIFS filesystem from using iget() and read_inode().  Replace
cifs_read_inode() with cifs_iget(), and call that instead of iget().
cifs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

cifs_read_super() now returns any error incurred when getting the root inode
instead of ENOMEM.

cifs_iget() needs examining.  The comment "can not call macro FreeXid here
since in a void func" is no longer true.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/cifs/cifsfs.c |8 
 fs/cifs/cifsfs.h |1 +
 fs/cifs/inode.c  |   22 +++---
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index a6fbea5..c802ea0 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -122,10 +122,11 @@ cifs_read_super(struct super_block *sb, void *data,
 #endif
sb->s_blocksize = CIFS_MAX_MSGSIZE;
sb->s_blocksize_bits = 14;  /* default 2**14 = CIFS_MAX_MSGSIZE */
-   inode = iget(sb, ROOT_I);
+   inode = cifs_iget(sb, ROOT_I);
 
-   if (!inode) {
-   rc = -ENOMEM;
+   if (IS_ERR(inode)) {
+   rc = PTR_ERR(inode);
+   inode = NULL;
goto out_no_root;
}
 
@@ -477,7 +478,6 @@ static int cifs_remount(struct super_block *sb, int *flags, 
char *data)
 }
 
 static const struct super_operations cifs_super_ops = {
-   .read_inode = cifs_read_inode,
.put_super = cifs_put_super,
.statfs = cifs_statfs,
.alloc_inode = cifs_alloc_inode,
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 5574ba3..f85e07d 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -43,6 +43,7 @@ extern void cifs_read_inode(struct inode *);
 
 /* Functions related to inodes */
 extern const struct inode_operations cifs_dir_inode_ops;
+extern struct inode *cifs_iget(struct super_block *, unsigned long);
 extern int cifs_create(struct inode *, struct dentry *, int,
   struct nameidata *);
 extern struct dentry *cifs_lookup(struct inode *, struct dentry *,
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 5e8b388..6e9c364 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -580,10 +580,18 @@ static const struct inode_operations cifs_ipc_inode_ops = 
{
 };
 
 /* gets root inode */
-void cifs_read_inode(struct inode *inode)
+struct inode *cifs_iget(struct super_block *sb, unsigned long ino)
 {
-   int xid, rc;
+   int xid;
struct cifs_sb_info *cifs_sb;
+   struct inode *inode;
+   long rc;
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
 
cifs_sb = CIFS_SB(inode->i_sb);
xid = GetXid();
@@ -600,10 +608,18 @@ void cifs_read_inode(struct inode *inode)
inode->i_fop = &simple_dir_operations;
inode->i_uid = cifs_sb->mnt_uid;
inode->i_gid = cifs_sb->mnt_gid;
+   _FreeXid(xid);
+   iget_failed(inode);
+   return ERR_PTR(rc);
}
 
-   /* can not call macro FreeXid here since in a void func */
+   unlock_new_inode(inode);
+
+   /* can not call macro FreeXid here since in a void func
+* TODO: This is no longer true
+*/
_FreeXid(xid);
+   return inode;
 }
 
 int cifs_unlink(struct inode *inode, struct dentry *direntry)

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 11/31] IGET: Stop EFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the EFS filesystem from using iget() and read_inode().  Replace
efs_read_inode() with efs_iget(), and call that instead of iget().
efs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

efs_fill_super() returns any error incurred when getting the root inode
instead of EACCES.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/efs/inode.c |   25 +
 fs/efs/namei.c |   25 +
 fs/efs/super.c |   18 --
 include/linux/efs_fs.h |2 +-
 4 files changed, 43 insertions(+), 27 deletions(-)

diff --git a/fs/efs/inode.c b/fs/efs/inode.c
index 174696f..627c302 100644
--- a/fs/efs/inode.c
+++ b/fs/efs/inode.c
@@ -45,17 +45,26 @@ static inline void extent_copy(efs_extent *src, efs_extent 
*dst) {
return;
 }
 
-void efs_read_inode(struct inode *inode)
+struct inode *efs_iget(struct super_block *super, unsigned long ino)
 {
int i, inode_index;
dev_t device;
u32 rdev;
struct buffer_head *bh;
-   struct efs_sb_info*sb = SUPER_INFO(inode->i_sb);
-   struct efs_inode_info *in = INODE_INFO(inode);
+   struct efs_sb_info*sb = SUPER_INFO(super);
+   struct efs_inode_info *in;
efs_block_t block, offset;
struct efs_dinode *efs_inode;
-  
+   struct inode *inode;
+
+   inode = iget_locked(super, ino);
+   if (IS_ERR(inode))
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   in = INODE_INFO(inode);
+
/*
** EFS layout:
**
@@ -159,13 +168,13 @@ void efs_read_inode(struct inode *inode)
break;
}
 
-   return;
+   unlock_new_inode(inode);
+   return inode;
 
 read_inode_error:
printk(KERN_WARNING "EFS: failed to read inode %lu\n", inode->i_ino);
-   make_bad_inode(inode);
-
-   return;
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
 }
 
 static inline efs_block_t
diff --git a/fs/efs/namei.c b/fs/efs/namei.c
index f7f4070..69486dd 100644
--- a/fs/efs/namei.c
+++ b/fs/efs/namei.c
@@ -66,9 +66,10 @@ struct dentry *efs_lookup(struct inode *dir, struct dentry 
*dentry, struct namei
lock_kernel();
inodenum = efs_find_entry(dir, dentry->d_name.name, dentry->d_name.len);
if (inodenum) {
-   if (!(inode = iget(dir->i_sb, inodenum))) {
+   inode = efs_iget(dir->i_sb, inodenum);
+   if (IS_ERR(inode)) {
unlock_kernel();
-   return ERR_PTR(-EACCES);
+   return ERR_CAST(inode);
}
}
unlock_kernel();
@@ -84,14 +85,13 @@ static struct inode *efs_nfs_get_inode(struct super_block 
*sb, u64 ino,
 
if (ino == 0)
return ERR_PTR(-ESTALE);
-   inode = iget(sb, ino);
-   if (inode == NULL)
-   return ERR_PTR(-ENOMEM);
+   inode = efs_iget(sb, ino);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
 
-   if (is_bad_inode(inode) ||
-   (generation && inode->i_generation != generation)) {
+   if (generation && inode->i_generation != generation) {
iput(inode);
-   return ERR_PTR(-ESTALE);
+   return ERR_PTR(-ESTALE);
}
 
return inode;
@@ -116,7 +116,7 @@ struct dentry *efs_get_parent(struct dentry *child)
struct dentry *parent;
struct inode *inode;
efs_ino_t ino;
-   int error;
+   long error;
 
lock_kernel();
 
@@ -125,10 +125,11 @@ struct dentry *efs_get_parent(struct dentry *child)
if (!ino)
goto fail;
 
-   error = -EACCES;
-   inode = iget(child->d_inode->i_sb, ino);
-   if (!inode)
+   inode = efs_iget(child->d_inode->i_sb, ino);
+   if (IS_ERR(inode)) {
+   error = PTR_ERR(inode);
goto fail;
+   }
 
error = -ENOMEM;
parent = d_alloc_anon(inode);
diff --git a/fs/efs/super.c b/fs/efs/super.c
index c79bc62..240f77c 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -107,7 +107,6 @@ static int efs_remount(struct super_block *sb, int *flags, 
char *data)
 static const struct super_operations efs_superblock_operations = {
.alloc_inode= efs_alloc_inode,
.destroy_inode  = efs_destroy_inode,
-   .read_inode = efs_read_inode,
.put_super  = efs_put_super,
.statfs = efs_statfs,
.remount_fs = efs_remount,
@@ -247,6 +246,7 @@ static int efs_fill_super(struct super_block *s, void *d, 
int silent)
struct efs_sb_info *sb;
struct buffer_head *bh;
struct inode *root;
+   int ret = -EINVAL;
 
sb = kzalloc(sizeof(struct efs_sb_info), GFP_KERNEL);
if (!sb)
@@ -303,12 +303,18 @@ static int efs_fill_super(struct super_block *s, v

[PATCH 07/31] IGET: Stop autofs from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the autofs filesystem from using iget() and read_inode().  Replace
autofs_read_inode() with autofs_iget(), and call that instead of iget().
autofs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/autofs/autofs_i.h |1 +
 fs/autofs/inode.c|   27 ++-
 fs/autofs/root.c |   22 ++
 3 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/fs/autofs/autofs_i.h b/fs/autofs/autofs_i.h
index 8b4cca3..901a3e6 100644
--- a/fs/autofs/autofs_i.h
+++ b/fs/autofs/autofs_i.h
@@ -150,6 +150,7 @@ extern const struct file_operations autofs_root_operations;
 
 int autofs_fill_super(struct super_block *, void *, int);
 void autofs_kill_sb(struct super_block *sb);
+struct inode *autofs_iget(struct super_block *, unsigned long);
 
 /* Queue management functions */
 
diff --git a/fs/autofs/inode.c b/fs/autofs/inode.c
index 45f5992..708bdb8 100644
--- a/fs/autofs/inode.c
+++ b/fs/autofs/inode.c
@@ -52,10 +52,7 @@ out_kill_sb:
kill_anon_super(sb);
 }
 
-static void autofs_read_inode(struct inode *inode);
-
 static const struct super_operations autofs_sops = {
-   .read_inode = autofs_read_inode,
.statfs = simple_statfs,
 };
 
@@ -164,7 +161,9 @@ int autofs_fill_super(struct super_block *s, void *data, 
int silent)
s->s_time_gran = 1;
sbi->sb = s;
 
-   root_inode = iget(s, AUTOFS_ROOT_INO);
+   root_inode = autofs_iget(s, AUTOFS_ROOT_INO);
+   if (IS_ERR(root_inode))
+   goto fail_free;
root = d_alloc_root(root_inode);
pipe = NULL;
 
@@ -230,11 +229,17 @@ fail_unlock:
return -EINVAL;
 }
 
-static void autofs_read_inode(struct inode *inode)
+struct inode *autofs_iget(struct super_block *sb, unsigned long ino)
 {
-   ino_t ino = inode->i_ino;
unsigned int n;
-   struct autofs_sb_info *sbi = autofs_sbi(inode->i_sb);
+   struct autofs_sb_info *sbi = autofs_sbi(sb);
+   struct inode *inode;
+
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
 
/* Initialize to the default case (stub directory) */
 
@@ -250,7 +255,7 @@ static void autofs_read_inode(struct inode *inode)
inode->i_op = &autofs_root_inode_operations;
inode->i_fop = &autofs_root_operations;
inode->i_uid = inode->i_gid = 0; /* Changed in read_super */
-   return;
+   goto done;
} 

inode->i_uid = inode->i_sb->s_root->d_inode->i_uid;
@@ -263,7 +268,7 @@ static void autofs_read_inode(struct inode *inode)
n = ino - AUTOFS_FIRST_SYMLINK;
if (n >= AUTOFS_MAX_SYMLINKS || 
!test_bit(n,sbi->symlink_bitmap)) {
printk("autofs: Looking for bad symlink inode %u\n", 
(unsigned int) ino);
-   return;
+   goto done;
}

inode->i_op = &autofs_symlink_inode_operations;
@@ -275,4 +280,8 @@ static void autofs_read_inode(struct inode *inode)
inode->i_size = sl->len;
inode->i_nlink = 1;
}
+
+done:
+   unlock_new_inode(inode);
+   return inode;
 }
diff --git a/fs/autofs/root.c b/fs/autofs/root.c
index 5efff3c..8aacade 100644
--- a/fs/autofs/root.c
+++ b/fs/autofs/root.c
@@ -114,8 +114,8 @@ static int try_to_fill_dentry(struct dentry *dentry, struct 
super_block *sb, str
dentry->d_time = (unsigned long) ent;

if (!dentry->d_inode) {
-   inode = iget(sb, ent->ino);
-   if (!inode) {
+   inode = autofs_iget(sb, ent->ino);
+   if (IS_ERR(inode)) {
/* Failed, but leave pending for next time */
return 1;
}
@@ -274,6 +274,7 @@ static int autofs_root_symlink(struct inode *dir, struct 
dentry *dentry, const c
unsigned int n;
int slsize;
struct autofs_symlink *sl;
+   struct inode *inode;
 
DPRINTK(("autofs_root_symlink: %s <- ", symname));
autofs_say(dentry->d_name.name,dentry->d_name.len);
@@ -331,7 +332,12 @@ static int autofs_root_symlink(struct inode *dir, struct 
dentry *dentry, const c
ent->dentry = NULL; /* We don't keep the dentry for symlinks */
 
autofs_hash_insert(dh,ent);
-   d_instantiate(dentry, iget(dir->i_sb,ent->ino));
+
+   inode = autofs_iget(dir->i_sb, ent->ino);
+   if (IS_ERR(inode))
+   return PTR_ERR(inode);
+
+   d_instantiate(dentry, inode);
unlock_kernel();
return 0;
 }
@@ -428,6 +434,7 @@ static int autofs_root_mkdir(struct inode *dir, struct 
dentry *dentry, int mode)
struct autofs_sb_info *sbi = autofs_sbi(dir->i_sb)

[PATCH 08/31] IGET: Stop BEFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the BEFS filesystem from using iget() and read_inode().  Replace
befs_read_inode() with befs_iget(), and call that instead of iget().
befs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

befs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
Acked-by: Will Dyson <[EMAIL PROTECTED]>
---

 fs/befs/linuxvfs.c |   39 +--
 1 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index b28a20e..403fe66 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -35,7 +35,7 @@ static int befs_get_block(struct inode *, sector_t, struct 
buffer_head *, int);
 static int befs_readpage(struct file *file, struct page *page);
 static sector_t befs_bmap(struct address_space *mapping, sector_t block);
 static struct dentry *befs_lookup(struct inode *, struct dentry *, struct 
nameidata *);
-static void befs_read_inode(struct inode *ino);
+static struct inode *befs_iget(struct super_block *, unsigned long);
 static struct inode *befs_alloc_inode(struct super_block *sb);
 static void befs_destroy_inode(struct inode *inode);
 static int befs_init_inodecache(void);
@@ -52,7 +52,6 @@ static int befs_statfs(struct dentry *, struct kstatfs *);
 static int parse_options(char *, befs_mount_options *);
 
 static const struct super_operations befs_sops = {
-   .read_inode = befs_read_inode,  /* initialize & read inode */
.alloc_inode= befs_alloc_inode, /* allocate a new inode */
.destroy_inode  = befs_destroy_inode, /* deallocate an inode */
.put_super  = befs_put_super,   /* uninit super */
@@ -198,9 +197,9 @@ befs_lookup(struct inode *dir, struct dentry *dentry, 
struct nameidata *nd)
return ERR_PTR(-ENODATA);
}
 
-   inode = iget(dir->i_sb, (ino_t) offset);
-   if (!inode)
-   return ERR_PTR(-EACCES);
+   inode = befs_iget(dir->i_sb, (ino_t) offset);
+   if (IS_ERR(inode))
+   return ERR_CAST(inode);
 
d_add(dentry, inode);
 
@@ -296,17 +295,23 @@ static void init_once(struct kmem_cache *cachep, void 
*foo)
inode_init_once(&bi->vfs_inode);
 }
 
-static void
-befs_read_inode(struct inode *inode)
+static struct inode *befs_iget(struct super_block *sb, unsigned long ino)
 {
struct buffer_head *bh = NULL;
befs_inode *raw_inode = NULL;
 
-   struct super_block *sb = inode->i_sb;
befs_sb_info *befs_sb = BEFS_SB(sb);
befs_inode_info *befs_ino = NULL;
+   struct inode *inode;
+   long ret = -EIO;
 
-   befs_debug(sb, "---> befs_read_inode() " "inode = %lu", inode->i_ino);
+   befs_debug(sb, "---> befs_read_inode() " "inode = %lu", ino);
+
+   inode = iget_locked(sb, ino);
+   if (IS_ERR(inode))
+   return inode;
+   if (!(inode->i_state & I_NEW))
+   return inode;
 
befs_ino = BEFS_I(inode);
 
@@ -402,15 +407,16 @@ befs_read_inode(struct inode *inode)
 
brelse(bh);
befs_debug(sb, "<--- befs_read_inode()");
-   return;
+   unlock_new_inode(inode);
+   return inode;
 
   unacquire_bh:
brelse(bh);
 
   unacquire_none:
-   make_bad_inode(inode);
+   iget_failed(inode);
befs_debug(sb, "<--- befs_read_inode() - Bad inode");
-   return;
+   return ERR_PTR(ret);
 }
 
 /* Initialize the inode cache. Called at fs setup.
@@ -752,6 +758,7 @@ befs_fill_super(struct super_block *sb, void *data, int 
silent)
befs_sb_info *befs_sb;
befs_super_block *disk_sb;
struct inode *root;
+   long ret = -EINVAL;
 
const unsigned long sb_block = 0;
const off_t x86_sb_off = 512;
@@ -833,7 +840,11 @@ befs_fill_super(struct super_block *sb, void *data, int 
silent)
/* Set real blocksize of fs */
sb_set_blocksize(sb, (ulong) befs_sb->block_size);
sb->s_op = (struct super_operations *) &befs_sops;
-   root = iget(sb, iaddr2blockno(sb, &(befs_sb->root_dir)));
+   root = befs_iget(sb, iaddr2blockno(sb, &(befs_sb->root_dir)));
+   if (IS_ERR(root)) {
+   ret = PTR_ERR(root);
+   goto unacquire_priv_sbp;
+   }
sb->s_root = d_alloc_root(root);
if (!sb->s_root) {
iput(root);
@@ -868,7 +879,7 @@ befs_fill_super(struct super_block *sb, void *data, int 
silent)
 
   unacquire_none:
sb->s_fs_info = NULL;
-   return -EINVAL;
+   return ret;
 }
 
 static int

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 06/31] IGET: Stop AFFS from using iget() and read_inode() [try #5]

2007-10-25 Thread David Howells

Stop the AFFS filesystem from using iget() and read_inode().  Replace
affs_read_inode() with affs_iget(), and call that instead of iget().
affs_iget() then uses iget_locked() directly and returns a proper error code
instead of an inode in the event of an error.

affs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/affs/affs.h |2 +-
 fs/affs/amigaffs.c |6 --
 fs/affs/inode.c|   20 +---
 fs/affs/namei.c|   10 --
 fs/affs/super.c|   12 +---
 5 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/fs/affs/affs.h b/fs/affs/affs.h
index 232c694..0bce4ff 100644
--- a/fs/affs/affs.h
+++ b/fs/affs/affs.h
@@ -174,7 +174,7 @@ extern void  affs_put_inode(struct inode 
*inode);
 extern void affs_drop_inode(struct inode *inode);
 extern void affs_delete_inode(struct inode *inode);
 extern void affs_clear_inode(struct inode *inode);
-extern void affs_read_inode(struct inode *inode);
+extern struct inode*affs_iget(struct super_block *sb, unsigned 
long ino);
 extern int  affs_write_inode(struct inode *inode, int);
 extern int  affs_add_entry(struct inode *dir, struct inode 
*inode, struct dentry *dentry, s32 type);
 
diff --git a/fs/affs/amigaffs.c b/fs/affs/amigaffs.c
index f4de4b9..8055730 100644
--- a/fs/affs/amigaffs.c
+++ b/fs/affs/amigaffs.c
@@ -170,9 +170,11 @@ affs_remove_link(struct dentry *dentry)
if (!link_bh)
goto done;
 
-   dir = iget(sb, be32_to_cpu(AFFS_TAIL(sb, link_bh)->parent));
-   if (!dir)
+   dir = affs_iget(sb, be32_to_cpu(AFFS_TAIL(sb, 
link_bh)->parent));
+   if (IS_ERR(dir)) {
+   retval = PTR_ERR(dir);
goto done;
+   }
 
affs_lock_dir(dir);
affs_fix_dcache(dentry, link_ino);
diff --git a/fs/affs/inode.c b/fs/affs/inode.c
index 4609a6c..edea76c 100644
--- a/fs/affs/inode.c
+++ b/fs/affs/inode.c
@@ -15,20 +15,25 @@
 extern const struct inode_operations affs_symlink_inode_operations;
 extern struct timezone sys_tz;
 
-void
-affs_read_inode(struct inode *inode)
+struct inode *affs_iget(struct super_block *sb, unsigned long ino)
 {
-   struct super_block  *sb = inode->i_sb;
struct affs_sb_info *sbi = AFFS_SB(sb);
struct buffer_head  *bh;
struct affs_head*head;
struct affs_tail*tail;
+   struct inode*inode;
u32  block;
u32  size;
u32  prot;
u16  id;
 
-   pr_debug("AFFS: read_inode(%lu)\n",inode->i_ino);
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   pr_debug("AFFS: affs_iget(%lu)\n",inode->i_ino);
 
block = inode->i_ino;
bh = affs_bread(sb, block);
@@ -154,12 +159,13 @@ affs_read_inode(struct inode *inode)
 sys_tz.tz_minuteswest * 60;
inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec = 
inode->i_atime.tv_nsec = 0;
affs_brelse(bh);
-   return;
+   unlock_new_inode(inode);
+   return inode;
 
 bad_inode:
-   make_bad_inode(inode);
affs_brelse(bh);
-   return;
+   iget_failed(inode);
+   return ERR_PTR(-EIO);
 }
 
 int
diff --git a/fs/affs/namei.c b/fs/affs/namei.c
index b407e9e..2218f1e 100644
--- a/fs/affs/namei.c
+++ b/fs/affs/namei.c
@@ -208,9 +208,8 @@ affs_lookup(struct inode *dir, struct dentry *dentry, 
struct nameidata *nd)
affs_lock_dir(dir);
bh = affs_find_entry(dir, dentry);
affs_unlock_dir(dir);
-   if (IS_ERR(bh)) {
+   if (IS_ERR(bh))
return ERR_CAST(bh);
-   }
if (bh) {
u32 ino = bh->b_blocknr;
 
@@ -223,10 +222,9 @@ affs_lookup(struct inode *dir, struct dentry *dentry, 
struct nameidata *nd)
ino = be32_to_cpu(AFFS_TAIL(sb, bh)->original);
}
affs_brelse(bh);
-   inode = iget(sb, ino);
-   if (!inode) {
-   return ERR_PTR(-EACCES);
-   }
+   inode = affs_iget(sb, ino);
+   if (IS_ERR(inode))
+   return ERR_PTR(PTR_ERR(inode));
}
dentry->d_op = AFFS_SB(sb)->s_flags & SF_INTL ? 
&affs_intl_dentry_operations : &affs_dentry_operations;
d_add(dentry, inode);
diff --git a/fs/affs/super.c b/fs/affs/super.c
index b53e5d0..3c45d49 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -113,7 +113,6 @@ static void destroy_inodecache(void)
 static const stru

[PATCH 04/31] IGET: Use iget_failed() in AFS [try #5]

2007-10-25 Thread David Howells

Use iget_failed() in AFS to kill a failed inode.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/afs/inode.c |5 +
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index d196840..ca9b02f 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -196,10 +196,7 @@ struct inode *afs_iget(struct super_block *sb, struct key 
*key,
 
/* failure */
 bad_inode:
-   make_bad_inode(inode);
-   unlock_new_inode(inode);
-   iput(inode);
-
+   iget_failed(inode);
_leave(" = %d [bad]", ret);
return ERR_PTR(ret);
 }

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/31] IGET: Use iget_failed() in GFS2 [try #5]

2007-10-25 Thread David Howells

Use iget_failed() in GFS2 to kill a failed inode.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/gfs2/inode.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 5f6dc32..c5285e2 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -235,7 +235,7 @@ fail_put:
ip->i_gl->gl_object = NULL;
gfs2_glock_put(ip->i_gl);
 fail:
-   iput(inode);
+   iget_failed(inode);
return ERR_PTR(error);
 }
 

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/31] Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) [try #5]

2007-10-25 Thread David Howells

Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using:

perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.*)[)][)]/ERR_CAST(\1)/' `grep -rl 
'ERR_PTR[(]*PTR_ERR' fs crypto net security`

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 crypto/cbc.c |2 +-
 crypto/cryptd.c  |4 ++--
 crypto/ecb.c |2 +-
 crypto/hmac.c|2 +-
 crypto/lrw.c |2 +-
 crypto/pcbc.c|2 +-
 crypto/xcbc.c|4 ++--
 fs/9p/vfs_inode.c|2 +-
 fs/affs/namei.c  |2 +-
 fs/afs/dir.c |4 ++--
 fs/afs/security.c|2 +-
 fs/fat/inode.c   |2 +-
 fs/fuse/dir.c|6 +++---
 fs/gfs2/dir.c|2 +-
 fs/gfs2/ops_export.c |2 +-
 fs/gfs2/ops_inode.c  |2 +-
 fs/jffs2/write.c |4 ++--
 fs/nfs/getroot.c |8 
 fs/nfsd/export.c |4 ++--
 fs/quota.c   |4 ++--
 fs/reiserfs/inode.c  |2 +-
 fs/reiserfs/xattr.c  |6 +++---
 fs/vfat/namei.c  |2 +-
 net/rxrpc/af_rxrpc.c |6 +++---
 security/keys/key.c  |2 +-
 security/keys/process_keys.c |2 +-
 security/keys/request_key.c  |2 +-
 security/keys/request_key_auth.c |2 +-
 28 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/crypto/cbc.c b/crypto/cbc.c
index 1f2649e..ea39ba2 100644
--- a/crypto/cbc.c
+++ b/crypto/cbc.c
@@ -288,7 +288,7 @@ static struct crypto_instance *crypto_cbc_alloc(struct 
rtattr **tb)
alg = crypto_get_attr_alg(tb, CRYPTO_ALG_TYPE_CIPHER,
  CRYPTO_ALG_TYPE_MASK);
if (IS_ERR(alg))
-   return ERR_PTR(PTR_ERR(alg));
+   return ERR_CAST(alg);
 
inst = crypto_alloc_instance("cbc", alg);
if (IS_ERR(inst))
diff --git a/crypto/cryptd.c b/crypto/cryptd.c
index 8bf2da8..60fad7b 100644
--- a/crypto/cryptd.c
+++ b/crypto/cryptd.c
@@ -230,7 +230,7 @@ static struct crypto_instance *cryptd_alloc_blkcipher(
alg = crypto_get_attr_alg(tb, CRYPTO_ALG_TYPE_BLKCIPHER,
  CRYPTO_ALG_TYPE_MASK | CRYPTO_ALG_ASYNC);
if (IS_ERR(alg))
-   return ERR_PTR(PTR_ERR(alg));
+   return ERR_CAST(alg);
 
inst = cryptd_alloc_instance(alg, state);
if (IS_ERR(inst))
@@ -265,7 +265,7 @@ static struct crypto_instance *cryptd_alloc(struct rtattr 
**tb)
 
algt = crypto_get_attr_type(tb);
if (IS_ERR(algt))
-   return ERR_PTR(PTR_ERR(algt));
+   return ERR_CAST(algt);
 
switch (algt->type & algt->mask & CRYPTO_ALG_TYPE_MASK) {
case CRYPTO_ALG_TYPE_BLKCIPHER:
diff --git a/crypto/ecb.c b/crypto/ecb.c
index 6310387..a46838e 100644
--- a/crypto/ecb.c
+++ b/crypto/ecb.c
@@ -128,7 +128,7 @@ static struct crypto_instance *crypto_ecb_alloc(struct 
rtattr **tb)
alg = crypto_get_attr_alg(tb, CRYPTO_ALG_TYPE_CIPHER,
  CRYPTO_ALG_TYPE_MASK);
if (IS_ERR(alg))
-   return ERR_PTR(PTR_ERR(alg));
+   return ERR_CAST(alg);
 
inst = crypto_alloc_instance("ecb", alg);
if (IS_ERR(inst))
diff --git a/crypto/hmac.c b/crypto/hmac.c
index e4eb6ac..c5c93da 100644
--- a/crypto/hmac.c
+++ b/crypto/hmac.c
@@ -211,7 +211,7 @@ static struct crypto_instance *hmac_alloc(struct rtattr 
**tb)
alg = crypto_get_attr_alg(tb, CRYPTO_ALG_TYPE_HASH,
  CRYPTO_ALG_TYPE_HASH_MASK);
if (IS_ERR(alg))
-   return ERR_PTR(PTR_ERR(alg));
+   return ERR_CAST(alg);
 
inst = crypto_alloc_instance("hmac", alg);
if (IS_ERR(inst))
diff --git a/crypto/lrw.c b/crypto/lrw.c
index 621095d..9d52e58 100644
--- a/crypto/lrw.c
+++ b/crypto/lrw.c
@@ -241,7 +241,7 @@ static struct crypto_instance *alloc(struct rtattr **tb)
alg = crypto_get_attr_alg(tb, CRYPTO_ALG_TYPE_CIPHER,
  CRYPTO_ALG_TYPE_MASK);
if (IS_ERR(alg))
-   return ERR_PTR(PTR_ERR(alg));
+   return ERR_CAST(alg);
 
inst = crypto_alloc_instance("lrw", alg);
if (IS_ERR(inst))
diff --git a/crypto/pcbc.c b/crypto/pcbc.c
index c3ed8a1..c04f52a 100644
--- a/crypto/pcbc.c
+++ b/crypto/pcbc.c
@@ -292,7 +292,7 @@ static struct crypto_instance *crypto_pcbc_alloc(struct 
rtattr **tb)
alg = crypto_get_attr_alg(tb, CRYPTO_ALG_TYPE_CIPHER,
  CRYPTO_ALG_TYPE_MASK);
if (IS_ERR(alg))
-   return ERR_PTR(PTR_ERR(alg));
+   return ERR_CAST(alg);
 
inst = crypto_alloc_instance("pcbc", alg);
if (IS_ERR(inst))
diff --git a/crypto/xcbc.c b/crypto/xcbc.c
index ac68f3b..

[PATCH 03/31] IGET: Introduce a function to register iget failure [try #5]

2007-10-25 Thread David Howells

Introduce a function to register failure in an inode construction path.  This
includes marking the inode under construction as bad, unlocking it and
releasing it.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 Documentation/filesystems/porting |   18 +-
 fs/bad_inode.c|   14 ++
 include/linux/fs.h|1 +
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/porting 
b/Documentation/filesystems/porting
index dac45c9..3b0fb22 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -184,11 +184,19 @@ just takes the superblock and inode number as arguments 
and does the
 test and set for you.
 
 e.g.
-   inode = iget_locked(sb, ino);
-   if (inode->i_state & I_NEW) {
-   read_inode_from_disk(inode);
-   unlock_new_inode(inode);
-   }
+   inode = iget_locked(sb, ino);
+   if (inode->i_state & I_NEW) {
+   err = read_inode_from_disk(inode);
+   if (err < 0) {
+   iget_failed(inode);
+   return err;
+   }
+   unlock_new_inode(inode);
+   }
+
+Note that if the process of setting up a new inode fails, then iget_failed()
+should be called on the inode to render it dead, and an appropriate error
+should be passed back to the caller.
 
 ---
 [recommended]
diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 521ff7c..f1c2ea8 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -359,3 +359,17 @@ int is_bad_inode(struct inode *inode)
 }
 
 EXPORT_SYMBOL(is_bad_inode);
+
+/**
+ * iget_failed - Mark an under-construction inode as dead and release it
+ * @inode: The inode to discard
+ *
+ * Mark an under-construction inode as dead and release it.
+ */
+void iget_failed(struct inode *inode)
+{
+   make_bad_inode(inode);
+   unlock_new_inode(inode);
+   iput(inode);
+}
+EXPORT_SYMBOL(iget_failed);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b3ec4a4..bc2671f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1763,6 +1763,7 @@ static inline struct inode *iget(struct super_block *sb, 
unsigned long ino)
 }
 
 extern void __iget(struct inode * inode);
+extern void iget_failed(struct inode *);
 extern void clear_inode(struct inode *);
 extern void destroy_inode(struct inode *);
 extern struct inode *new_inode(struct super_block *);

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/31] Add an ERR_CAST() macro to complement ERR_PTR and co. [try #5]

2007-10-25 Thread David Howells

Add an ERR_CAST() macro to complement ERR_PTR and co. for the purposes of
casting an error entyped as one pointer type to an error of another pointer
type whilst making it explicit as to what is going on.

This provides a replacement for the ERR_PTR(PTR_ERR(p)) construct.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 include/linux/err.h |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/linux/err.h b/include/linux/err.h
index 1ab1d44..08409cd 100644
--- a/include/linux/err.h
+++ b/include/linux/err.h
@@ -34,6 +34,18 @@ static inline long IS_ERR(const void *ptr)
return IS_ERR_VALUE((unsigned long)ptr);
 }
 
+/**
+ * ERR_CAST - Explicitly cast an error-valued pointer to another pointer type
+ * @ptr: The pointer to cast.
+ *
+ * Explicitly cast an error-valued pointer to another pointer type in such a
+ * way as to make it clear that's what's going on.
+ */
+static inline void *ERR_CAST(const void *ptr)
+{
+   return (void *) ptr;
+}
+
 #endif
 
 #endif /* _LINUX_ERR_H */

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 00/31] Remove iget() and read_inode() [try #5]

2007-10-25 Thread David Howells



Here's a set of patches that remove all calls to iget() and all read_inode()
functions.  They should be removed for two reasons: firstly they don't lend
themselves to good error handling, and secondly their presence is a temptation
for code outside a filesystem to call iget() to access inodes within that
filesystem.

There are a few benefits to this:

 (1) Error handling gets simpler as you can return an error code rather than
 having to call is_bad_inode().

 (2) You can now tell the difference between ENOMEM and EIO occurring in the
 read_inode() path.

 (3) The code should get smaller.  iget() is an inline function that is
 typically called 2-3 times per filesystem that uses it.  By folding the
 iget code into the read_inode code for each filesystem, it eliminates
 some duplication.

A tarball of the patches can be retrieved from:

http://people.redhat.com/~dhowells/iget-remove.tar.bz2


Additionally, there are a couple of patches that introduce an ERR_CAST() macro
to be used instead of ERR_PTR(PTR_ERR(p)) constructs and apply it to all such
instances in the kernel.

Of the patches directly relevant to the subject:

The first patch adds a function, iget_failed() that is a canned piece of code
for killing an inode when the inode construction path fails.

The second and third patches makes AFS and GFS2 use iget_failed() rather than
interpolating the sequence directly.

The final patch removes iget() and read_inode().

Each of the other patches modify a filesystem that used iget() and read_inode()
to use iget_locked() instead.  The standard procedure was to convert:

void thingyfs_read_inode(struct inode *inode)
{
...
}

into:

struct inode *thingyfs_iget(struct super_block *sp, unsigned long ino)
{
struct inode *inode;
int ret;

inode = iget_locked(sb, ino);
if (!inode)
return ERR_PTR(-ENOMEM);
if (!(inode->i_state & I_NEW))
return inode;

...
unlock_new_inode(inode);
return inode;
error:
iget_failed(inode);
return ERR_PTR(ret);
}

and then call thingyfs_iget() rather than iget():

ret = -EINVAL;
inode = iget(sb, ino);
if (!inode || is_bad_inode(inode))
goto error;

becomes:

inode = thingyfs_iget(sb, ino);
if (IS_ERR(inode)) {
ret = PTR_ERR(inode);
goto error;
}

There were exceptions; most notably it appeared FAT should be calling ilookup()
not iget().

Additionally, HPPFS and HOSTFS (UM-specific filesystems) really need checking:

 hostfs_kern.c:

 (*) hostfs_iget() should perhaps subsume init_inode() and hostfs_read_inode().

 (*) It would appear that all hostfs inodes are the same inode because iget()
 was being called with inode number 0 - which forms the lookup key.

 hppfs_kern.c:

 (*) The HPPFS inode retains a pointer to the proc dentry it is shadowing, but
 whilst it does appear to retain a reference to it, it doesn't appear to
 destroy the reference if the inode goes away.

 (*) hppfs_iget() should perhaps subsume init_inode() and hppfs_read_inode().

 (*) It would appear that all hppfs inodes are the same inode because iget()
 was being called with inode number 0, which forms the lookup key.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH+comment] fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE

2007-10-25 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Hugh Dickins writes:
> On Thu, 25 Oct 2007, Pekka Enberg wrote:

> With unionfs also fixed, we don't know of an absolute need for this
> patch (and so, on that basis, the !wbc->for_reclaim case could indeed
> be removed very soon); but as I see it, the unionfs case has shown
> that it's time to future-proof this code against whatever stacking
> filesystems come along.  Hence I didn't mention the names of such
> filesystems in the source comment.

I think "future proof" for other stackable f/s is a good idea, esp. since
many of the stackable f/s we've developed and distributed over the past 10
years are in some use in various places: gzipfs, avfs, tracefs, replayfs,
ncryptfs, versionfs, wrapfs, i3fs, and more (see www.filesystems.org).

Cheers,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Caching semi-digested credentials in struct cred

2007-10-25 Thread Trond Myklebust

On Thu, 2007-10-25 at 16:45 +0100, David Howells wrote:
> Trond Myklebust <[EMAIL PROTECTED]> wrote:
> > > If you'd rather, I can just drop the credentials into the NFS VFS methods
> > > and leave it for you to make work.
> > 
> > The latter sounds good for the moment. There are several people who are
> > working on the rpc auth code, and thus have an interest here, so it will
> > take some time to work out a consensus for what needs to be done.
> 
> The problem with that is that there are bits in the bottom of the sunrpc stuff
> that assume they can use current->fs[ug]id (rpcauth_lookupcred() for example),
> which is the reason for passing this pointer all the way down:-/

I don't think anybody has problems with the concept of replacing
current-> with cred-> in rpcauth_lookupcred(). That reflects the
(desirable) change to the way the VFS communicates authorisation
information to the lower level code.

What we don't want at this time is for any changes that go beyond that
minimal approach. It will take time to digest the consequences of the
existence of a generic credential, and it is not clear to us at this
time that changes to the rpc_cred caching mechanism (as opposed to just
the lookup key) is the right thing to do.

Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Hugh Dickins

On Mon, 22 Oct 2007, Pekka Enberg wrote:
> On 10/22/07, Hugh Dickins <[EMAIL PROTECTED]> wrote:
> > Only ramdisk and shmem have been returning AOP_WRITEPAGE_ACTIVATE.
> > Both of those set BDI_CAP_NO_WRITEBACK.  ramdisk never returned it
> > if !wbc->for_reclaim.  I contend that shmem shouldn't either: it's
> > a special code to get the LRU rotation right, not useful elsewhere.
> > Though Documentation/filesystems/vfs.txt does imply wider use.
> >
> > I think this is where people use the phrase "go figure" ;)
> 
> Heh. As far as I can tell, the implication of "wider use" was added by
> Neil in commit "341546f5ad6fce584531f744853a5807a140f2a9 Update some
> VFS documentation", so perhaps he might know? Neil?

I take as gospel this extract from Andrew's original 2.5.52 comment:

  So the locking rules for writepage() are unchanged.  They are:

  - Called with the page locked
  - Returns with the page unlocked
  - Must redirty the page itself if it wasn't all written.

  But there is a new, special, hidden, undocumented, secret hack for
  tmpfs: writepage may return WRITEPAGE_ACTIVATE to tell the VM to move
  the page to the active list.  The page must be kept locked in this one
  case.

Special, hidden, undocumented, secret hack!  Then in 2.6.7 Andrew
stole his own secret and used it when concocting ramdisk_writepage.
Oh, and NFS made some kind of use of it in 2.6.6 only.  Then Neil
revealed the secret to the uninitiated in 2.6.17: now, what's the
appropriate punishment for that?

In the full 2.5.52 comment, Andrew explains how prior to this secret
code, we used fail_writepage, which in the memory pressure case did
an activate_page, with the intention of moving the page to the active
list - but that didn't actually work, because the page is off the
LRUs at this point, being passed around between pagevecs.

I've always preferred the way it was originally trying to do it, which
seems clearer and less error-prone than having a special code which
people then have to worry about.  Here's the patch I'd like to present
in due course (against 2.6.24-rc1, so unionfs absent): tmpfs and ramdisk
simply SetPageActive for this case (and go back to obeying the usual
unlocking rule for writepage), vmscan.c observe and act accordingly.

But I've not tested it at all (well, I've run with it in, but not
actually going down the paths in question): it may suffer from
something silly like the original fail_writepage.  Plus I might be
persuaded into making inc_zone_page_state(page, NR_VMSCAN_WRITE)
conditional on !PageActive(page), just to produce the same stats
as before (though they don't make a lot of sense, counting other
non-writes as writes).  And would it need a deprecation phase?

Hugh

 Documentation/filesystems/Locking |6 +-
 Documentation/filesystems/vfs.txt |4 +---
 drivers/block/rd.c|5 ++---
 include/linux/fs.h|   10 --
 mm/migrate.c  |5 ++---
 mm/page-writeback.c   |4 
 mm/shmem.c|   11 ---
 mm/vmscan.c   |   17 ++---
 8 files changed, 20 insertions(+), 42 deletions(-)

--- 2.6.24-rc1/Documentation/filesystems/Locking2007-10-24 
07:15:11.0 +0100
+++ linux/Documentation/filesystems/Locking 2007-10-24 08:42:07.0 
+0100
@@ -228,11 +228,7 @@ If the filesystem is called for sync the
 in-progress I/O and then start new I/O.

 The filesystem should unlock the page synchronously, before returning to the
-caller, unless ->writepage() returns special WRITEPAGE_ACTIVATE
-value. WRITEPAGE_ACTIVATE means that page cannot really be written out
-currently, and VM should stop calling ->writepage() on this page for some
-time. VM does this by moving page to the head of the active list, hence the
-name.
+caller.

 Unless the filesystem is going to redirty_page_for_writepage(), unlock the page
 and return zero, writepage *must* run set_page_writeback() against the page,
--- 2.6.24-rc1/Documentation/filesystems/vfs.txt2007-10-24 
07:15:11.0 +0100
+++ linux/Documentation/filesystems/vfs.txt 2007-10-24 08:42:07.0 
+0100
@@ -567,9 +567,7 @@ struct address_space_operations {
   If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
   try too hard if there are problems, and may choose to write out
   other pages from the mapping if that is easier (e.g. due to
-  internal dependencies).  If it chooses not to start writeout, it
-  should return AOP_WRITEPAGE_ACTIVATE so that the VM will not keep
-  calling ->writepage on that page.
+  internal dependencies).

   See the file "Locking" for more details.

--- 2.6.24-rc1/drivers/block/rd.c   2007-10-24 07:15:23.0 +0100
+++ linux/drivers/block/rd.c2007-10-24 08:42:07.0 +0100
@@ -152,8 +152,7 @@ static int ramdisk_commit_write(struct f

 /*
  * ->writepage to the blockdev's mapping has to redirty the page so th

Re: Caching semi-digested credentials in struct cred

2007-10-25 Thread David Howells

Trond Myklebust <[EMAIL PROTECTED]> wrote:

> You should be using the rpc_auth that is cached in the rpc_client:
> 
>   NFS_CLIENT(inode)->cl_auth
> 
> W.r.t. the nfs_find_open_context(), btw, there is no reason why we
> couldn't change that to take a generic cred as its argument.

I think I've been under a misapprehension as to what rpc_auth is for.  I
assumed it was the credential NFS was using for the operations by the user
that it was created for, but given it's a 'superblock' level thing, I guess
not.

> We have yet to see any numbers comparing the current vs. cred performance.

That's something that concerns me, as does the increased risk of overrunning
the stack.  I'm adding an extra argument to an awful lot of functions, and
it's going to degrade performance, no two ways about it.  The question is by
how much?

> > If you'd rather, I can just drop the credentials into the NFS VFS methods
> > and leave it for you to make work.
> 
> The latter sounds good for the moment. There are several people who are
> working on the rpc auth code, and thus have an interest here, so it will
> take some time to work out a consensus for what needs to be done.

The problem with that is that there are bits in the bottom of the sunrpc stuff
that assume they can use current->fs[ug]id (rpcauth_lookupcred() for example),
which is the reason for passing this pointer all the way down:-/

David
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.24-rc1]EXPORT_SYMBOL(__set_page_dirty_no_writeback);

2007-10-25 Thread Tetsuo Handa

Hello.

Arjan van de Ven wrote:
> when will you post this filesystem for inclusion into kernel.org kernel?
> (and please really consider posting the patch together with that patch)
> (also, if you can give a pointer to the source code of this filesystem
> you might even get early code review)

I have proposed this filesystem at http://lkml.org/lkml/2004/11/1/48 .

In short, the filesystem I'm developing is a trivial device filesystem
that provides protection mechanism against tampering.

Reasons I don't use devfs/udev or fuse or LSM for /dev are:

  The devfs/udev don't provide protection mechanism against tampering.
  I don't know implementation that can enforce filename and it's attributes.
  Label based access control like SELinux doesn't distinguish
  /dev/sda1 and /dev/sda2, do they?
  If a process who is permitted to unlink and create /dev/sda1 and /dev/sda2 is 
cracked,
  who can ensure that /dev/sda1 is block-8-1 and /dev/sda2 is block-8-2?
  A situation /dev/sda1 is block-8-2 and /dev/sda2 is block-8-1 can happen.

  /dev has to be valid throughout the lifetime of system
  (i.e. from /sbin/init till power failure).
  Filesystems using fuse will freeze when a system starts /usr/bin/killall at 
shutdown script,
  where it is too early to stop working of /dev partition.

  LSM is used by SELinux, thus there is unlikely chance to call my module
  to validate a device file's filename and it's attributes.

The latest snapshot (which is not following codingstyle) is at
http://svn.sourceforge.jp/cgi-bin/viewcvs.cgi/*checkout*/trunk/1.5.x/ccs-patch/include/linux/syaoran.h?content-type=text%2Fplain&rev=588&root=tomoyo
http://svn.sourceforge.jp/cgi-bin/viewcvs.cgi/*checkout*/trunk/1.5.x/ccs-patch/fs/syaoran_2.6.c?content-type=text%2Fplain&rev=614&root=tomoyo

If there is a chance for inclusion into kernel.org kernel, I'm willing to fix 
codingstyle and submit immediately.

Thank you.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH+comment] fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE

2007-10-25 Thread Pekka Enberg

Hi Hugh,

On 10/25/07, Hugh Dickins <[EMAIL PROTECTED]> wrote:
> With unionfs also fixed, we don't know of an absolute need for this
> patch (and so, on that basis, the !wbc->for_reclaim case could indeed
> be removed very soon); but as I see it, the unionfs case has shown
> that it's time to future-proof this code against whatever stacking
> filesystems come along.

Heh, what can I say, after several readings, I still find your above
explanation (which I totally agree with) more to the point than the
actual comment :-).

In any case, the patch looks good to me.

Reviewed-by: Pekka Enberg <[EMAIL PROTECTED]>

  Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

61 matches

Mail list logo