from:"Erez Zadok"

Re: [2.6 patch] make vfs_ioctl() static

2008-02-17 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Christoph Hellwig writes:
> On Sun, Feb 17, 2008 at 10:18:42AM +0200, Adrian Bunk wrote:
> > This patch makes the needlessly global vfs_ioctl() static.
> 
> I think the point was toa eventually export it for stackable filesystem
> use.  But until they start using it marking it static seems fine with
> me.

Right.  I'm not using it yet in unionfs, although I could; for now I'm just
calling very similar code myself.  This is only used in unionfs after I
process my own ioctls; IOW, I pass all unknown ioctls to the lower level and
let it handle it.

eCryptfs, however, doesn't pass unknown ioctls to the lower layer: it only
processes its own.

Honestly I'm not sure which is more appropriate: should a stackable f/s pass
unknown ioctls to the lower f/s or not?  If it doesn't, would any important
functionality be lost?

Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: unionfs_copy_attr_times oopses

2008-02-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Hugh Dickins writes:

> Hi Erez,
> 
> Aside from the occasional "unionfs: new lower inode mtime" messages
> on directories (which I've got into the habit of ignoring now), the
> only problem I'm still suffering with unionfs over tmpfs (not tested
> any other fs's below it recently) is oops in unionfs_copy_attr_times.
> 
> I believe I'm working with your latest: 2.6.24-rc8-mm1 plus the four
> patches you posted to lkml on 26 Jan.  But this problem has been around
> for a while before that: I'd been hoping to debug it myself, but taken
> too long to make too little progress, so now handing over to you.
> 
> The oops occurs while doing repeated "make -j20" kernel builds in a
> unionfs mount of a tmpfs (though I doubt tmpfs is relevant): most of
> my testing was while swapping, but today I find that's irrelevant,
> and it should happen much quicker without.  SMP kernels (4 cpus),
> I haven't tried UP; happens with or without PREEMPT, may just be
> coincidence that it happens quicker on the machines with PREEMPT.
> 
> Most commonly it's unionfs_copy_attr_times called from unionfs_create,
> but that's probably just the most common route in this workload:
> I've seen it also when called from unionfs_rename or unionfs_open or
> unionfs_unlink.  It looks like there's a locking or refcounting bug,
> hence a race: the unionfs_inode_info which unionfs_copy_attr_times
> is working on gets changed underneath it, so it oopses on NULL
> lower_inodes.
[...]

Hugh,

Check out my latest set of patches (which correspond to release 2.2.4 of
Unionfs).  Thanks to your info and the patch, I was able to trigger several
races more frequently, and fix them.  I've tested my code with make -j N
(for N=4 and N=20), on a 4 cpu machine a well as a 2 cpu machine (w/
different amounts of memory and CPU speeds, also 32-bit vs 64-bit); I ran a
kernel compile for ~10-12 hours.  With the patches I just posted, I wasn't
able to trigger any of the WARN_ON's in unionfs_copy_attr_times.  I also
tried it while flushing caches via /proc, and/or performing branch-mgmt
commands in unionfs.

Give it a good shake and let me know what you find.

Thanks,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/17] Unionfs: stop using iget() and read_inode()

2008-02-16 Thread Erez Zadok

From: David Howells <[EMAIL PROTECTED]>

Replace unionfs_read_inode() with unionfs_iget(), and call that instead of
iget().  unionfs_iget() then uses iget_locked() directly and returns a
proper error code instead of an inode in the event of an error.

unionfs_fill_super() returns any error incurred when getting the root inode
instead of EINVAL.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c  |   11 +--
 fs/unionfs/super.c |   19 ++-
 fs/unionfs/union.h |1 +
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index ba3471d..4bc2c66 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -104,9 +104,8 @@ struct dentry *unionfs_interpose(struct dentry *dentry, 
struct super_block *sb,
BUG_ON(is_negative_dentry);
 
/*
-* We allocate our new inode below, by calling iget.
-* iget will call our read_inode which will initialize some
-* of the new inode's fields
+* We allocate our new inode below by calling unionfs_iget,
+* which will initialize some of the new inode's fields
 */
 
/*
@@ -128,9 +127,9 @@ struct dentry *unionfs_interpose(struct dentry *dentry, 
struct super_block *sb,
}
} else {
/* get unique inode number for unionfs */
-   inode = iget(sb, iunique(sb, UNIONFS_ROOT_INO));
-   if (!inode) {
-   err = -EACCES;
+   inode = unionfs_iget(sb, iunique(sb, UNIONFS_ROOT_INO));
+   if (IS_ERR(inode)) {
+   err = PTR_ERR(inode);
goto out;
}
if (atomic_read(&inode->i_count) > 1)
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 175840f..b71fc2a 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -24,11 +24,19 @@
  */
 static struct kmem_cache *unionfs_inode_cachep;
 
-static void unionfs_read_inode(struct inode *inode)
+struct inode *unionfs_iget(struct super_block *sb, unsigned long ino)
 {
int size;
-   struct unionfs_inode_info *info = UNIONFS_I(inode);
+   struct unionfs_inode_info *info;
+   struct inode *inode;
 
+   inode = iget_locked(sb, ino);
+   if (!inode)
+   return ERR_PTR(-ENOMEM);
+   if (!(inode->i_state & I_NEW))
+   return inode;
+
+   info = UNIONFS_I(inode);
memset(info, 0, offsetof(struct unionfs_inode_info, vfs_inode));
info->bstart = -1;
info->bend = -1;
@@ -44,7 +52,8 @@ static void unionfs_read_inode(struct inode *inode)
if (unlikely(!info->lower_inodes)) {
printk(KERN_CRIT "unionfs: no kernel memory when allocating "
   "lower-pointer array!\n");
-   BUG();
+   iget_failed(inode);
+   return ERR_PTR(-ENOMEM);
}
 
inode->i_version++;
@@ -60,7 +69,8 @@ static void unionfs_read_inode(struct inode *inode)
inode->i_atime.tv_sec = inode->i_atime.tv_nsec = 0;
inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
-
+   unlock_new_inode(inode);
+   return inode;
 }
 
 /*
@@ -1025,7 +1035,6 @@ out:
 }
 
 struct super_operations unionfs_sops = {
-   .read_inode = unionfs_read_inode,
.delete_inode   = unionfs_delete_inode,
.put_super  = unionfs_put_super,
.statfs = unionfs_statfs,
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 1bf0c09..533806c 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -364,6 +364,7 @@ extern int unionfs_fsync(struct file *file, struct dentry 
*dentry,
 extern int unionfs_fasync(int fd, struct file *file, int flag);
 
 /* Inode operations */
+extern struct inode *unionfs_iget(struct super_block *sb, unsigned long ino);
 extern int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
  struct inode *new_dir, struct dentry *new_dentry);
 extern int unionfs_unlink(struct inode *dir, struct dentry *dentry);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 11/17] Unionfs: lock parents' branch configuration fixes

2008-02-16 Thread Erez Zadok

Ensure that we lock the branch configuration of parent and child dentries in
operations which need it, and in the right order.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |   31 +---
 fs/unionfs/dentry.c |   26 +---
 fs/unionfs/inode.c  |   58 +++---
 fs/unionfs/union.h  |5 ++-
 fs/unionfs/unlink.c |   11 -
 5 files changed, 91 insertions(+), 40 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 96473c4..491e2ff 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -300,8 +300,9 @@ out:
  * Revalidate the struct file
  * @file: file to revalidate
  * @willwrite: true if caller may cause changes to the file; false otherwise.
+ * Caller must lock/unlock dentry's branch configuration.
  */
-int unionfs_file_revalidate(struct file *file, bool willwrite)
+int unionfs_file_revalidate_locked(struct file *file, bool willwrite)
 {
struct super_block *sb;
struct dentry *dentry;
@@ -311,7 +312,6 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
int err = 0;
 
dentry = file->f_path.dentry;
-   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
sb = dentry->d_sb;
 
/*
@@ -416,7 +416,17 @@ out:
 out_nofree:
if (!err)
unionfs_check_file(file);
-   unionfs_unlock_dentry(dentry);
+   return err;
+}
+
+int unionfs_file_revalidate(struct file *file, bool willwrite)
+{
+   int err;
+
+   unionfs_lock_dentry(file->f_path.dentry, UNIONFS_DMUTEX_CHILD);
+   err = unionfs_file_revalidate_locked(file, willwrite);
+   unionfs_unlock_dentry(file->f_path.dentry);
+
return err;
 }
 
@@ -524,9 +534,18 @@ int unionfs_open(struct inode *inode, struct file *file)
struct dentry *dentry = file->f_path.dentry;
int bindex = 0, bstart = 0, bend = 0;
int size;
+   int valid = 0;
 
unionfs_read_lock(inode->i_sb, UNIONFS_SMUTEX_PARENT);
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+   if (dentry != dentry->d_parent)
+   unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_PARENT);
+
+   valid = __unionfs_d_revalidate_chain(dentry->d_parent, NULL, false);
+   if (unlikely(!valid)) {
+   err = -ESTALE;
+   goto out_nofree;
+   }
 
file->private_data =
kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
@@ -589,6 +608,8 @@ out_nofree:
unionfs_check_file(file);
unionfs_check_inode(inode);
}
+   if (dentry != dentry->d_parent)
+   unionfs_unlock_dentry(dentry->d_parent);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(inode->i_sb);
return err;
@@ -797,8 +818,9 @@ int unionfs_flush(struct file *file, fl_owner_t id)
int bindex, bstart, bend;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
-   err = unionfs_file_revalidate(file, true);
+   err = unionfs_file_revalidate_locked(file, true);
if (unlikely(err))
goto out;
unionfs_check_file(file);
@@ -824,6 +846,7 @@ int unionfs_flush(struct file *file, fl_owner_t id)
 
 out:
unionfs_check_file(file);
+   unionfs_unlock_dentry(file->f_path.dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
 }
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 3bd2dfb..17b297d 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -363,6 +363,7 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
chain_len = 0;
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
dtmp = dentry->d_parent;
+   verify_locked(dentry);
if (dentry != dtmp)
unionfs_lock_dentry(dtmp, UNIONFS_DMUTEX_REVAL_PARENT);
dgen = atomic_read(&UNIONFS_D(dtmp)->generation);
@@ -453,7 +454,7 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
 
 out_this:
/* finally, lock this dentry and revalidate it */
-   verify_locked(dentry);
+   verify_locked(dentry);  /* verify child is locked */
if (dentry != dentry->d_parent)
unionfs_lock_dentry(dentry->d_parent,
UNIONFS_DMUTEX_REVAL_PARENT);
@@ -491,24 +492,20 @@ static int unionfs_d_revalidate(struct dentry *dentry, 
struct nameidata *nd)
return err;
 }
 
-/*
- * At this point no one can reference this dentry, so we don't have to be
- * careful about concurrent access.
- */
 static void unionfs_d_release(struct dentry *dentry)
 {
int bindex, bstart, bend;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHI

[PATCH 06/17] Unionfs: extend dentry branch configuration lock in open

2008-02-16 Thread Erez Zadok

Dentry branch configuration "info node" lock should extend to calls to
copy_attr_times.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |   14 --
 1 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index f37192f..96473c4 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -521,11 +521,12 @@ int unionfs_open(struct inode *inode, struct file *file)
 {
int err = 0;
struct file *lower_file = NULL;
-   struct dentry *dentry = NULL;
+   struct dentry *dentry = file->f_path.dentry;
int bindex = 0, bstart = 0, bend = 0;
int size;
 
unionfs_read_lock(inode->i_sb, UNIONFS_SMUTEX_PARENT);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
file->private_data =
kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
@@ -551,9 +552,6 @@ int unionfs_open(struct inode *inode, struct file *file)
goto out;
}
 
-   dentry = file->f_path.dentry;
-   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
-
bstart = fbstart(file) = dbstart(dentry);
bend = fbend(file) = dbend(dentry);
 
@@ -573,15 +571,12 @@ int unionfs_open(struct inode *inode, struct file *file)
if (!lower_file)
continue;
 
-   branchput(file->f_path.dentry->d_sb, bindex);
+   branchput(dentry->d_sb, bindex);
/* fput calls dput for lower_dentry */
fput(lower_file);
}
}
 
-   /* XXX: should this unlock be moved to the function bottom? */
-   unionfs_unlock_dentry(dentry);
-
 out:
if (err) {
kfree(UNIONFS_F(file)->lower_files);
@@ -590,12 +585,11 @@ out:
}
 out_nofree:
if (!err) {
-   dentry = file->f_path.dentry;
-   unionfs_copy_attr_times(dentry->d_parent->d_inode);
unionfs_copy_attr_times(inode);
unionfs_check_file(file);
unionfs_check_inode(inode);
}
+   unionfs_unlock_dentry(dentry);
unionfs_read_unlock(inode->i_sb);
return err;
 }
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 15/17] Unionfs: embed a struct path into struct nameidata instead of nd dentrymnt

2008-02-16 Thread Erez Zadok

From: Andrew Morton <[EMAIL PROTECTED]>

Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 2e791fd..640969d 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -254,8 +254,8 @@ static struct dentry *unionfs_lookup(struct inode *parent,
 
/* save the dentry & vfsmnt from namei */
if (nd) {
-   path_save.dentry = nd->dentry;
-   path_save.mnt = nd->mnt;
+   path_save.dentry = nd->path.dentry;
+   path_save.mnt = nd->path.mnt;
}
 
/*
@@ -266,8 +266,8 @@ static struct dentry *unionfs_lookup(struct inode *parent,
 
/* restore the dentry & vfsmnt in namei */
if (nd) {
-   nd->dentry = path_save.dentry;
-   nd->mnt = path_save.mnt;
+   nd->path.dentry = path_save.dentry;
+   nd->path.mnt = path_save.mnt;
}
if (!IS_ERR(ret)) {
if (ret)
@@ -885,7 +885,7 @@ static int unionfs_permission(struct inode *inode, int mask,
const int write_mask = (mask & MAY_WRITE) && !(mask & MAY_READ);
 
if (nd)
-   unionfs_lock_dentry(nd->dentry, UNIONFS_DMUTEX_CHILD);
+   unionfs_lock_dentry(nd->path.dentry, UNIONFS_DMUTEX_CHILD);
 
if (!UNIONFS_I(inode)->lower_inodes) {
if (is_file)/* dirs can be unlinked but chdir'ed to */
@@ -960,7 +960,7 @@ out:
unionfs_check_inode(inode);
unionfs_check_nd(nd);
if (nd)
-   unionfs_unlock_dentry(nd->dentry);
+   unionfs_unlock_dentry(nd->path.dentry);
return err;
 }
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/17] Unionfs: initialize path_save variable

2008-02-16 Thread Erez Zadok

This is not strictly necessary, but it helps quiet a gcc-4.2 warning (a good
optimizer may optimize this initialization away).

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
Signed-off-by: Josef 'Jeff' Sipek <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 0b92da2..8d939dc 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -246,7 +246,7 @@ static struct dentry *unionfs_lookup(struct inode *parent,
 struct dentry *dentry,
 struct nameidata *nd)
 {
-   struct path path_save;
+   struct path path_save = {NULL, NULL};
struct dentry *ret;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 08/17] Unionfs: improve debugging in copy_attr_times

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/subr.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 68a6280..1a40f63 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -247,8 +247,14 @@ void unionfs_copy_attr_times(struct inode *upper)
int bindex;
struct inode *lower;
 
-   if (!upper || ibstart(upper) < 0)
+   if (!upper)
return;
+   if (ibstart(upper) < 0) {
+#ifdef CONFIG_UNION_FS_DEBUG
+   WARN_ON(ibstart(upper) < 0);
+#endif /* CONFIG_UNION_FS_DEBUG */
+   return;
+   }
for (bindex = ibstart(upper); bindex <= ibend(upper); bindex++) {
lower = unionfs_lower_inode_idx(upper, bindex);
if (!lower)
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/17] Unionfs: use dget_parent in revalidation code

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   13 -
 1 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index a956b94..f8f65e1 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -410,15 +410,10 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
goto out;
}
 
-   /*
-* lock all dentries in chain, in child to parent order.
-* if failed, then sleep for a little, then retry.
-*/
-   dtmp = dentry->d_parent;
-   for (i = chain_len-1; i >= 0; i--) {
-   chain[i] = dget(dtmp);
-   dtmp = dtmp->d_parent;
-   }
+   /* grab all dentries in chain, in child to parent order */
+   dtmp = dentry;
+   for (i = chain_len-1; i >= 0; i--)
+   dtmp = chain[i] = dget_parent(dtmp);
 
/*
 * call __unionfs_d_revalidate_one() on each dentry, but in parent
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/17] Unionfs: branch management/configuration fixes

2008-02-16 Thread Erez Zadok

Remove unnecessary calls to update branch m/ctimes, and use them only when
needed.  Update branch vfsmounts after operations that could cause a copyup.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |9 +++--
 fs/unionfs/copyup.c |3 ++-
 fs/unionfs/dentry.c |1 +
 fs/unionfs/mmap.c   |   17 -
 fs/unionfs/rename.c |7 +--
 5 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 491e2ff..2add167 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -604,6 +604,7 @@ out:
}
 out_nofree:
if (!err) {
+   unionfs_postcopyup_setmnt(dentry);
unionfs_copy_attr_times(inode);
unionfs_check_file(file);
unionfs_check_inode(inode);
@@ -839,13 +840,9 @@ int unionfs_flush(struct file *file, fl_owner_t id)
 
}
 
-   /* on success, update our times */
-   unionfs_copy_attr_times(dentry->d_inode);
-   /* parent time could have changed too (async) */
-   unionfs_copy_attr_times(dentry->d_parent->d_inode);
-
 out:
-   unionfs_check_file(file);
+   if (!err)
+   unionfs_check_file(file);
unionfs_unlock_dentry(file->f_path.dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 9beac01..f71bddf 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -819,7 +819,8 @@ begin:
 * update times of this dentry, but also the parent, because if
 * we changed, the parent may have changed too.
 */
-   unionfs_copy_attr_times(parent_dentry->d_inode);
+   fsstack_copy_attr_times(parent_dentry->d_inode,
+   lower_parent_dentry->d_inode);
unionfs_copy_attr_times(child_dentry->d_inode);
 
parent_dentry = child_dentry;
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 17b297d..a956b94 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -482,6 +482,7 @@ static int unionfs_d_revalidate(struct dentry *dentry, 
struct nameidata *nd)
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
err = __unionfs_d_revalidate_chain(dentry, nd, false);
if (likely(err > 0)) { /* true==1: dentry is valid */
+   unionfs_postcopyup_setmnt(dentry);
unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
}
diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index ad770ac..d6ac61e 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -227,20 +227,11 @@ static int unionfs_prepare_write(struct file *file, 
struct page *page,
int err;
 
unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
-   /*
-* This is the only place where we unconditionally copy the lower
-* attribute times before calling unionfs_file_revalidate.  The
-* reason is that our ->write calls do_sync_write which in turn will
-* call our ->prepare_write and then ->commit_write.  Before our
-* ->write is called, the lower mtimes are in sync, but by the time
-* the VFS calls our ->commit_write, the lower mtimes have changed.
-* Therefore, the only reasonable time for us to sync up from the
-* changed lower mtimes, and avoid an invariant violation warning,
-* is here, in ->prepare_write.
-*/
-   unionfs_copy_attr_times(file->f_path.dentry->d_inode);
err = unionfs_file_revalidate(file, true);
-   unionfs_check_file(file);
+   if (!err) {
+   unionfs_copy_attr_times(file->f_path.dentry->d_inode);
+   unionfs_check_file(file);
+   }
unionfs_read_unlock(file->f_path.dentry->d_sb);
 
return err;
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 5ab13f9..cc16eb2 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -138,6 +138,11 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
err = vfs_rename(lower_old_dir_dentry->d_inode, lower_old_dentry,
 lower_new_dir_dentry->d_inode, lower_new_dentry);
 out_err_unlock:
+   if (!err) {
+   /* update parent dir times */
+   fsstack_copy_attr_times(old_dir, lower_old_dir_dentry->d_inode);
+   fsstack_copy_attr_times(new_dir, lower_new_dir_dentry->d_inode);
+   }
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
lockdep_on();
 
@@ -526,8 +531,6 @@ int unionfs_rename(struct inode *old_dir, struct dentry 
*old_dentry,
}
}
/* if all of this renaming succeeded, update our times */
-   unionfs_copy_attr_times(old_dir);
-   unionfs_copy_attr_times(new_dir);
unionfs

[PATCH 10/17] Unionfs: factor out revalidation routine

2008-02-16 Thread Erez Zadok

To be used by rest of revalidation code, as well a callers who already
locked the child and parent dentry branch-configurations.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   87 +++---
 fs/unionfs/union.h  |3 ++
 2 files changed, 57 insertions(+), 33 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index afa2120..3bd2dfb 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -285,6 +285,59 @@ void purge_sb_data(struct super_block *sb)
 }
 
 /*
+ * Revalidate a single file/symlink/special dentry.  Assume that info nodes
+ * of the dentry and its parent are locked.  Assume that parent(s) are all
+ * valid already, but the child may not yet be valid.  Returns true if
+ * valid, false otherwise.
+ */
+bool __unionfs_d_revalidate_one_locked(struct dentry *dentry,
+  struct nameidata *nd,
+  bool willwrite)
+{
+   bool valid = false; /* default is invalid */
+   int sbgen, dgen, bindex;
+
+   verify_locked(dentry);
+   verify_locked(dentry->d_parent);
+
+   sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
+   dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+
+   if (unlikely(is_newer_lower(dentry))) {
+   /* root dentry special case as aforementioned */
+   if (IS_ROOT(dentry)) {
+   unionfs_copy_attr_times(dentry->d_inode);
+   } else {
+   /*
+* reset generation number to zero, guaranteed to be
+* "old"
+*/
+   dgen = 0;
+   atomic_set(&UNIONFS_D(dentry)->generation, dgen);
+   }
+   if (!willwrite)
+   purge_inode_data(dentry->d_inode);
+   }
+   valid = __unionfs_d_revalidate_one(dentry, nd);
+
+   /*
+* If __unionfs_d_revalidate_one() succeeded above, then it will
+* have incremented the refcnt of the mnt's, but also the branch
+* indices of the dentry will have been updated (to take into
+* account any branch insertions/deletion.  So the current
+* dbstart/dbend match the current, and new, indices of the mnts
+* which __unionfs_d_revalidate_one has incremented.  Note: the "if"
+* test below does not depend on whether chain_len was 0 or greater.
+*/
+   if (!valid || sbgen == dgen)
+   goto out;
+   for (bindex = dbstart(dentry); bindex <= dbend(dentry); bindex++)
+   unionfs_mntput(dentry, bindex);
+out:
+   return valid;
+}
+
+/*
  * Revalidate a parent chain of dentries, then the actual node.
  * Assumes that dentry is locked, but will lock all parents if/when needed.
  *
@@ -404,42 +457,10 @@ out_this:
if (dentry != dentry->d_parent)
unionfs_lock_dentry(dentry->d_parent,
UNIONFS_DMUTEX_REVAL_PARENT);
-   dgen = atomic_read(&UNIONFS_D(dentry)->generation);
-
-   if (unlikely(is_newer_lower(dentry))) {
-   /* root dentry special case as aforementioned */
-   if (IS_ROOT(dentry)) {
-   unionfs_copy_attr_times(dentry->d_inode);
-   } else {
-   /*
-* reset generation number to zero, guaranteed to be
-* "old"
-*/
-   dgen = 0;
-   atomic_set(&UNIONFS_D(dentry)->generation, dgen);
-   }
-   if (!willwrite)
-   purge_inode_data(dentry->d_inode);
-   }
-   valid = __unionfs_d_revalidate_one(dentry, nd);
+   valid = __unionfs_d_revalidate_one_locked(dentry, nd, willwrite);
if (dentry != dentry->d_parent)
unionfs_unlock_dentry(dentry->d_parent);
 
-   /*
-* If __unionfs_d_revalidate_one() succeeded above, then it will
-* have incremented the refcnt of the mnt's, but also the branch
-* indices of the dentry will have been updated (to take into
-* account any branch insertions/deletion.  So the current
-* dbstart/dbend match the current, and new, indices of the mnts
-* which __unionfs_d_revalidate_one has incremented.  Note: the "if"
-* test below does not depend on whether chain_len was 0 or greater.
-*/
-   if (valid && sbgen != dgen)
-   for (bindex = dbstart(dentry);
-bindex <= dbend(dentry);
-bindex++)
-   unionfs_mntput(dentry, bindex);
-
 out_free:
/* unlock/dput all dentries in chain and return status */
if (chain_len >

[PATCH 17/17] VFS/Unionfs: use generic path_get/path_put functions

2008-02-16 Thread Erez Zadok

Remove unionfs's versions thereof.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c |   10 +-
 fs/unionfs/super.c|   27 ++-
 include/linux/namei.h |   12 
 3 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 3585b29..8f59fb5 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -230,11 +230,11 @@ void unionfs_reinterpose(struct dentry *dentry)
 int check_branch(struct nameidata *nd)
 {
/* XXX: remove in ODF code -- stacking unions allowed there */
-   if (!strcmp(nd->dentry->d_sb->s_type->name, UNIONFS_NAME))
+   if (!strcmp(nd->path.dentry->d_sb->s_type->name, UNIONFS_NAME))
return -EINVAL;
-   if (!nd->dentry->d_inode)
+   if (!nd->path.dentry->d_inode)
return -ENOENT;
-   if (!S_ISDIR(nd->dentry->d_inode->i_mode))
+   if (!S_ISDIR(nd->path.dentry->d_inode->i_mode))
return -ENOTDIR;
return 0;
 }
@@ -375,8 +375,8 @@ static int parse_dirs_option(struct super_block *sb, struct 
unionfs_dentry_info
goto out;
}
 
-   lower_root_info->lower_paths[bindex].dentry = nd.dentry;
-   lower_root_info->lower_paths[bindex].mnt = nd.mnt;
+   lower_root_info->lower_paths[bindex].dentry = nd.path.dentry;
+   lower_root_info->lower_paths[bindex].mnt = nd.path.mnt;
 
set_branchperms(sb, bindex, perms);
set_branch_count(sb, bindex, 0);
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 773623e..fba1598 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -231,8 +231,8 @@ static noinline int do_remount_mode_option(char *optarg, 
int cur_branches,
goto out;
}
for (idx = 0; idx < cur_branches; idx++)
-   if (nd.mnt == new_lower_paths[idx].mnt &&
-   nd.dentry == new_lower_paths[idx].dentry)
+   if (nd.path.mnt == new_lower_paths[idx].mnt &&
+   nd.path.dentry == new_lower_paths[idx].dentry)
break;
path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
@@ -274,8 +274,8 @@ static noinline int do_remount_del_option(char *optarg, int 
cur_branches,
goto out;
}
for (idx = 0; idx < cur_branches; idx++)
-   if (nd.mnt == new_lower_paths[idx].mnt &&
-   nd.dentry == new_lower_paths[idx].dentry)
+   if (nd.path.mnt == new_lower_paths[idx].mnt &&
+   nd.path.dentry == new_lower_paths[idx].dentry)
break;
path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
@@ -358,8 +358,8 @@ static noinline int do_remount_add_option(char *optarg, int 
cur_branches,
goto out;
}
for (idx = 0; idx < cur_branches; idx++)
-   if (nd.mnt == new_lower_paths[idx].mnt &&
-   nd.dentry == new_lower_paths[idx].dentry)
+   if (nd.path.mnt == new_lower_paths[idx].mnt &&
+   nd.path.dentry == new_lower_paths[idx].dentry)
break;
path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
@@ -425,10 +425,10 @@ found_insertion_point:
memmove(&new_lower_paths[idx+1], &new_lower_paths[idx],
(cur_branches - idx) * sizeof(struct path));
}
-   new_lower_paths[idx].dentry = nd.dentry;
-   new_lower_paths[idx].mnt = nd.mnt;
+   new_lower_paths[idx].dentry = nd.path.dentry;
+   new_lower_paths[idx].mnt = nd.path.mnt;
 
-   new_data[idx].sb = nd.dentry->d_sb;
+   new_data[idx].sb = nd.path.dentry->d_sb;
atomic_set(&new_data[idx].open_files, 0);
new_data[idx].branchperms = perms;
new_data[idx].branch_id = ++*high_branch_id; /* assign new branch ID */
@@ -577,7 +577,7 @@ static int unionfs_remount_fs(struct super_block *sb, int 
*flags,
memcpy(tmp_lower_paths, UNIONFS_D(sb->s_root)->lower_paths,
   cur_branches * sizeof(struct path));
for (i = 0; i < cur_branches; i++)
-   pathget(&tmp_lower_paths[i]); /* drop refs at end of fxn */
+   path_get(&tmp_lower_paths[i]); /* drop refs at end of fxn */
 
/***
 * For each branch command, do path_lookup on the requested branch,
@@ -1008,9 +1008,10 @@ static int unionfs_show_options(struct seq_file *m, 
struct vfsmount *mnt)
 
seq_printf(m, ",dirs=");
for (bindex = bstart; bindex <= bend; bindex++) {
-   path = d_path

[PATCH 16/17] Unionfs: use the new path_put

2008-02-16 Thread Erez Zadok

From: Jan Blunck <[EMAIL PROTECTED]>

* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

Signed-off-by: Jan Blunck <[EMAIL PROTECTED]>
Signed-off-by: Andreas Gruenbacher <[EMAIL PROTECTED]>
Acked-by: Christoph Hellwig <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c  |2 +-
 fs/unionfs/super.c |   12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 4bc2c66..3585b29 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -371,7 +371,7 @@ static int parse_dirs_option(struct super_block *sb, struct 
unionfs_dentry_info
if (err) {
printk(KERN_ERR "unionfs: lower directory "
   "'%s' is not a valid branch\n", name);
-   path_release(&nd);
+   path_put(&nd.path);
goto out;
}
 
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index b71fc2a..773623e 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -234,7 +234,7 @@ static noinline int do_remount_mode_option(char *optarg, 
int cur_branches,
if (nd.mnt == new_lower_paths[idx].mnt &&
nd.dentry == new_lower_paths[idx].dentry)
break;
-   path_release(&nd);  /* no longer needed */
+   path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
err = -ENOENT;  /* err may have been reset above */
printk(KERN_ERR "unionfs: branch \"%s\" "
@@ -277,7 +277,7 @@ static noinline int do_remount_del_option(char *optarg, int 
cur_branches,
if (nd.mnt == new_lower_paths[idx].mnt &&
nd.dentry == new_lower_paths[idx].dentry)
break;
-   path_release(&nd);  /* no longer needed */
+   path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
printk(KERN_ERR "unionfs: branch \"%s\" "
   "not found\n", optarg);
@@ -296,7 +296,7 @@ static noinline int do_remount_del_option(char *optarg, int 
cur_branches,
 * new_data and new_lower_paths one to the left.  Finally, adjust
 * cur_branches.
 */
-   pathput(&new_lower_paths[idx]);
+   path_put(&new_lower_paths[idx]);
 
if (idx < cur_branches - 1) {
/* if idx==cur_branches-1, we delete last branch: easy */
@@ -361,7 +361,7 @@ static noinline int do_remount_add_option(char *optarg, int 
cur_branches,
if (nd.mnt == new_lower_paths[idx].mnt &&
nd.dentry == new_lower_paths[idx].dentry)
break;
-   path_release(&nd);  /* no longer needed */
+   path_put(&nd.path); /* no longer needed */
if (idx == cur_branches) {
printk(KERN_ERR "unionfs: branch \"%s\" "
   "not found\n", optarg);
@@ -408,7 +408,7 @@ found_insertion_point:
if (err) {
printk(KERN_ERR "unionfs: lower directory "
   "\"%s\" is not a valid branch\n", optarg);
-   path_release(&nd);
+   path_put(&nd.path);
goto out;
}
 
@@ -818,7 +818,7 @@ out_release:
/* no need to cleanup/release anything in tmp_data */
if (tmp_lower_paths)
for (i = 0; i < new_branches; i++)
-   pathput(&tmp_lower_paths[i]);
+   path_put(&tmp_lower_paths[i]);
 out_free:
kfree(tmp_lower_paths);
kfree(tmp_data);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/17] Unionfs: uninline unionfs_copy_attr_times and unionfs_copy_attr_all

2008-02-16 Thread Erez Zadok

This reduces text size by about 6k.

Cc: Hugh Dickins <[EMAIL PROTECTED]>

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/fanout.h |   50 --
 fs/unionfs/subr.c   |   50 ++
 fs/unionfs/union.h  |2 ++
 3 files changed, 52 insertions(+), 50 deletions(-)

diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index 4d9a45f..29d42fb 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -313,54 +313,4 @@ static inline void verify_locked(struct dentry *d)
BUG_ON(!mutex_is_locked(&UNIONFS_D(d)->lock));
 }
 
-/* copy a/m/ctime from the lower branch with the newest times */
-static inline void unionfs_copy_attr_times(struct inode *upper)
-{
-   int bindex;
-   struct inode *lower;
-
-   if (!upper || ibstart(upper) < 0)
-   return;
-   for (bindex = ibstart(upper); bindex <= ibend(upper); bindex++) {
-   lower = unionfs_lower_inode_idx(upper, bindex);
-   if (!lower)
-   continue; /* not all lower dir objects may exist */
-   if (unlikely(timespec_compare(&upper->i_mtime,
- &lower->i_mtime) < 0))
-   upper->i_mtime = lower->i_mtime;
-   if (unlikely(timespec_compare(&upper->i_ctime,
- &lower->i_ctime) < 0))
-   upper->i_ctime = lower->i_ctime;
-   if (unlikely(timespec_compare(&upper->i_atime,
- &lower->i_atime) < 0))
-   upper->i_atime = lower->i_atime;
-   }
-}
-
-/*
- * A unionfs/fanout version of fsstack_copy_attr_all.  Uses a
- * unionfs_get_nlinks to properly calcluate the number of links to a file.
- * Also, copies the max() of all a/m/ctimes for all lower inodes (which is
- * important if the lower inode is a directory type)
- */
-static inline void unionfs_copy_attr_all(struct inode *dest,
-const struct inode *src)
-{
-   dest->i_mode = src->i_mode;
-   dest->i_uid = src->i_uid;
-   dest->i_gid = src->i_gid;
-   dest->i_rdev = src->i_rdev;
-
-   unionfs_copy_attr_times(dest);
-
-   dest->i_blkbits = src->i_blkbits;
-   dest->i_flags = src->i_flags;
-
-   /*
-* Update the nlinks AFTER updating the above fields, because the
-* get_links callback may depend on them.
-*/
-   dest->i_nlink = unionfs_get_nlinks(dest);
-}
-
 #endif /* not _FANOUT_H */
diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 0a0fce9..68a6280 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -240,3 +240,53 @@ char *alloc_whname(const char *name, int len)
 
return buf;
 }
+
+/* copy a/m/ctime from the lower branch with the newest times */
+void unionfs_copy_attr_times(struct inode *upper)
+{
+   int bindex;
+   struct inode *lower;
+
+   if (!upper || ibstart(upper) < 0)
+   return;
+   for (bindex = ibstart(upper); bindex <= ibend(upper); bindex++) {
+   lower = unionfs_lower_inode_idx(upper, bindex);
+   if (!lower)
+   continue; /* not all lower dir objects may exist */
+   if (unlikely(timespec_compare(&upper->i_mtime,
+ &lower->i_mtime) < 0))
+   upper->i_mtime = lower->i_mtime;
+   if (unlikely(timespec_compare(&upper->i_ctime,
+ &lower->i_ctime) < 0))
+   upper->i_ctime = lower->i_ctime;
+   if (unlikely(timespec_compare(&upper->i_atime,
+ &lower->i_atime) < 0))
+   upper->i_atime = lower->i_atime;
+   }
+}
+
+/*
+ * A unionfs/fanout version of fsstack_copy_attr_all.  Uses a
+ * unionfs_get_nlinks to properly calcluate the number of links to a file.
+ * Also, copies the max() of all a/m/ctimes for all lower inodes (which is
+ * important if the lower inode is a directory type)
+ */
+void unionfs_copy_attr_all(struct inode *dest,
+  const struct inode *src)
+{
+   dest->i_mode = src->i_mode;
+   dest->i_uid = src->i_uid;
+   dest->i_gid = src->i_gid;
+   dest->i_rdev = src->i_rdev;
+
+   unionfs_copy_attr_times(dest);
+
+   dest->i_blkbits = src->i_blkbits;
+   dest->i_flags = src->i_flags;
+
+   /*
+* Update the nlinks AFTER updating the above fields, because the
+* get_links callback may depend on them.
+*/
+   dest->i_nlink = unionfs_get_nlinks(d

[PATCH 02/17] Unionfs: ensure consistent lower inodes types

2008-02-16 Thread Erez Zadok

When looking up a lower object in multiple branches, especially for
directories, ignore any existing entries whose type is different than the
type of the first found object (otherwise we'll be trying to, say, call
readdir on a non-dir inode).

Signed-off-by: Himanshu Kanda <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/lookup.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index b9ee072..755158e 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -256,6 +256,19 @@ struct dentry *unionfs_lookup_backend(struct dentry 
*dentry,
continue;
}
 
+   /*
+* If we already found at least one positive dentry
+* (dentry_count is non-zero), then we skip all remaining
+* positive dentries if their type is a non-dir.  This is
+* because only directories are allowed to stack on multiple
+* branches, but we have to skip non-dirs (to avoid, say,
+* calling readdir on a regular file).
+*/
+   if (!S_ISDIR(lower_dentry->d_inode->i_mode) && dentry_count) {
+   dput(lower_dentry);
+   continue;
+   }
+
/* number of positive dentries */
dentry_count++;
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/17] Unionfs: follow_link locking fixes

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 8d939dc..6377533 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -820,7 +820,11 @@ static void *unionfs_follow_link(struct dentry *dentry, 
struct nameidata *nd)
err = 0;
 
 out:
-   unionfs_check_dentry(dentry);
+   if (!err) {
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   }
unionfs_check_nd(nd);
return ERR_PTR(err);
 }
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 09/17] Unionfs: revalidation code cleanup and refactoring

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   55 --
 1 files changed, 35 insertions(+), 20 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index cd15243..afa2120 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -18,6 +18,39 @@
 
 #include "union.h"
 
+
+static inline void __dput_lowers(struct dentry *dentry, int start, int end)
+{
+   struct dentry *lower_dentry;
+   int bindex;
+
+   if (start < 0)
+   return;
+   for (bindex = start; bindex <= end; bindex++) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry)
+   continue;
+   unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
+   dput(lower_dentry);
+   }
+}
+
+static inline void __iput_lowers(struct inode *inode, int start, int end)
+{
+   struct inode *lower_inode;
+   int bindex;
+
+   if (start < 0)
+   return;
+   for (bindex = start; bindex <= end; bindex++) {
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (!lower_inode)
+   continue;
+   unionfs_set_lower_inode_idx(inode, bindex, NULL);
+   iput(lower_inode);
+   }
+}
+
 /*
  * Revalidate a single dentry.
  * Assume that dentry's info node is locked.
@@ -72,15 +105,7 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
/* Free the pointers for our inodes and this dentry. */
bstart = dbstart(dentry);
bend = dbend(dentry);
-   if (bstart >= 0) {
-   struct dentry *lower_dentry;
-   for (bindex = bstart; bindex <= bend; bindex++) {
-   lower_dentry =
-   unionfs_lower_dentry_idx(dentry,
-bindex);
-   dput(lower_dentry);
-   }
-   }
+   __dput_lowers(dentry, bstart, bend);
set_dbstart(dentry, -1);
set_dbend(dentry, -1);
 
@@ -90,17 +115,7 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
 
bstart = ibstart(dentry->d_inode);
bend = ibend(dentry->d_inode);
-   if (bstart >= 0) {
-   struct inode *lower_inode;
-   for (bindex = bstart; bindex <= bend;
-bindex++) {
-   lower_inode =
-   unionfs_lower_inode_idx(
-   dentry->d_inode,
-   bindex);
-   iput(lower_inode);
-   }
-   }
+   __iput_lowers(dentry->d_inode, bstart, bend);
kfree(UNIONFS_I(dentry->d_inode)->lower_inodes);
UNIONFS_I(dentry->d_inode)->lower_inodes = NULL;
ibstart(dentry->d_inode) = -1;
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/17] Unionfs: grab lower super_block references

2008-02-16 Thread Erez Zadok

This prevents the lower super_block from being destroyed too early, when a
lower file system is being unmounted with MNT_FORCE or MNT_DETACH.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c  |3 +++
 fs/unionfs/super.c |   14 ++
 fs/unionfs/union.h |2 +-
 3 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 23c18f7..ba3471d 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -636,6 +636,7 @@ static int unionfs_read_super(struct super_block *sb, void 
*raw_data,
sbend(sb) = bend = lower_root_info->bend;
for (bindex = bstart; bindex <= bend; bindex++) {
struct dentry *d = lower_root_info->lower_paths[bindex].dentry;
+   atomic_inc(&d->d_sb->s_active);
unionfs_set_lower_super_idx(sb, bindex, d->d_sb);
}
 
@@ -711,6 +712,8 @@ out_dput:
dput(d);
/* initializing: can't use unionfs_mntput here */
mntput(m);
+   /* drop refs we took earlier */
+   atomic_dec(&d->d_sb->s_active);
}
kfree(lower_root_info->lower_paths);
kfree(lower_root_info);
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 986c980..175840f 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -116,6 +116,14 @@ static void unionfs_put_super(struct super_block *sb)
}
BUG_ON(leaks != 0);
 
+   /* decrement lower super references */
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   struct super_block *s;
+   s = unionfs_lower_super_idx(sb, bindex);
+   unionfs_set_lower_super_idx(sb, bindex, NULL);
+   atomic_dec(&s->s_active);
+   }
+
kfree(spd->data);
kfree(spd);
sb->s_fs_info = NULL;
@@ -729,6 +737,12 @@ out_no_change:
 */
purge_sb_data(sb);
 
+   /* grab new lower super references; release old ones */
+   for (i = 0; i < new_branches; i++)
+   atomic_inc(&new_data[i].sb->s_active);
+   for (i = 0; i < new_branches; i++)
+   atomic_dec(&UNIONFS_SB(sb)->data[i].sb->s_active);
+
/* copy new vectors into their correct place */
tmp_data = UNIONFS_SB(sb)->data;
UNIONFS_SB(sb)->data = new_data;
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 4b4d6c9..786c4be 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -127,7 +127,7 @@ struct unionfs_dentry_info {
 
 /* These are the pointers to our various objects. */
 struct unionfs_data {
-   struct super_block *sb;
+   struct super_block *sb; /* lower super_block */
atomic_t open_files;/* number of open files on branch */
int branchperms;
int branch_id;  /* unique branch ID at re/mount time */
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 03/17] Unionfs: document behavior when the lower topology changes

2008-02-16 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/concepts.txt |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/Documentation/filesystems/unionfs/concepts.txt 
b/Documentation/filesystems/unionfs/concepts.txt
index bed69bd..8d9a1c5 100644
--- a/Documentation/filesystems/unionfs/concepts.txt
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -210,4 +210,17 @@ there's a lot of concurrent activity on both the upper and 
lower objects,
 for the same file(s).  Lastly, this delayed time attribute detection is
 similar to how NFS clients operate (e.g., acregmin).
 
+Finally, there is no way currently in Linux to prevent lower directories
+from being moved around (i.e., topology changes); there's no way to prevent
+modifications to directory sub-trees of whole file systems which are mounted
+read-write.  It is therefore possible for in-flight operations in unionfs to
+take place, while a lower directory is being moved around.  Therefore, if
+you try to, say, create a new file in a directory through unionfs, while the
+directory is being moved around directly, then the new file may get created
+in the new location where that directory was moved to.  This is a somewhat
+similar behaviour in NFS: an NFS client could be creating a new file while
+th NFS server is moving th directory around; the file will get successfully
+created in the new location.  (The one exception in unionfs is that if the
+branch is marked read-only by unionfs, then a copyup will take place.)
+
 For more information, see <http://unionfs.filesystems.org/>.
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL -mm] 00/17 Unionfs updates/fixes/cleanups

2008-02-16 Thread Erez Zadok


The following is a series of patchsets related to Unionfs.  The most
significant changes are several fixes to races/locking.  This release also
supports newer APIs in 2.6.25-rc: not using iget/read_inode, using the
changed d_path, using the revised nameidata which embeds a struct path, and
using path_get/put.

These patches were tested (where appropriate) on 2.6.25, MM, as well as the
backports to 2.6.{24,23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs,
nfs2/3/4, jffs2, ramfs, tmpfs, cramfs, and squashfs (where available).  Also
tested with LTP-full and with a continuous parallel kernel compile (while
forcing cache flushing, manipulating lower branches, etc.).  See
http://unionfs.filesystems.org/ to download back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Andrew Morton (1):
  Unionfs: embed a struct path into struct nameidata instead of nd dentrymnt

David Howells (1):
  Unionfs: stop using iget() and read_inode()

Erez Zadok (14):
  Unionfs: grab lower super_block references
  Unionfs: ensure consistent lower inodes types
  Unionfs: document behavior when the lower topology changes
  Unionfs: uninline unionfs_copy_attr_times and unionfs_copy_attr_all
  Unionfs: initialize path_save variable
  Unionfs: extend dentry branch configuration lock in open
  Unionfs: follow_link locking fixes
  Unionfs: improve debugging in copy_attr_times
  Unionfs: revalidation code cleanup and refactoring
  Unionfs: factor out revalidation routine
  Unionfs: lock parents' branch configuration fixes
  Unionfs: branch management/configuration fixes
  Unionfs: use dget_parent in revalidation code
  VFS/Unionfs: use generic path_get/path_put functions

Jan Blunck (1):
  Unionfs: use the new path_put

 Documentation/filesystems/unionfs/concepts.txt |   13 +
 fs/unionfs/commonfops.c|   54 ---
 fs/unionfs/copyup.c|3 
 fs/unionfs/dentry.c|  182 ++---
 fs/unionfs/fanout.h|   50 --
 fs/unionfs/inode.c |   78 +++---
 fs/unionfs/lookup.c|   13 +
 fs/unionfs/main.c  |   26 +--
 fs/unionfs/mmap.c  |   17 --
 fs/unionfs/rename.c|7 
 fs/unionfs/subr.c  |   56 +++
 fs/unionfs/super.c |   72 ++---
 fs/unionfs/union.h |   13 +
 fs/unionfs/unlink.c|   11 +
 include/linux/namei.h  |   12 -
 15 files changed, 366 insertions(+), 241 deletions(-)

Thanks.
---
Erez Zadok
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

jffs2 console printk storm

2008-02-14 Thread Erez Zadok

Hi David,

This has been a problem I've seen for a while.  I've generated a jffs2 image
of an empty directory (I don't recall the version of the jffs2 utils I've
used to generate it).  I mount the jffs2 image something like this:

# cp jffs2-empty.img /tmp/fs.0
# losetup /dev/loop0 /tmp/fs.0
# modprobe block2mtd block2mtd=/dev/loop0,128ki
# mount -t jffs2 mtd0 /n/lower/b0

Then I start running my tests in it (in /n/lower/b0).  I get a lot of these
two kinds of console printk messages (esp. the first one):

  CLEANMARKER node found at 0x0081, not first node in block (0x0080)
  Empty flash at 0x0080ff70 ends at 0x0081

I don't know how serious the messages are, but jffs2 seems to work fine for
me even with those many kernel messages scrolling off of my screen.  I could
not find a way to turn off these warnings: jffs2's debugging config options
allow me to turn on more messages, but not to turn off these.

Is there a way to prevent these messages from showing up?  Are they serious?
Is my jffs2 image bad/corrupt?  If so, how should I fix it?

In the mean time, I've been using the small patch below to turn the above
two messages into jffs2 debugging messages.

Thanks,
Erez.


diff --git a/fs/jffs2/scan.c b/fs/jffs2/scan.c
index 272872d..16f428c 100644
--- a/fs/jffs2/scan.c
+++ b/fs/jffs2/scan.c
@@ -646,8 +646,8 @@ scan_more:
inbuf_ofs = ofs - buf_ofs;
while (inbuf_ofs < scan_end) {
if (unlikely(*(uint32_t *)(&buf[inbuf_ofs]) != 
0x)) {
-   printk(KERN_WARNING "Empty flash at 
0x%08x ends at 0x%08x\n",
-  empty_start, ofs);
+   D1(printk(KERN_WARNING "Empty flash at 
0x%08x ends at 0x%08x\n",
+ empty_start, ofs));
if ((err = jffs2_scan_dirty_space(c, 
jeb, ofs-empty_start)))
return err;
goto scan_more;
@@ -840,7 +840,7 @@ scan_more:
return err;
ofs += PAD(sizeof(struct jffs2_unknown_node));
} else if (jeb->first_node) {
-   printk(KERN_NOTICE "CLEANMARKER node found at 
0x%08x, not first node in block (0x%08x)\n", ofs, jeb->offset);
+   D1(printk(KERN_NOTICE "CLEANMARKER node found 
at 0x%08x, not first node in block (0x%08x)\n", ofs, jeb->offset));
if ((err = jffs2_scan_dirty_space(c, jeb, 
PAD(sizeof(struct jffs2_unknown_node)
return err;
ofs += PAD(sizeof(struct jffs2_unknown_node));
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-02-02 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Al Viro writes:
> On Sat, Jan 26, 2008 at 12:08:30AM -0500, Erez Zadok wrote:

[concerns about lower directories moving around...]

> You are thinking about non-interesting case.  _Files_ are not much
> of a problem.  Directory tree is.  The real problems with all unionfs and
> stacking implementations I've seen so far, all way back to Heidemann et.al.
> start when topology of the underlying layer changes.

OK, so if I understand you, your concerns center around the fact that lower
directories can be moved around (i.e., topology changes), what happens then
for operations that go through the stackable f/s, and what should users
expect to see.

> If you have clear
> semantics for unionfs behaviour in presence of such things, by all means,
> publish it - as far as I know *nobody* had done that; not even on the
> "what should we see when..." level, nevermind the implementation.

Since stacking and NFS have some similarities, I first checked w/ the NFS
people to see what are their semantics in a similar scenario: an NFS client
could be validating a directory, then issue, say, a ->create; but in
between, the server could have moved the directory that was validated.  In
NFS, the ->create operation succeeds, and creates the file in the new
location of the directory which was validated.

Unionfs's behavior is similar: the newly created file will be successfully
created in the moved directory.  The only exception is that if a lower
branch is marked readonly by unionfs, a copyup will take place.

This had not been a problem for unionfs users to date.  The main reason is
that when unionfs users modify lower files, they often do so while there's
little to no activity going through the union itself.  And while it doesn't
prevent directories from being moved around, this common usage mode does
reduce the frequency in which topology changes can be an issue for unionfs
users.

I'll submit a patch to document this behavior.

> > Perhaps this general topic is a good one to discuss at more length at LSF?
> > Suggestions are welcome.
> 
> It would; I honestly do not know if the problem is solvable with the
> (lack of) constraints you apparently want.  Again, the real PITA begins
> when you start dealing with pieces of underlying trees getting moved
> around, changing parents, etc.  Cross-directory rename() certainly rates
> very high on the list of "WTF had they been smoking in UCB?" misfeatures,
> but it's there and it has to be dealt with.

Well, it was UCB, UCLA, and Sun.  I don't think back in the early 90s they
were too concerned about topology changes; f/s stacking was a new idea and
they wanted to explore what can be done with it conceptually, not produce
commercial-grade software (still, remind me to tell you the first-hand story
I've learned about of how full-blown stacking _almost_ made it into Solaris
2.0 :-)

The only known reference to try and address this coherency problem was
Heidemann's SOSP'95 paper titled "Performance of cache coherence in
stackable filing."  The paper advocated changing the whole VFS and the
caches (page cache + dnlc) to create a "unified cache manager" that was
aware of complex layering topologies (including fan-out and fan-in).  It was
even able to handle compression layers, where file data offsets change b/t
the layers (a nasty problem).  Code for this unified cache manager was never
released AFAIK.  I think Heidemann's approach was elegant, but I don't think
it was practical as it required radical VFS/VM surgery.  Ironically, MS
Windows has a single I/O cache manager that all storage and filesystem
modules talk to directly (they're not allowed to pass IRPs directly b/t
layers): so Windows can handle such coherency better than most Unix systems
can today.

I've always thought of a different way to allow users to write to lower
branches -- through the union.  This is similar to what an old AT&T
unioning-like file system named "3DFS" did.  3DFS introduced a new directory
called "..." so if you cd to /mntpt/... then you got to the next level down
the stack (as if you popped the top one and now you see how the union looks
like without the top layer).  And if you "cd /mntpt/.../..." then you see
the view without the top two layers, etc.

So my idea is similar: to introduce virtual directory views that restrict
access to a single lower branch within the union.  So if someone does a "cd
/mnt/unionfs/.1" then they get access to branch 1; "cd /mnt/unionfs/.2" gets
access to branch 2; etc.  While this technique will waste a few names, it's
probably worth the savings in terms of cache-coherency pains (plus, the
actual virtual directory names can be configurable at mount time to allow
users to choose a non-conflicting dir name).  W

Re: unionfs_copy_attr_times oopses

2008-02-01 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Hugh Dickins writes:
> Hi Erez,
> 
> Aside from the occasional "unionfs: new lower inode mtime" messages
> on directories (which I've got into the habit of ignoring now), the
> only problem I'm still suffering with unionfs over tmpfs (not tested
> any other fs's below it recently) is oops in unionfs_copy_attr_times.
[...]

Thanks for the report and patch(es).  I'll start looking at this over the
weekend.  Could you do me a favor and open a bugzilla report
(https://bugzilla.filesystems.org/), upload your patches, etc.?  It'd be
easier for us to exchange info/patches and track the problem that way.

Thanks,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-01-25 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Al Viro writes:
> After grep for locking-related things:
> 
>   * lock_parent(): who said that you won't get dentry moved
> before managing to grab i_mutex on parent?  While we are at it,
> who said that you won't get dentry moved between fetching d_parent
> and doing dget()?  In that case parent could've been _freed_ before
> you get to dget().

OK, so looks like I should use dget_parent() in my lock_parent(), as I've
done elsewhere.  I'll also take a look at all instances in which I get
dentry->d_parent and see if a d_lock is needed there.

>   * in create_parents():
> +   struct inode *inode = lower_dentry->d_inode;
> +   /*
> +* If we get here, it means that we created a new
> +* dentry+inode, but copying permissions failed.
> +* Therefore, we should delete this inode and dput
> +* the dentry so as not to leave cruft behind.
> +*/
> +   if (lower_dentry->d_op && lower_dentry->d_op->d_iput)
> +   lower_dentry->d_op->d_iput(lower_dentry,
> +  inode);
> +   else
> +   iput(inode);
> +   lower_dentry->d_inode = NULL;
> +   dput(lower_dentry);
> +   lower_dentry = ERR_PTR(err);
> +   goto out;
> Really?  So what happens if it had become positive after your test and
> somebody had looked it up in lower layer and just now happens to be
> in the middle of operations on it?  Will be thucking frilled by that...

Good catch.  That ->d_iput call was an old fix to a bug that has since been
fixed more cleanly and generically in our copyup_permission routine and our
unionfs_d_iput.  I've removed the above ->d_iput "if" and tested to verify
that it's indeed unnecessary.

>   * __unionfs_rename():
> +   lock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
> +   err = vfs_rename(lower_old_dir_dentry->d_inode, lower_old_dentry,
> +lower_new_dir_dentry->d_inode, lower_new_dentry);
> +   unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
> 
> Uh-huh...  To start with, what guarantees that your lower_old_dentry
> is still a child of your lower_old_dir_dentry?

We dget/dget_parent the old/new dentry and parents a few lines above
(actually, it looked like I forgot to dget(lower_new_dentry) -- fixed).
This is a generic stackable f/s issue: ecryptfs does the same stuff before
calling vfs_rename() on the lower objects.

> What's more, you are
> not checking the result of lock_rename(), i.e. asking for serious trouble.

OK.  I'm now checking for the return from lock_rename for ancestor/rename
rules.  I'm CC'ing Mike Halcrow so he can do the same for ecryptfs.

>   * revalidation stuff: err...  how the devil can it work for
> directories, when there's nothing to prevent changes in underlying
> layers between ->d_revalidate() and operation itself?  For the upper
> layer (unionfs itself) everything's more or less fine, but the rest
> of that...

In a stacked f/s, we keep references to the lower dentries/inodes, so they
can't disappear on us (that happens in our interpose function, called from
our ->lookup).  On entry to every f/s method in unionfs, we first perform
lightweight revalidation of our dentry against the lower ones: we check if
m/ctime changed (users modifying lower files) or if the generation# b/t our
super and the our dentries have changed (branch-management took place); if
needed, then we perform a full revalidation of all lower objects (while
holding a lock on the branch configuration).  If we have to do a full reval
upon entry to our ->op, and the reval failed, then we return an appropriate
error; o/w we proceed.  (In certain cases, the VFS re-issues a lookup if the
f/s says that it's dentry is invalid.)

Without changes to the VFS, I don't see how else I can ensure cache
coherency cleanly, while allowing users to modify lower files; this feature
is very useful to some unionfs users, who depend on it (so even if I could
"lock out" the lower directories from being modified, there will be users
who'd still want to be able to modify lower files).

BTW, my sense of the relationship b/t upper and lower objects and their
validity in a stackable f/s, is that it's similar to the relationship b/t
the NFS client and server -- the client can't be sure that a file on the
server doesn't change b/t ->revalidate and ->op (hence nfs's reliance on dir
mtime checks).

Perhaps this general topic is a good one to discuss at more length at LSF?
Suggestions are welcome.

Thanks,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-

[PATCH 4/4] Unionfs: lock_rename related locking fixes

2008-01-25 Thread Erez Zadok

CC: Mike Halcrow <[EMAIL PROTECTED]>

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/rename.c |   16 +++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 9306a2b..5ab13f9 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -29,6 +29,7 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
struct dentry *lower_new_dir_dentry;
struct dentry *lower_wh_dentry;
struct dentry *lower_wh_dir_dentry;
+   struct dentry *trap;
char *wh_name = NULL;
 
lower_new_dentry = unionfs_lower_dentry_idx(new_dentry, bindex);
@@ -95,6 +96,7 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
goto out;
 
dget(lower_old_dentry);
+   dget(lower_new_dentry);
lower_old_dir_dentry = dget_parent(lower_old_dentry);
lower_new_dir_dentry = dget_parent(lower_new_dentry);
 
@@ -122,9 +124,20 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
 
/* see Documentation/filesystems/unionfs/issues.txt */
lockdep_off();
-   lock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
+   trap = lock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
+   /* source should not be ancenstor of target */
+   if (trap == lower_old_dentry) {
+   err = -EINVAL;
+   goto out_err_unlock;
+   }
+   /* target should not be ancenstor of source */
+   if (trap == lower_new_dentry) {
+   err = -ENOTEMPTY;
+   goto out_err_unlock;
+   }
err = vfs_rename(lower_old_dir_dentry->d_inode, lower_old_dentry,
 lower_new_dir_dentry->d_inode, lower_new_dentry);
+out_err_unlock:
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
lockdep_on();
 
@@ -132,6 +145,7 @@ out_dput:
dput(lower_old_dir_dentry);
dput(lower_new_dir_dentry);
dput(lower_old_dentry);
+   dput(lower_new_dentry);
 
 out:
if (!err) {
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] Unionfs: d_parent related locking fixes

2008-01-25 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/copyup.c |3 +--
 fs/unionfs/union.h  |4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 8663224..9beac01 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -716,8 +716,7 @@ struct dentry *create_parents(struct inode *dir, struct 
dentry *dentry,
child_dentry = parent_dentry;
 
/* find the parent directory dentry in unionfs */
-   parent_dentry = child_dentry->d_parent;
-   dget(parent_dentry);
+   parent_dentry = dget_parent(child_dentry);
 
/* find out the lower_parent_dentry in the given branch */
lower_parent_dentry =
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index d324f83..4b4d6c9 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -487,13 +487,13 @@ extern int parse_branch_mode(const char *name, int 
*perms);
 /* locking helpers */
 static inline struct dentry *lock_parent(struct dentry *dentry)
 {
-   struct dentry *dir = dget(dentry->d_parent);
+   struct dentry *dir = dget_parent(dentry);
mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT);
return dir;
 }
 static inline struct dentry *lock_parent_wh(struct dentry *dentry)
 {
-   struct dentry *dir = dget(dentry->d_parent);
+   struct dentry *dir = dget_parent(dentry);
 
mutex_lock_nested(&dir->d_inode->i_mutex, UNIONFS_DMUTEX_WHITEOUT);
return dir;
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] Unionfs: remove unnecessary call to d_iput

2008-01-25 Thread Erez Zadok

This old code was to fix a bug which has long since been fixed in our
copyup_permission and unionfs_d_iput.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/copyup.c |   13 -
 1 files changed, 0 insertions(+), 13 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 16b2c7c..8663224 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -807,19 +807,6 @@ begin:
 lower_dentry);
unlock_dir(lower_parent_dentry);
if (err) {
-   struct inode *inode = lower_dentry->d_inode;
-   /*
-* If we get here, it means that we created a new
-* dentry+inode, but copying permissions failed.
-* Therefore, we should delete this inode and dput
-* the dentry so as not to leave cruft behind.
-*/
-   if (lower_dentry->d_op && lower_dentry->d_op->d_iput)
-   lower_dentry->d_op->d_iput(lower_dentry,
-  inode);
-   else
-   iput(inode);
-   lower_dentry->d_inode = NULL;
dput(lower_dentry);
lower_dentry = ERR_PTR(err);
goto out;
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] Unionfs: use first writable branch (fix/cleanup)

2008-01-25 Thread Erez Zadok

Cleanup code in ->create, ->symlink, and ->mknod: refactor common code into
helper functions.  Also, this allows writing to multiple branches again,
which was broken by an earlier patch.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |  395 +---
 1 files changed, 156 insertions(+), 239 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index e15ddb9..0b92da2 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -18,14 +18,159 @@
 
 #include "union.h"
 
+/*
+ * Helper function when creating new objects (create, symlink, and mknod).
+ * Checks to see if there's a whiteout in @lower_dentry's parent directory,
+ * whose name is taken from @dentry.  Then tries to remove that whiteout, if
+ * found.
+ *
+ * Return 0 if no whiteout was found, or if one was found and successfully
+ * removed (a zero tells the caller that @lower_dentry belongs to a good
+ * branch to create the new object in).  Return -ERRNO if an error occurred
+ * during whiteout lookup or in trying to unlink the whiteout.
+ */
+static int check_for_whiteout(struct dentry *dentry,
+ struct dentry *lower_dentry)
+{
+   int err = 0;
+   struct dentry *wh_dentry = NULL;
+   struct dentry *lower_dir_dentry;
+   char *name = NULL;
+
+   /*
+* check if whiteout exists in this branch, i.e. lookup .wh.foo
+* first.
+*/
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
+
+   wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(wh_dentry)) {
+   err = PTR_ERR(wh_dentry);
+   wh_dentry = NULL;
+   goto out;
+   }
+
+   if (!wh_dentry->d_inode) /* no whiteout exists */
+   goto out;
+
+   /* .wh.foo has been found, so let's unlink it */
+   lower_dir_dentry = lock_parent_wh(wh_dentry);
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
+   err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+   lockdep_on();
+   unlock_dir(lower_dir_dentry);
+
+   /*
+* Whiteouts are special files and should be deleted no matter what
+* (as if they never existed), in order to allow this create
+* operation to succeed.  This is especially important in sticky
+* directories: a whiteout may have been created by one user, but
+* the newly created file may be created by another user.
+* Therefore, in order to maintain Unix semantics, if the vfs_unlink
+* above failed, then we have to try to directly unlink the
+* whiteout.  Note: in the ODF version of unionfs, whiteout are
+* handled much more cleanly.
+*/
+   if (err == -EPERM) {
+   struct inode *inode = lower_dir_dentry->d_inode;
+   err = inode->i_op->unlink(inode, wh_dentry);
+   }
+   if (err)
+   printk(KERN_ERR "unionfs: could not "
+  "unlink whiteout, err = %d\n", err);
+
+out:
+   dput(wh_dentry);
+   kfree(name);
+   return err;
+}
+
+/*
+ * Find a writeable branch to create new object in.  Checks all writeble
+ * branches of the parent inode, from istart to iend order; if none are
+ * suitable, also tries branch 0 (which may require a copyup).
+ *
+ * Return a lower_dentry we can use to create object in, or ERR_PTR.
+ */
+static struct dentry *find_writeable_branch(struct inode *parent,
+   struct dentry *dentry)
+{
+   int err = -EINVAL;
+   int bindex, istart, iend;
+   struct dentry *lower_dentry = NULL;
+
+   istart = ibstart(parent);
+   iend = ibend(parent);
+   if (istart < 0)
+   goto out;
+
+begin:
+   for (bindex = istart; bindex <= iend; bindex++) {
+   /* skip non-writeable branches */
+   err = is_robranch_super(dentry->d_sb, bindex);
+   if (err) {
+   err = -EROFS;
+   continue;
+   }
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry)
+   continue;
+   /*
+* check for whiteouts in writeable branch, and remove them
+* if necessary.
+*/
+   err = check_for_whiteout(dentry, lower_dentry);
+   if (err)
+   continue;
+   }
+   /*
+* If istart wasn't already branch 0, and we got any error, then try
+* branch 0 (which may require copyup)
+*/
+   if (err && ista

[GIT PULL -mm] 0/4 Unionfs updates/fixes/cleanups

2008-01-25 Thread Erez Zadok


The following is a series of patchsets related to Unionfs.  This is the
fifth set of patchsets resulting from an lkml review of the entire unionfs
code base, in preparation for a merge into mainline.  The most significant
changes here are a few locking related fixes, and a correction to broken
logic which didn't allow writing to the first available writable branch.

These patches were tested (where appropriate) on 2.6.24, MM, as well as the
backports to 2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4,
jffs2, ramfs, tmpfs, cramfs, and squashfs (where available).  Also tested
with LTP-full and with a continuous parallel kernel compile (while forcing
cache flushing, manipulating lower branches, etc.).  See
http://unionfs.filesystems.org/ to download back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (4):
  Unionfs: use first writable branch (fix/cleanup)
  Unionfs: remove unnecessary call to d_iput
  Unionfs: d_parent related locking fixes
  Unionfs: lock_rename related locking fixes

 copyup.c |   16 --
 inode.c  |  395 ---
 rename.c |   16 ++
 union.h  |4 
 4 files changed, 174 insertions(+), 257 deletions(-)

---
Erez Zadok
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 01/26] mount options: add documentation

2008-01-24 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Miklos Szeredi writes:
> From: Miklos Szeredi <[EMAIL PROTECTED]>
> 
> This series addresses the problem of showing mount options in
> /proc/mounts.
> 
> Several filesystems which use mount options, have not implemented a
> .show_options superblock operation.  Several others have implemented
> this callback, but have not kept it fully up to date with the parsed
> options.
[...]

> The following filesystems still need fixing: CIFS, NFS, XFS, Unionfs,
> Reiser4.  For CIFS, NFS and XFS I wasn't able to understand how some
> of the options are used.  The last two are not yet in mainline, so I
> leave fixing those to their respective maintainers out of pure
> laziness.
> 
> Table displaying status of all in-kernel filesystems:
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> legend:
> 
>   none - fs has options, but doesn't define ->show_options()
>   some - fs defines ->show_options(), but some only options are shown
>   most - fs defines ->show_options(), and shows most of them
>   good - fs shows all options
>   noopt - fs does not have options
>   patch - a patch will be posted
[...]

> in -mm:
> 
> reiser4 some
> unionfs none

Hi Miklos,

Where did you check for the existence of a ->show_options method for
unionfs?  Unionfs does implement ->show_options and supports all of the
mount/remount options.  See:

The unionfs ->remount code supports branch-management options which can
add/del/change a branch, but we don't show those directly in ->show_options;
it makes more sense to show the final (and thus most current) branch
configuration.

Could you update your records please?

BTW, I should be able to use your save_mount_options().

Cheers,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-01-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Al Viro writes:
> After grep for locking-related things:
[...]

Thanks.  I'll start looking at these issues asap.

Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-01-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Michael Halcrow writes:
> On Thu, Jan 10, 2008 at 10:57:46AM -0500, Erez Zadok wrote:
[...]
> Would the inclusion of Unionfs in mainline really slow down or damage
> the union mount effort? If not, then I think the pragmatic approach
> would be to make it available in mainline for all of the users who are
> already successfully running it today. We can then focus future
> efforts on the VFS-level modifications that address the remaining
> issues, limiting Unionfs in the future to only those problems that are
> best solved in a stacked filesystem layer.

Mike, this is indeed the pragmatic approach I've advocated: as the VFS would
come up with more unioning-related functionality, I could easily make use of
it in unionfs, thus shrinking the code base in unionfs (while keeping the
user API unchanged).  In the end, what'll be left over is probably a smaller
standalone file system that offers the kind of features that aren't likely
to show up at the VFS level (e.g., a persistent cache of unified dir
contents, persistent inode numbers, whiteouts that work with any "obscure"
filesystem, and such).

> Mike

Cheers,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-01-10 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Christoph Hellwig writes:
> On Thu, Jan 10, 2008 at 09:59:19AM -0500, Erez Zadok wrote:
> > 
> > Dear Linus, Al, Christoph, and Andrew,
> > 
> > As per your request, I'm posting for review the unionfs code (and related
> > code) that's in my korg tree against mainline (v2.6.24-rc7-71-gfd0b45d).
> > This is in preparation for merge in 2.6.25.
> 
> Huh?  There's still aboslutely not fix to the underlying problems of
> the whole idea.   I think we made it pretty clear that unionfs is not
> the way to go, and that we'll get the union mount patches clear once
> the per-mountpoint r/o and unprivilegued mount patches series are in
> and stable.

I'll reiterate what I've said before: unionfs is used today by many users,
it works, and is stable.  After years of working with unionfs, we've settled
on a set of features that users actually use.  This functionality can be in
mainline today.

Unioning at the VFS level, will take a long time to reach the same level of
maturity and support the same set of features.  Based on my years of
practical experience with it, unioning directories seems like a simple idea,
but in practice it's quite hard no matter the approach taken to implement
it.

Existing users of unioning aren't likely to switch to Union Mounts unless it
supports the same set of features.  How long will it realistically take to
get whiteout support in every lower file system that's used by Unionfs
users?  How will Union Mounts support persistent inode numbers at the VFS
level?  Those are just a few of the questions.

I think a better approach would be to start with Unionfs (a standalone file
system that doesn't touch the rest of the kernel).  And as Linux gradually
starts supporting more and more features that help unioning/stacking in
general, to change Unionfs to use those features (e.g., native whiteout
support).  Eventually there could be basic unioning support at the VFS
level, and concurrently a file-system which offers the extra features (e.g.,
persistency).  This can be done w/o affecting user-visible APIs.

Cheers,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 20/29] Unionfs: super_block operations

2008-01-10 Thread Erez Zadok

Includes read_inode, delete_inode, put_super, statfs, remount_fs (which
supports branch-management ops), clear_inode, alloc_inode, destroy_inode,
write_inode, umount_begin, and show_options.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/super.c | 1025 
 1 files changed, 1025 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/super.c

diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
new file mode 100644
index 000..986c980
--- /dev/null
+++ b/fs/unionfs/super.c
@@ -0,0 +1,1025 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * The inode cache is used with alloc_inode for both our inode info and the
+ * vfs inode.
+ */
+static struct kmem_cache *unionfs_inode_cachep;
+
+static void unionfs_read_inode(struct inode *inode)
+{
+   int size;
+   struct unionfs_inode_info *info = UNIONFS_I(inode);
+
+   memset(info, 0, offsetof(struct unionfs_inode_info, vfs_inode));
+   info->bstart = -1;
+   info->bend = -1;
+   atomic_set(&info->generation,
+  atomic_read(&UNIONFS_SB(inode->i_sb)->generation));
+   spin_lock_init(&info->rdlock);
+   info->rdcount = 1;
+   info->hashsize = -1;
+   INIT_LIST_HEAD(&info->readdircache);
+
+   size = sbmax(inode->i_sb) * sizeof(struct inode *);
+   info->lower_inodes = kzalloc(size, GFP_KERNEL);
+   if (unlikely(!info->lower_inodes)) {
+   printk(KERN_CRIT "unionfs: no kernel memory when allocating "
+  "lower-pointer array!\n");
+   BUG();
+   }
+
+   inode->i_version++;
+   inode->i_op = &unionfs_main_iops;
+   inode->i_fop = &unionfs_main_fops;
+
+   inode->i_mapping->a_ops = &unionfs_aops;
+
+   /*
+* reset times so unionfs_copy_attr_all can keep out time invariants
+* right (upper inode time being the max of all lower ones).
+*/
+   inode->i_atime.tv_sec = inode->i_atime.tv_nsec = 0;
+   inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
+   inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
+
+}
+
+/*
+ * we now define delete_inode, because there are two VFS paths that may
+ * destroy an inode: one of them calls clear inode before doing everything
+ * else that's needed, and the other is fine.  This way we truncate the inode
+ * size (and its pages) and then clear our own inode, which will do an iput
+ * on our and the lower inode.
+ *
+ * No need to lock sb info's rwsem.
+ */
+static void unionfs_delete_inode(struct inode *inode)
+{
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_lock(&inode->i_lock);
+#endif
+   i_size_write(inode, 0); /* every f/s seems to do that */
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_unlock(&inode->i_lock);
+#endif
+
+   if (inode->i_data.nrpages)
+   truncate_inode_pages(&inode->i_data, 0);
+
+   clear_inode(inode);
+}
+
+/*
+ * final actions when unmounting a file system
+ *
+ * No need to lock rwsem.
+ */
+static void unionfs_put_super(struct super_block *sb)
+{
+   int bindex, bstart, bend;
+   struct unionfs_sb_info *spd;
+   int leaks = 0;
+
+   spd = UNIONFS_SB(sb);
+   if (!spd)
+   return;
+
+   bstart = sbstart(sb);
+   bend = sbend(sb);
+
+   /* Make sure we have no leaks of branchget/branchput. */
+   for (bindex = bstart; bindex <= bend; bindex++)
+   if (unlikely(branch_count(sb, bindex) != 0)) {
+   printk(KERN_CRIT
+  "unionfs: branch %d has %d references left!\n",
+  bindex, branch_count(sb, bindex));
+   leaks = 1;
+   }
+   BUG_ON(leaks != 0);
+
+   kfree(spd->data);
+   kfree(spd);
+   sb->s_fs_info = NULL;
+}
+
+/*
+ * Since people use this to answer the "How big of a file can I write?"
+ * question, we report the size of the highest priority branch as the size of
+ * the union.
+ */
+static int unionfs_sta

[PATCH 11/29] Unionfs: lower-level lookup routines

2008-01-10 Thread Erez Zadok

Includes lower nameidata support routines.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/lookup.c |  652 +++
 1 files changed, 652 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/lookup.c

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
new file mode 100644
index 000..b9ee072
--- /dev/null
+++ b/fs/unionfs/lookup.c
@@ -0,0 +1,652 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int realloc_dentry_private_data(struct dentry *dentry);
+
+/* is the filename valid == !(whiteout for a file or opaque dir marker) */
+static int is_validname(const char *name)
+{
+   if (!strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN))
+   return 0;
+   if (!strncmp(name, UNIONFS_DIR_OPAQUE_NAME,
+sizeof(UNIONFS_DIR_OPAQUE_NAME) - 1))
+   return 0;
+   return 1;
+}
+
+/* The rest of these are utility functions for lookup. */
+static noinline int is_opaque_dir(struct dentry *dentry, int bindex)
+{
+   int err = 0;
+   struct dentry *lower_dentry;
+   struct dentry *wh_lower_dentry;
+   struct inode *lower_inode;
+   struct sioq_args args;
+
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   lower_inode = lower_dentry->d_inode;
+
+   BUG_ON(!S_ISDIR(lower_inode->i_mode));
+
+   mutex_lock(&lower_inode->i_mutex);
+
+   if (!permission(lower_inode, MAY_EXEC, NULL)) {
+   wh_lower_dentry =
+   lookup_one_len(UNIONFS_DIR_OPAQUE, lower_dentry,
+  sizeof(UNIONFS_DIR_OPAQUE) - 1);
+   } else {
+   args.is_opaque.dentry = lower_dentry;
+   run_sioq(__is_opaque_dir, &args);
+   wh_lower_dentry = args.ret;
+   }
+
+   mutex_unlock(&lower_inode->i_mutex);
+
+   if (IS_ERR(wh_lower_dentry)) {
+   err = PTR_ERR(wh_lower_dentry);
+   goto out;
+   }
+
+   /* This is an opaque dir iff wh_lower_dentry is positive */
+   err = !!wh_lower_dentry->d_inode;
+
+   dput(wh_lower_dentry);
+out:
+   return err;
+}
+
+/*
+ * Main (and complex) driver function for Unionfs's lookup
+ *
+ * Returns: NULL (ok), ERR_PTR if an error occurred, or a non-null non-error
+ * PTR if d_splice returned a different dentry.
+ *
+ * If lookupmode is INTERPOSE_PARTIAL/REVAL/REVAL_NEG, the passed dentry's
+ * inode info must be locked.  If lookupmode is INTERPOSE_LOOKUP (i.e., a
+ * newly looked-up dentry), then unionfs_lookup_backend will return a locked
+ * dentry's info, which the caller must unlock.
+ */
+struct dentry *unionfs_lookup_backend(struct dentry *dentry,
+ struct nameidata *nd, int lookupmode)
+{
+   int err = 0;
+   struct dentry *lower_dentry = NULL;
+   struct dentry *wh_lower_dentry = NULL;
+   struct dentry *lower_dir_dentry = NULL;
+   struct dentry *parent_dentry = NULL;
+   struct dentry *d_interposed = NULL;
+   int bindex, bstart = -1, bend, bopaque;
+   int dentry_count = 0;   /* Number of positive dentries. */
+   int first_dentry_offset = -1; /* -1 is uninitialized */
+   struct dentry *first_dentry = NULL;
+   struct dentry *first_lower_dentry = NULL;
+   struct vfsmount *first_lower_mnt = NULL;
+   int opaque;
+   char *whname = NULL;
+   const char *name;
+   int namelen;
+
+   /*
+* We should already have a lock on this dentry in the case of a
+* partial lookup, or a revalidation. Otherwise it is returned from
+* new_dentry_private_data already locked.
+*/
+   if (lookupmode == INTERPOSE_PARTIAL || lookupmode == INTERPOSE_REVAL ||
+   lookupmode == INTERPOSE_REVAL_NEG)
+   verify_locked(dentry);
+   else/* this could only be INTERPOSE_LOOKUP */
+   BUG_ON(UNIONFS_D(dentry) != NULL);
+
+   switch (lookupmode) {
+   case INTERPOSE_PARTIAL:
+   break;
+   case INTERPOSE_LOOKUP:
+   err = new_dentry_private_data(dentry, UNIONFS_DMUTEX_CHILD);
+   if (unlikely(err))
+

[PATCH 06/29] Unionfs: main header file

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/union.h |  602 
 1 files changed, 602 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/union.h

diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
new file mode 100644
index 000..d324f83
--- /dev/null
+++ b/fs/unionfs/union.h
@@ -0,0 +1,602 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _UNION_H_
+#define _UNION_H_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+
+/* the file system name */
+#define UNIONFS_NAME "unionfs"
+
+/* unionfs root inode number */
+#define UNIONFS_ROOT_INO 1
+
+/* number of times we try to get a unique temporary file name */
+#define GET_TMPNAM_MAX_RETRY   5
+
+/* maximum number of branches we support, to avoid memory blowup */
+#define UNIONFS_MAX_BRANCHES   128
+
+/* minimum time (seconds) required for time-based cache-coherency */
+#define UNIONFS_MIN_CC_TIME3
+
+/* Operations vectors defined in specific files. */
+extern struct file_operations unionfs_main_fops;
+extern struct file_operations unionfs_dir_fops;
+extern struct inode_operations unionfs_main_iops;
+extern struct inode_operations unionfs_dir_iops;
+extern struct inode_operations unionfs_symlink_iops;
+extern struct super_operations unionfs_sops;
+extern struct dentry_operations unionfs_dops;
+extern struct address_space_operations unionfs_aops;
+
+/* How long should an entry be allowed to persist */
+#define RDCACHE_JIFFIES(5*HZ)
+
+/* file private data. */
+struct unionfs_file_info {
+   int bstart;
+   int bend;
+   atomic_t generation;
+
+   struct unionfs_dir_state *rdstate;
+   struct file **lower_files;
+   int *saved_branch_ids; /* IDs of branches when file was opened */
+};
+
+/* unionfs inode data in memory */
+struct unionfs_inode_info {
+   int bstart;
+   int bend;
+   atomic_t generation;
+   int stale;
+   /* Stuff for readdir over NFS. */
+   spinlock_t rdlock;
+   struct list_head readdircache;
+   int rdcount;
+   int hashsize;
+   int cookie;
+
+   /* The lower inodes */
+   struct inode **lower_inodes;
+
+   struct inode vfs_inode;
+};
+
+/* unionfs dentry data in memory */
+struct unionfs_dentry_info {
+   /*
+* The semaphore is used to lock the dentry as soon as we get into a
+* unionfs function from the VFS.  Our lock ordering is that children
+* go before their parents.
+*/
+   struct mutex lock;
+   int bstart;
+   int bend;
+   int bopaque;
+   int bcount;
+   atomic_t generation;
+   struct path *lower_paths;
+};
+
+/* These are the pointers to our various objects. */
+struct unionfs_data {
+   struct super_block *sb;
+   atomic_t open_files;/* number of open files on branch */
+   int branchperms;
+   int branch_id;  /* unique branch ID at re/mount time */
+};
+
+/* unionfs super-block data in memory */
+struct unionfs_sb_info {
+   int bend;
+
+   atomic_t generation;
+
+   /*
+* This rwsem is used to make sure that a branch management
+* operation...
+*   1) will not begin before all currently in-flight operations
+*  complete.
+*   2) any new operations do not execute until the currently
+*  running branch management operation completes.
+*
+* The write_lock_owner records the PID of the task which grabbed
+* the rw_sem for writing.  If the same task also tries to grab the
+* read lock, we allow it.  This prevents a self-deadlock when
+* branch-management is used on a pivot_root'ed union, because we
+* have to ->lookup paths which belong to the same union.
+*/
+   struct rw_semaphore rwsem;
+   pid_t write_lock_owner; /* PID of rw_sem owner (write lock) */
+   int high_branch_id; /* last unique branch ID given */
+   struct unionfs_data *data;
+};
+
+/*
+ * structure for making the li

[PATCH 07/29] Unionfs: common file copyup/revalidation operations

2008-01-10 Thread Erez Zadok

Includes open, ioctl, and flush operations.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |  835 +++
 1 files changed, 835 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/commonfops.c

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
new file mode 100644
index 000..f37192f
--- /dev/null
+++ b/fs/unionfs/commonfops.c
@@ -0,0 +1,835 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * 1) Copyup the file
+ * 2) Rename the file to '.unionfs' - obviously
+ * stolen from NFS's silly rename
+ */
+static int copyup_deleted_file(struct file *file, struct dentry *dentry,
+  int bstart, int bindex)
+{
+   static unsigned int counter;
+   const int i_inosize = sizeof(dentry->d_inode->i_ino) * 2;
+   const int countersize = sizeof(counter) * 2;
+   const int nlen = sizeof(".unionfs") + i_inosize + countersize - 1;
+   char name[nlen + 1];
+   int err;
+   struct dentry *tmp_dentry = NULL;
+   struct dentry *lower_dentry;
+   struct dentry *lower_dir_dentry = NULL;
+
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bstart);
+
+   sprintf(name, ".unionfs%*.*lx",
+   i_inosize, i_inosize, lower_dentry->d_inode->i_ino);
+
+   /*
+* Loop, looking for an unused temp name to copyup to.
+*
+* It's somewhat silly that we look for a free temp tmp name in the
+* source branch (bstart) instead of the dest branch (bindex), where
+* the final name will be created.  We _will_ catch it if somehow
+* the name exists in the dest branch, but it'd be nice to catch it
+* sooner than later.
+*/
+retry:
+   tmp_dentry = NULL;
+   do {
+   char *suffix = name + nlen - countersize;
+
+   dput(tmp_dentry);
+   counter++;
+   sprintf(suffix, "%*.*x", countersize, countersize, counter);
+
+   pr_debug("unionfs: trying to rename %s to %s\n",
+dentry->d_name.name, name);
+
+   tmp_dentry = lookup_one_len(name, lower_dentry->d_parent,
+   nlen);
+   if (IS_ERR(tmp_dentry)) {
+   err = PTR_ERR(tmp_dentry);
+   goto out;
+   }
+   } while (tmp_dentry->d_inode != NULL);  /* need negative dentry */
+   dput(tmp_dentry);
+
+   err = copyup_named_file(dentry->d_parent->d_inode, file, name, bstart,
+   bindex,
+   i_size_read(file->f_path.dentry->d_inode));
+   if (err) {
+   if (unlikely(err == -EEXIST))
+   goto retry;
+   goto out;
+   }
+
+   /* bring it to the same state as an unlinked file */
+   lower_dentry = unionfs_lower_dentry_idx(dentry, dbstart(dentry));
+   if (!unionfs_lower_inode_idx(dentry->d_inode, bindex)) {
+   atomic_inc(&lower_dentry->d_inode->i_count);
+   unionfs_set_lower_inode_idx(dentry->d_inode, bindex,
+   lower_dentry->d_inode);
+   }
+   lower_dir_dentry = lock_parent(lower_dentry);
+   err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
+   unlock_dir(lower_dir_dentry);
+
+out:
+   if (!err)
+   unionfs_check_dentry(dentry);
+   return err;
+}
+
+/*
+ * put all references held by upper struct file and free lower file pointer
+ * array
+ */
+static void cleanup_file(struct file *file)
+{
+   int bindex, bstart, bend;
+   struct file **lower_files;
+   struct file *lower_file;
+   struct super_block *sb = file->f_path.dentry->d_sb;
+
+   lower_files = UNIONFS_F(file)->lower_files;
+   bstart = fbstart(file);
+   bend = fbend(file);
+
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   int i;  /* holds (possibly) updated branch index */
+   int old_bid;
+
+   lower_file = unionfs_lower_file_idx(file, bind

[PATCH 21/29] Unionfs: extended attributes operations

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/xattr.c |  153 
 1 files changed, 153 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/xattr.c

diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
new file mode 100644
index 000..8001c65
--- /dev/null
+++ b/fs/unionfs/xattr.c
@@ -0,0 +1,153 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* This is lifted from fs/xattr.c */
+void *unionfs_xattr_alloc(size_t size, size_t limit)
+{
+   void *ptr;
+
+   if (size > limit)
+   return ERR_PTR(-E2BIG);
+
+   if (!size)  /* size request, no buffer is needed */
+   return NULL;
+
+   ptr = kmalloc(size, GFP_KERNEL);
+   if (unlikely(!ptr))
+   return ERR_PTR(-ENOMEM);
+   return ptr;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,
+size_t size)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_getxattr(lower_dentry, (char *) name, value, size);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+int unionfs_setxattr(struct dentry *dentry, const char *name,
+const void *value, size_t size, int flags)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_setxattr(lower_dentry, (char *) name, (void *) value,
+  size, flags);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+int unionfs_removexattr(struct dentry *dentry, const char *name)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   err = vfs_removexattr(lower_dentry, (char *) name);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
+
+/*
+ * BKL held by caller.
+ * dentry->d_inode->i_mutex locked
+ */
+ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)
+{
+   struct dentry *lower_dentry = NULL;
+   int err = -EOPNOTSUPP;
+   char *encoded_list = NULL;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+
+   lower_dentry = unionfs_lower_dentry(dentry);
+
+   encoded_list = list;
+   err = vfs_listxattr(lower_dentry, encoded_list, size);
+
+out:
+   unionfs_check_dentry(dentry);
+   unionfs_unlock_dentry(dentry);
+   unionfs_read_unlock(dentry->d_sb);
+   return err;
+}
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of

[PATCH 13/29] Unionfs: directory reading file operations

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirfops.c |  290 ++
 1 files changed, 290 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dirfops.c

diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
new file mode 100644
index 000..a613862
--- /dev/null
+++ b/fs/unionfs/dirfops.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* Make sure our rdstate is playing by the rules. */
+static void verify_rdstate_offset(struct unionfs_dir_state *rdstate)
+{
+   BUG_ON(rdstate->offset >= DIREOF);
+   BUG_ON(rdstate->cookie >= MAXRDCOOKIE);
+}
+
+struct unionfs_getdents_callback {
+   struct unionfs_dir_state *rdstate;
+   void *dirent;
+   int entries_written;
+   int filldir_called;
+   int filldir_error;
+   filldir_t filldir;
+   struct super_block *sb;
+};
+
+/* based on generic filldir in fs/readir.c */
+static int unionfs_filldir(void *dirent, const char *name, int namelen,
+  loff_t offset, u64 ino, unsigned int d_type)
+{
+   struct unionfs_getdents_callback *buf = dirent;
+   struct filldir_node *found = NULL;
+   int err = 0;
+   int is_wh_entry = 0;
+
+   buf->filldir_called++;
+
+   if ((namelen > UNIONFS_WHLEN) &&
+   !strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN)) {
+   name += UNIONFS_WHLEN;
+   namelen -= UNIONFS_WHLEN;
+   is_wh_entry = 1;
+   }
+
+   found = find_filldir_node(buf->rdstate, name, namelen, is_wh_entry);
+
+   if (found) {
+   /*
+* If we had non-whiteout entry in dir cache, then mark it
+* as a whiteout and but leave it in the dir cache.
+*/
+   if (is_wh_entry && !found->whiteout)
+   found->whiteout = is_wh_entry;
+   goto out;
+   }
+
+   /* if 'name' isn't a whiteout, filldir it. */
+   if (!is_wh_entry) {
+   off_t pos = rdstate2offset(buf->rdstate);
+   u64 unionfs_ino = ino;
+
+   err = buf->filldir(buf->dirent, name, namelen, pos,
+  unionfs_ino, d_type);
+   buf->rdstate->offset++;
+   verify_rdstate_offset(buf->rdstate);
+   }
+   /*
+* If we did fill it, stuff it in our hash, otherwise return an
+* error.
+*/
+   if (err) {
+   buf->filldir_error = err;
+   goto out;
+   }
+   buf->entries_written++;
+   err = add_filldir_node(buf->rdstate, name, namelen,
+  buf->rdstate->bindex, is_wh_entry);
+   if (err)
+   buf->filldir_error = err;
+
+out:
+   return err;
+}
+
+static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir)
+{
+   int err = 0;
+   struct file *lower_file = NULL;
+   struct inode *inode = NULL;
+   struct unionfs_getdents_callback buf;
+   struct unionfs_dir_state *uds;
+   int bend;
+   loff_t offset;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+
+   err = unionfs_file_revalidate(file, false);
+   if (unlikely(err))
+   goto out;
+
+   inode = file->f_path.dentry->d_inode;
+
+   uds = UNIONFS_F(file)->rdstate;
+   if (!uds) {
+   if (file->f_pos == DIREOF) {
+   goto out;
+   } else if (file->f_pos > 0) {
+   uds = find_rdstate(inode, file->f_pos);
+   if (unlikely(!uds)) {
+   err = -ESTALE;
+   goto out;
+   }
+   UNIONFS_F(file)->rdstate = uds;
+   } else {
+   init_rdstate(file);
+   uds = UNIONFS_F(file)->rdstate;
+   }
+   }
+   bend = fbend(file);
+
+   while (uds->bindex <= bend) {
+   lower_file = unionfs_lower_file_idx(file, ud

[PATCH 10/29] Unionfs: dentry revalidation

2008-01-10 Thread Erez Zadok

Includes d_release methods and cache-coherency support for dentries.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |  548 +++
 1 files changed, 548 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dentry.c

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
new file mode 100644
index 000..cd15243
--- /dev/null
+++ b/fs/unionfs/dentry.c
@@ -0,0 +1,548 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Revalidate a single dentry.
+ * Assume that dentry's info node is locked.
+ * Assume that parent(s) are all valid already, but
+ * the child may not yet be valid.
+ * Returns true if valid, false otherwise.
+ */
+static bool __unionfs_d_revalidate_one(struct dentry *dentry,
+  struct nameidata *nd)
+{
+   bool valid = true;  /* default is valid */
+   struct dentry *lower_dentry;
+   int bindex, bstart, bend;
+   int sbgen, dgen;
+   int positive = 0;
+   int interpose_flag;
+   struct nameidata lowernd; /* TODO: be gentler to the stack */
+
+   if (nd)
+   memcpy(&lowernd, nd, sizeof(struct nameidata));
+   else
+   memset(&lowernd, 0, sizeof(struct nameidata));
+
+   verify_locked(dentry);
+   verify_locked(dentry->d_parent);
+
+   /* if the dentry is unhashed, do NOT revalidate */
+   if (d_deleted(dentry))
+   goto out;
+
+   BUG_ON(dbstart(dentry) == -1);
+   if (dentry->d_inode)
+   positive = 1;
+   dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+   sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
+   /*
+* If we are working on an unconnected dentry, then there is no
+* revalidation to be done, because this file does not exist within
+* the namespace, and Unionfs operates on the namespace, not data.
+*/
+   if (unlikely(sbgen != dgen)) {
+   struct dentry *result;
+   int pdgen;
+
+   /* The root entry should always be valid */
+   BUG_ON(IS_ROOT(dentry));
+
+   /* We can't work correctly if our parent isn't valid. */
+   pdgen = atomic_read(&UNIONFS_D(dentry->d_parent)->generation);
+   BUG_ON(pdgen != sbgen); /* should never happen here */
+
+   /* Free the pointers for our inodes and this dentry. */
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+   if (bstart >= 0) {
+   struct dentry *lower_dentry;
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_dentry =
+   unionfs_lower_dentry_idx(dentry,
+bindex);
+   dput(lower_dentry);
+   }
+   }
+   set_dbstart(dentry, -1);
+   set_dbend(dentry, -1);
+
+   interpose_flag = INTERPOSE_REVAL_NEG;
+   if (positive) {
+   interpose_flag = INTERPOSE_REVAL;
+
+   bstart = ibstart(dentry->d_inode);
+   bend = ibend(dentry->d_inode);
+   if (bstart >= 0) {
+   struct inode *lower_inode;
+   for (bindex = bstart; bindex <= bend;
+bindex++) {
+   lower_inode =
+   unionfs_lower_inode_idx(
+   dentry->d_inode,
+   bindex);
+   iput(lower_inode);
+   }
+   }
+   kfree(UNIONFS_I(dentry->d_inode)->lower_inodes);
+   UNIONFS_I(dentry->d_inode)->lower_inodes = NULL;
+   ibstart(dentry->d_inode) = -1;
+

[PATCH 24/29] Unionfs: debugging infrastructure

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/debug.c |  533 
 1 files changed, 533 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/debug.c

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
new file mode 100644
index 000..d154c32
--- /dev/null
+++ b/fs/unionfs/debug.c
@@ -0,0 +1,533 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Helper debugging functions for maintainers (and for users to report back
+ * useful information back to maintainers)
+ */
+
+/* it's always useful to know what part of the code called us */
+#define PRINT_CALLER(fname, fxn, line) \
+   do {\
+   if (!printed_caller) {  \
+   pr_debug("PC:%s:%s:%d\n", (fname), (fxn), (line)); \
+   printed_caller = 1; \
+   }   \
+   } while (0)
+
+/*
+ * __unionfs_check_{inode,dentry,file} perform exhaustive sanity checking on
+ * the fan-out of various Unionfs objects.  We check that no lower objects
+ * exist  outside the start/end branch range; that all objects within are
+ * non-NULL (with some allowed exceptions); that for every lower file
+ * there's a lower dentry+inode; that the start/end ranges match for all
+ * corresponding lower objects; that open files/symlinks have only one lower
+ * objects, but directories can have several; and more.
+ */
+void __unionfs_check_inode(const struct inode *inode,
+  const char *fname, const char *fxn, int line)
+{
+   int bindex;
+   int istart, iend;
+   struct inode *lower_inode;
+   struct super_block *sb;
+   int printed_caller = 0;
+   void *poison_ptr;
+
+   /* for inodes now */
+   BUG_ON(!inode);
+   sb = inode->i_sb;
+   istart = ibstart(inode);
+   iend = ibend(inode);
+   /* don't check inode if no lower branches */
+   if (istart < 0 && iend < 0)
+   return;
+   if (unlikely(istart > iend)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci0: inode=%p istart/end=%d:%d\n",
+inode, istart, iend);
+   }
+   if (unlikely((istart == -1 && iend != -1) ||
+(istart != -1 && iend == -1))) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci1: inode=%p istart/end=%d:%d\n",
+inode, istart, iend);
+   }
+   if (!S_ISDIR(inode->i_mode)) {
+   if (unlikely(iend != istart)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci2: inode=%p istart=%d iend=%d\n",
+inode, istart, iend);
+   }
+   }
+
+   for (bindex = sbstart(sb); bindex < sbmax(sb); bindex++) {
+   if (unlikely(!UNIONFS_I(inode))) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci3: no inode_info %p\n", inode);
+   return;
+   }
+   if (unlikely(!UNIONFS_I(inode)->lower_inodes)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci4: no lower_inodes %p\n", inode);
+   return;
+   }
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (lower_inode) {
+   memset(&poison_ptr, POISON_INUSE, sizeof(void *));
+   if (unlikely(bindex < istart || bindex > iend)) {
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci5: inode/linode=%p:%p bindex=%d "
+"istart/end=%d:%d\n", inode,
+lower_inode, bindex, istart, iend);
+   } else if (unlikely(lower_inode == poison_ptr)) {
+   /* freed inode! */
+   PRINT_CALLER(fname, fxn, line);
+   pr_debug(" Ci6: inode/linode=%p:%p bindex=%d "
+"istart/end=%d:%d\n", inode,
+lower_inode, bin

[PATCH 26/29] Unionfs: common header file for user-land utilities and kernel

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/union_fs.h |   24 
 1 files changed, 24 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/union_fs.h

diff --git a/include/linux/union_fs.h b/include/linux/union_fs.h
new file mode 100644
index 000..9d601d2
--- /dev/null
+++ b/include/linux/union_fs.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _LINUX_UNION_FS_H
+#define _LINUX_UNION_FS_H
+
+#define UNIONFS_VERSION  "2.2-mm"
+
+/*
+ * DEFINITIONS FOR USER AND KERNEL CODE:
+ */
+# define UNIONFS_IOCTL_INCGEN  _IOR(0x15, 11, int)
+# define UNIONFS_IOCTL_QUERYFILE   _IOR(0x15, 15, int)
+
+#endif /* _LINUX_UNIONFS_H */
+
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/29] Unionfs: rename method and helpers

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/rename.c |  531 +++
 1 files changed, 531 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/rename.c

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
new file mode 100644
index 000..9306a2b
--- /dev/null
+++ b/fs/unionfs/rename.c
@@ -0,0 +1,531 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
+   struct inode *new_dir, struct dentry *new_dentry,
+   int bindex, struct dentry **wh_old)
+{
+   int err = 0;
+   struct dentry *lower_old_dentry;
+   struct dentry *lower_new_dentry;
+   struct dentry *lower_old_dir_dentry;
+   struct dentry *lower_new_dir_dentry;
+   struct dentry *lower_wh_dentry;
+   struct dentry *lower_wh_dir_dentry;
+   char *wh_name = NULL;
+
+   lower_new_dentry = unionfs_lower_dentry_idx(new_dentry, bindex);
+   lower_old_dentry = unionfs_lower_dentry_idx(old_dentry, bindex);
+
+   if (!lower_new_dentry) {
+   lower_new_dentry =
+   create_parents(new_dentry->d_parent->d_inode,
+  new_dentry, new_dentry->d_name.name,
+  bindex);
+   if (IS_ERR(lower_new_dentry)) {
+   err = PTR_ERR(lower_new_dentry);
+   if (IS_COPYUP_ERR(err))
+   goto out;
+   printk(KERN_ERR "unionfs: error creating directory "
+  "tree for rename, bindex=%d err=%d\n",
+  bindex, err);
+   goto out;
+   }
+   }
+
+   wh_name = alloc_whname(new_dentry->d_name.name,
+  new_dentry->d_name.len);
+   if (unlikely(IS_ERR(wh_name))) {
+   err = PTR_ERR(wh_name);
+   goto out;
+   }
+
+   lower_wh_dentry = lookup_one_len(wh_name, lower_new_dentry->d_parent,
+new_dentry->d_name.len +
+UNIONFS_WHLEN);
+   if (IS_ERR(lower_wh_dentry)) {
+   err = PTR_ERR(lower_wh_dentry);
+   goto out;
+   }
+
+   if (lower_wh_dentry->d_inode) {
+   /* get rid of the whiteout that is existing */
+   if (lower_new_dentry->d_inode) {
+   printk(KERN_ERR "unionfs: both a whiteout and a "
+  "dentry exist when doing a rename!\n");
+   err = -EIO;
+
+   dput(lower_wh_dentry);
+   goto out;
+   }
+
+   lower_wh_dir_dentry = lock_parent_wh(lower_wh_dentry);
+   err = is_robranch_super(old_dentry->d_sb, bindex);
+   if (!err)
+   err = vfs_unlink(lower_wh_dir_dentry->d_inode,
+lower_wh_dentry);
+
+   dput(lower_wh_dentry);
+   unlock_dir(lower_wh_dir_dentry);
+   if (err)
+   goto out;
+   } else {
+   dput(lower_wh_dentry);
+   }
+
+   err = is_robranch_super(old_dentry->d_sb, bindex);
+   if (err)
+   goto out;
+
+   dget(lower_old_dentry);
+   lower_old_dir_dentry = dget_parent(lower_old_dentry);
+   lower_new_dir_dentry = dget_parent(lower_new_dentry);
+
+   /*
+* ready to whiteout for old_dentry. caller will create the actual
+* whiteout, and must dput(*wh_old)
+*/
+   if (wh_old) {
+   char *whname;
+   whname = alloc_whname(old_dentry->d_name.name,
+ old_dentry->d_name.len);
+   err = PTR_ERR(whname);
+   if (unlikely(IS_ERR(whname)))
+   goto out_dput;
+   *wh_old = lookup_one_len(whname, lower_old_dir_dentry,
+old_dentry->d_name.len +
+

[PATCH 28/29] VFS: export release_open_intent symbol

2008-01-10 Thread Erez Zadok

Needed to release the resources of the lower nameidata structures that we
create and pass to lower file systems (e.g., when calling vfs_create).

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/namei.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 3b993db..14f9861 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -389,6 +389,7 @@ void release_open_intent(struct nameidata *nd)
else
fput(nd->intent.open.file);
 }
+EXPORT_SYMBOL(release_open_intent);
 
 static inline struct dentry *
 do_revalidate(struct dentry *dentry, struct nameidata *nd)
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 19/29] Unionfs: mount-time and stacking-interposition functions

2008-01-10 Thread Erez Zadok

Includes read_super and module-linkage routines.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c |  794 +
 1 files changed, 794 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/main.c

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
new file mode 100644
index 000..23c18f7
--- /dev/null
+++ b/fs/unionfs/main.c
@@ -0,0 +1,794 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+#include 
+#include 
+
+static void unionfs_fill_inode(struct dentry *dentry,
+  struct inode *inode)
+{
+   struct inode *lower_inode;
+   struct dentry *lower_dentry;
+   int bindex, bstart, bend;
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry) {
+   unionfs_set_lower_inode_idx(inode, bindex, NULL);
+   continue;
+   }
+
+   /* Initialize the lower inode to the new lower inode. */
+   if (!lower_dentry->d_inode)
+   continue;
+
+   unionfs_set_lower_inode_idx(inode, bindex,
+   igrab(lower_dentry->d_inode));
+   }
+
+   ibstart(inode) = dbstart(dentry);
+   ibend(inode) = dbend(dentry);
+
+   /* Use attributes from the first branch. */
+   lower_inode = unionfs_lower_inode(inode);
+
+   /* Use different set of inode ops for symlinks & directories */
+   if (S_ISLNK(lower_inode->i_mode))
+   inode->i_op = &unionfs_symlink_iops;
+   else if (S_ISDIR(lower_inode->i_mode))
+   inode->i_op = &unionfs_dir_iops;
+
+   /* Use different set of file ops for directories */
+   if (S_ISDIR(lower_inode->i_mode))
+   inode->i_fop = &unionfs_dir_fops;
+
+   /* properly initialize special inodes */
+   if (S_ISBLK(lower_inode->i_mode) || S_ISCHR(lower_inode->i_mode) ||
+   S_ISFIFO(lower_inode->i_mode) || S_ISSOCK(lower_inode->i_mode))
+   init_special_inode(inode, lower_inode->i_mode,
+  lower_inode->i_rdev);
+
+   /* all well, copy inode attributes */
+   unionfs_copy_attr_all(inode, lower_inode);
+   fsstack_copy_inode_size(inode, lower_inode);
+}
+
+/*
+ * Connect a unionfs inode dentry/inode with several lower ones.  This is
+ * the classic stackable file system "vnode interposition" action.
+ *
+ * @sb: unionfs's super_block
+ */
+struct dentry *unionfs_interpose(struct dentry *dentry, struct super_block *sb,
+int flag)
+{
+   int err = 0;
+   struct inode *inode;
+   int is_negative_dentry = 1;
+   int bindex, bstart, bend;
+   int need_fill_inode = 1;
+   struct dentry *spliced = NULL;
+
+   verify_locked(dentry);
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   /* Make sure that we didn't get a negative dentry. */
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   if (unionfs_lower_dentry_idx(dentry, bindex) &&
+   unionfs_lower_dentry_idx(dentry, bindex)->d_inode) {
+   is_negative_dentry = 0;
+   break;
+   }
+   }
+   BUG_ON(is_negative_dentry);
+
+   /*
+* We allocate our new inode below, by calling iget.
+* iget will call our read_inode which will initialize some
+* of the new inode's fields
+*/
+
+   /*
+* On revalidate we've already got our own inode and just need
+* to fix it up.
+*/
+   if (flag == INTERPOSE_REVAL) {
+   inode = dentry->d_inode;
+   UNIONFS_I(inode)->bstart = -1;
+   UNIONFS_I(inode)->bend = -1;
+   atomic_set(&UNIONFS_I(inode)->generation,
+  atomic_read(&UNIONFS_SB(sb)->generation));
+
+   UNIONFS_I(inode)->lower_inodes =
+

[PATCH 25/29] Unionfs file system magic number

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/magic.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/magic.h b/include/linux/magic.h
index 1fa0c2c..67043ed 100644
--- a/include/linux/magic.h
+++ b/include/linux/magic.h
@@ -35,6 +35,8 @@
 #define REISER2FS_SUPER_MAGIC_STRING   "ReIsEr2Fs"
 #define REISER2FS_JR_SUPER_MAGIC_STRING"ReIsEr3Fs"
 
+#define UNIONFS_SUPER_MAGIC 0xf15f083d
+
 #define SMB_SUPER_MAGIC0x517B
 #define USBDEVICE_SUPER_MAGIC  0x9fa2
 #define CGROUP_SUPER_MAGIC 0x27e0eb
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/29] Unionfs: readdir helper functions

2008-01-10 Thread Erez Zadok

Includes whiteout handling for directories.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirhelper.c |  267 
 1 files changed, 267 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/dirhelper.c

diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
new file mode 100644
index 000..4b73bb6
--- /dev/null
+++ b/fs/unionfs/dirhelper.c
@@ -0,0 +1,267 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Delete all of the whiteouts in a given directory for rmdir.
+ *
+ * lower directory inode should be locked
+ */
+int do_delete_whiteouts(struct dentry *dentry, int bindex,
+   struct unionfs_dir_state *namelist)
+{
+   int err = 0;
+   struct dentry *lower_dir_dentry = NULL;
+   struct dentry *lower_dentry;
+   char *name = NULL, *p;
+   struct inode *lower_dir;
+   int i;
+   struct list_head *pos;
+   struct filldir_node *cursor;
+
+   /* Find out lower parent dentry */
+   lower_dir_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   BUG_ON(!S_ISDIR(lower_dir_dentry->d_inode->i_mode));
+   lower_dir = lower_dir_dentry->d_inode;
+   BUG_ON(!S_ISDIR(lower_dir->i_mode));
+
+   err = -ENOMEM;
+   name = __getname();
+   if (unlikely(!name))
+   goto out;
+   strcpy(name, UNIONFS_WHPFX);
+   p = name + UNIONFS_WHLEN;
+
+   err = 0;
+   for (i = 0; !err && i < namelist->size; i++) {
+   list_for_each(pos, &namelist->list[i]) {
+   cursor =
+   list_entry(pos, struct filldir_node,
+  file_list);
+   /* Only operate on whiteouts in this branch. */
+   if (cursor->bindex != bindex)
+   continue;
+   if (!cursor->whiteout)
+   continue;
+
+   strcpy(p, cursor->name);
+   lower_dentry =
+   lookup_one_len(name, lower_dir_dentry,
+  cursor->namelen +
+  UNIONFS_WHLEN);
+   if (IS_ERR(lower_dentry)) {
+   err = PTR_ERR(lower_dentry);
+   break;
+   }
+   if (lower_dentry->d_inode)
+   err = vfs_unlink(lower_dir, lower_dentry);
+   dput(lower_dentry);
+   if (err)
+   break;
+   }
+   }
+
+   __putname(name);
+
+   /* After all of the removals, we should copy the attributes once. */
+   fsstack_copy_attr_times(dentry->d_inode, lower_dir_dentry->d_inode);
+
+out:
+   return err;
+}
+
+/* delete whiteouts in a dir (for rmdir operation) using sioq if necessary */
+int delete_whiteouts(struct dentry *dentry, int bindex,
+struct unionfs_dir_state *namelist)
+{
+   int err;
+   struct super_block *sb;
+   struct dentry *lower_dir_dentry;
+   struct inode *lower_dir;
+   struct sioq_args args;
+
+   sb = dentry->d_sb;
+
+   BUG_ON(!S_ISDIR(dentry->d_inode->i_mode));
+   BUG_ON(bindex < dbstart(dentry));
+   BUG_ON(bindex > dbend(dentry));
+   err = is_robranch_super(sb, bindex);
+   if (err)
+   goto out;
+
+   lower_dir_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   BUG_ON(!S_ISDIR(lower_dir_dentry->d_inode->i_mode));
+   lower_dir = lower_dir_dentry->d_inode;
+   BUG_ON(!S_ISDIR(lower_dir->i_mode));
+
+   if (!permission(lower_dir, MAY_WRITE | MAY_EXEC, NULL)) {
+   err = do_delete_whiteouts(dentry, bindex, namelist);
+   } else {
+   args.deletewh.namelist = namelist;
+   args.deletewh.dentry = dentry;
+   args.deletewh.bindex = bindex;
+   run_sioq(__delete_whiteouts, &args);
+   err = args.err;
+   }
+
+out:
+   retur

[PATCH 18/29] Unionfs: address-space operations

2008-01-10 Thread Erez Zadok

Includes writepage, writepages, readpage, prepare_write, and commit_write.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/mmap.c |  343 +
 1 files changed, 343 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/mmap.c

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
new file mode 100644
index 000..ad770ac
--- /dev/null
+++ b/fs/unionfs/mmap.c
@@ -0,0 +1,343 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2006  Shaya Potter
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
+{
+   int err = -EIO;
+   struct inode *inode;
+   struct inode *lower_inode;
+   struct page *lower_page;
+   struct address_space *lower_mapping; /* lower inode mapping */
+   gfp_t mask;
+
+   BUG_ON(!PageUptodate(page));
+   inode = page->mapping->host;
+   /* if no lower inode, nothing to do */
+   if (!inode || !UNIONFS_I(inode) || UNIONFS_I(inode)->lower_inodes) {
+   err = 0;
+   goto out;
+   }
+   lower_inode = unionfs_lower_inode(inode);
+   lower_mapping = lower_inode->i_mapping;
+
+   /*
+* find lower page (returns a locked page)
+*
+* We turn off __GFP_FS while we look for or create a new lower
+* page.  This prevents a recursion into the file system code, which
+* under memory pressure conditions could lead to a deadlock.  This
+* is similar to how the loop driver behaves (see loop_set_fd in
+* drivers/block/loop.c).  If we can't find the lower page, we
+* redirty our page and return "success" so that the VM will call us
+* again in the (hopefully near) future.
+*/
+   mask = mapping_gfp_mask(lower_mapping) & ~(__GFP_FS);
+   lower_page = find_or_create_page(lower_mapping, page->index, mask);
+   if (!lower_page) {
+   err = 0;
+   set_page_dirty(page);
+   goto out;
+   }
+
+   /* copy page data from our upper page to the lower page */
+   copy_highpage(lower_page, page);
+   flush_dcache_page(lower_page);
+   SetPageUptodate(lower_page);
+   set_page_dirty(lower_page);
+
+   /*
+* Call lower writepage (expects locked page).  However, if we are
+* called with wbc->for_reclaim, then the VFS/VM just wants to
+* reclaim our page.  Therefore, we don't need to call the lower
+* ->writepage: just copy our data to the lower page (already done
+* above), then mark the lower page dirty and unlock it, and return
+* success.
+*/
+   if (wbc->for_reclaim) {
+   unlock_page(lower_page);
+   goto out_release;
+   }
+
+   BUG_ON(!lower_mapping->a_ops->writepage);
+   wait_on_page_writeback(lower_page); /* prevent multiple writers */
+   clear_page_dirty_for_io(lower_page); /* emulate VFS behavior */
+   err = lower_mapping->a_ops->writepage(lower_page, wbc);
+   if (err < 0)
+   goto out_release;
+
+   /*
+* Lower file systems such as ramfs and tmpfs, may return
+* AOP_WRITEPAGE_ACTIVATE so that the VM won't try to (pointlessly)
+* write the page again for a while.  But those lower file systems
+* also set the page dirty bit back again.  Since we successfully
+* copied our page data to the lower page, then the VM will come
+* back to the lower page (directly) and try to flush it.  So we can
+* save the VM the hassle of coming back to our page and trying to
+* flush too.  Therefore, we don't re-dirty our own page, and we
+* never return AOP_WRITEPAGE_ACTIVATE back to the VM (we consider
+* this a success).
+*
+* We also unlock the lower page if the lower ->writepage returned
+* AOP_WRITEPAGE_ACTIVATE.  (This "anomalous" behaviour may be
+* addressed in future shmem/VM code.)
+*/
+   if (err == AOP_WRITEPAGE_ACTIVATE) {
+   err = 0;
+   unlock_page(lower_page);
+   }
+
+   /* all is well

[PATCH 16/29] Unionfs: inode operations

2008-01-10 Thread Erez Zadok

Includes create, lookup, link, symlink, mkdir, mknod, readlink, follow_link,
put_link, permission, and setattr.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c | 1174 
 1 files changed, 1174 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/inode.c

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
new file mode 100644
index 000..e15ddb9
--- /dev/null
+++ b/fs/unionfs/inode.c
@@ -0,0 +1,1174 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int unionfs_create(struct inode *parent, struct dentry *dentry,
+ int mode, struct nameidata *nd)
+{
+   int err = 0;
+   struct dentry *lower_dentry = NULL;
+   struct dentry *wh_dentry = NULL;
+   struct dentry *lower_parent_dentry = NULL;
+   char *name = NULL;
+   int valid = 0;
+   struct nameidata lower_nd;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_PARENT);
+   valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
+   unionfs_unlock_dentry(dentry->d_parent);
+   if (unlikely(!valid)) {
+   err = -ESTALE;  /* same as what real_lookup does */
+   goto out;
+   }
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   valid = __unionfs_d_revalidate_chain(dentry, nd, false);
+   /*
+* It's only a bug if this dentry was not negative and couldn't be
+* revalidated (shouldn't happen).
+*/
+   BUG_ON(!valid && dentry->d_inode);
+
+   /*
+* We shouldn't create things in a read-only branch; this check is a
+* bit redundant as we don't allow branch 0 to be read-only at the
+* moment
+*/
+   err = is_robranch_super(dentry->d_sb, 0);
+   if (err) {
+   err = -EROFS;
+   goto out;
+   }
+
+   /*
+* We _always_ create on branch 0
+*/
+   lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
+   if (lower_dentry) {
+   /*
+* check if whiteout exists in this branch, i.e. lookup .wh.foo
+* first.
+*/
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
+
+   wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(wh_dentry)) {
+   err = PTR_ERR(wh_dentry);
+   wh_dentry = NULL;
+   goto out;
+   }
+
+   if (wh_dentry->d_inode) {
+   /*
+* .wh.foo has been found, so let's unlink it
+*/
+   struct dentry *lower_dir_dentry;
+
+   lower_dir_dentry = lock_parent_wh(wh_dentry);
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
+   err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+   lockdep_on();
+   unlock_dir(lower_dir_dentry);
+
+   /*
+* Whiteouts are special files and should be deleted
+* no matter what (as if they never existed), in
+* order to allow this create operation to succeed.
+* This is especially important in sticky
+* directories: a whiteout may have been created by
+* one user, but the newly created file may be
+* created by another user.  Therefore, in order to
+* maintain Unix semantics, if the vfs_unlink above
+* ailed, then we have to try to directly unlink the
+* whiteout.  Note: in the ODF version of unionfs,
+*

[PATCH 05/29] Unionfs: fanout header definitions

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/fanout.h |  366 +++
 1 files changed, 366 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/fanout.h

diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
new file mode 100644
index 000..4d9a45f
--- /dev/null
+++ b/fs/unionfs/fanout.h
@@ -0,0 +1,366 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _FANOUT_H_
+#define _FANOUT_H_
+
+/*
+ * Inode to private data
+ *
+ * Since we use containers and the struct inode is _inside_ the
+ * unionfs_inode_info structure, UNIONFS_I will always (given a non-NULL
+ * inode pointer), return a valid non-NULL pointer.
+ */
+static inline struct unionfs_inode_info *UNIONFS_I(const struct inode *inode)
+{
+   return container_of(inode, struct unionfs_inode_info, vfs_inode);
+}
+
+#define ibstart(ino) (UNIONFS_I(ino)->bstart)
+#define ibend(ino) (UNIONFS_I(ino)->bend)
+
+/* Superblock to private data */
+#define UNIONFS_SB(super) ((struct unionfs_sb_info *)(super)->s_fs_info)
+#define sbstart(sb) 0
+#define sbend(sb) (UNIONFS_SB(sb)->bend)
+#define sbmax(sb) (UNIONFS_SB(sb)->bend + 1)
+#define sbhbid(sb) (UNIONFS_SB(sb)->high_branch_id)
+
+/* File to private Data */
+#define UNIONFS_F(file) ((struct unionfs_file_info *)((file)->private_data))
+#define fbstart(file) (UNIONFS_F(file)->bstart)
+#define fbend(file) (UNIONFS_F(file)->bend)
+
+/* macros to manipulate branch IDs in stored in our superblock */
+static inline int branch_id(struct super_block *sb, int index)
+{
+   BUG_ON(!sb || index < 0);
+   return UNIONFS_SB(sb)->data[index].branch_id;
+}
+
+static inline void set_branch_id(struct super_block *sb, int index, int val)
+{
+   BUG_ON(!sb || index < 0);
+   UNIONFS_SB(sb)->data[index].branch_id = val;
+}
+
+static inline void new_branch_id(struct super_block *sb, int index)
+{
+   BUG_ON(!sb || index < 0);
+   set_branch_id(sb, index, ++UNIONFS_SB(sb)->high_branch_id);
+}
+
+/*
+ * Find new index of matching branch with an existing superblock of a known
+ * (possibly old) id.  This is needed because branches could have been
+ * added/deleted causing the branches of any open files to shift.
+ *
+ * @sb: the new superblock which may have new/different branch IDs
+ * @id: the old/existing id we're looking for
+ * Returns index of newly found branch (0 or greater), -1 otherwise.
+ */
+static inline int branch_id_to_idx(struct super_block *sb, int id)
+{
+   int i;
+   for (i = 0; i < sbmax(sb); i++) {
+   if (branch_id(sb, i) == id)
+   return i;
+   }
+   /* in the non-ODF code, this should really never happen */
+   printk(KERN_WARNING "unionfs: cannot find branch with id %d\n", id);
+   return -1;
+}
+
+/* File to lower file. */
+static inline struct file *unionfs_lower_file(const struct file *f)
+{
+   BUG_ON(!f);
+   return UNIONFS_F(f)->lower_files[fbstart(f)];
+}
+
+static inline struct file *unionfs_lower_file_idx(const struct file *f,
+ int index)
+{
+   BUG_ON(!f || index < 0);
+   return UNIONFS_F(f)->lower_files[index];
+}
+
+static inline void unionfs_set_lower_file_idx(struct file *f, int index,
+ struct file *val)
+{
+   BUG_ON(!f || index < 0);
+   UNIONFS_F(f)->lower_files[index] = val;
+   /* save branch ID (may be redundant?) */
+   UNIONFS_F(f)->saved_branch_ids[index] =
+   branch_id((f)->f_path.dentry->d_sb, index);
+}
+
+static inline void unionfs_set_lower_file(struct file *f, struct file *val)
+{
+   BUG_ON(!f);
+   unionfs_set_lower_file_idx((f), fbstart(f), (val));
+}
+
+/* Inode to lower inode. */
+static inline struct inode *unionfs_lower_inode(const struct inode *i)
+{
+   BUG_ON(!i);
+   return UNIONFS_I(i)->lower_inodes[ibstart(i)];
+}
+
+static inline struct inode *unionfs_lower_inode_idx(const struct inode *i,
+   int index)
+{
+   BUG_ON(!i || index < 0);
+   return UNIONFS_I(i)->lower_inodes[index];
+}
+
+static inline void unionfs_set_lower_inode_idx(struct inode *i

[PATCH 23/29] Unionfs: miscellaneous helper routines

2008-01-10 Thread Erez Zadok

Mostly related to whiteouts.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/subr.c |  242 +
 1 files changed, 242 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/subr.c

diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
new file mode 100644
index 000..0a0fce9
--- /dev/null
+++ b/fs/unionfs/subr.c
@@ -0,0 +1,242 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Pass an unionfs dentry and an index.  It will try to create a whiteout
+ * for the filename in dentry, and will try in branch 'index'.  On error,
+ * it will proceed to a branch to the left.
+ */
+int create_whiteout(struct dentry *dentry, int start)
+{
+   int bstart, bend, bindex;
+   struct dentry *lower_dir_dentry;
+   struct dentry *lower_dentry;
+   struct dentry *lower_wh_dentry;
+   struct nameidata nd;
+   char *name = NULL;
+   int err = -EINVAL;
+
+   verify_locked(dentry);
+
+   bstart = dbstart(dentry);
+   bend = dbend(dentry);
+
+   /* create dentry's whiteout equivalent */
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
+
+   for (bindex = start; bindex >= 0; bindex--) {
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+
+   if (!lower_dentry) {
+   /*
+* if lower dentry is not present, create the
+* entire lower dentry directory structure and go
+* ahead.  Since we want to just create whiteout, we
+* only want the parent dentry, and hence get rid of
+* this dentry.
+*/
+   lower_dentry = create_parents(dentry->d_inode,
+ dentry,
+ dentry->d_name.name,
+ bindex);
+   if (!lower_dentry || IS_ERR(lower_dentry)) {
+   int ret = PTR_ERR(lower_dentry);
+   if (!IS_COPYUP_ERR(ret))
+   printk(KERN_ERR
+  "unionfs: create_parents for "
+  "whiteout failed: bindex=%d "
+  "err=%d\n", bindex, ret);
+   continue;
+   }
+   }
+
+   lower_wh_dentry =
+   lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(lower_wh_dentry))
+   continue;
+
+   /*
+* The whiteout already exists. This used to be impossible,
+* but now is possible because of opaqueness.
+*/
+   if (lower_wh_dentry->d_inode) {
+   dput(lower_wh_dentry);
+   err = 0;
+   goto out;
+   }
+
+   err = init_lower_nd(&nd, LOOKUP_CREATE);
+   if (unlikely(err < 0))
+   goto out;
+   lower_dir_dentry = lock_parent_wh(lower_wh_dentry);
+   err = is_robranch_super(dentry->d_sb, bindex);
+   if (!err)
+   err = vfs_create(lower_dir_dentry->d_inode,
+lower_wh_dentry,
+~current->fs->umask & S_IRWXUGO,
+&nd);
+   unlock_dir(lower_dir_dentry);
+   dput(lower_wh_dentry);
+   release_lower_nd(&nd, err);
+
+   if (!err || !IS_COPYUP_ERR(err))
+   break;
+   }
+
+   /* set dbopaque so that lookup will not proceed after this branch */
+   if (!err)
+   set_dbopa

[PATCH 22/29] Unionfs: async I/O queue

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/sioq.c |  119 +
 fs/unionfs/sioq.h |   92 +
 2 files changed, 211 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/sioq.c
 create mode 100644 fs/unionfs/sioq.h

diff --git a/fs/unionfs/sioq.c b/fs/unionfs/sioq.c
new file mode 100644
index 000..2a8c88e
--- /dev/null
+++ b/fs/unionfs/sioq.c
@@ -0,0 +1,119 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006  Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006  Junjiro Okajima
+ * Copyright (c) 2006  David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Super-user IO work Queue - sometimes we need to perform actions which
+ * would fail due to the unix permissions on the parent directory (e.g.,
+ * rmdir a directory which appears empty, but in reality contains
+ * whiteouts).
+ */
+
+static struct workqueue_struct *superio_workqueue;
+
+int __init init_sioq(void)
+{
+   int err;
+
+   superio_workqueue = create_workqueue("unionfs_siod");
+   if (!IS_ERR(superio_workqueue))
+   return 0;
+
+   err = PTR_ERR(superio_workqueue);
+   printk(KERN_ERR "unionfs: create_workqueue failed %d\n", err);
+   superio_workqueue = NULL;
+   return err;
+}
+
+void stop_sioq(void)
+{
+   if (superio_workqueue)
+   destroy_workqueue(superio_workqueue);
+}
+
+void run_sioq(work_func_t func, struct sioq_args *args)
+{
+   INIT_WORK(&args->work, func);
+
+   init_completion(&args->comp);
+   while (!queue_work(superio_workqueue, &args->work)) {
+   /* TODO: do accounting if needed */
+   schedule();
+   }
+   wait_for_completion(&args->comp);
+}
+
+void __unionfs_create(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct create_args *c = &args->create;
+
+   args->err = vfs_create(c->parent, c->dentry, c->mode, c->nd);
+   complete(&args->comp);
+}
+
+void __unionfs_mkdir(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct mkdir_args *m = &args->mkdir;
+
+   args->err = vfs_mkdir(m->parent, m->dentry, m->mode);
+   complete(&args->comp);
+}
+
+void __unionfs_mknod(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct mknod_args *m = &args->mknod;
+
+   args->err = vfs_mknod(m->parent, m->dentry, m->mode, m->dev);
+   complete(&args->comp);
+}
+
+void __unionfs_symlink(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct symlink_args *s = &args->symlink;
+
+   args->err = vfs_symlink(s->parent, s->dentry, s->symbuf, s->mode);
+   complete(&args->comp);
+}
+
+void __unionfs_unlink(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct unlink_args *u = &args->unlink;
+
+   args->err = vfs_unlink(u->parent, u->dentry);
+   complete(&args->comp);
+}
+
+void __delete_whiteouts(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+   struct deletewh_args *d = &args->deletewh;
+
+   args->err = do_delete_whiteouts(d->dentry, d->bindex, d->namelist);
+   complete(&args->comp);
+}
+
+void __is_opaque_dir(struct work_struct *work)
+{
+   struct sioq_args *args = container_of(work, struct sioq_args, work);
+
+   args->ret = lookup_one_len(UNIONFS_DIR_OPAQUE, args->is_opaque.dentry,
+  sizeof(UNIONFS_DIR_OPAQUE) - 1);
+   complete(&args->comp);
+}
diff --git a/fs/unionfs/sioq.h b/fs/unionfs/sioq.h
new file mode 100644
index 000..afb71ee
--- /dev/null
+++ b/fs/unionfs/sioq.h
@@ -0,0 +1,92 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006  Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006  Junjiro Okajima
+ * Copyright (c) 2006  David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU

[PATCH 08/29] Unionfs: basic file operations

2008-01-10 Thread Erez Zadok

Includes read, write, mmap, fsync, and fasync.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/file.c |  184 +
 1 files changed, 184 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/file.c

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
new file mode 100644
index 000..0c424f6
--- /dev/null
+++ b/fs/unionfs/file.c
@@ -0,0 +1,184 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+static int unionfs_file_readdir(struct file *file, void *dirent,
+   filldir_t filldir)
+{
+   return -ENOTDIR;
+}
+
+static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)
+{
+   int err = 0;
+   bool willwrite;
+   struct file *lower_file;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+
+   /* This might be deferred to mmap's writepage */
+   willwrite = ((vma->vm_flags | VM_SHARED | VM_WRITE) == vma->vm_flags);
+   err = unionfs_file_revalidate(file, willwrite);
+   if (unlikely(err))
+   goto out;
+   unionfs_check_file(file);
+
+   /*
+* File systems which do not implement ->writepage may use
+* generic_file_readonly_mmap as their ->mmap op.  If you call
+* generic_file_readonly_mmap with VM_WRITE, you'd get an -EINVAL.
+* But we cannot call the lower ->mmap op, so we can't tell that
+* writeable mappings won't work.  Therefore, our only choice is to
+* check if the lower file system supports the ->writepage, and if
+* not, return EINVAL (the same error that
+* generic_file_readonly_mmap returns in that case).
+*/
+   lower_file = unionfs_lower_file(file);
+   if (willwrite && !lower_file->f_mapping->a_ops->writepage) {
+   err = -EINVAL;
+   printk(KERN_ERR "unionfs: branch %d file system does not "
+  "support writeable mmap\n", fbstart(file));
+   } else {
+   err = generic_file_mmap(file, vma);
+   if (err)
+   printk(KERN_ERR
+  "unionfs: generic_file_mmap failed %d\n", err);
+   }
+
+out:
+   if (!err) {
+   /* copyup could cause parent dir times to change */
+   unionfs_copy_attr_times(file->f_path.dentry->d_parent->d_inode);
+   unionfs_check_file(file);
+   }
+   unionfs_read_unlock(file->f_path.dentry->d_sb);
+   return err;
+}
+
+int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)
+{
+   int bindex, bstart, bend;
+   struct file *lower_file;
+   struct dentry *lower_dentry;
+   struct inode *lower_inode, *inode;
+   int err = -EINVAL;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
+   err = unionfs_file_revalidate(file, true);
+   if (unlikely(err))
+   goto out;
+   unionfs_check_file(file);
+
+   bstart = fbstart(file);
+   bend = fbend(file);
+   if (bstart < 0 || bend < 0)
+   goto out;
+
+   inode = dentry->d_inode;
+   if (unlikely(!inode)) {
+   printk(KERN_ERR
+  "unionfs: null lower inode in unionfs_fsync\n");
+   goto out;
+   }
+   for (bindex = bstart; bindex <= bend; bindex++) {
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (!lower_inode || !lower_inode->i_fop->fsync)
+   continue;
+   lower_file = unionfs_lower_file_idx(file, bindex);
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   mutex_lock(&lower_inode->i_mutex);
+   err = lower_inode->i_fop->fsync(lower_file,
+   lower_dentry,
+   datasync);
+   mutex_unlock(&lower_inode->i_mutex);
+   if (err)
+   goto out;
+   }
+
+   unionfs_copy_attr_times(inode);
+
+out:
+   unionfs

[PATCH 04/29] Unionfs: main Makefile

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/Makefile |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/Makefile

diff --git a/fs/unionfs/Makefile b/fs/unionfs/Makefile
new file mode 100644
index 000..17ca4a7
--- /dev/null
+++ b/fs/unionfs/Makefile
@@ -0,0 +1,13 @@
+obj-$(CONFIG_UNION_FS) += unionfs.o
+
+unionfs-y := subr.o dentry.o file.o inode.o main.o super.o \
+   rdstate.o copyup.o dirhelper.o rename.o unlink.o \
+   lookup.o commonfops.o dirfops.o sioq.o mmap.o
+
+unionfs-$(CONFIG_UNION_FS_XATTR) += xattr.o
+
+unionfs-$(CONFIG_UNION_FS_DEBUG) += debug.o
+
+ifeq ($(CONFIG_UNION_FS_DEBUG),y)
+EXTRA_CFLAGS += -DDEBUG
+endif
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 15/29] Unionfs: readdir state helpers

2008-01-10 Thread Erez Zadok

Includes duplicate name elimination and whiteout-handling code.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/rdstate.c |  285 ++
 1 files changed, 285 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/rdstate.c

diff --git a/fs/unionfs/rdstate.c b/fs/unionfs/rdstate.c
new file mode 100644
index 000..7ba1e1a
--- /dev/null
+++ b/fs/unionfs/rdstate.c
@@ -0,0 +1,285 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* This file contains the routines for maintaining readdir state. */
+
+/*
+ * There are two structures here, rdstate which is a hash table
+ * of the second structure which is a filldir_node.
+ */
+
+/*
+ * This is a struct kmem_cache for filldir nodes, because we allocate a lot
+ * of them and they shouldn't waste memory.  If the node has a small name
+ * (as defined by the dentry structure), then we use an inline name to
+ * preserve kmalloc space.
+ */
+static struct kmem_cache *unionfs_filldir_cachep;
+
+int unionfs_init_filldir_cache(void)
+{
+   unionfs_filldir_cachep =
+   kmem_cache_create("unionfs_filldir",
+ sizeof(struct filldir_node), 0,
+ SLAB_RECLAIM_ACCOUNT, NULL);
+
+   return (unionfs_filldir_cachep ? 0 : -ENOMEM);
+}
+
+void unionfs_destroy_filldir_cache(void)
+{
+   if (unionfs_filldir_cachep)
+   kmem_cache_destroy(unionfs_filldir_cachep);
+}
+
+/*
+ * This is a tuning parameter that tells us roughly how big to make the
+ * hash table in directory entries per page.  This isn't perfect, but
+ * at least we get a hash table size that shouldn't be too overloaded.
+ * The following averages are based on my home directory.
+ * 14.44693Overall
+ * 12.29   Single Page Directories
+ * 117.93  Multi-page directories
+ */
+#define DENTPAGE 4096
+#define DENTPERONEPAGE 12
+#define DENTPERPAGE 118
+#define MINHASHSIZE 1
+static int guesstimate_hash_size(struct inode *inode)
+{
+   struct inode *lower_inode;
+   int bindex;
+   int hashsize = MINHASHSIZE;
+
+   if (UNIONFS_I(inode)->hashsize > 0)
+   return UNIONFS_I(inode)->hashsize;
+
+   for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (!lower_inode)
+   continue;
+
+   if (i_size_read(lower_inode) == DENTPAGE)
+   hashsize += DENTPERONEPAGE;
+   else
+   hashsize += (i_size_read(lower_inode) / DENTPAGE) *
+   DENTPERPAGE;
+   }
+
+   return hashsize;
+}
+
+int init_rdstate(struct file *file)
+{
+   BUG_ON(sizeof(loff_t) !=
+  (sizeof(unsigned int) + sizeof(unsigned int)));
+   BUG_ON(UNIONFS_F(file)->rdstate != NULL);
+
+   UNIONFS_F(file)->rdstate = alloc_rdstate(file->f_path.dentry->d_inode,
+fbstart(file));
+
+   return (UNIONFS_F(file)->rdstate ? 0 : -ENOMEM);
+}
+
+struct unionfs_dir_state *find_rdstate(struct inode *inode, loff_t fpos)
+{
+   struct unionfs_dir_state *rdstate = NULL;
+   struct list_head *pos;
+
+   spin_lock(&UNIONFS_I(inode)->rdlock);
+   list_for_each(pos, &UNIONFS_I(inode)->readdircache) {
+   struct unionfs_dir_state *r =
+   list_entry(pos, struct unionfs_dir_state, cache);
+   if (fpos == rdstate2offset(r)) {
+   UNIONFS_I(inode)->rdcount--;
+   list_del(&r->cache);
+   rdstate = r;
+   break;
+   }
+   }
+   spin_unlock(&UNIONFS_I(inode)->rdlock);
+   return rdstate;
+}
+
+struct unionfs_dir_state *alloc_rdstate(struct inode *inode, int bindex)
+{
+   int i = 0;
+   int hashsize;
+   unsigned long mallocsize = sizeof(struct unionfs_dir_state);
+   struct unionfs_dir_state *rdstate;
+
+   hashsize = guesstimate_hash_size(inode);
+   mallocsize += hashsize * sizeof(struct list_head);
+   mallocsize

[PATCH 29/29] Put Unionfs and eCryptfs under one layered filesystems menu

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/Kconfig |   53 +
 1 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 487236c..55a78b7 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -1041,6 +1041,47 @@ config CONFIGFS_FS
 
 endmenu
 
+menu "Layered filesystems"
+
+config ECRYPT_FS
+   tristate "eCrypt filesystem layer support (EXPERIMENTAL)"
+   depends on EXPERIMENTAL && KEYS && CRYPTO && NET
+   help
+ Encrypted filesystem that operates on the VFS layer.  See
+  to learn more about
+ eCryptfs.  Userspace components are required and can be
+ obtained from <http://ecryptfs.sf.net>.
+
+ To compile this file system support as a module, choose M here: the
+ module will be called ecryptfs.
+
+config UNION_FS
+   tristate "Union file system (EXPERIMENTAL)"
+   depends on EXPERIMENTAL
+   help
+ Unionfs is a stackable unification file system, which appears to
+ merge the contents of several directories (branches), while keeping
+ their physical content separate.
+
+ See <http://unionfs.filesystems.org> for details
+
+config UNION_FS_XATTR
+   bool "Unionfs extended attributes"
+   depends on UNION_FS
+   help
+ Extended attributes are name:value pairs associated with inodes by
+ the kernel or by users (see the attr(5) manual page).
+
+ If unsure, say N.
+
+config UNION_FS_DEBUG
+   bool "Debug Unionfs"
+   depends on UNION_FS
+   help
+ If you say Y here, you can turn on debugging output from Unionfs.
+
+endmenu
+
 menu "Miscellaneous filesystems"
 
 config ADFS_FS
@@ -1093,18 +1134,6 @@ config AFFS_FS
  To compile this file system support as a module, choose M here: the
  module will be called affs.  If unsure, say N.
 
-config ECRYPT_FS
-   tristate "eCrypt filesystem layer support (EXPERIMENTAL)"
-   depends on EXPERIMENTAL && KEYS && CRYPTO && NET
-   help
- Encrypted filesystem that operates on the VFS layer.  See
-  to learn more about
- eCryptfs.  Userspace components are required and can be
- obtained from <http://ecryptfs.sf.net>.
-
- To compile this file system support as a module, choose M here: the
- module will be called ecryptfs.
-
 config HFS_FS
tristate "Apple Macintosh file system support (EXPERIMENTAL)"
depends on BLOCK && EXPERIMENTAL
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/29] Unionfs: documentation

2008-01-10 Thread Erez Zadok

Includes index files, MAINTAINERS, and documentation on general concepts,
usage, issues, and renaming operations.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/00-INDEX |2 +
 Documentation/filesystems/unionfs/00-INDEX |   10 +
 Documentation/filesystems/unionfs/concepts.txt |  213 
 Documentation/filesystems/unionfs/issues.txt   |   28 +++
 Documentation/filesystems/unionfs/rename.txt   |   31 
 Documentation/filesystems/unionfs/usage.txt|  134 +++
 MAINTAINERS|9 +
 7 files changed, 427 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/filesystems/unionfs/00-INDEX
 create mode 100644 Documentation/filesystems/unionfs/concepts.txt
 create mode 100644 Documentation/filesystems/unionfs/issues.txt
 create mode 100644 Documentation/filesystems/unionfs/rename.txt
 create mode 100644 Documentation/filesystems/unionfs/usage.txt

diff --git a/Documentation/filesystems/00-INDEX 
b/Documentation/filesystems/00-INDEX
index 1de155e..b168331 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -96,6 +96,8 @@ udf.txt
- info and mount options for the UDF filesystem.
 ufs.txt
- info on the ufs filesystem.
+unionfs/
+   - info on the unionfs filesystem
 vfat.txt
- info on using the VFAT filesystem used in Windows NT and Windows 95
 vfs.txt
diff --git a/Documentation/filesystems/unionfs/00-INDEX 
b/Documentation/filesystems/unionfs/00-INDEX
new file mode 100644
index 000..96fdf67
--- /dev/null
+++ b/Documentation/filesystems/unionfs/00-INDEX
@@ -0,0 +1,10 @@
+00-INDEX
+   - this file.
+concepts.txt
+   - A brief introduction of concepts.
+issues.txt
+   - A summary of known issues with unionfs.
+rename.txt
+   - Information regarding rename operations.
+usage.txt
+   - Usage information and examples.
diff --git a/Documentation/filesystems/unionfs/concepts.txt 
b/Documentation/filesystems/unionfs/concepts.txt
new file mode 100644
index 000..bed69bd
--- /dev/null
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -0,0 +1,213 @@
+Unionfs 2.x CONCEPTS:
+=
+
+This file describes the concepts needed by a namespace unification file
+system.
+
+
+Branch Priority:
+
+
+Each branch is assigned a unique priority - starting from 0 (highest
+priority).  No two branches can have the same priority.
+
+
+Branch Mode:
+
+
+Each branch is assigned a mode - read-write or read-only. This allows
+directories on media mounted read-write to be used in a read-only manner.
+
+
+Whiteouts:
+==
+
+A whiteout removes a file name from the namespace. Whiteouts are needed when
+one attempts to remove a file on a read-only branch.
+
+Suppose we have a two-branch union, where branch 0 is read-write and branch
+1 is read-only. And a file 'foo' on branch 1:
+
+./b0/
+./b1/
+./b1/foo
+
+The unified view would simply be:
+
+./union/
+./union/foo
+
+Since 'foo' is stored on a read-only branch, it cannot be removed. A
+whiteout is used to remove the name 'foo' from the unified namespace. Again,
+since branch 1 is read-only, the whiteout cannot be created there. So, we
+try on a higher priority (lower numerically) branch and create the whiteout
+there.
+
+./b0/
+./b0/.wh.foo
+./b1/
+./b1/foo
+
+Later, when Unionfs traverses branches (due to lookup or readdir), it
+eliminate 'foo' from the namespace (as well as the whiteout itself.)
+
+
+Duplicate Elimination:
+==
+
+It is possible for files on different branches to have the same name.
+Unionfs then has to select which instance of the file to show to the user.
+Given the fact that each branch has a priority associated with it, the
+simplest solution is to take the instance from the highest priority
+(numerically lowest value) and "hide" the others.
+
+
+Copyup:
+===
+
+When a change is made to the contents of a file's data or meta-data, they
+have to be stored somewhere.  The best way is to create a copy of the
+original file on a branch that is writable, and then redirect the write
+though to this copy.  The copy must be made on a higher priority branch so
+that lookup and readdir return this newer "version" of the file rather than
+the original (see duplicate elimination).
+
+An entire unionfs mount can be read-only or read-write.  If it's read-only,
+then none of the branches will be written to, even if some of the branches
+are physically writeable.  If the unionfs mount is read-write, then the
+leftmost (highest priority) branch must be writeable (for copyup to take
+place); the remaining branches can be any mix of read-write and read-only.
+
+In a writeable mount, unionfs will create new files/dir in the leftmost
+branch.  If one tries to modify a file in a read-only branch/media, unionfs
+will copyup the fi

[PATCH 03/29] Makefile: hook to compile unionfs

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/Makefile |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/Makefile b/fs/Makefile
index 500cf15..e202288 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -118,3 +118,4 @@ obj-$(CONFIG_HPPFS) += hppfs/
 obj-$(CONFIG_DEBUG_FS) += debugfs/
 obj-$(CONFIG_OCFS2_FS) += ocfs2/
 obj-$(CONFIG_GFS2_FS)   += gfs2/
+obj-$(CONFIG_UNION_FS) += unionfs/
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/29] Unionfs: unlink/rmdir operations

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/unlink.c |  251 +++
 1 files changed, 251 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/unlink.c

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
new file mode 100644
index 000..1e370a1
--- /dev/null
+++ b/fs/unionfs/unlink.c
@@ -0,0 +1,251 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/* unlink a file by creating a whiteout */
+static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
+{
+   struct dentry *lower_dentry;
+   struct dentry *lower_dir_dentry;
+   int bindex;
+   int err = 0;
+
+   err = unionfs_partial_lookup(dentry);
+   if (err)
+   goto out;
+
+   bindex = dbstart(dentry);
+
+   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+   if (!lower_dentry)
+   goto out;
+
+   lower_dir_dentry = lock_parent(lower_dentry);
+
+   /* avoid destroying the lower inode if the file is in use */
+   dget(lower_dentry);
+   err = is_robranch_super(dentry->d_sb, bindex);
+   if (!err) {
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
+   err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
+   lockdep_on();
+   }
+   /* if vfs_unlink succeeded, update our inode's times */
+   if (!err)
+   unionfs_copy_attr_times(dentry->d_inode);
+   dput(lower_dentry);
+   fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
+   unlock_dir(lower_dir_dentry);
+
+   if (err && !IS_COPYUP_ERR(err))
+   goto out;
+
+   /*
+* We create whiteouts if (1) there was an error unlinking the main
+* file; (2) there is a lower priority file with the same name
+* (dbopaque); (3) the branch in which the file is not the last
+* (rightmost0 branch.  The last rule is an optimization to avoid
+* creating all those whiteouts if there's no chance they'd be
+* masking any lower-priority branch, as well as unionfs is used
+* with only one branch (using only one branch, while odd, is still
+* possible).
+*/
+   if (err) {
+   if (dbstart(dentry) == 0)
+   goto out;
+   err = create_whiteout(dentry, dbstart(dentry) - 1);
+   } else if (dbopaque(dentry) != -1) {
+   err = create_whiteout(dentry, dbopaque(dentry));
+   } else if (dbstart(dentry) < sbend(dentry->d_sb)) {
+   err = create_whiteout(dentry, dbstart(dentry));
+   }
+
+out:
+   if (!err)
+   inode_dec_link_count(dentry->d_inode);
+
+   /* We don't want to leave negative leftover dentries for revalidate. */
+   if (!err && (dbopaque(dentry) != -1))
+   update_bstart(dentry);
+
+   return err;
+}
+
+int unionfs_unlink(struct inode *dir, struct dentry *dentry)
+{
+   int err = 0;
+   struct inode *inode = dentry->d_inode;
+
+   BUG_ON(S_ISDIR(inode->i_mode));
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
+   err = -ESTALE;
+   goto out;
+   }
+   unionfs_check_dentry(dentry);
+
+   err = unionfs_unlink_whiteout(dir, dentry);
+   /* call d_drop so the system "forgets" about us */
+   if (!err) {
+   unionfs_postcopyup_release(dentry);
+   if (inode->i_nlink == 0) {
+   /* drop lower inodes */
+   iput(unionfs_lower_inode(inode));
+   unionfs_set_lower_inode(inode, NULL);
+   ibstart(inode) = ibend(inode) = -1;
+   }
+   d_drop(dentry);
+   /*
+* if unlink/whiteout succeeded, parent dir mtime has
+* changed
+*/
+   unionfs_copy_attr_times(dir);
+   }
+
+out:
+   if (!err) {
+   unionfs_check

[PATCH 09/29] Unionfs: lower-level copyup routines

2008-01-10 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/copyup.c |  899 +++
 1 files changed, 899 insertions(+), 0 deletions(-)
 create mode 100644 fs/unionfs/copyup.c

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
new file mode 100644
index 000..16b2c7c
--- /dev/null
+++ b/fs/unionfs/copyup.c
@@ -0,0 +1,899 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005  Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003  Puja Gupta
+ * Copyright (c) 2003  Harikesavan Krishnan
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * For detailed explanation of copyup see:
+ * Documentation/filesystems/unionfs/concepts.txt
+ */
+
+#ifdef CONFIG_UNION_FS_XATTR
+/* copyup all extended attrs for a given dentry */
+static int copyup_xattrs(struct dentry *old_lower_dentry,
+struct dentry *new_lower_dentry)
+{
+   int err = 0;
+   ssize_t list_size = -1;
+   char *name_list = NULL;
+   char *attr_value = NULL;
+   char *name_list_buf = NULL;
+
+   /* query the actual size of the xattr list */
+   list_size = vfs_listxattr(old_lower_dentry, NULL, 0);
+   if (list_size <= 0) {
+   err = list_size;
+   goto out;
+   }
+
+   /* allocate space for the actual list */
+   name_list = unionfs_xattr_alloc(list_size + 1, XATTR_LIST_MAX);
+   if (unlikely(!name_list || IS_ERR(name_list))) {
+   err = PTR_ERR(name_list);
+   goto out;
+   }
+
+   name_list_buf = name_list; /* save for kfree at end */
+
+   /* now get the actual xattr list of the source file */
+   list_size = vfs_listxattr(old_lower_dentry, name_list, list_size);
+   if (list_size <= 0) {
+   err = list_size;
+   goto out;
+   }
+
+   /* allocate space to hold each xattr's value */
+   attr_value = unionfs_xattr_alloc(XATTR_SIZE_MAX, XATTR_SIZE_MAX);
+   if (unlikely(!attr_value || IS_ERR(attr_value))) {
+   err = PTR_ERR(name_list);
+   goto out;
+   }
+
+   /* in a loop, get and set each xattr from src to dst file */
+   while (*name_list) {
+   ssize_t size;
+
+   /* Lock here since vfs_getxattr doesn't lock for us */
+   mutex_lock(&old_lower_dentry->d_inode->i_mutex);
+   size = vfs_getxattr(old_lower_dentry, name_list,
+   attr_value, XATTR_SIZE_MAX);
+   mutex_unlock(&old_lower_dentry->d_inode->i_mutex);
+   if (size < 0) {
+   err = size;
+   goto out;
+   }
+   if (size > XATTR_SIZE_MAX) {
+   err = -E2BIG;
+   goto out;
+   }
+   /* Don't lock here since vfs_setxattr does it for us. */
+   err = vfs_setxattr(new_lower_dentry, name_list, attr_value,
+  size, 0);
+   /*
+* Selinux depends on "security.*" xattrs, so to maintain
+* the security of copied-up files, if Selinux is active,
+* then we must copy these xattrs as well.  So we need to
+* temporarily get FOWNER privileges.
+* XXX: move entire copyup code to SIOQ.
+*/
+   if (err == -EPERM && !capable(CAP_FOWNER)) {
+   cap_raise(current->cap_effective, CAP_FOWNER);
+   err = vfs_setxattr(new_lower_dentry, name_list,
+  attr_value, size, 0);
+   cap_lower(current->cap_effective, CAP_FOWNER);
+   }
+   if (err < 0)
+   goto out;
+   name_list += strlen(name_list) + 1;
+   }
+out:
+   unionfs_xattr_kfree(name_list_buf);
+   unionfs_xattr_kfree(attr_value);
+   /* Ignore if xattr isn't supported */
+   if (err == -ENOTSUPP || err == -EOPNOTSUPP)
+   err = 0;
+   return err;
+}
+#endif /* CONFIG_UNION_FS_XATTR */
+
+/*
+ * Determine the mode based on the copyup flags, and the existing dentry.
+ *
+ * Handle file systems which may not support certain options.  For example
+ * jffs2 doesn't allow one to chmod a sy

[PATCH 27/29] VFS path get/put ops used by Unionfs

2008-01-10 Thread Erez Zadok

Note: this will become obsolete once similar patches, now in -mm, make it to
mainline.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 include/linux/namei.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/namei.h b/include/linux/namei.h
index 4cb4f8d..63f16d9 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 struct vfsmount;
 
@@ -100,4 +101,16 @@ static inline char *nd_get_link(struct nameidata *nd)
return nd->saved_names[nd->depth];
 }
 
+static inline void pathget(struct path *path)
+{
+   mntget(path->mnt);
+   dget(path->dentry);
+}
+
+static inline void pathput(struct path *path)
+{
+   dput(path->dentry);
+   mntput(path->mnt);
+}
+
 #endif /* _LINUX_NAMEI_H */
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/29] VFS/eCryptfs: use simplified fs_stack API to fsstack_copy_attr_all

2008-01-10 Thread Erez Zadok

Acked-by: Mike Halcrow <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/ecryptfs/dentry.c |2 +-
 fs/ecryptfs/inode.c  |6 +++---
 fs/ecryptfs/main.c   |2 +-
 fs/stack.c   |   38 --
 include/linux/fs_stack.h |   21 -
 5 files changed, 45 insertions(+), 24 deletions(-)

diff --git a/fs/ecryptfs/dentry.c b/fs/ecryptfs/dentry.c
index cb20b96..a8c1686 100644
--- a/fs/ecryptfs/dentry.c
+++ b/fs/ecryptfs/dentry.c
@@ -62,7 +62,7 @@ static int ecryptfs_d_revalidate(struct dentry *dentry, 
struct nameidata *nd)
struct inode *lower_inode =
ecryptfs_inode_to_lower(dentry->d_inode);
 
-   fsstack_copy_attr_all(dentry->d_inode, lower_inode, NULL);
+   fsstack_copy_attr_all(dentry->d_inode, lower_inode);
}
 out:
return rc;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 5a71918..89e8560 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -576,9 +576,9 @@ ecryptfs_rename(struct inode *old_dir, struct dentry 
*old_dentry,
lower_new_dir_dentry->d_inode, lower_new_dentry);
if (rc)
goto out_lock;
-   fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode, NULL);
+   fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode);
if (new_dir != old_dir)
-   fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode, 
NULL);
+   fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode);
 out_lock:
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
dput(lower_new_dentry->d_parent);
@@ -912,7 +912,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct 
iattr *ia)
 
rc = notify_change(lower_dentry, ia);
 out:
-   fsstack_copy_attr_all(inode, lower_inode, NULL);
+   fsstack_copy_attr_all(inode, lower_inode);
return rc;
 }
 
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index e5580bc..6276cdf 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -211,7 +211,7 @@ int ecryptfs_interpose(struct dentry *lower_dentry, struct 
dentry *dentry,
d_add(dentry, inode);
else
d_instantiate(dentry, inode);
-   fsstack_copy_attr_all(inode, lower_inode, NULL);
+   fsstack_copy_attr_all(inode, lower_inode);
/* This size will be overwritten for real files w/ headers and
 * other metadata */
fsstack_copy_inode_size(inode, lower_inode);
diff --git a/fs/stack.c b/fs/stack.c
index 67716f6..4336f2b 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -1,24 +1,42 @@
+/*
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
 #include 
 #include 
 #include 
 
-/* does _NOT_ require i_mutex to be held.
+/*
+ * does _NOT_ require i_mutex to be held.
  *
  * This function cannot be inlined since i_size_{read,write} is rather
  * heavy-weight on 32-bit systems
  */
 void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
 {
-   i_size_write(dst, i_size_read((struct inode *)src));
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_lock(&dst->i_lock);
+#endif
+   i_size_write(dst, i_size_read(src));
dst->i_blocks = src->i_blocks;
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_unlock(&dst->i_lock);
+#endif
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
 
-/* copy all attributes; get_nlinks is optional way to override the i_nlink
+/*
+ * copy all attributes; get_nlinks is optional way to override the i_nlink
  * copying
  */
-void fsstack_copy_attr_all(struct inode *dest, const struct inode *src,
-   int (*get_nlinks)(struct inode *))
+void fsstack_copy_attr_all(struct inode *dest, const struct inode *src)
 {
dest->i_mode = src->i_mode;
dest->i_uid = src->i_uid;
@@ -29,14 +47,6 @@ void fsstack_copy_attr_all(struct inode *dest, const struct 
inode *src,
dest->i_ctime = src->i_ctime;
dest->i_blkbits = src->i_blkbits;
dest->i_flags = src->i_flags;
-
-   /*
-* Update the nlinks AFTER updating the above fields, because the
-* get_links callback may depend on them.
-*/
-   if (!get_nlinks)
-   dest->i_nlink = src->i_nlink;
-   else
-   dest->i_nlink = (*get_nlinks)(dest);
+   dest->i_nlink = src->i_nlink;
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_attr_all);
diff --git a/include/linux

[UNIONFS] 00/29 Unionfs and related patches pre-merge review (v2)

2008-01-10 Thread Erez Zadok

X files to indicate that file XXX has been whited-out.
This works well on many file systems, but it tends to clutter lower
branches with these .wh.* files.  We recently optimized our whiteout
creation algorithm so it minimizes the number of conditions in which
whiteouts are created, and that helped some people a lot.  But still, if
you unify a readonly and writeable branch, and you try to delete a file
from the readonly branch/medium, there's no way to avoid creating some
sort of a whiteout.  BTW, of course, these whiteouts are completely
hidden from the view of the user who accesses files/dirs via the union.

In the long run, we really hope to see native whiteout support in Linux (ala
BSD).  Of course, this would require a change to the VFS and several native
file systems (possibly even a change to the on-disk format), so we realize
that this isn't likely to happen soon.  If/when native whiteout support was
available, unionfs could easily use it.  Until that time, we have lots of
users who want to use unionfs on top of numerous different file systems, and
so we have to do the next best thing wrt whiteouts.

This is a good point to mention that the version of unionfs in -mm is 2.2.x.
We have been working on a newer and still experimental version of Unionfs,
called "Unionfs with On-Disk Format" or Unionfs-ODF.  Unionfs-ODF uses a
small persistent store (e.g., a small ext2 partition) to store whiteouts in,
among other info; this moves the union-level meta-data (e.g., whiteouts),
outside the lower file systems, and thus eliminates the need to create .wh.*
files.  Unionfs-ODF has other useful benefits, and you can get more detail
about it here: <http://www.filesystems.org/unionfs-odf.txt>.  We recently
sync'ed up our unionfs 2.2 and unionfs-odf releases and we're tracking
Linus's tree for both.  IOW, every fix and user-visible feature that has
gone into unionfs in -mm, is now also in unionfs-odf.  Our intent is to
continue to develop both versions, and gradually move features from
unionfs-odf into unionfs 2.2; this would be possible even if/after
unionfs-2.2 gets merged, because the changes will all be internal to the
implementation, and users won't need to change the way they, say, mount a
union or manipulate its branches.

(4) branch management.  One of the most useful features of unioning is to be
able to add/remove branches from the union.  We used to do this via
ioctl's, which was considered racy, unclean, and non-atomic (only one
branch-manipulation operation at a time).  We now do that via the
remount interface, and allow users to pass multiple branch-manipulation
commands, which are handled as one action.


* GENERAL

I should note that my philosophy in developing any stackable file system had
been to minimize changes to the VFS, and to not change any lower file system
whatsoever: that ensures that unionfs couldn't affect the stability of
performance of the rest of the kernel.  Still, some of the things unionfs
does could possibly be done more cleanly and easily at the VFS level (e.g.,
better hooks for cache coherency).

Unionfs 2.2.x is currently maintained on 2.6.9 and all major kernels since
2.6.18, all the way to Linus's latest 2.6.24-rc tree and -mm.  We've got a
lot users who use unionfs in more creative ways than even we could think of,
and this has helped us find the RIGHT set of features to please the users,
as well as stabilize the code.  Before every new release, we test the new
code on all versions using ltp-full, parallel compiles, and our own
unionfs-aware regression suite which exercises unionfs's unique features
(e.g., copy-up).

I therefore believe that unionfs is in a good enough shape now to be
considered for merging in 2.6.25.  (Heck, using Unionfs I've managed to
find, report, and in most cases fix bugs I found in nfsv2/3/4, xfs, jffs2,
and ext3/jbd :-) The user-visible unionfs behavior isn't likely to change;
and any changes to the VFS to better support stacking, could be handled
internally in subsequent kernels without affecting how users use unionfs.
Aside from greater exposure to stackable file systems and unionfs, I think
one of the other important benefits of a merge could be that we'd have more
than one stackable f/s in the kernel (i.e., ecryptfs and unionfs); this
would allow us to slowly and gradually generalize the VFS so it can better
support stackable file systems.


Lastly, shortlog and diffstats:

Erez Zadok (29):
  Unionfs: documentation
  VFS/eCryptfs: use simplified fs_stack API to fsstack_copy_attr_all
  Makefile: hook to compile unionfs
  Unionfs: main Makefile
  Unionfs: fanout header definitions
  Unionfs: main header file
  Unionfs: common file copyup/revalidation operations
  Unionfs: basic file operations
  Unionfs: lower-level copyup routines
  Unionfs: dentry revalidation
  Unionfs:

[PATCH 1/4] Unionfs: merged several printk KERN_CONT together into one pr_debug

2008-01-09 Thread Erez Zadok

CC: Joe Perches <[EMAIL PROTECTED]>

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/debug.c |   51 +--
 1 files changed, 25 insertions(+), 26 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index 5f1d887..d154c32 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -472,16 +472,16 @@ void __show_inode_times(const struct inode *inode,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (unlikely(!lower_inode))
continue;
-   pr_debug("IT(%lu:%d): ", inode->i_ino, bindex);
-   printk(KERN_CONT "%s:%s:%d ", file, fxn, line);
-   printk(KERN_CONT "um=%lu/%lu lm=%lu/%lu ",
-  inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-  lower_inode->i_mtime.tv_sec,
-  lower_inode->i_mtime.tv_nsec);
-   printk(KERN_CONT "uc=%lu/%lu lc=%lu/%lu\n",
-  inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
-  lower_inode->i_ctime.tv_sec,
-  lower_inode->i_ctime.tv_nsec);
+   pr_debug("IT(%lu:%d): %s:%s:%d "
+"um=%lu/%lu lm=%lu/%lu uc=%lu/%lu lc=%lu/%lu\n",
+inode->i_ino, bindex,
+file, fxn, line,
+inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+lower_inode->i_mtime.tv_sec,
+lower_inode->i_mtime.tv_nsec,
+inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+lower_inode->i_ctime.tv_sec,
+lower_inode->i_ctime.tv_nsec);
}
 }
 
@@ -496,17 +496,16 @@ void __show_dinode_times(const struct dentry *dentry,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (!lower_inode)
continue;
-   pr_debug("DT(%s:%lu:%d): ", dentry->d_name.name, inode->i_ino,
-bindex);
-   printk(KERN_CONT "%s:%s:%d ", file, fxn, line);
-   printk(KERN_CONT "um=%lu/%lu lm=%lu/%lu ",
-  inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-  lower_inode->i_mtime.tv_sec,
-  lower_inode->i_mtime.tv_nsec);
-   printk(KERN_CONT "uc=%lu/%lu lc=%lu/%lu\n",
-  inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
-  lower_inode->i_ctime.tv_sec,
-  lower_inode->i_ctime.tv_nsec);
+   pr_debug("DT(%s:%lu:%d): %s:%s:%d "
+"um=%lu/%lu lm=%lu/%lu uc=%lu/%lu lc=%lu/%lu\n",
+dentry->d_name.name, inode->i_ino, bindex,
+file, fxn, line,
+inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+lower_inode->i_mtime.tv_sec,
+lower_inode->i_mtime.tv_nsec,
+inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+lower_inode->i_ctime.tv_sec,
+lower_inode->i_ctime.tv_nsec);
}
 }
 
@@ -525,10 +524,10 @@ void __show_inode_counts(const struct inode *inode,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (unlikely(!lower_inode))
continue;
-   printk(KERN_CONT "SIC(%lu:%d:%d): ", inode->i_ino, bindex,
-  atomic_read(&(inode)->i_count));
-   printk(KERN_CONT "lc=%d ",
-  atomic_read(&(lower_inode)->i_count));
-   printk(KERN_CONT "%s:%s:%d\n", file, fxn, line);
+   pr_debug("SIC(%lu:%d:%d): lc=%d %s:%s:%d\n",
+inode->i_ino, bindex,
+atomic_read(&(inode)->i_count),
+atomic_read(&(lower_inode)->i_count),
+file, fxn, line);
}
 }
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] Unionfs: mmap fixes

2008-01-09 Thread Erez Zadok

Ensure we have lower inodes in prepare/commit_write.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/mmap.c |   26 +-
 1 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index a0e654b..ad770ac 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -224,13 +224,26 @@ out:
 static int unionfs_prepare_write(struct file *file, struct page *page,
 unsigned from, unsigned to)
 {
+   int err;
+
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
/*
-* Just copy lower inode attributes and return success.  Not much
-* else to do here.  No need to lock either (lockdep won't like it).
-* Let commit_write do all the hard work instead.
+* This is the only place where we unconditionally copy the lower
+* attribute times before calling unionfs_file_revalidate.  The
+* reason is that our ->write calls do_sync_write which in turn will
+* call our ->prepare_write and then ->commit_write.  Before our
+* ->write is called, the lower mtimes are in sync, but by the time
+* the VFS calls our ->commit_write, the lower mtimes have changed.
+* Therefore, the only reasonable time for us to sync up from the
+* changed lower mtimes, and avoid an invariant violation warning,
+* is here, in ->prepare_write.
 */
unionfs_copy_attr_times(file->f_path.dentry->d_inode);
-   return 0;
+   err = unionfs_file_revalidate(file, true);
+   unionfs_check_file(file);
+   unionfs_read_unlock(file->f_path.dentry->d_sb);
+
+   return err;
 }
 
 static int unionfs_commit_write(struct file *file, struct page *page,
@@ -252,7 +265,6 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
unionfs_check_file(file);
 
inode = page->mapping->host;
-   lower_inode = unionfs_lower_inode(inode);
 
if (UNIONFS_F(file) != NULL)
lower_file = unionfs_lower_file(file);
@@ -282,6 +294,10 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
goto out;
 
/* if vfs_write succeeded above, sync up our times/sizes */
+   lower_inode = lower_file->f_path.dentry->d_inode;
+   if (!lower_inode)
+   lower_inode = unionfs_lower_inode(inode);
+   BUG_ON(!lower_inode);
fsstack_copy_inode_size(inode, lower_inode);
unionfs_copy_attr_times(inode);
mark_inode_dirty_sync(inode);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] Unionfs: ensure we have lower dentries in d_iput

2008-01-09 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index d969640..cd15243 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -507,9 +507,10 @@ static void unionfs_d_iput(struct dentry *dentry, struct 
inode *inode)
 {
int bindex, rc;
 
+   BUG_ON(!dentry);
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
 
-   if (dbstart(dentry) < 0)
+   if (!UNIONFS_D(dentry) || dbstart(dentry) < 0)
goto drop_lower_inodes;
for (bindex = dbstart(dentry); bindex <= dbend(dentry); bindex++) {
if (unionfs_lower_mnt_idx(dentry, bindex)) {
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] Unionfs: branch-management related locking fixes

2008-01-09 Thread Erez Zadok

Add necessary locking to dentry/inode branch-configuration, so we get
consistent values during branch-management actions.  In d_revalidate_chain,
->permission, and ->create, also lock parent dentry.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |6 ++
 fs/unionfs/dentry.c |6 +-
 fs/unionfs/inode.c  |   17 +
 3 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 2c32ada..f37192f 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -318,6 +318,7 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
 * First revalidate the dentry inside struct file,
 * but not unhashed dentries.
 */
+reval_dentry:
if (unlikely(!d_deleted(dentry) &&
 !__unionfs_d_revalidate_chain(dentry, NULL, willwrite))) {
err = -ESTALE;
@@ -328,6 +329,11 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
dgen = atomic_read(&UNIONFS_D(dentry)->generation);
fgen = atomic_read(&UNIONFS_F(file)->generation);
 
+   if (unlikely(sbgen > dgen)) {
+   pr_debug("unionfs: retry dentry revalidation\n");
+   schedule();
+   goto reval_dentry;
+   }
BUG_ON(sbgen > dgen);
 
/*
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 7646828..d969640 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -203,7 +203,7 @@ bool is_newer_lower(const struct dentry *dentry)
if (!dentry || !UNIONFS_D(dentry))
return false;
inode = dentry->d_inode;
-   if (!inode || !UNIONFS_I(inode) ||
+   if (!inode || !UNIONFS_I(inode)->lower_inodes ||
ibstart(inode) < 0 || ibend(inode) < 0)
return false;
 
@@ -295,6 +295,8 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
chain_len = 0;
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
dtmp = dentry->d_parent;
+   if (dentry != dtmp)
+   unionfs_lock_dentry(dtmp, UNIONFS_DMUTEX_REVAL_PARENT);
dgen = atomic_read(&UNIONFS_D(dtmp)->generation);
/* XXX: should we check if is_newer_lower all the way up? */
if (unlikely(is_newer_lower(dtmp))) {
@@ -315,6 +317,8 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
}
purge_inode_data(dtmp->d_inode);
}
+   if (dentry != dtmp)
+   unionfs_unlock_dentry(dtmp);
while (sbgen != dgen) {
/* The root entry should always be valid */
BUG_ON(IS_ROOT(dtmp));
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 6095c4f..e15ddb9 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -30,6 +30,13 @@ static int unionfs_create(struct inode *parent, struct 
dentry *dentry,
struct nameidata lower_nd;
 
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+   unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_PARENT);
+   valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
+   unionfs_unlock_dentry(dentry->d_parent);
+   if (unlikely(!valid)) {
+   err = -ESTALE;  /* same as what real_lookup does */
+   goto out;
+   }
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
valid = __unionfs_d_revalidate_chain(dentry, nd, false);
@@ -936,6 +943,14 @@ static int unionfs_permission(struct inode *inode, int 
mask,
const int is_file = !S_ISDIR(inode->i_mode);
const int write_mask = (mask & MAY_WRITE) && !(mask & MAY_READ);
 
+   if (nd)
+   unionfs_lock_dentry(nd->dentry, UNIONFS_DMUTEX_CHILD);
+
+   if (!UNIONFS_I(inode)->lower_inodes) {
+   if (is_file)/* dirs can be unlinked but chdir'ed to */
+   err = -ESTALE;  /* force revalidate */
+   goto out;
+   }
bstart = ibstart(inode);
bend = ibend(inode);
if (unlikely(bstart < 0 || bend < 0)) {
@@ -1003,6 +1018,8 @@ static int unionfs_permission(struct inode *inode, int 
mask,
 out:
unionfs_check_inode(inode);
unionfs_check_nd(nd);
+   if (nd)
+   unionfs_unlock_dentry(nd->dentry);
return err;
 }
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL -mm] 0/4 Unionfs updates/fixes/cleanups

2008-01-09 Thread Erez Zadok


The following is a series of patchsets related to Unionfs.  This is the
fourth set of patchsets resulting from an lkml review of the entire unionfs
code base, in preparation for a merge into mainline.  The most significant
changes here are a few locking/race bugfix related to branch-management.

These patches were tested (where appropriate) on Linus's 2.6.24 latest code
(as of v2.6.24-rc7-71-gfd0b45d), MM, as well as the backports to
2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4, jffs2,
ramfs, tmpfs, cramfs, and squashfs (where available).  Also tested with
LTP-full and with a continuous parallel kernel compile (while forcing cache
flushing, manipulating lower branches, etc.).  See
http://unionfs.filesystems.org/ to download back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (4):
  Unionfs: merged several printk KERN_CONT together into one pr_debug
  Unionfs: mmap fixes
  Unionfs: branch-management related locking fixes
  Unionfs: ensure we have lower dentries in d_iput

 commonfops.c |6 ++
 debug.c  |   51 +--
 dentry.c |9 +++--
 inode.c  |   17 +
 mmap.c   |   26 +-
 5 files changed, 76 insertions(+), 33 deletions(-)

---
Erez Zadok
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] Unionfs: use printk KERN_CONT for debugging messages

2008-01-03 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Joe Perches writes:
> On Thu, 2008-01-03 at 00:57 -0500, Erez Zadok wrote:

> I think printks should be single statements and
> KERN_CONT should be used as sparingly as possible.
[...]

KERN_CONT is documented as not being SMP safe, but I figured it was harmless
for just some debugging message.  Still, I like your way better.  Thanks
Joe.

> Perhaps:
>   pr_debug("IT(%lu:%d): %s:%s:%d "
>"um=%lu/%lu lm=%lu/%lu "
>"uc=%lu/%lu lc=%lu/%lu\n",
>inode->i_ino, bindex, file, fnx, line,
>inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
>lower_inode->i_mtime.tv_sec,
>lower_inode->i_mtime.tv_nsec
>inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
>lower_inode->i_ctime.tv_sec,
>lower_inode->i_ctime.tv_nsec);
[...]

Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] Unionfs: use VFS helpers to manipulate i_nlink

2008-01-02 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/unlink.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index a1c82b6..1e370a1 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -79,7 +79,7 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct 
dentry *dentry)
 
 out:
if (!err)
-   dentry->d_inode->i_nlink--;
+   inode_dec_link_count(dentry->d_inode);
 
/* We don't want to leave negative leftover dentries for revalidate. */
if (!err && (dbopaque(dentry) != -1))
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] Unionfs: locking fixes

2008-01-02 Thread Erez Zadok

Lock parent dentries during revalidation.
Reduce total number of lockdep classes used.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   13 -
 fs/unionfs/fanout.h |3 ++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 0369d93..7646828 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -42,6 +42,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
memset(&lowernd, 0, sizeof(struct nameidata));
 
verify_locked(dentry);
+   verify_locked(dentry->d_parent);
 
/* if the dentry is unhashed, do NOT revalidate */
if (d_deleted(dentry))
@@ -351,7 +352,10 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
 * to child order.
 */
for (i = 0; i < chain_len; i++) {
-   unionfs_lock_dentry(chain[i], UNIONFS_DMUTEX_REVAL+i);
+   unionfs_lock_dentry(chain[i], UNIONFS_DMUTEX_REVAL_CHILD);
+   if (chain[i] != chain[i]->d_parent)
+   unionfs_lock_dentry(chain[i]->d_parent,
+   UNIONFS_DMUTEX_REVAL_PARENT);
saved_bstart = dbstart(chain[i]);
saved_bend = dbend(chain[i]);
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
@@ -366,6 +370,8 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
 bindex++)
unionfs_mntput(chain[i], bindex);
}
+   if (chain[i] != chain[i]->d_parent)
+   unionfs_unlock_dentry(chain[i]->d_parent);
unionfs_unlock_dentry(chain[i]);
 
if (unlikely(!valid))
@@ -376,6 +382,9 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, 
struct nameidata *nd,
 out_this:
/* finally, lock this dentry and revalidate it */
verify_locked(dentry);
+   if (dentry != dentry->d_parent)
+   unionfs_lock_dentry(dentry->d_parent,
+   UNIONFS_DMUTEX_REVAL_PARENT);
dgen = atomic_read(&UNIONFS_D(dentry)->generation);
 
if (unlikely(is_newer_lower(dentry))) {
@@ -394,6 +403,8 @@ out_this:
purge_inode_data(dentry->d_inode);
}
valid = __unionfs_d_revalidate_one(dentry, nd);
+   if (dentry != dentry->d_parent)
+   unionfs_unlock_dentry(dentry->d_parent);
 
/*
 * If __unionfs_d_revalidate_one() succeeded above, then it will
diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index 5f31015..4d9a45f 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -290,7 +290,8 @@ enum unionfs_dentry_lock_class {
UNIONFS_DMUTEX_PARENT,
UNIONFS_DMUTEX_CHILD,
UNIONFS_DMUTEX_WHITEOUT,
-   UNIONFS_DMUTEX_REVAL,   /* for file/dentry revalidate */
+   UNIONFS_DMUTEX_REVAL_PARENT, /* for file/dentry revalidate */
+   UNIONFS_DMUTEX_REVAL_CHILD,   /* for file/dentry revalidate */
 };
 
 static inline void unionfs_lock_dentry(struct dentry *d,
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] Unionfs: use printk KERN_CONT for debugging messages

2008-01-02 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/debug.c |   50 ++
 1 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index c2b8b58..5f1d887 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -456,9 +456,10 @@ void __show_branch_counts(const struct super_block *sb,
mnt = UNIONFS_D(sb->s_root)->lower_paths[i].mnt;
else
mnt = NULL;
-   pr_debug("%d:", (mnt ? atomic_read(&mnt->mnt_count) : -99));
+   printk(KERN_CONT "%d:",
+  (mnt ? atomic_read(&mnt->mnt_count) : -99));
}
-   pr_debug("%s:%s:%d\n", file, fxn, line);
+   printk(KERN_CONT "%s:%s:%d\n", file, fxn, line);
 }
 
 void __show_inode_times(const struct inode *inode,
@@ -472,15 +473,15 @@ void __show_inode_times(const struct inode *inode,
if (unlikely(!lower_inode))
continue;
pr_debug("IT(%lu:%d): ", inode->i_ino, bindex);
-   pr_debug("%s:%s:%d ", file, fxn, line);
-   pr_debug("um=%lu/%lu lm=%lu/%lu ",
-inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-lower_inode->i_mtime.tv_sec,
-lower_inode->i_mtime.tv_nsec);
-   pr_debug("uc=%lu/%lu lc=%lu/%lu\n",
-inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
-lower_inode->i_ctime.tv_sec,
-lower_inode->i_ctime.tv_nsec);
+   printk(KERN_CONT "%s:%s:%d ", file, fxn, line);
+   printk(KERN_CONT "um=%lu/%lu lm=%lu/%lu ",
+  inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+  lower_inode->i_mtime.tv_sec,
+  lower_inode->i_mtime.tv_nsec);
+   printk(KERN_CONT "uc=%lu/%lu lc=%lu/%lu\n",
+  inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+  lower_inode->i_ctime.tv_sec,
+  lower_inode->i_ctime.tv_nsec);
}
 }
 
@@ -497,15 +498,15 @@ void __show_dinode_times(const struct dentry *dentry,
continue;
pr_debug("DT(%s:%lu:%d): ", dentry->d_name.name, inode->i_ino,
 bindex);
-   pr_debug("%s:%s:%d ", file, fxn, line);
-   pr_debug("um=%lu/%lu lm=%lu/%lu ",
-inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
-lower_inode->i_mtime.tv_sec,
-lower_inode->i_mtime.tv_nsec);
-   pr_debug("uc=%lu/%lu lc=%lu/%lu\n",
-inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
-lower_inode->i_ctime.tv_sec,
-lower_inode->i_ctime.tv_nsec);
+   printk(KERN_CONT "%s:%s:%d ", file, fxn, line);
+   printk(KERN_CONT "um=%lu/%lu lm=%lu/%lu ",
+  inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+  lower_inode->i_mtime.tv_sec,
+  lower_inode->i_mtime.tv_nsec);
+   printk(KERN_CONT "uc=%lu/%lu lc=%lu/%lu\n",
+  inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+  lower_inode->i_ctime.tv_sec,
+  lower_inode->i_ctime.tv_nsec);
}
 }
 
@@ -524,9 +525,10 @@ void __show_inode_counts(const struct inode *inode,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (unlikely(!lower_inode))
continue;
-   pr_debug("SIC(%lu:%d:%d): ", inode->i_ino, bindex,
-atomic_read(&(inode)->i_count));
-   pr_debug("lc=%d ", atomic_read(&(lower_inode)->i_count));
-   pr_debug("%s:%s:%d\n", file, fxn, line);
+   printk(KERN_CONT "SIC(%lu:%d:%d): ", inode->i_ino, bindex,
+  atomic_read(&(inode)->i_count));
+   printk(KERN_CONT "lc=%d ",
+  atomic_read(&(lower_inode)->i_count));
+   printk(KERN_CONT "%s:%s:%d\n", file, fxn, line);
}
 }
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL -mm] 0/3 Unionfs updates/fixes/cleanups

2008-01-02 Thread Erez Zadok


The following is a series of patchsets related to Unionfs.  This is the
third set of patchsets resulting from an lkml review of the entire unionfs
code base.  The most significant change here is a locking/race bugfix during
dentry revalidation.

These patches were tested (where appropriate) on Linus's 2.6.24 latest code
(as of v2.6.24-rc6-179-gb8c9a18), MM, as well as the backports to
2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4, jffs2,
ramfs, tmpfs, cramfs, and squashfs (where available).  Also tested with
LTP-full.  See http://unionfs.filesystems.org/ to download back-ported
unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (3):
  Unionfs: use printk KERN_CONT for debugging messages
  Unionfs: locking fixes
  Unionfs: use VFS helpers to manipulate i_nlink

 debug.c  |   50 ++
 dentry.c |   13 -
 fanout.h |3 ++-
 unlink.c |2 +-
 4 files changed, 41 insertions(+), 27 deletions(-)

---
Erez Zadok
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/30] VFS/fs_stack: drop cast on inode passed to i_size_read

2007-12-28 Thread Erez Zadok

i_size_read() takes 'const struct inode *' already, as of 2.6.20.

CC: Mike Halcrow <[EMAIL PROTECTED]>

Signed-off-by: Jan Engelhardt <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/stack.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/stack.c b/fs/stack.c
index a548aac..7913fe5 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -21,7 +21,7 @@
  */
 void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
 {
-   i_size_write(dst, i_size_read((struct inode *)src));
+   i_size_write(dst, i_size_read(src));
dst->i_blocks = src->i_blocks;
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/30] VFS/fs_stack: use locking around i_size_write in 32-bit systems

2007-12-28 Thread Erez Zadok

From: Hugh Dickins <[EMAIL PROTECTED]>

LTP's iogen01 doio tests hang nicely on 32-bit SMP when /tmp is a unionfs
mount of a tmpfs.  See the comment on i_size_write in linux/fs.h: it needs
to be locked, otherwise i_size_read can spin forever waiting for a lost
seqcount update.

Most filesystems are already holding i_mutex for this, but unionfs calls
fsstack_copy_inode_size from many places, not necessarily holding i_mutex.
Use the low-level i_lock within fsstack_copy_inode_size when 32-bit SMP.

Checked the entire unionfs code to ensure this is the right fix for
i_size_write().

Also compared to what other file systems do when they have to handle inodes,
esp. not their own inodes (e.g., network file systems have to access the
exported file system's inodes).  Found out that most such file systems not just
don't lock around i_size_write, but they don't even use i_size_read or
i_size_write to access the inode's size.

CC: Mike Halcrow <[EMAIL PROTECTED]>

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/stack.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/stack.c b/fs/stack.c
index 7913fe5..4336f2b 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -21,8 +21,14 @@
  */
 void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
 {
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_lock(&dst->i_lock);
+#endif
i_size_write(dst, i_size_read(src));
dst->i_blocks = src->i_blocks;
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+   spin_unlock(&dst->i_lock);
+#endif
 }
 EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/30] Unionfs: clarify usage.txt read/write behavior

2007-12-28 Thread Erez Zadok

CC: Michael Tokarev <[EMAIL PROTECTED]>

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/concepts.txt |   20 +---
 Documentation/filesystems/unionfs/issues.txt   |2 +-
 Documentation/filesystems/unionfs/usage.txt|   13 -
 3 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/Documentation/filesystems/unionfs/concepts.txt 
b/Documentation/filesystems/unionfs/concepts.txt
index 7654ccc..bed69bd 100644
--- a/Documentation/filesystems/unionfs/concepts.txt
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -1,4 +1,4 @@
-Unionfs 2.1 CONCEPTS:
+Unionfs 2.x CONCEPTS:
 =
 
 This file describes the concepts needed by a namespace unification file
@@ -66,12 +66,26 @@ Copyup:
 ===
 
 When a change is made to the contents of a file's data or meta-data, they
-have to be stored somewhere. The best way is to create a copy of the
+have to be stored somewhere.  The best way is to create a copy of the
 original file on a branch that is writable, and then redirect the write
-though to this copy. The copy must be made on a higher priority branch so
+though to this copy.  The copy must be made on a higher priority branch so
 that lookup and readdir return this newer "version" of the file rather than
 the original (see duplicate elimination).
 
+An entire unionfs mount can be read-only or read-write.  If it's read-only,
+then none of the branches will be written to, even if some of the branches
+are physically writeable.  If the unionfs mount is read-write, then the
+leftmost (highest priority) branch must be writeable (for copyup to take
+place); the remaining branches can be any mix of read-write and read-only.
+
+In a writeable mount, unionfs will create new files/dir in the leftmost
+branch.  If one tries to modify a file in a read-only branch/media, unionfs
+will copyup the file to the leftmost branch and modify it there.  If you try
+to modify a file from a writeable branch which is not the leftmost branch,
+then unionfs will modify it in that branch; this is useful if you, say,
+unify differnet packages (e.g., apache, sendmail, ftpd, etc.) and you want
+changes to specific package files to remain logically in the directory where
+they came from.
 
 Cache Coherency:
 
diff --git a/Documentation/filesystems/unionfs/issues.txt 
b/Documentation/filesystems/unionfs/issues.txt
index 9db1d70..bb6ab05 100644
--- a/Documentation/filesystems/unionfs/issues.txt
+++ b/Documentation/filesystems/unionfs/issues.txt
@@ -1,4 +1,4 @@
-KNOWN Unionfs 2.1 ISSUES:
+KNOWN Unionfs 2.x ISSUES:
 =
 
 1. Unionfs should not use lookup_one_len() on the underlying f/s as it
diff --git a/Documentation/filesystems/unionfs/usage.txt 
b/Documentation/filesystems/unionfs/usage.txt
index 59c4f28..1adde69 100644
--- a/Documentation/filesystems/unionfs/usage.txt
+++ b/Documentation/filesystems/unionfs/usage.txt
@@ -12,7 +12,7 @@ GENERAL SYNTAX
 
 # mount -t unionfs -o , none MOUNTPOINT
 
-OPTIONS can be any legal combination one of:
+OPTIONS can be any legal combination of:
 
 - ro   # mount file system read-only
 - rw   # mount file system read-write
@@ -20,8 +20,9 @@ OPTIONS can be any legal combination one of:
 - incgen   # increment generation no. (see Cache Consistency below)
 
 BRANCH-OPTIONS can be either (1) a list of branches given to the "dirs="
-option, or (2) a list of individual branch manipulation commands, described
-in the "Branch Management" section below.
+option, or (2) a list of individual branch manipulation commands, combined
+with the "remount" option, and is further described in the "Branch
+Management" section below.
 
 The syntax for the "dirs=" mount option is:
 
@@ -32,7 +33,9 @@ the union, with an optional branch mode for each of those 
directories.
 Directories that come earlier (specified first, on the left) in the list
 have a higher precedence than those which come later.  Additionally,
 read-only or read-write permissions of the branch can be specified by
-appending =ro or =rw (default) to each directory.
+appending =ro or =rw (default) to each directory.  See the Copyup section in
+concepts.txt, for a description of Unionfs's behavior when mixing read-only
+and read-write branches and mounts.
 
 Syntax:
 
@@ -112,7 +115,7 @@ CACHE CONSISTENCY
 =
 
 If you modify any file on any of the lower branches directly, while there is
-a Unionfs 2.1 mounted above any of those branches, you should tell Unionfs
+a Unionfs 2.x mounted above any of those branches, you should tell Unionfs
 to purge its caches and re-get the objects.  To do that, you have to
 increment the generation number of the superblock using the following
 command:
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 08/30] Unionfs: create new symlinks only in first branch

2007-12-28 Thread Erez Zadok

When creating a new symlink, always create it in the first branch, which is
always writeable, not in the branch which may have a whiteout in it.  This
makes the policy for the creation of new symlinks consistent with that of
new files/directories, as well as improves efficiency a bit.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |  210 +++
 1 files changed, 95 insertions(+), 115 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 8076d0b..78cdfa2 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -368,16 +368,16 @@ out:
return err;
 }
 
-static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
+static int unionfs_symlink(struct inode *parent, struct dentry *dentry,
   const char *symname)
 {
int err = 0;
struct dentry *lower_dentry = NULL;
-   struct dentry *whiteout_dentry = NULL;
-   struct dentry *lower_dir_dentry = NULL;
-   umode_t mode;
-   int bindex = 0, bstart;
+   struct dentry *wh_dentry = NULL;
+   struct dentry *lower_parent_dentry = NULL;
char *name = NULL;
+   int valid = 0;
+   umode_t mode;
 
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
@@ -388,147 +388,127 @@ static int unionfs_symlink(struct inode *dir, struct 
dentry *dentry,
goto out;
}
 
-   /* We start out in the leftmost branch. */
-   bstart = dbstart(dentry);
-
-   lower_dentry = unionfs_lower_dentry(dentry);
-
/*
-* check if whiteout exists in this branch, i.e. lookup .wh.foo
-* first. If present, delete it
+* It's only a bug if this dentry was not negative and couldn't be
+* revalidated (shouldn't happen).
 */
-   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
-   if (unlikely(IS_ERR(name))) {
-   err = PTR_ERR(name);
-   goto out;
-   }
+   BUG_ON(!valid && dentry->d_inode);
 
-   whiteout_dentry =
-   lookup_one_len(name, lower_dentry->d_parent,
-  dentry->d_name.len + UNIONFS_WHLEN);
-   if (IS_ERR(whiteout_dentry)) {
-   err = PTR_ERR(whiteout_dentry);
+   /*
+* We shouldn't create things in a read-only branch; this check is a
+* bit redundant as we don't allow branch 0 to be read-only at the
+* moment
+*/
+   err = is_robranch_super(dentry->d_sb, 0);
+   if (err) {
+   err = -EROFS;
goto out;
}
 
-   if (!whiteout_dentry->d_inode) {
-   dput(whiteout_dentry);
-   whiteout_dentry = NULL;
-   } else {
+   /*
+* We _always_ create on branch 0
+*/
+   lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
+   if (lower_dentry) {
/*
-* found a .wh.foo entry, unlink it and then call
-* vfs_symlink().
+* check if whiteout exists in this branch, i.e. lookup .wh.foo
+* first.
 */
-   lower_dir_dentry = lock_parent(whiteout_dentry);
-
-   err = is_robranch_super(dentry->d_sb, bstart);
-   if (!err)
-   err = vfs_unlink(lower_dir_dentry->d_inode,
-whiteout_dentry);
-   dput(whiteout_dentry);
-
-   fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
-   /* propagate number of hard-links */
-   dir->i_nlink = unionfs_get_nlinks(dir);
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
 
-   unlock_dir(lower_dir_dentry);
+   wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(wh_dentry)) {
+   err = PTR_ERR(wh_dentry);
+   wh_dentry = NULL;
+   goto out;
+   }
 
-   if (err) {
-   /* exit if the error returned was NOT -EROFS */
-   if (!IS_COPYUP_ERR(err))
-   goto out;
+   if (wh_dentry->d_inode) {
/*
-* should now try to create symlink in the another
-* branch.
+* .wh.foo has been found, so let's unlink it
 */
-   bstart--;
-   }
-   }
+   struct dentry *lower_dir_dentry;
+
+   lower

[PATCH 14/30] Unionfs: remove unnecessary conditional inode lock

2007-12-28 Thread Erez Zadok

This was intended to protect the inode during branch management, but that is
now done through our superblock rwsem.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   13 -
 1 files changed, 0 insertions(+), 13 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 68c07ba..dc1aa39 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -33,7 +33,6 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
int bindex, bstart, bend;
int sbgen, dgen;
int positive = 0;
-   int locked = 0;
int interpose_flag;
struct nameidata lowernd; /* TODO: be gentler to the stack */
 
@@ -87,16 +86,6 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
interpose_flag = INTERPOSE_REVAL_NEG;
if (positive) {
interpose_flag = INTERPOSE_REVAL;
-   /*
-* During BRM, the VFS could already hold a lock on
-* a file being read, so don't lock it again
-* (deadlock), but if you lock it in this function,
-* then release it here too.
-*/
-   if (!mutex_is_locked(&dentry->d_inode->i_mutex)) {
-   mutex_lock(&dentry->d_inode->i_mutex);
-   locked = 1;
-   }
 
bstart = ibstart(dentry->d_inode);
bend = ibend(dentry->d_inode);
@@ -115,8 +104,6 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
UNIONFS_I(dentry->d_inode)->lower_inodes = NULL;
ibstart(dentry->d_inode) = -1;
ibend(dentry->d_inode) = -1;
-   if (locked)
-   mutex_unlock(&dentry->d_inode->i_mutex);
}
 
result = unionfs_lookup_backend(dentry, &lowernd,
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 18/30] Unionfs: remove unnecessary parent lock in create

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |7 ---
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 7ec9c1b..3df9b19 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -32,13 +32,6 @@ static int unionfs_create(struct inode *parent, struct 
dentry *dentry,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
 
-   unionfs_lock_dentry(dentry->d_parent);
-   valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
-   unionfs_unlock_dentry(dentry->d_parent);
-   if (unlikely(!valid)) {
-   err = -ESTALE;  /* same as what real_lookup does */
-   goto out;
-   }
valid = __unionfs_d_revalidate_chain(dentry, nd, false);
/*
 * It's only a bug if this dentry was not negative and couldn't be
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 28/30] Unionfs: don't check dentry on error

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index df6138a..740d364 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -158,9 +158,9 @@ out:
unionfs_check_inode(parent);
if (!err) {
unionfs_check_dentry(dentry->d_parent);
+   unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
}
-   unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 20/30] Unionfs: implement lockdep classes

2007-12-28 Thread Erez Zadok

Lockdep fixes.  Support locking order/classes (e.g., parent -> child ->
whiteout).  Remove locking from create_parents: it's enough to just dget the
dentries in question.  Move parent locking to from lookup_backend to caller,
unionfs_lookup.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |   16 
 fs/unionfs/copyup.c |6 +++---
 fs/unionfs/dentry.c |8 
 fs/unionfs/dirfops.c|4 ++--
 fs/unionfs/fanout.h |   14 --
 fs/unionfs/file.c   |6 +++---
 fs/unionfs/inode.c  |   47 ++-
 fs/unionfs/lookup.c |   15 +--
 fs/unionfs/main.c   |2 +-
 fs/unionfs/mmap.c   |4 ++--
 fs/unionfs/rename.c |8 
 fs/unionfs/subr.c   |4 ++--
 fs/unionfs/super.c  |   10 +-
 fs/unionfs/union.h  |   38 --
 fs/unionfs/unlink.c |8 
 fs/unionfs/xattr.c  |   16 
 16 files changed, 113 insertions(+), 93 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index f714e2f..4077907 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -311,7 +311,7 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
int err = 0;
 
dentry = file->f_path.dentry;
-   unionfs_lock_dentry(dentry);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
sb = dentry->d_sb;
 
/*
@@ -519,7 +519,7 @@ int unionfs_open(struct inode *inode, struct file *file)
int bindex = 0, bstart = 0, bend = 0;
int size;
 
-   unionfs_read_lock(inode->i_sb);
+   unionfs_read_lock(inode->i_sb, UNIONFS_SMUTEX_PARENT);
 
file->private_data =
kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
@@ -546,7 +546,7 @@ int unionfs_open(struct inode *inode, struct file *file)
}
 
dentry = file->f_path.dentry;
-   unionfs_lock_dentry(dentry);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
bstart = fbstart(file) = dbstart(dentry);
bend = fbend(file) = dbend(dentry);
@@ -607,7 +607,7 @@ int unionfs_file_release(struct inode *inode, struct file 
*file)
int bindex, bstart, bend;
int fgen, err = 0;
 
-   unionfs_read_lock(sb);
+   unionfs_read_lock(sb, UNIONFS_SMUTEX_PARENT);
/*
 * Yes, we have to revalidate this file even if it's being released.
 * This is important for open-but-unlinked files, as well as mmap
@@ -626,7 +626,7 @@ int unionfs_file_release(struct inode *inode, struct file 
*file)
bstart = fbstart(file);
bend = fbend(file);
 
-   unionfs_lock_dentry(dentry);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
for (bindex = bstart; bindex <= bend; bindex++) {
lower_file = unionfs_lower_file_idx(file, bindex);
 
@@ -705,7 +705,7 @@ static int unionfs_ioctl_queryfile(struct file *file, 
unsigned int cmd,
struct vfsmount *mnt;
 
dentry = file->f_path.dentry;
-   unionfs_lock_dentry(dentry);
+   unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
orig_bstart = dbstart(dentry);
orig_bend = dbend(dentry);
err = unionfs_partial_lookup(dentry);
@@ -755,7 +755,7 @@ long unionfs_ioctl(struct file *file, unsigned int cmd, 
unsigned long arg)
 {
long err;
 
-   unionfs_read_lock(file->f_path.dentry->d_sb);
+   unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
 
err = unionfs_file_revalidate(file, true);
if (unlikely(err))
@@ -794,7 +794,7 @@ int unionfs_flush(struct file *file, fl_owner_t id)
struct dentry *dentry = file->f_path.dentry;
int bindex, bstart, bend;
 
-   unionfs_read_lock(dentry->d_sb);
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_PARENT);
 
err = unionfs_file_revalidate(file, true);
if (unlikely(err))
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 0012caf..16b2c7c 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -717,7 +717,7 @@ struct dentry *create_parents(struct inode *dir, struct 
dentry *dentry,
 
/* find the parent directory dentry in unionfs */
parent_dentry = child_dentry->d_parent;
-   unionfs_lock_dentry(parent_dentry);
+   dget(parent_dentry);
 
/* find out the lower_parent_dentry in the given branch */
lower_parent_dentry =
@@ -752,7 +752,7 @@ struct dentry *create_parents(struct inode *dir, struct 
dentry *dentry,
 begin:
/* get lower parent dir in the current branch */
lower_parent_dentry = unionfs_lower_dentry_idx(parent_dentry, bindex);
-   unionfs_unlock_dentry(parent_dentry);
+   dput(parent_dentry);
 
/* init the values to lookup */

[PATCH 15/30] Unionfs: remove unnecessary lock when deleting whiteouts

2007-12-28 Thread Erez Zadok

Lockdep complained, because we eventually call vfs_unlink which'd grab the
necessary locks.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirhelper.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
index 2e52fc3..b40090a 100644
--- a/fs/unionfs/dirhelper.c
+++ b/fs/unionfs/dirhelper.c
@@ -110,7 +110,6 @@ int delete_whiteouts(struct dentry *dentry, int bindex,
lower_dir = lower_dir_dentry->d_inode;
BUG_ON(!S_ISDIR(lower_dir->i_mode));
 
-   mutex_lock(&lower_dir->i_mutex);
if (!permission(lower_dir, MAY_WRITE | MAY_EXEC, NULL)) {
err = do_delete_whiteouts(dentry, bindex, namelist);
} else {
@@ -120,7 +119,6 @@ int delete_whiteouts(struct dentry *dentry, int bindex,
run_sioq(__delete_whiteouts, &args);
err = args.err;
}
-   mutex_unlock(&lower_dir->i_mutex);
 
 out:
return err;
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 11/30] Unionfs: restructure unionfs_setattr and fix truncation order

2007-12-28 Thread Erez Zadok

From: Hugh Dickins <[EMAIL PROTECTED]>

Restructure the code to move the lower notify_change out of the loop in
unionfs_setattr.  Cleanup and simplify the code.  Then fix the truncation
order which fsx-linux in a unionfs on tmpfs found.  Then handle copyup
properly.

When shrinking a file, unionfs_setattr needs to vmtruncate the upper level
before notifying change to the lower level, to eliminate those dirty pages
beyond new eof which otherwise drift down to the lower level's writepage,
writing beyond its eof (and later uncovered when the file is expanded).

Also truncate the upper level first when expanding, in the case when
the upper level's s_maxbytes is more limiting than the lower level's.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |   97 ++-
 1 files changed, 49 insertions(+), 48 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 78cdfa2..37258c8 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -998,11 +998,10 @@ static int unionfs_setattr(struct dentry *dentry, struct 
iattr *ia)
 {
int err = 0;
struct dentry *lower_dentry;
-   struct inode *inode = NULL;
-   struct inode *lower_inode = NULL;
+   struct inode *inode;
+   struct inode *lower_inode;
int bstart, bend, bindex;
-   int i;
-   int copyup = 0;
+   loff_t size;
 
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
@@ -1023,62 +1022,64 @@ static int unionfs_setattr(struct dentry *dentry, 
struct iattr *ia)
if (ia->ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID))
ia->ia_valid &= ~ATTR_MODE;
 
-   for (bindex = bstart; (bindex <= bend) || (bindex == bstart);
-bindex++) {
-   lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
-   if (!lower_dentry)
-   continue;
-   BUG_ON(lower_dentry->d_inode == NULL);
-
-   /* If the file is on a read only branch */
-   if (is_robranch_super(dentry->d_sb, bindex)
-   || IS_RDONLY(lower_dentry->d_inode)) {
-   if (copyup || (bindex != bstart))
-   continue;
-   /* Only if its the leftmost file, copyup the file */
-   for (i = bstart - 1; i >= 0; i--) {
-   loff_t size = i_size_read(dentry->d_inode);
-   if (ia->ia_valid & ATTR_SIZE)
-   size = ia->ia_size;
-   err = copyup_dentry(dentry->d_parent->d_inode,
-   dentry, bstart, i,
-   dentry->d_name.name,
-   dentry->d_name.len,
-   NULL, size);
-
-   if (!err) {
-   copyup = 1;
-   lower_dentry =
-   unionfs_lower_dentry(dentry);
-   break;
-   }
-   /*
-* if error is in the leftmost branch, pass
-* it up.
-*/
-   if (i == 0)
-   goto out;
-   }
+   lower_dentry = unionfs_lower_dentry(dentry);
+   BUG_ON(!lower_dentry);  /* should never happen after above revalidate */
+
+   /* copyup if the file is on a read only branch */
+   if (is_robranch_super(dentry->d_sb, bstart)
+   || IS_RDONLY(lower_dentry->d_inode)) {
+   /* check if we have a branch to copy up to */
+   if (bstart <= 0) {
+   err = -EACCES;
+   goto out;
+   }
 
+   if (ia->ia_valid & ATTR_SIZE)
+   size = ia->ia_size;
+   else
+   size = i_size_read(inode);
+   /* copyup to next available branch */
+   for (bindex = bstart - 1; bindex >= 0; bindex--) {
+   err = copyup_dentry(dentry->d_parent->d_inode,
+   dentry, bstart, bindex,
+   dentry->d_name.name,
+   dentry->d_name.len,
+   NULL, size);
+   if (!err)
+   break;
}
-   err = notify_change(lower_dentry, ia);

[PATCH 27/30] Unionfs: cleanup lower inodes after successful unlink

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/unlink.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 1e81494..a1c82b6 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -91,7 +91,9 @@ out:
 int unionfs_unlink(struct inode *dir, struct dentry *dentry)
 {
int err = 0;
+   struct inode *inode = dentry->d_inode;
 
+   BUG_ON(S_ISDIR(inode->i_mode));
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
 
@@ -104,8 +106,13 @@ int unionfs_unlink(struct inode *dir, struct dentry 
*dentry)
err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system "forgets" about us */
if (!err) {
-   if (!S_ISDIR(dentry->d_inode->i_mode))
-   unionfs_postcopyup_release(dentry);
+   unionfs_postcopyup_release(dentry);
+   if (inode->i_nlink == 0) {
+   /* drop lower inodes */
+   iput(unionfs_lower_inode(inode));
+   unionfs_set_lower_inode(inode, NULL);
+   ibstart(inode) = ibend(inode) = -1;
+   }
d_drop(dentry);
/*
 * if unlink/whiteout succeeded, parent dir mtime has
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 29/30] Unionfs: implement d_iput method

2007-12-28 Thread Erez Zadok

This is needed to drop lower objects early enough, under certain conditions,
so the lower objects don't stay behind until umount(). [LTP testing]

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |   42 ++
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 0e89308..0369d93 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -484,7 +484,49 @@ out:
return;
 }
 
+/*
+ * Called when we're removing the last reference to our dentry.  So we
+ * should drop all lower references too.
+ */
+static void unionfs_d_iput(struct dentry *dentry, struct inode *inode)
+{
+   int bindex, rc;
+
+   unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+
+   if (dbstart(dentry) < 0)
+   goto drop_lower_inodes;
+   for (bindex = dbstart(dentry); bindex <= dbend(dentry); bindex++) {
+   if (unionfs_lower_mnt_idx(dentry, bindex)) {
+   unionfs_mntput(dentry, bindex);
+   unionfs_set_lower_mnt_idx(dentry, bindex, NULL);
+   }
+   if (unionfs_lower_dentry_idx(dentry, bindex)) {
+   dput(unionfs_lower_dentry_idx(dentry, bindex));
+   unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
+   }
+   }
+   set_dbstart(dentry, -1);
+   set_dbend(dentry, -1);
+
+drop_lower_inodes:
+   rc = atomic_read(&inode->i_count);
+   if (rc == 1 && inode->i_nlink == 1 && ibstart(inode) >= 0) {
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
+   iput(unionfs_lower_inode(inode));
+   lockdep_on();
+   unionfs_set_lower_inode(inode, NULL);
+   /* XXX: may need to set start/end to -1? */
+   }
+
+   iput(inode);
+
+   unionfs_read_unlock(dentry->d_sb);
+}
+
 struct dentry_operations unionfs_dops = {
.d_revalidate   = unionfs_d_revalidate,
.d_release  = unionfs_d_release,
+   .d_iput = unionfs_d_iput,
 };
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 26/30] Unionfs: initialize namelist variable in rename

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/rename.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 1019d47..9306a2b 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -462,7 +462,7 @@ int unionfs_rename(struct inode *old_dir, struct dentry 
*old_dentry,
}
 
if (S_ISDIR(new_dentry->d_inode->i_mode)) {
-   struct unionfs_dir_state *namelist;
+   struct unionfs_dir_state *namelist = NULL;
/* check if this unionfs directory is empty or not */
err = check_empty(new_dentry, &namelist);
if (err)
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/30] Unionfs: remove unnecessary locking in follow-link

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 37258c8..7ec9c1b 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -851,7 +851,8 @@ out:
  * nor do we need to revalidate it either.  It is safe to not lock our
  * dentry here, nor revalidate it, because unionfs_follow_link does not do
  * anything (prior to calling ->readlink) which could become inconsistent
- * due to branch management.
+ * due to branch management.  We also don't need to lock our super because
+ * this function isn't affected by branch-management.
  */
 static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd)
 {
@@ -859,8 +860,6 @@ static void *unionfs_follow_link(struct dentry *dentry, 
struct nameidata *nd)
int len = PAGE_SIZE, err;
mm_segment_t old_fs;
 
-   unionfs_read_lock(dentry->d_sb);
-
/* This is freed by the put_link method assuming a successful call. */
buf = kmalloc(len, GFP_KERNEL);
if (unlikely(!buf)) {
@@ -885,7 +884,6 @@ static void *unionfs_follow_link(struct dentry *dentry, 
struct nameidata *nd)
 out:
unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
-   unionfs_read_unlock(dentry->d_sb);
return ERR_PTR(err);
 }
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 06/30] Unionfs: initialize inode times for reused inodes

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/super.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index ed3eb04..c474c86 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -55,6 +55,14 @@ static void unionfs_read_inode(struct inode *inode)
 
inode->i_mapping->a_ops = &unionfs_aops;
 
+   /*
+* reset times so unionfs_copy_attr_all can keep out time invariants
+* right (upper inode time being the max of all lower ones).
+*/
+   inode->i_atime.tv_sec = inode->i_atime.tv_nsec = 0;
+   inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
+   inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
+
unionfs_read_unlock(inode->i_sb);
 }
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 10/30] Unionfs: mmap fixes

2007-12-28 Thread Erez Zadok

From: Hugh Dickins <[EMAIL PROTECTED]>

Remove !mapping_cap_writeback_dirty shortcircuit from unionfs_writepages.

It was introduced to avoid the stray AOP_WRITEPAGE_ACTIVATE coming from
shmem_writepage; but that has since been fixed in shmem_writepage and in
write_cache_pages.  It stayed because it looked like a good optimization,
not to waste time calling down to tmpfs when that would serve no purpose.

But in fact this optimization causes hangs when running LTP with unionfs
over tmpfs.  The problem is that the test comes at the wrong level: unionfs
has already declared in its default_backing_dev_info that it's playing by
cap_writeback_dirty rules.  If it does nothing here in its writepages, its
dirty pages accumulate and choke the system.  What's needed is to carry on
down and let its pages be cleaned while in turn they dirty the lower level.

And this now has an additional benefit for tmpfs, that a sync or pdflush
pushes these pages down to shmem_writepage, letting it match the filepage
coming from unionfs with the swap which may have been allocated earlier,
so it can free the duplication sooner than waiting for further pressure.

Remove unnecessary locking/code from prepare_write.  Handle if no lower
inodes in writepage.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/mmap.c |   29 +
 1 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 4d05352..aad2137 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -30,6 +30,11 @@ static int unionfs_writepage(struct page *page, struct 
writeback_control *wbc)
 
BUG_ON(!PageUptodate(page));
inode = page->mapping->host;
+   /* if no lower inode, nothing to do */
+   if (!inode || !UNIONFS_I(inode) || UNIONFS_I(inode)->lower_inodes) {
+   err = 0;
+   goto out;
+   }
lower_inode = unionfs_lower_inode(inode);
lower_mapping = lower_inode->i_mapping;
 
@@ -130,9 +135,6 @@ static int unionfs_writepages(struct address_space *mapping,
if (!lower_inode)
goto out;
 
-   if (!mapping_cap_writeback_dirty(lower_inode->i_mapping))
-   goto out;
-
err = generic_writepages(mapping, wbc);
if (!err)
unionfs_copy_attr_times(inode);
@@ -222,26 +224,13 @@ out:
 static int unionfs_prepare_write(struct file *file, struct page *page,
 unsigned from, unsigned to)
 {
-   int err;
-
-   unionfs_read_lock(file->f_path.dentry->d_sb);
/*
-* This is the only place where we unconditionally copy the lower
-* attribute times before calling unionfs_file_revalidate.  The
-* reason is that our ->write calls do_sync_write which in turn will
-* call our ->prepare_write and then ->commit_write.  Before our
-* ->write is called, the lower mtimes are in sync, but by the time
-* the VFS calls our ->commit_write, the lower mtimes have changed.
-* Therefore, the only reasonable time for us to sync up from the
-* changed lower mtimes, and avoid an invariant violation warning,
-* is here, in ->prepare_write.
+* Just copy lower inode attributes and return success.  Not much
+* else to do here.  No need to lock either (lockdep won't like it).
+* Let commit_write do all the hard work instead.
 */
unionfs_copy_attr_times(file->f_path.dentry->d_inode);
-   err = unionfs_file_revalidate(file, true);
-   unionfs_check_file(file);
-   unionfs_read_unlock(file->f_path.dentry->d_sb);
-
-   return err;
+   return 0;
 }
 
 static int unionfs_commit_write(struct file *file, struct page *page,
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 24/30] Unionfs: update inode times after a successful open

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 4077907..b8357a7 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -583,10 +583,13 @@ out:
kfree(UNIONFS_F(file));
}
 out_nofree:
-   unionfs_check_inode(inode);
if (!err) {
+   dentry = file->f_path.dentry;
+   unionfs_copy_attr_times(dentry->d_parent->d_inode);
+   unionfs_copy_attr_times(inode);
unionfs_check_file(file);
-   unionfs_check_dentry(file->f_path.dentry->d_parent);
+   unionfs_check_dentry(dentry->d_parent);
+   unionfs_check_inode(inode);
}
unionfs_read_unlock(inode->i_sb);
return err;
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 16/30] Unionfs: remove unnecessary lock in read_inode

2007-12-28 Thread Erez Zadok

Our read_inode doesn't need to grab the superblock rwsem because there no
chance it could be affected by branch management.  But our read_inode was
called from other places which did grab need to grab that rwsem, and lockdep
complained.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/super.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index c474c86..8b70aca 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -29,8 +29,6 @@ static void unionfs_read_inode(struct inode *inode)
int size;
struct unionfs_inode_info *info = UNIONFS_I(inode);
 
-   unionfs_read_lock(inode->i_sb);
-
memset(info, 0, offsetof(struct unionfs_inode_info, vfs_inode));
info->bstart = -1;
info->bend = -1;
@@ -63,7 +61,6 @@ static void unionfs_read_inode(struct inode *inode)
inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
 
-   unionfs_read_unlock(inode->i_sb);
 }
 
 /*
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 25/30] Unionfs: minor cleanup in check_empty

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dirhelper.c |9 +++--
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
index b40090a..4b73bb6 100644
--- a/fs/unionfs/dirhelper.c
+++ b/fs/unionfs/dirhelper.c
@@ -182,6 +182,7 @@ int check_empty(struct dentry *dentry, struct 
unionfs_dir_state **namelist)
 {
int err = 0;
struct dentry *lower_dentry = NULL;
+   struct vfsmount *mnt;
struct super_block *sb;
struct file *lower_file;
struct unionfs_rdutil_callback *buf = NULL;
@@ -226,15 +227,11 @@ int check_empty(struct dentry *dentry, struct 
unionfs_dir_state **namelist)
continue;
 
dget(lower_dentry);
-   unionfs_mntget(dentry, bindex);
+   mnt = unionfs_mntget(dentry, bindex);
branchget(sb, bindex);
-   lower_file =
-   dentry_open(lower_dentry,
-   unionfs_lower_mnt_idx(dentry, bindex),
-   O_RDONLY);
+   lower_file = dentry_open(lower_dentry, mnt, O_RDONLY);
if (IS_ERR(lower_file)) {
err = PTR_ERR(lower_file);
-   dput(lower_dentry);
branchput(sb, bindex);
goto out;
}
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/30] Unionfs: interpose cleanup and fix for spliced dentries

2007-12-28 Thread Erez Zadok

Fix unionfs_interpose to fill lower inode info when d_splice_alias returns
NULL.  Also cleanup impossible case (d_splice_alias doesn't return ERR_PTR).

Signed-off-by: Rachita Kothiyal <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 22aa6e6..ea8976d 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -149,9 +149,7 @@ skip:
break;
case INTERPOSE_LOOKUP:
spliced = d_splice_alias(inode, dentry);
-   if (IS_ERR(spliced)) {
-   err = PTR_ERR(spliced);
-   } else if (spliced && spliced != dentry) {
+   if (spliced && spliced != dentry) {
/*
 * d_splice can return a dentry if it was
 * disconnected and had to be moved.  We must ensure
@@ -169,6 +167,12 @@ skip:
unionfs_fill_inode(dentry, inode);
}
goto out_spliced;
+   } else if (!spliced) {
+   if (need_fill_inode) {
+   need_fill_inode = 0;
+   unionfs_fill_inode(dentry, inode);
+   goto out_spliced;
+   }
}
break;
case INTERPOSE_REVAL:
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 30/30] Unionfs: don't check parent dentries

2007-12-28 Thread Erez Zadok

Parent dentries may not be locked and may change, so don't check them.  But
do check parent inodes if they are passed to the method.  Also, ensure the
checks are done only if no error occurred.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/commonfops.c |1 -
 fs/unionfs/file.c   |1 -
 fs/unionfs/inode.c  |   23 +++
 3 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index b8357a7..2c32ada 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -588,7 +588,6 @@ out_nofree:
unionfs_copy_attr_times(dentry->d_parent->d_inode);
unionfs_copy_attr_times(inode);
unionfs_check_file(file);
-   unionfs_check_dentry(dentry->d_parent);
unionfs_check_inode(inode);
}
unionfs_read_unlock(inode->i_sb);
diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index b632042..0c424f6 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -66,7 +66,6 @@ out:
/* copyup could cause parent dir times to change */
unionfs_copy_attr_times(file->f_path.dentry->d_parent->d_inode);
unionfs_check_file(file);
-   unionfs_check_dentry(file->f_path.dentry->d_parent);
}
unionfs_read_unlock(file->f_path.dentry->d_sb);
return err;
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 740d364..6095c4f 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -157,7 +157,6 @@ out:
 
unionfs_check_inode(parent);
if (!err) {
-   unionfs_check_dentry(dentry->d_parent);
unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
}
@@ -207,14 +206,16 @@ static struct dentry *unionfs_lookup(struct inode *parent,
}
 
unionfs_check_inode(parent);
-   unionfs_check_dentry(dentry);
-   unionfs_check_dentry(dentry->d_parent);
-   unionfs_check_nd(nd);
-   if (!IS_ERR(ret))
+   if (!IS_ERR(ret)) {
+   unionfs_check_dentry(dentry);
+   unionfs_check_nd(nd);
unionfs_unlock_dentry(dentry);
+   }
 
-   if (dentry != dentry->d_parent)
+   if (dentry != dentry->d_parent) {
+   unionfs_check_dentry(dentry->d_parent);
unionfs_unlock_dentry(dentry->d_parent);
+   }
unionfs_read_unlock(dentry->d_sb);
 
return ret;
@@ -520,8 +521,7 @@ out:
 
unionfs_check_inode(parent);
if (!err)
-   unionfs_check_dentry(dentry->d_parent);
-   unionfs_check_dentry(dentry);
+   unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
@@ -815,8 +815,7 @@ out:
 
unionfs_check_inode(parent);
if (!err)
-   unionfs_check_dentry(dentry->d_parent);
-   unionfs_check_dentry(dentry);
+   unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
@@ -1110,8 +1109,8 @@ static int unionfs_setattr(struct dentry *dentry, struct 
iattr *ia)
/* if setattr succeeded, then parent dir may have changed */
unionfs_copy_attr_times(dentry->d_parent->d_inode);
 out:
-   unionfs_check_dentry(dentry);
-   unionfs_check_dentry(dentry->d_parent);
+   if (!err)
+   unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 09/30] Unionfs: release special files on copyup

2007-12-28 Thread Erez Zadok

If we copyup a special file (char, block, etc.), then dput the source
object.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/copyup.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 3fe4865..f48209f 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -505,13 +505,12 @@ out_unlock:
 
 out_free:
/*
-* If old_lower_dentry was a directory, we need to dput it.  If it
-* was a file, then it was already dput indirectly by other
+* If old_lower_dentry was not a file, then we need to dput it.  If
+* it was a file, then it was already dput indirectly by other
 * functions we call above which operate on regular files.
 */
if (old_lower_dentry && old_lower_dentry->d_inode &&
-   (S_ISDIR(old_lower_dentry->d_inode->i_mode) ||
-S_ISLNK(old_lower_dentry->d_inode->i_mode)))
+   !S_ISREG(old_lower_dentry->d_inode->i_mode))
dput(old_lower_dentry);
kfree(symbuf);
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/30] Unionfs: create new special files only in first branch

2007-12-28 Thread Erez Zadok

When creating a new special file, always create it in the first branch,
which is always writeable, not in the branch which may have a whiteout in
it.  This makes the policy for the creation of new special files consistent
with that of new files/directories, as well as improves efficiency a bit.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/inode.c |  174 +++
 1 files changed, 92 insertions(+), 82 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 63ff3d3..8076d0b 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -686,15 +686,15 @@ out:
return err;
 }
 
-static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
+static int unionfs_mknod(struct inode *parent, struct dentry *dentry, int mode,
 dev_t dev)
 {
int err = 0;
-   struct dentry *lower_dentry = NULL, *whiteout_dentry = NULL;
+   struct dentry *lower_dentry = NULL;
+   struct dentry *wh_dentry = NULL;
struct dentry *lower_parent_dentry = NULL;
-   int bindex = 0, bstart;
char *name = NULL;
-   int whiteout_unlinked = 0;
+   int valid = 0;
 
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
@@ -705,115 +705,125 @@ static int unionfs_mknod(struct inode *dir, struct 
dentry *dentry, int mode,
goto out;
}
 
-   bstart = dbstart(dentry);
-
-   lower_dentry = unionfs_lower_dentry(dentry);
+   /*
+* It's only a bug if this dentry was not negative and couldn't be
+* revalidated (shouldn't happen).
+*/
+   BUG_ON(!valid && dentry->d_inode);
 
/*
-* check if whiteout exists in this branch, i.e. lookup .wh.foo
-* first.
+* We shouldn't create things in a read-only branch; this check is a
+* bit redundant as we don't allow branch 0 to be read-only at the
+* moment
 */
-   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
-   if (unlikely(IS_ERR(name))) {
-   err = PTR_ERR(name);
+   err = is_robranch_super(dentry->d_sb, 0);
+   if (err) {
+   err = -EROFS;
goto out;
}
 
-   whiteout_dentry = lookup_one_len(name, lower_dentry->d_parent,
-dentry->d_name.len + UNIONFS_WHLEN);
-   if (IS_ERR(whiteout_dentry)) {
-   err = PTR_ERR(whiteout_dentry);
-   goto out;
-   }
+   /*
+* We _always_ create on branch 0
+*/
+   lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
+   if (lower_dentry) {
+   /*
+* check if whiteout exists in this branch, i.e. lookup .wh.foo
+* first.
+*/
+   name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+   if (unlikely(IS_ERR(name))) {
+   err = PTR_ERR(name);
+   goto out;
+   }
 
-   if (!whiteout_dentry->d_inode) {
-   dput(whiteout_dentry);
-   whiteout_dentry = NULL;
-   } else {
-   /* found .wh.foo, unlink it */
-   lower_parent_dentry = lock_parent(whiteout_dentry);
+   wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+  dentry->d_name.len + UNIONFS_WHLEN);
+   if (IS_ERR(wh_dentry)) {
+   err = PTR_ERR(wh_dentry);
+   wh_dentry = NULL;
+   goto out;
+   }
 
-   /* found a.wh.foo entry, remove it then do vfs_mkdir */
-   err = is_robranch_super(dentry->d_sb, bstart);
-   if (!err)
-   err = vfs_unlink(lower_parent_dentry->d_inode,
-whiteout_dentry);
-   dput(whiteout_dentry);
+   if (wh_dentry->d_inode) {
+   /*
+* .wh.foo has been found, so let's unlink it
+*/
+   struct dentry *lower_dir_dentry;
 
-   unlock_dir(lower_parent_dentry);
+   lower_dir_dentry = lock_parent(wh_dentry);
+   err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+   unlock_dir(lower_dir_dentry);
 
-   if (err) {
-   if (!IS_COPYUP_ERR(err))
+   /*
+* Whiteouts are special files and should be deleted
+* no matter what (as if they never existed), in
+* order to allow this create operation to succeed.
+* This is especially important in sticky
+* directories: a whiteout

[PATCH 23/30] Unionfs: set our superblock a/m/ctime granularity

2007-12-28 Thread Erez Zadok

Set it to 1 ns, because we could be stacked on top of file systems with such
granularity.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/main.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 92f0e9d..23c18f7 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -642,6 +642,13 @@ static int unionfs_read_super(struct super_block *sb, void 
*raw_data,
/* max Bytes is the maximum bytes from highest priority branch */
sb->s_maxbytes = unionfs_lower_super_idx(sb, 0)->s_maxbytes;
 
+   /*
+* Our c/m/atime granularity is 1 ns because we may stack on file
+* systems whose granularity is as good.  This is important for our
+* time-based cache coherency.
+*/
+   sb->s_time_gran = 1;
+
sb->s_op = &unionfs_sops;
 
/* See comment next to the definition of unionfs_d_alloc_root */
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 22/30] Unionfs: handle on lower inodes in lookup

2007-12-28 Thread Erez Zadok

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/dentry.c |6 +-
 fs/unionfs/lookup.c |5 +
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index b207a6f..0e89308 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -151,8 +151,12 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
valid = false;
}
 
-   if (!dentry->d_inode)
+   if (!dentry->d_inode ||
+   ibstart(dentry->d_inode) < 0 ||
+   ibend(dentry->d_inode) < 0) {
valid = false;
+   goto out;
+   }
 
if (valid) {
/*
diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index 85a85aa..b9ee072 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -225,6 +225,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
wh_lower_dentry = NULL;
 
/* Now do regular lookup; lookup foo */
+   BUG_ON(!lower_dir_dentry);
lower_dentry = lookup_one_len(name, lower_dir_dentry, namelen);
if (IS_ERR(lower_dentry)) {
dput(first_lower_dentry);
@@ -315,6 +316,10 @@ out_negative:
UNIONFS_I(dentry->d_inode)->stale = 1;
goto out;
}
+   if (!lower_dir_dentry) {
+   err = -ENOENT;
+   goto out;
+   }
/* This should only happen if we found a whiteout. */
if (first_dentry_offset == -1) {
first_lower_dentry = lookup_one_len(name, lower_dir_dentry,
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 19/30] Unionfs: prevent false lockdep warnings in stacking

2007-12-28 Thread Erez Zadok

A stackable file system like unionfs often performs an operation on a lower
file system, by calling a vfs_* method, having been called possibly by the
very same method from the VFS.  Both calls to the vfs_* method grab a lock
in the same lock class, and hence lockdep complains.  This warning is a
false positive in instances where unionfs only calls the vfs_* method on
lower objects; there's a strict lock ordering here: upper objects first,
then lower objects.

We want to prevent these false positives so that lockdep will not shutdown
so it'd still be able to warn us about potentially true locking problems.
So, we temporarily turn off lockdep ONLY AROUND the calls to vfs methods to
which we pass lower objects, and only for those instances where lockdep
complained.  While this solution may seem unclean, it is not without
precedent: other places in the kernel also do similar temporary disabling,
of course after carefully having checked that it is the right thing to do.

In the long run, lockdep needs to be taught how to handle about stacking.
Then this patch can be removed.  It is likely that such lockdep-stacking
support will do essentially the same as this patch: consider the same
ordering (upper then lower) and consider upper vs. lower locks to be in
different classes.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 Documentation/filesystems/unionfs/issues.txt |   10 +++---
 fs/unionfs/copyup.c  |3 +++
 fs/unionfs/inode.c   |   21 +++--
 fs/unionfs/rename.c  |   21 +++--
 fs/unionfs/super.c   |4 
 fs/unionfs/unlink.c  |   12 ++--
 6 files changed, 54 insertions(+), 17 deletions(-)

diff --git a/Documentation/filesystems/unionfs/issues.txt 
b/Documentation/filesystems/unionfs/issues.txt
index bb6ab05..f4b7e7e 100644
--- a/Documentation/filesystems/unionfs/issues.txt
+++ b/Documentation/filesystems/unionfs/issues.txt
@@ -17,8 +17,12 @@ KNOWN Unionfs 2.x ISSUES:
an upper object, and then a lower object, in a strict order to avoid
locking problems; in addition, Unionfs, as a fan-out file system, may
have to lock several lower inodes.  We are currently looking into Lockdep
-   to see how to make it aware of stackable file systems.  In the meantime,
-   if you get any warnings from Lockdep, you can safely ignore them (or feel
-   free to report them to the Unionfs maintainers, just to be sure).
+   to see how to make it aware of stackable file systems.  For now, we
+   temporarily disable lockdep when calling vfs methods on lower objects,
+   but only for those places where lockdep complained.  While this solution
+   may seem unclean, it is not without precedent: other places in the kernel
+   also do similar temporary disabling, of course after carefully having
+   checked that it is the right thing to do.  Anyway, you get any warnings
+   from Lockdep, please report them to the Unionfs maintainers.
 
 For more information, see <http://unionfs.filesystems.org/>.
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index f48209f..0012caf 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -297,11 +297,14 @@ static int __copyup_reg_data(struct dentry *dentry,
break;
}
 
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
write_bytes =
output_file->f_op->write(output_file,
 (char __user *)buf,
 read_bytes,
 &output_file->f_pos);
+   lockdep_on();
if ((write_bytes < 0) || (write_bytes < read_bytes)) {
err = write_bytes;
break;
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 3df9b19..4890f42 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -80,7 +80,10 @@ static int unionfs_create(struct inode *parent, struct 
dentry *dentry,
struct dentry *lower_dir_dentry;
 
lower_dir_dentry = lock_parent(wh_dentry);
+   /* see Documentation/filesystems/unionfs/issues.txt */
+   lockdep_off();
err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+   lockdep_on();
unlock_dir(lower_dir_dentry);
 
/*
@@ -262,9 +265,13 @@ static int unionfs_link(struct dentry *old_dentry, struct 
inode *dir,
/* found a .wh.foo entry, unlink it and then call vfs_link() */
lower_dir_dentry = lock_parent(whiteout_dentry);
err = is_robranch_super(new_dentry->d_sb, dbstart(new_dentry));
-

[PATCH 12/30] Unionfs: remove custom read/write methods

2007-12-28 Thread Erez Zadok

Having them results in lockdep warnings about having locks and grabbing the
same class locks in do_sync_read/write which were called from
unionfs_read/write.  All they did was revalidate out file object sooner,
which will now be deferred till a bit later.  Instead, use generic
do_sync_read and do_sync_write.

Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
---
 fs/unionfs/file.c |   46 ++
 1 files changed, 2 insertions(+), 44 deletions(-)

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index c922173..94784c3 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -18,48 +18,6 @@
 
 #include "union.h"
 
-static ssize_t unionfs_read(struct file *file, char __user *buf,
-   size_t count, loff_t *ppos)
-{
-   int err;
-
-   unionfs_read_lock(file->f_path.dentry->d_sb);
-   err = unionfs_file_revalidate(file, false);
-   if (unlikely(err))
-   goto out;
-   unionfs_check_file(file);
-
-   err = do_sync_read(file, buf, count, ppos);
-
-out:
-   unionfs_check_file(file);
-   unionfs_read_unlock(file->f_path.dentry->d_sb);
-   return err;
-}
-
-static ssize_t unionfs_write(struct file *file, const char __user *buf,
-size_t count, loff_t *ppos)
-{
-   int err = 0;
-
-   unionfs_read_lock(file->f_path.dentry->d_sb);
-   err = unionfs_file_revalidate(file, true);
-   if (unlikely(err))
-   goto out;
-   unionfs_check_file(file);
-
-   err = do_sync_write(file, buf, count, ppos);
-   /* update our inode times upon a successful lower write */
-   if (err >= 0) {
-   unionfs_copy_attr_times(file->f_path.dentry->d_inode);
-   unionfs_check_file(file);
-   }
-
-out:
-   unionfs_read_unlock(file->f_path.dentry->d_sb);
-   return err;
-}
-
 static int unionfs_file_readdir(struct file *file, void *dirent,
filldir_t filldir)
 {
@@ -210,9 +168,9 @@ out:
 
 struct file_operations unionfs_main_fops = {
.llseek = generic_file_llseek,
-   .read   = unionfs_read,
+   .read   = do_sync_read,
.aio_read   = generic_file_aio_read,
-   .write  = unionfs_write,
+   .write  = do_sync_write,
.aio_write  = generic_file_aio_write,
.readdir= unionfs_file_readdir,
.unlocked_ioctl = unionfs_ioctl,
-- 
1.5.2.2

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 4 >

1 - 100 of 320 matches

Mail list logo