[RFC 0/2] readdir() as an inode operation

2007-10-20 Thread Jan Blunck
This is a first try to move readdir() to become an inode operation. This is
necessary for a VFS implementation of "something like union-mounts" where a
readdir() needs to read the directory contents of multiple directories.
Besides that the new interface is no longer giving the struct file to the
filesystem implementations anymore.

Comments, please?
Jan

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/2] i_op->readdir: Change readdir() to be an inode operation

2007-10-20 Thread Jan Blunck
This patch adds a new readdir() inode operation. The purpose of this patch is
to enable the VFS to support directory reading on a stack of directories. The
new interface isn't passing the struct file to the filesystem implementation
anymore. Normally the filesystem implementation shouldn't depend on any
information in struct file except for the dentry, the cookie (f_pos) and the
users credentials.

The new interface for the readdir inode operation is as follows:

int (*readdir) (struct dentry *dentry, loff_t *pos, void *private,
filldir_t filler, void *dirent);

@dentry: the dentry of the directory
@pos: pointer to the cookie
@private: the credentials (at the moment it is still filp->private_data
@filler: the filldir to call
@dirent: the dirent buffer

Signed-off-by: Jan Blunck <[EMAIL PROTECTED]>
---
 fs/readdir.c   |   14 --
 include/linux/fs.h |2 ++
 2 files changed, 14 insertions(+), 2 deletions(-)

Index: b/fs/readdir.c
===
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -23,7 +24,8 @@ int vfs_readdir(struct file *file, filld
 {
struct inode *inode = file->f_path.dentry->d_inode;
int res = -ENOTDIR;
-   if (!file->f_op || !file->f_op->readdir)
+   if ((!file->f_op || !file->f_op->readdir) &&
+   (!inode->i_op || !inode->i_op->readdir))
goto out;
 
res = security_file_permission(file, MAY_READ);
@@ -33,7 +35,15 @@ int vfs_readdir(struct file *file, filld
mutex_lock(&inode->i_mutex);
res = -ENOENT;
if (!IS_DEADDIR(inode)) {
-   res = file->f_op->readdir(file, buf, filler);
+   if (inode->i_op->readdir) {
+   printk(KERN_DEBUG "i_op->readdir @ ");
+   print_ip_sym((unsigned long)inode->i_op);
+   res = inode->i_op->readdir(file->f_path.dentry,
+  &file->f_pos,
+  file->private_data,
+  filler, buf);
+   } else
+   res = file->f_op->readdir(file, buf, filler);
file_accessed(file);
}
mutex_unlock(&inode->i_mutex);
Index: b/include/linux/fs.h
===
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1214,6 +1214,8 @@ struct inode_operations {
int (*mkdir) (struct inode *,struct dentry *,int);
int (*rmdir) (struct inode *,struct dentry *);
int (*mknod) (struct inode *,struct dentry *,int,dev_t);
+   /* readdir(dentry, position, private/credential, filler, buffer) */
+   int (*readdir) (struct dentry *, loff_t *, void *, filldir_t, void *);
int (*rename) (struct inode *, struct dentry *,
struct inode *, struct dentry *);
int (*readlink) (struct dentry *, char __user *,int);

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 2/2] i_op->readdir: Change libfs users to the new interface

2007-10-20 Thread Jan Blunck
This patch changes dcache_readdir() to the new inode operations readdir
interface. Hence all the users of libfs.c are changed to use the new interface
too.

Signed-off-by: Jan Blunck <[EMAIL PROTECTED]>
---
 fs/autofs4/autofs_i.h |5 ++---
 fs/autofs4/root.c |   41 -
 fs/cifs/inode.c   |1 +
 fs/hugetlbfs/inode.c  |1 +
 fs/libfs.c|   27 ++-
 fs/ocfs2/dlm/dlmfs.c  |1 +
 fs/ramfs/inode.c  |1 +
 include/linux/fs.h|3 ++-
 mm/shmem.c|1 +
 9 files changed, 47 insertions(+), 34 deletions(-)

Index: b/fs/autofs4/autofs_i.h
===
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -168,10 +168,9 @@ static inline int autofs4_ispending(stru
return pending;
 }
 
-static inline void autofs4_copy_atime(struct file *src, struct file *dst)
+static inline void autofs4_copy_atime(struct inode *src, struct inode *dst)
 {
-   dst->f_path.dentry->d_inode->i_atime =
-   src->f_path.dentry->d_inode->i_atime;
+   dst->i_atime = src->i_atime;
return;
 }
 
Index: b/fs/autofs4/root.c
===
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -35,7 +35,6 @@ const struct file_operations autofs4_roo
.open   = dcache_dir_open,
.release= dcache_dir_close,
.read   = generic_read_dir,
-   .readdir= autofs4_root_readdir,
.ioctl  = autofs4_root_ioctl,
 };
 
@@ -43,7 +42,6 @@ const struct file_operations autofs4_dir
.open   = autofs4_dir_open,
.release= autofs4_dir_close,
.read   = generic_read_dir,
-   .readdir= autofs4_dir_readdir,
 };
 
 const struct inode_operations autofs4_indirect_root_inode_operations = {
@@ -52,6 +50,7 @@ const struct inode_operations autofs4_in
.symlink= autofs4_dir_symlink,
.mkdir  = autofs4_dir_mkdir,
.rmdir  = autofs4_dir_rmdir,
+   .readdir= autofs4_root_readdir,
 };
 
 const struct inode_operations autofs4_direct_root_inode_operations = {
@@ -59,6 +58,7 @@ const struct inode_operations autofs4_di
.unlink = autofs4_dir_unlink,
.mkdir  = autofs4_dir_mkdir,
.rmdir  = autofs4_dir_rmdir,
+   .readdir= autofs4_root_readdir,
.follow_link= autofs4_follow_link,
 };
 
@@ -68,15 +68,17 @@ const struct inode_operations autofs4_di
.symlink= autofs4_dir_symlink,
.mkdir  = autofs4_dir_mkdir,
.rmdir  = autofs4_dir_rmdir,
+   .readdir= autofs4_dir_readdir,
 };
 
-static int autofs4_root_readdir(struct file *file, void *dirent,
-   filldir_t filldir)
+static int autofs4_root_readdir(struct dentry *dentry, loff_t *pos,
+   void *private,
+   filldir_t filldir, void *dirent)
 {
-   struct autofs_sb_info *sbi = autofs4_sbi(file->f_path.dentry->d_sb);
+   struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
int oz_mode = autofs4_oz_mode(sbi);
 
-   DPRINTK("called, filp->f_pos = %lld", file->f_pos);
+   DPRINTK("called, filp->f_pos = %lld", *pos);
 
/*
 * Don't set reghost flag if:
@@ -84,12 +86,12 @@ static int autofs4_root_readdir(struct f
 * 2) we haven't even enabled reghosting in the 1st place.
 * 3) this is the daemon doing a readdir
 */
-   if (oz_mode && file->f_pos == 0 && sbi->reghost_enabled)
+   if (oz_mode && *pos == 0 && sbi->reghost_enabled)
sbi->needs_reghost = 1;
 
DPRINTK("needs_reghost = %d", sbi->needs_reghost);
 
-   return dcache_readdir(file, dirent, filldir);
+   return dcache_inode_readdir(dentry, pos, private,filldir, dirent);
 }
 
 static int autofs4_dir_open(struct inode *inode, struct file *file)
@@ -201,15 +203,16 @@ out:
return status;
 }
 
-static int autofs4_dir_readdir(struct file *file, void *dirent, filldir_t 
filldir)
+static int autofs4_dir_readdir(struct dentry *dentry, loff_t *pos,
+  void *private,
+  filldir_t filldir, void *dirent)
 {
-   struct dentry *dentry = file->f_path.dentry;
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
-   struct dentry *cursor = file->private_data;
+   struct dentry *cursor = private;
int status;
 
-   DPRINTK("file=%p dentry=%p %.*s",
-   file, dentry, dentry->d_name.len, dentry->d_name.name);
+   DPRINTK("dentry=%p %.*s", dentry, dentry->d_name.len,
+   dentry->d_name.name);
 
if (autofs4_oz_mode(sbi))
goto out;
@@ -221,21 +224,25 @@ static int autofs4_dir_readdir(struct fi
 
if (d_mountpoint(dentry)) 

Re: Does "32.1% non-contiguous" mean severely fragmented?

2007-10-20 Thread Theodore Tso
On Sat, Oct 20, 2007 at 12:39:33PM +0900, Tetsuo Handa wrote:
> Theodore Tso wrote:
> > beginning of every single block group.  You have a small number of
> > files on your system (349) occupying an average of 348 megabytes.  So
> > it's not at all surprising that the contiguous percentage is 32%.
> I see, thank you. Yes, there are many files splitted in 2GB each.
> 
> But what is surprising for me is that I have to wait for more than
> five minutes to save/restore the virtual machine's 512MB-RAM image
> (usually it takes less than five seconds).
> Hdparm reports DMA is on and e2fsck reports no errors,
> so I thought it is severely fragmented.
> May be I should backup all virtual machine's data and
> format the partition and restore them.

Well, that's a little drastic if you're not sure what is going on is
fragmentation.

5 minutes to save/restore a 512MB ram image, assuming that you are
saving somewhere around 576 megs of data, means you are writing less
than 2 megs/second.  That seems point to something fundamentally
wrong, far worse than can be explained by fragmentation.  

First of all, what does the "filefrag" program (shipped as part of
e2fsprogs, not included in some distributions) say if you run it as
root on your VM data file?

Secondly, what results do you get when you run the command "hdparm -tT
/dev/sda" (or /dev/hda if you are using an IDE disk)?

This kind of performance regression is the sort of thing I see on my
laptop when compile the kernel with the wrong options, and/or disable
AHCI mode in favor of compatibility mode, such that my laptop SATA
performance (as measured using hdparm) drops from 50 megs/second to 2
megs/second.

Regards,

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html