Re: [PATCH] reiser4: fix readv

2006-09-18 Thread Christoph Hellwig
On Sun, Sep 17, 2006 at 10:39:07PM +0400, Vladimir V. Saveliev wrote:
 It seems the problem can be fixed a bit simpler. Currently, there is a 
 difference between read and readv: read calls f_op-read if it is defined,
 but readv calls f_op-read if f_op-aio_read is not defined. The latest is a 
 bit unlogical imho: 
 wouldn't it be more consistent if readv worked via f_op-read if it is 
 defined?
 If we fixed readv (do_readv_writev) that way - reiser4 would not need 
 anything from its aio_read but generic_file_aio_read.

This behaviour is historic baggae.  readv used to call -readv and fall
back to -read when it wasn't available.  We now merged -readv into
-aio_read and kept that behaviour.  I think it makes sense, though - if
the filesystem supports vator operations via -aio_read we should use
them and not fall back to the manual looping around -read.

I'd rather change read to do the same thing as readv so we have
consistent behaviour.

why does this matter for reiser4?



Re: [PATCH] reiser4: fix readv

2006-09-18 Thread Vladimir V. Saveliev
Hello

On Monday 18 September 2006 16:30, Christoph Hellwig wrote:
 On Sun, Sep 17, 2006 at 10:39:07PM +0400, Vladimir V. Saveliev wrote:
  It seems the problem can be fixed a bit simpler. Currently, there is a 
  difference between read and readv: read calls f_op-read if it is defined,
  but readv calls f_op-read if f_op-aio_read is not defined. The latest is 
  a bit unlogical imho: 
  wouldn't it be more consistent if readv worked via f_op-read if it is 
  defined?
  If we fixed readv (do_readv_writev) that way - reiser4 would not need 
  anything from its aio_read but generic_file_aio_read.
 
 This behaviour is historic baggae.  readv used to call -readv and fall
 back to -read when it wasn't available.  We now merged -readv into
 -aio_read and kept that behaviour.  I think it makes sense, though - if
 the filesystem supports vator operations via -aio_read we should use
 them and not fall back to the manual looping around -read.
 
 I'd rather change read to do the same thing as readv so we have
 consistent behaviour.
 

Can we put it that way: if a filesystem can use page_cache directly - it has to 
set f_op-aio_read to generic_file_aio_read
and set f_op-read to NULL. If the filesystem wants to try to screw things up a 
bit - 
it implements f_op-read and its f_op-read is called by sys_read and sys_readv
regardless to whether it has aio_read or not?

 why does this matter for reiser4?
 

reiser4 reads some files via generic page cache routines. In that case reiser4' 
read calls do_sync_read. 
Therefore, it has to define f_op-aio_read. 
OTOH, there are files for which reiser4' read does not call do_sync_read.

In case of readv, f_op-aio_read is called directly (if it is defined), which 
may result in that reiser4' aio_read is called for files for which 
reiser4' read would never call do_sync_read. 
To avoid the problem we have to implement reiser4_aio_read which either calls 
generic_file_aio_read or does something very similar to do_loop_readv_writev.


 


Re: [PATCH] reiser4: fix readv

2006-09-18 Thread Christoph Hellwig
On Mon, Sep 18, 2006 at 05:34:42PM +0400, Vladimir V. Saveliev wrote:
 Can we put it that way: if a filesystem can use page_cache directly - it has 
 to set f_op-aio_read to generic_file_aio_read
 and set f_op-read to NULL. If the filesystem wants to try to screw things up 
 a bit - 
 it implements f_op-read and its f_op-read is called by sys_read and 
 sys_readv
 regardless to whether it has aio_read or not?

Not entirely.  The idea I had in my mind was the following:

 - real filesystems should always implement the aio_ methods and must
   support async and vectored I/O
 - drivers or simple synthetic filesystems can implement just -read and
   -write and stay out of all the complexity.

In the end I would like to enforce and invariant where a fileesystem or
driver implements either the aio_ or normal methods, but we can't do
that yet as there are far too many places calling thos directly.  Thus
we have do_sync_read and do_sync_write that are used as read and write
methods for those more complex filesystems.  (Anyone's got an idea how
to enforce that people never set -read and -write to anything else
when they implement the aio methods?)

  why does this matter for reiser4?
  
 
 reiser4 reads some files via generic page cache routines. In that case 
 reiser4' read calls do_sync_read. 
 Therefore, it has to define f_op-aio_read. 
 OTOH, there are files for which reiser4' read does not call do_sync_read.
 
 In case of readv, f_op-aio_read is called directly (if it is defined), which 
 may result in that reiser4' aio_read is called for files for which 
 reiser4' read would never call do_sync_read. 
 To avoid the problem we have to implement reiser4_aio_read which either calls 
 generic_file_aio_read or does something very similar to do_loop_readv_writev.

In that case reiserfs should only implement aio_read and aio_write
methods and use do_loop_readv_writev which we should export for a
beginning.  Longer term you should try to implement full blown aio and
vector support even for those odd files (or find a way to migrate the
pagecache).  Do files change from odd to normal while they're
instanciated?  Otherwise you could just declare to sets of
file_operations, once that uses the pagecache and one that doesn't
and decide at inode instanciation time which one to use.


Re: [PATCH] reiser4: fix readv

2006-09-17 Thread Vladimir V. Saveliev
Hello

On Friday 15 September 2006 06:24, Andrew Morton wrote:
 On Wed, 13 Sep 2006 22:12:54 +0400
 Vladimir V. Saveliev [EMAIL PROTECTED] wrote:
 
  Hello, Andrew
  
  reiser4 in 2.6.18-rc6-mm2 has a bug. It can not do readv.
  
  The attached patch fixes it by implementing reiser4' aio_read file 
  operation.
  Unfortunately, it appeared to get a loop which is very similar to the one of
  fs/read_write.c:do_loop_readv_writev().
  Alternatively, if do_loop_readv_writev were EXPORT_SYMBOL-ed
  reiser4' aio_read could use it instead. But, there is a problem with 
  do_loop_readv_writev EXPORT_SYMBOL-ing:
  one if its arguments is io_fn_t, which is declared in fs/read_write.h.
  If it is ok to move io_fn_t and do_loop_readv_writev declarations to 
  include/linux/fs.h and to EXPORT_SYMBOL 
  do_loop_readv_writev the fix will be smaller. Please, let me know what 
  would you prefer.
  
 
 Yes, I'd say that do_loop_readv_writev() is suitable for exporting to
 filesystems, and that doing so is preferable to duplicating it.
 
 That'd be two patches, please: one to do the export and one to use it in
 reiser4.
 
 I assume there's a good reason why reiser4 cannot use
 generic_file_aio_read() or vfs_readv().  Please capture that discussion in
 the changelog for the first patch, thanks.
 

It seems the problem can be fixed a bit simpler. Currently, there is a 
difference between read and readv: read calls f_op-read if it is defined,
but readv calls f_op-read if f_op-aio_read is not defined. The latest is a 
bit unlogical imho: 
wouldn't it be more consistent if readv worked via f_op-read if it is defined?
If we fixed readv (do_readv_writev) that way - reiser4 would not need anything 
from its aio_read but generic_file_aio_read.


From: Vladimir Saveliev [EMAIL PROTECTED]

There is some asymmetry between read and readv:
read (vfs_read, namely) calls f_op-read when it is defined,
while readv (do_readv_writev) calls f_op-read when f_op-aio_read is not 
defined.
This patch makes do_readv_writev to call do_loop_readv_writev
(which calls f_op-read for each segment of i/o vector) if f_op-read is 
defined.

Signed-off-by: Vladimir Saveliev [EMAIL PROTECTED]




diff -puN fs/read_write.c~fix-do_readv_writev fs/read_write.c
--- linux-2.6.18-rc6-mm2/fs/read_write.c~fix-do_readv_writev2006-09-15 
17:46:03.0 +0400
+++ linux-2.6.18-rc6-mm2-vs/fs/read_write.c 2006-09-17 23:01:17.0 
+0400
@@ -608,7 +608,6 @@ static ssize_t do_readv_writev(int type,
if (ret)
goto out;
 
-   fnv = NULL;
if (type == READ) {
fn = file-f_op-read;
fnv = file-f_op-aio_read;
@@ -617,11 +616,11 @@ static ssize_t do_readv_writev(int type,
fnv = file-f_op-aio_write;
}
 
-   if (fnv)
+   if (fn)
+   ret = do_loop_readv_writev(file, iov, nr_segs, pos, fn);
+   else
ret = do_sync_readv_writev(file, iov, nr_segs, tot_len,
pos, fnv);
-   else
-   ret = do_loop_readv_writev(file, iov, nr_segs, pos, fn);
 
 out:
if (iov != iovstack)

_


Re: [PATCH] reiser4: fix readv

2006-09-17 Thread Andrew Morton
On Sun, 17 Sep 2006 22:39:07 +0400
Vladimir V. Saveliev [EMAIL PROTECTED] wrote:

 Hello
 
 On Friday 15 September 2006 06:24, Andrew Morton wrote:
  On Wed, 13 Sep 2006 22:12:54 +0400
  Vladimir V. Saveliev [EMAIL PROTECTED] wrote:
  
   Hello, Andrew
   
   reiser4 in 2.6.18-rc6-mm2 has a bug. It can not do readv.
   
   The attached patch fixes it by implementing reiser4' aio_read file 
   operation.
   Unfortunately, it appeared to get a loop which is very similar to the one 
   of
   fs/read_write.c:do_loop_readv_writev().
   Alternatively, if do_loop_readv_writev were EXPORT_SYMBOL-ed
   reiser4' aio_read could use it instead. But, there is a problem with 
   do_loop_readv_writev EXPORT_SYMBOL-ing:
   one if its arguments is io_fn_t, which is declared in fs/read_write.h.
   If it is ok to move io_fn_t and do_loop_readv_writev declarations to 
   include/linux/fs.h and to EXPORT_SYMBOL 
   do_loop_readv_writev the fix will be smaller. Please, let me know what 
   would you prefer.
   
  
  Yes, I'd say that do_loop_readv_writev() is suitable for exporting to
  filesystems, and that doing so is preferable to duplicating it.
  
  That'd be two patches, please: one to do the export and one to use it in
  reiser4.
  
  I assume there's a good reason why reiser4 cannot use
  generic_file_aio_read() or vfs_readv().  Please capture that discussion in
  the changelog for the first patch, thanks.
  
 
 It seems the problem can be fixed a bit simpler. Currently, there is a 
 difference between read and readv: read calls f_op-read if it is defined,
 but readv calls f_op-read if f_op-aio_read is not defined. The latest is a 
 bit unlogical imho: 
 wouldn't it be more consistent if readv worked via f_op-read if it is 
 defined?
 If we fixed readv (do_readv_writev) that way - reiser4 would not need 
 anything from its aio_read but generic_file_aio_read.
 
 
 From: Vladimir Saveliev [EMAIL PROTECTED]
 
 There is some asymmetry between read and readv:
 read (vfs_read, namely) calls f_op-read when it is defined,
 while readv (do_readv_writev) calls f_op-read when f_op-aio_read is not 
 defined.
 This patch makes do_readv_writev to call do_loop_readv_writev
 (which calls f_op-read for each segment of i/o vector) if f_op-read is 
 defined.
 
 Signed-off-by: Vladimir Saveliev [EMAIL PROTECTED]
 
 
 
 
 diff -puN fs/read_write.c~fix-do_readv_writev fs/read_write.c
 --- linux-2.6.18-rc6-mm2/fs/read_write.c~fix-do_readv_writev  2006-09-15 
 17:46:03.0 +0400
 +++ linux-2.6.18-rc6-mm2-vs/fs/read_write.c   2006-09-17 23:01:17.0 
 +0400
 @@ -608,7 +608,6 @@ static ssize_t do_readv_writev(int type,
   if (ret)
   goto out;
  
 - fnv = NULL;
   if (type == READ) {
   fn = file-f_op-read;
   fnv = file-f_op-aio_read;
 @@ -617,11 +616,11 @@ static ssize_t do_readv_writev(int type,
   fnv = file-f_op-aio_write;
   }
  
 - if (fnv)
 + if (fn)
 + ret = do_loop_readv_writev(file, iov, nr_segs, pos, fn);
 + else
   ret = do_sync_readv_writev(file, iov, nr_segs, tot_len,
   pos, fnv);
 - else
 - ret = do_loop_readv_writev(file, iov, nr_segs, pos, fn);
  
  out:
   if (iov != iovstack)
 
 _

OK, thanks.  I'll wait to hear from Badari and Christoph on that.




Re: [PATCH] reiser4: fix readv

2006-09-17 Thread Andrew Morton

btw, please be aware that the post-2.6.17 deadlock fixes which went into
generic_file_buffered_write() caused a few problems:

a) Significant NFS overwrite performance regression (due to running
   prepare_write/commit_write) against each writev segment.

b) Smaller but significant performance regression for all writev()
   usages (I expect - haven't measured this yet).

c) It's still theoretically deadlockable.

I am (slowly) working on addressing these issues - more info later.

Basically, we will again need to be able to traverse multiple iovec
segments within a single prepare_write/commit_write instance.  This will
impact your batch_write patch, because that patch somewhat codifies in the
function layout the concept that we do one-segment-pre-prepare_write().

So that patch will need significant rework, I suspect.

What I'm hoping to be able to do is to simply revert the two deadlock-fix
patches (so we go back to the 2.6.17 code) and to remove
fault_in_pages_readable() and to simply unlock the page after
-prepare_write() and lock it again prior to -commit_write().  But I
haven't tried that yet - there's quite a bit of preparatory work needed.


Re: [PATCH] reiser4: fix readv

2006-09-14 Thread Christian Trefzer
Hi Vladimir,

this fixes a bunch of random user space oddities for me. Just patched
2.6.18-rc6-mm2 with this, and some annoyances I could not trace back to
a change in user space just went bye-bye.

Thanks a lot!
Chris


pgpJBZDdCOUrz.pgp
Description: PGP signature


Re: [PATCH] reiser4: fix readv

2006-09-14 Thread Peter
On Thu, 14 Sep 2006 11:28:29 +0200, Christian Trefzer wrote:

 Hi Vladimir,
 
 this fixes a bunch of random user space oddities for me. Just patched
 2.6.18-rc6-mm2 with this, and some annoyances I could not trace back to
 a change in user space just went bye-bye.
 
 Thanks a lot!
 Chris

This does not patch against 2.6.17-3 patchset however. Any possibility
this may be related to some startup issues as noted on other threads here?

Thx

-- 
Peter
+
Do not reply to this email, it is a spam trap and not monitored.
I can be reached via this list, or via 
jabber: pete4abw at jabber.org
ICQ: 73676357



Re: [PATCH] reiser4: fix readv

2006-09-14 Thread Christian Trefzer
On Thu, Sep 14, 2006 at 10:35:02AM +, Peter wrote:
 This does not patch against 2.6.17-3 patchset however. Any possibility
 this may be related to some startup issues as noted on other threads here?

I have seen the issues at bootup you noticed some time ago, but only
once after the root fs had been mounted in a different way, i.e. laptop
hdd stuffed into a USB case and connected to my desktop, mounted to
/mnt/something.

The most annoying problems that have been fixed for me were related to
gentoo's postfix and spamd startup scripts. For some strange reason,
they did not detect properly that postfix had been started / spamd had
been stopped. Now everything seems kosher - I survived a few
suspend/resume cycles without the hickups that have become common during
the last few months. Sadly, I had more serious trouble to worry about,
otherwise I had bugged the list myself : P

Kind regards,
Chris


pgpgZzZU2mOck.pgp
Description: PGP signature


Re: [PATCH] reiser4: fix readv

2006-09-14 Thread Vladimir V. Saveliev
Hello

On Thursday 14 September 2006 14:35, Peter wrote:
 On Thu, 14 Sep 2006 11:28:29 +0200, Christian Trefzer wrote:
 
  Hi Vladimir,
  
  this fixes a bunch of random user space oddities for me. Just patched
  2.6.18-rc6-mm2 with this, and some annoyances I could not trace back to
  a change in user space just went bye-bye.
  
  Thanks a lot!
  Chris
 
 This does not patch against 2.6.17-3 patchset however. 

2.6.17-3 does not have the problem which that patch fixes. So you do not need 
it.

 Any possibility 
 this may be related to some startup issues as noted on other threads here?
 

no. 

 Thx
 


[PATCH] reiser4: fix readv

2006-09-13 Thread Vladimir V. Saveliev
Hello, Andrew

reiser4 in 2.6.18-rc6-mm2 has a bug. It can not do readv.

The attached patch fixes it by implementing reiser4' aio_read file operation.
Unfortunately, it appeared to get a loop which is very similar to the one of
fs/read_write.c:do_loop_readv_writev().
Alternatively, if do_loop_readv_writev were EXPORT_SYMBOL-ed
reiser4' aio_read could use it instead. But, there is a problem with 
do_loop_readv_writev EXPORT_SYMBOL-ing:
one if its arguments is io_fn_t, which is declared in fs/read_write.h.
If it is ok to move io_fn_t and do_loop_readv_writev declarations to 
include/linux/fs.h and to EXPORT_SYMBOL 
do_loop_readv_writev the fix will be smaller. Please, let me know what would 
you prefer.


From: Vladimir Saveliev [EMAIL PROTECTED]

This patch adds implementation of aio_read file operation for reiser4.
It is needed because in reiser4 there are files which can not be dealt
with via generic page cache routines.
In case of readv, reiser4 has no meaning to find out file type and to choose 
proper
way to read it. As result generic page cache read gets called for files which 
can not be 
read that way. Reiser4' aio_read method is to fix that problem. 

Signed-off-by: Vladimir Saveliev [EMAIL PROTECTED]




diff -puN fs/reiser4/plugin/object.c~reiser4-add-aio_read 
fs/reiser4/plugin/object.c
--- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/object.c~reiser4-add-aio_read
2006-09-13 20:18:23.0 +0400
+++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/object.c  2006-09-13 
20:18:23.0 +0400
@@ -101,7 +101,7 @@ file_plugin file_plugins[LAST_FILE_PLUGI
.llseek = generic_file_llseek,
.read = read_unix_file,
.write = do_sync_write,
-   .aio_read = generic_file_aio_read,
+   .aio_read = aio_read_unix_file,
.aio_write = generic_file_aio_write,
.ioctl = ioctl_unix_file,
.mmap = mmap_unix_file,
diff -puN fs/reiser4/plugin/file/file.c~reiser4-add-aio_read 
fs/reiser4/plugin/file/file.c
--- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/file/file.c~reiser4-add-aio_read 
2006-09-13 20:18:23.0 +0400
+++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/file/file.c   2006-09-13 
20:52:30.0 +0400
@@ -2011,6 +2011,54 @@ out:
return result;
 }
 
+/**
+ * aio_read_unix_file - aio_read of struct file_operations
+ * @iocb: i/o control block
+ * @iov: i/o vector
+ * @nr_segs: number of segments in the i/o vector
+ * @pos: file position to read from
+ *
+ * When it is called within reiser4 context (this happens when sys_read is
+ * reading a file built of extents) - just call generic_file_aio_read to
+ * perform read into page cache. When it is called without reiser4 context
+ * (sys_readv) - call read_unix_file for each segments of i/o vector, so that
+ * read_unix_file will be able to choose whether the file is to be read into
+ * page cache or the file is built of tail items and page cache read is not
+ * suitable for it.
+ */
+ssize_t aio_read_unix_file(struct kiocb *iocb, const struct iovec *iov,
+  unsigned long nr_segs, loff_t pos)
+{
+   ssize_t ret = 0;
+
+   if (is_in_reiser4_context())
+   return generic_file_aio_read(iocb, iov, nr_segs, pos);
+
+   while (nr_segs  0) {
+   void __user *base;
+   size_t len;
+   ssize_t nr;
+
+   base = iov-iov_base;
+   len = iov-iov_len;
+   iov++;
+   nr_segs--;
+
+   nr = read_unix_file(iocb-ki_filp, base, len, iocb-ki_pos);
+   if (nr  0) {
+   if (!ret)
+   ret = nr;
+   break;
+   }
+   ret += nr;
+   if (nr != len)
+   break;
+   }
+
+   return ret;
+
+}
+
 static ssize_t read_unix_file_container_tails(
struct file *file, char __user *buf, size_t read_amount, loff_t *off)
 {
diff -puN fs/reiser4/plugin/file/file.h~reiser4-add-aio_read 
fs/reiser4/plugin/file/file.h
--- linux-2.6.18-rc6-mm2/fs/reiser4/plugin/file/file.h~reiser4-add-aio_read 
2006-09-13 20:18:23.0 +0400
+++ linux-2.6.18-rc6-mm2-vs/fs/reiser4/plugin/file/file.h   2006-09-13 
20:18:23.0 +0400
@@ -15,6 +15,8 @@ int setattr_unix_file(struct dentry *, s
 /* file operations */
 ssize_t read_unix_file(struct file *, char __user *buf, size_t read_amount,
   loff_t *off);
+ssize_t aio_read_unix_file(struct kiocb *, const struct iovec *,
+  unsigned long nr_segs, loff_t pos);
 ssize_t write_unix_file(struct file *, const char __user *buf, size_t 
write_amount,
loff_t * off);
 int ioctl_unix_file(struct inode *, struct file *, unsigned int cmd,

_