Re: [PATCH] Fix queued SIGIO

2000-09-19 Thread Chuck Lever

On Tue, 19 Sep 2000, Julian Anastasov wrote:
> On Mon, 18 Sep 2000, Andi Kleen wrote:
> 
> > >   SI_SIGIO is not generated from kernel. The same is for the
> > > other SI_ consts < 0 not defined with __SI_CODE.
> >
> > Ok, then you have already broken binary compatibility between 2.2 and 2.4
> 
>   Looking in the old kernels, it seems the binary compatibility
> was broken in 2.3.21 when si_code returns POLL_xxx events just like
> mentioned in "The Single UNIX ® Specification, Version 2",
> xsh/signal.h.html and not SI_SIGIO.
> 
>   SI_SIGIO in si_code for 2.2 does not return any information
> about the events. I even see that Redhat maintains patch against 2.2
> to backport the POLL_xxx events from 2.3. Not sure after the changes
> in 2.4.0-test1. Anyway, 2.2 looks unusable for me and I don't see other
> way this problem to be fixed. The binary compatibility is impossible
> to exist. The applications can support the both ways: the old SI_SIGIO
> and the new POLL_xxx events (recompiled after test1) in si_code.

because the 2.2 kernels are already "broken" in this regard, i can't see
how binary compatibility between 2.2 and 2.4 could even be an
issue.  applications can't use this stuff in 2.2 without at least the
RedHat patch.

unless there's a problem implementing a glibc that runs on both 2.2 and
2.4, perhaps this should be revisited.

also, the test at issue here (from line 363 of kernel/signal.c):

/* If this was sent by a rt mechanism, try again.  */
if (info->si_code != SI_USER) {
ret = -EAGAIN;
goto out;
}

has always been unclear as to its intent.  it seems like there is
overloading going on here -- if the real intent is to prevent users
without credentials from sending "kernel-only" signals, then that should
be the logic here.

>   The next step is somebody to implement event merging and to
> allow receiving of many events with one call. For the next kernels.

we just published a paper in the ALS proceedings describing our
implementation of a new system call similar to sigtimedwait() that
collects many events at once.

- Chuck Lever
--
corporate:  <[EMAIL PROTECTED]>
personal:   <[EMAIL PROTECTED]>

The Linux Scalability project:
http://www.citi.umich.edu/projects/linux-scalability/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: [NFS] [PATCH] New patch to flush out dirty mmap()ed NFS pages in 2.4.4

2001-05-14 Thread Chuck Lever

the default behavior is that close() waits for all write-backs to
be committed to the server's disk.  you might add support for the
"nocto" mount option so that waiting is skipped for shared mmap'd
files, but then what happens to data that is pinned on the client
because a write-back failed after close() returns to the application?

what's the application domain Linus is trying to optimize?

> Linus was not too keen on the patches that circulated last week. In
> his concept of shared mmap(), he wants it to ignore the usual
> requirement we have on normal files whereby we flush out the pages on
> file close.  The problem is, though, that we need at least to schedule
> the writes using the correct credentials, and thus a compromise was to
> do this when closing the file...
>
> The following patch attempts to implement a fix along Linus'
> specification. Features are:
>
>- Remove inode operation force_delete(). The latter is toxic to all
>  mmap(), as it can cause the inode to get killed before the dirty
>  pages get scheduled.
>
>- Schedule dirty pages upon last fput() of the file.
>
>- Always write out all dirty pages to the server when
>  locking/unlocking.
>
>- Add a write_inode() method in order to allow bdflush() and
>  friends to force out writes when the user calls sys_sync(), when
>  umounting, or when memory pressure is high.
>
>- Since we in any case have to add the write_inode() method, scrap
>  the NFS special O_SYNC code, so we can just use the generic stuff
>  (which will be faster for large writes).
>
> Comments?
>
> Cheers,
>   Trond
>
> diff -u --recursive --new-file linux-2.4.4-fixes/fs/nfs/file.c
> linux-2.4.4-mmap/fs/nfs/file.c
> --- linux-2.4.4-fixes/fs/nfs/file.c   Fri Feb  9 20:29:44 2001
> +++ linux-2.4.4-mmap/fs/nfs/file.cSat May 12 21:31:39 2001
> @@ -39,6 +39,7 @@
>  static ssize_t nfs_file_write(struct file *, const char *,
> size_t, loff_t *);
>  static int  nfs_file_flush(struct file *);
>  static int  nfs_fsync(struct file *, struct dentry *dentry, int
> datasync);
> +static int  nfs_file_release(struct inode *, struct file *);
>
>  struct file_operations nfs_file_operations = {
>   read:   nfs_file_read,
> @@ -46,7 +47,7 @@
>   mmap:   nfs_file_mmap,
>   open:   nfs_open,
>   flush:  nfs_file_flush,
> - release:nfs_release,
> + release:nfs_file_release,
>   fsync:  nfs_fsync,
>   lock:   nfs_lock,
>  };
> @@ -87,6 +88,13 @@
>   return status;
>  }
>
> +int
> +nfs_file_release(struct inode *inode, struct file *file)
> +{
> + filemap_fdatasync(inode->i_mapping);
> + return nfs_release(inode,file);
> +}
> +
>  static ssize_t
>  nfs_file_read(struct file * file, char * buf, size_t count, loff_t *ppos)
>  {
> @@ -283,9 +291,11 @@
>* Flush all pending writes before doing anything
>* with locks..
>*/
> - down(&filp->f_dentry->d_inode->i_sem);
> + filemap_fdatasync(inode->i_mapping);
> + down(&inode->i_sem);
>   status = nfs_wb_all(inode);
> - up(&filp->f_dentry->d_inode->i_sem);
> + up(&inode->i_sem);
> + filemap_fdatawait(inode->i_mapping);
>   if (status < 0)
>   return status;
>
> @@ -300,10 +310,12 @@
>*/
>   out_ok:
>   if ((cmd == F_SETLK || cmd == F_SETLKW) && fl->fl_type != F_UNLCK) {
> - down(&filp->f_dentry->d_inode->i_sem);
> + filemap_fdatasync(inode->i_mapping);
> + down(&inode->i_sem);
>   nfs_wb_all(inode);  /* we may have slept */
> + up(&inode->i_sem);
> + filemap_fdatawait(inode->i_mapping);
>   nfs_zap_caches(inode);
> - up(&filp->f_dentry->d_inode->i_sem);
>   }
>   return status;
>  }
> diff -u --recursive --new-file linux-2.4.4-fixes/fs/nfs/inode.c
> linux-2.4.4-mmap/fs/nfs/inode.c
> --- linux-2.4.4-fixes/fs/nfs/inode.c  Wed Apr 25 23:58:17 2001
> +++ linux-2.4.4-mmap/fs/nfs/inode.c   Sat May 12 23:54:16 2001
> @@ -45,6 +45,7 @@
>  static void nfs_invalidate_inode(struct inode *);
>
>  static void nfs_read_inode(struct inode *);
> +static void nfs_write_inode(struct inode *,int);
>  static void nfs_delete_inode(struct inode *);
>  static void nfs_put_super(struct super_block *);
>  static void nfs_umount_begin(struct super_block *);
> @@ -52,7 +53,7 @@
>
>  static struct super_operations nfs_sops = {
>   read_inode: nfs_read_inode,
> - put_inode:  force_delete,
> + write_inode:nfs_write_inode,
>   delete_inode:   nfs_delete_inode,
>   put_super:  nfs_put_super,
>   statfs: nfs_statfs,
> @@ -113,6 +114,14 @@
>   NFS_CACHEINV(inode);
>   NFS_ATTRTIMEO(inode) = NFS_MINATTRTIMEO(inode);
>   NFS_ATTRTIMEO_UPDATE(inode) = jiffies;
> +}
> +
> +static void
> +nfs_write_inode(struct inode *inode, int sync)
> +{
> + int flags = sync ? FLUSH_WAIT : 0;
> +
> + nfs_