On Thu, Sep 17, 2009 at 03:50:36PM -0700, Alfred Perlstein wrote:

> Please do not make the option have the same name but different
> semantics.
> 
> Strongly suggest adding the Darwin name as a toggle and a FreeBSD
> name as a specific size option.

Then it may be:

       case F_RDAHEAD:
           arg = arg ? 128 * 1024: 0;
           /* FALLTHROUGH F_READAHEAD */

       case F_READAHEAD:


> -Alfred
> 
> * Xin LI <delp...@delphij.net> [090917 15:27] wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > Hi, Igor,
> > 
> > Igor Sysoev wrote:
> > > Hi,
> > > 
> > > nginx-0.8.15 can use completely non-blocking sendfile() using SF_NODISKIO
> > > flag. When sendfile() returns EBUSY, nginx calls aio_read() to read single
> > > byte. The first aio_read() preloads the first 128K part of a file in VM 
> > > cache,
> > > however, all successive aio_read()s preload just 16K parts of the file.
> > > This makes non-blocking sendfile() usage ineffective for files larger
> > > than 128K.
> > > 
> > > I've created a small patch for Darwin compatible F_RDAHEAD fcntl:
> > > 
> > >    fcntl(fd, F_RDAHEAD, preload_size)
> > > 
> > > There is small incompatibilty: Darwin's fcntl allows just to 
> > > enable/disable
> > > read ahead, while the proposed patch allows to set exact preload size.
> > > 
> > > Currently the preload size affects vn_read() code path only and does not
> > > affect on sendfile() code path. However, it can be easy extended on
> > > sendfile() part too. The preload size is still limited by sysctl 
> > > vfs.read_max.
> > > 
> > > The patch is against FreeBSD 7.2 and was tested on FreeBSD 7.2-STABLE 
> > > only.
> > 
> > I have ported this as a patch against -HEAD (should apply on 8.0-R but
> > it's too late for us to add a new feature) plus a manual page entry
> > documenting the feature.
> > 
> > I've used F_READAHEAD as the name, but reading the manual page, it looks
> > like we can just use F_RDAHEAD since Darwin seems to just distinguish 0
> > and !=0 case so that programmers won't have to use #ifdef or something
> > else to get code working on different platform?
> > 
> > Cheers,
> > - --
> > Xin LI <delp...@delphij.net>        http://www.delphij.net/
> > FreeBSD - The Power to Serve!
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v2.0.12 (FreeBSD)
> > 
> > iEYEARECAAYFAkqyt40ACgkQi+vbBBjt66AdKgCfXOo/Vn+zw0cCjS+gGJUgPo8t
> > WToAmgKIXaVKsKUcqVOqTwHl4eTFsbkM
> > =uP3m
> > -----END PGP SIGNATURE-----
> 
> > Index: lib/libc/sys/fcntl.2
> > ===================================================================
> > --- lib/libc/sys/fcntl.2    (revision 197297)
> > +++ lib/libc/sys/fcntl.2    (working copy)
> > @@ -28,7 +28,7 @@
> >  .\"     @(#)fcntl.2        8.2 (Berkeley) 1/12/94
> >  .\" $FreeBSD$
> >  .\"
> > -.Dd March 8, 2008
> > +.Dd September 19, 2009
> >  .Dt FCNTL 2
> >  .Os
> >  .Sh NAME
> > @@ -241,6 +241,14 @@
> >  .Dv SA_RESTART
> >  (see
> >  .Xr sigaction 2 ) .
> > +.It Dv F_READAHEAD
> > +Set or clear the read ahead amount for sequential access to the third
> > +argument,
> > +.Fa arg ,
> > +which is rounded up to the nearest block size.
> > +A zero value in
> > +.Fa arg
> > +turns off read ahead.
> >  .El
> >  .Pp
> >  When a shared lock has been set on a segment of a file,
> > Index: sys/kern/kern_descrip.c
> > ===================================================================
> > --- sys/kern/kern_descrip.c (revision 197297)
> > +++ sys/kern/kern_descrip.c (working copy)
> > @@ -421,6 +421,7 @@
> >     struct vnode *vp;
> >     int error, flg, tmp;
> >     int vfslocked;
> > +   uint64_t bsize;
> >  
> >     vfslocked = 0;
> >     error = 0;
> > @@ -686,6 +687,31 @@
> >             vfslocked = 0;
> >             fdrop(fp, td);
> >             break;
> > +
> > +   case F_READAHEAD:
> > +           FILEDESC_SLOCK(fdp);
> > +           if ((fp = fdtofp(fd, fdp)) == NULL) {
> > +                   FILEDESC_SUNLOCK(fdp);
> > +                   error = EBADF;
> > +                   break;
> > +           }
> > +           if (fp->f_type != DTYPE_VNODE) {
> > +                   FILEDESC_SUNLOCK(fdp);
> > +                   error = EBADF;
> > +                   break;
> > +           }
> > +           fhold(fp);
> > +           FILEDESC_SUNLOCK(fdp);
> > +           if (arg) {
> > +                   bsize = fp->f_vnode->v_mount->mnt_stat.f_iosize;
> > +                   fp->f_seqcount = (arg + bsize - 1) / bsize;
> > +                   fp->f_flag |= O_READAHEAD;
> > +           } else {
> > +                   fp->f_flag &= ~O_READAHEAD;
> > +           }
> > +           fdrop(fp, td);
> > +           break;
> > +
> >     default:
> >             error = EINVAL;
> >             break;
> > Index: sys/kern/vfs_vnops.c
> > ===================================================================
> > --- sys/kern/vfs_vnops.c    (revision 197297)
> > +++ sys/kern/vfs_vnops.c    (working copy)
> > @@ -312,6 +312,9 @@
> >  sequential_heuristic(struct uio *uio, struct file *fp)
> >  {
> >  
> > +   if (fp->f_flag & O_READAHEAD)
> > +           return (fp->f_seqcount << IO_SEQSHIFT);
> > +
> >     /*
> >      * Offset 0 is handled specially.  open() sets f_seqcount to 1 so
> >      * that the first I/O is normally considered to be slightly
> > Index: sys/sys/fcntl.h
> > ===================================================================
> > --- sys/sys/fcntl.h (revision 197297)
> > +++ sys/sys/fcntl.h (working copy)
> > @@ -112,7 +112,11 @@
> >  #if __BSD_VISIBLE
> >  /* Attempt to bypass buffer cache */
> >  #define O_DIRECT   0x00010000
> > +#ifdef _KERNEL
> > +/* Read ahead */
> > +#define O_READAHEAD        0x00020000
> >  #endif
> > +#endif
> >  
> >  /* Defined by POSIX Extended API Set Part 2 */
> >  #if __BSD_VISIBLE
> > @@ -218,6 +222,7 @@
> >  #define    F_SETLK         12              /* set record locking 
> > information */
> >  #define    F_SETLKW        13              /* F_SETLK; wait if blocked */
> >  #define    F_SETLK_REMOTE  14              /* debugging support for remote 
> > locks */
> > +#define    F_READAHEAD     15              /* read ahead */
> >  
> >  /* file descriptor flags (F_GETFD, F_SETFD) */
> >  #define    FD_CLOEXEC      1               /* close-on-exec flag */
> 
> > _______________________________________________
> > freebsd-hackers@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> > To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
> 
> -- 
> - Alfred Perlstein
> .- AMA, VMOA #5191, 03 vmax, 92 gs500, 85 ch250
> .- FreeBSD committer

-- 
Igor Sysoev
http://sysoev.ru/en/
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to