On Wed, Nov 30, 2011 at 10:01:12PM +0100, Giovanni Trematerra wrote:
> On Wed, Oct 5, 2011 at 6:56 PM, Konstantin Belousov <k...@freebsd.org> wrote:
> > Author: kib
> > Date: Wed Oct  5 16:56:06 2011
> > New Revision: 226042
> > URL: http://svn.freebsd.org/changeset/base/226042
> >
> > Log:
> >  Supply unique (st_dev, st_ino) value pair for the fstat(2) done on the 
> > pipes.
> >
> >  Reviewed by:  jhb, Peter Jeremy <peterjeremy acm org>
> >  MFC after:    2 weeks
> >
> > Modified:
> >  head/sys/kern/sys_pipe.c
> >  head/sys/sys/pipe.h
> >
> 
> Hi Konstantin,
> unfortunately your commit introduces a performance penalty of about 3%
> documented below in a real workload.
> I guess that fstat(2) on the pipes is seldom used so we could just be lazy
> and alloc a new unr number inside pipe_stat instead of during pipe creation.
> In that case if an application doesn't need fstat(2) on the pipes it
> won't be charged
> the cost of alloc_unr/free_unr for the fake inode number.
> The patch I propose, furthermore, fix a panic in the case alloc_unr
> failed to allocate
> a new unr number and return -1. This is because ino_t is unsigned and the test
> pipe_ino > 0 into pipeclose would be true, calling then free_unr when
> it hasn't to.
> The proposed patch was tested with a regression test code that you can find 
> here
> 
> http://www.trematerra.net/patches/pipe-fstat.c
> 
> Feel free to add it under tools/regression/pipe/
> 
> Here the proposed patch:
> 
> http://www.trematerra.net/patches/lazy_inoalloc.diff
> 
> Here the report of the benchmark:
> 
> Configuration
> 10.0-CURRENT updated to r22807.
> kern.smp.disabled=1 in /boot/loader.conf
> kernel config GENERIC without debugging options.
> 
> The first result of any test was discarded and not reported here.
> 
> here the result of three executions of
> # make -s buildkernel
> note that I managed to compile the same identical source files
> for all the tests.
> 
> r22807 with r226042 reverted (time in seconds)
> 527, 527, 527
> 
> r22807 (time in seconds)
> 544, 544, 544
> 
> r22807M with the proposed patch (time in seconds)
> 527, 528, 528
> 
> ministat output:
> 
> x r22807 with r226042 reverted
> + r22807
> * r22807M with the proposed patch
> +------------------------------------------------------------------------------+
> |+    *                                                                       
> x|
> |*    *                                                                       
> x|
> ||__A_M|                                                                      
> A|
> +------------------------------------------------------------------------------+
>     N           Min           Max        Median           Avg        Stddev
> x   3           544           544           544           544             0
> +   3           527           527           527           527             0
> Difference at 95.0% confidence
>       -17 +/- 0
>       -3.125% +/- 0%
>       (Student's t, pooled s = 0)
> *   3           527           528           528     527.66667    0.57735027
> Difference at 95.0% confidence
>       -16.3333 +/- 0.925333
>       -3.00245% +/- 0.170098%
>       (Student's t, pooled s = 0.408248)
> 
> --
> Gianni
Thank you for looking at this.
I committed the test, and the fix for the call to free_unr(-1).

Regarding the lazy allocation of the inode number, I agree with the idea,
but have some reservations against the implementation. If several threads
call fstat(2) on the same pipe which inode is not yet initialized, then
I see a race in the patch. The easiest workaround is to cover the inode
allocation with the pipe lock.

Also, I find the return of ENOMEM from fstat(2) somewhat questionable. The
error code is not documented as allowed for the syscall. I prefer to
not fail the fstat(2) if lazy allocation failed, but return some fake
value for inode instead.

Updated patch is below.

diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
index 2b6eb66..19fc98f 100644
--- a/sys/kern/sys_pipe.c
+++ b/sys/kern/sys_pipe.c
@@ -569,12 +569,7 @@ pipe_create(pipe, backing)
                /* If we're not backing this pipe, no need to do anything. */
                error = 0;
        }
-       if (error == 0) {
-               pipe->pipe_ino = alloc_unr(pipeino_unr);
-               if (pipe->pipe_ino == -1)
-                       /* pipeclose will clear allocated kva */
-                       error = ENOMEM;
-       }
+       pipe->pipe_ino = -1;
        return (error);
 }
 
@@ -1398,16 +1393,40 @@ pipe_stat(fp, ub, active_cred, td)
        struct ucred *active_cred;
        struct thread *td;
 {
-       struct pipe *pipe = fp->f_data;
+       struct pipe *pipe;
+       int new_unr;
 #ifdef MAC
        int error;
+#endif
 
+       pipe = fp->f_data;
        PIPE_LOCK(pipe);
+#ifdef MAC
        error = mac_pipe_check_stat(active_cred, pipe->pipe_pair);
-       PIPE_UNLOCK(pipe);
-       if (error)
+       if (error) {
+               PIPE_UNLOCK(pipe);
                return (error);
+       }
 #endif
+       /*
+        * Lazily allocate an inode number for the pipe.  Most pipe
+        * users do not call stat(2) on the pipe, which means that
+        * postponing the inode allocation until it is must be returned to
+        * userland is useful.  If alloc_unr failed, assign st_ino
+        * zero instead of returning an error.
+        * Special pipe_ino values:
+        *  -1 - not yet initialized;
+        *  0  - alloc_unr failed, return 0 as st_ino forever.
+        */
+       if (pipe->pipe_ino == (ino_t)-1) {
+               new_unr = alloc_unr(pipeino_unr);
+               if (new_unr != -1)
+                       pipe->pipe_ino = new_unr;
+               else
+                       pipe->pipe_ino = 0;
+       }
+       PIPE_UNLOCK(pipe);
+
        bzero(ub, sizeof(*ub));
        ub->st_mode = S_IFIFO;
        ub->st_blksize = PAGE_SIZE;

Attachment: pgpKmiECnYFnv.pgp
Description: PGP signature

Reply via email to