Re: bpf_mtap(9) w/o KERNEL_LOCK

Martin Pieuchot Mon, 14 Nov 2016 01:09:08 -0800

On 13/09/16(Tue) 12:23, Martin Pieuchot wrote:
> Here's the big scary diff I've been using for some months now to stop
> grabbing the KERNEL_LOCK() in bpf_mtap(9).  This has been originally
> written to prevent lock ordering inside pf_test().  Now that we're
> heading toward using a rwlock, we won't have this problem, but fewer
> usages of KERNEL_LOCK() is still interesting.
> 
> I'm going to split this diff in small chunks to ease the review.  But
> I'd appreciate if people could try to break it, test & report back.
> 
> Some notes:
> 
>   - Now that selwakeup() is called in a thread context (task) we only
>     rely on the KERNEL_LOCK() to serialize access to kqueue(9) data.
> 
>   - The reference counting is here to make sure a descriptor is not
>     freed during a sleep.  That's why the KERNEL_LOCK() is still
>     necessary in the slow path.  On the other hand bpf_catchpacket()
>     relies on the reference guaranteed by the SRP list.
> 
>   - A mutex now protects the rotating buffers and their associated
>     fields.  It is dropped before calling ifpromisc() because USB
>     devices sleep.
> 
>   - The dance around uiomove(9) is here to check that buffers aren't
>     rotated while data is copied to userland.  Setting ``b->bd_fbuf''
>     to NULL should be enough to let bpf_catchpacket() to drop the
>     patcket.  But I added ``__in_uiomove'' to be able to have usable
>     panic if something weird happen.
> 
> Comments?


I've got many test reports but no actual review...  anyone?

Here's another extracted diff to move forward.  Let bpf_allocbufs()
fail when allocating memory, this way we can call it while holding
a mutex.

ok?

Index: net/bpf.c
===================================================================
RCS file: /cvs/src/sys/net/bpf.c,v
retrieving revision 1.150
diff -u -p -r1.150 bpf.c
--- net/bpf.c   16 Oct 2016 18:05:41 -0000      1.150
+++ net/bpf.c   14 Nov 2016 09:02:51 -0000
@@ -92,7 +92,7 @@ int bpf_maxbufsize = BPF_MAXBUFSIZE;
 struct bpf_if  *bpf_iflist;
 LIST_HEAD(, bpf_d) bpf_d_list;
 
-void   bpf_allocbufs(struct bpf_d *);
+int    bpf_allocbufs(struct bpf_d *);
 void   bpf_ifname(struct ifnet *, struct ifreq *);
 int    _bpf_mtap(caddr_t, const struct mbuf *, u_int,
            void (*)(const void *, void *, size_t));
@@ -996,6 +996,7 @@ int
 bpf_setif(struct bpf_d *d, struct ifreq *ifr)
 {
        struct bpf_if *bp, *candidate = NULL;
+       int error = 0;
        int s;
 
        /*
@@ -1012,30 +1013,33 @@ bpf_setif(struct bpf_d *d, struct ifreq 
                        candidate = bp;
        }
 
-       if (candidate != NULL) {
-               /*
-                * Allocate the packet buffers if we need to.
-                * If we're already attached to requested interface,
-                * just flush the buffer.
-                */
-               if (d->bd_sbuf == NULL)
-                       bpf_allocbufs(d);
-               s = splnet();
-               if (candidate != d->bd_bif) {
-                       if (d->bd_bif)
-                               /*
-                                * Detach if attached to something else.
-                                */
-                               bpf_detachd(d);
+       /* Not found. */
+       if (candidate == NULL)
+               return (ENXIO);
 
-                       bpf_attachd(d, candidate);
-               }
-               bpf_reset_d(d);
-               splx(s);
-               return (0);
+       /*
+        * Allocate the packet buffers if we need to.
+        * If we're already attached to requested interface,
+        * just flush the buffer.
+        */
+       s = splnet();
+       if (d->bd_sbuf == NULL) {
+               if ((error = bpf_allocbufs(d)))
+                       goto out;
        }
-       /* Not found. */
-       return (ENXIO);
+       if (candidate != d->bd_bif) {
+               if (d->bd_bif)
+                       /*
+                        * Detach if attached to something else.
+                        */
+                       bpf_detachd(d);
+
+               bpf_attachd(d, candidate);
+       }
+       bpf_reset_d(d);
+out:
+       splx(s);
+       return (error);
 }
 
 /*
@@ -1434,13 +1438,23 @@ bpf_catchpacket(struct bpf_d *d, u_char 
 /*
  * Initialize all nonzero fields of a descriptor.
  */
-void
+int
 bpf_allocbufs(struct bpf_d *d)
 {
-       d->bd_fbuf = malloc(d->bd_bufsize, M_DEVBUF, M_WAITOK);
-       d->bd_sbuf = malloc(d->bd_bufsize, M_DEVBUF, M_WAITOK);
+       d->bd_fbuf = malloc(d->bd_bufsize, M_DEVBUF, M_NOWAIT);
+       if (d->bd_fbuf == NULL)
+               return (ENOMEM);
+
+       d->bd_sbuf = malloc(d->bd_bufsize, M_DEVBUF, M_NOWAIT);
+       if (d->bd_sbuf == NULL) {
+               free(d->bd_fbuf, M_DEVBUF, d->bd_bufsize);
+               return (ENOMEM);
+       }
+
        d->bd_slen = 0;
        d->bd_hlen = 0;
+
+       return (0);
 }
 
 void

Re: bpf_mtap(9) w/o KERNEL_LOCK

Reply via email to