On 07/12/16(Wed) 00:44, Alexander Bluhm wrote:
> On Mon, Nov 28, 2016 at 04:01:14PM +0100, Martin Pieuchot wrote:
> > @@ -717,7 +753,9 @@ bpfioctl(dev_t dev, u_long cmd, caddr_t
> > *(u_int *)addr = size = bpf_maxbufsize;
> > else if (size < BPF_MINBUFSIZE)
> > *(u_int *)addr = size = BPF_MINBUFSIZE;
> > + mtx_enter(&d->bd_mtx);
> > d->bd_bufsize = size;
> > + mtx_leave(&d->bd_mtx);
> > }
> > break;
> >
>
> bd_bufsize needs protection, but you only add it in BIOCSBLEN,
> you should also put a mutex in BIOCGBLEN.
I'm not sure to understand why. The value is only written in ioctl(2)
context. The mutex is here to make sure the value wont change while
bpf_catchpacket() is executed.
> > @@ -778,30 +815,36 @@ bpfioctl(dev_t dev, u_long cmd, caddr_t
> > * Get device parameters.
> > */
> > case BIOCGDLT:
> > + mtx_enter(&d->bd_mtx);
> > if (d->bd_bif == NULL)
> > error = EINVAL;
> > else
> > *(u_int *)addr = d->bd_bif->bif_dlt;
> > + mtx_leave(&d->bd_mtx);
> > break;
>
> bd_bif is not protected elsewhere. bif_dlt is only set during attach.
> So I think you need no mutex here.
Indeed.
> >
> > /*
> > * Set device parameters.
> > */
> > case BIOCSDLT:
> > + mtx_enter(&d->bd_mtx);
> > if (d->bd_bif == NULL)
> > error = EINVAL;
> > else
> > error = bpf_setdlt(d, *(u_int *)addr);
> > + mtx_leave(&d->bd_mtx);
> > break;
>
> bd_bif needs not protection, so put the mutex only in the else path.
> Then it is obvious why it is there.
Ack.
> >
> > /*
> > * Set interface name.
> > */
> > case BIOCGETIF:
> > + mtx_enter(&d->bd_mtx);
> > if (d->bd_bif == NULL)
> > error = EINVAL;
> > else
> > bpf_ifname(d->bd_bif->bif_ifp, (struct ifreq *)addr);
> > + mtx_leave(&d->bd_mtx);
> > break;
>
> The interface name is not used by bpf without kernel lock. You do
> not need a mutex here.
Ack.
> > @@ -1151,6 +1197,8 @@ filt_bpfread(struct knote *kn, long hint
> > {
> > struct bpf_d *d = kn->kn_hook;
> >
> > + KERNEL_ASSERT_LOCKED();
> > +
> > kn->kn_data = d->bd_hlen;
> > if (d->bd_immediate)
> > kn->kn_data += d->bd_slen;
>
> You need the mutex here to access bd_slen.
Ack.
> > @@ -1232,12 +1279,10 @@ _bpf_mtap(caddr_t arg, const struct mbuf
> > if (!gottime++)
> > microtime(&tv);
> >
> > - KERNEL_LOCK();
> > - s = splnet();
> > + mtx_enter(&d->bd_mtx);
> > bpf_catchpacket(d, (u_char *)m, pktlen, slen, cpfn,
> > &tv);
> > - splx(s);
> > - KERNEL_UNLOCK();
> > + mtx_leave(&d->bd_mtx);
> >
> > if (d->bd_fildrop)
> > drop = 1;
>
> bd_fildrop can be changed by ioctl. Can we expect to read the
> correct value without a memory barrier?
>
> Same question applies to bd_dirfilt.
We probably need some. I guess that the actual code works because these
events happen occasionally.
> bd_rcount is a long value incremented atomically in _bpf_mtap().
> It is read from an other CPU with kernel lock during an ioctl. Is
> this sufficient to get a consisten value? Do we also need an atomic
> operation when reading it?
Do we need a consistent value?
> > --- net/bpfdesc.h 22 Aug 2016 10:40:36 -0000 1.31
> > +++ net/bpfdesc.h 28 Nov 2016 14:51:06 -0000
> > @@ -56,15 +56,17 @@ struct bpf_d {
> > * fbuf (free) - When read is done, put cluster here.
> > * On receiving, if sbuf is full and fbuf is 0, packet is dropped.
> > */
> > + struct mutex bd_mtx; /* protect buffer slots below */
> > caddr_t bd_sbuf; /* store slot */
> > caddr_t bd_hbuf; /* hold slot */
> > caddr_t bd_fbuf; /* free slot */
> > int bd_slen; /* current length of store buffer */
> > int bd_hlen; /* current length of hold buffer */
> > -
> > int bd_bufsize; /* absolute length of buffers */
>
> So bd_mtx is protecting the fields above. Everything below must
> either be not accessed without kernel lock, or only be set during
> attach. Is that correct or do I have a wrong understanding of the
> SMP model?
I'm not sure I understood your sentence, but I think you got the model
right :)
> > - struct bpf_if * bd_bif; /* interface descriptor */
> > + int __in_uiomove;
>
> Why do you prefix it with __ ? bd_uiomove would be more consistent.
Because it's a debug trick, I'm not convinced we should keep it.