On Wed, Apr 28, 2021 at 03:31:41PM +0200, Claudio Jeker wrote:
> On Wed, Apr 28, 2021 at 02:56:27PM +0200, Alexandr Nedvedicky wrote:
> > Hello,
> > 
> > On Wed, Apr 28, 2021 at 02:25:26AM +0200, Alexander Bluhm wrote:
> > > On Wed, Apr 28, 2021 at 03:19:47AM +0300, Vitaliy Makkoveev wrote:
> > > > The code not only breaks loop but also cleans the queue. Should this be
> > > > kept?
> > > 
> > 
> >     this is a good point 
> > 
> > > In IPv6 nd6_cache_lladdr() we have neither queue nor loop, just a
> > > single mbuf on hold.  But we have that discard logic, too.
> > > 
> > >                         if (ln->ln_hold) {
> > >                                 struct mbuf *n = ln->ln_hold;
> > >                                 ln->ln_hold = NULL;
> > >                                 /*
> > >                                  * we assume ifp is not a p2p here, so 
> > > just
> > >                                  * set the 2nd argument as the 1st one.
> > >                                  */
> > >                                 ifp->if_output(ifp, n, rt_key(rt), rt);
> > >                                 if (ln->ln_hold == n) {
> > >                                         /* n is back in ln_hold. Discard. 
> > > */
> > >                                         m_freem(ln->ln_hold);
> > >                                         ln->ln_hold = NULL;
> > >                                 }
> > >                         }
> > > 
> > > So I guess there is a reason to clean up.
> > > 
> > 
> >     I would say the la->la_mq queue should be empty once we dispatch all
> >     packets to wire. If it is not empty, then something went wrong and 
> > packets
> >     got recirculated back to la->la_mq. I think dropping them is better
> >     plan than keeping them in queue. There is not much we can do in this
> >     case than to drop them.
> 
> Another option is to assert on the condition. It is an error case
> that should not exist.
> 

    I think assert won't hurt here as it will enable us to learn something.

    I suspect we can increase a chance to trigger error condition
    by doing 'arp -a -d'.

> There are a few more issues in the arp code once the concurrency goes up.
> The way the link-local address is updated looks not save to me.
> At least arpresolve() and arpcache() need to make sure that access to
> rt_gateway is either serialized or the updates happen so that the result
> is always correct. At the moment this is not the case.
> 
> I assume that arp_rtrequest() is accessed with an exclusive NET_LOCK if
> not then there is more trouble waiting (setting rt_llinfo early is one
> of them).
> 

    this is the case currently as far as I can tell.

regards
sashan

Reply via email to