On Sat, 25 Jan 2020, Claudio Jeker wrote:

> Adam Thompson reported a bgpd crash he sees in 6.6. Using
> kern.nosuidcoredump=3 he was able to get me a back trace of the crash.
> The RDE chokes on the TAILQ_REMOVE in nexthop_runner() which indicates
> that the nexthop_runners list gets corrupted.
> After staring at the code for a while I realized that it is possible that
> a nexthop is put on the runner list even though there are no objects to
> process. In this case if nexthop_unref() is called before the
> nexthop_runner() had a chance to run and remove that nexthop the queue
> will be corrupt because of a use-after-free of that element.
> Fix is simple, check before enqueuing a nexthop on the nexthop_runners
> queue that next_prefix is not NULL.
> 
> The 2nd hunk just adds a debug log in the case where a prefix removal
> actually completes the nexthop run.
> 
> OK?

If I understand right, the nexthop is vulnerable to being freed when its 
prefix_h list is empty; so don't queue these to the nexthop_runner(), 
which will be a no-op anyways (except for the debug message, which you've 
added).

ok procter@ 

> -- 
> :wq Claudio
> 
> Index: rde_rib.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/bgpd/rde_rib.c,v
> retrieving revision 1.214
> diff -u -p -r1.214 rde_rib.c
> --- rde_rib.c 10 Jan 2020 14:52:57 -0000      1.214
> +++ rde_rib.c 23 Jan 2020 03:59:09 -0000
> @@ -1800,8 +1800,11 @@ nexthop_update(struct kroute_nexthop *ms
>       nh->nexthop_netlen = msg->netlen;
>  
>       nh->next_prefix = LIST_FIRST(&nh->prefix_h);
> -     TAILQ_INSERT_HEAD(&nexthop_runners, nh, runner_l);
> -     log_debug("nexthop %s update starting", log_addr(&nh->exit_nexthop));
> +     if (nh->next_prefix != NULL) {
> +             TAILQ_INSERT_HEAD(&nexthop_runners, nh, runner_l);
> +             log_debug("nexthop %s update starting",
> +                 log_addr(&nh->exit_nexthop));
> +     }
>  }
>  
>  void
> @@ -1860,8 +1863,11 @@ nexthop_unlink(struct prefix *p)
>       if (p == p->nexthop->next_prefix) {
>               p->nexthop->next_prefix = LIST_NEXT(p, entry.list.nexthop);
>               /* remove nexthop from list if no prefixes left to update */
> -             if (p->nexthop->next_prefix == NULL)
> +             if (p->nexthop->next_prefix == NULL) {
>                       TAILQ_REMOVE(&nexthop_runners, p->nexthop, runner_l);
> +                     log_debug("nexthop %s update finished",
> +                         log_addr(&p->nexthop->exit_nexthop));
> +             }
>       }
>  
>       p->flags &= ~PREFIX_NEXTHOP_LINKED;
> 
> 

Reply via email to