Hi,

On Wed, Apr 24, 2019 at 07:43:15AM -0700, Randy Bush wrote:
> > And while i agree naming confusion is not good, what i care about most
> > is that we understand this phenomenon better.
> 
> and i am still waiting.  we've seen them for 30 years.  and we are still
> no nearer understanding them than a conjecture that they are caused by
> vendor bugs.
> 
> on a sibling mess, duplicate announcements, folk did real expirments and
> found some root causes.  i am still waiting for that on stuck routes.

One of the issues we found (Philip Smith and I) "back then" was indeed
router bugs.  The combination of "export policy is changed" with "an 
update is queued for this neighbour right then" led to control-plane 
confusion and missing withdraws.  This was fixed.

My conclusion then was that something along the following line happens

 - router R1 remembers where an UPDATE was sent to
 - export policy on R1 is changed, changing whether or not a given
   peer would receive an UPDATE for a given prefix
 - R1 receives withdraw from his best (and only) path, prefix is gone
 - R1 sends withdraw to "all peers it remembers"

 - and something goes wrong if that list of peers is not reflecting the
   real set of peers, possibly due to "BGP internal state not fully in
   sync between 'export policy is changed' and 'withdraw comes in'", so
   R1 is no longer aware that one of his neighbours received the prefix
   originally.

Gert Doering
        -- NetMaster
-- 
have you enabled IPv6 on something today...?

SpaceNet AG                      Vorstand: Sebastian v. Bomhard, Michael Emmer
Joseph-Dollinger-Bogen 14        Aufsichtsratsvors.: A. Grundner-Culemann
D-80807 Muenchen                 HRB: 136055 (AG Muenchen)
Tel: +49 (0)89/32356-444         USt-IdNr.: DE813185279

Attachment: signature.asc
Description: PGP signature

Reply via email to