This series tries to clean up the FSM/collision-detection logic somewhat,
and make it bit more robust.

The primary functional changes of the series are:

- Send Open immediately on newly accepted inbound connections. We should
  have been doing that, and not doing so increases the window for collisions
  to occur. Speed things up and hopefully make things more robust.

- Ensure Established sessions are favoured, in collision detection (except
  for GR which for some reason doesn't want to just rely on KEEPALIVE).  I'm
  not 100% sure, but it seems the old code would shut down the Established
  connection if a new connection sent an OPEN. Which seems unstable.

- Ensure the collision detection logic shuts down the correct 'inbound' or
  'outbound' connection. Previously, it assumed the 'new' argument was the
  'inbound' and the looked-up 'peer' the 'outbound' but that doesn't seem
  100% safe. If/when that isn't the case, bgpd could be inconsistent with
  other implementations and result in both shutting down the same
  connection. Should make things more robust.

For the rest, I've tried to tidy and untangle the logic a bit.

There is also a patch from Cumulus, from Dinesh, on the same subject:

  'bgpd, tests, vtysh: Fix FSM to handle active/passive connections better'

Which deletes the existing connection-transfer logic and recreates it
elsewhere.  It also adds peer state flags, some of which seem to overlap
with or duplicate existing peer states - which then need further code to
manage.  I found it hard to follow some parts of it, hence this more mimimal
attempt.  All the items above, when I went back to look, the Cumulus patch
seems to do too.

Testing: I've tested this in a ring topology with a mix of the new code and
previous Quagga release.  It works.  However, torture testing in a more
mixed implementation environment would be good.

In an ideal world, peer configuration and connection state would be cleanly
separated. I tried hacking at that, but it'd require huge changes across the
code - whether it's done by adding a 'connection state' object to (struct
peer), or done by moving 'configuration state' out of (struct peer)
somehow. Way too much churn to be worth it I think.


_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

Reply via email to