* bgpd.texi: Document the -l argument. Update the 'BGP decision process' table
to reflect what /actually/ is implemented. Add docs on 'compare-routerid' in
the bestpath section.
Add a section on MED, to highlight the issues it has by default, and to
highlight that it is terminally broken for its original purpose in many
modern iBGP topologies.
* routemap.texi: set an anchor on 'set metric' so bgpd.texi can reference it.
---
doc/bgpd.texi | 238 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
doc/routemap.texi | 1 +
2 files changed, 234 insertions(+), 5 deletions(-)
diff --git a/doc/bgpd.texi b/doc/bgpd.texi
index 7d92b5e..80f4888 100644
--- a/doc/bgpd.texi
+++ b/doc/bgpd.texi
@@ -53,6 +53,13 @@ Set the bgp protocol's port number.
@item -r
@itemx --retain
When program terminates, retain BGP routes added by zebra.
+
+@item -l
+@itemx --listenon
+Specify a specific IP address for bgpd to listen on, rather than its
+default of INADDR_ANY / IN6ADDR_ANY. This can be useful to constrain bgpd
+to an internal address, or to run multiple bgpd processes on one host.
+
@end table
@node BGP router
@@ -104,18 +111,59 @@ This command set distance value to
@node BGP decision process
@subsection BGP decision process
+The decision process Quagga BGP uses to select routes is as follows:
+
@table @asis
@item 1. Weight check
+prefer higher local weight routes to lower routes.
-@item 2. Local preference check.
+@item 2. Local preference check
+prefer higher local preference routes to lower.
+
+@item 3. Local route check
+Prefer local routes (statics, aggregates, redistributed) to received routes.
+
+@item AS path length check
+Prefer shortest hop-count AS_PATHs.
+
+@item 4. Origin check
+Prefer the lowest origin type route. That is, prefer IGP origin routes to
+EGP, to Incomplete routes.
+
+@item 5. MED check
+Where routes with a MED were received from the same AS,
+prefer the route with the lowest MED. See ...
+
+@item 6. External check
+Prefer the route received from an external, eBGP peer
+over routes received from other types of peers.
+
+@item 7. IGP cost check
+Prefer the route with the lower IGP cost.
+
+@item 8. Multi-path check
+If multi-pathing is enabled, then check whether
+the routes not yet distinguished in preference may be considered equal. If
+@ref{bgp bestpath as-path multipath-relax} is set, all such routes are
+considered equal, otherwise routes received via iBGP with identical AS_PATHs
+or routes received from eBGP neighbours in the same AS are considered equal.
+
-@item 3. Local route check.
+@item 10. Router-ID check
+Prefer the route with the lowest router-ID. If the
+route has an ORIGINATOR_ID attribute, through iBGP reflection, then that
+router ID is used, otherwise the router-ID of the peer the route was
+received from is used.
-@item 4. AS path length check.
+@item 11. Cluster-List length check
+The route with the shortest cluster-list
+length is used. The cluster-list reflects the iBGP reflection path the
+route has taken.
-@item 5. Origin check.
+@item 12. Peer address
+Prefer the route received from the peer with the higher
+transport layer address, as a last-resort tie-breaker.
-@item 6. MED check.
@end table
@deffn {BGP} {bgp bestpath as-path confed} {}
@@ -125,11 +173,31 @@ decision process.
@end deffn
@deffn {BGP} {bgp bestpath as-path multipath-relax} {}
+@anchor{bgp bestpath as-path multipath-relax}
This command specifies that BGP decision process should consider paths
of equal AS_PATH length candidates for multipath computation. Without
the knob, the entire AS_PATH must match for multipath computation.
@end deffn
+@deffn {BGP} {bgp bestpath compare-routerid} {}
+@anchor{bgp bestpath compare-routerid}
+
+Ensure that where iBGP routes are equal on most metrics, including
+local-pref, AS_PATH length, IGP cost, MED, the tie is broken based on
+router-ID. If a route has an ORIGINATOR_ID attribute, i.e. it has been
+reflected, that ID will be used. Otherwise, the router-ID of the peer the
+route was received from will be used.
+
+The advantage of this is that the route-selection (at this point) will be
+deterministic, across iBGP. The disadvantage is that such equal routes will
+tend to take the same exit out of the AS, via the lowest-ID router.
+
+If this option is enabled, then the external-age check, where already
+selected eBGP routes are preferred, is skipped.
+@end deffn
+
+
+
@node BGP route flap dampening
@subsection BGP route flap dampening
@@ -151,6 +219,166 @@ The route-flap damping algorithm is compatible with
@cite{RFC2439}. The use of t
is not recommended nowadays, see
@uref{http://www.ripe.net/ripe/docs/ripe-378,,RIPE-378}.
@end deffn
+@node BGP MED
+@section BGP MED
+
+The BGP @acronym{MED, Multi_Exit_Discriminator} attribute is a
+non-transitive attribute, intended to allow one AS to indicate preferences
+for ingress points to another AS. E.g., if AS X and AS Y have 2 different
+BGP peerings, then AS X might set a MED of 100 on routes advertised on one
+of those and a MED of 200 on another. AS Y then, when selecting between
+otherwise equal routes to or via AS X, should prefer to take the path via
+the lower MED peering of 100 with AS X. Setting the MED allows an AS to
+influence the routing taken to it within another, neighbouring AS.
+
+In this use of MED, it is not really meaningful to compare the MED value on
+routes where the next AS on the paths differs. E.g., if AS Y also had a
+route for some destination via AS Z in addition to the routes from AS X, and
+AS Z also set a MED, it wouldn't make sense for AS Y to compare AS Z's MED
+values to those of AS X. The MED values have been set by different
+administrators, with different frame of reference.
+
+The default behaviour of BGP therefore is to not compare MED values across
+routes received from different neighbouring ASes. In Quagga this is done by
+comparing the neighbouring, left-most AS in the received AS_PATHs of the
+routes and only comparing MED if those are the same.
+
+Unfortunately, this behaviour means MED can cause the order of preference
+over all the routes to become undefined. That is, given routes A, B, and C,
+if A is preferred to B, and B is preferred to C, a defined, transitive order
+of preference should mean that A is preferred to C.
+
+However, when MED is involved this need not be the case. With MED it is
+possible that C is actually preferred over A. This can be true even where
+BGP defines a deterministic "most preferred" route out of the full set of
+A,B,C. For any given set of routes there may be a deterministically
+preferred route, however with MED there may be no way to arrange them into
+any order of preference.
+
+That MED can induce non-transitive orders of preference over routes can
+cause issues. Firstly, in creating routing table churn locally at speakers;
+secondly in creating routing instability in non-full-mesh iBGP topologies,
+where sets of speakers continually oscillate between different paths.
+
+The first issue arises from how speakers often implement routing decisions.
+Though BGP defines a selection process that will deterministically select
+the same route as best, at any given speaker, even with MED, that process
+requires evaluating all routes together. For performance and ease of
+implementation reasons, many implementations evaluate route preferences in a
+pair-wise fashion instead. Given there is no well-defined order when MED is
+involved, the best route that will be chosen becomes subject to
+implementation details, such as the order the routes are stored in. That
+may be (locally) non-deterministic, e.g.@: it may be the order the routes
+were received in. This may be considered undesirable, though it need not
+cause problems.
+
+This first issue can be fixed with a more deterministic route selection that
+ensures routes are ordered by the neighbouring AS during selection.
+@xref{bgp deterministic-med}. This may (but need not) reduce the number of
+updates as routes are received, and may in some cases reduce routing churn.
+
+A deterministic comparison tends to imply an additional overhead of sorting
+over any set of n routes to a destination. The implementation of
+deterministic MED in Quagga scales significantly worse than most sorting
+algorithms at present however, and may be expensive in terms of CPU if there
+are many paths for a destination (with or without MED).
+
+Deterministic local evaluation can @emph{not} fix the second issue of MED
+however. Which is that the non-transitive preference of routes that MED can
+induce is routing instability or oscillation across multiple speakers, when
+combined with non-full-mesh iBGP topologies that reduce routing information.
+This has primarily been documented with iBGP route-reflection topologies.
+However, any other route-hiding technologies potentially could also cause
+oscillation with MED.
+
+The second issue occurs where speakers each have only a subset of routes.
+E.g. speaker X might have routes A,B, and speaker Y might have route C. X
+selects A as its best, Y obviously can only choose C. They exchange routes
+and then X might choose C as best from A,B,C while Y might choose A as best
+from A,C - the non-transitive, non-defined order of preference of routes
+that MED may induce allows this. They then withdraw their routes and the
+cycle repeats. This can occur even if all speakers use a deterministic
+order in route selection. [D
+
+More complex and insiduous cycles of oscillation have been documented in the
+literature. See, e.g., @cite{McPherson, D. and Gill, V. and Walton, D.,
+ "Border Gateway Protocol (BGP) Persistent Route Oscillation Condition",
+ IETF RFC3345}, and @cite{Flavel, A. and M. Roughan, "Stable and flexible
+ iBGP", ACM SIGCOMM 2009}, and @cite{Griffin, T. and G. Wilfong,
+"On the correctness of IBGP configuration", ACM SIGCOMM 2002} for concrete
examples and further
+references.
+
+There is as of this writing @emph{no} known way to use MED for its original
+purpose; @emph{and} reduce routing information in non-full-mesh iBGP
+topologies (e.g with reflectors); @emph{and} be sure to avoid the
+instability problems of MED due the non-transitive routing preferences it
+can induce.
+
+The instability problems that MED can introduce on more complex,
+non-full-mesh, iBGP topologies may be avoided either by:
+
+@itemize
+@item
+Deleting MED from all routes received from neighbouring ASes,
+and/or by ignoring MED entirely in the decision process. There is no way to
+do this at this time in Quagga.
+@item
+Setting @ref{bgp always-compare-med}, however this allows MED to be compared
+across values set by different neighbour ASes, which may not produce
+desirable results.
+@item
+Setting MED to the same value (e.g. 0) using @ref{routemap set metric} on all
+received routes, in combination with setting @ref{bgp always-compare-med} on
+all speakers.
+@end itemize
+
+As MED is evaluated after the AS_PATH length check, another possible use for
+MED is for intra-AS steering of routes with equal AS_PATH length, as an
+extension of the last case above. As MED is evaluated before IGP metric,
+this can allow cold-potato routing to be implemented, sending traffic to
+preferred hand-offs with neighbours, rather than the closest IGP hand-off.
+This would be done with @ref{routemap set metric} and by setting @ref{bgp
+always-compare-med} on all speakers.
+
+Note that even if action is taken to address the MED non-transitivity
+issues, other oscillations may still be possible. E.g. on IGP cost if iBGP
+and IGP topologies are at cross-purposes with each other.
+
+@deffn {BGP} {bgp deterministic-med} {}
+@anchor{bgp deterministic-med}
+
+Carry out route-selection in way that produces more deterministic answers
+locally, even in the face of MED and the lack of a well-defined order of
+preference it can induce on routes. Without this option the preferred route
+with MED may be determined largely by the order that routes were received
+in.
+
+Setting this option will have a performance cost that may be noticable when
+there are many routes for each destination. Currently in Quagga it is
+implemented in a way that scales poorly as the number of routes per
+destination increases.
+
+The default is that this option is not set.
+@end deffn
+
+Note that there are other sources of indeterminism in the route selection
+process, @xref{BGP decision process}.
+
+@deffn {BGP} {bgp always-compare-med} {}
+@anchor{bgp always-compare-med}
+
+Always compare the MED on routes, even when they were received from
+different neighbouring ASes. Setting this option makes the order of
+preference of routes more defined, and should eliminate MED induced
+oscillations.
+
+This option can be used, together with @ref{routemap set metric} to use MED
+as an intra-AS metric to steer equal-length AS_PATH routes to, e.g., desired
+exit points.
+@end deffn
+
+
+
@node BGP network
@section BGP network
diff --git a/doc/routemap.texi b/doc/routemap.texi
index db3e72d..7938c96 100644
--- a/doc/routemap.texi
+++ b/doc/routemap.texi
@@ -171,6 +171,7 @@ Set the route's weight.
@end deffn
@deffn {Route-map Command} {set metric @var{metric}} {}
+@anchor{routemap set metric}
Set the BGP attribute MED.
@end deffn
--
2.5.0
_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev