This discussion (see the attached messages) really should have been
cc'ed to BESS and IDR.
--- Begin Message ---
Hello
I have been selected to do a routing directorate “early” review of this draft.
https://datatracker.ietf.org/doc/draft-ietf-mpls-rfc3107bis/
The routing directorate will, on request from the working group chair, perform
an “early” review of a draft before it is submitted for publication to the
IESG. The early review can be performed at any time during the draft’s
lifetime as a working group document. The purpose of the early review depends
on the stage that the document has reached. As this document is in working
group last call, my focus for the review was to determine whether the document
is ready to be published. Please consider my comments along with the other
working group last call comments.
For more information about the Routing Directorate, please see
http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir
Best regards
Jon
Document: draft-ietf-mpls-rfc3107-01
Reviewer: Jonathan Hardwick
Review Date: 27 April 2017
Intended Status: Standards Track
Summary
Thank you for writing this document. It provides much-needed clarity to the
original RFC 3107.
This document is very well written. I think that it is ready to be published,
but there are a few points below that I would like to discuss for clarification.
I also spotted a few nits that should be fixed at some point before publication.
Comments and Questions
1) In section 2.1 it says:
“ If the Multiple Labels Capability for a given AFI/SAFI had been
exchanged on the failed session, but is not exchanged on the
restarted session, then any prefixes advertised in that AFI/SAFI with
multiple labels MUST be explicitly withdrawn.”
If I have understood this correctly, it requires a speaker to withdraw NLRI
that it sent on the previous session but that it has not sent on the restarted
session (because the negotiated session capabilities changed).
(a) Why does it need to do that – isn’t the NLRI implicitly withdrawn when the
EOR marker is sent?
(b) This seems to contradict section 2.4 which says “Note that label/prefix
bindings that were not advertised on the given session cannot be withdrawn by
this method.”
2) In section 2.1 it says:
“A BGP speaker SHOULD NOT send an UPDATE that binds more labels to a given
prefix than its peer is capable of receiving” – why isn’t that MUST NOT?
3) In section 2.4 it says:
“To do so, it may send a BGP UPDATE message with an MP_UNREACH_NLRI attribute.”
Should that be “it MUST send”?
4) In section 5: although some implementations treat SAFI 1 and SAFI 4 routes
as comparable, I believe that they should always be treated as independent, in
the following sense:
Suppose a speaker S1 sends a SAFI 1 route and then a SAFI 4 route to the same
prefix P. The SAFI 4 route MUST NOT be treated by the receiving speaker as an
implicit withdraw of the SAFI 1 route. If S1 subsequently sends an explicit
withdraw of the SAFI 4 route, this MUST NOT implicitly withdraw the SAFI 1
route, and vice versa.
Am I correct? I have seen implementations that violate this so I think it is
worth spelling out somewhere in this section.
5) In section 7 it says:
“ If a BGP implementation, not conformant with the current document,
encodes multiple labels in the NLRI but has not sent and received the
"Multiple Labels" Capability, a BGP implementation that does conform
with the current document will likely reset the BGP session.”
Wouldn’t that prevent incremental deployment of this RFC into a network that is
initially composed of such implementations? Because it seems to require that
both ends of each BGP session must be upgraded simultaneously, or else the BGP
sessions will all reset.
Nits
Section 2: Missing close-parenthesis on “([RFC4760]” – this occurs twice in
this section
Section 2.1: s/ an prefixes advertised/ any prefixes advertised/
Section 2.1: Figure 1 appears quite a long way distant from the text that
references it. I suggest moving it to immediately after the paragraph it is
referenced from.
Section 2.1: s/MUST BE/MUST be/
Section 3.1: s/different path identifiers../different path identifiers./ (i.e.
remove stray extra period)
Section 3.2.1: “using the procedure of Figure 4” should be “using the procedure
of Section 2.4”.
Ditto in section 3.2.2.
Section 4: s/S a layer 2 encapsulation/a layer 2 encapsulation/ (i.e. delete
stray ‘S’)
--- End Message ---
--- Begin Message ---
Thanks for your review!
On 4/27/2017 9:03 AM, Jonathan Hardwick wrote:
I also spotted a few nits that should be fixed at some point before
publication.
I have fixed the nits.
Comments and Questions
1) In section 2.1 it says:
“ If the Multiple Labels Capability for a given AFI/SAFI had been
exchanged on the failed session, but is not exchanged on the
restarted session, then any prefixes advertised in that AFI/SAFI with
multiple labels MUST be explicitly withdrawn.”
If I have understood this correctly, it requires a speaker to withdraw
NLRI that it sent on the previous session but that it has not sent on
the restarted session (because the negotiated session capabilities
changed).
(a) Why does it need to do that – isn’t the NLRI implicitly withdrawn
when the EOR marker is sent?
The theory here is that the label stack in the stale routes is known to
be invalid, so you really don't want your peer to hold on to them until
EOR is received.
(b) This seems to contradict section 2.4 which says “Note that
label/prefix bindings that were not advertised on the given session
cannot be withdrawn by this method.”
I added the following text to section 2.4 right after the quoted sentence:
(However, if the bindings were advertised on a previous session
with the same peer, and the current session is the result of a
"graceful restart" ([RFC4724]) of the previous session, then this
withdrawal method may be used.)
2) In section 2.1 it says:
“A BGP speaker SHOULD NOT send an UPDATE that binds more labels to a
given prefix than its peer is capable of receiving” – why isn’t that
MUST NOT?
Section 2.1 also requires the receiving speaker to apply
"treat-as-withdraw" to such updates, which does imply that the sending
speaker must not send them. So I've changed "SHOULD NOT" to "MUST NOT".
3) In section 2.4 it says:
“To do so, it may send a BGP UPDATE message with an MP_UNREACH_NLRI
attribute.”
Should that be “it MUST send”?
I think the non-normative (non-RFC2119) language is fine here.
4) In section 5: although some implementations treat SAFI 1 and SAFI 4
routes as comparable, I believe that they should always be treated as
independent, in the following sense:
Suppose a speaker S1 sends a SAFI 1 route and then a SAFI 4 route to
the same prefix P. The SAFI 4 route MUST NOT be treated by the
receiving speaker as an implicit withdraw of the SAFI 1 route. If S1
subsequently sends an explicit withdraw of the SAFI 4 route, this MUST
NOT implicitly withdraw the SAFI 1 route, and vice versa.
Am I correct? I have seen implementations that violate this so I
think it is worth spelling out somewhere in this section.
From Section 1:
This document also addresses the issue of the how UPDATEs that bind
labels to a given prefix interact with UPDATEs that advertise paths
to that prefix but do not bind labels to it. However, for backwards
compatibility, it declares most of these interactions to be matters
of local policy.
Different deployed implementations have different behavior, and I think
it is better to advance the document as is rather than derail it with
the inevitable food fight that would occur if we wanted to try to get
the IETF to say which implementation is better than which other
implementation. The deployed implementations have been around for many
years, and people seem to have adapted to the differences.
5) In section 7 it says:
“ If a BGP implementation, not conformant with the current document,
encodes multiple labels in the NLRI but has not sent and received the
"Multiple Labels" Capability, a BGP implementation that does conform
with the current document will likely reset the BGP session.”
Wouldn’t that prevent incremental deployment of this RFC into a
network that is initially composed of such implementations? Because
it seems to require that both ends of each BGP session must be
upgraded simultaneously, or else the BGP sessions will all reset.
This issue was discussed at great length when the draft was first
submitted. The vast majority of deployments do not check the S bit.
That is, the de facto standard is to assume that a received update has
only one label. If any existing deployment were transmitting updates
with multiple labels encoded into the NLRI, it would already be causing
BGP session resets.
(Even iff this were a real problem, it wouldn't require both ends of a
session to be upgraded simultaneously. It would just require one end to
have a knob allowing it to accept both old and new behavior from its
peer, and a knob telling it whether to use old or new behavior when
sending to its peer. But since the defacto standard doesn't use
multiple labels, I don't think we have to worry much about this.)
--- End Message ---
--- Begin Message ---
Hi Eric
Thanks for the replies – please see [Jon] inline.
Cheers
Jon
From: Eric C Rosen [mailto:ero...@juniper.net]
Sent: 03 May 2017 19:23
To: Jonathan Hardwick <jonathan.hardw...@metaswitch.com>;
draft-ietf-mpls-rfc3107...@ietf.org; mpls-cha...@ietf.org; m...@ietf.org
Cc: rtg-...@ietf.org
Subject: Re: Routing directorate review of draft-ietf-mpls-rfc3107-bis
Thanks for your review!
On 4/27/2017 9:03 AM, Jonathan Hardwick wrote:
I also spotted a few nits that should be fixed at some point before publication.
I have fixed the nits.
Comments and Questions
1) In section 2.1 it says:
“ If the Multiple Labels Capability for a given AFI/SAFI had been
exchanged on the failed session, but is not exchanged on the
restarted session, then any prefixes advertised in that AFI/SAFI with
multiple labels MUST be explicitly withdrawn.”
If I have understood this correctly, it requires a speaker to withdraw NLRI
that it sent on the previous session but that it has not sent on the restarted
session (because the negotiated session capabilities changed).
(a) Why does it need to do that – isn’t the NLRI implicitly withdrawn when the
EOR marker is sent?
The theory here is that the label stack in the stale routes is known to be
invalid, so you really don't want your peer to hold on to them until EOR is
received.
[Jon] OK, that’s fine.
(b) This seems to contradict section 2.4 which says “Note that label/prefix
bindings that were not advertised on the given session cannot be withdrawn by
this method.”
I added the following text to section 2.4 right after the quoted sentence:
(However, if the bindings were advertised on a previous session with the same
peer, and the current session is the result of a "graceful restart" ([RFC4724])
of the previous session, then this withdrawal method may be used.)
2) In section 2.1 it says:
“A BGP speaker SHOULD NOT send an UPDATE that binds more labels to a given
prefix than its peer is capable of receiving” – why isn’t that MUST NOT?
Section 2.1 also requires the receiving speaker to apply "treat-as-withdraw" to
such updates, which does imply that the sending speaker must not send them. So
I've changed "SHOULD NOT" to "MUST NOT".
[Jon] Thanks. Note that the same change also applies in section 3.2.1
“Similarly, a given route SHOULD NOT be
propagated to a given peer if the route's NLRI has more labels than
the peer has announced”
…and also in section 3.2.2
“A BGP speaker MUST NOT send multiple
labels to a peer with which it has not exchanged the "Multiple
Labels" Capability, and SHOULD NOT send more labels to a given peer
than the peer has announced”
3) In section 2.4 it says:
“To do so, it may send a BGP UPDATE message with an MP_UNREACH_NLRI attribute.”
Should that be “it MUST send”?
I think the non-normative (non-RFC2119) language is fine here.
[Jon] It just jarred with me a little. I’m not going to dig in on this point,
but let me explain where I was coming from. When you say “it may send” the
word “may” makes it sound like it is not obliged to do it this way. I don’t
think there is any other way to withdraw the route (short of closing the
session) so I think that what you are describing is the obligatory way to
withdraw the route. For that reason I’d prefer either “it sends” or “it MUST
send”.
4) In section 5: although some implementations treat SAFI 1 and SAFI 4 routes
as comparable, I believe that they should always be treated as independent, in
the following sense:
Suppose a speaker S1 sends a SAFI 1 route and then a SAFI 4 route to the same
prefix P. The SAFI 4 route MUST NOT be treated by the receiving speaker as an
implicit withdraw of the SAFI 1 route. If S1 subsequently sends an explicit
withdraw of the SAFI 4 route, this MUST NOT implicitly withdraw the SAFI 1
route, and vice versa.
Am I correct? I have seen implementations that violate this so I think it is
worth spelling out somewhere in this section.
From Section 1:
This document also addresses the issue of the how UPDATEs that bind labels to a
given prefix interact with UPDATEs that advertise paths to that prefix but do
not bind labels to it. However, for backwards compatibility, it declares most
of these interactions to be matters of local policy.
Different deployed implementations have different behavior, and I think it is
better to advance the document as is rather than derail it with the inevitable
food fight that would occur if we wanted to try to get the IETF to say which
implementation is better than which other implementation. The deployed
implementations have been around for many years, and people seem to have
adapted to the differences.
[Jon] I agree that this document can’t retrospectively deprecate existing,
widely deployed behaviours. In section 5 you say
“Other implementations may treat the SAFI-1 and SAFI-4 routes for a given
prefix as comparable”
Imagine that the SAFI 1 and SAFI 4 routes were received from the same peer on
the same session. What do you mean by “comparable” in this context? It has
been interpreted in incompatible ways by different implementations.
- Either: the routes are independent in the Adj-RIB-In but comparable in
the LOC-RIB (a single best one is selected, the other is retained in memory)
- Or: The routes are comparable in the Adj-RIB-In i.e. one implicitly
withdraws the other.
Both of these behaviours are implemented and deployed. The issue is that they
do not interoperate if SAFI 1 and SAFI 4 are enabled on the same session. For
instance, if an implementation of the first type withdraws its SAFI 4 route
then an implementation of the second type is left with no SAFI 1 route with
which to forward traffic. You describe this as a matter of policy, which I can
live with, but without further discussion it gives no clue to the lack of
interoperability.
I understand that it is too late to say that one of these is wrong and one is
right, but the lack of interoperability is an important deployment
consideration and I don’t think this document should be silent on it. IMO this
is an important detail for implementers, who need to be aware that both
interpretations of “comparable” exist. New implementations may wish to have a
“compatibility setting” that emulates one or the other behaviour on a
particular session, so that they can interoperate.
My suggestion: can we expand section 5 slightly to clarify that there are two
ways in which implementations can consider SAFI 1 and SAFI 4 routes comparable
on a given session, as above, and that the resulting interoperability issue can
be avoided either by emulating the appropriate behaviour on that session or by
not running SAFI 1 and SAFI 4 on the same session?
5) In section 7 it says:
“ If a BGP implementation, not conformant with the current document,
encodes multiple labels in the NLRI but has not sent and received the
"Multiple Labels" Capability, a BGP implementation that does conform
with the current document will likely reset the BGP session.”
Wouldn’t that prevent incremental deployment of this RFC into a network that is
initially composed of such implementations? Because it seems to require that
both ends of each BGP session must be upgraded simultaneously, or else the BGP
sessions will all reset.
This issue was discussed at great length when the draft was first submitted.
The vast majority of deployments do not check the S bit. That is, the de facto
standard is to assume that a received update has only one label. If any
existing deployment were transmitting updates with multiple labels encoded into
the NLRI, it would already be causing BGP session resets.
[Jon] OK.
(Even iff this were a real problem, it wouldn't require both ends of a session
to be upgraded simultaneously. It would just require one end to have a knob
allowing it to accept both old and new behavior from its peer, and a knob
telling it whether to use old or new behavior when sending to its peer. But
since the defacto standard doesn't use multiple labels, I don't think we have
to worry much about this.)
--- End Message ---
--- Begin Message ---
Hi Jon,
I've made some modifications to section 5 in response to your comments.
While I don't want to get into a lot of details about the implementation
differences, I do think it is fair to ask that section 5 point out more
clearly that there can be interoperability problems, and that it might
not be possible to achieve a desired set of local policies with some
implementations or some combinations of implementation. See below for
the proposed new contents of Section 5.
Eric
----------------------
5. Relationship Between SAFI-4 and SAFI-1 Routes
It is possible that a BGP speaker will receive both a SAFI-1 route
for prefix P and a SAFI-4 route for prefix P. Different implementations
treat this situation in different ways.
For example, some implementations may regard SAFI-1 routes and
SAFI-4 routes as completely independent, and may treat them in a "ships
in the night" fashion. In this case, bestpath selection for the two
SAFIs is independent, and there will be a best SAFI-1 route to P as well
as a best SAFI-4 route to P. Which packets get forwarded according to
the routes of which SAFI is then a matter of local policy.
Other implementations may treat the SAFI-1 and SAFI-4 routes for a
given prefix as comparable, such that the best route to prefix P is
either a SAFI-1 route or a SAFI-4 route, but not both. In such
implementations, if load-balancing is done among a set of equal cost
routes, some of the equal cost routes may be SAFI-1 routes and some may
be SAFI-4 routes. Whether this is allowed is again a matter of local
policy.
Some implementations may allow a single BGP session to carry UPDATES
of both SAFI-1 and SAFI-4; other implementations may disallow this.
Some implementations that allow both SAFIs on the same session may treat
the receipt of a SAFI-1 route for prefix P on a given session as an
implicit withdrawal of a previous SAFI-4 route for prefix P on that
session, and vice versa. Other implementations may have different behavior.
A BGP speaker may receive a SAFI-4 route over a given BGP session,
but may have other BGP sessions for which SAFI-4 is not enabled. In
this case, the BGP speaker MAY convert the SAFI-4 route to a SAFI-1
route and then propagate the result over the session on which SAFI-4 is
not enabled. Whether this is done is a matter of local policy.
These differences in the behavior of different implementations may
result in unexpected behavior or lack of interoperability. In some
cases, it may be difficult or impossible to achieve the desired policies
with certain implementations or combinations of implementations.
--- End Message ---
_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess