Re: [bess] Rtgdir last call review of draft-ietf-bess-mvpn-fast-failover-11

2020-10-28 Thread Adrian Farrel
Hi again,

 

Thanks for rapid convergence.

 

All good.

 

Adrian

 

Section 3 notes that the procedure (presumably the procedure defined
in this section) is OPTIONAL. I didn't see anything similar in sections
4 and 5 stating that those procedures are optional. Presumably, since
this document is not updating any other RFCs, all of these procedures
are optional.

Actually it would be good to clarify how all these procedures fit in
with "legacy" deployments, and how they are all optional procedures. I
think that needs a short statement in the Introduction and a small
section of its own (maybe between 6 and 7).

GIM>> Thank you for the suggestion. I've updated the Introduction in this way:

OLD TEXT:

   Section 4 describes protocol extensions that can speed up failover by
   not requiring any multicast VPN routing message exchange at recovery
   time.

   Moreover, section 5 describes a "hot leaf standby" mechanism, that
   uses a combination of these two mechanisms.  This approach has
   similarities with the solution described in [RFC7431] to improve
   failover times when PIM routing is used in a network given some
   topology and metric constraints.

NEW TEXT:

   Section 4 describes optional protocol extensions that can speed up
   failover by not requiring any multicast VPN routing message exchange
   at recovery time.

   Moreover, Section 5 describes a "hot leaf standby" mechanism that can
   be used to improve failover time in MVPN.  The approach combines
   mechanisms defined in Section 3 and Section 4 has similarities with
   the solution described in [RFC7431] to improve failover times when
   PIM routing is used in a network given some topology and metric
   constraints.

I think that Section 5 is intended to explain how introduced BGP extensions and 
their use described in Section 3 and Section 4 enable operators to provide 
protection for multicast services. Would you suggest adding a new text to the 
section to highlight particular aspects of introducing protection in MVPN?

[af] OK I obviously wasn’t clear. What I’m looking for is something like…

The procedures described in this document are optional to enable an operator to 
provide protection for multicast services. An operator would enable these 
mechanisms using  and it is assumed that these mechanisms would be 
supported by all  in the network for the procedures to work. In the case 
that a BGP implementation does not recognise or is configured to not support 
the extensions defined in this document, it will respond  as described 
in . This would result in .

GIM2>> I think I've got the idea now. Would appending the new paragraph to the 
Introduction address your comment:

NEW TEXT:

   The procedures described in this document are optional to enable an
   operator to provide protection for multicast services in BGP/MPLS IP
   VPNs.  An operator would enable these mechanisms using a method
   discussed in Section 3 in combination with the redundancy provided by
   a standby PE connected to the source of the multicast flow, and it is
   assumed that all PEs in the network would support these mechanisms
   for the procedures to work.  In the case that a BGP implementation
   does not recognize or is configured to not support the extensions
   defined in this document, it will continue to provide the multicast

   service, as described in [RFC6513]. 

 

[af] Perfect

 It is curious (to me) that 3.1.1 describes a way to know that a P-tunnel
is up.  You don't say, however, if being unable to determine that the
P-tunnel is up using this method is equivalent to determining that the
P-tunnel is down. (Previously in 3.1 you have talked about the "tunnel's
state is not known to be down".)

GIM>> This method, as noted in the document, is similar to BGP next-hop 
tracking, may be computationally intensive, and cannot be run frequently. So, 
in periods between checking whether the root address in the x-PMSI Tunnel 
attribute is reachable the state is "not known to be down".

[af] Well, OK. Can you add to say that, “If it is not possible to determine 
whether the state of a tunnel is ‘up’, the state shall be considered as ‘not 
known to be down’, and it may be treated as if it is ‘up’ so that attempts to 
use the tunnel are acceptable.” This is probably “obvious to one skilled in the 
art,” but would help this reader.

GIM2>> Thank you for the contributed text. I've added in before "not known to 
be Down" used in the text (with the yellowish background):

NEW TEXT:

   The procedure described here is an OPTIONAL procedure that is based

   on a downstream PE taking into account the status of P-tunnels rooted
   at each possible Upstream PE, for including or not including each
   given PE in the list of candidate UMHs for a given (C-S, C-G) state.
   If it is not possible to determine whether a P-tunnel's current
   status is Up, the state shall be considered "not known to be Down",
   and it may be treated as if it is Up so that attempts to 

Re: [bess] Rtgdir last call review of draft-ietf-bess-mvpn-fast-failover-11

2020-10-28 Thread Adrian Farrel
Hello Greg,

 

Thanks for this. I’m cutting down to places where we still need to interact. 
Look for [af] and blue text.

 

Nothing alarming.

 

Best,

Adrian

 

Section 3 notes that the procedure (presumably the procedure defined
in this section) is OPTIONAL. I didn't see anything similar in sections
4 and 5 stating that those procedures are optional. Presumably, since
this document is not updating any other RFCs, all of these procedures
are optional.

Actually it would be good to clarify how all these procedures fit in
with "legacy" deployments, and how they are all optional procedures. I
think that needs a short statement in the Introduction and a small
section of its own (maybe between 6 and 7).

GIM>> Thank you for the suggestion. I've updated the Introduction in this way:

OLD TEXT:

   Section 4 describes protocol extensions that can speed up failover by
   not requiring any multicast VPN routing message exchange at recovery
   time.

   Moreover, section 5 describes a "hot leaf standby" mechanism, that
   uses a combination of these two mechanisms.  This approach has
   similarities with the solution described in [RFC7431] to improve
   failover times when PIM routing is used in a network given some
   topology and metric constraints.

NEW TEXT:

   Section 4 describes optional protocol extensions that can speed up
   failover by not requiring any multicast VPN routing message exchange
   at recovery time.

   Moreover, Section 5 describes a "hot leaf standby" mechanism that can
   be used to improve failover time in MVPN.  The approach combines
   mechanisms defined in Section 3 and Section 4 has similarities with
   the solution described in [RFC7431] to improve failover times when
   PIM routing is used in a network given some topology and metric
   constraints.

 

I think that Section 5 is intended to explain how introduced BGP extensions and 
their use described in Section 3 and Section 4 enable operators to provide 
protection for multicast services. Would you suggest adding a new text to the 
section to highlight particular aspects of introducing protection in MVPN?

 

[af] OK I obviously wasn’t clear. What I’m looking for is something like…

 

The procedures described in this document are optional to enable an operator to 
provide protection for multicast services. An operator would enable these 
mechanisms using  and it is assumed that these mechanisms would be 
supported by all  in the network for the procedures to work. In the case 
that a BGP implementation does not recognise or is configured to not support 
the extensions defined in this document, it will respond  as described 
in . This would result in .

 

It is curious (to me) that 3.1.1 describes a way to know that a P-tunnel
is up.  You don't say, however, if being unable to determine that the
P-tunnel is up using this method is equivalent to determining that the
P-tunnel is down. (Previously in 3.1 you have talked about the "tunnel's
state is not known to be down".)

GIM>> This method, as noted in the document, is similar to BGP next-hop 
tracking, may be computationally intensive, and cannot be run frequently. So, 
in periods between checking whether the root address in the x-PMSI Tunnel 
attribute is reachable the state is "not known to be down".

 

[af] Well, OK. Can you add to say that, “If it is not possible to determine 
whether the state of a tunnel is ‘up’, the state shall be considered as ‘not 
known to be down’, and it may be treated as if it is ‘up’ so that attempts to 
use the tunnel are acceptable.” This is probably “obvious to one skilled in the 
art,” but would help this reader.

 

By the way, do you ever say that a P-tunnel has just these two statuses
(up and down) because that could make a big difference?

GIM>> I think that the document then needs to discuss what impact detection 
time has on MVPN. For example, if the detection time is in single-digit 
seconds, a two-state model can be used. But would it be a useful model if the 
detection time is in tens of seconds? Should a "not known to be down" state be 
introduced?

 

[af] Yes, that *seems* to be the implication. But is there any different action 
between “up” and “not known to be down”? If you have three states then there is 
(possibly) an implication that tunnels are prioritised by state. I think, 
however, that it is OK to use “not known to be down” as if it was “up”.

 


Note that 3.1.2 etc also establish ways to know that the tunnel is up,
but not ways to determine whether the tunnel is down.

GIM>> In this section the state of a P-tunnel is equated with the state of the 
last link of that tunnel. The document notes that if the link is Up, then the 
P-tunnel is considered down. It is implied, that if it is determined that the 
link is Down, then the state of the P-tunnel is considered Down. Would you 
recommend adding an explanation to the document? 

 

To reiterate, "I don't know if it is up" is not the same as "I know it
is down."


Re: [bess] Rtgdir last call review of draft-ietf-bess-mvpn-fast-failover-11

2020-10-20 Thread Greg Mirsky
Hi Adrian,
thank you for the review, detailed questions, and helpful suggestions. I'll
work through and respond within several days.

Regards,
Greg

On Mon, Oct 19, 2020 at 1:09 PM Adrian Farrel via Datatracker <
nore...@ietf.org> wrote:

> Reviewer: Adrian Farrel
> Review result: Has Issues
>
> Hello,
>
> I have been selected as the Routing Directorate reviewer for this draft.
> The
> Routing Directorate seeks to review all routing or routing-related drafts
> as
> they pass through IETF last call and IESG review, and sometimes on special
> request. The purpose of the review is to provide assistance to the Routing
> ADs.
> For more information about the Routing Directorate, please see
> http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir
>
> Although these comments are primarily for the use of the Routing ADs, it
> would
> be helpful if you could consider them along with any other IETF Last Call
> comments that you receive, and strive to resolve them through discussion
> or by
> updating the draft.
>
> Document: draft-ietf-bess-mvpn-fast-failover-11.txt
> Reviewer: Adrian Farrel
> Review Date: 2020-10-18
> IETF LC End Date: 2020-10-19
> Intended Status: Proposed Standard
>
> ==Summary:==
>
> I have some minor concerns about this document that I think should be
> resolved
> before publication.
>
> ==Comments:==
>
> This document is fairly easy to read, but demands a thorough understanding
> of
> RFCs 6513 and 6514. That is not unreasonable.
>
> I also hope that the IDR working group has had a good opportunity to review
> this work.
>
> ==Major Issues:==
>
> None
>
> ==Minor Issues:==
>
> Abstract
>
> I think the Abstract should mention explicitly that this document
> extends BGP (and how).
>
> ---
>
> Section 3 notes that the procedure (presumably the procedure defined
> in this section) is OPTIONAL. I didn't see anything similar in sections
> 4 and 5 stating that those procedures are optional. Presumably, since
> this document is not updating any other RFCs, all of these procedures
> are optional.
>
> Actually it would be good to clarify how all these procedures fit in
> with "legacy" deployments, and how they are all optional procedures. I
> think that needs a short statement in the Introduction and a small
> section of its own (maybe between 6 and 7).
>
> ---
>
> It is curious (to me) that 3.1.1 describes a way to know that a P-tunnel
> is up.  You don't say, however, if being unable to determine that the
> P-tunnel is up using this method is equivalent to determining that the
> P-tunnel is down. (Previously in 3.1 you have talked about the "tunnel's
> state is not known to be down".)
>
> By the way, do you ever say that a P-tunnel has just these two statuses
> (up and down) because that could make a big difference?
>
> Note that 3.1.2 etc also establish ways to know that the tunnel is up,
> but not ways to determine whether the tunnel is down.
>
> To reiterate, "I don't know if it is up" is not the same as "I know it
> is down."
>
> ---
>
> 3.1.2
>
>Using this method when a fast restoration mechanism (such as MPLS FRR
>[RFC4090]) is in place for the link requires careful consideration
>and coordination of defect detection intervals for the link and the
>tunnel.  In many cases, it is not practical to use both protection
>methods at the same time.
>
> OK, I considered them carefully. Now what? :-)
>
> I think you have to give implementation guidance.
>
> ---
>
> All of 3.1.x are timid about the use of the mechanisms they describe.
>
> I think that the end of 3.1 should say that an implementation may choose
> to use any of these mechanisms to determine the status of the P-tunnel.
>
> This is quite stark, however, in 3.1.3 where you have...
>
>When signaling state for a P2MP TE LSP is removed (e.g., if the
>ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE
>LSP changes state from Up to Down as determined by procedures in
>[RFC4875], the status of the corresponding P-tunnel SHOULD be re-
>evaluated.  If the P-tunnel transitions from Up to Down state, the
>Upstream PE that is the ingress of the P-tunnel SHOULD NOT be
>considered a valid UMH.
>
> The use of SHOULD and SHOULD NOT is puzzling. Is this "if this mechanism
> is being used, the status SHOULD..." or is it "if a P2MP MPLS-TE tunnel
> is being used, this mechanism SHOULD be used"? In the former case, the
> SHOULD is presumably a MUST. In the latter case, why is this worthy of
> BCP 14 language when:
> - this whole document is optional
> - the mechanisms in 3.1.x are all optional
>
> But 3.1.4, 3.1.5, 3.1.6, 3.1.7 also use BCP 14 language. I'm pretty sure
> you mean "if this mechanism is being used..."
>
> In case you determine to keep any use of "SHOULD" you need to describe
> under what circumstances an implementation might diverge from this
> strong advice.
>
> ---
>
> 3.1.6
>
> What should I do if I don't recognise or support the setting of the BFD
> Mode field?
>
> ---
>
> 4.1
>
>  

[bess] Rtgdir last call review of draft-ietf-bess-mvpn-fast-failover-11

2020-10-19 Thread Adrian Farrel via Datatracker
Reviewer: Adrian Farrel
Review result: Has Issues

Hello,

I have been selected as the Routing Directorate reviewer for this draft. The
Routing Directorate seeks to review all routing or routing-related drafts as
they pass through IETF last call and IESG review, and sometimes on special
request. The purpose of the review is to provide assistance to the Routing ADs.
For more information about the Routing Directorate, please see
http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir

Although these comments are primarily for the use of the Routing ADs, it would
be helpful if you could consider them along with any other IETF Last Call
comments that you receive, and strive to resolve them through discussion or by
updating the draft.

Document: draft-ietf-bess-mvpn-fast-failover-11.txt
Reviewer: Adrian Farrel
Review Date: 2020-10-18
IETF LC End Date: 2020-10-19
Intended Status: Proposed Standard

==Summary:==

I have some minor concerns about this document that I think should be resolved
before publication.

==Comments:==

This document is fairly easy to read, but demands a thorough understanding of
RFCs 6513 and 6514. That is not unreasonable.

I also hope that the IDR working group has had a good opportunity to review
this work.

==Major Issues:==

None

==Minor Issues:==

Abstract

I think the Abstract should mention explicitly that this document
extends BGP (and how).

---

Section 3 notes that the procedure (presumably the procedure defined
in this section) is OPTIONAL. I didn't see anything similar in sections
4 and 5 stating that those procedures are optional. Presumably, since
this document is not updating any other RFCs, all of these procedures
are optional.

Actually it would be good to clarify how all these procedures fit in
with "legacy" deployments, and how they are all optional procedures. I
think that needs a short statement in the Introduction and a small
section of its own (maybe between 6 and 7).

---

It is curious (to me) that 3.1.1 describes a way to know that a P-tunnel
is up.  You don't say, however, if being unable to determine that the
P-tunnel is up using this method is equivalent to determining that the
P-tunnel is down. (Previously in 3.1 you have talked about the "tunnel's
state is not known to be down".)

By the way, do you ever say that a P-tunnel has just these two statuses
(up and down) because that could make a big difference?

Note that 3.1.2 etc also establish ways to know that the tunnel is up,
but not ways to determine whether the tunnel is down.

To reiterate, "I don't know if it is up" is not the same as "I know it
is down."

---

3.1.2

   Using this method when a fast restoration mechanism (such as MPLS FRR
   [RFC4090]) is in place for the link requires careful consideration
   and coordination of defect detection intervals for the link and the
   tunnel.  In many cases, it is not practical to use both protection
   methods at the same time.

OK, I considered them carefully. Now what? :-)

I think you have to give implementation guidance.

---

All of 3.1.x are timid about the use of the mechanisms they describe.

I think that the end of 3.1 should say that an implementation may choose
to use any of these mechanisms to determine the status of the P-tunnel.

This is quite stark, however, in 3.1.3 where you have...

   When signaling state for a P2MP TE LSP is removed (e.g., if the
   ingress of the P2MP TE LSP sends a PathTear message) or the P2MP TE
   LSP changes state from Up to Down as determined by procedures in
   [RFC4875], the status of the corresponding P-tunnel SHOULD be re-
   evaluated.  If the P-tunnel transitions from Up to Down state, the
   Upstream PE that is the ingress of the P-tunnel SHOULD NOT be
   considered a valid UMH.

The use of SHOULD and SHOULD NOT is puzzling. Is this "if this mechanism
is being used, the status SHOULD..." or is it "if a P2MP MPLS-TE tunnel
is being used, this mechanism SHOULD be used"? In the former case, the
SHOULD is presumably a MUST. In the latter case, why is this worthy of
BCP 14 language when:
- this whole document is optional
- the mechanisms in 3.1.x are all optional

But 3.1.4, 3.1.5, 3.1.6, 3.1.7 also use BCP 14 language. I'm pretty sure
you mean "if this mechanism is being used..."

In case you determine to keep any use of "SHOULD" you need to describe
under what circumstances an implementation might diverge from this
strong advice.

---

3.1.6

What should I do if I don't recognise or support the setting of the BFD
Mode field?

---

4.1

   The normal and the standby C-multicast routes must have their Local
   Preference attribute adjusted

Should this be "MUST"?

---

7.1

   IANA is requested to allocate the BGP "Standby PE" community value
   (TBA1) from the Border Gateway Protocol (BGP) Well-known Communities
   registry.

There are three ranges. You need to tell IANA which range to use.
Presumably not Private Use (because they are not assigned). But do you
want an assignment from the FCFS range or the