Re: [Pce] draft-chen-idr-mbinding - BSID terminating node

Andrew Stone (Nokia) Wed, 23 Nov 2022 12:52:31 -0800

Hi Huaimo

Thanks for pointing me to the draft. Not sure how I missed it and the adoption 
call discussion. Indeed, does go into more detail that I was looking for in the 
mbinding document. Recommend having mbinding reference this document since it 
does cover the various solution discussions.

Replied below with [Andrew2].

Thanks
Andrew

From: Huaimo Chen <huaimo.c...@futurewei.com>
Date: Monday, November 21, 2022 at 5:25 PM
To: "Andrew Stone (Nokia)" <andrew.st...@nokia.com>, "i...@ietf.org" 
<i...@ietf.org>
Cc: "pce@ietf.org" <pce@ietf.org>
Subject: Re: draft-chen-idr-mbinding - BSID terminating node

Hi Andrew,

    Thanks much for your further comments and suggestions.
    My responses are inline below with [HC2].

Best Regards,
Huaimo
________________________________
From: Stone, Andrew (Nokia - CA/Ottawa) <andrew.st...@nokia.com>
Sent: Sunday, November 13, 2022 4:01 PM
To: Huaimo Chen <huaimo.c...@futurewei.com>; i...@ietf.org <i...@ietf.org>
Cc: pce@ietf.org <pce@ietf.org>
Subject: Re: draft-chen-idr-mbinding - BSID terminating node

Hi Huaimo,

Thanks for the responses. I've added additional replies below with [Andrew], 
basically just captures what we discussed on the mic at the PCE WG Friday 
session (which took place after the email exchange) - reapplying below for this 
thread.

As discussed in PCE WG session, seems to me there should be a solution document 
discussing the Node Protection for BSIDs and defining what kind of information 
needs to be sent down to the neighboring node independent of protocol encoding. 
Either SPRING or RTGWG seems appropriate. If one doc is already in the works 
discussing this appreciate if you can steer me to it.

[HC2]:  Draft "SR-TE Path Midpoint Restoration" below talks about the 
information about the Node Protection for BSIDs

https://datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/
[Image removed by 
sender.]<https://datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/>
draft-hu-spring-segment-routing-proxy-forwarding-20 - SR-TE Path Midpoint 
Restoration - Internet Engineering Task 
Force<https://datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/>
Segment Routing Traffic Engineering (SR-TE) supports explicit paths using 
segment lists containing adjacency-SIDs, node-SIDs and binding- SIDs. The 
current SR FRR such as TI-LFA provides fast re-route protection for the failure 
of a node along a SR-TE path by the direct neighbor or say point of local 
repair (PLR) to the failure. However, once the IGP converges, the SR FRR is no 
longer ...
datatracker.ietf.org




Thanks

Andrew

From: Huaimo Chen <huaimo.c...@futurewei.com>
Date: Thursday, November 10, 2022 at 10:41 PM
To: "Stone, Andrew (Nokia - CA/Ottawa)" <andrew.st...@nokia.com>, 
"i...@ietf.org" <i...@ietf.org>
Cc: "pce@ietf.org" <pce@ietf.org>
Subject: RE: draft-chen-idr-mbinding - BSID terminating node

Hi Andrew,

Thank you very much for your comments.

My responses are inline below with [HC].

Best Regards,

Huaimo

From: Idr <idr-boun...@ietf.org> On Behalf Of Stone, Andrew (Nokia - CA/Ottawa)
Sent: Monday, November 7, 2022 6:33 PM
To: i...@ietf.org
Subject: [Idr] draft-chen-idr-mbinding - BSID terminating node

Hi IDR, Authors

Following up with my comment at the mic in the IDR meeting earlier today. The 
below would also apply to the sibling draft in PCE-WG. During the meeting it 
was confirmed the BSID SL is not provided to the neighbor node, however:

[HC]: A BSID is associated with a list of SIDs for a path segment. When BGP 
sends the

BSID with the list of SIDs to a node named N, BGP also sends the BSID with the 
list of SIDs

plus the ID of node N to the neighbors of N.

[Andrew] ACK. Thanks for clarifying the intention. Recommend the document 
clarifying that a SID list is indeed pushed down to the neighbor nodes and not 
just the node id + bsid value.

[HC2]: Will add some text into the document accordingly.

1.       How does a node providing protection, know where the BSID tunnel 
terminates on?

[HC]: If the first SID in the list is a node SID, when node N failed, the 
neighbor node of N will

replace the BSID with the list of SIDs in the packet after the SID for node N 
is removed/popped

and sends the packet  to the first SID in the path segment bond to BSID.

[Andrew] ACK. Will discuss more below….

For example, given the following network path: 
[A]--100--[B]--200--[C]--300--[D]--400--[E]--500--[F]

Where A,B,C,D,E,F are nodes and 100, 200, 300, 400, 500 are local 
adjacency-sids.

BSID 1000 is deployed to node C with SL: 300, 400.

Therefore a headend tunnel is programmed on A with SL: 100, 200, 1000, 500.

As per the draft, we want node protection for node C thus protect BSID 1000. 
When programming its neighbor [B] the document says to only inform node B:  {C, 
1000}

[HC]: If the first SID in the list is an adjacency SID (e.g., adjacency 300 for 
the adjacency from C to D),

when node N (e.g., node C here)  failed, the neighbor node(e.g., node B)  of N 
(e.g., node C) will

get the node SID of next hop node  (i.e., node D) using the adjacency SID 300, 
and replace the BSID

1000 with the node SID of next hop node (i.e., node D) and the rest of the SIDs 
in the list,

and sends the packet  to the first SID (i.e., the node SID of node D) in the 
path segment bond to BSID.

[Andrew] I think this creates a few implied assumptions which may not always be 
feasible. The first is an assumption that the neighbor node can and must 
resolve the next-node for first SID within the BSID SL. If this a simple same 
instance, same area neighbor, then yes the node can peak into IGP to resolve it 
as you describe. For non-igp flooded and received SIDs then it's not feasible. 
The second assumption is that the neighboring node must be able to push at 
least the same number of SIDs that were embedded within the BSID itself, which 
might not be feasible. Third, it assumes no micro loops may form during 
reconvergence by using the node sid of the next hop node. To work around that, 
we may need to push yet-another SID to reach that next node loop free, thus 
requiring an MSD capability equal to the BSID SL length + 1. It's likely worth 
asking: Is the goal to follow 100% the same encoding beyond the first hop, or 
simply replace the BSID with an instruction list that gets the delivery of the 
traffic to the BSID terminating node so that packet forwarding can continue?  
Given all of the above, since PCE/Controller is programming the BSID and 
notifying the neighbors of the BSID, why not have PCE/Controller a compute the 
SID list for the neighboring node to replace the BSID with upon node failure?

[HC2]: There are a few ways in which a network is configured. When a network is 
configured to run IGP, the neighbor node can resolve the next-node for the 
first SID within the BSID segment list. When a network is configured to run BGP 
or PCE, BGP or PCE can help the neighbor node resolve the next-node for the 
first SID within the BSID segment list.

Regarding to the MSD capability, the neighbor node should do as much as 
possible. If it can push the segment list for the BSID, it should push the 
list.  If it can only push part of the segment list (i.e., the last few SIDs in 
the segment list), it should push them. Note that the top SID is the node SID. 
At an extreme or worst case, it should push one SID, which is the last SID in 
the segment list (if the last SID is adjacent SID, the node SID of the next hop 
node for the adjacency needs to be found and pushed).

It seems that there is no loop. The node protection we are talking about is for 
the period after a node failed and the IGP has converged. After the IGP has 
converged, the path (repairing path) getting around the failed node is the 
shortest path after the convergence, there should not be any loop if there is 
not another new failure.

[Andrew2]:  Agreed, there may be different network designs, but I think my main 
concern remains that the current solution described will not work for an 
inter-area/inter-domain topology when BSID originates on an ABR/ASBR or 
traverses across more than one, which is a sensible design when there's a 
requirement for BSIDs. Currently the documents do not scope it to intra-area 
only or describe it being out of scope. Regarding "PCE can help the neighbor 
node resolve" - if PCE can aid, do you see PCE provide the full SID stack to 
swap with, or just the top sid of the next node? It seems reasonable to me to 
either a) have PCE just give the full new stack or b) allow both. Some new 
flags in mbinding can likely indicate this.  Regarding MSD, Okay. So 
essentially saying the proxy node shall replace the BSID path with a new stack, 
filling in as much SIDs as feasible to re-steer the traffic back along the BSID 
path and the number of instructions which can be replaced will be variable 
depending on topology, resolution visibility, and MSD. In the drafts the only 
indicator I see is that proxy node shall just reroute to the first hop in the 
BSID stack, and that it may not further down the path. If this is the case, I 
think more text is likely needed to describe the best-effort BSID protection 
steering as right now it reads like the protection shall always steer to the 
first node after the failure node.  Regarding the micro loop, Using section 4.3 
in the proxy forwarding document as an example, when RT2 pushes RT4 NSID, it 
forwards to RT7 but there's no guarantee that RT7 has converged yet. Prior to 
failure the IGP shortest path to RT4 from RT2 may be via {RT2, RT3, RT4}, thus 
the traffic will bounce between RT2 and RT7 until RT7 converges. However, if 
RT2 instead pushed {NSID7, 70074} instead of {NSID4} then no loop would occur 
(at the cost of computation + extra label).

How will B know BSID 1000 terminates on E?  If B only peaks into the next-sid 
in the stack (500), that value is of local significance to E only thus does not 
actually know it should send to E.

[HC]: When node D receives the packet, node D can process its adjacency SID 400 
(for adjacency

From D to E), and sends the packet to node E.

2.       It could be possible for BSID 1000 SL to also contain BSIDs in a 
nested manner, thus further masking where the BSID actually terminates. As 
well, the next SID in the path (ex: 500) could also be a local BSID too.

[HC]: BSID 500 is for node E, right? If so, when node E failed, its neighbor 
node D will execute the

Procedure similar to that executed by node B for node C’s failure.

3.       I did not not mentioned at the mic: the BSID terminating node may be 
in another domain/IGP instance not within view of the protecting node as a 
valuable use case is BSIDs on borders.

[HC]: For multiple domains and BSID on the border of the domains, this case is 
an egress protection

(the border node is the egress of the upstream domain) for protecting the 
failure of the egress (i.e.,

the border node).  A backup egress node is selected or configured to protect 
the (primary) egress

node. The backup egress node can get the information about the path segment 
associated with BSID,

and sends the packet along the path segment associated with the BSID.

[Andrew] As discussed in PCE WG, I think the egress protection case is quite 
different than protecting a BSID which happens to be on a border node. In the 
egress protection case, the packet is being de-encapsulated from the end to end 
tunnel. In the BSID case on border case, the packet is still encapsulated and 
in flight within the tunnel. While it's egressing the domain, it's not actually 
egressing the tunnel, thus there’s not really a primary / backup node scenario.

[HC2]: One border node B, which is adjacent to the possible failed border node 
N with BSID and in the same domain or area as N, can provide protection for N 
regarding to BSID. This border node B can replace the BSID with the segment 
list after N failed.

[Andrew2] Not sure I follow with this example, but will try to apply it to a 
topology from figure 1 in proxy forwarding draft. Imagine RT3 and RT7 are 
border nodes. RT3 is configured with a BSID100 {30034, 40045}. RT7 is the 
adjacent border node b. Yes, RT7 could also be programmed with the same BSID100 
{70074, 40045}. RT2 could learn RT3 and RT7 BSIDs and their sid values but how 
does RT2 know they both terminate on RT5? RT2 won’t understand {30034, 40045} 
or {70074, 40045} deliver to the same node. The BSID100 on RT7 could go to a 
different node and has the same value just as a fluke since it’s only locally 
significant, and the underlying sids are local only to those nodes as well.

It seems to me at minimum: the node identifier of where the BSID terminates (E 
in example above) is required, but that would still not be sufficient to solve 
[3].

[HC]: As my explanation above, it seems that these are resolved.

Thanks

Andrew

_______________________________________________
Pce mailing list
Pce@ietf.org
https://www.ietf.org/mailman/listinfo/pce

Re: [Pce] draft-chen-idr-mbinding - BSID terminating node

Reply via email to