[speaking as individual contributor]

Hi  Shraddha, all,

Please find below comments on section 5 (IGP hold timer for protecting traffic 
to failed node)

Following my previous comments on -02, thank you for the updated text in -03.

The current solution is based on a modification of the SPF computation (or FIB 
installation not being in sync with the regular SPF). As such, it relies on all 
compliant routers to have strictly the same behavior. Otherwise this would 
translate into micro-loops for the duration of the hold timer (easily in the 
range of 10s or higher IMHO).


  1.  In order to achieve this consistent behavior on all nodes, I would 
welcome a more normative description of the behavior with 2119 key words.
Also since the current algo relies on the spf-back off delay in order for all 
nodes the compute their "hold time" SPF on the same LSDB (same set of events) I 
think that calls for the use of a standardize back-off delay hence a normative 
reference to RFC 8405.


  1.

"A small configurable
   delay called spf-delay can be enabled, which will schedule the SPF
   after the spf-delay time on receiving the first event.  In case of a
   node going down, the spf-delay time coupled with fast-flooding can
   help to accumulate link-down events reported by all neighbors in one
   single SPF.  This mechanism is on best effort basis and does not
   guarantee that all link-down events are accumulated before SPF is
   triggered.  If there are flooding delays, the SPF might get triggered
   before receiving all events related to node going down."

A few comments:
- Indeed, the  above relies/requires fast flooding. I would suggest adding an 
informative reference to an extension improving the current flooding speed 
(draft-ietf-lsr-isis-fast-flooding)
- I don't think that the proposed trick is good enough nor really works. 
Typically spf-delay are configured with a short initial timer to quickly react 
to link failure which are the most common. This would not allow to accumulate 
multiple events hence different nodes could use a different event/LSP before 
your proposed FIB hold-down and this would translate to loops for the duration 
of the "hold down".
- Even if the above would work, I agree that this would be only "best effort". 
I don't think that this is good enough given the modest gain in good cases 
(improved availability on the few SR-TE path using the node going down) and the 
cost in bad cases (micro-loops for a relative long duration (e.g. 10s) which 
can overload multiple links and reducing availability of all traffic, included 
the one prioritized by QoS)


  1.  Given the two above points, I would rather propose to use a different 
solution: instead of using a hold-timer, push a new node SID toward a neighbor 
of the failed node, typically the closest one (à la near side tunneling).
Cf draft-hegde-rtgwg-microloop-avoidance-using-spring which you wrote and which 
says in the abstract "Micro-looping is generally more harmful than simply 
dropping traffic on failed links, because it can cause control traffic to be 
dropped on an otherwise healthy link involved in micro-loop. This can lead to 
cascading adjacency failures or network meltdown."
I think that this text is also relevant in the context of 
draft-ietf-spring-segment-protection-sr-te-paths.

Note that instead of using the closest neighbor, one could also use the 
neighbor advertising the mirror SID for the failed node (as defined in RFC 8667 
and 8402). The use of tunneling/pushing a SID allows for this choice to be 
local and hence implementation dependent.

Thank you,
Best regards,
--Bruno

> Bruno,
>
> Snipped...
>
>   1.  draft-ietf-spring-node-protection-for-sr-te-paths
>
> "If the Node-SID or Prefix-SID becomes
   > unreachable, the event and resulting forwarding changes should not
   > communicated to the forwarding planes on all configured routers
   > (including PLRs for the failed node) until the hold-timer expires."
>
>
>   *   It's not crystal clear to me how it would work in reality, so I would 
> welcome more prescriptive text. In particular:
     > *   "node failure" is not an IGP message. IGP nodes sees multiple 
"adjacency loss" messages which are not atomic and could be handled in multiple 
SPFs. Hence different nodes will freeze their FIB based on a different topology 
(link1 for some, link2 for others) leading to inconsistent routing and 
forwarding loops.
> <SH> I will add text to capture this point and also add detailed text on 
> possible solutions.
>
>
     > *   How is the FIB modified in cases of consecutives IGP events? 
(freezed on hold topology may lead to drops, updating entries would need to be 
specified.
> <SH> Agreed.
>
>   *   On a side node, this text requires a global behavior of all IGP nodes. 
> That seem a bit out of scope of a non-normative sentence, in an informational 
> document, describing a local behavior on the PLR.
> <SH> I don't think we need any protocol extension for solutions described in 
> this document but if  WG thinks it should be a standard rather than 
> informational
> we should upgrade this document status IMO.
>
> Rgds
> Shraddha
>
>
>
> Juniper Business Use Only
> From: spring spring-boun...@ietf.org<mailto:spring-boun...@ietf.org> On 
> Behalf Of bruno.decra...@orange.com<mailto:bruno.decra...@orange.com>
> Sent: Wednesday, February 2, 2022 6:56 PM
> To: slitkows.i...@gmail.com<mailto:slitkows.i...@gmail.com>; 'SPRING WG' 
> spring@ietf.org<mailto:spring@ietf.org>; Huzhibo 
> huzh...@huawei.com<mailto:huzh...@huawei.com>
> Subject: Re: [spring] WG adoption call - 
> draft-hu-spring-segment-routing-proxy-forwarding
>
> [External Email. Be cautious of content]
>
> Hi authors of both documents, WG,
>
> [Speaking as individual contributor.]
>
> It's good to see technical discussions on the restoration of failed SIDs used 
> by SR policy.
>
>
>   1.  From a functional point of view, can we summarize the benefit to signal 
> the node proxy capability?
> e.g.
> - drop the traffic earlier if the PLR does not support proxy capability. 
> (helps with congestion)
> - use another proxy off the shortest path (increase congestion but reduce 
> loss)
> - possibly help identifying the proxy (nominal is not in the reachable 
> topology anymore)
> ...
> Or agree on the absence of significant benefits?
>
>
>   1.  draft-ietf-spring-node-protection-for-sr-te-paths
>
> "If the Node-SID or Prefix-SID becomes
   > unreachable, the event and resulting forwarding changes should not
   > communicated to the forwarding planes on all configured routers
   > (including PLRs for the failed node) until the hold-timer expires."
>
>
>   *   It's not crystal clear to me how it would work in reality, so I would 
> welcome more prescriptive text. In particular:
>
     > *   "node failure" is not an IGP message. IGP nodes sees multiple 
"adjacency loss" messages which are not atomic and could be handled in multiple 
SPFs. Hence different nodes will freeze their FIB based on a different topology 
(link1 for some, link2 for others) leading to inconsistent routing and 
forwarding loops.
     > *   How is the FIB modified in cases of consecutives IGP events? 
(freezed on hold topology may lead to drops, updating entries would need to be 
specified.
>
>   *   On a side node, this text requires a global behavior of all IGP nodes. 
> That seem a bit out of scope of a non-normative sentence, in an informational 
> document, describing a local behavior on the PLR.
>
>
>
>
>   1.  draft-hu-spring-segment-routing-proxy-forwarding
> Rather than defining a new "Proxy Forwarding" capability in IGP why don't you 
> use the existing Mirroring Segment (from RFC 
> https://datatracker.ietf.org/doc/html/rfc8402#section-5.1<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc8402*section-5.1__;Iw!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oKxHlihC$>)
>  whose signaling is already standardized? 
> https://datatracker.ietf.org/doc/html/rfc8667#section-2.4.1<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc8667*section-2.4.1__;Iw!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oLCHtOYr$>
>
>
>   1.  What about the following solution:
>
>   *   Use mirror SID
  > *   Tunnel to the "proxy-forwarding" advertising mirror SID
>
> I would see the following benefits:
>
>   *   No new protocol extensions (cf "3)"
  > *   Consistent routing in case of multiple SPFs (cf "2)")
  > *   Benefit from the signaling of the proxy (cf "1)")
>
> Thanks,
> Regards,
> --Bruno
>
>
>
> Orange Restricted
> From: 
> slitkows.i...@gmail.com<mailto:slitkows.i...@gmail.com<mailto:slitkows.i...@gmail.com%3cmailto:slitkows.i...@gmail.com>>
>  
> slitkows.i...@gmail.com<mailto:slitkows.i...@gmail.com<mailto:slitkows.i...@gmail.com%3cmailto:slitkows.i...@gmail.com>>
> Sent: Tuesday, January 25, 2022 6:13 PM
> To: DECRAENE Bruno INNOV/NET 
> bruno.decra...@orange.com<mailto:bruno.decra...@orange.com<mailto:bruno.decra...@orange.com%3cmailto:bruno.decra...@orange.com>>;
>  'SPRING WG' 
> spring@ietf.org<mailto:spring@ietf.org<mailto:spring@ietf.org%3cmailto:spring@ietf.org>>
> Subject: RE: [spring] WG adoption call - 
> draft-hu-spring-segment-routing-proxy-forwarding
>
> Hi,
>
> I'm NOT supporting this draft for the following reasons:
>
>
>   1.  The WG already have a WG document which is dealing with this problem, I 
> don't think that WG should come with multiple documents/solutions for the 
> same solution space as it may just confuse the industry and create deployment 
> issues as different vendors may pick different solutions.
>
>
>
>   1.  Adding protocols extensions adds complexity in the solution without 
> adding a strong value.
>
>
>
> The document claims that "[I-D.ietf-spring-segment-protection-sr-te-paths] 
> ... may not work for some cases such as some of nodes in the network not 
> supporting this solution.". While this is true, the proposed solution in 
> draft-hu-spring-segment-routing-proxy-forwarding has exactly the same caveat 
> and requires all nodes in the network to support the solution.
>
>
>
> Considering the following straight line network: A -B -C -D - E - F - G -H 
> and an SR policy from A to H using SID_G, routers A to F have to support the 
> extension to make the solution working, if one of the router doesn't support 
> the extension, traffic will be dropped.
>
>
>
> Then, there is no value compared to the timer-based solution of 
> [I-D.ietf-spring-segment-protection-sr-te-paths]
>
>
>
> Authors of draft-hu-spring-segment-routing-proxy-forwarding argued that G may 
> have multiple upstream neighbors let's say F and F' and the solution allows 
> for F' to support the extension while F may not support, so the solution will 
> send the traffic to F'. Well yes, but this still requires all routers 
> upstream to F' to support this extension and maybe F is on the path to F'. 
> So, I don't think the argument is valid as it may possibly work tactically 
> depending on the network topology when we look at a small portion of the 
> network, but when we look at the whole network, operator will have to upgrade 
> all their nodes to support the extension to ensure the benefit is there.
>
>
>
> In addition, in term of traffic, forwarding traffic to a neighbor of the 
> failed node which wasn't initially on the path, could lead to traffic 
> congestion or high traffic peaks on links that were not sized to carry this 
> traffic. We could easily expect some traffic tromboning, where traffic goes 
> to this non-natural neighbor of the failed node and then goes back over some 
> part of the same path before reaching the destination.
>
>
>
> So these protocol extensions are bringing complexity for no value here.
>
>
>
>
>   1.  Regarding BSID, I'm not fan of advertising BSIDs in IGP as there may be 
> hundreds or thousands of BSID on a node which again will create a lot of 
> burden in IGP. The proposed way will have to be discussed in LSR, not in 
> SPRING (see next comment).
>
>
> Note that [I-D.ietf-spring-segment-protection-sr-te-paths] could also work 
> with BSIDs as long as BSID information of failed node is available in the 
> control-plane of PLRs by whatever mechanism. I think this BSID handling is 
> orthogonal to the proxy-forwarding controlplane behavior. The forwarding 
> operations for BSID will have to be discussed more in details, we could not 
> expect all HW to be able to do 3 or 4 lookups without any perf degradation.
>
>
>
>   1.  The document is currently a bit borderline between SPRING and LSR as it 
> talks in good details about IGP protocol extensions. If it's a SPRING doc, it 
> should detail reqs for protocols but nothing beyond.
>
>
>
> Brgds,
>
> Stephane
>
>
> From: spring 
> spring-boun...@ietf.org<mailto:spring-boun...@ietf.org<mailto:spring-boun...@ietf.org%3cmailto:spring-boun...@ietf.org>>
>  On Behalf Of 
> bruno.decra...@orange.com<mailto:bruno.decra...@orange.com<mailto:bruno.decra...@orange.com%3cmailto:bruno.decra...@orange.com>>
> Sent: jeudi 13 janvier 2022 11:19
> To: SPRING WG 
> spring@ietf.org<mailto:spring@ietf.org<mailto:spring@ietf.org%3cmailto:spring@ietf.org>>
> Subject: [spring] WG adoption call - 
> draft-hu-spring-segment-routing-proxy-forwarding
>
> Dear WG,
>
> This message starts a 2 week WG adoption call, ending 27/01/2022, for 
> draft-hu-spring-segment-routing-proxy-forwarding
> https://datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/__;!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oKTgLzsj$<https://datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/%3chttps:/urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/__;!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oKTgLzsj$>>
>
> After review of the document please indicate support (or not) for WG adoption 
> of the document to the mailing list.
>
> Please also provide comments/reasons for your support (or lack thereof) as 
> this is a stronger way to indicate your (non) support as this is not a vote.
>
> If you are willing to work on or review the document, please state this 
> explicitly. This gives the chairs an indication of the energy level of people 
> in the working group willing to work on the document.
>
> Thanks!
Bruno, Jim, Joel

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

_______________________________________________
spring mailing list
spring@ietf.org
https://www.ietf.org/mailman/listinfo/spring

Reply via email to