Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Peter Psenak Wed, 15 Jun 2022 02:03:01 -0700

Hi Gunter,

please see inline:



On 15/06/2022 10:38, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi Peter, All,

From a BGP perspective (PE service nodes) the event detection when transport 
tunnel end-point suddenly becomes unreachable is an operational problem. I 
think we all agree.

This problem exists in any multi-domain network, and is not limited to a 
multi-area/level IGP with summarization. Hence my doubts that simple encodings 
using the IGP as API for unreachability signaling is an optimal solution.

we are solving the problem for inter-area and/or inter-domain IGPnetworks. There are plenty of them.


Churning the LSDB for these things doesn't seem right.  Would this mean that we 
hack the IGP implementation so we don't trigger SPFs on rx of these updates?

I would not call adding a UPA announcement for a very rare eventchurning the LSDB. I really do not see the problem there.

UPA is a prefix advertisement with unreachable metric. Given that theprefix was never advertised with valid metric before (due tosummarization) even PRC is not required.

Another concern is how we hook into BGP sideways to update it. Typically a 
router just looks at RTM and tunnel-tables for reachability. Now it would have 
check all the time a separate bypass-list.


that is a matter of implementation.

What about the pseudo-state. On startup I would imagine we would have to 
originate this PUA until a certain point?

UPA is only advertised if the component prefix of the summary that wasreachable in its source area/domain becomes unreachable. Nothing is senton startup.


Some consideration about installing the PUA route as a blackhole route, it does 
not seem an option because resolution of BGP next-hops with blackhole /32 
routes has to continue to mean “drop” matching traffic because of the 
widespread way this is used for DDOS protection. So there is need another 
“install” type for the “unreachable” IGP prefix which does not exist yet.

again, UPA processing is a matter of the implementation and is out ofthe scope of the draft. All you need to do is to trigger BGP PIC fordestinations that use the UPA prefix as its NH. Isn't that hard.


To make IGP based Prefix-unreachability-signal successful seems not a trivial 
task pe-to-pe, and involves more than simplistic dumping of opaque link-state 
messages into IGP and to re-vector interior routing as an API. I'm a bit 
tormented regarding the potential evil caused to IGP for signaling 
prefix-unreachability. It may not be worth the effort. Especially when 
realizing that the problem space is not limited to multi-area/level 
summarization but instead exists in any multi-domain network.


once you implement it, you realize that it was not that hard at all.

thanks,
Peter


Maybe IETF should consider looking at the bigger picture, at service level, and 
document a full service level solution framework instead of looking only at IGP 
in atomic fashion.

G/

-----Original Message-----
From: Peter Psenak <ppse...@cisco.com>
Sent: Tuesday, June 14, 2022 5:46 PM
To: Van De Velde, Gunter (Nokia - BE/Antwerp) <gunter.van_de_ve...@nokia.com>; lsr 
<lsr@ietf.org>
Cc: draft-ppsenak-lsr-igp-ureach-prefix-annou...@ietf.org; 
draft-wang-lsr-prefix-unreachable-annoucement 
<draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>
Subject: Re: Thoughts about PUAs - are we not over-engineering?

Hi Gunter,

please see inline:

On 14/06/2022 10:59, Van De Velde, Gunter (Nokia - BE/Antwerp) wrote:

Hi All,

When reading both proposals about PUA's:
* draft-ppsenak-lsr-igp-ureach-prefix-announce-00
* draft-wang-lsr-prefix-unreachable-annoucement-09

The identified problem space seems a correct observation, and indeed summaries 
hide remote area network instabilities. It is one of the perceived benefits of 
using summaries. The place in the network where this hiding takes the most 
impact upon convergence is at service nodes (PE's for L3/L2/transport) where 
due to the summarization its difficult to detect that the transport tunnel 
end-point suddenly becomes unreachable. My concern however is if it really is a 
problem that is worthy for LSR WG to solve.


the request to address the problem is coming from the field. The scale of the 
networks in the field is growing significantly and the summarization is being 
implemented to keep the prefix scale under control.


To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is
not a preferred solution due to the expectation that all nodes in an
area must be upgraded to support the IGP capability. From this
operational perspective the draft
"draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as
only the A(S)BR's and particular PEs must be upgraded to support
PUA's. I do have concerns about the number of PUA advertisements in
hierarchically summarized networks (/24 (site) -> /20 (region) -> /16
(core)). More specific, in the /16 backbone area, how many of these
PUAs will be floating around creating LSP LSDB update churns? How to
control the potentially exponential number of observed PUAs from
floating everywhere? (will this lead to OSPF type NSSA areas where
areas will be purged from these PUAs for scaling stability?)


Node going down is a rare event. The expected number of UPAs at any given time 
is very small. Implementations can limit the number of UPAs on ABR/ASBR in case 
of a catastrophic events, in which case the UPAs would hardly help anyway.


Long story short, should we not take a step back and re-think this identified 
problem space? Is the proposed solution space not more evil as the problem 
space? We do summarization because it brings stability and reduce the number of 
link state updates within an area. And now with PUA we re-introduce additional 
link state updates (PUAs), we blow up the LSDB with information opaque to SPF 
best-path calculation. In addition there is suggestion of new state-machinery 
to track the igp reachability of 'protected' prefixes and there is maybe desire 
to contain or filter updates cross inter-area boundaries. And finally, how will 
we represent and track PUA in the RTM?


the problem space is valid, as conformed by the field. As described
above, the number of UPAs will be low, so there is no danger of
defeating the purpose of the summarization.


What is wrong with simply not doing summaries and forget about these PUAs to 
pinch holes in the summary prefixes? this worked very well during last two 
decennia. Are we not over-engineering with PUAs?


it's the scale of the current networks, which is growing exponentially,
which demands the use of the summarization.


thanks,
Peter

G/


_______________________________________________
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Reply via email to