Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-12 Thread Peter Psenak

Hi Shraddha,

On 11/03/2021 19:16, Shraddha Hegde wrote:

I agree problem is valid for networks that use summarization and leaking for 
inter-domain connectivity.
However, I don't think the solution space has to be in IGP.
There are various different ways the problem could be solved.
A network could deploy egress protection [RFC 8679] or anycast based egress 
protection
[draft-hegde-rtgwg-egress-protection-sr-networks] which will ensure packets 
aren't dropped
Due to remote PE node failure. This mechanism is faster compared to other 
possible
Solutions  because if addresses failure  at the PLR and provides protection.


egress protection is one way of solving the problem. It has its own 
issues though - it requires the service SIDs to be in sync between 
multiple PEs which is not a trivial thing to do and maintain. Especially 
with the per CE allocation scheme and large scale on top.


Ingress PE "PIC" like convergence has proved its value and if an 
efficient IGP solution can be found to provide a similar mechanism in 
combination with the summarization, I have no doubts users would like that.


A combination of both is possible as well.

thanks,
Peter



Rgds
Shraddha


Juniper Business Use Only

-Original Message-
From: Lsr  On Behalf Of Peter Psenak
Sent: Tuesday, March 9, 2021 5:07 PM
To: Robert Raszuk 
Cc: Gyan Mishra ; Aijun Wang ; Aijun Wang 
; Tony Li ; lsr ; Acee Lindem (acee) 
; draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: [Lsr] 
https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

[External Email. Be cautious of content]


Robert,

On 09/03/2021 12:20, Robert Raszuk wrote:


  > In addition you may have a hierarchical RR, which would still
involve  > BGP signalling.

Last time I measured time it takes to propage withdraw via good RR was
single milliseconds.


  > because BGP signalling is prefix based and as a result slow.
+
  > that is the whole point, you need something that is prefix independent.

BGP can be easily setup in prefix independent way today.

Example 1:

If session to PE1 goes down, withdraw all RDs received from such PE.


still dependent on RDs and BGP specific. We want app independent way of 
signaling the reachability loss. At the end that's what IGPs do without a 
presence of summarization.

Again, I'm not advocating the solution proposed in 
draft-wang-lsr-prefix-unreachable-annoucement. I'm just saying the problem 
seems valid  and IGP based solution is not an unreasonable thing to consider if 
a reasonable one can be found.



Example 2:

Use IGP recursion - Use RFC3107 to construct your interarea LSPs. If
PE


there is no LSP in SRv6.

Peter


goes down withdraw it. IGP can still signal summary no issue as no
inet.3 route.

Best,
R.


On Tue, Mar 9, 2021 at 12:12 PM Peter Psenak mailto:ppse...@cisco.com>> wrote:

 Hi Robert,

 On 09/03/2021 12:02, Robert Raszuk wrote:
  > Hey Peter,
  >
  > Well ok so let's forget about LDP - cool !
  >
  > So IGP sends summary around and that is all what is needed.
  >
  > So the question why not propage information that PE went down in
 service
  > signalling - today mainly BGP.

 because BGP signalling is prefix based and as a result slow.

  >
  >  >   And forget BFD, does not scale with 10k PEs.
  >
  > You missed the point. No one is proposing full mesh of BFD sessions
  > between all PEs. I hope so at least.
  >
  > PE is connected to RRs so you need as many BFD sessions as RR to
 PE BGP
  > sessions.

 that can be still too many.
 In addition you may have a hierarchical RR, which would still involve
 BGP signalling.

 Once that session is brought down RR has all it needs to
  > trigger a message (withdraw or implicit withdraw) to remove the
  > broken service routes in a scalable way.

 that is the whole point, you need something that is prefix independent.

 thanks,
 Peter

  >
  > Thx,
  > R.
  >
  > PS. Yes we still need to start support signalling of
 unreachability in
  > BGP itself when BGP is used for underlay but this is a bit
 different use
  > case and outside of scope of LSR
  >
  >
  > On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak mailto:ppse...@cisco.com>
  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
  >
  > Robert,
  >
  > On 09/03/2021 11:47, Robert Raszuk wrote:
  >  >  > You’re trying to fix a problem in the overlay by
 morphing the
  >  > underlay.  How can that seem like a good idea?
  >  >
  >  > I think this really nails this discussion.
  >  >
  >  > We have discussed this before and while the concept 

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-11 Thread Aijun Wang
Hi, Shraddha:

I think Anycast protection mechanism is valid but it requires the deployment of 
anycast address for each multi-home pair services("The number of anycast 
loopbacks on a given node will be equal to the number of such {primary, 
protector} pairs a node belongs to."), and another thing is that they are 
applying only the SR-based service network.

Peter has mentioned some other scenarios(mainly tunnel services) at 
https://mailarchive.ietf.org/arch/msg/lsr/lz0FeTvu8OsYIYAJ83eYspmH7B8/
PUA messages can be used to trigger the tunnel switchover, besides the egress 
node/link protection.

Best Regards

Aijun Wang
China Telecom


-Original Message-
From: lsr-boun...@ietf.org  On Behalf Of Shraddha Hegde
Sent: Friday, March 12, 2021 2:16 AM
To: Peter Psenak ; Robert Raszuk 

Cc: Gyan Mishra ; Aijun Wang 
; Aijun Wang ; Tony Li 
; lsr ; Acee Lindem (acee) ; 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: [Lsr] 
https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

I agree problem is valid for networks that use summarization and leaking for 
inter-domain connectivity.
However, I don't think the solution space has to be in IGP.
There are various different ways the problem could be solved. 
A network could deploy egress protection [RFC 8679] or anycast based egress 
protection [draft-hegde-rtgwg-egress-protection-sr-networks] which will ensure 
packets aren't dropped Due to remote PE node failure. This mechanism is faster 
compared to other possible Solutions  because if addresses failure  at the PLR 
and provides protection.


Rgds
Shraddha


Juniper Business Use Only

-Original Message-
From: Lsr  On Behalf Of Peter Psenak
Sent: Tuesday, March 9, 2021 5:07 PM
To: Robert Raszuk 
Cc: Gyan Mishra ; Aijun Wang 
; Aijun Wang ; Tony Li 
; lsr ; Acee Lindem (acee) ; 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: [Lsr] 
https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

[External Email. Be cautious of content]


Robert,

On 09/03/2021 12:20, Robert Raszuk wrote:
>
>  > In addition you may have a hierarchical RR, which would still 
> involve  > BGP signalling.
>
> Last time I measured time it takes to propage withdraw via good RR was 
> single milliseconds.
>
>
>  > because BGP signalling is prefix based and as a result slow.
> +
>  > that is the whole point, you need something that is prefix independent.
>
> BGP can be easily setup in prefix independent way today.
>
> Example 1:
>
> If session to PE1 goes down, withdraw all RDs received from such PE.

still dependent on RDs and BGP specific. We want app independent way of 
signaling the reachability loss. At the end that's what IGPs do without a 
presence of summarization.

Again, I'm not advocating the solution proposed in 
draft-wang-lsr-prefix-unreachable-annoucement. I'm just saying the problem 
seems valid  and IGP based solution is not an unreasonable thing to consider if 
a reasonable one can be found.

>
> Example 2:
>
> Use IGP recursion - Use RFC3107 to construct your interarea LSPs. If 
> PE

there is no LSP in SRv6.

Peter

> goes down withdraw it. IGP can still signal summary no issue as no
> inet.3 route.
>
> Best,
> R.
>
>
> On Tue, Mar 9, 2021 at 12:12 PM Peter Psenak  <mailto:ppse...@cisco.com>> wrote:
>
> Hi Robert,
>
> On 09/03/2021 12:02, Robert Raszuk wrote:
>  > Hey Peter,
>  >
>  > Well ok so let's forget about LDP - cool !
>  >
>  > So IGP sends summary around and that is all what is needed.
>  >
>  > So the question why not propage information that PE went down in
> service
>  > signalling - today mainly BGP.
>
> because BGP signalling is prefix based and as a result slow.
>
>  >
>  >  >   And forget BFD, does not scale with 10k PEs.
>  >
>  > You missed the point. No one is proposing full mesh of BFD sessions
>  > between all PEs. I hope so at least.
>  >
>  > PE is connected to RRs so you need as many BFD sessions as RR to
> PE BGP
>  > sessions.
>
> that can be still too many.
> In addition you may have a hierarchical RR, which would still involve
> BGP signalling.
>
> Once that session is brought down RR has all it needs to
>  > trigger a message (withdraw or implicit withdraw) to remove the
>  > broken service routes in a scalable way.
>
> that is the whole point, you need something that is prefix independent.
>
> thanks,
> Peter
>
>  >
>  > Thx,
>  > R.
>  >
>  > PS. Yes we still need to start support signalling of
> unreachability in
>  > BGP itself when BGP

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-11 Thread Shraddha Hegde
I agree problem is valid for networks that use summarization and leaking for 
inter-domain connectivity.
However, I don't think the solution space has to be in IGP.
There are various different ways the problem could be solved. 
A network could deploy egress protection [RFC 8679] or anycast based egress 
protection
[draft-hegde-rtgwg-egress-protection-sr-networks] which will ensure packets 
aren't dropped
Due to remote PE node failure. This mechanism is faster compared to other 
possible
Solutions  because if addresses failure  at the PLR and provides protection.


Rgds
Shraddha


Juniper Business Use Only

-Original Message-
From: Lsr  On Behalf Of Peter Psenak
Sent: Tuesday, March 9, 2021 5:07 PM
To: Robert Raszuk 
Cc: Gyan Mishra ; Aijun Wang 
; Aijun Wang ; Tony Li 
; lsr ; Acee Lindem (acee) ; 
draft-wang-lsr-prefix-unreachable-annoucement 

Subject: Re: [Lsr] 
https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

[External Email. Be cautious of content]


Robert,

On 09/03/2021 12:20, Robert Raszuk wrote:
>
>  > In addition you may have a hierarchical RR, which would still 
> involve  > BGP signalling.
>
> Last time I measured time it takes to propage withdraw via good RR was 
> single milliseconds.
>
>
>  > because BGP signalling is prefix based and as a result slow.
> +
>  > that is the whole point, you need something that is prefix independent.
>
> BGP can be easily setup in prefix independent way today.
>
> Example 1:
>
> If session to PE1 goes down, withdraw all RDs received from such PE.

still dependent on RDs and BGP specific. We want app independent way of 
signaling the reachability loss. At the end that's what IGPs do without a 
presence of summarization.

Again, I'm not advocating the solution proposed in 
draft-wang-lsr-prefix-unreachable-annoucement. I'm just saying the problem 
seems valid  and IGP based solution is not an unreasonable thing to consider if 
a reasonable one can be found.

>
> Example 2:
>
> Use IGP recursion - Use RFC3107 to construct your interarea LSPs. If 
> PE

there is no LSP in SRv6.

Peter

> goes down withdraw it. IGP can still signal summary no issue as no
> inet.3 route.
>
> Best,
> R.
>
>
> On Tue, Mar 9, 2021 at 12:12 PM Peter Psenak  <mailto:ppse...@cisco.com>> wrote:
>
> Hi Robert,
>
> On 09/03/2021 12:02, Robert Raszuk wrote:
>  > Hey Peter,
>  >
>  > Well ok so let's forget about LDP - cool !
>  >
>  > So IGP sends summary around and that is all what is needed.
>  >
>  > So the question why not propage information that PE went down in
> service
>  > signalling - today mainly BGP.
>
> because BGP signalling is prefix based and as a result slow.
>
>  >
>  >  >   And forget BFD, does not scale with 10k PEs.
>  >
>  > You missed the point. No one is proposing full mesh of BFD sessions
>  > between all PEs. I hope so at least.
>  >
>  > PE is connected to RRs so you need as many BFD sessions as RR to
> PE BGP
>  > sessions.
>
> that can be still too many.
> In addition you may have a hierarchical RR, which would still involve
> BGP signalling.
>
> Once that session is brought down RR has all it needs to
>  > trigger a message (withdraw or implicit withdraw) to remove the
>  > broken service routes in a scalable way.
>
> that is the whole point, you need something that is prefix independent.
>
> thanks,
> Peter
>
>  >
>  > Thx,
>  > R.
>  >
>  > PS. Yes we still need to start support signalling of
> unreachability in
>  > BGP itself when BGP is used for underlay but this is a bit
> different use
>  > case and outside of scope of LSR
>  >
>  >
>  > On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak  <mailto:ppse...@cisco.com>
>  > <mailto:ppse...@cisco.com <mailto:ppse...@cisco.com>>> wrote:
>  >
>  > Robert,
>  >
>  > On 09/03/2021 11:47, Robert Raszuk wrote:
>  >  >  > You’re trying to fix a problem in the overlay by
> morphing the
>  >  > underlay.  How can that seem like a good idea?
>  >  >
>  >  > I think this really nails this discussion.
>  >  >
>  >  > We have discussed this before and while the concept of
> signalling
>  >  > unreachability does seem useful such signalling should be done
>  > where it
>  >  > belongs.
>  >  >
>  >  >

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-11 Thread Acee Lindem (acee)
Hi Gyan,

I guess you didn’t understand my first PUA question. See inline.

From: Gyan Mishra 
Date: Monday, March 8, 2021 at 8:11 PM
To: Acee Lindem 
Cc: Aijun Wang , 
draft-wang-lsr-prefix-unreachable-annoucement 
, lsr 
Subject: Re: 
https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05



On Mon, Mar 8, 2021 at 7:37 PM Acee Lindem (acee) 
mailto:a...@cisco.com>> wrote:
Speaking as a WG member:

Hi Gyan,

The first question is how do you know which prefixes within the summary range 
to protect? Are these configured? Is this half-assed best-effort protection 
where you protect prefixes within the range that you’ve installed recently? 
Just how does this work? It is clearly not specified in the draft.
 Gyan>  All prefixes within the summary range are protected see section 4.


   [RFC7794] and 
[I-D.ietf-lsr-ospf-prefix-originator]
 draft both define

   one sub-tlv to announce the originator information of the one prefix

   from a specified node.  This draft utilizes such TLV for both OSPF

   and ISIS to signal the negative prefix in the perspective PUA when a

   link or node goes down.



   ABR detects link or node down and floods PUA negative prefix

   advertisement along with the summary advertisement according to the

   prefix-originator specification.  The ABR or ISIS L1-L2 border node

   has the responsibility to add the prefix originator information when

   it receives the Router LSA from other routers in the same area or

   level.



Acee> So, the ABR will only know about missing prefixes that it has recently 
received? What if the prefix is already missing when the ABR establishes 
adjacencies on the path to the PE? What if the prefix is being permanently 
taken out of service – then this negative advertisement will persist 
permanently. What if there is an unintentional advertisement in the summary 
range and it is withdrawn? How do you decide whether or not to protect a prefix 
with in the range?









When the ABR or ISIS L1-L2 border node generates the summary

   advertisement based on component prefixes, the ABR will announce one

   new summary LSA or LSP which includes the information about this down

   prefix, with the prefix originator set to NULL.  The number of PUAs

   is equivalent to the number of links down or nodes down.  The LSA or

   LSP will be propagated with standard flooding procedures.



   If the nodes in the area receive the PUA flood from all of its ABR

   routers, they will start BGP convergence process if there exist BGP

   session on this PUA prefix.  The PUA creates a forced fail over

   action to initiate immediate control plane convergence switchover to

   alternate egress PE.  Without the PUA forced convergence the down

   prefix will yield black hole routing resulting in loss of

   connectivity.



   When only some of the ABRs can't reach the failure node/link, as that

   described in Section 
3.2,
 the ABR th.at can reach the PUA prefix

   should advertise one specific route to this PUA prefix.  The internal

   routers within another area can then bypass the ABRs that can't reach

   the PUA prefix, to reach the PUA prefix.

The second comment is that using the prefix-originator TLV is a terrible choice 
of encoding. Note that if there is any router in the domain that doesn’t 
support the extension, you’ll actually attract traffic towards the ABR 
blackholing it.
 Gyan> I will work with the authors to see if their is any alternative PUA 
process to signal and detect the failure in case prefix originator TLV is not 
supported.
Acee> Note that in the case of OSPFv3, the prefix originator TLV is a Sub-TLV 
of the Inter-Area Prefix TLV advertised in the E-Inter-Area-Prefix-LSA. If 
there are any OSPFv3 routers in the domain that don’t support this 
functionality and receive traffic for the protected prefix, they will actually 
route it towards the blackhole.

Further, I think your example is a bit contrived. I’d hope that an OSPF area 
with “thousands” of summarized PE addresses wouldn’t be portioned by a single 
failure as in figure 1 in the draft and your slides. I also that the option of 
a backbone tunnel between the ABRs was removed from the draft since it 
diminished the requirement for this functionality.
 Gyan> This is a real world Metro access edge example as the impact is 
customers that have LSP built to the down egress PE that has not failed over.  
In this scenario their is a Primary and Backup PE per Metro edge which is 
typical for an operator.

The workaround used today is to flood all /32 next hop prefixes and not take 
advantage of summarization.  This draft makes RFC 5283 inter area FEC binding 
now viable for operators.
Acee> Or add a reliable intra-area link between your ABRs. 

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-10 Thread Tony Przygienda
same thing. if you want go multiple hops ZeroMQ you need forwarding
already. And if you go one hop it really doesn't matter, it's just FOSE
(flooding over something else ;-)

-- tony

On Wed, Mar 10, 2021 at 12:52 PM Robert Raszuk  wrote:

> >  You think Kafka here?
>
> Nope ... I meant ZeroMQ message bus as underlaying pub-sub transport for
> service related info.
>
> Thx,
> R.,
>
>
> On Wed, Mar 10, 2021 at 11:41 AM Tony Przygienda 
> wrote:
>
>> ? Last time I looked @ it (and it's been a while) Open-R had nothing of
>> that sort, it was basically KV playing LSDB (innovative and clever in
>> itself). You think Kafka here? Which in turn needs underlying IGP however
>> and is nothing but BGP problems in new clothes having looked @ their
>> internal architecture and where it's goiing a while ago.
>>
>> -- tony
>>
>> On Wed, Mar 10, 2021 at 11:29 AM Robert Raszuk  wrote:
>>
>>> Peter,
>>>
>>> > But suddenly the DOWN event distribution is considered
>>> > problematic. Not sure I follow.
>>>
>>> In routing and IP reachability we use p2mp distribution and flooding as
>>> it is required to provide any to any connectivity.
>>>
>>> Such spray model no longer fits services where not every endpoint
>>> participates in all services.
>>>
>>> So my point is that just because you have transport ready we should not
>>> continue to announce neither good nor bad news in spray fashion for
>>> services.
>>>
>>> Sure it works, but it is hardly a good design and sound architecture.
>>>
>>> It happened to BGP as the convenience of already having TCP sessions
>>> between nodes was so great that we loaded loads of stuff to go along basic
>>> routing reachability.
>>>
>>> And now it seems time came to do the same with IGPs :).
>>>
>>> I think unless we stop and define a real pub-sub messaging protocol
>>> (like FB does with open-R)  we will continue this.
>>>
>>> And to me it is like building a tower from the cards ... the higher you
>>> go the more likely your entire tower is to collapse.
>>>
>>> Cheers,
>>> R.
>>>
>>> PS.
>>>
>>> > with MPLS loopback address of all PEs is advertised everywhere.
>>>
>>> Is this a feature or a day one design bug later fixed by RFC5283 ?
>>>
>>>
>>>
>>>
>>> On Wed, Mar 10, 2021 at 9:10 AM Peter Psenak  wrote:
>>>
 Robert,


 On 09/03/2021 19:30, Robert Raszuk wrote:
 > Hi Peter,
 >
 >  > Example 1:
 >  >
 >  > If session to PE1 goes down, withdraw all RDs received from
 such PE.
 >
 > still dependent on RDs and BGP specific.
 >
 >
 > To me this does sound like a feature ... to you I think it was rather
 > pejorative.

 not sure I understand your point with "pejorative"...

 There are other ways to provide services outside of BGP - think GRE,
 IPsec, etc. The solution should cover them all.

 >
 > We want app independent way of
 > signaling the reachability loss. At the end that's what IGPs do
 without
 > a presence of summarization.
 >
 >
 > Here you go. I suppose you just drafted the first use case for OSPF
 > Transport Instance.

 you said it, not me.


 >
 > I suppose you just run new ISIS or OSPF Instance and flood info about
 PE
 > down events to all other instance nodes (hopefully just PEs and no Ps
 as
 > such plane would be OTT one).  Still you will be flooding this to
 100s
 > of PEs which may never need this information at all which I think is
 the
 > main issue here. Such bad news IMHO should be distributed on a
 pub/sub
 > basis only. First you subscribe then you get updates ... not get
 > everything then keep junk till it get's removed or expires.

 with MPLS loopback address of all PEs is advertised everywhere. So you
 keep the state when the remote PE loopback is up and you get a state
 withdrawal when the remote PE loopback goes down.

 In Srv6, with summarization we can reduced the amount of UP state to
 minimum. But suddenly the DOWN event distribution is considered
 problematic. Not sure I follow.

 thanks,
 Peter

 >
 > Many thx,
 > Robert
 >

 ___
>>> Lsr mailing list
>>> Lsr@ietf.org
>>> https://www.ietf.org/mailman/listinfo/lsr
>>>
>>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-10 Thread Robert Raszuk
>  You think Kafka here?

Nope ... I meant ZeroMQ message bus as underlaying pub-sub transport for
service related info.

Thx,
R.,


On Wed, Mar 10, 2021 at 11:41 AM Tony Przygienda 
wrote:

> ? Last time I looked @ it (and it's been a while) Open-R had nothing of
> that sort, it was basically KV playing LSDB (innovative and clever in
> itself). You think Kafka here? Which in turn needs underlying IGP however
> and is nothing but BGP problems in new clothes having looked @ their
> internal architecture and where it's goiing a while ago.
>
> -- tony
>
> On Wed, Mar 10, 2021 at 11:29 AM Robert Raszuk  wrote:
>
>> Peter,
>>
>> > But suddenly the DOWN event distribution is considered
>> > problematic. Not sure I follow.
>>
>> In routing and IP reachability we use p2mp distribution and flooding as
>> it is required to provide any to any connectivity.
>>
>> Such spray model no longer fits services where not every endpoint
>> participates in all services.
>>
>> So my point is that just because you have transport ready we should not
>> continue to announce neither good nor bad news in spray fashion for
>> services.
>>
>> Sure it works, but it is hardly a good design and sound architecture.
>>
>> It happened to BGP as the convenience of already having TCP sessions
>> between nodes was so great that we loaded loads of stuff to go along basic
>> routing reachability.
>>
>> And now it seems time came to do the same with IGPs :).
>>
>> I think unless we stop and define a real pub-sub messaging protocol (like
>> FB does with open-R)  we will continue this.
>>
>> And to me it is like building a tower from the cards ... the higher you
>> go the more likely your entire tower is to collapse.
>>
>> Cheers,
>> R.
>>
>> PS.
>>
>> > with MPLS loopback address of all PEs is advertised everywhere.
>>
>> Is this a feature or a day one design bug later fixed by RFC5283 ?
>>
>>
>>
>>
>> On Wed, Mar 10, 2021 at 9:10 AM Peter Psenak  wrote:
>>
>>> Robert,
>>>
>>>
>>> On 09/03/2021 19:30, Robert Raszuk wrote:
>>> > Hi Peter,
>>> >
>>> >  > Example 1:
>>> >  >
>>> >  > If session to PE1 goes down, withdraw all RDs received from
>>> such PE.
>>> >
>>> > still dependent on RDs and BGP specific.
>>> >
>>> >
>>> > To me this does sound like a feature ... to you I think it was rather
>>> > pejorative.
>>>
>>> not sure I understand your point with "pejorative"...
>>>
>>> There are other ways to provide services outside of BGP - think GRE,
>>> IPsec, etc. The solution should cover them all.
>>>
>>> >
>>> > We want app independent way of
>>> > signaling the reachability loss. At the end that's what IGPs do
>>> without
>>> > a presence of summarization.
>>> >
>>> >
>>> > Here you go. I suppose you just drafted the first use case for OSPF
>>> > Transport Instance.
>>>
>>> you said it, not me.
>>>
>>>
>>> >
>>> > I suppose you just run new ISIS or OSPF Instance and flood info about
>>> PE
>>> > down events to all other instance nodes (hopefully just PEs and no Ps
>>> as
>>> > such plane would be OTT one).  Still you will be flooding this to 100s
>>> > of PEs which may never need this information at all which I think is
>>> the
>>> > main issue here. Such bad news IMHO should be distributed on a pub/sub
>>> > basis only. First you subscribe then you get updates ... not get
>>> > everything then keep junk till it get's removed or expires.
>>>
>>> with MPLS loopback address of all PEs is advertised everywhere. So you
>>> keep the state when the remote PE loopback is up and you get a state
>>> withdrawal when the remote PE loopback goes down.
>>>
>>> In Srv6, with summarization we can reduced the amount of UP state to
>>> minimum. But suddenly the DOWN event distribution is considered
>>> problematic. Not sure I follow.
>>>
>>> thanks,
>>> Peter
>>>
>>> >
>>> > Many thx,
>>> > Robert
>>> >
>>>
>>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-10 Thread Tony Przygienda
? Last time I looked @ it (and it's been a while) Open-R had nothing of
that sort, it was basically KV playing LSDB (innovative and clever in
itself). You think Kafka here? Which in turn needs underlying IGP however
and is nothing but BGP problems in new clothes having looked @ their
internal architecture and where it's goiing a while ago.

-- tony

On Wed, Mar 10, 2021 at 11:29 AM Robert Raszuk  wrote:

> Peter,
>
> > But suddenly the DOWN event distribution is considered
> > problematic. Not sure I follow.
>
> In routing and IP reachability we use p2mp distribution and flooding as it
> is required to provide any to any connectivity.
>
> Such spray model no longer fits services where not every endpoint
> participates in all services.
>
> So my point is that just because you have transport ready we should not
> continue to announce neither good nor bad news in spray fashion for
> services.
>
> Sure it works, but it is hardly a good design and sound architecture.
>
> It happened to BGP as the convenience of already having TCP sessions
> between nodes was so great that we loaded loads of stuff to go along basic
> routing reachability.
>
> And now it seems time came to do the same with IGPs :).
>
> I think unless we stop and define a real pub-sub messaging protocol (like
> FB does with open-R)  we will continue this.
>
> And to me it is like building a tower from the cards ... the higher you go
> the more likely your entire tower is to collapse.
>
> Cheers,
> R.
>
> PS.
>
> > with MPLS loopback address of all PEs is advertised everywhere.
>
> Is this a feature or a day one design bug later fixed by RFC5283 ?
>
>
>
>
> On Wed, Mar 10, 2021 at 9:10 AM Peter Psenak  wrote:
>
>> Robert,
>>
>>
>> On 09/03/2021 19:30, Robert Raszuk wrote:
>> > Hi Peter,
>> >
>> >  > Example 1:
>> >  >
>> >  > If session to PE1 goes down, withdraw all RDs received from such
>> PE.
>> >
>> > still dependent on RDs and BGP specific.
>> >
>> >
>> > To me this does sound like a feature ... to you I think it was rather
>> > pejorative.
>>
>> not sure I understand your point with "pejorative"...
>>
>> There are other ways to provide services outside of BGP - think GRE,
>> IPsec, etc. The solution should cover them all.
>>
>> >
>> > We want app independent way of
>> > signaling the reachability loss. At the end that's what IGPs do
>> without
>> > a presence of summarization.
>> >
>> >
>> > Here you go. I suppose you just drafted the first use case for OSPF
>> > Transport Instance.
>>
>> you said it, not me.
>>
>>
>> >
>> > I suppose you just run new ISIS or OSPF Instance and flood info about
>> PE
>> > down events to all other instance nodes (hopefully just PEs and no Ps
>> as
>> > such plane would be OTT one).  Still you will be flooding this to 100s
>> > of PEs which may never need this information at all which I think is
>> the
>> > main issue here. Such bad news IMHO should be distributed on a pub/sub
>> > basis only. First you subscribe then you get updates ... not get
>> > everything then keep junk till it get's removed or expires.
>>
>> with MPLS loopback address of all PEs is advertised everywhere. So you
>> keep the state when the remote PE loopback is up and you get a state
>> withdrawal when the remote PE loopback goes down.
>>
>> In Srv6, with summarization we can reduced the amount of UP state to
>> minimum. But suddenly the DOWN event distribution is considered
>> problematic. Not sure I follow.
>>
>> thanks,
>> Peter
>>
>> >
>> > Many thx,
>> > Robert
>> >
>>
>> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-10 Thread Peter Psenak

Robert,

On 10/03/2021 11:29, Robert Raszuk wrote:

Peter,

 > But suddenly the DOWN event distribution is considered
 > problematic. Not sure I follow.

In routing and IP reachability we use p2mp distribution and flooding as 
it is required to provide any to any connectivity.


Such spray model no longer fits services where not every endpoint 
participates in all services.


So my point is that just because you have transport ready we should not 
continue to announce neither good nor bad news in spray fashion for 
services.


Sure it works, but it is hardly a good design and sound architecture.

It happened to BGP as the convenience of already having TCP sessions 
between nodes was so great that we loaded loads of stuff to go along 
basic routing reachability.


And now it seems time came to do the same with IGPs :).

I think unless we stop and define a real pub-sub messaging protocol 
(like FB does with open-R)  we will continue this.


you are of course free to do that. Here we are at the LSR WG.

thanks,
Peter





And to me it is like building a tower from the cards ... the higher you 
go the more likely your entire tower is to collapse.


Cheers,
R.

PS.

 > with MPLS loopback address of all PEs is advertised everywhere.

Is this a feature or a day one design bug later fixed by RFC5283 ?




On Wed, Mar 10, 2021 at 9:10 AM Peter Psenak > wrote:


Robert,


On 09/03/2021 19:30, Robert Raszuk wrote:
 > Hi Peter,
 >
 >      > Example 1:
 >      >
 >      > If session to PE1 goes down, withdraw all RDs received
from such PE.
 >
 >     still dependent on RDs and BGP specific.
 >
 >
 > To me this does sound like a feature ... to you I think it was
rather
 > pejorative.

not sure I understand your point with "pejorative"...

There are other ways to provide services outside of BGP - think GRE,
IPsec, etc. The solution should cover them all.

 >
 >     We want app independent way of
 >     signaling the reachability loss. At the end that's what IGPs
do without
 >     a presence of summarization.
 >
 >
 > Here you go. I suppose you just drafted the first use case for OSPF
 > Transport Instance.

you said it, not me.


 >
 > I suppose you just run new ISIS or OSPF Instance and flood info
about PE
 > down events to all other instance nodes (hopefully just PEs and
no Ps as
 > such plane would be OTT one).  Still you will be flooding this to
100s
 > of PEs which may never need this information at all which I think
is the
 > main issue here. Such bad news IMHO should be distributed on a
pub/sub
 > basis only. First you subscribe then you get updates ... not get
 > everything then keep junk till it get's removed or expires.

with MPLS loopback address of all PEs is advertised everywhere. So you
keep the state when the remote PE loopback is up and you get a state
withdrawal when the remote PE loopback goes down.

In Srv6, with summarization we can reduced the amount of UP state to
minimum. But suddenly the DOWN event distribution is considered
problematic. Not sure I follow.

thanks,
Peter

 >
 > Many thx,
 > Robert
 >



___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-10 Thread Robert Raszuk
Peter,

> But suddenly the DOWN event distribution is considered
> problematic. Not sure I follow.

In routing and IP reachability we use p2mp distribution and flooding as it
is required to provide any to any connectivity.

Such spray model no longer fits services where not every endpoint
participates in all services.

So my point is that just because you have transport ready we should not
continue to announce neither good nor bad news in spray fashion for
services.

Sure it works, but it is hardly a good design and sound architecture.

It happened to BGP as the convenience of already having TCP sessions
between nodes was so great that we loaded loads of stuff to go along basic
routing reachability.

And now it seems time came to do the same with IGPs :).

I think unless we stop and define a real pub-sub messaging protocol (like
FB does with open-R)  we will continue this.

And to me it is like building a tower from the cards ... the higher you go
the more likely your entire tower is to collapse.

Cheers,
R.

PS.

> with MPLS loopback address of all PEs is advertised everywhere.

Is this a feature or a day one design bug later fixed by RFC5283 ?




On Wed, Mar 10, 2021 at 9:10 AM Peter Psenak  wrote:

> Robert,
>
>
> On 09/03/2021 19:30, Robert Raszuk wrote:
> > Hi Peter,
> >
> >  > Example 1:
> >  >
> >  > If session to PE1 goes down, withdraw all RDs received from such
> PE.
> >
> > still dependent on RDs and BGP specific.
> >
> >
> > To me this does sound like a feature ... to you I think it was rather
> > pejorative.
>
> not sure I understand your point with "pejorative"...
>
> There are other ways to provide services outside of BGP - think GRE,
> IPsec, etc. The solution should cover them all.
>
> >
> > We want app independent way of
> > signaling the reachability loss. At the end that's what IGPs do
> without
> > a presence of summarization.
> >
> >
> > Here you go. I suppose you just drafted the first use case for OSPF
> > Transport Instance.
>
> you said it, not me.
>
>
> >
> > I suppose you just run new ISIS or OSPF Instance and flood info about PE
> > down events to all other instance nodes (hopefully just PEs and no Ps as
> > such plane would be OTT one).  Still you will be flooding this to 100s
> > of PEs which may never need this information at all which I think is the
> > main issue here. Such bad news IMHO should be distributed on a pub/sub
> > basis only. First you subscribe then you get updates ... not get
> > everything then keep junk till it get's removed or expires.
>
> with MPLS loopback address of all PEs is advertised everywhere. So you
> keep the state when the remote PE loopback is up and you get a state
> withdrawal when the remote PE loopback goes down.
>
> In Srv6, with summarization we can reduced the amount of UP state to
> minimum. But suddenly the DOWN event distribution is considered
> problematic. Not sure I follow.
>
> thanks,
> Peter
>
> >
> > Many thx,
> > Robert
> >
>
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-10 Thread Peter Psenak

Robert,


On 09/03/2021 19:30, Robert Raszuk wrote:

Hi Peter,

 > Example 1:
 >
 > If session to PE1 goes down, withdraw all RDs received from such PE.

still dependent on RDs and BGP specific. 



To me this does sound like a feature ... to you I think it was rather 
pejorative.


not sure I understand your point with "pejorative"...

There are other ways to provide services outside of BGP - think GRE, 
IPsec, etc. The solution should cover them all.




We want app independent way of
signaling the reachability loss. At the end that's what IGPs do without
a presence of summarization.


Here you go. I suppose you just drafted the first use case for OSPF 
Transport Instance.


you said it, not me.




I suppose you just run new ISIS or OSPF Instance and flood info about PE 
down events to all other instance nodes (hopefully just PEs and no Ps as 
such plane would be OTT one).  Still you will be flooding this to 100s 
of PEs which may never need this information at all which I think is the 
main issue here. Such bad news IMHO should be distributed on a pub/sub 
basis only. First you subscribe then you get updates ... not get 
everything then keep junk till it get's removed or expires.


with MPLS loopback address of all PEs is advertised everywhere. So you 
keep the state when the remote PE loopback is up and you get a state 
withdrawal when the remote PE loopback goes down.


In Srv6, with summarization we can reduced the amount of UP state to 
minimum. But suddenly the DOWN event distribution is considered 
problematic. Not sure I follow.


thanks,
Peter



Many thx,
Robert



___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Robert Raszuk
Hi Peter,


> > Example 1:
> >
> > If session to PE1 goes down, withdraw all RDs received from such PE.
>
> still dependent on RDs and BGP specific.


To me this does sound like a feature ... to you I think it was rather
pejorative.


> We want app independent way of
> signaling the reachability loss. At the end that's what IGPs do without
> a presence of summarization.
>

Here you go. I suppose you just drafted the first use case for OSPF
Transport Instance.

I suppose you just run new ISIS or OSPF Instance and flood info about PE
down events to all other instance nodes (hopefully just PEs and no Ps as
such plane would be OTT one).  Still you will be flooding this to 100s of
PEs which may never need this information at all which I think is the main
issue here. Such bad news IMHO should be distributed on a pub/sub basis
only. First you subscribe then you get updates ... not get everything then
keep junk till it get's removed or expires.

Many thx,
Robert
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Peter Psenak

Robert,

On 09/03/2021 12:20, Robert Raszuk wrote:


 > In addition you may have a hierarchical RR, which would still involve
 > BGP signalling.

Last time I measured time it takes to propage withdraw via good RR was 
single milliseconds.



 > because BGP signalling is prefix based and as a result slow.
+
 > that is the whole point, you need something that is prefix independent.

BGP can be easily setup in prefix independent way today.

Example 1:

If session to PE1 goes down, withdraw all RDs received from such PE.


still dependent on RDs and BGP specific. We want app independent way of 
signaling the reachability loss. At the end that's what IGPs do without 
a presence of summarization.


Again, I'm not advocating the solution proposed in 
draft-wang-lsr-prefix-unreachable-annoucement. I'm just saying the 
problem seems valid  and IGP based solution is not an unreasonable thing 
to consider if a reasonable one can be found.




Example 2:

Use IGP recursion - Use RFC3107 to construct your interarea LSPs. If PE 


there is no LSP in SRv6.

Peter

goes down withdraw it. IGP can still signal summary no issue as no 
inet.3 route.


Best,
R.


On Tue, Mar 9, 2021 at 12:12 PM Peter Psenak > wrote:


Hi Robert,

On 09/03/2021 12:02, Robert Raszuk wrote:
 > Hey Peter,
 >
 > Well ok so let's forget about LDP - cool !
 >
 > So IGP sends summary around and that is all what is needed.
 >
 > So the question why not propage information that PE went down in
service
 > signalling - today mainly BGP.

because BGP signalling is prefix based and as a result slow.

 >
 >  >   And forget BFD, does not scale with 10k PEs.
 >
 > You missed the point. No one is proposing full mesh of BFD sessions
 > between all PEs. I hope so at least.
 >
 > PE is connected to RRs so you need as many BFD sessions as RR to
PE BGP
 > sessions.

that can be still too many.
In addition you may have a hierarchical RR, which would still involve
BGP signalling.

Once that session is brought down RR has all it needs to
 > trigger a message (withdraw or implicit withdraw) to remove the
 > broken service routes in a scalable way.

that is the whole point, you need something that is prefix independent.

thanks,
Peter

 >
 > Thx,
 > R.
 >
 > PS. Yes we still need to start support signalling of
unreachability in
 > BGP itself when BGP is used for underlay but this is a bit
different use
 > case and outside of scope of LSR
 >
 >
 > On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak mailto:ppse...@cisco.com>
 > >> wrote:
 >
 >     Robert,
 >
 >     On 09/03/2021 11:47, Robert Raszuk wrote:
 >      >  > You’re trying to fix a problem in the overlay by
morphing the
 >      > underlay.  How can that seem like a good idea?
 >      >
 >      > I think this really nails this discussion.
 >      >
 >      > We have discussed this before and while the concept of
signalling
 >      > unreachability does seem useful such signalling should be done
 >     where it
 >      > belongs.
 >      >
 >      > Here clearly we are talking about faster connectivity
restoration
 >     for
 >      > overlay services so it naturally belongs in overlay.
 >      >
 >      > It could be a bit misleading as this is today underlay which
 >     propagates
 >      > reachability of PEs and overlay relies on it. And to scale,
 >      > summarization is used hence in the underlay, failing
remote PEs
 >     remain
 >      > reachable. That however in spite of many efforts in lots of
 >     networks are
 >      > really not the practical problem as those networks still
relay on
 >     exact
 >      > match of IGP to LDP FEC when MPLS is used. So removal of
/32 can and
 >      > does happen.
 >
 >     think SRv6, forget /32 or /128 removal. Think summarization.
 >
 >     I'm not necessary advocating the solution proposed in this
particular
 >     draft, but the problem is valid. We need fast detection of
the PE loss.
 >
 >     And forget BFD, does not scale with 10k PEs.
 >
 >     thanks,
 >     Peter
 >
 >
 >
 >      >
 >      > In the same time BGP can pretty quickly (milliseconds)
 >     remove affected
 >      > service routes (or rather paths) hence connectivity can be
 >     restored to
 >      > redundantly connected endpoints in sub second. Such
removal can
 >     be in a
 >      > form of atomic withdraw (or readvertisement), removal of
recursive
 >      > routes (next hop going down) or withdraw of few RD/64
prefixes.
 >      >
 >      > I am not convinced and I have not seen any evidence 

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Robert Raszuk
> In addition you may have a hierarchical RR, which would still involve
> BGP signalling.

Last time I measured time it takes to propage withdraw via good RR was
single milliseconds.


> because BGP signalling is prefix based and as a result slow.
+
> that is the whole point, you need something that is prefix independent.

BGP can be easily setup in prefix independent way today.

Example 1:

If session to PE1 goes down, withdraw all RDs received from such PE.

Example 2:

Use IGP recursion - Use RFC3107 to construct your interarea LSPs. If PE
goes down withdraw it. IGP can still signal summary no issue as no inet.3
route.

Best,
R.


On Tue, Mar 9, 2021 at 12:12 PM Peter Psenak  wrote:

> Hi Robert,
>
> On 09/03/2021 12:02, Robert Raszuk wrote:
> > Hey Peter,
> >
> > Well ok so let's forget about LDP - cool !
> >
> > So IGP sends summary around and that is all what is needed.
> >
> > So the question why not propage information that PE went down in service
> > signalling - today mainly BGP.
>
> because BGP signalling is prefix based and as a result slow.
>
> >
> >  >   And forget BFD, does not scale with 10k PEs.
> >
> > You missed the point. No one is proposing full mesh of BFD sessions
> > between all PEs. I hope so at least.
> >
> > PE is connected to RRs so you need as many BFD sessions as RR to PE BGP
> > sessions.
>
> that can be still too many.
> In addition you may have a hierarchical RR, which would still involve
> BGP signalling.
>
> Once that session is brought down RR has all it needs to
> > trigger a message (withdraw or implicit withdraw) to remove the
> > broken service routes in a scalable way.
>
> that is the whole point, you need something that is prefix independent.
>
> thanks,
> Peter
>
> >
> > Thx,
> > R.
> >
> > PS. Yes we still need to start support signalling of unreachability in
> > BGP itself when BGP is used for underlay but this is a bit different use
> > case and outside of scope of LSR
> >
> >
> > On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak  > > wrote:
> >
> > Robert,
> >
> > On 09/03/2021 11:47, Robert Raszuk wrote:
> >  >  > You’re trying to fix a problem in the overlay by morphing the
> >  > underlay.  How can that seem like a good idea?
> >  >
> >  > I think this really nails this discussion.
> >  >
> >  > We have discussed this before and while the concept of signalling
> >  > unreachability does seem useful such signalling should be done
> > where it
> >  > belongs.
> >  >
> >  > Here clearly we are talking about faster connectivity restoration
> > for
> >  > overlay services so it naturally belongs in overlay.
> >  >
> >  > It could be a bit misleading as this is today underlay which
> > propagates
> >  > reachability of PEs and overlay relies on it. And to scale,
> >  > summarization is used hence in the underlay, failing remote PEs
> > remain
> >  > reachable. That however in spite of many efforts in lots of
> > networks are
> >  > really not the practical problem as those networks still relay on
> > exact
> >  > match of IGP to LDP FEC when MPLS is used. So removal of /32 can
> and
> >  > does happen.
> >
> > think SRv6, forget /32 or /128 removal. Think summarization.
> >
> > I'm not necessary advocating the solution proposed in this particular
> > draft, but the problem is valid. We need fast detection of the PE
> loss.
> >
> > And forget BFD, does not scale with 10k PEs.
> >
> > thanks,
> > Peter
> >
> >
> >
> >  >
> >  > In the same time BGP can pretty quickly (milliseconds)
> > remove affected
> >  > service routes (or rather paths) hence connectivity can be
> > restored to
> >  > redundantly connected endpoints in sub second. Such removal can
> > be in a
> >  > form of atomic withdraw (or readvertisement), removal of recursive
> >  > routes (next hop going down) or withdraw of few RD/64 prefixes.
> >  >
> >  > I am not convinced and I have not seen any evidence that if we
> > put this
> >  > into IGP it will be any faster across areas or domains (case of
> >  > redistribution over ASBRs to and from IGP to BGP). One thing for
> > sure -
> >  > it will be much more complex to troubleshoot.
> >  >
> >  > Thx,
> >  > R.
> >  >
> >  > On Tue, Mar 9, 2021 at 5:39 AM Tony Li  > 
> >  > >> wrote:
> >  >
> >  >
> >  > Hi Gyan,
> >  >
> >  >  > Gyan> In previous threads BFD multi hop has been
> > mentioned to
> >  > track IGP liveliness but that gets way overly complicated
> > especially
> >  > with large domains and not viable.
> >  >
> >  >
> >  > This is not tracking IGP liveness, this is to track BGP
> endpoint
> >  > liveness.
> >  >
> >  > Here in 2021, we 

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Tony Przygienda
I know it's not fashionable (yet) but put multipoint BFD on BIER & run 2
subdomains and you got 10K. Add subdomains to taste which will also allow
to partition it across chips easily. Yepp, needs silicon that will sustain
reasonable rates but you have a pretty good darn' solution. IGP gives you
reachability for BIER already, you got minimal replication and you can tune
your timers to heart's delight.

Sticking stuff in IGP (to second Tony #1) is very satisfying, especially
since some of that could work some time @ no load on IGP. All those "let's
add a million things to IGP" only catches you when you realize in serious
outage your IGP is busy figuring out/flooding junk rather than getting you
basic connectivity. Yes, good implementation technique and careful design
of protocol can help (e.g. take ISIS extensions that allow you for a large
LSP space where you know what priority things are that need to go out/come
in/compute, here is sympathize with the new idea of separate instance for
"junk hauling" BTW ;-) but mashing overlay into underlay of your most time
sensitive and delicate piece of network control to use it as overlay
signalling protocol does not have a promising history. Confounding the
whole thing on top with adding a route type as signalling means is a bit
injury on top of insult or vice versa

-- tony

On Tue, Mar 9, 2021 at 12:03 PM Robert Raszuk  wrote:

> Hey Peter,
>
> Well ok so let's forget about LDP - cool !
>
> So IGP sends summary around and that is all what is needed.
>
> So the question why not propage information that PE went down in service
> signalling - today mainly BGP.
>
> >   And forget BFD, does not scale with 10k PEs.
>
> You missed the point. No one is proposing full mesh of BFD sessions
> between all PEs. I hope so at least.
>
> PE is connected to RRs so you need as many BFD sessions as RR to PE BGP
> sessions. Once that session is brought down RR has all it needs to trigger
> a message (withdraw or implicit withdraw) to remove the broken service
> routes in a scalable way.
>
> Thx,
> R.
>
> PS. Yes we still need to start support signalling of unreachability in BGP
> itself when BGP is used for underlay but this is a bit different use case
> and outside of scope of LSR
>
>
> On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak  wrote:
>
>> Robert,
>>
>> On 09/03/2021 11:47, Robert Raszuk wrote:
>> >  > You’re trying to fix a problem in the overlay by morphing the
>> > underlay.  How can that seem like a good idea?
>> >
>> > I think this really nails this discussion.
>> >
>> > We have discussed this before and while the concept of signalling
>> > unreachability does seem useful such signalling should be done where it
>> > belongs.
>> >
>> > Here clearly we are talking about faster connectivity restoration for
>> > overlay services so it naturally belongs in overlay.
>> >
>> > It could be a bit misleading as this is today underlay which propagates
>> > reachability of PEs and overlay relies on it. And to scale,
>> > summarization is used hence in the underlay, failing remote PEs remain
>> > reachable. That however in spite of many efforts in lots of networks
>> are
>> > really not the practical problem as those networks still relay on exact
>> > match of IGP to LDP FEC when MPLS is used. So removal of /32 can and
>> > does happen.
>>
>> think SRv6, forget /32 or /128 removal. Think summarization.
>>
>> I'm not necessary advocating the solution proposed in this particular
>> draft, but the problem is valid. We need fast detection of the PE loss.
>>
>> And forget BFD, does not scale with 10k PEs.
>>
>> thanks,
>> Peter
>>
>>
>>
>> >
>> > In the same time BGP can pretty quickly (milliseconds) remove affected
>> > service routes (or rather paths) hence connectivity can be restored to
>> > redundantly connected endpoints in sub second. Such removal can be in a
>> > form of atomic withdraw (or readvertisement), removal of recursive
>> > routes (next hop going down) or withdraw of few RD/64 prefixes.
>> >
>> > I am not convinced and I have not seen any evidence that if we put this
>> > into IGP it will be any faster across areas or domains (case of
>> > redistribution over ASBRs to and from IGP to BGP). One thing for sure -
>> > it will be much more complex to troubleshoot.
>> >
>> > Thx,
>> > R.
>> >
>> > On Tue, Mar 9, 2021 at 5:39 AM Tony Li > > > wrote:
>> >
>> >
>> > Hi Gyan,
>> >
>> >  > Gyan> In previous threads BFD multi hop has been mentioned to
>> > track IGP liveliness but that gets way overly complicated especially
>> > with large domains and not viable.
>> >
>> >
>> > This is not tracking IGP liveness, this is to track BGP endpoint
>> > liveness.
>> >
>> > Here in 2021, we seem to have (finally) discovered that we can
>> > automate our management plane. This ameliorates a great deal of
>> > complexity.
>> >
>> >
>> >  > Gyan> As we are trying to signal the IGP to trigger the
>> > control plane 

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Peter Psenak

Hi Robert,

On 09/03/2021 12:02, Robert Raszuk wrote:

Hey Peter,

Well ok so let's forget about LDP - cool !

So IGP sends summary around and that is all what is needed.

So the question why not propage information that PE went down in service 
signalling - today mainly BGP.


because BGP signalling is prefix based and as a result slow.



 >   And forget BFD, does not scale with 10k PEs.

You missed the point. No one is proposing full mesh of BFD sessions 
between all PEs. I hope so at least.


PE is connected to RRs so you need as many BFD sessions as RR to PE BGP 
sessions. 


that can be still too many.
In addition you may have a hierarchical RR, which would still involve 
BGP signalling.


Once that session is brought down RR has all it needs to
trigger a message (withdraw or implicit withdraw) to remove the 
broken service routes in a scalable way.


that is the whole point, you need something that is prefix independent.

thanks,
Peter



Thx,
R.

PS. Yes we still need to start support signalling of unreachability in 
BGP itself when BGP is used for underlay but this is a bit different use 
case and outside of scope of LSR



On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak > wrote:


Robert,

On 09/03/2021 11:47, Robert Raszuk wrote:
 >  > You’re trying to fix a problem in the overlay by morphing the
 > underlay.  How can that seem like a good idea?
 >
 > I think this really nails this discussion.
 >
 > We have discussed this before and while the concept of signalling
 > unreachability does seem useful such signalling should be done
where it
 > belongs.
 >
 > Here clearly we are talking about faster connectivity restoration
for
 > overlay services so it naturally belongs in overlay.
 >
 > It could be a bit misleading as this is today underlay which
propagates
 > reachability of PEs and overlay relies on it. And to scale,
 > summarization is used hence in the underlay, failing remote PEs
remain
 > reachable. That however in spite of many efforts in lots of
networks are
 > really not the practical problem as those networks still relay on
exact
 > match of IGP to LDP FEC when MPLS is used. So removal of /32 can and
 > does happen.

think SRv6, forget /32 or /128 removal. Think summarization.

I'm not necessary advocating the solution proposed in this particular
draft, but the problem is valid. We need fast detection of the PE loss.

And forget BFD, does not scale with 10k PEs.

thanks,
Peter



 >
 > In the same time BGP can pretty quickly (milliseconds)
remove affected
 > service routes (or rather paths) hence connectivity can be
restored to
 > redundantly connected endpoints in sub second. Such removal can
be in a
 > form of atomic withdraw (or readvertisement), removal of recursive
 > routes (next hop going down) or withdraw of few RD/64 prefixes.
 >
 > I am not convinced and I have not seen any evidence that if we
put this
 > into IGP it will be any faster across areas or domains (case of
 > redistribution over ASBRs to and from IGP to BGP). One thing for
sure -
 > it will be much more complex to troubleshoot.
 >
 > Thx,
 > R.
 >
 > On Tue, Mar 9, 2021 at 5:39 AM Tony Li mailto:tony...@tony.li>
 > >> wrote:
 >
 >
 >     Hi Gyan,
 >
 >      >     Gyan> In previous threads BFD multi hop has been
mentioned to
 >     track IGP liveliness but that gets way overly complicated
especially
 >     with large domains and not viable.
 >
 >
 >     This is not tracking IGP liveness, this is to track BGP endpoint
 >     liveness.
 >
 >     Here in 2021, we seem to have (finally) discovered that we can
 >     automate our management plane. This ameliorates a great deal of
 >     complexity.
 >
 >
 >      >     Gyan> As we are trying to signal the IGP to trigger the
 >     control plane convergence, the flooding machinery in the IGP
already
 >     exists well as the prefix originator sub TLV from the link or
node
 >     failure.  IGP seems to be the perfect mechanism for the control
 >     plane signaling switchover.
 >
 >
 >     You’re trying to fix a problem in the overlay by morphing the
 >     underlay.  How can that seem like a good idea?
 >
 >
 >      >       Gyan>As I mentioned advertising flooding of the longer
 >     prefix defeats the purpose of summarization.
 >
 >
 >     PUA also defeats summarization.  If you really insist on faster
 >     convergence and not building a sufficiently redundant
topology, then
 >     yes, your area will partition and you will have to pay the
price of
 >     additional state for your longer prefixes.
 >
 >
 > 

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Robert Raszuk
Hey Peter,

Well ok so let's forget about LDP - cool !

So IGP sends summary around and that is all what is needed.

So the question why not propage information that PE went down in service
signalling - today mainly BGP.

>   And forget BFD, does not scale with 10k PEs.

You missed the point. No one is proposing full mesh of BFD sessions between
all PEs. I hope so at least.

PE is connected to RRs so you need as many BFD sessions as RR to PE BGP
sessions. Once that session is brought down RR has all it needs to trigger
a message (withdraw or implicit withdraw) to remove the broken service
routes in a scalable way.

Thx,
R.

PS. Yes we still need to start support signalling of unreachability in BGP
itself when BGP is used for underlay but this is a bit different use case
and outside of scope of LSR


On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak  wrote:

> Robert,
>
> On 09/03/2021 11:47, Robert Raszuk wrote:
> >  > You’re trying to fix a problem in the overlay by morphing the
> > underlay.  How can that seem like a good idea?
> >
> > I think this really nails this discussion.
> >
> > We have discussed this before and while the concept of signalling
> > unreachability does seem useful such signalling should be done where it
> > belongs.
> >
> > Here clearly we are talking about faster connectivity restoration for
> > overlay services so it naturally belongs in overlay.
> >
> > It could be a bit misleading as this is today underlay which propagates
> > reachability of PEs and overlay relies on it. And to scale,
> > summarization is used hence in the underlay, failing remote PEs remain
> > reachable. That however in spite of many efforts in lots of networks are
> > really not the practical problem as those networks still relay on exact
> > match of IGP to LDP FEC when MPLS is used. So removal of /32 can and
> > does happen.
>
> think SRv6, forget /32 or /128 removal. Think summarization.
>
> I'm not necessary advocating the solution proposed in this particular
> draft, but the problem is valid. We need fast detection of the PE loss.
>
> And forget BFD, does not scale with 10k PEs.
>
> thanks,
> Peter
>
>
>
> >
> > In the same time BGP can pretty quickly (milliseconds) remove affected
> > service routes (or rather paths) hence connectivity can be restored to
> > redundantly connected endpoints in sub second. Such removal can be in a
> > form of atomic withdraw (or readvertisement), removal of recursive
> > routes (next hop going down) or withdraw of few RD/64 prefixes.
> >
> > I am not convinced and I have not seen any evidence that if we put this
> > into IGP it will be any faster across areas or domains (case of
> > redistribution over ASBRs to and from IGP to BGP). One thing for sure -
> > it will be much more complex to troubleshoot.
> >
> > Thx,
> > R.
> >
> > On Tue, Mar 9, 2021 at 5:39 AM Tony Li  > > wrote:
> >
> >
> > Hi Gyan,
> >
> >  > Gyan> In previous threads BFD multi hop has been mentioned to
> > track IGP liveliness but that gets way overly complicated especially
> > with large domains and not viable.
> >
> >
> > This is not tracking IGP liveness, this is to track BGP endpoint
> > liveness.
> >
> > Here in 2021, we seem to have (finally) discovered that we can
> > automate our management plane. This ameliorates a great deal of
> > complexity.
> >
> >
> >  > Gyan> As we are trying to signal the IGP to trigger the
> > control plane convergence, the flooding machinery in the IGP already
> > exists well as the prefix originator sub TLV from the link or node
> > failure.  IGP seems to be the perfect mechanism for the control
> > plane signaling switchover.
> >
> >
> > You’re trying to fix a problem in the overlay by morphing the
> > underlay.  How can that seem like a good idea?
> >
> >
> >  >   Gyan>As I mentioned advertising flooding of the longer
> > prefix defeats the purpose of summarization.
> >
> >
> > PUA also defeats summarization.  If you really insist on faster
> > convergence and not building a sufficiently redundant topology, then
> > yes, your area will partition and you will have to pay the price of
> > additional state for your longer prefixes.
> >
> >
> >  > In order to do what you are stating you have to remove the
> > summarization and go back to domain wide flooding
> >
> >
> > No, I’m suggesting you maintain the summary and ALSO advertise the
> > longer prefix that you feel is essential to reroute immediately.
> >
> >
> >  > which completely defeats the goal of the draft which is to make
> > host route summarization viable for operators.  We know the prefix
> > that went down and that is why with the PUA negative advertisement
> > we can easily flood a null0 to block the control plane from
> > installing the route.
> >
> >
> > So you can also advertise the more specific from the connected ABR…
> >
> >
> >  > We 

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Robert Raszuk
> You’re trying to fix a problem in the overlay by morphing the underlay.
How can that seem like a good idea?

I think this really nails this discussion.

We have discussed this before and while the concept of signalling
unreachability does seem useful such signalling should be done where it
belongs.

Here clearly we are talking about faster connectivity restoration for
overlay services so it naturally belongs in overlay.

It could be a bit misleading as this is today underlay which propagates
reachability of PEs and overlay relies on it. And to scale, summarization
is used hence in the underlay, failing remote PEs remain reachable. That
however in spite of many efforts in lots of networks are really not the
practical problem as those networks still relay on exact match of IGP to
LDP FEC when MPLS is used. So removal of /32 can and does happen.

In the same time BGP can pretty quickly (milliseconds) remove affected
service routes (or rather paths) hence connectivity can be restored to
redundantly connected endpoints in sub second. Such removal can be in a
form of atomic withdraw (or readvertisement), removal of recursive routes
(next hop going down) or withdraw of few RD/64 prefixes.

I am not convinced and I have not seen any evidence that if we put this
into IGP it will be any faster across areas or domains (case of
redistribution over ASBRs to and from IGP to BGP). One thing for sure - it
will be much more complex to troubleshoot.

Thx,
R.

On Tue, Mar 9, 2021 at 5:39 AM Tony Li  wrote:

>
> Hi Gyan,
>
> > Gyan> In previous threads BFD multi hop has been mentioned to track
> IGP liveliness but that gets way overly complicated especially with large
> domains and not viable.
>
>
> This is not tracking IGP liveness, this is to track BGP endpoint liveness.
>
> Here in 2021, we seem to have (finally) discovered that we can automate
> our management plane. This ameliorates a great deal of complexity.
>
>
> > Gyan> As we are trying to signal the IGP to trigger the control
> plane convergence, the flooding machinery in the IGP already exists well as
> the prefix originator sub TLV from the link or node failure.  IGP seems to
> be the perfect mechanism for the control plane signaling switchover.
>
>
> You’re trying to fix a problem in the overlay by morphing the underlay.
> How can that seem like a good idea?
>
>
> >   Gyan>As I mentioned advertising flooding of the longer prefix
> defeats the purpose of summarization.
>
>
> PUA also defeats summarization.  If you really insist on faster
> convergence and not building a sufficiently redundant topology, then yes,
> your area will partition and you will have to pay the price of additional
> state for your longer prefixes.
>
>
> > In order to do what you are stating you have to remove the summarization
> and go back to domain wide flooding
>
>
> No, I’m suggesting you maintain the summary and ALSO advertise the longer
> prefix that you feel is essential to reroute immediately.
>
>
> > which completely defeats the goal of the draft which is to make host
> route summarization viable for operators.  We know the prefix that went
> down and that is why with the PUA negative advertisement we can easily
> flood a null0 to block the control plane from installing the route.
>
>
> So you can also advertise the more specific from the connected ABR…
>
>
> > We don’t have any prior knowledge of the alternate for the egress PE bgp
> next hop attribute for the customer VPN overlay.  So the only way to
> accomplish what you are asking is not do any summarization and flood al
> host routes.  Of course  as I stated defeats the purpose of the draft.
>
>
> Please read again.
>
> Tony
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-09 Thread Peter Psenak

Tony,

On 09/03/2021 01:03, Tony Li wrote:


Gyan,

If I understand the purpose of this draft, the point is to punch a hole 
in a summary so that traffic is redirected via an alternate, working path.


Rather than punch a hole, why not rely on existing technology? Have the 
valid path advertise the more specific. This will attract the traffic.


that would defeat the purpose of the summarization.

thanks,
Peter



Tony


On Mar 8, 2021, at 3:57 PM, Gyan Mishra > wrote:



Acee.

Please ask the two questions you raised about the PUA draft so we can 
address your concerns.


If anyone else has any other outstanding questions or concerns we 
would like to address as well and resolve.


Once all questions and  concerns are satisfied we would like to ask 
for WG adoption.


Kind Regards

Gyan
--



*Gyan Mishra*
/Network Solutions A//rchitect /
/M 301 502-1347
13101 Columbia Pike
/Silver Spring, MD

___
Lsr mailing list
Lsr@ietf.org 
https://www.ietf.org/mailman/listinfo/lsr




___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Aijun Wang
Hi, Tony:

Aijun Wang
China Telecom

> On Mar 9, 2021, at 12:32, Tony Li  wrote:
> 
> 
> Hi Aijun,
> 
>> Stuffing things in the IGP seems like a poor way of determining that there’s 
>> a BGP failure.  Wouldn’t BFD be a more appropriate way of determining the 
>> loss of connectivity?  Or aggressive BGP keepalive timers?
>> [WAJ] For BFD, you need to configure between each pair of them to track the 
>> connectivity.  For BGP keeplive times, the duration is too long.
> 
> 
> You have to configure BGP on both sides too.

[WAJ] Aim is the BGP process acts upon the PUA message automatically.

> 
> Most implementations do allow you to change the timers.

[WAJ] it is not efficient as the event trigger mechanism. It is the same 
comparison between Poll and Push/Notify mechanism.


> 
> 
>>> The other side of BGP peer can quickly remove the BGP session when it 
>>> receives such PUA message which tell it the other peer is down now. Other 
>>> BGP peer protection procedures can then take effects on.
>>> The immediate notification of the failure prefix can certainly accelerate 
>>> the switchover of BGP control plane and also the service traffic that such 
>>> BGP session carries.
>> 
>> 
>> The IGP is a very poor way of delivering service liveness information.
>> [WAJ] why? 
> 
> 
> Because it’s not a service (un)availability protocol. It’s a reachability and 
> path computation function.  That is it’s role in the architecture.  Asking it 
> to do something else is a violation of the architecture.
> 
> Routing protocols are not transport protocols. Or dump trucks. Or service 
> advertisement protocols.

[WAJ]Here I think you can consider the ABR router advertises the wrong summary 
information and should correct itself via the PUA message. Action on the PUA 
message is the following optimize treatments.

> 
> 
>> That seems 100% unnecessary as the longer prefix will attract the traffic in 
>> the way that you want.
>> [WAJ] If the ABR advertises such detail prefix, it will certainly attract 
>> the traffic. But here the PUA message is just to trigger the ABR to 
>> advertise such detail prefix, or else, such ABR will still advertise the 
>> summary address.
> 
> 
> You don’t need the PUA message. The ABR can see the loss of topology and can 
> realize that it’s the best path to the prefix and can then advertise the 
> explicit longer prefix in addition to the normal summary.

[WAJ] The advertise of the details prefixes only occurs when some of the ABRs 
can reach the prefixes, but some can’t reaches.
If there is no PUA message, how can one ABR knows other ABRs can’t reach the 
prefixes? It can only know whether itself can reach or not.
Or else, you should require the ABR run SPF on behalf of other ABRs?

> 
> Tony
> 
> 

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Gyan Mishra
Hi Tony,

On Mon, Mar 8, 2021 at 11:39 PM Tony Li  wrote:

>
> Hi Gyan,
>
> > Gyan> In previous threads BFD multi hop has been mentioned to track
> IGP liveliness but that gets way overly complicated especially with large
> domains and not viable.
>
>
> This is not tracking IGP liveness, this is to track BGP endpoint liveness.
>

   Gyan> Agreed BGP endpoint liveliness recursive next hop - BGP callback
to IGP - Next hop tracking ..

>
> Here in 2021, we seem to have (finally) discovered that we can automate
> our management plane. This ameliorates a great deal of complexity.
>

   Gyan> Ok.  How do you propose to automate?

>
>
> > Gyan> As we are trying to signal the IGP to trigger the control
> plane convergence, the flooding machinery in the IGP already exists well as
> the prefix originator sub TLV from the link or node failure.  IGP seems to
> be the perfect mechanism for the control plane signaling switchover.
>
>
> You’re trying to fix a problem in the overlay by morphing the underlay.
> How can that seem like a good idea?
>

Gyan> I mention overlay but it’s really the underlay as the BGP next
hop is in the global table recursive route to IGP.

>
>
> >   Gyan>As I mentioned advertising flooding of the longer prefix
> defeats the purpose of summarization.
>
>
> PUA also defeats summarization.  If you really insist on faster
> convergence and not building a sufficiently redundant topology, then yes,
> your area will partition and you will have to pay the price of additional
> state for your longer prefixes.
>

Gyan> We are talking a typical Metro Edge use case with a pair of PEs
which is redundant.  The issue is not redundancy.  The issue is when
summarization is used  as the component prefixes BGP next hop recursive to
the IGP are now hidden and with MPLS RFC 5283 inter area LSP use case the
failover is broken.  It’s not just faster convergence it’s any convergence
as the traffic black hole dead ends on the ABR cannot build the LSP to the
egress PE.  Please see the diagram in the slide deck it details this
special use case.

>
>
> > In order to do what you are stating you have to remove the summarization
> and go back to domain wide flooding
>
>
> No, I’m suggesting you maintain the summary and ALSO advertise the longer
> prefix that you feel is essential to reroute immediately.


Gyan> ok understood.  How would you know or figure out the logic on the
ABR to find what the backup egress PE next hop to advertise the LSP.  This
is a unique data plane failure condition with RFC 5283 inter area LSP
extension where this problem statement occurs.

>
>
>
> > which completely defeats the goal of the draft which is to make host
> route summarization viable for operators.  We know the prefix that went
> down and that is why with the PUA negative advertisement we can easily
> flood a null0 to block the control plane from installing the route.
>
>
> So you can also advertise the more specific from the connected ABR…


   Gyan> As this is MPLS exact match FEC host route you would have to flood
the single next hop  loopback let’s say if it’s one backup PE for
simplicity

> throughout the domain so that LDP downstream unsolicited can allocate the
> labels downstream from destination downstream to source upstream, and then
> in reverse forward direction of the LSP unidirectional path is established
> and the traffic is now forwarded to FEC binding label switched to egress PE
> to now domain wide flooded exact match host route.  You cannot use the
> summary route on the ABR as that LPM FEC binding is only supported if using
> RFC 5283.  You cannot go half way you either have to use domain wide
> flooding or use the RFC 5283 inter area LSP LPM match.
>
>
> > We don’t have any prior knowledge of the alternate for the egress PE bgp
> next hop attribute for the customer VPN overlay.  So the only way to
> accomplish what you are asking is not do any summarization and flood al
> host routes.  Of course  as I stated defeats the purpose of the draft.
>
>
> Please read again.
>
> Tony
>
> --



*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Tony Li

Hi Gyan,

> Gyan> In previous threads BFD multi hop has been mentioned to track IGP 
> liveliness but that gets way overly complicated especially with large domains 
> and not viable.


This is not tracking IGP liveness, this is to track BGP endpoint liveness.

Here in 2021, we seem to have (finally) discovered that we can automate our 
management plane. This ameliorates a great deal of complexity.


> Gyan> As we are trying to signal the IGP to trigger the control plane 
> convergence, the flooding machinery in the IGP already exists well as the 
> prefix originator sub TLV from the link or node failure.  IGP seems to be the 
> perfect mechanism for the control plane signaling switchover.


You’re trying to fix a problem in the overlay by morphing the underlay.  How 
can that seem like a good idea?


>   Gyan>As I mentioned advertising flooding of the longer prefix defeats 
> the purpose of summarization.


PUA also defeats summarization.  If you really insist on faster convergence and 
not building a sufficiently redundant topology, then yes, your area will 
partition and you will have to pay the price of additional state for your 
longer prefixes.


> In order to do what you are stating you have to remove the summarization and 
> go back to domain wide flooding


No, I’m suggesting you maintain the summary and ALSO advertise the longer 
prefix that you feel is essential to reroute immediately.


> which completely defeats the goal of the draft which is to make host route 
> summarization viable for operators.  We know the prefix that went down and 
> that is why with the PUA negative advertisement we can easily flood a null0 
> to block the control plane from installing the route. 


So you can also advertise the more specific from the connected ABR…


> We don’t have any prior knowledge of the alternate for the egress PE bgp next 
> hop attribute for the customer VPN overlay.  So the only way to accomplish 
> what you are asking is not do any summarization and flood al host routes.  Of 
> course  as I stated defeats the purpose of the draft.


Please read again.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Tony Li

Hi Aijun,

> Stuffing things in the IGP seems like a poor way of determining that there’s 
> a BGP failure.  Wouldn’t BFD be a more appropriate way of determining the 
> loss of connectivity?  Or aggressive BGP keepalive timers?
> [WAJ] For BFD, you need to configure between each pair of them to track the 
> connectivity.  For BGP keeplive times, the duration is too long.


You have to configure BGP on both sides too.

Most implementations do allow you to change the timers.


>> The other side of BGP peer can quickly remove the BGP session when it 
>> receives such PUA message which tell it the other peer is down now. Other 
>> BGP peer protection procedures can then take effects on.
>> The immediate notification of the failure prefix can certainly accelerate 
>> the switchover of BGP control plane and also the service traffic that such 
>> BGP session carries.
> 
> 
> The IGP is a very poor way of delivering service liveness information.
> [WAJ] why? 


Because it’s not a service (un)availability protocol. It’s a reachability and 
path computation function.  That is it’s role in the architecture.  Asking it 
to do something else is a violation of the architecture.

Routing protocols are not transport protocols. Or dump trucks. Or service 
advertisement protocols.


> That seems 100% unnecessary as the longer prefix will attract the traffic in 
> the way that you want.
> [WAJ] If the ABR advertises such detail prefix, it will certainly attract the 
> traffic. But here the PUA message is just to trigger the ABR to advertise 
> such detail prefix, or else, such ABR will still advertise the summary 
> address.


You don’t need the PUA message. The ABR can see the loss of topology and can 
realize that it’s the best path to the prefix and can then advertise the 
explicit longer prefix in addition to the normal summary.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Gyan Mishra
Hi Tony

Thank your for your comments.

Responses in-line

Kind Regards

On Mon, Mar 8, 2021 at 11:06 PM Aijun Wang 
wrote:

> Hi, Tony:
>
> -Original Message-
> From: lsr-boun...@ietf.org  On Behalf Of Tony Li
> Sent: Tuesday, March 9, 2021 11:12 AM
> To: Aijun Wang 
> Cc: lsr ; draft-wang-lsr-prefix-unreachable-annoucement <
> draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>; Acee Lindem
> (acee) ; Aijun Wang ; Gyan
> Mishra 
> Subject: Re: [Lsr]
> https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05
>
> Hi Aijun,
>
> > [WAJ] We just want to avoid such silent discard behavior, especially for
> the scenario that there is BGP session run on such failure prefix.
>
> Stuffing things in the IGP seems like a poor way of determining that
> there’s a BGP failure.  Wouldn’t BFD be a more appropriate way of
> determining the loss of connectivity?  Or aggressive BGP keepalive timers?
> [WAJ] For BFD, you need to configure between each pair of them to track
> the connectivity.  For BGP keeplive times, the duration is too long.


Gyan> In previous threads BFD multi hop has been mentioned to track IGP
liveliness but that gets way overly complicated especially with large
domains and not viable.

>
>
> > The other side of BGP peer can quickly remove the BGP session when it
> receives such PUA message which tell it the other peer is down now. Other
> BGP peer protection procedures can then take effects on.
> > The immediate notification of the failure prefix can certainly
> accelerate the switchover of BGP control plane and also the service traffic
> that such BGP session carries.
>
>
> The IGP is a very poor way of delivering service liveness information.
> [WAJ] why?
>

Gyan> As we are trying to signal the IGP to trigger the control plane
convergence, the flooding machinery in the IGP already exists well as the
prefix originator sub TLV from the link or node failure.  IGP seems to be
the perfect mechanism for the control plane signaling switchover.

>
>
> >>> For scenarios 2, because the specified prefixes can be accessed via
> another ABR, then we can let this ABR to advertise the details prefixes
> information for the specified address, which behavior is similar with RIFT,
> as also mentioned in the presentation materials.
> >>
> >>
> >> Agreed.
> >
> > [WAJ] Even for this scenario, the advertisement of the detail prefixes
> is trigger also via the PUA message from other ABR.
>
>
> That seems 100% unnecessary as the longer prefix will attract the traffic
> in the way that you want.
> [WAJ] If the ABR advertises such detail prefix, it will certainly attract
> the traffic. But here the PUA message is just to trigger the ABR to
> advertise such detail prefix, or else, such ABR will still advertise the
> summary address.


  Gyan>As I mentioned advertising flooding of the longer prefix defeats
the purpose of summarization. In order to do what you are stating you have
to remove the summarization and go back to domain wide flooding which
completely defeats the goal of the draft which is to make host route
summarization viable for operators.  We know the prefix that went down and
that is why with the PUA negative advertisement we can easily flood a null0
to block the control plane from installing the route.  We don’t have any
prior knowledge of the alternate for the egress PE bgp next hop attribute
for the customer VPN overlay.  So the only way to accomplish what you are
asking is not do any summarization and flood al host routes.  Of course  as
I stated defeats the purpose of the draft.

>
>
> Tony
>
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
> --

<http://www.verizon.com/>

*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Aijun Wang
Hi, Tony:

-Original Message-
From: lsr-boun...@ietf.org  On Behalf Of Tony Li
Sent: Tuesday, March 9, 2021 11:12 AM
To: Aijun Wang 
Cc: lsr ; draft-wang-lsr-prefix-unreachable-annoucement 
; Acee Lindem (acee) 
; Aijun Wang ; Gyan Mishra 

Subject: Re: [Lsr] 
https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

Hi Aijun,

> [WAJ] We just want to avoid such silent discard behavior, especially for the 
> scenario that there is BGP session run on such failure prefix. 

Stuffing things in the IGP seems like a poor way of determining that there’s a 
BGP failure.  Wouldn’t BFD be a more appropriate way of determining the loss of 
connectivity?  Or aggressive BGP keepalive timers?
[WAJ] For BFD, you need to configure between each pair of them to track the 
connectivity.  For BGP keeplive times, the duration is too long.

> The other side of BGP peer can quickly remove the BGP session when it 
> receives such PUA message which tell it the other peer is down now. Other BGP 
> peer protection procedures can then take effects on.
> The immediate notification of the failure prefix can certainly accelerate the 
> switchover of BGP control plane and also the service traffic that such BGP 
> session carries.


The IGP is a very poor way of delivering service liveness information.
[WAJ] why? 


>>> For scenarios 2, because the specified prefixes can be accessed via another 
>>> ABR, then we can let this ABR to advertise the details prefixes information 
>>> for the specified address, which behavior is similar with RIFT, as also 
>>> mentioned in the presentation materials.
>> 
>> 
>> Agreed.
> 
> [WAJ] Even for this scenario, the advertisement of the detail prefixes is 
> trigger also via the PUA message from other ABR.


That seems 100% unnecessary as the longer prefix will attract the traffic in 
the way that you want.
[WAJ] If the ABR advertises such detail prefix, it will certainly attract the 
traffic. But here the PUA message is just to trigger the ABR to advertise such 
detail prefix, or else, such ABR will still advertise the summary address.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Tony Li


Hi Aijun,

> [WAJ] We just want to avoid such silent discard behavior, especially for the 
> scenario that there is BGP session run on such failure prefix. 


Stuffing things in the IGP seems like a poor way of determining that there’s a 
BGP failure.  Wouldn’t BFD be a more appropriate way of determining the loss of 
connectivity?  Or aggressive BGP keepalive timers?


> The other side of BGP peer can quickly remove the BGP session when it 
> receives such PUA message which tell it the other peer is down now. Other BGP 
> peer protection procedures can then take effects on.
> The immediate notification of the failure prefix can certainly accelerate the 
> switchover of BGP control plane and also the service traffic that such BGP 
> session carries.


The IGP is a very poor way of delivering service liveness information.


>>> For scenarios 2, because the specified prefixes can be accessed via another 
>>> ABR, then we can let this ABR to advertise the details prefixes information 
>>> for the specified address, which behavior is similar with RIFT, as also 
>>> mentioned in the presentation materials.
>> 
>> 
>> Agreed.
> 
> [WAJ] Even for this scenario, the advertisement of the detail prefixes is 
> trigger also via the PUA message from other ABR.


That seems 100% unnecessary as the longer prefix will attract the traffic in 
the way that you want.

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Aijun Wang
Hi, Acee:

Let me state my considerations for your questions.

Aijun Wang
China Telecom

> On Mar 9, 2021, at 08:37, Acee Lindem (acee)  wrote:
> 
> 
> Speaking as a WG member:
>  
> Hi Gyan,
>  
> The first question is how do you know which prefixes within the summary range 
> to protect? Are these configured? Is this half-assed best-effort protection 
> where you protect prefixes within the range that you’ve installed recently? 
> Just how does this work? It is clearly not specified in the draft.

[WAJ] Currently, we consider PUA is one generic notification for the failure 
prefixes. The ABR can detect the failure prefixes within its attached areas.
There will only two kinds of nodes will react on the PUA message:
One is the receiving side that run BGP session on such failure 
prefixes(accelerate the BGP switchover procedures), the other is the other 
ABRs(advertising the detail prefixes if it can reach the prefixes that notified 
by the receiving PUA)


>  
> The second comment is that using the prefix-originator TLV is a terrible 
> choice of encoding. Note that if there is any router in the domain that 
> doesn’t support the extension, you’ll actually attract traffic towards the 
> ABR blackholing it.

[WAJ] No. if such router doesn’t support the extension, it will not advertise 
the PUA message, nor will it act on such message.
>  
> Further, I think your example is a bit contrived. I’d hope that an OSPF area 
> with “thousands” of summarized PE addresses wouldn’t be portioned by a single 
> failure as in figure 1 in the draft and your slides.
> I also that the option of a backbone tunnel between the ABRs was removed from 
> the draft since it diminished the requirement for this functionality.
[WAJ]Tunnel should only be used in some extreme scenarios. For example, when 
all ABRs can’t advertise the PUA or detail reachable prefixes. We can discuss 
this later.
>  
> Thanks,
> Acee
>  
> From: Gyan Mishra 
> Date: Monday, March 8, 2021 at 6:57 PM
> To: Acee Lindem , Aijun Wang , 
> draft-wang-lsr-prefix-unreachable-annoucement 
> , lsr 
> Subject: 
> https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05
>  
>  
> Acee. 
>  
> Please ask the two questions you raised about the PUA draft so we can address 
> your concerns.
>  
> If anyone else has any other outstanding questions or concerns we would like 
> to address as well and resolve.
>  
> Once all questions and  concerns are satisfied we would like to ask for WG 
> adoption.
>  
> Kind Regards 
>  
> Gyan
> --
> 
> 
> Gyan Mishra
> Network Solutions Architect 
> M 301 502-1347
> 13101 Columbia Pike 
> Silver Spring, MD
>  
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Gyan Mishra
On Mon, Mar 8, 2021 at 7:37 PM Acee Lindem (acee)  wrote:

> Speaking as a WG member:
>
>
>
> Hi Gyan,
>
>
>
> The first question is how do you know which prefixes within the summary
> range to protect? Are these configured? Is this half-assed best-effort
> protection where you protect prefixes within the range that you’ve
> installed recently? Just how does this work? It is clearly not specified in
> the draft.
>
>  Gyan>  All prefixes within the summary range are protected see section 4.
>


   [RFC7794] and [I-D.ietf-lsr-ospf-prefix-originator
]
draft both define
   one sub-tlv to announce the originator information of the one prefix
   from a specified node.  This draft utilizes such TLV for both OSPF
   and ISIS to signal the negative prefix in the perspective PUA when a
   link or node goes down.

   ABR detects link or node down and floods PUA negative prefix
   advertisement along with the summary advertisement according to the
   prefix-originator specification.  The ABR or ISIS L1-L2 border node
   has the responsibility to add the prefix originator information when
   it receives the Router LSA from other routers in the same area or
   level.


When the ABR or ISIS L1-L2 border node generates the summary
   advertisement based on component prefixes, the ABR will announce one
   new summary LSA or LSP which includes the information about this down
   prefix, with the prefix originator set to NULL.  The number of PUAs
   is equivalent to the number of links down or nodes down.  The LSA or
   LSP will be propagated with standard flooding procedures.

   If the nodes in the area receive the PUA flood from all of its ABR
   routers, they will start BGP convergence process if there exist BGP
   session on this PUA prefix.  The PUA creates a forced fail over
   action to initiate immediate control plane convergence switchover to
   alternate egress PE.  Without the PUA forced convergence the down
   prefix will yield black hole routing resulting in loss of
   connectivity.

   When only some of the ABRs can't reach the failure node/link, as that
   described in Section 3.2
,
the ABR th.at can reach the PUA prefix
   should advertise one specific route to this PUA prefix.  The internal
   routers within another area can then bypass the ABRs that can't reach
   the PUA prefix, to reach the PUA prefix.


The second comment is that using the prefix-originator TLV is a terrible
> choice of encoding. Note that if there is any router in the domain that
> doesn’t support the extension, you’ll actually attract traffic towards the
> ABR blackholing it.
>
>  Gyan> I will work with the authors to see if their is any alternative PUA
> process to signal and detect the failure in case prefix originator TLV is
> not supported.
>
> Further, I think your example is a bit contrived. I’d hope that an OSPF
> area with “thousands” of summarized PE addresses wouldn’t be portioned by a
> single failure as in figure 1 in the draft and your slides. I also that the
> option of a backbone tunnel between the ABRs was removed from the draft
> since it diminished the requirement for this functionality.
>
>  Gyan> This is a real world Metro access edge example as the impact is
> customers that have LSP built to the down egress PE that has not failed
> over.  In this scenario their is a Primary and Backup PE per Metro edge
> which is typical for an operator.
>

The workaround used today is to flood all /32 next hop prefixes and not
> take advantage of summarization.  This draft makes RFC 5283 inter area FEC
> binding now viable for operators.
>

>
> Thanks,
> Acee
>
>
>
> *From: *Gyan Mishra 
> *Date: *Monday, March 8, 2021 at 6:57 PM
> *To: *Acee Lindem , Aijun Wang ,
> draft-wang-lsr-prefix-unreachable-annoucement <
> draft-wang-lsr-prefix-unreachable-annoucem...@ietf.org>, lsr  >
> *Subject: *
> https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05
>
>
>
>
>
> Acee.
>
>
>
> Please ask the two questions you raised about the PUA draft so we can
> address your concerns.
>
>
>
> If anyone else has any other outstanding questions or concerns we would
> like to address as well and resolve.
>
>
>
> Once all questions and  concerns are satisfied we would like to ask for WG
> adoption.
>
>
>
> Kind Regards
>
>
>
> Gyan
>
> --
>
> [image: Image removed by sender.] 
>
> *Gyan Mishra*
>
> *Network Solutions Architect *
>
>
>
> *M 301 502-1347 13101 Columbia Pike
> 
> *Silver Spring, MD
>
>
>
-- 



*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD
___
Lsr mailing list
Lsr@ietf.org

Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Aijun Wang
Hi, Tony:

Aijun Wang
China Telecom

> On Mar 9, 2021, at 08:22, Tony Li  wrote:
> 
> 
> Hi Aijun,
>> 
>> There are two scenarios as introduced by Gyan: one is the node 
>> failure(Scenario 1), and another is the link failure(Scenario 2).
>> 
>> For scenario 1, also when all ABRs can’t reach the specified address, it is 
>> not efficient to advertise all of other detail prefixes when only one prefix 
>> or some prefixes are missing. The ABRs  tell exactly the specified failure 
>> prefixes via PUA message is reasonable.
> 
> 
> If no ABR can reach the address, then there is no point in advertising 
> anything.  The traffic is going to black hole.

[WAJ] We just want to avoid such silent discard behavior, especially for the 
scenario that there is BGP session run on such failure prefix. 
The other side of BGP peer can quickly remove the BGP session when it receives 
such PUA message which tell it the other peer is down now. Other BGP peer 
protection procedures can then take effects on.
The immediate notification of the failure prefix can certainly accelerate the 
switchover of BGP control plane and also the service traffic that such BGP 
session carries.

> 
> 
>> For scenarios 2, because the specified prefixes can be accessed via another 
>> ABR, then we can let this ABR to advertise the details prefixes information 
>> for the specified address, which behavior is similar with RIFT, as also 
>> mentioned in the presentation materials.
> 
> 
> Agreed.

[WAJ] Even for this scenario, the advertisement of the detail prefixes is 
trigger also via the PUA message from other ABR.

> 
> So, why do we need to punch a hole?
> 
> Tony
> 
> 

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Gyan Mishra
Hi Tony

On Mon, Mar 8, 2021 at 7:22 PM Tony Li  wrote:

>
> Hi Aijun,
> >
> > There are two scenarios as introduced by Gyan: one is the node
> failure(Scenario 1), and another is the link failure(Scenario 2).
> >
> > For scenario 1, also when all ABRs can’t reach the specified address, it
> is not efficient to advertise all of other detail prefixes when only one
> prefix or some prefixes are missing. The ABRs  tell exactly the specified
> failure prefixes via PUA message is reasonable.
>
>
> If no ABR can reach the address, then there is no point in advertising
> anything.  The traffic is going to black hole.




>
>
>
> > For scenarios 2, because the specified prefixes can be accessed via
> another ABR, then we can let this ABR to advertise the details prefixes
> information for the specified address, which behavior is similar with RIFT,
> as also mentioned in the presentation materials.
>
>
> Agreed.
>
> So, why do we need to punch a hole?



   Gyan> The goal of the solution is to be able to take advantage of
summarizing LPM match using MPLS LDP inter-area LSP extension of VPN
overlay next hop attribute /32 host route so that in larger domains with
1000s of PEs that domain wide flooding of host routes can be eliminated per
RFC 5302 impacts of domain wide flooding. The prefix is the egress PE next
hop attribute for VPN overlay component prefix that becomes unreachable due
to a link or node failure.  All BGP next hop attributes for VPN overlay are
being summarized so LPM summary FEC binding must be used to route the
traffic.  As all the BGP next hop attribute for vpn overlay are summarized
as is the goal to mitigate domain wide host route flooding, the next hop
attribute host routes component prefixes cannot be advertised.  That is the
problem we are trying to solve with PUA signaling of the component prefix
being down using the prefix originator Sub TLV RFC 7794 setting to null0
and flooding using Normal procedures  to force control plane convergence to
alternate egress PE next hop for the LSP to properly re-route.

>
>
> Tony
>
> --



*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Acee Lindem (acee)
Speaking as a WG member:

Hi Gyan,

The first question is how do you know which prefixes within the summary range 
to protect? Are these configured? Is this half-assed best-effort protection 
where you protect prefixes within the range that you’ve installed recently? 
Just how does this work? It is clearly not specified in the draft.

The second comment is that using the prefix-originator TLV is a terrible choice 
of encoding. Note that if there is any router in the domain that doesn’t 
support the extension, you’ll actually attract traffic towards the ABR 
blackholing it.

Further, I think your example is a bit contrived. I’d hope that an OSPF area 
with “thousands” of summarized PE addresses wouldn’t be portioned by a single 
failure as in figure 1 in the draft and your slides. I also that the option of 
a backbone tunnel between the ABRs was removed from the draft since it 
diminished the requirement for this functionality.

Thanks,
Acee

From: Gyan Mishra 
Date: Monday, March 8, 2021 at 6:57 PM
To: Acee Lindem , Aijun Wang , 
draft-wang-lsr-prefix-unreachable-annoucement 
, lsr 
Subject: 
https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05


Acee.

Please ask the two questions you raised about the PUA draft so we can address 
your concerns.

If anyone else has any other outstanding questions or concerns we would like to 
address as well and resolve.

Once all questions and  concerns are satisfied we would like to ask for WG 
adoption.

Kind Regards

Gyan
--

[Image removed by sender.]

Gyan Mishra

Network Solutions Architect

M 301 502-1347
13101 Columbia Pike
Silver Spring, MD

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Tony Li

Hi Aijun,
> 
> There are two scenarios as introduced by Gyan: one is the node 
> failure(Scenario 1), and another is the link failure(Scenario 2).
> 
> For scenario 1, also when all ABRs can’t reach the specified address, it is 
> not efficient to advertise all of other detail prefixes when only one prefix 
> or some prefixes are missing. The ABRs  tell exactly the specified failure 
> prefixes via PUA message is reasonable.


If no ABR can reach the address, then there is no point in advertising 
anything.  The traffic is going to black hole.


> For scenarios 2, because the specified prefixes can be accessed via another 
> ABR, then we can let this ABR to advertise the details prefixes information 
> for the specified address, which behavior is similar with RIFT, as also 
> mentioned in the presentation materials.


Agreed.

So, why do we need to punch a hole?

Tony

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Aijun Wang
Hi, Tony:

There are two scenarios as introduced by Gyan: one is the node failure(Scenario 
1), and another is the link failure(Scenario 2).

For scenario 1, also when all ABRs can’t reach the specified address, it is not 
efficient to advertise all of other detail prefixes when only one prefix or 
some prefixes are missing. The ABRs  tell exactly the specified failure 
prefixes via PUA message is reasonable.

For scenarios 2, because the specified prefixes can be accessed via another 
ABR, then we can let this ABR to advertise the details prefixes information for 
the specified address, which behavior is similar with RIFT, as also mentioned 
in the presentation materials.

Aijun Wang
China Telecom

> On Mar 9, 2021, at 08:03, Tony Li  wrote:
> 
> 
> 
> Gyan,
> 
> If I understand the purpose of this draft, the point is to punch a hole in a 
> summary so that traffic is redirected via an alternate, working path.
> 
> Rather than punch a hole, why not rely on existing technology? Have the valid 
> path advertise the more specific. This will attract the traffic.
> 
> Tony
> 
> 
>> On Mar 8, 2021, at 3:57 PM, Gyan Mishra  wrote:
>> 
>> 
>> Acee. 
>> 
>> Please ask the two questions you raised about the PUA draft so we can 
>> address your concerns.
>> 
>> If anyone else has any other outstanding questions or concerns we would like 
>> to address as well and resolve.
>> 
>> Once all questions and  concerns are satisfied we would like to ask for WG 
>> adoption.
>> 
>> Kind Regards 
>> 
>> Gyan
>> -- 
>> 
>> 
>> Gyan Mishra
>> Network Solutions Architect 
>> M 301 502-1347
>> 13101 Columbia Pike 
>> Silver Spring, MD
>> 
>> ___
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
> 
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Tony Li

Gyan,

If I understand the purpose of this draft, the point is to punch a hole in a 
summary so that traffic is redirected via an alternate, working path.

Rather than punch a hole, why not rely on existing technology? Have the valid 
path advertise the more specific. This will attract the traffic.

Tony


> On Mar 8, 2021, at 3:57 PM, Gyan Mishra  wrote:
> 
> 
> Acee. 
> 
> Please ask the two questions you raised about the PUA draft so we can address 
> your concerns.
> 
> If anyone else has any other outstanding questions or concerns we would like 
> to address as well and resolve.
> 
> Once all questions and  concerns are satisfied we would like to ask for WG 
> adoption.
> 
> Kind Regards 
> 
> Gyan
> -- 
>  
> Gyan Mishra
> Network Solutions Architect 
> M 301 502-1347
> 13101 Columbia Pike 
> Silver Spring, MD
> 
> ___
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

2021-03-08 Thread Gyan Mishra
Acee.

Please ask the two questions you raised about the PUA draft so we can
address your concerns.

If anyone else has any other outstanding questions or concerns we would
like to address as well and resolve.

Once all questions and  concerns are satisfied we would like to ask for WG
adoption.

Kind Regards

Gyan
-- 



*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD
___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr