Hi Aijun, thank you for confirming that it is not the conclusion one can arrive based on my discussion with Robert. Secondly, the problem you describe, I wouldn't characterize as a scaling issue with using multi-hop BFD monitoring path continuity in the underlay network. In my opinion, it is an operational overhead that can be addressed by an intelligent management plane or a few extensions in the control plane that is setting an overlay. Since the management plane is usually a proprietary solution, I invite anyone interested in working on BFD auto-configuration extensions in the control plane. I much appreciate references to the use cases that can benefit from such extensions.
Regards, Greg On Mon, Nov 29, 2021 at 6:26 PM Aijun Wang <wangai...@tsinghua.org.cn> wrote: > Hi, Greg: > > > > Firstly, regardless of which methods to be used for the multihop BFD > approach, it is certainly the configuration overhead if you image there are > 10,000 PEs as Tony often raised as one example. > > Shouldn’t you configure each pair of them to detect the PE-PE connection? > > It is obvious not scalable. > > > > > > Best Regards > > > > Aijun Wang > > China Telecom > > > > *From:* Greg Mirsky <gregimir...@gmail.com> > *Sent:* Tuesday, November 30, 2021 10:18 AM > *To:* Aijun Wang <wangai...@tsinghua.org.cn> > *Cc:* Gyan Mishra <hayabusa...@gmail.com>; Robert Raszuk < > rob...@raszuk.net>; lsr <lsr@ietf.org> > *Subject:* Re: [Lsr] BFD aspects > > > > Hi Aijun, > > could you please elaborate on how you see that this discussion leads to > the "BFD based detection for the mentioned problem is not [...] > scalable(among PEs)" conclusion? I hope that there's nothing I've said or > suggested lead you to this conclusion. Personally, I believe that BFD-based > PE-PE is the best technical solution. I understand that an operator may be > dissatisfied with the additional configuration of the BFD session. As > noted, I believe that can be addressed in the management plane or minor > extensions in the control plane (BGP or not). If a particular > implementation (or a combination of the implementation and HW) has a > scaling challenge with multi-hop BFD, then that could be not enough > sufficient technical justification for a somewhat controversial proposal. > > > > Regards, > > Greg > > > > On Mon, Nov 29, 2021 at 5:17 PM Aijun Wang <wangai...@tsinghua.org.cn> > wrote: > > From the discussion, I think we can get the conclusion that BFD based > detection for the mentioned problem is not reliable (between PE/RR) and > scalable(among PEs). > > Then also the BGP based solution. > > > > So let’s focus how to implement it within the IGP? Thanks Greg’s > analysis. > > And one supplement for Robert’s comments: RR is always not located within > the same area as PEs, then can’t know the down of PE nodes immediately > when the summary is configured between areas. > > > > Best Regards > > > > Aijun Wang > > China Telecom > > > > *From:* lsr-boun...@ietf.org <lsr-boun...@ietf.org> *On Behalf Of *Gyan > Mishra > *Sent:* Tuesday, November 30, 2021 8:44 AM > *To:* Robert Raszuk <rob...@raszuk.net> > *Cc:* Greg Mirsky <gregimir...@gmail.com>; lsr <lsr@ietf.org> > *Subject:* Re: [Lsr] BFD aspects > > > > > > Robert > > > > On Mon, Nov 29, 2021 at 7:35 PM Robert Raszuk <rob...@raszuk.net> wrote: > > Hi Greg, > > > > If BFD would have autodiscovery built in, that would indeed be the > ultimate solution. Of course folks will worry about scaling and number of > BFD sessions to be run PE-PE. > > GIM>> I sense that it is not "BFD autodiscovery" but an advertisement of > BFD multi-hop system readiness to the particular PE. That, as I think of > it, can be done in a control or management plane. > > > > Agreed. > > > > But if BFD between all PEs would be an option why RR to PE in the local > area would not be a viable solution ? > > > > GIM>>Because, in the case of PE-PE, BFD control packets will be > fate-sharing with data packets. But the path between RR and PE might not be > used for carrying data packets at all. > > > > 100%. But that was accounted for. Reason being that you have at least > two RRs in an area. The point of BFD was to use detect that PE went down. > > > > Gyan> What Greg is alluding is a very good point to consider is that the > RR in many cases in operator networks sit in the “control plane” path > which is separate from the data plane path. So the E2E forwarding plane > path between the PEs, the RR has no knowledge as is it sits outside the > forwarding plane path. That being said the PE to RR path is disjoint from > the PE-PE path so from the PE-RR RR POV may think the PE is up or down > thus the false positive or negative. That would be the case regardless of > how many RRs are deployed. > > > > You are absolutely right that it may report RR disconnect from the network > while PE is up and data plane from remote PEs can reach it. That is why we > have more than one RR. > > > > As far as fate sharing PE-PE BFD with real user data - I think it is not > always the case. But this is completely separate discussion :) > > > > Also please keep in mind that PE going down can be learned by RRs by > listening to the IGP. No BFD needed. > > > > Both would be multihop, both would be subject to all transit failures etc > ... > > GIM>> I think that there's a difference between the impact a path failure > has on the data traffic. In the case of monitoring PE-PE path in the > underlay and using the same encapsulation as data traffic is representative > of the data experience. A failure of the PE-RR path, in my understanding, > may be not representative at all. BFD session between RR and PE may fail > while PE is absolutely functional from the service PoV. > > > > Please keep in mind that this entire discussion is not about data plane > failure end to end :) Yes, it's pretty sad. This entire debate is to > indicate domain wide that the IGP component on a PE went down. > > > > No one considers data plane liveness and even as you observed data plane > encapsulation congruence. Clearly this is not a true OAM discussion. > > > > On the other hand, PE might be disconnected from the service while the BFD > session to RR is in the Up state. > > > > Not likely if you keep in mind that to trigger any remote action such > failure would have to happen to all RRs. > > > > Thx a lot, > R. > > > > _______________________________________________ > Lsr mailing list > Lsr@ietf.org > https://www.ietf.org/mailman/listinfo/lsr > > -- > > <http://www.verizon.com/> > > *Gyan Mishra* > > *Network Solutions Architect * > > *Email gyan.s.mis...@verizon.com <gyan.s.mis...@verizon.com>* > > *M 301 502-1347* > > > > _______________________________________________ > Lsr mailing list > Lsr@ietf.org > https://www.ietf.org/mailman/listinfo/lsr >
_______________________________________________ Lsr mailing list Lsr@ietf.org https://www.ietf.org/mailman/listinfo/lsr