Dear PCE working group, I am working on a PCE for SR-TE and currently trying to implement dual-PCE redundancy. I prefer to follow the RFC and not implement any proprietary inter-PCE protocols unless absolutely necessary.
PCEP has been so far not a very standard-friendly protocol in my opinion, because every vendor implements things differently and there is not much interoperability. I managed to make single-PCE scenario somewhat works, although there is different logic in handling state sync and LSP-ID state for different vendors. So my question is how dual PCE redundancy is supposed to work; let's say for now for PCE-initiated LSP only. RFC8231#section-5.7 <https://datatracker.ietf.org/doc/html/rfc8231#section-5.7> describes the procedures, but again vendors implement it differently. Cisco delegates LSP to the PCE that initiated it, and reports to the other PCE (without delegate flag). and in case this PCE fails, redelegates to the second PCE and the LSP remain delegated to it forever. Juniper also delegates to the primary PCE and reports to the other without delegate flag, but if the primary PCE fails, Juniper does not redelegate, instead it waits a bit and then removes the LSP. My understanding is that it expects the secondary PCE to re-initiate the LSP at this point. I haven't tested Nokia, Huawei and others but wouldn't be surprised if they also have different ways of handling this. So my question is, how to properly implement dual-PCE redundancy? Is there a standard way or it relies on vendor-proprietary mechanisms? I think about the following logic: 1. When session comes up, wait until state sync (message with all zero's) + some timer; then 1. If received PCReports with delegate flag (and some of those matching our LSP) -> assume PCC elected us as primary PCE; send PCUpdate for these delegated LSP, send PCInit for our LSP not seen in PCReports 2. If received PCReports matching our LSP but without delegate flag -> assume PCC elected us as secondary PCE; do not send PCUpdate or PCInit 3. If did not receive any PCReports matching our LSP - [this is a tricky part - we can’t send PCInit because if 2 PCE send it simultaneously, PCC will complain] - maybe start a random timer and try to PCInit after it, so the other PCE will receive reports with our LSP? 2. When PCC re-delegates LSP to us -> assume we are now elected as primary LSP, send PCUpdate for delegated LSP and PCInit for LSP not seen in PCReports 3. When PCC removes LSP but they are still active on PCE side -> assume Juniper-style redundancy, send PCInit for those LSP I think this will work based on my observations so far but I want to ask PCE experts just in case - what is the IETF way and are there any implementations that interoperate with different vendors? Maybe the more intelligent approach would be implementing draft-ietf-pce-state-sync-09 but from reading it I'm not sure that will work in all scenarios including Juniper PCC. So it can add more problems with split brain while not solving the original issue. Please let me know what you think. Regards, Dmytro
_______________________________________________ Pce mailing list -- [email protected] To unsubscribe send an email to [email protected]
