Hi authors, I really appreciate the work done through interop to better understand protocol specs and revise the protocol. I hope that you are not all talking yourselves into an interop mode that changes the specs because that seems to interoperate, rather than fixing implementations to conform to the current specs (which were thought about quite a bit ;-)
In the end, of course, we should document what is implemented and interoperates (provided it works!). I did a very light skim of the document and I found a number of issues that concern me. Best, Adrian --- In section 3 you have: We introduce the concept of the LSP-DB, as a database of actual LSP state in the network I don't think you do 😊 You might start with RFC 7399 and RFC 8051. Possibly you are introducing the concept of the PCC LSP-DB. --- In section 3 you say The structure and format of the LSP-DB MUST be common among all dataplane types (i.e., RSVP- TE/SR-TE/SRv6), all instantiation methods (i.e., PCC-initiated/PCE- initiated), all destination types (i.e., point-to-point/point-to- multipoint). This should be self-evidently wrong. The LSP-DB is internal to the PCE implementation, so while it is true that the PCE needs to be able to derive certain information from the LSP-DB, how it stores that information is completely its own business. Now, you may want an abstract representation in order to define your state machines, and that is fine, but please don't tell implementations how to implement. --- In 3.2 you have The PCC adds/removes entries to/from its LSP-DB based on what LSPs it creates/destroys in the network. There can be many trigger types for updating the PCC LSP-DB, some examples include PCUpd messages, local computation on the PCC, local configuration on the PCC, etc. The trigger type does not affect the content of the PCC LSP-DB, i.e., the content of the PCC LSP-DB is updated identically regardless of the trigger type. But surely a PCUpd message does not immediately affect the state of the LSP in the network. Depending on the signaling process and the processing at the PCC, that may take some time. So, you need to be careful when you include PCUpd as a trigger and then say that all trigger types are to be treated equally. Later (in 3.2) you say: The LSP-DB on both the PCC and the PCE only stores the actual state in the network, it does not store the desired state. Which seems to say that PCUpd is not a trigger. In fact, the PCE and PCC need to store both the "currently believed current state" and the "state that is being attempted". How this state is stored is a very moot point because the LSP-DB is a logical concept. [In previous work] it expresses the information stored by the PCE about the active LSPs, but there is no such limit placed on what information about the active LSPs may be stored. Why not store the shoe size of the network operator's mother, if the implementation chooses to do so? I suspect that all PCCs have always stored the current/desired states (plural) because that is how head end devices work. That is why there was never any mention of PCC LSP-DBs in the previous literature: they were implicit in "this is what you do when you build a router." The LSP-DB was only introduced for the stateful work because of the (new) desire to know what the PCC already knew and to synchronise PCC-PCE and PCE-PCE. It seems to me that you are trying to describe how to implement a PCE and a PCC: something that may be a fun task, but which surely belongs outside the IETF. --- 3.2 Whenever a PCC modifies an entry in its PCC LSP-DB, it MUST send a PCRpt message to the PCE (or multiple PCEs), to synchronize this change. This implies that this update must be sent "at once". Why? The network may have taken some time to reach this state - why, then, must the update be instant? Indeed, you may want to smooth flaps in the PCC, and also avoid swamping the PCE. --- 3.2 Ensuring this synchronization is always in place allows one to define behavior as a function of LSP-DB state, instead of defining behavior as a function of what PCEP messages were sent or received. Cough? The message sets the state (you have just said), so the state is exactly a function of what messages were sent or received. --- 3.2 When the PCC acts on message, it would update its own PCC LSP DB and immediately send PCRpt to the PCE to synchronize the change. As before: - why immediately? - and why not wait until the change is present in the network? --- Section 1 has The current document serves to optimize the original procedure in [RFC8231] to drop the PCReq and PCReply exchange, which greatly simplifies implementation and optimizes the protocol. But 3.3 says In this document, we would like to make it clear that sending PCReq is optional. So, I think Section 1 should s/drop/optionally drop/ --- 3.3 has In reality, stateless bringup introduces overhead and is not possible to enforce from the PCE, because the stateless PCE is not supposed to keep any per-LSP state about previous PCReq messages. I think "not supposed to" is a bit strong. What stops a PCE keeping track of everything? It is an implementation and it can do what it wants. Indeed, the concept of "sticky resources" was introduced a long time ago (and so described rather late in the day in RFC 7399) to help with the need to understand which network resources might be in the process of being provisioned but have not been updated in the TED yet. Perhaps s/not supposed/not required/ --- 3.3 It was found that many vendors choose to ignore this requirement and send the PCRpt directly, without going through PCReq. This is the main point, and it is good to flag it up. But, be clear, this is a violation of 8231 (which uses a "MUST NOT" to cover this case). So this document is not "clarifying" it is "updating" and you need to make this very clear in the document. --- 3.3 Therefore, PCReq messages are useful for many OAM ping/traceroute applications where the PCC wishes to probe the network without having any effect on the existing LSPs. I don't think the PCC "probes the network" in this case. I think the PCC asks questions to the PCE. --- 3.3 The PCC MAY delegate an empty LSP to the PCE and then wait for the PCE to send PCUpd, without sending PCReq. We shall refer to this process as "stateful bringup". The PCE MUST support the original stateless bringup, for backward compatibility purposes. Supporting stateful bringup should not require introducing any new behavior on the PCE, because as mentioned earlier, the PCE MUST NOT modify LSP-DB state based on PCReq messages. So whether the PCE has received a PCReq or not, it MUST process the PCRpt all the same. OK. This is your description of new behaviour. You need to do two things: 1. In this section, turn this into an abstract description of the function. 2. Move the new BCP 14 description into a new section and flag it as "new behaviour that updates 8231". Missing from this description is how the PCE decides to: - do a path computation - set this LSP's oper status to up - send a PCUpd I *think* that you are using the "empty" ERO as the trigger. But I would note that, per RFC 3209, the ERO is optional. I don't know that optional and present-but-empty are easy to distinguish. Perhaps that doesn't matter because after delegation the PCE will determine what path it wants and will supply an ERO. I would question why the PCC doesn't send "oper up" with a null ERO to be the trigger. This would be more consistent with the PCC's desire to have the LSP transition to active. --- In 3.4 and 3.5 I am confused about delegation since you don't show the D-flag. This is crucial because if LSP 2 has been delegated, it is not the PCC's job to perform MBB, but if LSP 2 has not been delegated, why would it be reported to the PCE? --- I'm unclear whether 4.1 etc. is defining any new behaviour (notwithstanding the "new" logical ASSO DB). Isn't the case that when the state of an already-delegated LSP changes, the PCC must update the PCE? And surely the association is part of the state. You are using BCP 14 language in an example which is unhelpful. If you are clarifying existing procedures then: - you must reference them - you must not use BCP 14 language If you are defining new procedures or fixing existing procedures then: - you must reference them - you must update the relevant rfcs - you must make it clear: - what the abstract behaviour is - what the new/changed procedures are --- I think 4.2 is just an example for clarification (since everything works as I would expect it to). That's fine, but given the confusion in the document about what is new/change and what is clarification, it would be helpful to scope this sections as "an example for clarification, and not normative." --- Section 5 is a big change to RFC 8697 section 6.3.1 that has: When an LSP is first reported to the PCE, the PCRpt message MUST include all the association groups that it belongs to. Any subsequent PCRpt message SHOULD include only the associations that are being modified or removed. Thus, you are saying that an existing implementation of 8697 will accidentally remove an LSP from all of its associations if it doesn't list them in a PCRpt. So... - You need to reference 8697 - You need to explicitly update 8697 - You need to use BCP 14 language ...and this leads to the big one for the whole document... --- ...you need to describe how to interoperate with "legacy" implementations that adhere to the current specifications. --- Section 6 Again, I'm not clear when you are restating and when you are changing specified behaviour. But... A PCE SHOULD interpret the RRO/SR- RRO/SRv6-RRO as the actual path the LSP is taking but MAY interpret only the ERO/SR-ERO/SRv6-ERO as the actual path. No. If the RRO is present, it would be wrong to ever interpret the ERO as the actual path taken. That would lead to the LSP-DB being incorrect with the result that subsequent computations would potentially be in error. In fact, the LSP-DB needs to hold both the intended and actual paths to allow for more complex cases. You can either reduce this to only cover SR, or you fix it for all paths. Furthermore... In the absence of an RRO/SR-RRO/SRv6-RRO a PCE SHOULD interpret the ERO/SR-ERO/SRv6-ERO (respectively) as the actual path for the LSP. Why "SHOULD" and not "MUST"? What other recourse does the PCE have? Should it update the LSP-DB to say "I've no idea, but there is an LSP somewhere in the network"? --- -----Original Message----- From: I-D-Announce <i-d-announce-boun...@ietf.org> On Behalf Of internet-dra...@ietf.org Sent: 19 February 2022 19:22 To: i-d-annou...@ietf.org Subject: I-D Action: draft-koldychev-pce-operational-05.txt A New Internet-Draft is available from the on-line Internet-Drafts directories. Title : PCEP Operational Clarification Authors : Mike Koldychev Siva Sivabalan Shuping Peng Diego Achaval Hari Kotni Filename : draft-koldychev-pce-operational-05.txt Pages : 15 Date : 2022-02-19 Abstract: This document proposes some important simplifications to the original PCEP protocol and also serves to clarify certain aspects of PCEP operation. The content of this document has been compiled based on the feedback from several multi-vendor interop exercises. Several constructs are introduced, such as the LSP-DB and the ASSO-DB. The IETF datatracker status page for this draft is: https://datatracker.ietf.org/doc/draft-koldychev-pce-operational/ There is also an htmlized version available at: https://datatracker.ietf.org/doc/html/draft-koldychev-pce-operational-05 A diff from the previous version is available at: https://www.ietf.org/rfcdiff?url2=draft-koldychev-pce-operational-05 Internet-Drafts are also available by rsync at rsync.ietf.org::internet-drafts _______________________________________________ I-D-Announce mailing list i-d-annou...@ietf.org https://www.ietf.org/mailman/listinfo/i-d-announce Internet-Draft directories: http://www.ietf.org/shadow.html or ftp://ftp.ietf.org/ietf/1shadow-sites.txt _______________________________________________ Pce mailing list Pce@ietf.org https://www.ietf.org/mailman/listinfo/pce