Hi Fung, Thanks again for your review of this draft. We've posted a new revision which incorporated all your comments and suggestions on the security considerations: https://datatracker.ietf.org/doc/html/draft-ietf-spring-resource-aware-segments-18
Please also find replies to some of the comments inline, and let us know if these comments have been addressed. Best regards, Jie (on behalf of coauthors) -----Original Message----- From: Fung Lim via Datatracker <[email protected]> Sent: Sunday, May 10, 2026 12:31 PM To: [email protected] Cc: [email protected]; [email protected]; [email protected] Subject: draft-ietf-spring-resource-aware-segments-17 ietf last call Opsdir review Document: draft-ietf-spring-resource-aware-segments Title: Introducing Resource Awareness to SR Segments Reviewer: Fung Lim Review result: Has Issues 1. Operational Considerations section needs expansion No guidance on resource group sizing or planning. The document acknowledges scalability concerns (Section 2.1: "there can be scalability concerns when the number of resource groups is large") but the Operational Considerations section provides no guidance on practical limits or planning heuristics. Providing guidance would better align intended use cases. [Jie] Some guidance about the resource group size and planning have been provided in the operational considerations section. The document should describe what happens when resource allocation fails or is inconsistent. Section 3 states resource group support "MUST be aligned among the network nodes," but what happens operationally when it is not? How does an operator detect misalignment? What are the failure symptoms, or failure modes? [Jie] Some text about the failure in resource allocation have been added to the control plane considerations. And text about the inconsistency in the binding have been added to in the operational considerations section. There is also no discussion of resource over-subscription. What happens when traffic exceeds the allocated resources for a resource-aware SID? Is traffic dropped, downgraded, or does it spill into other resource groups? This is a critical operational question left entirely unaddressed and guidance is necessary for consistent implementation. [Jie] The behavior for over-subscription on a "virtual topology" built with a resource group is no different from over-subscription in existing networks. That said, some text about the operators policy on the treatment of traffic exceeding the allocated resources have been added. 2. Missing configuration management guidance The document describes two provisioning approaches (local configuration vs. centralized controller) but provides no guidance on: - What configuration parameters exist and what are their defaults? [Jie] Some text have been added to the control plane considerations section about the resource-group and resource-aware SID provisioning. - Are there any consideration for validation before activation, or after device reboots? - What state must be preserved across device reboots? - How to handle configuration rollback if resource allocation partially fails across a multi-node resource group? [Jie] IMO no consideration is needed for activation or device reboot. The rollback in case of partial resource allocation failure have been added to the control plane considerations. 3. No fault management or troubleshooting guidance The document introduces several new failure modes but does not discuss how operators would detect or diagnose them: - A node fails to allocate resources to a resource-aware SID - Resource group alignment becomes inconsistent across nodes - Resource exhaustion on a subset of links in a resource group - SLA violations caused by insufficient resource allocation [Jie] As mentioned above, text related to resource allocation failure and inconsistent binding have been added to this version. Resource exhaustion and SLA violation are discussed in the security considerations. Nits 4. Section 3 mentions that "in network cases with SR and other TE mechanisms (such as RSVP-TE) co-existing," IGP advertisements "may need to be updated" and "it is suggested such updates would be rate-limited.". It lacks specifics — what rate limiting is appropriate? What are the consequences of not rate-limiting? [Jie] Some references to control plane rate-limiting and suppression mechanisms are added. 5. No discussion of management interoperability The document references NETCONF/YANG and draft-ietf-spring-sr-policy-yang as controller interfaces, but does not discuss: Whether a YANG Data Model for resource-aware segments is needed How resource group state would be exposed through management interfaces Whether existing SR YANG models are sufficient or need extension [Jie] Some more text about YANG model augmentation and a reference to a related YANG model are added. 6. PHP recommendation needs operational justification Section 2.1 states: "it is RECOMMENDED that Penultimate Hop Popping (PHP) be disabled." Disabling PHP is a significant operational change for many SR-MPLS deployments. The document should discuss: The operational impact of disabling PHP (e.g., increased label stack processing on egress) How to verify that PHP is correctly disabled across relevant paths What happens if PHP is not disabled — is there a graceful degradation or a hard failure? [Jie] The mechanism and benefits/costs of PHP are adequately covered in multiple RFCs, which will not be repeated in this draft. We added some text about the cost of not disabling PHP in the case specified in this draft. 7. Typos in Operational Considerations section "Althougth" → "Although" (line 578) "paradigmn" → "paradigm" (line 578) [Jie] These typos have been fixed, thanks. 8. Single-vendor implementation status Section 5 lists only Huawei implementations. While this is noted as per SPRING WG policies, from an operational perspective, single-vendor implementation raises interoperability concerns for a Standards Track document. [Jie] As the implementation status section says: "This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist." AFAIK there could implementations which are not reported to the WG yet. I hope these comments are useful and constructive! The core mechanism is well-conceived; strengthening the operational guidance will improve its deployability. [Jie] Yes these comments are very helpful, thanks again for the review and suggestions. Bes regards, Jie Fung Lim _______________________________________________ spring mailing list -- [email protected] To unsubscribe send an email to [email protected]
