Document: draft-ietf-spring-resource-aware-segments
Title: Introducing Resource Awareness to SR Segments
Reviewer: Fung Lim
Review result: Has Issues

Hi,

I have been selected as the Operational Directorate (opsdir) reviewer for this
Internet-Draft.

The Operational Directorate reviews all operational and management-related
Internet-Drafts to ensure alignment with operational best practices and that
adequate operational considerations are covered.

A complete set of _"Guidelines for Considering Operations and Management in
IETF Specifications"_ can be found at
https://datatracker.ietf.org/doc/draft-ietf-opsawg-rfc5706bis/.

While these comments are primarily for the Operations and Management Area
Directors (Ops ADs), the authors should consider them alongside other feedback
received.

- Document: draft-ietf-spring-resource-aware-segments-17
- Reviewer: Fung Lim
- Intended Status: Standards Track

---

## Summary

Choose one:

- Has Issues: I have some minor concerns about this document that I think
should be resolved before publication.

---

This document is well-scoped and defines a mechanism to associate network
resources with Segment Routing SIDs for both SR-MPLS and SRv6. The core concept
addresses a real operational need for resource isolation beyond DiffServ.
However, as a Standards Track document introducing new operational complexity
to SR networks, the Operational Considerations section (Section 6) is too thin.
Several important operational and manageability topics are absent or
insufficiently addressed.

1. Operational Considerations section needs expansion

No guidance on resource group sizing or planning. The document acknowledges
scalability concerns (Section 2.1: "there can be scalability concerns when the
number of resource groups is large") but the Operational Considerations section
provides no guidance on practical limits or planning heuristics. Providing
guidance would better align intended use cases.

The document should describe what happens when resource allocation fails or is
inconsistent. Section 3 states resource group support "MUST be aligned among
the network nodes," but what happens operationally when it is not? How does an
operator detect misalignment? What are the failure symptoms, or failure modes?

There is also no discussion of resource over-subscription. What happens when
traffic exceeds the allocated resources for a resource-aware SID? Is traffic
dropped, downgraded, or does it spill into other resource groups? This is a
critical operational question left entirely unaddressed and guidance is
necessary for consistent implementation.

2. Missing configuration management guidance

The document describes two provisioning approaches (local configuration vs.
centralized controller) but provides no guidance on:

- What configuration parameters exist and what are their defaults?
- Are there any consideration for validation before activation, or after device
reboots? - What state must be preserved across device reboots? - How to handle
configuration rollback if resource allocation partially fails across a
multi-node resource group?

3. No fault management or troubleshooting guidance

The document introduces several new failure modes but does not discuss how
operators would detect or diagnose them:

- A node fails to allocate resources to a resource-aware SID
- Resource group alignment becomes inconsistent across nodes
- Resource exhaustion on a subset of links in a resource group
- SLA violations caused by insufficient resource allocation

Nits

4. Section 3 mentions that "in network cases with SR and other TE mechanisms
(such as RSVP-TE) co-existing," IGP advertisements "may need to be updated" and
"it is suggested such updates would be rate-limited.". It lacks specifics —
what rate limiting is appropriate? What are the consequences of not
rate-limiting?

5. No discussion of management interoperability

The document references NETCONF/YANG and draft-ietf-spring-sr-policy-yang as
controller interfaces, but does not discuss:

Whether a YANG Data Model for resource-aware segments is needed
How resource group state would be exposed through management interfaces
Whether existing SR YANG models are sufficient or need extension

6. PHP recommendation needs operational justification

Section 2.1 states: "it is RECOMMENDED that Penultimate Hop Popping (PHP) be
disabled." Disabling PHP is a significant operational change for many SR-MPLS
deployments. The document should discuss:

The operational impact of disabling PHP (e.g., increased label stack processing
on egress) How to verify that PHP is correctly disabled across relevant paths
What happens if PHP is not disabled — is there a graceful degradation or a hard
failure?

7. Typos in Operational Considerations section

"Althougth" → "Although" (line 578)
"paradigmn" → "paradigm" (line 578)

8. Single-vendor implementation status

Section 5 lists only Huawei implementations. While this is noted as per SPRING
WG policies, from an operational perspective, single-vendor implementation
raises interoperability concerns for a Standards Track document.

I hope these comments are useful and constructive! The core mechanism is
well-conceived; strengthening the operational guidance will improve its
deployability.

Fung Lim



_______________________________________________
spring mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to