Mohamed Boucadair has entered the following ballot position for draft-ietf-lsr-igp-ureach-prefix-announce-09: Discuss
When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ for more information about how to handle DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-lsr-igp-ureach-prefix-announce/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- Hi Peter, Clarence, Daniel, Shraddha, and Gyan, Thank you for the effort put into this specification. I like the approach adopted here as it leverages exiting features (which is good for backward compatibility) and thus eases incremental deployment. # Meta comment The flow of the document can be made better by introducing the explicit signaling part early in the document as that is what actually demux the use defined in the draft vs. any other future uses of the specific metric. The current flow reveals that the signaling part was added as an after though, not as part of the design. Please find below some DICUSS points: # Planned Maintenance I expect that adequate reconfiguration will be put in place to isolate a node for PM purposes. The document does not explain how the signal defined here is needed or solves an issue. I appreciate that the introduction include a short mention of the use of overload bit, though. I’m also asking the question because there is no ** normative ** discussion (at least I missed it) about how that can be triggered at the originating ABR. Also, unlike a random failure, a planned failure is associated with an expected start and end times. Signaling the event without those characteristic questions the value of tagging a loss as an NP. # Configuration Efficiency Section 2 Implementations MAY limit the UPA generation to specific prefixes, e.g. host prefixes, SRv6 locators, or similar. Such filtering is optional and MAY be controlled via configuration. I’m afraid that for the limit to be useful, it should be tweaked based on local operator policy and not on some magic that is internal to the implementation. I suggest the second MAY to be changed to SHOULD when such limit is supported. Idem for the following: Implementation MAY provide a configuration option to specify the UPA lifetime at the originating ABR or ASBR. I suggest we update it to: UPA implementations SHOULD provide a configuration option to specify the UPA ^^^^^^^^^^^^^^^^^^^ lifetime at the originating ABR or ASBR. The reasoning for this change is that a failure may last long and that an operator would like to maintain that loss state longer (than allowed by a default limit) to accommodate how the specific application consuming the loss signal. Please note that you say that the validity depends on the use case: CURRENT (S.2): The time the UPA is kept in the network SHOULD also reflect the intended use-case for which the UPA was advertised. # Withdrawal Section 2 says: UPA advertisements SHOULD therefore be withdrawn after some amount of time, that would provides sufficient time for UPA to be flooded network-wide and acted upon by receiving nodes, but limits the presence of UPA in the network. The time the UPA is kept in the network SHOULD also reflect the intended use-case for which the UPA was advertised. … ABR or ASBR MUST withdraw the previously advertised UPA when the reason for which the UPA was generated ceases - e.g. prefix reachability was restored or its metric has changed such that it is below the configured threshold value. The second MUST is not useful if that reason disappears before the timeout that triggers the SHOULD. I think a simple text reorder would fix this. For example, NEW: ABR or ASBR MUST withdraw the previously advertised UPA when the reason for which the UPA was generated ceases - e.g. prefix reachability was restored or its metric has changed such that it is below the configured threshold value. Even if the reasons persist, UPA advertisements SHOULD be withdrawn after some amount of time, that would provides sufficient time for UPA to be flooded network-wide and acted upon by receiving nodes, but limits the presence of UPA in the network. The time the UPA is kept in the network SHOULD also reflect the intended use-case for which the UPA was advertised. # Scalability control Section 2 says: It is also RECOMMENDED that implementations limit the number of UPA advertisements which can be originated at a given time. The benefits gained by summarization may be nullified if a large number of UPAs are injected. This recommendation is really great, but there is a need to expose a configuration parameter here. Pease NEW: UPA implementations SHOULD provide a configuration option to limit the number of such UPAs. # Backward compatibility Check CURRENT (3.2/4.2): Such requirement of reachability MUST NOT be applied for UPAs, as they are propagating unreachability. Does this mean that an ABR that don’t support UPA might discard it? # Avoid redefining exiting behaviors CURRENT (Section 4): In addition, NU-bit is defined for OSPFv3 [RFC5340]. Prefixes having the NU-bit set in their PrefixOptions field SHOULD NOT be included in the routing calculation. The SHOULD NOT part is already defined in 5340. What is this redefined again? # Check on “UP-Flag” is useless 5.1.1 The prefix that is advertised with U-Flag or UP-Flag MUST have the metric set to a value larger than 0xFE000000. If the prefix metric is less than or equal 0xFE000000, both of these flags MUST be ignored. 5.2.2 The prefix that is advertised with U-Flag or UP-Flag MUST have the NU-bit set in the PrefixOptions of the parent TLV. The “or UP-Flag” part of both MUSTs is useless as U-Flag will be set in such case as well. The case where only UP-Flag is set is invalid and will be ignored. Unless I missed a subtle thing here, please update these two. # Service stability The document declares the applications that consume the signal out of scope. Which is fine. However, there might be some negative implications due to how the loss signal (or it withdrawal/absence) is interpreted. For example, add a warning to insist that withdrawal does not mean revert to a nominal state. Having few sentences for these service to take appropriate measure before reverting. ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- # Topology-dependent matters Abstract: This enables fast convergence by steering traffic away from the node which owns the prefix and is no longer reachable. “steering..” part of the text is only possible when there alternate routes for that prefix. Otherwise, the traffic will be dropped anyway. Pleas reword. For example, saying “when applicable” or similar would be just fine. # Why now? Summarization was there since decades. A mention about what exacerbates the need for this new feature now (and was not considered as a major issue in past) would be helpful. # "Egress" depends on the traffic directionality Section 1: Similarly, when an egress router needs to be taken out of service for maintenance, the traffic is drained from the node before taking it down. A router may behave as ingress/egress as a function of traffic direction. I would delete “egress” here. # (ni) There might be many ABRs per area, not single one Please make this change in Section 1 (and similar): OLD: When prefixes from such node are summarized by the Area Border Router (ABR) or Autonomous System Boundary Router (ASBR), NEW: When prefixes from such node are summarized by an Area Border Router (ABR) or Autonomous System Boundary Router (ASBR), # (Operational Considerations) Many signals, distinct expected uses Section 1 has the following: This document does not define how to advertise prefix that is not reachable for routing. That has been defined for IS-IS in [RFC5305] and [RFC5308], for OSPF in [RFC2328], and for OSPFv3 in [RFC5340]. I wonder whether we can list the signals available out there (explicit prefer advertisement, prefix with a specific metric, etc.) and remind the intend use scope for each of them. This can be record in the Operational Considerations. I’m raising this point, especially that the text right after says the following without appropriate scoping: CURRENT: This document defines two new flags in IS-IS, OSPF, and OSPFv3. These flags, together with the existing protocol mechanisms, provide the support for advertising prefix unreachability , together with the ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ reason for which the unreachability is advertised # Unreachable metric?! Section 1 has the following: This document defines a method to signal a specific reason for which the prefix with unreachable metric was advertised. Unless, I’m mistaken there is no such “unreachable metric” as a thing in IS-IS. # (Operational Considerations) Activation default Section 2 has the following: UPA MAY be generated by the ABR or ASBR for a prefix that is summarized by the summary address originated by the ABR or ASBR in the following cases: Can we say something about whether the use of UPA be enabled or disabled by default? # Threshold Section 2 says: - the metric to reach the prefix from the ABR or ASBR crosses the configured threshold. Can we explicit the threshold we are referring to here + add reference where to look at? # Explicit references in Section 3 OLD: [RFC5305] defines the encoding for advertising IPv4 prefixes using 4 octets of metric information. Section 4 specifies: .. Similarly, [RFC5308] defines the encoding for advertising IPv6 prefixes using 4 octets of metric information. Section 2 states: NEW: [RFC5305] defines the encoding for advertising IPv4 prefixes using 4 octets of metric information. Section 4 of [RFC5305] specifies: .. Similarly, [RFC5308] defines the encoding for advertising IPv6 prefixes using 4 octets of metric information. Section 2 of [RFC5308] states: # “Advertisement of UPA in IS-IS” CURENT (S3.1) Recognition of the advertisement as UPA is only required on routers which have a valid use case for this information. I would delete this sentence as this section is about the originator. # Section 6 Consider moving that section right after current Section 7 as a subsection of an Operational Consideration section. ## Multiple ABRs That section may also discuss whether there are specific consideration to take into account, e.g., presence of multiple ABRs with announces UPAs for a set of prefixes in an area and measures to prevent routing stability. If you don’t see any risk out there, saying that as well would be useful. ## Consider adding any implications (or absence of) explicit withdrawal of an UPA. Cheers, Med _______________________________________________ Lsr mailing list -- [email protected] To unsubscribe send an email to [email protected]
