Hi Med,

thanks for the comments, please see inline (##PP):

On 21/09/2025 12:21, Mohamed Boucadair via Datatracker wrote:
Mohamed Boucadair has entered the following ballot position for
draft-ietf-lsr-igp-ureach-prefix-announce-09: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer tohttps://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-lsr-igp-ureach-prefix-announce/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Hi Peter, Clarence, Daniel, Shraddha, and Gyan,

Thank you for the effort put into this specification.

I like the approach adopted here as it leverages exiting features (which is
good for backward compatibility) and thus eases incremental deployment.

# Meta comment

The flow of the document can be made better by introducing the explicit
signaling part early in the document as that is what actually demux the use
defined in the draft vs. any other future uses of the specific metric. The
current flow reveals that the signaling part was added as an after though, not
as part of the design.

##PP
the document has been re-ordered several times based on the comments during various stages of the review process. I'm a bit reluctant to change it again, as different people have a different  priorities, and it's not possible to make everyone happy. In the end, what we are specifying is rather simple, so the ordering does not play that significant role IMHO.



Please find below some DICUSS points:

# Planned Maintenance

I expect that adequate reconfiguration will be put in place to isolate a node
for PM purposes. The document does not explain how the signal defined here is
needed or solves an issue. I appreciate that the introduction include a short
mention of the use of overload bit, though.

I’m also asking the question because there is no ** normative ** discussion (at
least I missed it) about how that can be triggered at the originating ABR.

Also, unlike a random failure, a planned failure is associated with an expected
start and end times. Signaling the event without those characteristic questions
the value of tagging a loss as an NP.

##PP
I'm not sure what is being asked. This document does not specify the PM procedure of any kind. This document specifies the UPA signaling, including the bit that signals that the UPA was originated on the ABR/ASBR as a result of the PM in its source area/domain.

UPA helps in propagating the information about the connectivity loss caused by the beginning of the PM outside of the area/domain. That's all.



# Configuration Efficiency

Section 2
    Implementations MAY limit the UPA generation to specific prefixes,
    e.g.  host prefixes, SRv6 locators, or similar.  Such filtering is
    optional and MAY be controlled via configuration.

I’m afraid that for the limit to be useful, it should be tweaked based on local
operator policy and not on some magic that is internal to the implementation. I
suggest the second MAY to be changed to SHOULD when such limit is supported.
##PP
done

Idem for the following:
    Implementation MAY provide a configuration option to specify the UPA
    lifetime at the originating ABR or ASBR.

I suggest we update it to:

    UPA implementations SHOULD provide a configuration option to specify the UPA
    ^^^^^^^^^^^^^^^^^^^
    lifetime at the originating ABR or ASBR.
##PP
done

The reasoning for this change is that a failure may last long and that an
operator would like to maintain that loss state longer (than allowed by a
default limit) to accommodate how the specific application consuming the loss
signal.

Please note that you say that the validity depends on the use case:

CURRENT (S.2):
    The
    time the UPA is kept in the network SHOULD also reflect the intended
    use-case for which the UPA was advertised.

# Withdrawal

Section 2 says:

    UPA advertisements SHOULD
    therefore be withdrawn after some amount of time, that would provides
    sufficient time for UPA to be flooded network-wide and acted upon by
    receiving nodes, but limits the presence of UPA in the network.  The
    time the UPA is kept in the network SHOULD also reflect the intended
    use-case for which the UPA was advertised.
    …
    ABR or ASBR MUST withdraw the previously advertised UPA when the
    reason for which the UPA was generated ceases - e.g. prefix
    reachability was restored or its metric has changed such that it is
    below the configured threshold value.

The second MUST is not useful if that reason disappears before the timeout that
triggers the SHOULD.

I think a simple text reorder would fix this. For example,

NEW:
    ABR or ASBR MUST withdraw the previously advertised UPA when the
    reason for which the UPA was generated ceases - e.g. prefix
    reachability was restored or its metric has changed such that it is
    below the configured threshold value.

    Even if the reasons persist, UPA advertisements SHOULD
    be withdrawn after some amount of time, that would provides
    sufficient time for UPA to be flooded network-wide and acted upon by
    receiving nodes, but limits the presence of UPA in the network.  The
    time the UPA is kept in the network SHOULD also reflect the intended
    use-case for which the UPA was advertised.

##PP
done

# Scalability control

Section 2 says:
    It is also RECOMMENDED that implementations limit the number of UPA
    advertisements which can be originated at a given time.

The benefits gained by summarization may be nullified if a large number of UPAs
are injected. This recommendation is really great, but there is a need to
expose a configuration parameter here. Pease

NEW:
    UPA implementations SHOULD provide a configuration option to limit
    the number of such UPAs.

##PP
done.

# Backward compatibility Check

CURRENT (3.2/4.2):
    Such requirement of
    reachability MUST NOT be applied for UPAs, as they are propagating
    unreachability.

Does this mean that an ABR that don’t support UPA might discard it?

##PP
it will not discard it, bit it will not propagate it between areas.



# Avoid redefining exiting behaviors

CURRENT (Section 4):
    In addition, NU-bit is defined for OSPFv3 [RFC5340].  Prefixes having
    the NU-bit set in their PrefixOptions field SHOULD NOT be included in
    the routing calculation.

The SHOULD NOT part is already defined in 5340. What is this redefined again?

##PP

I changed to:

"Prefixes having  the NU-bit set in their PrefixOptions field are not included in the routing
 calculation."

Is that ok? Without any text it's not clear what's the relevance of the NU bit here.



# Check on “UP-Flag” is useless

5.1.1
    The prefix that is advertised with U-Flag or UP-Flag  MUST have the
    metric set to a value larger than 0xFE000000.  If the prefix metric
    is less than or equal 0xFE000000, both of these flags MUST be
    ignored.

5.2.2
    The prefix that is advertised with U-Flag or UP-Flag MUST have the
    NU-bit set in the PrefixOptions of the parent TLV.

The “or UP-Flag” part of both MUSTs is useless as U-Flag will be set in such
case as well. The case where only UP-Flag is set is invalid and will be ignored.

Unless I missed a subtle thing here, please update these two.

##PP
good catch, fixed it.
We changed the UP flag from standalone one to be suplementary, but we forgot to update the text.



# Service stability

The document declares the applications that consume the signal out of scope.
Which is fine. However, there might be some negative implications due to how
the loss signal (or it withdrawal/absence) is interpreted. For example, add a
warning to insist that withdrawal does not mean revert to a nominal state.
Having few sentences for these service to take appropriate measure before
reverting.

##PP

The usage of the UPA depends on the application. As we may not know what applications are going to use it, I would be reluctant to specify the application behavior.

For some applications, the presence of the UPA may be required, before the application itself detects the loss later.

Section 2 says:
"ABR or ASBR MUST withdraw the previously advertised UPA when the reason for which the UPA was generated ceases"

In such case the withdrawal could mean that the prefix for which the UPA was generated became reachable again.

UPA is a one time signal for applications that there might be some loss of connectivity. Application itself must use its own mechanism to verify/detect the loss eventually. In the meantime it may decide to do some proactive measures, but UPA does not dictates the application behavior.




----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

# Topology-dependent matters

Abstract:
    This enables fast
    convergence by steering traffic away from the node which owns the
    prefix and is no longer reachable.

“steering..” part of the text is only possible when there alternate routes for
that prefix. Otherwise, the traffic will be dropped anyway.

Pleas reword. For example, saying “when applicable” or similar would be just
fine.

##PP
done


# Why now?

Summarization was there since decades. A mention about what exacerbates the
need for this new feature now (and was not considered as a major issue in past)
would be helpful.

##PP
is it really required? I was not planning to mention it in the text . But if you insist...

It's the SRv6 that brings summarization back in life for underlay (IGPs). With MPLS summarization was not possible.

# "Egress" depends on the traffic directionality

Section 1:
    Similarly, when an egress router needs to be taken out of service for
    maintenance, the traffic is drained from the node before taking it
    down.

A router may behave as ingress/egress as a function of traffic direction. I
would delete “egress” here.
##PP
done


# (ni) There might be many ABRs per area, not single one

Please make this change in Section 1 (and similar):

OLD:
    When prefixes from such node are summarized by the
    Area Border Router (ABR) or Autonomous System Boundary Router (ASBR),

NEW:
    When prefixes from such node are summarized by an
    Area Border Router (ABR) or Autonomous System Boundary Router (ASBR),

##PP
done


# (Operational Considerations) Many signals, distinct expected uses

Section 1 has the following:

    This document does not define how to advertise prefix that is not
    reachable for routing.  That has been defined for IS-IS in [RFC5305]
    and [RFC5308], for OSPF in [RFC2328], and for OSPFv3 in [RFC5340].

I wonder whether we can list the signals available out there (explicit prefer
advertisement, prefix with a specific metric, etc.) and remind the intend use
scope for each of them. This can be record in the Operational Considerations.

##PP
that is what the beginning of section 3 and 4 are doing for ISIS and OSPFv2/v3 respectively.



I’m raising this point, especially that the text right after says the following
without appropriate scoping:

CURRENT:
    This document defines two new flags in IS-IS, OSPF, and OSPFv3.
    These flags, together with the existing protocol mechanisms, provide
    the support for advertising prefix unreachability , together with the
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    reason for which the unreachability is advertised

##PP
what do you mean by "without appropriate scoping"?



# Unreachable metric?!

Section 1 has the following:
    This document defines a method to signal a specific reason for which
    the prefix with unreachable metric was advertised.

Unless, I’m mistaken there is no such “unreachable metric” as a thing in IS-IS.

##PP
Please see the section 3 of this document that mentions [RFC5305 <https://www.rfc-editor.org/info/rfc5305>]. That RFC defines a metric that makes the prefix "not reachable" for the purpose of the routing.

We are reusing such metric value for UPA.


# (Operational Considerations) Activation default

Section 2 has the following:
    UPA MAY be generated by the ABR or ASBR for a prefix that is
    summarized by the summary address originated by the ABR or ASBR in
    the following cases:

Can we say something about whether the use of UPA be enabled or disabled by
default?

##PP
I added the following text as a response to one of the Ketan's comments:

"Generation of the UPA at the ABR or ASBR is optional and SHOULD be controlled by
 a configuration knob."

I would leave the default behavior for the implementations to decide. I see no reason why an RFC should mandate any specific default behavior.


# Threshold

Section 2 says:
           - the metric to reach the prefix from the ABR or ASBR crosses
           the configured threshold.

Can we explicit the threshold we are referring to here + add reference where to
look at?

##PP
there is no explicit threshold here, it's any value of the metric that the operator defines. For example, if for the planned maintenance the cost of router's links is set to some high value X, the operator may set the threshold on the ABR to X + something.


# Explicit references in Section 3

OLD:
    [RFC5305] defines the encoding for advertising IPv4 prefixes using 4
    octets of metric information.  Section 4 specifies:
    ..
    Similarly, [RFC5308] defines the encoding for advertising IPv6
    prefixes using 4 octets of metric information. Section 2 states:

NEW:
    [RFC5305] defines the encoding for advertising IPv4 prefixes using 4
    octets of metric information.  Section 4 of [RFC5305]  specifies:
    ..
    Similarly, [RFC5308] defines the encoding for advertising IPv6
    prefixes using 4 octets of metric information.  Section 2 of [RFC5308]
    states:

##PP
Ketan made a similar comment and the text has been updated already.



# “Advertisement of UPA in IS-IS”

CURENT (S3.1)
    Recognition of the advertisement as UPA is only required on routers
    which have a valid use case for this information.

I would delete this sentence as this section is about the originator.

##PP
done


# Section 6

Consider moving that section right after current Section 7 as a subsection of
an Operational Consideration section.

##PP
done



## Multiple ABRs

That section may also discuss whether there are specific consideration to take
into account, e.g., presence of multiple ABRs with announces UPAs for a set of
prefixes in an area and measures to prevent routing stability. If you don’t see
any risk out there, saying that as well would be useful.
##PP
there is no special risk with multiple ABRs - if the egress PE goes down they will all equally see it and generate the UPA. The problem is when one ABR sees the egress PE reachable and other as unreachable. This can only happen when the area partitions. That problem is described in section 6.

## Consider adding any implications (or absence of) explicit withdrawal of an
UPA.

##PP
I'm not sure what do you mean by explicit withdrawal. UPA must be withdraw by the originator in all cases. Either based on the timeout, or because the reason for which it was generated does not exist anymore, whatever comes first.

thanks,
Peter



Cheers,
Med




_______________________________________________
Lsr mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to