[+rtgwg mailing list]
Adrian,
Thanks very much for working through the example. It was very interesting
to see an understanding by someone who isn't as close to the problem-space
and helped pick up on imprecisions and lack of clarity in the definitions.
Not to speak for Stewart - whom I'm sure will be responding quite
soon, but...
On Mon, Dec 29, 2014 at 11:09 AM, Adrian Farrel <[email protected]
<mailto:[email protected]>> wrote:
Adrian Farrel has entered the following ballot position for
draft-ietf-rtgwg-remote-lfa-09: Discuss
When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut
this
introductory paragraph, however.)
Please refer to
http://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.
The document, along with other ballot positions, can be found here:
http://datatracker.ietf.org/doc/draft-ietf-rtgwg-remote-lfa/
----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------
I'm placing this Discuss because I found the description of the
algorithm in 4.2.1 and the worked example in Section 2 to be at odds
with the definitions of P-space, extended P-space, and Q-space.
I have been able to make things work by messing with the algorithm and
keeping the current definitions. You could probably do it by keeping
the algorithm and messing with the definitions.
Yes - we want to keep the algorithm. In particular, the pseudo-code
gets it
write and the definitions are a little sloppy. Stewart clarified a bit
around the
extended P-space not including paths through the failed link in the
text before
I put it to the IESG.
My workings were as follows, based on the example in Section 2:
> S---E
> / \
> A D
> \ /
> B---C
>
> In Figure 1 S can reach A, B, and C without going via E;
> these form S's extended P-space.
First, this should say "via S-E" and "extended P-space with respect to
S-E".
But...
> Extended P-space
> The union of the P-space of the neighbours of a
> specific router with respect to the protected link
(Noting that 4.2.1.2 changes this definition *significantly* by saying
that the neighbour at the far end of the failing link - i.e., E in
this
case - must be excised from the list of neighbours whose P-spaces are
combined).
To be fair, the definition of P-space (below) includes that the paths
can't transit the protected link (S-E in the example).
I think that the definition needs to be updated to be "the neighbors of a
specific router that are reachable without going via the protected link".
When there's only a single link S-E, S has no direct way of forcing
traffic
to E so E's P-space can't be included.
...and...
> P-space P-space is the set of routers reachable from a
> specific router using the normal FIB, without
any path
> (including equal cost path splits) transiting the
> protected link.
Now, S's neighbours are A and E.
The P-space of A with respect to S-E is {B, C}
And the P-space of E with respect to S-E is {C, D}
So the extended P-Space of S with respect to S-E is {B, C, D}
Something is broken!
Yes, can't include the P-space of E when the failed link is S-E and
there's
no way to reach E directly (one hop) from S.
{A, B, C} is not even the (not extended) P-space of S with respect to
S-E which is {A, B} since C is not in that set because of SEDC.
On the other hand {A, B, C} *is* the extended P-space of E wrt S-E.
Although, I would observe that the pseudocode in 4.2.1 does derive
A, B, C as the extended P-space of S wrt S-E, but I think that is
because it has an entirely different definition of an extended
P-space.
?? Because it omits E? Do you see anything else different that needs
to be better clarified?
Now...
> Q-space Q-space is the set of routers from which a specific
> router can be reached without any path (including
> equal cost path splits) transiting the protected
link.
...so the Q-space of S wrt S-E is {A, B} since CDES.
And, for the record, the Q-space or E wrt S-E is {C, D}
Now, to compound the confusion, the example determines the PQ
nodes for
S wrt S-E by taking the intersection of the extended P-space for S wrt
S-E and the Q-space of E wrt S-E. This is done notwithstanding the
definition...
> PQ node A node which is a member of both the P-space and the
> Q-space. Where extended P-space is in use it is a
> node which is a member of both the extended P-space
> and the Q-space. In remote LFA this is used as the
> repair tunnel endpoint.
Yup - so clearly it means "of both the extended P-space of the PLR (S) and
the Q-space of the far-end of the failed link (E)". Some improved
definitions
are definitely needed.
This definition gives the PQ nodes of S wrt SE as either
- the intersection of {A, B} and {A, B} if P-space is being discussed
or
- the intersection of {B, C, D} and {A, B} if extended P-space is
being
used.
So the correct tunnel end point for your example is B.
But it clearly doesn't work since traffic to E that is tunneled to
B may
still be ECMP routed back along BAS.
So I think in this whole example, you sit at S and you say "I want to
protect traffic to E". Then you work out the extended P-space of
*E* wrt
S-E (which is {A, B, C}) and the Q-space of *E* wrt S-E (which is
{C, D}) giving you the correct PQ node for S to use to protect traffic
to E in the event of a failure of S-E as C.
Extended-P space is the space that the PLR S can send traffic to -
what nodes
can it reach without using the failed link.
Q-space is what nodes can reach the far-end E of the failed-link S-E
without using
the failed link.
Obviously the definitions need improvement.
It is simple! All you have to do is update the text to describe the
actual process and not the wrong one. Then the right result will pop
out :-)
The replacement is
OLD
In Figure 1 S can reach A, B, and C without going via E;
these form S's extended P-space.
NEW
In Figure 1 S can reach A and B without going via S-E, and
D can reach B and C without going via S-E. So E's extended P-space
with respect to S-E is the nodes A, B, and C.
END
BUT, given all of this, are you sure that Section 4.2.1 is right? I'm
not.
Yes, not concerned about that. You are just high-lighting some
imprecisions
and lack of clarity in the definitions.
------
Shouldn't the pseudocode in 4.3 be enclosed in code component
macros to
match with the copyright TLP etc.?
----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------
This is not my area of expertise so please excuse the brevity of the
rest of this review.
---
Please s/draft/document/ throughout (except the boilerplate and
filename) so that it can be published as an RFC (which is not a
draft).
---
Although it causes some pain with abbreviations and a little more care
in explanation, you need to put the Introduction as the first
section in
the document.
---
You are using RFC 5714 as a Normative Reference by making me go there
for the definition of terms. Please move it to the correct section.
---
IMHO your definition of FIB is rather loose. Fortunately (?) "FIB" is
barely used in this document, so it might not be important, but if you
wanted to fix it:
- you are talking about IP packets in this document
Mostly MPLS actually...
- the actions are, I think, limited to forwarding actions
---
This comment applies iff the resolution of the Discuss is not a
complete
change to the terminology!
I think definitions need to be tighter or omitted from this part
of the
document. The definitions in 4.2.1 are more verbose and probably for
good reason. If you feel you need to retain these definitions early in
the document and can't lift the text from 4.2.1 then you need to
address
the concerns below.
P-space P-space is the set of routers reachable from a
specific router using the normal FIB, without
any path
(including equal cost path splits) transiting the
protected link.
"the protected link"? There is only one protected link?
Since the example is worded as...
For example, the P-space of S with respect to link
S-E, is the set of routers that S can reach without
using the protected link S-E.
...I think you need...
P-space The P-space of a router with respect to a specific
protected link is the set of routers reachable from
the specific router using the normal FIB,
without any
path (including equal cost path splits)
transiting the
protected link.
Similarly, you need...
Extended P-space
The union of the P-spaces of all of the
neighbours of
a specific router with respect to a single specific
protected link (see Section 4.2.1.2).
But note that 4.2.1.2 makes a significant change to this definition.
Q-space The Q-space for a specific router with respect to a
specific protected link is the set of routers from
which the specific router can be reached without any
path (including equal cost path splits)
transiting the
protected link.
PQ node A PQ node is a node which is a member of both the
P-space and the Q-space for the same router and with
respect to the same protected link.
Where extended P-space is being discussed, a PQ node
is a node which is a member of both the extended
P-space and the Q-space for the same router and with
respect to the same protected link.
In remote LFA the repair tunnel endpoint is a PQ
node.
Throughout the text, however, the terms are used rather loosely. For
example, when discussing Figure 1 you say "S's extended P-space", but
this is really "S's extended P-space with respect to S-E". Someone
familiar with the work might say that it is obvious from the context
that we are discussing the link S-E, and it is, but the terminology
needs to be tight.
---
There is some difficult terminology in Section 2
If all link costs are equal, the link S-E cannot be fully protected
by LFAs. The destination C is an ECMP from S, and so can be
protected when S-E fails, but D and E are not protectable using
LFAs.
Is it the link or the node that is protected (or the traffic)? Perhaps
this could be rewritten to be less ambiguous.
---
Section 2
B
has equal-cost paths via B-A-S-E and B-C-D-E and so may go through
S-E.
I don't think B is going anywhere. Maybe...
B
has equal-cost paths to E via B-A-S-E and B-C-D-E and so may
reach E
through S-E.
---
Section 2
In MPLS networks the targeted LDP
protocol needed to learn the label binding at the repair tunnel
endpoint is a well understood and widely deployed technology.
But it would still benefit from a citation or a forward reference to
section 7.
---
I enjoyed 3.2
relatively rare as is the incidence of failure in a well managed
network.
So, managing my network well is protection against back-hoes. Nice.
LOL - the argument is about the set-up time to be protected again and what
is the interval between failures. The editors and WG have decided
that this
trade-off is acceptable - but I'd also prefer to see it more clearly
articulated.
---
In 3.2
Multiple
repairs MAY share a tunnel end point.
1. s/repairs/repair tunnels/
2. s/MAY/may/ since this is not an implementation or operational
choice,
but a fact of life.
---
In 4.2 you have truncated...
The repair tunnel endpoint needs to be a node in the network
reachable from S without traversing S-E.
...and...
o The repair tunneled point MUST be reachable from the tunnel
source
without traversing the failed link; and
You mean "reachable using the normal FIB" I think.
Not quite because if the repair tunnel endpoint is in the extended
P-space and
not the P-space, then S has to force the first hop rather than send it
via the normal FIB.
---
Section 4.3
The preceding text has mostly described the computation of the
remote
LFA repair target (PQ) in terms of the intersection of two
reachability graphs computed using SPFs.
"mostly"?
"reachability graphs"? Were they? Or were they reachability sets?
---
Your pseducode in 4.3 invokes an unresolved (and undescribed) function
Compute_Forward_SPF().
Actually, I think this is a bogus line that can be deleted.
---
4.3 has
if ( D_opt(n, y) <
D_opt(n,self) + D_opt)(self, y)
Surely this is
if ( D_opt(n, y) <
D_opt(n,self) + D_opt(self, y) )
---
I think the introduction of "pseudonode" in 4.3 may be a little
without
context.
---
Section 7
If for any reason the TLDP session cannot
not be established
s/cannot not/cannot/
---
I think [RFC5424] and [RFC3411] are pretty poor references to give in
section 7. You appear to be saying that an implementation that cannot
establish a TLDP session should write a MIB module, standardise
it, and
then report an error.
Can't you find an existing LDP MIB module that reports Session-up
failures?
Or maybe just delete "using any well known mechanism such as Syslog
[RFC5424] or SNMP [RFC3411]."
---
Why is the discussion of microloops on network re-converges considered
to be a management consideration (by inclusion in Section 9).
Surely it
is a deployment or operational consideration.
I wanted that text somewhere in the doc. Adding an operational
considerations
section to put it in would be fine.
---
I think you can strengthen the security considerations. You have:
To prevent their use as an attack vector IP repair tunnel endpoints
(where used) SHOULD be assigned from a set of addresses that
are not
reachable from outside the routing domain.
1. "To prevent their use" is surely consistent with a "MUST".
The fact that you want to say "SHOULD" means that you need to turn
the text around...
IP repair tunnel endpoints (where used) SHOULD be assigned from
a set
of addresses that are not reachable from outside the routing
domain.
This would prevent their use as an attack vector.
2. You can add a note about what traffic can be placed into a repair
tunnel. You already have this earlier in the document, and it is
worth restating.
3. I think you should also make note of whether the repair tunnel is
advertised by the routing protocol as an available link.
I agree on the comments otherwise.
Regards,
Alia
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg