[Lsr] Secdir last call review of draft-ietf-ospf-mpls-elc-13

2020-04-30 Thread Joseph Salowey via Datatracker
Reviewer: Joseph Salowey
Review result: Ready

I have reviewed this document as part of the security directorate's
ongoing effort to review all IETF documents being processed by the
IESG.  These comments were written primarily for the benefit of the
security area directors.  Document editors and WG chairs should treat
these comments just like any other last call comments.

The summary of the review is ready from a security perspective.  The security
considerations section is adequate.


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] Secdir last call review of draft-ietf-isis-mpls-elc-12

2020-04-30 Thread Rich Salz via Datatracker
Reviewer: Rich Salz
Review result: Ready

I am the SECDIR reviewer for this document; the security directorate tries to
review all documents before they go to the IESG. This content is intended
primarily for the SecAD's, anyone else should consider this like other
last-call comments.

This document is READY.

This is a short doument. It defines a way for MPLS routers to indicate that
they can handle additional meta-data, "entropy labels" and depth of them, to do
traffic routing.

The security implications of this are well-discussed.



___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


Re: [Lsr] Congestion (flow) control thoughts.

2020-04-30 Thread tony . li

Hi Xuesong,

> In congestion control of layer 4, it is assumed that there is a bottleneck in 
> the network, and the ideal rate of the transmitters equals to a fair share of 
> the bandwidth in the bottleneck. The flows in the network change all the time 
> and so as to the ideal transmitting rate. There are some methods to detect 
> the bottleneck, for example detecting packet loss, setting ECN, RTT and so 
> on. Considering that the goal of cc is to maximize the throughput and 
> minimize the queuing delay, the throughput and delay could be used to compare 
> different cc algorithms.


I consider flow control and congestion control to be separate, but similar 
problems.

Flow control is about creating a single control loop between a single 
transmitter and single receiver.

Congestion control is about creating multiple interacting control loops between 
multiple transmitters and multiple receivers.


> The problem of mine is that for CC of ISIS flooding, where is the 
> bottleneck?(maybe the receiver capability) What method of detecting could be 
> used? (In my understanding, we have two options at this stage: one from the 
> receiver side and the other from the transmitter side) What is the criteria 
> of comparing? (still a little confusing for me )


In the case of modern IS-IS, we have gigabit links everywhere, the bandwidth 
greatly exceeds our needs, and flooding only happens between peers, so link 
resources are never the issue.  Thus, to my mind, congestion only happens when 
a node is engaged in flooding on multiple interfaces concurrently. Effectively, 
congestion control can be seen as flow control with shared resources, or as 
flow control with varying resources.

The resources of concern vary depending on the internals of both the sender and 
receiver.  Some that come to mind are:

- CPU. Both transmitter and receiver have finite cycles available. This is 
typically shared across multiple interfaces, so can congest.

- Process level buffer space. There may be a fixed number of packet buffers 
within the IS-IS process.  These can be depleted.

- Kernel buffers. 

- Internal control fabric. In some multi-chassis designs, the line card chassis 
is remote from the CPU running IS-IS. The network between the two can congest.

- Line card CPU. Again in the case of a multi-chassis system, control packets 
may flow through a CPU on the line card.  This can congest due to cycles or 
buffers.

- Forwarding plane buffers. Modern design for many systems has a forwarding 
plane ASIC doing a great deal of our packet handling. 
  One of its responsibilities is to extract control plane traffic from incoming 
traffic and pass it upwards. As a single ASIC, it is finite and can only buffer
  so many packets before loss.  The buffers are only cleared when the packets 
are read out by the line card or central CPU.  IMHO, this is a very likely 
point of congestion.

All of these can be instrumented to determine whether they are congesting.  It 
seems very unlikely that the transmitter ever congests.  Bruno’s data as 
presented supports this: the transmitter can outstrip the receiver. Thus, we 
tend to focus on congestion at the receiver.

The other feedback mechanisms that we have available are the ones that we’ve 
discussed: PSNPs provide acknowledgment of received packets. From that, and 
their timing, we may be able to infer more useful data, but we are discussing 
changing their behavior to make things more useful.

In our draft, we are proposing other feedback mechanisms.

Tony


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] SRLG usage in the IGP Flexible Algorithm draft

2020-04-30 Thread Alexander Vainshtein
Hi all,
I have a question about the proposed usage of SRLG in the IGP Flexible 
Algorithm draft.

This usage is defined Section 12 of the draft with the reference to the SRLG 
exclude rule as following:



  2.  Check if any exclude SRLG rule is part of the Flex-Algorithm

  definition.  If such exclude rule exists, check if the link is

  part of any SRLG that is also part of the SRLG exclude rule.  If

  the link is part of such SRLG, the link MUST be pruned from the

  computation.

This looks effectively undistinguishable from the usage of the exclude Admin 
groups rule as described in the same Section 12 of the draft:


  1.  Check if any exclude rule is part of the Flex-Algorithm

  definition.  If such exclude rule exists, check if any color that

  is part of the exclude rule is also set on the link.  If such a

  color is set, the link MUST be pruned from the computation.

>From my POV, with such a definition, there is no need in the dedicated 
>"Exclude SRLG" rule as part of the specification of the Flexible Algorithm, 
>since such the SRLG Exclude rule can be replaced with a matching Exclude All 
>rule  using Admin groups.

I also think that such a usage of SRLG does not fit the needs of the 
TI-LFA 
draft that considers an SRLG as a resource that fails when any of the 
links/nodes comprising it fails. E.g., it says in Section 2:


   The Point of Local Repair (PLR), S, needs to find a node Q (a repair

   node) that is capable of safely forwarding the traffic to a

   destination D affected by the failure of the protected link L, a set

   of links including L (SRLG), or the node F itself.  The PLR also

   needs to find a way to reach Q without being affected by the

   convergence state of the nodes over the paths it wants to use to

   reach Q: the PLR needs a loop-free path to reach Q.



To me this suggests that SRLGs are only relevant when computing backup paths 
for specific failures, e.g., an LFA for failure of a link hat belongs to a 
specific SRLG must be computed in the topology from which all the links 
belonging to the same SRLG are pruned. This understanding matches RFC 4090 that 
states in Section 6.2 "Procedures for Backup Path Computation":



  - The backup LSP cannot traverse the downstream node and/or link

whose failure is being protected against.  Note that if the PLR

is the penultimate hop, node protection is not possible, and

only the downstream link can be avoided.  The backup path may be

computed to be SRLG disjoint from the downstream node and/or

link being avoided.



If SRLGs are only relevant for computation of backup paths, it is not clear to 
me if they should be part of the definition of a specific Flexible Algorithm.



What, if anything, did I miss?



Regards, and lots of thanks in advance,
Sasha

Office: +972-39266302
Cell:  +972-549266302
Email:   alexander.vainsht...@ecitele.com


___

This e-mail message is intended for the recipient only and contains information 
which is 
CONFIDENTIAL and which may be proprietary to ECI Telecom. If you have received 
this 
transmission in error, please inform us by e-mail, phone or fax, and then 
delete the original 
and all copies thereof.
__
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] Why only a congestion-avoidance algorithm on the sender isn't enough

2020-04-30 Thread Henk Smit



Hello all,

Two years ago, Gunter Van de Velde and myself published this draft:
https://tools.ietf.org/html/draft-hsmit-lsr-isis-flooding-over-tcp-00
That started this discussion about flow/congestion control and ISIS 
flooding.


My thoughts were that once we start implementing new algorithms to
optimize ISIS flooding speed, we'll end up with our own version of TCP.
I think most people here have a good general understanding of TCP.
But if not, this is a good overview how TCP does it:
https://en.wikipedia.org/wiki/TCP_congestion_control


What does TCP do:

TCP does 2 things: flow control and congestion control.

1) Flow control is: the receiver trying to prevent itself from being
overloaded. The receiver indicates, through the receiver-window-size
in the TCP acks, how much data it can or wants to receive.
2) Congestion control is: the sender trying to prevent the links between
sender and receiver from being overloaded. The sender makes an educated
guess at what speed it can send.


The part we seem to be missing:

For the sender to make a guess at what speed it can send, it looks at
how the transmission is behaving. Are there drops ? What is the RTT ?
Do drop-percentage and RTT change ? Do acks come in at the same rate
as the sender sends segments ? Are there duplicate acks ? To be able
to do this, the sender must know what to expect. How acks behave.

If you want an ISIS sender to make a guess at what speed it can send,
without changing the protocol, the only thing the sender can do is look
at the PSNPs that come back from the receiver. But the RTT of PSNPs can
not be predicted. Because a good ISIS implementation does not 
immediately
send a PSNP when it receives a LSP. 1) the receiver should jitter the 
PSNP,
like it should jitter all packets. And 2) the receiver should wait a 
little

to see if it can combine multiple acks into a single PSNP packet.

In TCP, if a single segment gets lost, each new segment will cause the
receiver to send an ack with the seqnr of the last received byte. This
is called "duplicate acks". This triggers the sender to do
fast-retransmission. In ISIS, this can't be be done. The information
a sender can get from looking at incoming PSNPs is a lot less than what
TCP can learn from incoming acks.


The problem with sender-side congestion control:

In ISIS, all we know is that the default retransmit-interval is 5 
seconds.
And I think most implementations use that as the default. This means 
that
the receiver of an LSP has one requirement: send a PSNP within 5 
seconds.
For the rest, implementations are free to send PSNPs however and 
whenever

they want. This means a sender can not really make conclusions about
flooding speed, dropped LSPs, capacity of the receiver, etc.
There is no ordering when flooding LSPs, or sending PSNPs. This makes
a sender-side algorithm for ISIS a lot harder.

When you think about it, you realize that a sender should wait the
full 5 seconds before it can make any real conclusions about dropped 
LSPs.
If a sender looks at PSNPs to determine its flooding speed, it will 
probably
not be able to react without a delay of a few seconds. A sender might 
send

hunderds or thousands of LSPs in those 5 seconds, which might all or
partially be dropped, complicating matters even further.


A sender-sider algorithm should specify how to do PSNPs.

So imho a sender-side only algorithm can't work just like that in a
multi-vendor environment. We must not only specify a congestion-control
algorithm for the sender. We must also specify for the receiver a more
specific algorithm how and when to send PSNPs. At least how to do PSNPs
under load.

Note that this might result in the receiver sending more (and smaller) 
PSNPs.

More packets might mean more congestion (inside routers).


Will receiver-side flow-control work ?

I don't know if that's enough. It will certainly help.

I think to tackle this problem, we need 3 parts:
1) sender-side congestion-control algorithm
2) more detailed algorithm on receiver when and how to send PSNPs
3) receiver-side flow-control mechanism

As discussed at length, I don't know if the ISIS process on the 
receiving

router can actually know if its running out of resources (buffers on
interfaces, linecards, etc). That's implementation dependent. A receiver
can definitely advertise a fixed value. So the sender has an upper bound
to use when doing congestion-control. Just like TCP has both a 
flow-control
window and a congestion-control window, and a sender uses both. Maybe 
the
receiver can even advertise a dynamic value. Maybe now, maybe only in 
the

future. An advertised upper limit seems useful to me today.


What I didn't like about our own proposal (flooding over TCP):

The problem I saw with flooding over TCP concerns multi-point networks 
(LANs).


When flooding over a multi-point network, setting up TCP connections
introduces serious challenges. Who are the endpoints of the TCP 
connections ?

Full mesh ? Or