On 22  Jun 2011, at 00:07 , Shane Amante wrote:
> An example which routinely happens in today's networks would be link 
> restoration.  
> In that case, the network is restoring traffic from a much longer path to a
> shorter, more optimal path in the network.  Depending on the transmission rate
> of the transmitter, this can/will lead to temporary reordering of microflows
> at the receivers.  Do note that in well-operated networks this reordering is,
> hopefully, transient and of an extremely short duration.  

Note that similar to the above, having first packet in a flow
travel a slightly different path than subsequent fragments
due to ECMP/LAG differences is also predictably "transient" and
"of an extremely short duration" (to use Shane's excellent phrasing).

> Namely, deployed IPv6 routers *already* [should] identify fragmented
> vs. non-fragmented packets, presumably by inspecting the "Next Header
> field of the last header of the Unfragmentable Part" for value 44
> [RFC 2460, Section 4.5] for 'Fragment Header' for the purposes of
> deciding whether /or not/ they should attempt to identify Next Headers
> containing the Upper Layer protocol and, subsequently, the
> {protocol, src_port & dst_port} that will be fed, (along with
> {src_ip, dst_ip}), into LAG and/or ECMP hash algorithms for
> fine-grained load-balancing.

Exactly so.

Side comment:   Because many users/operators require that routers implement
                wire-speed Access Control Lists, and most of those ACLs 
                include variables such as transport-layer protocol, 
                and transport-layer port numbers, ASIC/FPGA-based IPv6
                routers will continue to parse into the IPv6 packet
                beyond the IPv6 base header *regardless* of whether
                that information is needed for ECMP/LAG/load-balancing
                purposes.

        I have heard credible reports of an ISP that apparently deploys 
        hundreds of ACL rules into the exterior border routers of their AS, 
        and apparently that ISP deploys a smaller but still non-trivial 
        number into interior routers.  

        Of course, this also means they select their largest routers in part 
        upon the routers' ability to support large numbers of ACLs at full 
        wire-speed and also the ability to look past the occasional IPv6 
        extension header to examine the ICMP/TCP/UDP/etc headers behind it.

> "Conservative" router/switch implementations strive to reduce the risk of
> _persistent_ reordering of an individual microflow.  IOW, since non-first
> fragments will not contain Upper Layer protocol information, (specifically:
> {src_port, dst_port}), that can be fed as input-keys to LAG and/or ECMP hash
> algorithms, the "safe" thing they should do is to only use the 2-tuple of
> {src_ip, dst_ip} as input-keys for _all_ fragments within a microflow.  
> Obviously, this leads to 'coarse-grained' load-balancing for microflows
> containing fragmented packets.

Yes, and an important word above is "persistent", to echo Shane's emphasis.

As Shane observes, this implementation approach is common today.  I believe 
some implementations today already include the Flow Label, simply because 
it is another available differentiator (albeit most commonly zero just now), 
not because of this current set of specifications.

> If this draft is widely implemented & deployed and originating hosts are
> encoding a "uniformly distributed", non-zero flow-label in all packets
> (fragmented or not), then it would seem logical that routers would be
> adapted so that:
> 
> a)  If they encounter a Fragment Header they use: {src_ip, dst_ip + 
> flow_label}
>     as input-keys to the LAG + ECMP hash algorithms; and/or,
> 
> b)  If they encounter a Next Header with, for example, an Upper Layer Protocol
>     that they have *not* (yet?) implemented a parsing routine to extract
>     appropriate input-keys (or, can't, because it's too deep in the packet's
>     headers), then they revert back to using {src_ip, dst_ip + flow_label}
>     as input-keys to the LAG + ECMP hash algorithms[1]; and/or,
> 
> c)  [Assuming widespread use of the flow-label], they no longer even bother
>     looking at any Next Headers in all packets and _always_ use {src_ip,
>      dst_ip + flow_label} for input-keys to LAG + ECMP hash algorithms.
> 
> Personally, I see (a) & (b) as being a short- to medium-term "wins"
> that could be safely implemented, by default, in the next-spin of NP,
> FPGA SW and ASIC HW, given the existence of this, hopefully soon, RFC.  

Agreed.  In some cases, I think (a) and (b) are already deployed.

> Obviously, (c) is going to be a little further out. 

I would apply the caveat from earlier to (c).  

Routers will still be looking beyond the IPv6 header *for ACL purposes*, 
even if those same routers use non-zero Flow Labels for ECMP/LAG, 
rather than using transport-layer information for ECMP/LAG.

> I would also point out a substantial additional advantage is [long-term]

> architectural flexibility in that the end-points (hosts) may freely use

> *new* transport protocols (SCTP, DCCP, UDP-lite, etc.) so long as they

> continue to label all packets with a "uniformly distributed",

> non-zero flow-label so that [Core] routers/switches have something

> they can safely use as input-keys for LAG and/or ECMP hash algorithms.  


With respect to ECMP/LAG narrowly, I agree with the above.

> At least, that's one part of the network that we don't need to worry
> about upgrading to support new transport-layer protocols.  Unfortunately,
> middleboxes (FW's or, more generally, "security GW's") might still have
> to be adapted depending on the applicability of the new transport-layer
> protocol to various network types, (e.g.: SOHO vs. Large-ish Enterprise).

Exactly so.   Router-based ACLs (which are widespread -- even in 
some deployed transit/backbone routers) will still need to support those 
new transport-layer protocols BEFORE those new transport-layer protocols 
will be practical to widely deploy.  

Yours,

Ran Atkinson


PS: For implementations prior to this current set of documents, and for 
    routers creating Flow Labels on the fly, a useful additional input 
    to an ECMP/LAG function would be the "SPI" value of an ESP or AH header. 

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to