On 22 Jun 2011, at 00:07 , Shane Amante wrote: > An example which routinely happens in today's networks would be link > restoration. > In that case, the network is restoring traffic from a much longer path to a > shorter, more optimal path in the network. Depending on the transmission rate > of the transmitter, this can/will lead to temporary reordering of microflows > at the receivers. Do note that in well-operated networks this reordering is, > hopefully, transient and of an extremely short duration.
Note that similar to the above, having first packet in a flow travel a slightly different path than subsequent fragments due to ECMP/LAG differences is also predictably "transient" and "of an extremely short duration" (to use Shane's excellent phrasing). > Namely, deployed IPv6 routers *already* [should] identify fragmented > vs. non-fragmented packets, presumably by inspecting the "Next Header > field of the last header of the Unfragmentable Part" for value 44 > [RFC 2460, Section 4.5] for 'Fragment Header' for the purposes of > deciding whether /or not/ they should attempt to identify Next Headers > containing the Upper Layer protocol and, subsequently, the > {protocol, src_port & dst_port} that will be fed, (along with > {src_ip, dst_ip}), into LAG and/or ECMP hash algorithms for > fine-grained load-balancing. Exactly so. Side comment: Because many users/operators require that routers implement wire-speed Access Control Lists, and most of those ACLs include variables such as transport-layer protocol, and transport-layer port numbers, ASIC/FPGA-based IPv6 routers will continue to parse into the IPv6 packet beyond the IPv6 base header *regardless* of whether that information is needed for ECMP/LAG/load-balancing purposes. I have heard credible reports of an ISP that apparently deploys hundreds of ACL rules into the exterior border routers of their AS, and apparently that ISP deploys a smaller but still non-trivial number into interior routers. Of course, this also means they select their largest routers in part upon the routers' ability to support large numbers of ACLs at full wire-speed and also the ability to look past the occasional IPv6 extension header to examine the ICMP/TCP/UDP/etc headers behind it. > "Conservative" router/switch implementations strive to reduce the risk of > _persistent_ reordering of an individual microflow. IOW, since non-first > fragments will not contain Upper Layer protocol information, (specifically: > {src_port, dst_port}), that can be fed as input-keys to LAG and/or ECMP hash > algorithms, the "safe" thing they should do is to only use the 2-tuple of > {src_ip, dst_ip} as input-keys for _all_ fragments within a microflow. > Obviously, this leads to 'coarse-grained' load-balancing for microflows > containing fragmented packets. Yes, and an important word above is "persistent", to echo Shane's emphasis. As Shane observes, this implementation approach is common today. I believe some implementations today already include the Flow Label, simply because it is another available differentiator (albeit most commonly zero just now), not because of this current set of specifications. > If this draft is widely implemented & deployed and originating hosts are > encoding a "uniformly distributed", non-zero flow-label in all packets > (fragmented or not), then it would seem logical that routers would be > adapted so that: > > a) If they encounter a Fragment Header they use: {src_ip, dst_ip + > flow_label} > as input-keys to the LAG + ECMP hash algorithms; and/or, > > b) If they encounter a Next Header with, for example, an Upper Layer Protocol > that they have *not* (yet?) implemented a parsing routine to extract > appropriate input-keys (or, can't, because it's too deep in the packet's > headers), then they revert back to using {src_ip, dst_ip + flow_label} > as input-keys to the LAG + ECMP hash algorithms[1]; and/or, > > c) [Assuming widespread use of the flow-label], they no longer even bother > looking at any Next Headers in all packets and _always_ use {src_ip, > dst_ip + flow_label} for input-keys to LAG + ECMP hash algorithms. > > Personally, I see (a) & (b) as being a short- to medium-term "wins" > that could be safely implemented, by default, in the next-spin of NP, > FPGA SW and ASIC HW, given the existence of this, hopefully soon, RFC. Agreed. In some cases, I think (a) and (b) are already deployed. > Obviously, (c) is going to be a little further out. I would apply the caveat from earlier to (c). Routers will still be looking beyond the IPv6 header *for ACL purposes*, even if those same routers use non-zero Flow Labels for ECMP/LAG, rather than using transport-layer information for ECMP/LAG. > I would also point out a substantial additional advantage is [long-term] > architectural flexibility in that the end-points (hosts) may freely use > *new* transport protocols (SCTP, DCCP, UDP-lite, etc.) so long as they > continue to label all packets with a "uniformly distributed", > non-zero flow-label so that [Core] routers/switches have something > they can safely use as input-keys for LAG and/or ECMP hash algorithms. With respect to ECMP/LAG narrowly, I agree with the above. > At least, that's one part of the network that we don't need to worry > about upgrading to support new transport-layer protocols. Unfortunately, > middleboxes (FW's or, more generally, "security GW's") might still have > to be adapted depending on the applicability of the new transport-layer > protocol to various network types, (e.g.: SOHO vs. Large-ish Enterprise). Exactly so. Router-based ACLs (which are widespread -- even in some deployed transit/backbone routers) will still need to support those new transport-layer protocols BEFORE those new transport-layer protocols will be practical to widely deploy. Yours, Ran Atkinson PS: For implementations prior to this current set of documents, and for routers creating Flow Labels on the fly, a useful additional input to an ECMP/LAG function would be the "SPI" value of an ESP or AH header. -------------------------------------------------------------------- IETF IPv6 working group mailing list ipv6@ietf.org Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6 --------------------------------------------------------------------