Looking at the revised document, here are some additional comments. One lightweight approach to ECMP or LAG is this: if there are N equally good paths to choose from, then form a modulo(N) hash [RFC2991] from a consistent set of fields in each packet header that are certain to have the same values throughout the duration of a flow, and use the resulting output hash value to select a particular
would be nice to have a term better than "consistent". The point is, you want to use fields that stay constant for a given flow. distribution, due to the pseudo-random nature of ephemeral ports. Ephemeral port numbers are quite well distributed [Lee10] and will is "pseudo-random" right here? IN fact, do we even need that last part of the sentence? o The flow label in the outer packet SHOULD be set by the sending TEP to a pseudo-random 20-bit value in accordance with [RFC3697] or its replacement. The same flow label value MUST be used for Don't like this psuedo-random requirement here. And, the TEP should be setting the Flow Label in *exactly* the same way as 3697bis recommends. Tunnels are no different... * Note that this rule is a recommendation, to permit individual implementers to take an alternative approach if they wish to do so. For example, a simpler solution than a pseudo-random value might be adopted if it was known that the load balancer would Carpenter & Amante Expires August 14, 2011 [Page 6] Internet-Draft Flow Label for tunnel ECMP/LAG February 2011 continue to provide uniform distribution of flows with it. Such an alternative MUST conform to [RFC3697] or its replacement. This is too wishy washy. It also suggests that the TEP setting the Flow Label knows about the algorithm used by the load balancer. That will rarely (never?) be the case and this document shouldn't suggest this. the relevant flow label into the outer IPv6 header. A user flow could be identified by the ingress TEP most simply by its {destination, source} address 2-tuple (coarse) or by its 5-tuple {dest addr, source addr, protocol, dest port, source port} (fine). At present, ironically, there would be little advantage for IPv6 packets in using the {dest addr, source addr, flow label} 3-tuple. Ambiguous. Advantages compared to what? Also, the Flow classification should simply follow the recommendation in 3697bis, which says use the 5 tuple, or, at a minimum, the 3 tuple. The The choice of n-tuple is an implementation detail in the sending TEP. No it's not. What may be a detail is the actual algorithm used. But which fields to use should be a clear recommendation (e.g., taken from 3697bis). * This stateless method creates a small probability of two different user flows hashing to the same flow label. Since RFC 3697 allows a source (the TEP in this case) to define any set of packets that it wishes as a single flow, occasionally labeling two user flows as a single flow through the tunnel is acceptable. This should be fine. There is no problem with treating packets from 2 different flows the same way. The problem occurs if packets from within one flow are treated differently. o At intermediate router(s) that perform load distribution, the hash algorithm used to determine the outgoing component-link in an ECMP and/or LAG toward the next-hop MUST minimally include the 3-tuple {dest addr, source addr, flow label}. This applies whether the traffic is tunneled traffic only, or a mixture of normal traffic and tunneled traffic. Be more clear: should be 5 tuple, next best is 3. Defer to 3697bis. Thomas -------------------------------------------------------------------- IETF IPv6 working group mailing list ipv6@ietf.org Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6 --------------------------------------------------------------------