On 2011-04-06 05:45, Thomas Narten wrote:
> Looking at the revised document, here are some additional comments.
> 
>    One lightweight approach to ECMP or LAG is this: if there are N
>    equally good paths to choose from, then form a modulo(N) hash
>    [RFC2991] from a consistent set of fields in each packet header
>    that are certain to have the same values throughout the duration of
>    a flow, and use the resulting output hash value to select a
>    particular
> 
> would be nice to have a term better than "consistent". The point is,
> you want to use fields that stay constant for a given flow.

s/consistent/defined/

> 
>    distribution, due to the pseudo-random nature of ephemeral ports.
>    Ephemeral port numbers are quite well distributed [Lee10] and will
> 
> is "pseudo-random" right here? IN fact, do we even need that last part
> of the sentence?

s/pseudo-random/variable/

> 
>    o The flow label in the outer packet SHOULD be set by the sending
>       TEP to a pseudo-random 20-bit value in accordance with [RFC3697]
>       or its replacement.  The same flow label value MUST be used for
> 
> Don't like this psuedo-random requirement here. And, the TEP should be
> setting the Flow Label in *exactly* the same way as 3697bis
> recommends. Tunnels are no different...

That's true in respect of 3697bis; 3697 was non-specific on this. Fixed by 
updating the
reference; now that the three drafts are bunched together, this makes sense 
anyway.

> 
>       * Note that this rule is a recommendation, to permit individual
>          implementers to take an alternative approach if they wish to
>          do so.  For example, a simpler solution than a pseudo-random
>          value might be adopted if it was known that the load balancer
>          would
> 
>        Carpenter & Amante Expires August 14, 2011 [Page 6]
>        
>        Internet-Draft Flow Label for tunnel ECMP/LAG February 2011
> 
> 
>          continue to provide uniform distribution of flows with it.
>          Such an alternative MUST conform to [RFC3697] or its
>          replacement.
> 
> 
> This is too wishy washy. It also suggests that the TEP setting the
> Flow Label knows about the algorithm used by the load balancer. That
> will rarely (never?) be the case and this document shouldn't suggest
> this. 

Agree, the "For example..." sentence doesn't really add any value.

> 
>       the relevant flow label into the outer IPv6 header.  A user flow
>       could be identified by the ingress TEP most simply by its
>       {destination, source} address 2-tuple (coarse) or by its 5-tuple
>       {dest addr, source addr, protocol, dest port, source port} (fine).
>       At present, ironically, there would be little advantage for IPv6
>       packets in using the {dest addr, source addr, flow label} 3-tuple.
> 
> Ambiguous. Advantages compared to what?

s/advantage for IPv6 packets/point/

> 
> Also, the Flow classification should simply follow the recommendation
> in 3697bis, which says use  the 5 tuple, or, at a minimum, the 3
> tuple. The 
> 
>       The choice of n-tuple is an implementation detail in the sending
>       TEP.
> 
> No it's not. What may be a detail is the actual algorithm used. But
> which fields to use should be a clear recommendation (e.g., taken from
> 3697bis).

We maintain that whether it uses just the addresses (2-tuple) or up to
the whole 5-tuple is not something we can recommend. There could be a major
efficiency impact, depending on the design of the router acting as TEP (and
we might well be talking about 10 Gbit lines speeds or more).

s/detail/choice/

However, re-reading the text made us realise that it doesn't flow quite 
logically
and that now we have clarified 3697bis somewhat, you are correct that we can
depend on it more. So the text has been shortened and reorganised, as well as
making the above changes.

> 
>       *  This stateless method creates a small probability of two
>          different user flows hashing to the same flow label.  Since RFC
>          3697 allows a source (the TEP in this case) to define any set
>          of packets that it wishes as a single flow, occasionally
>          labeling two user flows as a single flow through the tunnel is
>          acceptable.
> 
> This should be fine. There is no problem with treating packets from 2
> different flows the same way. The problem occurs if packets from
> within one flow are treated differently.

We agree.

> 
>    o  At intermediate router(s) that perform load distribution, the hash
>       algorithm used to determine the outgoing component-link in an ECMP
>       and/or LAG toward the next-hop MUST minimally include the 3-tuple
>       {dest addr, source addr, flow label}.  This applies whether the
>       traffic is tunneled traffic only, or a mixture of normal traffic
>       and tunneled traffic.
> 
> Be more clear: should be 5 tuple, next best is 3. Defer to 3697bis.

No, this is a different hash from the one in 3697bis - this is the actual
load balancing hash, which is not mentioned there. And again: it's an
implementation choice. Some vendors may actively prefer to limit it
to the 3-tuple and forget all about transport headers.

Added a MAY for components of the 5-tuple (which is slightly redundant
with the following sub-bullet).

  Brian + Shane


--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to