-----Original Message-----
From: ipv6-boun...@ietf.org [mailto:ipv6-boun...@ietf.org] On Behalf Of Brian E 
Carpenter
Sent: Monday, June 20, 2011 6:21 PM
To: Jari Arkko
Cc: IETF IPv6 Mailing List; draft-ietf-6man-flow-3697...@tools.ietf.org
Subject: Re: AD review of draft-ietf-6man-flow-3697bis

> I think this recommendation is problematic. I agree that the first hop
> router should insert the flow label, but requiring it to do fragment
> reassembly in order to find the 5-tuple is a big burden, and I'm not
> sure its even called for.

Yes, this is an embarassingly good catch. I think we need to add some language
to the recommendation that source hosts SHOULD set their own flow labels,
pointing out that in a fragmented case this is the only way to do things
properly. Reverting to the 2-tuple is a serious penalty since a lot of
the entropy in headers is in one of the port numbers.

WEG> Given the fact that only end hosts can actually fragment IPv6 packets, and 
that's only when MTU less than 1280 is supported, I would expect fragmentation 
to be more the exception than the rule. I agree that for completeness it is 
worthwhile to discuss as Brian recommends, but I don't think it's a major 
concern, and it should be phrased that way in the draft.

> The RFC 2119 language above is fine. But I'd like to change the part
> about normal case being the 5-tuple. I think the normal case should be
> the 2-tuple under these circumstances. The source has access to the
> 5-tuple; a router is not guaranteed to have access to it.

As far as routers go, I think we have to say that an implementor has
to choose between a reassembly-based solution using the 5-tuple and
simply using the 2-tuple (maybe also the fragmentation ID - there is
some scope for ingenuity here).

WEG> I agree that this is a detail best left to implementers and operators, but 
I do not support an explicit recommendation to only do 2-tuple on fragmented 
packets. If the draft wants to put some language that it may need to be a 
consideration, I'm ok with that, but I do not want this to be a situation where 
the implementation that we get is only 2-tuple because of a perceived (and 
potentially overstated) scale risk for fragmented packet reassembly. I'd much 
rather router vendors implement 5-tuple, and then let operators decide how they 
want to manage this potential scale issue via configuration knobs. This also 
leaves it open for ingenuity as Brian says. Smart vendors could probably even 
use some of their existing rate-limiting machinery to do some amount of 
reassembly and then revert to 2-tuple if it exceeds a certain threshold in 
order to preserve box performance.
If I were to put this in normative language, I would say SHOULD use 5 tuple for 
fragments, MAY revert to 2-tuple if reassembly at higher traffic levels becomes 
a concern in a given implementation.
However, to Brian's comment about fragmentation ID, is there any reason why the 
draft can't recommend (or even require) this as a possible improvement over 
straight 2-tuple in the revert case?

> Are there situations
> where the multiple first hop routers are used from the same host?

WEG> I think that first-hop router in the context of this draft is more 
accurately defined as the first router in the path which is configured to 
re-mark zero-value flow labels, which probably makes "first-hop router" a poor 
choice of words here. Ingress router might be a better choice by itself, since 
that could more generically cover something like an ASBR as well.
Alternatively, you could make a distinction between first-hop router and 
ingress router in cases where that device is not the host, but is going to be 
encapsulating the data (tunnels, for example) in such a way so that subsequent 
routers cannot see the true 5-tuple data without looking well past the packet 
headers. In this case, we may want to strongly recommend that it sets the flow 
label on the outer packet based on data from the inner packet, since like the 
host, it has the best visibility into the underlying flows and this will be 
gone on subsequent hops.

Because this is strictly optional and will be incrementally deployed, there is 
no way to assume that the first-hop router is anywhere near the originating 
host, and I think it's overreaching to discuss such specific cases as above. 
We've already said that the host should be the one doing this because it has 
the best knowledge of its flows. If it doesn't, it's best effort based on what 
the network can infer from the available data. I'm not sure how the case of one 
host connected to multiple first-hop routers is different - it sends packets to 
all of them, either due to ECMP or other bundling or some other algorithm, but 
unless it's doing something relatively strange like per-packet round robin, 
each of the first-hop routers should be able to use the information it has to 
determine flows on the traffic it sees and mark accordingly. At most, this 
might be a good example for why per-packet distribution without explicitly 
setting the flow label is a bad idea.


> The document should provide some guidance about operational conditions where
> the recommendations for the first hop router can be applied. The
> document should state how such functionality is turned on (per
> configuration? automatically?) and provide assurances that problematic
> conditions can be avoided.

I'd like to hear from my co-authors and the WG before producing a new version.

WEG> I think I need some additional clarity on what you mean by the above 
before I can give much in the way of useful feedback. I'm not sure that much 
guidance on operational conditions is necessary, since the introduction already 
explains the types of networks that would see benefit to enhanced 
load-balancing, as well as the reasons why FL is a good addition to the 
existing data used for LB decisions. If your network meets those tests, and 
your router supports the feature at scale, enable it. Unless folks can come up 
with specific cases where enabling FL marking is explicitly NOT recommended...
Regarding avoiding problematic conditions, I can think of the following:
Inconsistent balancing due to incorrect identification of flows or collision of 
flow label values
Out of order packet delivery due to the above
Scaling considerations with managing reassembly of fragments for 5-tuple flow 
marking
Most of these are either going to be discussed based on this review, or are 
side effects of the fact that this is a best-effort, incrementally deployed and 
optional implementation. What other problematic conditions are there that we 
need to avoid?

I think defaults are probably an implementation detail, but if the draft has to 
make a recommendation, I'd say host-level defaults to on, router-level defaults 
to off, and have enabling rewriting zero-value flow labels be a separate 
configuration item from paying attention to existing nonzero flow label values 
for load-balancing. Whether it's a global knob or a per-interface one is 
probably getting too far into the weeds, but I'd say that at least for the 
rewriting, it might make sense to have the granularity to enable/disable on a 
per-interface basis.

Wes George

This E-mail and any of its attachments may contain Time Warner Cable 
proprietary information, which is privileged, confidential, or subject to 
copyright belonging to Time Warner Cable. This E-mail is intended solely for 
the use of the individual or entity to which it is addressed. If you are not 
the intended recipient of this E-mail, you are hereby notified that any 
dissemination, distribution, copying, or action taken in relation to the 
contents of and attachments to this E-mail is strictly prohibited and may be 
unlawful. If you have received this E-mail in error, please notify the sender 
immediately and permanently delete the original and any copy of this E-mail and 
any printout.
--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to