Re: [Fwd: I-D Action:draft-carpenter-6man-flow-update-03.txt]

Joel M. Halpern Mon, 21 Jun 2010 11:41:15 -0700

I want to emphasis one aspect of what Wes said.

We see ECMP (and layer two link aggregates) used almost everywhere.They are used for many reasons. For example, in some network designs Ihave seen there are at least two links between every pair of devices.Those designs also usually include at least 2 L3 level (ECMP) paths.

This is NOT a niche application.

Conversely, as far as I can tell, all of the other uses for the fieldappear to be niche applications, with limited utility.


Yours,
Joel

George, Wes E IV [NTK] wrote:

-----Original Message-----
From: Mark Smith [mailto:i...@69706e6720323030352d30312d31340a.nosense.org]
Sent: Saturday, June 19, 2010 3:28 AM
To: George, Wes E IV [NTK]
Cc: Brian E Carpenter; 6man
Subject: Re: [Fwd: I-D Action:draft-carpenter-6man-flow-update-03.txt]

Sorry for the (very) late response, finally remembered to reply

Does there have to be a single use, or more specifically, a single
specific use?
[[WEG]] well I don't think that there has to be a single use, but I do think 
that the requirement for immutability dramatically limits our options, and for 
the reasons I detailed originally, I think that particular thing should go 
away. I'm not saying that there's no possibility for other uses, just that 
those other uses should not assume immutability.

It seems to me that one of the reason why there have be quite a variety
of proposals on it's use are that it has some attractive properties
that no other header fields or extension headers have i.e.

- it's always in the IPv6 header, rather than optional like extension
  headers

- all IPv6 implementations in at least the past 10 years support it

- it is a constant and fixed size, avoiding the complexity and
  performance impact of dealing with a variable length field

- it's mutable by the network

- being 20 bits in size, it can be used to encode a million values

[[WEG]] All good points, and I think other uses can coexist, should they ever 
get enough support to be widely implemented. I'm simply saying that I support 
flow identification as the primary implementation, and that I support changing 
the restrictions to make this as effective/efficient as possible, vs waiting on 
unspecified and otherwise TBD other proposals that don't, in my mind, have 
nearly as compelling of a use case/requirement to use this specific field, nor 
much in the way of support and wide-scale implementation to require us to 
consider backward compatibility a major requirement.

Once concern I have about changing the above properties to suit the the
ECMP traffic load balancing use case is that I think ECMP is a fairly
niche use. It seems to me that only the largest of networks need to use
ECMP because individual link speeds such as 40Gbps aren't large enough
for them.
[[WEG]] response to this below

All other networks running IPv6 won't gain any benefit from
this use case, yet they're always paying the price of the flow field
because it is in each IPv6 packet.
[[WEG]] since the flow field is always in the packet whether it's all 0s, all 1s, or 
contains useful data, I think the "paying the price for not using it" is a red 
herring at best. And since most networks running IPv6 do eventually need their packets to 
traverse the big ISP networks, they benefit from load balancing working properly on said 
networks, even if they benefit indirectly because their TCP doesn't throttle to 
compensate for a big UDP flow that is load balancing poorly and monopolizing one of the 
pipes, etc.

I think with standardisation of 100Gbps link speeds, and talk of 1Tbps link 
speeds in the relatively
near future, ECMP traffic load balancing usefulness for the largest
of networks will also be reduced. I think making the flow field more
widely useful than this specific case would be better.
[[WEG]] Ok, first, how do you propose "making the flow field more widely useful"? Do you 
have something specific in mind, or is this more a case of, "let's wait and see what people 
come up with in the future"?

Second, I fundamentally disagree with the notion that ECMP is a niche use 
limited to ISPs that need bigger pipes than whatever is currently the largest 
available and will go away as soon as we have bigger pipes. There are many, 
many enterprises that are using link bundling to make bigger than 10G pipes out 
of cheap 10GEs (or even bundling GE ports together because that's what they 
have on the boxes they're using), especially when they have datacenters full of 
servers that can now hand off 10GE themselves. At least on the routers I've 
worked with, they often use the same hash-based layer 3 flow determination to 
do load balancing between the members of the bundle even if the bundling itself 
is at layer 2.
There are also implementations that carry traffic encapsulated and obscure the 
src/dst for the infrastructure to use for load balancing, even within the 
enterprise space. That aside, if you have multiple servers that are capable of 
generating fractions of 10G flows by themselves, you're still open to some risk 
if you try to balance that across 10G pipes. It's not reasonable to assume that 
just because there are multiple devices involved that simple src/dst hashing 
will work reliably in all cases. Works ok if it's a web server and thousands of 
unique requests are hitting it. More or less completely breaks if it's doing 
database replication and it might talk to 2 or 3 other hosts at most, or if a 
lot of the traffic is UDP and doesn't throttle, etc.

I used 10G as an example above because that's what is common today. But you can 
just as easily substitute 40GE, 100GE, etc into the above and it still holds 
true. The reality is that no matter what size the biggest pipes are, there are 
always going to be a requirement to aggregate traffic from multiple sources 
into things bigger than those pipes, and it's not limited to the largest ISPs. 
There is certainly a delay between the latest and greatest (40G, 100G, 1T) 
being available on routers and people building servers that can singlehandedly 
fill them, but it's only a delay. Further, there is a cost associated with 
early adoption that makes it overly optimistic to assume that just because 40GE 
or 100GE exists, that people will immediately move away from the lower speeds 
in order to replace their bundles with bigger pipes for all of their 
aggregation needs. It's not just about buying new ports (and fabric) on 
routers. It requires a significant investment in Metro and long-haul

DW

 DM to support the higher speeds on the WAN side, and there may be distance or 
performance tradeoffs to consider.

Wes George

This e-mail may contain Sprint Nextel Company proprietary information intended 
for the sole use of the recipient(s). Any use by others is prohibited. If you 
are not the intended recipient, please contact the sender and delete all copies 
of the message.

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Re: [Fwd: I-D Action:draft-carpenter-6man-flow-update-03.txt]

Reply via email to