On 2010-06-22 06:41, Joel M. Halpern wrote:
> I want to emphasis one aspect of what Wes said.
> We see ECMP (and layer two link aggregates) used almost everywhere. They
> are used for many reasons.  For example, in some network designs I have
> seen there are at least two links between every pair of devices. Those
> designs also usually include at least 2 L3 level (ECMP) paths.
> This is NOT a niche application.
> 
> Conversely, as far as I can tell, all of the other uses for the field
> appear to be niche applications, with limited utility.

This is indeed why the draft aims to square the circle by allowing
local semantics but recommending ECMP/LAG compatible semantics
as the default.

    Brian

> 
> Yours,
> Joel
> 
> George, Wes E IV [NTK] wrote:
>> -----Original Message-----
>> From: Mark Smith
>> [mailto:i...@69706e6720323030352d30312d31340a.nosense.org]
>> Sent: Saturday, June 19, 2010 3:28 AM
>> To: George, Wes E IV [NTK]
>> Cc: Brian E Carpenter; 6man
>> Subject: Re: [Fwd: I-D Action:draft-carpenter-6man-flow-update-03.txt]
>>
>> Sorry for the (very) late response, finally remembered to reply
>>
>> Does there have to be a single use, or more specifically, a single
>> specific use?
>> [[WEG]] well I don't think that there has to be a single use, but I do
>> think that the requirement for immutability dramatically limits our
>> options, and for the reasons I detailed originally, I think that
>> particular thing should go away. I'm not saying that there's no
>> possibility for other uses, just that those other uses should not
>> assume immutability.
>>
>>
>> It seems to me that one of the reason why there have be quite a variety
>> of proposals on it's use are that it has some attractive properties
>> that no other header fields or extension headers have i.e.
>>
>> - it's always in the IPv6 header, rather than optional like extension
>>   headers
>>
>> - all IPv6 implementations in at least the past 10 years support it
>>
>> - it is a constant and fixed size, avoiding the complexity and
>>   performance impact of dealing with a variable length field
>>
>> - it's mutable by the network
>>
>> - being 20 bits in size, it can be used to encode a million values
>>
>> [[WEG]] All good points, and I think other uses can coexist, should
>> they ever get enough support to be widely implemented. I'm simply
>> saying that I support flow identification as the primary
>> implementation, and that I support changing the restrictions to make
>> this as effective/efficient as possible, vs waiting on unspecified and
>> otherwise TBD other proposals that don't, in my mind, have nearly as
>> compelling of a use case/requirement to use this specific field, nor
>> much in the way of support and wide-scale implementation to require us
>> to consider backward compatibility a major requirement.
>>
>>
>> Once concern I have about changing the above properties to suit the the
>> ECMP traffic load balancing use case is that I think ECMP is a fairly
>> niche use. It seems to me that only the largest of networks need to use
>> ECMP because individual link speeds such as 40Gbps aren't large enough
>> for them.
>> [[WEG]] response to this below
>>
>> All other networks running IPv6 won't gain any benefit from
>> this use case, yet they're always paying the price of the flow field
>> because it is in each IPv6 packet.
>> [[WEG]] since the flow field is always in the packet whether it's all
>> 0s, all 1s, or contains useful data, I think the "paying the price for
>> not using it" is a red herring at best. And since most networks
>> running IPv6 do eventually need their packets to traverse the big ISP
>> networks, they benefit from load balancing working properly on said
>> networks, even if they benefit indirectly because their TCP doesn't
>> throttle to compensate for a big UDP flow that is load balancing
>> poorly and monopolizing one of the pipes, etc.
>>
>>
>> I think with standardisation of 100Gbps link speeds, and talk of 1Tbps
>> link speeds in the relatively
>> near future, ECMP traffic load balancing usefulness for the largest
>> of networks will also be reduced. I think making the flow field more
>> widely useful than this specific case would be better.
>> [[WEG]] Ok, first, how do you propose "making the flow field more
>> widely useful"? Do you have something specific in mind, or is this
>> more a case of, "let's wait and see what people come up with in the
>> future"?
>>
>> Second, I fundamentally disagree with the notion that ECMP is a niche
>> use limited to ISPs that need bigger pipes than whatever is currently
>> the largest available and will go away as soon as we have bigger
>> pipes. There are many, many enterprises that are using link bundling
>> to make bigger than 10G pipes out of cheap 10GEs (or even bundling GE
>> ports together because that's what they have on the boxes they're
>> using), especially when they have datacenters full of servers that can
>> now hand off 10GE themselves. At least on the routers I've worked
>> with, they often use the same hash-based layer 3 flow determination to
>> do load balancing between the members of the bundle even if the
>> bundling itself is at layer 2.
>> There are also implementations that carry traffic encapsulated and
>> obscure the src/dst for the infrastructure to use for load balancing,
>> even within the enterprise space. That aside, if you have multiple
>> servers that are capable of generating fractions of 10G flows by
>> themselves, you're still open to some risk if you try to balance that
>> across 10G pipes. It's not reasonable to assume that just because
>> there are multiple devices involved that simple src/dst hashing will
>> work reliably in all cases. Works ok if it's a web server and
>> thousands of unique requests are hitting it. More or less completely
>> breaks if it's doing database replication and it might talk to 2 or 3
>> other hosts at most, or if a lot of the traffic is UDP and doesn't
>> throttle, etc.
>>
>> I used 10G as an example above because that's what is common today.
>> But you can just as easily substitute 40GE, 100GE, etc into the above
>> and it still holds true. The reality is that no matter what size the
>> biggest pipes are, there are always going to be a requirement to
>> aggregate traffic from multiple sources into things bigger than those
>> pipes, and it's not limited to the largest ISPs. There is certainly a
>> delay between the latest and greatest (40G, 100G, 1T) being available
>> on routers and people building servers that can singlehandedly fill
>> them, but it's only a delay. Further, there is a cost associated with
>> early adoption that makes it overly optimistic to assume that just
>> because 40GE or 100GE exists, that people will immediately move away
>> from the lower speeds in order to replace their bundles with bigger
>> pipes for all of their aggregation needs. It's not just about buying
>> new ports (and fabric) on routers. It requires a significant
>> investment in Metro and long-haul
> DW
>>  DM to support the higher speeds on the WAN side, and there may be
>> distance or performance tradeoffs to consider.
>>
>> Wes George
>>
>> This e-mail may contain Sprint Nextel Company proprietary information
>> intended for the sole use of the recipient(s). Any use by others is
>> prohibited. If you are not the intended recipient, please contact the
>> sender and delete all copies of the message.
>>
>> --------------------------------------------------------------------
>> IETF IPv6 working group mailing list
>> ipv6@ietf.org
>> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>> --------------------------------------------------------------------
>>
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------
> 
--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to