Re: Comcast Network Peer Survey on DSCP/ECN for L4S

2022-06-13 Thread Jared Mauch
This seems to be missing some of the reasons/why things are remarked, perhaps 
it would be wise to bring some of the people interested in this to the various 
vendor-specific lists or such?

For example, for some hardware types, enabling any sort of rate shaping at all 
will rewrite the DSCP values, even for packets that do not traverse the shaped 
interfaces.

- Jared

> On Jun 10, 2022, at 9:31 AM, Livingood, Jason via NANOG  
> wrote:
> 
> Hi – Comcast is working on the implementation of ultra-low latency 
> networking, leveraging the IETF’s upcoming L4S standard. This standard will 
> require passing ECN and DSCP markings across network boundaries. As a result, 
> we are interested in your perspective on this and in how you handle markings 
> today. We have a short survey that should only take a few minutes to 
> complete. Take the survey at https://forms.office.com/r/vGb0LUXfS1
>  
> While any network operator is welcome to take this, we are particularly 
> interested in any networks that are directly interconnected with us today.
>  
> Thank you!
> Jason Livingood
> Comcast – Technology Policy & Standards
> jason_living...@comcast.com
>  
> PS – Apologies if any of you get a duplicate of this request via other 
> channels.



Re: Comcast Network Peer Survey on DSCP/ECN for L4S

2022-06-10 Thread Dave Taht
I would argue that question 9 needs an option of "Both".

Secondly, two additional good questions to ask would be: are the ECN
values presently being treated as RFC3168?

Are the ECN values being modified by any AQM implementations (WRED,
FQ_CODEL, etc) on any switch or router in transit?


Comcast Network Peer Survey on DSCP/ECN for L4S

2022-06-10 Thread Livingood, Jason via NANOG
Hi – Comcast is working on the implementation of ultra-low latency networking, 
leveraging the IETF’s upcoming L4S standard. This standard will require passing 
ECN and DSCP markings across network boundaries. As a result, we are interested 
in your perspective on this and in how you handle markings today. We have a 
short survey that should only take a few minutes to complete. Take the survey 
at https://forms.office.com/r/vGb0LUXfS1

While any network operator is welcome to take this, we are particularly 
interested in any networks that are directly interconnected with us today.

Thank you!
Jason Livingood
Comcast – Technology Policy & Standards
jason_living...@comcast.com



PS – Apologies if any of you get a duplicate of this request via other channels.


Re: TCP and anycast (was Re: ECN)

2019-11-16 Thread Scott Weeks



--- ra...@psg.com wrote:
lots of good research lit on catchment topology of anycasted 
dns, which is very non-local.
---


For the others here that didn't know what that is and are 
curious.  I couldn't take it and just had to know... :)

https://tools.ietf.org/html/rfc4786

Catchment:  in physical geography, an area drained by a river, also
  known as a drainage basin.  By analogy, as used in this document,
  the topological region of a network within which packets directed
  at an Anycast Address are routed to one particular node.

scott


Re: TCP and anycast (was Re: ECN)

2019-11-14 Thread William Herrin
On Thu, Nov 14, 2019 at 1:10 AM Bill Woodcock  wrote:
> > On Nov 14, 2019, at 7:39 AM, Anoop Ghanwani 
wrote:
> > RFC 7094 (https://tools.ietf.org/html/rfc7094) describes the pitfalls &
risks of using TCP with an anycast address.  It recognizes that there are
valid use cases for it, though.
> > Specifically, section 3.1 says this:
> >Most stateful transport protocols (e.g., TCP), without modification,
do not understand the properties of anycast; hence, they will fail
> >probabilistically, but possibly catastrophically, when using anycast
addresses in the presence of "normal" routing dynamics.
> >This can lead  to a protocol working fine in, say, a test lab but
not in the global Internet.
> >
> > On Thu, Nov 14, 2019 at 12:25 AM Matt Corallo 
wrote:
> > > This sounds like a bug on Cloudflare’s end (cause trying to do
anycast TCP is... out of spec to say the least),
>
> No. We have been doing anycast TCP for more than _thirty years_, most of
that time on a global scale, without operational problems.

Hi Bill,

Not to put to fine a point on it but Baldur and Toke's scenario in which
anycast tcp failed, the one which started this thread, should probably be
classed as an operational problem.

It is possible to build an anycast TCP that works. YOU have not done so.
And Cloudflare certainly has not.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: TCP and anycast (was Re: ECN)

2019-11-14 Thread Randy Bush
>>> RFC 7094 (https://tools.ietf.org/html/rfc7094) describes the pitfalls
>>> & risks of using TCP with an anycast address.
>>
>> and two decades of operational experience are that prudent deployments
>> just work.
> 
> I agree with Bill/Randy here... this does just work if you understand
> your local topology and manage change properly.

agree, but would extend ...

sometimes s/local//

i.e. casting from your edge dumps directly to peers, keeping it off
your backbone.  but the topo set you have to keep in mind can be
large.

lots of good research lit on catchment topology of anycasted dns,
which is very non-local.

randy


Re: TCP and anycast (was Re: ECN)

2019-11-14 Thread Christopher Morrow
On Fri, Nov 15, 2019 at 1:54 AM Randy Bush  wrote:
>
> > RFC 7094 (https://tools.ietf.org/html/rfc7094) describes the pitfalls
> > & risks of using TCP with an anycast address.
>
> and two decades of operational experience are that prudent deployments
> just work.

I agree with Bill/Randy here... this does just work if you understand
your local topology and manage change properly.


Re: TCP and anycast (was Re: ECN)

2019-11-14 Thread Randy Bush
> RFC 7094 (https://tools.ietf.org/html/rfc7094) describes the pitfalls
> & risks of using TCP with an anycast address.

and two decades of operational experience are that prudent deployments
just work.

randy


Re: ECN

2019-11-14 Thread Toke Høiland-Jørgensen via NANOG
Owen DeLong  writes:

> Like it or not (and I really don’t), the majority of modern CDNs are
> using TCP over Anycast.
>
> It’s ugly and it’s prone to problems like this. It’s nice to see a
> customer with know-how actually publicizing and digging into the
> problem.

Thanks. I do plan to write this whole story up as a blog post, BTW.
Apart from just being a nice "battle story" I also think it's important
to get more visibility into these kinds of issues. I've mostly been
interested in issues related to ECN in general, but its interaction with
anycast is certainly... interesting :)

> Until now, I believe an unknown number of customers have been
> suffering in silence or relegated to the ISPs “We can’t reproduce you
> problem” bin without resolution.
>
> I’ve had lots of discussions on the subject and the usual end result
> is “It’s too hard to measure or quantify and there’s no visible
> contingent of impacted users”.
>
> Now we at least have one visible impacted user.

As I said, happy to be an exponent if it can help others resolve these
kinds of problems.

Incidentally, in case you're not aware, there are currently two
competing schemes being discussed at the IETF to re-purpose the ECT(1)
code point in the IP header. One proposal[0] is to use it as an
additional high-fidelity congestion indicator, while the other[1] is to
use it as an identifier for a new type of traffic that should get
special treatment (which almost, but not quite, amounts to priority
queueing). So if either proposal gains traction, expect more ECN-marked
traffic coming to a network near you in the maybe-not-so-distant future;
with all the interesting issues that can bring with it.

If someone feels like introducing some operational considerations into
the IETF discussions, I do believe both drafts will be discussed at the
tsvwg working group meetings at the Singapore IETF next week.

-Toke

[0] https://datatracker.ietf.org/doc/draft-morton-tsvwg-sce/
[1] https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/


Re: ECN

2019-11-14 Thread Toke Høiland-Jørgensen via NANOG
Baldur Norddahl  writes:

> I am testing disabling our use of ECMP as it is not strictly necessary
> and we are moving to a new platform anyway. Waiting for feedback from
> the customer to hear if this fixes the issue.

Which I can confirm that it does. Thank you for the speedy resolution! :)

-Toke


Re: ECN

2019-11-14 Thread Saku Ytti
On Wed, 13 Nov 2019 at 22:57, Lukas Tribus  wrote:


> In fact I believe everything beyond the 5-tuple is just a bad idea to
> base your hash on. Here are some examples (not quite as straight
> forward than the TOS/ECN case here):

ACK.

> TTL:
> https://mailman.nanog.org/pipermail/nanog/2018-September/096871.html

> IPv6 flow label:
> https://blog.apnic.net/2018/01/11/ipv6-flow-label-misuse-hashing/
> https://pc.nanog.org/static/published/meetings/NANOG71/1531/20171003_Jaeggli_Lightning_Talk_Ipv6_v1.pdf
> https://www.youtube.com/watch?v=b0CRjOpnT7w

It is unfortunate IPv6 flow label is so poorly specified, had it been
specified clearly it could have been very very good for the Internet.
Crucially sender should be able to instruct transit HOW to hash, there
should be flags in flow label used by sender to indicate that flow
label must be used for hash exclusively, not at all, inclusively with
what ever host otherwise uses. This would give sender control over
what is discreet flow.

Something like this
https://ytti.github.io/flow-label/draft-ytti-v6ops-flow-label.html
would have been nice, but unclear if it would be possible to deliver
post-fact

-- 
  ++ytti


Re: TCP and anycast (was Re: ECN)

2019-11-14 Thread Bill Woodcock



> On Nov 14, 2019, at 7:39 AM, Anoop Ghanwani  wrote:
> RFC 7094 (https://tools.ietf.org/html/rfc7094) describes the pitfalls & risks 
> of using TCP with an anycast address.  It recognizes that there are valid use 
> cases for it, though.
> Specifically, section 3.1 says this:
>Most stateful transport protocols (e.g., TCP), without modification, do 
> not understand the properties of anycast; hence, they will fail
>probabilistically, but possibly catastrophically, when using anycast 
> addresses in the presence of "normal" routing dynamics.
>This can lead  to a protocol working fine in, say, a test lab but not in 
> the global Internet.
> 
> On Thu, Nov 14, 2019 at 12:25 AM Matt Corallo  wrote:
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> > is... out of spec to say the least),

No. We have been doing anycast TCP for more than _thirty years_, most of that 
time on a global scale, without operational problems.

There were people who seemed gray-bearded at the time, who were scared of 
anycast because it used IP addresses _non uniquely_ and that wasn’t how they’d 
intended them to be used, and these kids these days, etc.  What you’re seeing 
is residuum of their pronouncements on the matter, carrying over from the 
mid-1990s.

It’s very true that anycast can be misused and abused in a myriad of ways, 
leading to unexpected or unpleasant results, but no more so than other routing 
techniques.  We and others have published on many or most of the potential 
issues and their solutions over the years.  That RFC has never actually been a 
comprehensive source of information on the topic, and it contains a lot of 
scare-mongering. 

-Bill




Re: ECN

2019-11-13 Thread Tore Anderson
* Saku Ytti

> Not true. Hash result should indicate discreet flow, more importantly
> discreet flow should not result into two unique hash numbers. Using
> whole TOS byte breaks this promise and thus breaks ECMP.
> 
> Platforms allow you to configure which  bytes are part of hash
> calculation, whole TOS byte should not be used as discreet flow SHOULD
> have unique ECN bits during congestion. Toke has diagnosed the problem
> correctly, solution is to remove TOS from ECMP hash calculation.

Agreed. This also goes for the other bits, so whole byte must be excluded.

For example, the OpenSSH client will by default change the code point from zero 
(during authentication) to af21/cs1 (when it enters a 
interactive/non-interactive session).

I have experienced this break IPv6 SSH sessions to an anycasted SSH server 
instance that was reached through old Juniper DPC cards with ECMP enabled. 
Symptom was that authentication went fine, only for the connection to be reset 
immediately after (unless default IPQoS config was changed). The «solution» was 
to simply disable ECMP for all IPv6 traffic, since I could not figure out how 
to make the Juniper exclude the DiffServ byte from the ECMP hash calculation.

Tore


TCP and anycast (was Re: ECN)

2019-11-13 Thread Anoop Ghanwani
RFC 7094 (https://tools.ietf.org/html/rfc7094) describes the pitfalls &
risks of using TCP with an anycast address.  It recognizes that there are
valid use cases for it, though.

Specifically, section 3.1 says this:
>>>

   Most stateful transport protocols (e.g., TCP), without modification,
   do not understand the properties of anycast; hence, they will fail
   probabilistically, but possibly catastrophically, when using anycast
   addresses in the presence of "normal" routing dynamics.

...

   This can lead
   to a protocol working fine in, say, a test lab but not in the global
   Internet.

>>>

On Wed, Nov 13, 2019 at 3:33 PM Warren Kumari  wrote:

> On Thu, Nov 14, 2019 at 12:25 AM Matt Corallo  wrote:
> >
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
>
> Err. I really don't think that there is any sort of spec that
> covers that :-P
>
> Using Anycast for TCP is incredibly common - the DNS root servers for
> one obvious example.
> More TCP centric well-known examples are Fastly and LinkedIn -
> LinkedIn in particular did a really good podcast on their experience
> with this.
>
> There is also a good NANOG talk from the ~2000s (?) on people using
> TCP anycast for long lived (serving ISO files, which were long-lived
> in those days) flows, and how reliable it is - perhaps that's the talk
> Todd mentioned?
>
> W
>
> >
> > > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG <
> nanog@nanog.org> wrote:
> > >
> > > 
> > >>
> > >> Hello
> > >>
> > >> I have a customer that believes my network has a ECN problem. We do
> > >> not, we just move packets. But how do I prove it?
> > >>
> > >> Is there a tool that checks for ECN trouble? Ideally something I could
> > >> run on the NLNOG Ring network.
> > >>
> > >> I believe it likely that it is the destination that has the problem.
> > >
> > > Hi Baldur
> > >
> > > I believe I may be that customer :)
> > >
> > > First of all, thank you for looking into the issue! We've been having
> > > great fun over on the ecn-sane mailing list trying to figure out what's
> > > going on. I'll summarise below, but see this thread for the discussion
> > > and debugging details:
> > >
> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> > >
> > > The short version is that the problem appears to come from a
> combination
> > > of the ECMP routing in your network, and Cloudflare's heavy use of
> > > anycast. Specifically, a router in your network appears to be doing
> ECMP
> > > by hashing on the packet header, *including the ECN bits*. This breaks
> > > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > > up taking a different path than the rest of the flow (which is marked
> as
> > > ECT(0)). When the destination is anycasted, this means that the data
> > > packets go to a different server than the SYN did. This second server
> > > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > > router's ECMP hash.
> > >
> > > For a longer exposition, see below. You should be able to verify this
> > > from somewhere else in the network, but if there's anything else you
> > > want me to test, do let me know. Also, would you mind sharing the
> router
> > > make and model that does this? We're trying to collect real-world
> > > examples of network problems caused by ECN and this is definitely an
> > > interesting example.
> > >
> > > -Toke
> > >
> > >
> > >
> > > The long version:
> > >
> > > From my end I can see that I have two paths to Cloudflare; which is
> > > taken appears to be based on a hash of the packet header, as can be
> seen
> > > by varying the source port:
> > >
> > > $ traceroute -q 1 --sport=1 104.24.125.13
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.357 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> 

Re: ECN

2019-11-13 Thread Warren Kumari
On Thu, Nov 14, 2019 at 12:25 AM Matt Corallo  wrote:
>
> This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> is... out of spec to say the least), not a bug in ECN/ECMP.

Err. I really don't think that there is any sort of spec that
covers that :-P

Using Anycast for TCP is incredibly common - the DNS root servers for
one obvious example.
More TCP centric well-known examples are Fastly and LinkedIn -
LinkedIn in particular did a really good podcast on their experience
with this.

There is also a good NANOG talk from the ~2000s (?) on people using
TCP anycast for long lived (serving ISO files, which were long-lived
in those days) flows, and how reliable it is - perhaps that's the talk
Todd mentioned?

W

>
> > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG 
> >  wrote:
> >
> > 
> >>
> >> Hello
> >>
> >> I have a customer that believes my network has a ECN problem. We do
> >> not, we just move packets. But how do I prove it?
> >>
> >> Is there a tool that checks for ECN trouble? Ideally something I could
> >> run on the NLNOG Ring network.
> >>
> >> I believe it likely that it is the destination that has the problem.
> >
> > Hi Baldur
> >
> > I believe I may be that customer :)
> >
> > First of all, thank you for looking into the issue! We've been having
> > great fun over on the ecn-sane mailing list trying to figure out what's
> > going on. I'll summarise below, but see this thread for the discussion
> > and debugging details:
> > https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> >
> > The short version is that the problem appears to come from a combination
> > of the ECMP routing in your network, and Cloudflare's heavy use of
> > anycast. Specifically, a router in your network appears to be doing ECMP
> > by hashing on the packet header, *including the ECN bits*. This breaks
> > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > up taking a different path than the rest of the flow (which is marked as
> > ECT(0)). When the destination is anycasted, this means that the data
> > packets go to a different server than the SYN did. This second server
> > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > router's ECMP hash.
> >
> > For a longer exposition, see below. You should be able to verify this
> > from somewhere else in the network, but if there's anything else you
> > want me to test, do let me know. Also, would you mind sharing the router
> > make and model that does this? We're trying to collect real-world
> > examples of network problems caused by ECN and this is definitely an
> > interesting example.
> >
> > -Toke
> >
> >
> >
> > The long version:
> >
> > From my end I can see that I have two paths to Cloudflare; which is
> > taken appears to be based on a hash of the packet header, as can be seen
> > by varying the source port:
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.357 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> > 6  104.24.125.13 (104.24.125.13)  1.322 ms
> >
> > $ traceroute -q 1 --sport=10001 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.293 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> > 6  149.6.142.130 (149.6.142.130)  6.925 ms
> > 7  104.24.125.13 (104.24.125.13)  1.501 ms
> >
> >
> > This is fine in itself. However, the problem stems from the fact that
> > the ECN bits in the IP header are also included in the ECMP hash (-t
> > sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
> > ECT(1)):
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.336 ms
> > 2  alber

Re: ECN

2019-11-13 Thread Lukas Tribus
Hello,

On Wed, Nov 13, 2019 at 8:35 PM Saku Ytti  wrote:
>
> On Wed, 13 Nov 2019 at 18:27, Matt Corallo  wrote:
>
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> > is... out of spec to say the least), not a bug in ECN/ECMP.
>
> Not true. Hash result should indicate discreet flow, more importantly
> discreet flow should not result into two unique hash numbers. Using
> whole TOS byte breaks this promise and thus breaks ECMP.
>
> Platforms allow you to configure which  bytes are part of hash
> calculation, whole TOS byte should not be used as discreet flow SHOULD
> have unique ECN bits during congestion. Toke has diagnosed the problem
> correctly, solution is to remove TOS from ECMP hash calculation.

In fact I believe everything beyond the 5-tuple is just a bad idea to
base your hash on. Here are some examples (not quite as straight
forward than the TOS/ECN case here):

TTL:
https://mailman.nanog.org/pipermail/nanog/2018-September/096871.html

IPv6 flow label:
https://blog.apnic.net/2018/01/11/ipv6-flow-label-misuse-hashing/
https://pc.nanog.org/static/published/meetings/NANOG71/1531/20171003_Jaeggli_Lightning_Talk_Ipv6_v1.pdf
https://www.youtube.com/watch?v=b0CRjOpnT7w



Lukas


Re: ECN

2019-11-13 Thread William Herrin
On Wed, Nov 13, 2019 at 11:36 AM Saku Ytti  wrote:

> On Wed, 13 Nov 2019 at 18:27, Matt Corallo  wrote:
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
>
> Not true. Hash result should indicate discreet flow, more importantly
> discreet flow should not result into two unique hash numbers. Using
> whole TOS byte breaks this promise and thus breaks ECMP.
>

Yes true.

Equal Cost MultiPath (ECMP) consistency over the life of a TCP connection
is not a promise. Anycasters would love it to be but it's not.

ECMP's only promise is that packets for a particular connection will tend
to prefer a particular path so that throughput doesn't suffer overly much
from the packet reordering you'd get by round-robining the packets on
different links. Choosing an alternate path during congestion is a
perfectly reasonable thing for ECMP to do.

Don't blame the network. This is Cloudflare choosing not to handle the
anycast spray corner case because it happens rarely enough with symptoms
obscure enough that they only occasionally get called to carpet. Their BGP
announcements make the claim they're ready for your packet at any of their
sites, but they're not.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: ECN

2019-11-13 Thread Owen DeLong
Like it or not (and I really don’t), the majority of modern CDNs are using TCP 
over Anycast.

It’s ugly and it’s prone to problems like this. It’s nice to see a customer 
with know-how actually publicizing and digging into the problem.

Until now, I believe an unknown number of customers have been suffering in 
silence or relegated to the ISPs “We can’t reproduce you problem” bin without 
resolution.

I’ve had lots of discussions on the subject and the usual end result is “It’s 
too hard to measure or quantify and there’s no visible contingent of impacted 
users”.

Now we at least have one visible impacted user.

Owen


> On Nov 13, 2019, at 09:19 , Anoop Ghanwani  wrote:
> 
> Not to condone what cloudflare is doing, but...
> 
> An ECN connection will have different bits on various packets for the 
> duration of the connection -- pure ACKs (ACKs not piggybacking on data) will 
> have the ECN bits as 00b, while all other packets will have either 01b, 10b 
> (when no congestion was experienced) or 11b (when congestion was 
> experienced).  So using the ECN bits as part of the hash would affect 
> performance throughout the life of the connection.
> 
> On Wed, Nov 13, 2019 at 9:00 AM Matt Corallo  <mailto:na...@as397444.net>> wrote:
> Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), 
> splitting the flow shouldn’t have material performance degradation? 
> 
> > On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  > <mailto:t...@toke.dk>> wrote:
> > 
> > 
> > 
> >> On 13 November 2019 17:20:18 CET, Matt Corallo  >> <mailto:netad...@as397444.net>> wrote:
> >> This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> >> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> > 
> > Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
> > split the flow over multiple paths; avoiding that is the whole point of 
> > doing the flow-based hashing in the first place.
> > 
> > Anycast "only" turns a potential degradation of TCP performance into a hard 
> > failure... :)
> > 
> > -Toke
> 



Re: ECN

2019-11-13 Thread Saku Ytti
On Wed, 13 Nov 2019 at 18:27, Matt Corallo  wrote:

> This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> is... out of spec to say the least), not a bug in ECN/ECMP.

Not true. Hash result should indicate discreet flow, more importantly
discreet flow should not result into two unique hash numbers. Using
whole TOS byte breaks this promise and thus breaks ECMP.

Platforms allow you to configure which  bytes are part of hash
calculation, whole TOS byte should not be used as discreet flow SHOULD
have unique ECN bits during congestion. Toke has diagnosed the problem
correctly, solution is to remove TOS from ECMP hash calculation.

-- 
  ++ytti


Re: ECN

2019-11-13 Thread Toke Høiland-Jørgensen via NANOG



On 13 November 2019 17:20:18 CET, Matt Corallo  wrote:
>This sounds like a bug on Cloudflare’s end (cause trying to do anycast
>TCP is... out of spec to say the least), not a bug in ECN/ECMP.

Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
split the flow over multiple paths; avoiding that is the whole point of doing 
the flow-based hashing in the first place.

Anycast "only" turns a potential degradation of TCP performance into a hard 
failure... :)

-Toke


Re: ECN

2019-11-13 Thread Baldur Norddahl
ZTE M6000-S V3.00.20(3.40.1)

We are moving away from this platform so I can not be bothered with
requesting a fix. In the past they have made fixes for us, so I
believe they would also fix this issue if we asked them to do so.

Also I would like to state that I have not personally verified that the
equipment is doing hashing based on the ECN bits. I just turned off ECMP so
the customer can test. If it works we will either let ECMP stay off or move
the customer to the new platform.

Regards,

Baldur


On Wed, Nov 13, 2019 at 7:30 PM Mikael Abrahamsson  wrote:

> On Wed, 13 Nov 2019, Baldur Norddahl wrote:
>
> > In any case, is it not recommended that users of anycast proxy packets
> > that arrive at the wrong place? To avoid this kind of issue.
>
> In typical anycast deployments there is no feasible way to figure out
> where the "right place" is.
>
> It would be very interesting if your could share what equipment you're
> using that is doing ECMP hashing based on ECN bits. That vendor needs to
> fix that or people should avoid their devices.
>
> --
> Mikael Abrahamssonemail: swm...@swm.pp.se
>


Re: ECN

2019-11-13 Thread Mikael Abrahamsson via NANOG

On Wed, 13 Nov 2019, Baldur Norddahl wrote:

In any case, is it not recommended that users of anycast proxy packets 
that arrive at the wrong place? To avoid this kind of issue.


In typical anycast deployments there is no feasible way to figure out 
where the "right place" is.


It would be very interesting if your could share what equipment you're 
using that is doing ECMP hashing based on ECN bits. That vendor needs to 
fix that or people should avoid their devices.


--
Mikael Abrahamssonemail: swm...@swm.pp.se


Re: ECN

2019-11-13 Thread Baldur Norddahl
I am testing disabling our use of ECMP as it is not strictly necessary and
we are moving to a new platform anyway. Waiting for feedback from the
customer to hear if this fixes the issue.

In any case, is it not recommended that users of anycast proxy packets that
arrive at the wrong place? To avoid this kind of issue.

Regards,

Baldur


On Wed, Nov 13, 2019 at 6:35 PM Todd Underwood  wrote:

> as one of the authors of that talk, it definitely is "a thing", has been
> for years and years and years, and indeed, mostly works.
>
> t
>
> On Wed, Nov 13, 2019 at 12:18 PM Hunter Fuller 
> wrote:
>
>> It is certainly odd, but it's definitely a "thing."
>>
>> https://archive.nanog.org/meetings/nanog37/presentations/matt.levine.pdf
>>
>> On Wed, Nov 13, 2019 at 10:24 AM Matt Corallo  wrote:
>> >
>> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
>> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
>> >
>> > > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG <
>> nanog@nanog.org> wrote:
>> > >
>> > > 
>> > >>
>> > >> Hello
>> > >>
>> > >> I have a customer that believes my network has a ECN problem. We do
>> > >> not, we just move packets. But how do I prove it?
>> > >>
>> > >> Is there a tool that checks for ECN trouble? Ideally something I
>> could
>> > >> run on the NLNOG Ring network.
>> > >>
>> > >> I believe it likely that it is the destination that has the problem.
>> > >
>> > > Hi Baldur
>> > >
>> > > I believe I may be that customer :)
>> > >
>> > > First of all, thank you for looking into the issue! We've been having
>> > > great fun over on the ecn-sane mailing list trying to figure out
>> what's
>> > > going on. I'll summarise below, but see this thread for the discussion
>> > > and debugging details:
>> > >
>> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
>> > >
>> > > The short version is that the problem appears to come from a
>> combination
>> > > of the ECMP routing in your network, and Cloudflare's heavy use of
>> > > anycast. Specifically, a router in your network appears to be doing
>> ECMP
>> > > by hashing on the packet header, *including the ECN bits*. This breaks
>> > > TCP connections with ECN because the TCP SYN (with no ECN bits set)
>> end
>> > > up taking a different path than the rest of the flow (which is marked
>> as
>> > > ECT(0)). When the destination is anycasted, this means that the data
>> > > packets go to a different server than the SYN did. This second server
>> > > doesn't recognise the connection, and so replies with a TCP RST. To
>> fix
>> > > this, simply exclude the ECN bits (or the whole TOS byte) from your
>> > > router's ECMP hash.
>> > >
>> > > For a longer exposition, see below. You should be able to verify this
>> > > from somewhere else in the network, but if there's anything else you
>> > > want me to test, do let me know. Also, would you mind sharing the
>> router
>> > > make and model that does this? We're trying to collect real-world
>> > > examples of network problems caused by ECN and this is definitely an
>> > > interesting example.
>> > >
>> > > -Toke
>> > >
>> > >
>> > >
>> > > The long version:
>> > >
>> > > From my end I can see that I have two paths to Cloudflare; which is
>> > > taken appears to be based on a hash of the packet header, as can be
>> seen
>> > > by varying the source port:
>> > >
>> > > $ traceroute -q 1 --sport=1 104.24.125.13
>> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
>> packets
>> > > 1  _gateway (10.42.3.1)  0.357 ms
>> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
>> > > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
>> > > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
>> > > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
>> > > 6  104.24.125.13 (104.24.125.13)  1.322 ms
>> > >
>> > > $ traceroute -q 1 --sport=10001 104.24.125.13
>> > > traceroute to 104.24.125.

Re: ECN

2019-11-13 Thread Jon Lewis
It does when the split flows land in different anycast origin POPs. 
Making a few assumptions from the traceroutes, the ECMP paths are sending 
some packets to Hamburg and some to Denmark.  Each POP may be getting 
parts of what should be a single TCP stream, and I doubt they have 
anything to cope with that (another assumption).


On Wed, 13 Nov 2019, Matt Corallo wrote:


Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), 
splitting the flow shouldn’t have material performance degradation?


On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  wrote:




On 13 November 2019 17:20:18 CET, Matt Corallo  wrote:
This sounds like a bug on Cloudflare’s end (cause trying to do anycast
TCP is... out of spec to say the least), not a bug in ECN/ECMP.


Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
split the flow over multiple paths; avoiding that is the whole point of doing 
the flow-based hashing in the first place.

Anycast "only" turns a potential degradation of TCP performance into a hard 
failure... :)

-Toke





--
 Jon Lewis, MCP :)   |  I route
 StackPath, Sr. Neteng   |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: ECN

2019-11-13 Thread Todd Underwood
as one of the authors of that talk, it definitely is "a thing", has been
for years and years and years, and indeed, mostly works.

t

On Wed, Nov 13, 2019 at 12:18 PM Hunter Fuller  wrote:

> It is certainly odd, but it's definitely a "thing."
>
> https://archive.nanog.org/meetings/nanog37/presentations/matt.levine.pdf
>
> On Wed, Nov 13, 2019 at 10:24 AM Matt Corallo  wrote:
> >
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> >
> > > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG <
> nanog@nanog.org> wrote:
> > >
> > > 
> > >>
> > >> Hello
> > >>
> > >> I have a customer that believes my network has a ECN problem. We do
> > >> not, we just move packets. But how do I prove it?
> > >>
> > >> Is there a tool that checks for ECN trouble? Ideally something I could
> > >> run on the NLNOG Ring network.
> > >>
> > >> I believe it likely that it is the destination that has the problem.
> > >
> > > Hi Baldur
> > >
> > > I believe I may be that customer :)
> > >
> > > First of all, thank you for looking into the issue! We've been having
> > > great fun over on the ecn-sane mailing list trying to figure out what's
> > > going on. I'll summarise below, but see this thread for the discussion
> > > and debugging details:
> > >
> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> > >
> > > The short version is that the problem appears to come from a
> combination
> > > of the ECMP routing in your network, and Cloudflare's heavy use of
> > > anycast. Specifically, a router in your network appears to be doing
> ECMP
> > > by hashing on the packet header, *including the ECN bits*. This breaks
> > > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > > up taking a different path than the rest of the flow (which is marked
> as
> > > ECT(0)). When the destination is anycasted, this means that the data
> > > packets go to a different server than the SYN did. This second server
> > > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > > router's ECMP hash.
> > >
> > > For a longer exposition, see below. You should be able to verify this
> > > from somewhere else in the network, but if there's anything else you
> > > want me to test, do let me know. Also, would you mind sharing the
> router
> > > make and model that does this? We're trying to collect real-world
> > > examples of network problems caused by ECN and this is definitely an
> > > interesting example.
> > >
> > > -Toke
> > >
> > >
> > >
> > > The long version:
> > >
> > > From my end I can see that I have two paths to Cloudflare; which is
> > > taken appears to be based on a hash of the packet header, as can be
> seen
> > > by varying the source port:
> > >
> > > $ traceroute -q 1 --sport=1 104.24.125.13
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.357 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> > > 6  104.24.125.13 (104.24.125.13)  1.322 ms
> > >
> > > $ traceroute -q 1 --sport=10001 104.24.125.13
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.293 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> > > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> > > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> > > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> > > 6  149.6.142.130 (149.6.142.130)  6.925 ms
> > > 7  104.24.125.13 (104.24.125.13)  1.501 ms
> > >
> > >
> > > This is fine in itself. However, the problem stems from the fact that
> > > the ECN bits in the IP header are also included in the ECMP hash (-t
> > > sets the TOS byte; -t 1 ends up as EC

Re: ECN

2019-11-13 Thread Anoop Ghanwani
Not to condone what cloudflare is doing, but...

An ECN connection will have different bits on various packets for the
duration of the connection -- pure ACKs (ACKs not piggybacking on data)
will have the ECN bits as 00b, while all other packets will have either
01b, 10b (when no congestion was experienced) or 11b (when congestion was
experienced).  So using the ECN bits as part of the hash would affect
performance throughout the life of the connection.

On Wed, Nov 13, 2019 at 9:00 AM Matt Corallo  wrote:

> Not ideal, sure, but if it’s only for the SYN (as you seem to indicate),
> splitting the flow shouldn’t have material performance degradation?
>
> > On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  wrote:
> >
> > 
> >
> >> On 13 November 2019 17:20:18 CET, Matt Corallo 
> wrote:
> >> This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> >> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> >
> > Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so
> will split the flow over multiple paths; avoiding that is the whole point
> of doing the flow-based hashing in the first place.
> >
> > Anycast "only" turns a potential degradation of TCP performance into a
> hard failure... :)
> >
> > -Toke
>
>


Re: ECN

2019-11-13 Thread Hunter Fuller
It is certainly odd, but it's definitely a "thing."

https://archive.nanog.org/meetings/nanog37/presentations/matt.levine.pdf

On Wed, Nov 13, 2019 at 10:24 AM Matt Corallo  wrote:
>
> This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> is... out of spec to say the least), not a bug in ECN/ECMP.
>
> > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG 
> >  wrote:
> >
> > 
> >>
> >> Hello
> >>
> >> I have a customer that believes my network has a ECN problem. We do
> >> not, we just move packets. But how do I prove it?
> >>
> >> Is there a tool that checks for ECN trouble? Ideally something I could
> >> run on the NLNOG Ring network.
> >>
> >> I believe it likely that it is the destination that has the problem.
> >
> > Hi Baldur
> >
> > I believe I may be that customer :)
> >
> > First of all, thank you for looking into the issue! We've been having
> > great fun over on the ecn-sane mailing list trying to figure out what's
> > going on. I'll summarise below, but see this thread for the discussion
> > and debugging details:
> > https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> >
> > The short version is that the problem appears to come from a combination
> > of the ECMP routing in your network, and Cloudflare's heavy use of
> > anycast. Specifically, a router in your network appears to be doing ECMP
> > by hashing on the packet header, *including the ECN bits*. This breaks
> > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > up taking a different path than the rest of the flow (which is marked as
> > ECT(0)). When the destination is anycasted, this means that the data
> > packets go to a different server than the SYN did. This second server
> > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > router's ECMP hash.
> >
> > For a longer exposition, see below. You should be able to verify this
> > from somewhere else in the network, but if there's anything else you
> > want me to test, do let me know. Also, would you mind sharing the router
> > make and model that does this? We're trying to collect real-world
> > examples of network problems caused by ECN and this is definitely an
> > interesting example.
> >
> > -Toke
> >
> >
> >
> > The long version:
> >
> > From my end I can see that I have two paths to Cloudflare; which is
> > taken appears to be based on a hash of the packet header, as can be seen
> > by varying the source port:
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.357 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> > 6  104.24.125.13 (104.24.125.13)  1.322 ms
> >
> > $ traceroute -q 1 --sport=10001 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.293 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> > 6  149.6.142.130 (149.6.142.130)  6.925 ms
> > 7  104.24.125.13 (104.24.125.13)  1.501 ms
> >
> >
> > This is fine in itself. However, the problem stems from the fact that
> > the ECN bits in the IP header are also included in the ECMP hash (-t
> > sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
> > ECT(1)):
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.336 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
> > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
> > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
> > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
> > 6  104.24.125.13 (104.24.125.13)  1.210 ms
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13 -t 2

Re: ECN

2019-11-13 Thread Matt Corallo
Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), 
splitting the flow shouldn’t have material performance degradation? 

> On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  wrote:
> 
> 
> 
>> On 13 November 2019 17:20:18 CET, Matt Corallo  wrote:
>> This sounds like a bug on Cloudflare’s end (cause trying to do anycast
>> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> 
> Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
> split the flow over multiple paths; avoiding that is the whole point of doing 
> the flow-based hashing in the first place.
> 
> Anycast "only" turns a potential degradation of TCP performance into a hard 
> failure... :)
> 
> -Toke



Re: ECN

2019-11-13 Thread Matt Corallo
This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
is... out of spec to say the least), not a bug in ECN/ECMP.

> On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG  
> wrote:
> 
> 
>> 
>> Hello
>> 
>> I have a customer that believes my network has a ECN problem. We do
>> not, we just move packets. But how do I prove it?
>> 
>> Is there a tool that checks for ECN trouble? Ideally something I could
>> run on the NLNOG Ring network.
>> 
>> I believe it likely that it is the destination that has the problem.
> 
> Hi Baldur
> 
> I believe I may be that customer :)
> 
> First of all, thank you for looking into the issue! We've been having
> great fun over on the ecn-sane mailing list trying to figure out what's
> going on. I'll summarise below, but see this thread for the discussion
> and debugging details:
> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> 
> The short version is that the problem appears to come from a combination
> of the ECMP routing in your network, and Cloudflare's heavy use of
> anycast. Specifically, a router in your network appears to be doing ECMP
> by hashing on the packet header, *including the ECN bits*. This breaks
> TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> up taking a different path than the rest of the flow (which is marked as
> ECT(0)). When the destination is anycasted, this means that the data
> packets go to a different server than the SYN did. This second server
> doesn't recognise the connection, and so replies with a TCP RST. To fix
> this, simply exclude the ECN bits (or the whole TOS byte) from your
> router's ECMP hash.
> 
> For a longer exposition, see below. You should be able to verify this
> from somewhere else in the network, but if there's anything else you
> want me to test, do let me know. Also, would you mind sharing the router
> make and model that does this? We're trying to collect real-world
> examples of network problems caused by ECN and this is definitely an
> interesting example.
> 
> -Toke
> 
> 
> 
> The long version:
> 
> From my end I can see that I have two paths to Cloudflare; which is
> taken appears to be based on a hash of the packet header, as can be seen
> by varying the source port:
> 
> $ traceroute -q 1 --sport=1 104.24.125.13
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.357 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> 6  104.24.125.13 (104.24.125.13)  1.322 ms
> 
> $ traceroute -q 1 --sport=10001 104.24.125.13
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.293 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> 6  149.6.142.130 (149.6.142.130)  6.925 ms
> 7  104.24.125.13 (104.24.125.13)  1.501 ms
> 
> 
> This is fine in itself. However, the problem stems from the fact that
> the ECN bits in the IP header are also included in the ECMP hash (-t
> sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
> ECT(1)):
> 
> $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.336 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
> 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
> 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
> 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
> 6  104.24.125.13 (104.24.125.13)  1.210 ms
> 
> $ traceroute -q 1 --sport=1 104.24.125.13 -t 2
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.339 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  2.565 ms
> 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.301 ms
> 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.339 ms
> 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.570 ms
> 6  149.6.142.130 (149.6.142.130)  6.888 ms
> 7  104.24.125.13 (104.24.125.13)  1.785 ms
> 
> 
> So why is this a problem? The TCP SYN packet first needs to negotiate
> ECN, so it is sent without any ECN bits set in the he

Re: ECN

2019-11-13 Thread Toke Høiland-Jørgensen via NANOG
> Hello
> 
> I have a customer that believes my network has a ECN problem. We do
> not, we just move packets. But how do I prove it?
> 
> Is there a tool that checks for ECN trouble? Ideally something I could
> run on the NLNOG Ring network.
> 
> I believe it likely that it is the destination that has the problem.

Hi Baldur

I believe I may be that customer :)

First of all, thank you for looking into the issue! We've been having
great fun over on the ecn-sane mailing list trying to figure out what's
going on. I'll summarise below, but see this thread for the discussion
and debugging details:
https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html

The short version is that the problem appears to come from a combination
of the ECMP routing in your network, and Cloudflare's heavy use of
anycast. Specifically, a router in your network appears to be doing ECMP
by hashing on the packet header, *including the ECN bits*. This breaks
TCP connections with ECN because the TCP SYN (with no ECN bits set) end
up taking a different path than the rest of the flow (which is marked as
ECT(0)). When the destination is anycasted, this means that the data
packets go to a different server than the SYN did. This second server
doesn't recognise the connection, and so replies with a TCP RST. To fix
this, simply exclude the ECN bits (or the whole TOS byte) from your
router's ECMP hash.

For a longer exposition, see below. You should be able to verify this
from somewhere else in the network, but if there's anything else you
want me to test, do let me know. Also, would you mind sharing the router
make and model that does this? We're trying to collect real-world
examples of network problems caused by ECN and this is definitely an
interesting example.

-Toke



The long version:

>From my end I can see that I have two paths to Cloudflare; which is
taken appears to be based on a hash of the packet header, as can be seen
by varying the source port:

$ traceroute -q 1 --sport=1 104.24.125.13
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.357 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
 6  104.24.125.13 (104.24.125.13)  1.322 ms

$ traceroute -q 1 --sport=10001 104.24.125.13
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.293 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
 6  149.6.142.130 (149.6.142.130)  6.925 ms
 7  104.24.125.13 (104.24.125.13)  1.501 ms


This is fine in itself. However, the problem stems from the fact that
the ECN bits in the IP header are also included in the ECMP hash (-t
sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
ECT(1)):

$ traceroute -q 1 --sport=1 104.24.125.13 -t 1
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.336 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
 6  104.24.125.13 (104.24.125.13)  1.210 ms

$ traceroute -q 1 --sport=1 104.24.125.13 -t 2
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.339 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  2.565 ms
 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.301 ms
 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.339 ms
 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.570 ms
 6  149.6.142.130 (149.6.142.130)  6.888 ms
 7  104.24.125.13 (104.24.125.13)  1.785 ms


So why is this a problem? The TCP SYN packet first needs to negotiate
ECN, so it is sent without any ECN bits set in the header; after
negotiation succeeds, the data packets will be marked as ECT(0). But
because that becomes part of the ECMP hash, those packets will take
another path. And since the destination is anycasted, that means they
will also end up at a different endpoint. This second endpoint won't
recognise the connection, and reply with a TCP RST. This is clearly
visible in tcpdump; notice the different TOS values, and that the RST
packet has a different TTL than the SYN-ACK:

12:21:47.816359 IP (tos 0x0, ttl 64, id 25687, offset 0, flags [DF], proto TCP 
(6), length 60)
10.42.3.130.34420 > 104.24.125.13.80: Flags [SEW], cksum 0xf2ff (incorrect 
-> 0x0853), seq 3345293502, wi

Re: ECN

2019-11-11 Thread Owen DeLong



> On Nov 11, 2019, at 05:01 , Baldur Norddahl  wrote:
> 
> Hello
> 
> I have a customer that believes my network has a ECN problem. We do not, we 
> just move packets. But how do I prove it?

Are you saying that none of your routers support ECN or that you think ECN only 
applies to endpoints?

> Is there a tool that checks for ECN trouble? Ideally something I could run on 
> the NLNOG Ring network.
> 
> I believe it likely that it is the destination that has the problem.

I’d say start with asking the reporter to provide a PCAP of the problem and 
review the packet trace to provide clues of tap points
in your network to investigate where ECN is (or should be) occurring and the 
opposite is occurring.

Owen



ECN

2019-11-11 Thread Baldur Norddahl

Hello

I have a customer that believes my network has a ECN problem. We do not, 
we just move packets. But how do I prove it?


Is there a tool that checks for ECN trouble? Ideally something I could 
run on the NLNOG Ring network.


I believe it likely that it is the destination that has the problem.

Regards,

Baldur



Re: ECN, DNS and Firewalls

2018-12-27 Thread Mark Andrews



> On 28 Dec 2018, at 2:49 pm, valdis.kletni...@vt.edu wrote:
> 
> On Fri, 28 Dec 2018 13:35:04 +1100, Mark Andrews said:
>> There are major operators that still have STUPID firewall settings
>> in front of DNS servers that drop SYN packets with ECE and CWR set
>> 17 years after ECN was specified.
> 
> Time to name-n-shame?

No yet.  Let people test and fix their firewalls first.

A test machine should be sending [SEW] and getting back 
[S.E] or [S.] in the TCP flags using tcpdump depending
upon whether the DNS server’s TCP stack supports ECN or not.

e.g.

11:35:50.335713 IP6 2001:470:a001:3:f1f2:b12d:4b18:d934.50670 > 
2001:7fe::53.53: Flags [SEW], seq 3764146938, win 65535, options [mss 
1220,nop,wscale 5,nop,nop,TS val 522561237 ecr 0,sackOK,eol], length 0
11:35:50.745472 IP6 2001:7fe::53.53 > 
2001:470:a001:3:f1f2:b12d:4b18:d934.50670: Flags [S.E], seq 1542147586, ack 
3764146939, win 14280, options [mss 1440,sackOK,TS val 1392826170 ecr 
522561237,nop,wscale 7], length 0

or

11:40:35.360655 IP6 2001:470:a001:3:f1f2:b12d:4b18:d934.50697 > 
2001:502:8cc::30.53: Flags [SEW], seq 81498720, win 65535, options [mss 
1220,nop,wscale 5,nop,nop,TS val 522845405 ecr 0,sackOK,eol], length 0
11:40:35.589420 IP6 2001:502:8cc::30.53 > 
2001:470:a001:3:f1f2:b12d:4b18:d934.50697: Flags [S.], seq 987294478, ack 
81498721, win 1220, options [mss 1220], length 0

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org



Re: ECN, DNS and Firewalls

2018-12-27 Thread valdis . kletnieks
On Fri, 28 Dec 2018 13:35:04 +1100, Mark Andrews said:
> There are major operators that still have STUPID firewall settings
> in front of DNS servers that drop SYN packets with ECE and CWR set
> 17 years after ECN was specified.

Time to name-n-shame?


ECN, DNS and Firewalls

2018-12-27 Thread Mark Andrews
There are major operators that still have STUPID firewall settings
in front of DNS servers that drop SYN packets with ECE and CWR set
17 years after ECN was specified.

Do you really want to add a second to EVERY DNS lookup that needs
to use TCP?  Modern OS actually attempt to use ECN by default.  DNS
is time critical enough without introducing unnecessary delays.

If you have signed zones then TCP requests are almost certainly being
made to your servers.

EVERYONE TEST YOUR SERVERS FROM OUTSIDE YOUR NETWORK AND FIX THE BROKEN
FIREWALLS THAT ARE FOUND.

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org



Re: Apple ECN, Bufferbloat, CoDel (fwd)

2015-06-15 Thread Jared Mauch
On Sat, Jun 13, 2015 at 06:20:31PM +0200, Mikael Abrahamsson wrote:
 
 Hi,
 
 I just want to bring to your attention the below talk (I am too lazy to
 re-write the whole email for this slightly different audience).
 
 Takeaway:
 
 We'll see a lot of ECN enabled traffic in a few months. This shouldn't be a
 problem. I've been doing it to all my machines for 3-5 years without ill
 effects.

I recall when ECN first came out and firewalls would block it causing me
issues on my Linux boxes sending list mail out.  It was a small enough 
percentage
that I mostly ignored it, but this will cause trouble for people who still
haven't fixed their broken firewalls.

I encourage almost everyone on nanog to watch this talk.

- Jared

 -- Forwarded message --
 Date: Sat, 13 Jun 2015 18:07:57 +0200 (CEST)
 From: Mikael Abrahamsson swm...@swm.pp.se
 To: bl...@lists.bufferbloat.net
 Subject: Apple ECN, Bufferbloat, CoDel
 
 I highly encourage people to take a look at:
 
 https://developer.apple.com/videos/wwdc/2015/?id=719

 -- 
 Mikael Abrahamssonemail: swm...@swm.pp.se

-- 
Jared Mauch  | pgp key available via finger from ja...@puck.nether.net
clue++;  | http://puck.nether.net/~jared/  My statements are only mine.


Re: Apple ECN, Bufferbloat, CoDel (fwd)

2015-06-15 Thread Dave Taht
On Mon, Jun 15, 2015 at 9:13 AM, joel jaeggli joe...@bogus.com wrote:
 On 6/15/15 6:19 AM, Jared Mauch wrote:
 On Sat, Jun 13, 2015 at 06:20:31PM +0200, Mikael Abrahamsson wrote:

 Hi,

 I just want to bring to your attention the below talk (I am too lazy to
 re-write the whole email for this slightly different audience).

 Takeaway:

 We'll see a lot of ECN enabled traffic in a few months. This shouldn't be a
 problem. I've been doing it to all my machines for 3-5 years without ill
 effects.

 you'll also find all the networks that use the entire tos field as part
 of the hash key... that's not exactly something you notice when you have
 a 1:1 host to ip correspondence unless it leads to reordering. but with
 stateless load balancing you can. fortunately those networks are
 observably rare.

I am aware of one such (very large) network that did, indeed, (and til
recently!) have devices that used the entire tos field in their ECMP
implementation. This led to re-ordering every time ECN CE was
exerted on ECN enabled flows. Testing for the existence of this
problem is not terribly hard (example, have a rule that periodically
exerts CE on a bunch of test tcp flows, count the reorders in
TCP_INFO), but the tools for it are kind of adhoc as yet.

I am curious if there is a SNMP mib/cacti/mrtg/other support for
reporting CE events in addition to loss?

Although fq_codel and pie (as deployed in linux - sadly docis-pie has
no ECN support in the spec) do do ecn markings (fq_codel *by
default*), deployment on bottleneck links is limited as yet. :)

My expectation is that this will make a difference first for apple
streaming video apps in the home, connecting to other devices in the
home (over wifi, ethernet, bluetooth, etc) that will start to make use
of this additional signalling information. And a billion new devices
with ecn on by default will probably expose all the other problems
rather rapidly. ;)

   I recall when ECN first came out and firewalls would block it causing 
 me
 issues on my Linux boxes sending list mail out.  It was a small enough 
 percentage
 that I mostly ignored it, but this will cause trouble for people who still
 haven't fixed their broken firewalls.

Better fallbacks exist now.

   I encourage almost everyone on nanog to watch this talk.

   - Jared

 -- Forwarded message --
 Date: Sat, 13 Jun 2015 18:07:57 +0200 (CEST)
 From: Mikael Abrahamsson swm...@swm.pp.se
 To: bl...@lists.bufferbloat.net
 Subject: Apple ECN, Bufferbloat, CoDel

 I highly encourage people to take a look at:

 https://developer.apple.com/videos/wwdc/2015/?id=719

 --
 Mikael Abrahamssonemail: swm...@swm.pp.se






-- 
Dave Täht
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast


Re: Apple ECN, Bufferbloat, CoDel (fwd)

2015-06-15 Thread joel jaeggli
On 6/15/15 6:19 AM, Jared Mauch wrote:
 On Sat, Jun 13, 2015 at 06:20:31PM +0200, Mikael Abrahamsson wrote:

 Hi,

 I just want to bring to your attention the below talk (I am too lazy to
 re-write the whole email for this slightly different audience).

 Takeaway:

 We'll see a lot of ECN enabled traffic in a few months. This shouldn't be a
 problem. I've been doing it to all my machines for 3-5 years without ill
 effects.

you'll also find all the networks that use the entire tos field as part
of the hash key... that's not exactly something you notice when you have
a 1:1 host to ip correspondence unless it leads to reordering. but with
stateless load balancing you can. fortunately those networks are
observably rare.

   I recall when ECN first came out and firewalls would block it causing me
 issues on my Linux boxes sending list mail out.  It was a small enough 
 percentage
 that I mostly ignored it, but this will cause trouble for people who still
 haven't fixed their broken firewalls.
 
   I encourage almost everyone on nanog to watch this talk.
 
   - Jared
 
 -- Forwarded message --
 Date: Sat, 13 Jun 2015 18:07:57 +0200 (CEST)
 From: Mikael Abrahamsson swm...@swm.pp.se
 To: bl...@lists.bufferbloat.net
 Subject: Apple ECN, Bufferbloat, CoDel

 I highly encourage people to take a look at:

 https://developer.apple.com/videos/wwdc/2015/?id=719
 
 -- 
 Mikael Abrahamssonemail: swm...@swm.pp.se
 




signature.asc
Description: OpenPGP digital signature


Apple ECN, Bufferbloat, CoDel (fwd)

2015-06-13 Thread Mikael Abrahamsson


Hi,

I just want to bring to your attention the below talk (I am too lazy to 
re-write the whole email for this slightly different audience).


Takeaway:

We'll see a lot of ECN enabled traffic in a few months. This shouldn't be 
a problem. I've been doing it to all my machines for 3-5 years without ill 
effects.


More people will become interested in how TCP works, from application, 
through the host stack, to the AQM (or lack thereof) in the router etc. If 
you don't do AQM towards your customers, be prepared that they're going to 
start complaining in a more informed manner in the not so distant future.


IPv6 only with NAT64+DNS64 will become a lot more feasible going forward. 
I am not a fan of breaking DNSSEC, but perhaps if we can do the DNS64 in 
the host (as it seems Apple is doing, at least for IPv4 literals), then 
that might be possible to work around.


-- Forwarded message --
Date: Sat, 13 Jun 2015 18:07:57 +0200 (CEST)
From: Mikael Abrahamsson swm...@swm.pp.se
To: bl...@lists.bufferbloat.net
Subject: Apple ECN, Bufferbloat, CoDel


I highly encourage people to take a look at:

https://developer.apple.com/videos/wwdc/2015/?id=719 (you might have to 
reigster as an apple developer to watch it, I don't know)


Your App and Next Generation Networks
IPv6 is growing exponentially and carriers worldwide are moving to pure IPv6 
APNs. Learn about new tools to test your apps for compatibility and get expert 
advice on making sure your apps work in all network environments. iOS 9 and OS 
X 10.11 now support the latest TCP standards. Hear from the experts on TCP Fast 
Open and Explicit Congestion Notification, and find out how it benefits your 
apps.


Being on this list you might not learn much from the talk, but I really 
appreciate a talk aimed at a wider (developer) audience which so clearly 
outlines the benefits of ECN, CoDel and TCP host opimization to reduce 
end-to-end experienced application communication latency. One of the major 
takeaways is that Apple is planning to by default enable ECN in iOS9 and OSX 
10.11. This would mean hundreds of millions of devices will be using ECN in a 
few months.


You can skip to 16 minutes into the talk if you're not interested in the new 
requirement for applications to support an environment where it's Internet 
access is IPv6 only behind NAT64+DNS64 (I'm myself super excited about this).


Let's hope this brings a lot of buzz and requests towards device manufacturers 
to start supporting ECN marking and AQM. Apple is usually a good megaphone to 
bring attention to these kinds of issues...


--
Mikael Abrahamssonemail: swm...@swm.pp.se


Re: ECN

2008-11-08 Thread Hank Nussbacher

On Fri, 7 Nov 2008, [EMAIL PROTECTED] wrote:


On Fri, 07 Nov 2008 08:27:58 +0100, Mikael Abrahamsson said:

for ECN to actually be useful, we (the ISPs) have to turn this option on
in the routers as well. Is anyone doing this today? What vendors support
it?


The only thing that's *required* for it to help is that the routers and
firewalls not actually *molest* the bits in the TCP SYN packet.  If you pass
them and *do nothing else*, it at least has the potential of being useful at
some other router along the path.  And let's face it - if *your* router is
congested enough for ECN to matter, there's a fairly good chance that the
router one hop up/downstream is *also* seeing some effects. Even if *you* don't
do anything else, your neighbor might - helping you out in the bargain.


See:
http://www.icir.org/floyd/ecn.html
http://www.cisco.com/en/US/docs/ios/12_2t/12_2t8/feature/guide/ftwrdecn.html

-Hank



Re: ECN

2008-11-07 Thread David Freedman
 When I thought about it, the IP core (10G links etc) first came to mind,
 and there it's fairly easy to roll out (since I guess a lot of us do
 WRED already), but what about on slower links? Would it make sense to
 have our DSLAMs do this? What about DSL/cable modems (well, vendors
 should first realise that FIFO is not great to begin with :P) ?

Implementing this in an MPLS core is not an easy task, you can really
only do this on the edge, when the MPLS labelled packet arrives at an
LSR, we don't know if it contains a TCP segment or not (fancy deep h/w
implementations excluded), all we know is that , if there is congestion,
we can discard it based on the EXP bits in the shim.

Dave.




Re: ECN

2008-11-07 Thread Bjørn Mork
David Freedman [EMAIL PROTECTED] writes:

 Implementing this in an MPLS core is not an easy task, you can really
 only do this on the edge, when the MPLS labelled packet arrives at an
 LSR, we don't know if it contains a TCP segment or not (fancy deep h/w
 implementations excluded), all we know is that , if there is congestion,
 we can discard it based on the EXP bits in the shim.

Please see RFC 5129


Bjørn



Re: ECN

2008-11-07 Thread David Freedman
Interesting , I hadn't followed this since draft-ietf-mpls-ecn-00,
, I eagerly await a vendor implementation :)

Dave.

Bjørn Mork wrote:
 David Freedman [EMAIL PROTECTED] writes:
 
 Implementing this in an MPLS core is not an easy task, you can really
 only do this on the edge, when the MPLS labelled packet arrives at an
 LSR, we don't know if it contains a TCP segment or not (fancy deep h/w
 implementations excluded), all we know is that , if there is congestion,
 we can discard it based on the EXP bits in the shim.
 
 Please see RFC 5129
 
 
 Bjørn
 
 




Re: ECN

2008-11-07 Thread Mikael Abrahamsson

On Fri, 7 Nov 2008, David Freedman wrote:

Implementing this in an MPLS core is not an easy task, you can really 
only do this on the edge, when the MPLS labelled packet arrives at an 
LSR, we don't know if it contains a TCP segment or not (fancy deep h/w 
implementations excluded), all we know is that , if there is congestion, 
we can discard it based on the EXP bits in the shim.


I did some more checking and neither 12000 (IOS) nor CRS-1 seems to 
support WRED with ECN (at least the command doesn't show when I create a 
policy-map), so I'm going to ping my Cisco SE and hear about what's going 
on.


--
Mikael Abrahamssonemail: [EMAIL PROTECTED]



Re: ECN

2008-11-07 Thread Valdis . Kletnieks
On Fri, 07 Nov 2008 08:27:58 +0100, Mikael Abrahamsson said:
 for ECN to actually be useful, we (the ISPs) have to turn this option on 
 in the routers as well. Is anyone doing this today? What vendors support 
 it?

The only thing that's *required* for it to help is that the routers and
firewalls not actually *molest* the bits in the TCP SYN packet.  If you pass
them and *do nothing else*, it at least has the potential of being useful at
some other router along the path.  And let's face it - if *your* router is
congested enough for ECN to matter, there's a fairly good chance that the
router one hop up/downstream is *also* seeing some effects. Even if *you* don't
do anything else, your neighbor might - helping you out in the bargain.



pgpWRdhiL6ucT.pgp
Description: PGP signature


ECN

2008-11-06 Thread Mikael Abrahamsson


Hi,

On LKML (Linux Kernel Mailing List) there is talk 
http://lkml.org/lkml/2008/11/4/151 about shipping the Linux kernel with 
ECN turned on by default (it was on by default a few years back but that 
change was reverted due to too many sites dropping ECN enabled SYNs).


Recent investigations http://www.imperialviolet.org/binary/ecntest.pdf 
shows that 0.5% of end hosts will drop SYN packets with ECN turned on. 
This is approximately the same rate I have seen for A/ adoption in 
this tread http://www.ops.ietf.org/lists/v6ops/v6ops.2008/msg01585.html.


Do we in the operational ISP community have an opinion on ECN adoption 
that we want to voice? As far as I can discern from 
http://www.cisco.com/en/US/docs/ios/12_2t/12_2t8/feature/guide/ftwrdecn.html 
for ECN to actually be useful, we (the ISPs) have to turn this option on 
in the routers as well. Is anyone doing this today? What vendors support 
it?


When I thought about it, the IP core (10G links etc) first came to mind, 
and there it's fairly easy to roll out (since I guess a lot of us do WRED 
already), but what about on slower links? Would it make sense to have our 
DSLAMs do this? What about DSL/cable modems (well, vendors should first 
realise that FIFO is not great to begin with :P) ?


http://www.icir.org/tbit/ is a summary page I found on ECN that looks 
like a good resource for further reading. Is anyone looking into including 
ECN configuration into some BCP document?


--
Mikael Abrahamssonemail: [EMAIL PROTECTED]