date:20180901

Re: Service provider story about tracking down TCP RSTs

2018-09-01 Thread Lee

On 9/1/18, William Herrin  wrote:
> On Sat, Sep 1, 2018 at 6:11 PM, Lee  wrote:
>> On 9/1/18, William Herrin  wrote:
>>> On Sat, Sep 1, 2018 at 4:00 PM, William Herrin  wrote:
 Better yet, do the job right and build an anycast TCP stack as
 described here: https://bill.herrin.us/network/anycasttcp.html
>>
>> An explosion in state management would be the least of my worries :)
>> I got as far as your Third hook: and thought of this
>>   https://www.jwz.org/doc/worse-is-better.html
>
> Hi Lee,
>
> On a brief tangent: Geographic routing would drastically simplify the
> Internet core, reducing both cost and complexity. You'd need to carry
> only nearby specific routes and a few broad aggregates for
> destinations far away. It will never be implemented, never, because no
> cross-ocean carriers are willing to have their bandwidth stolen when
> the algorithm decides it likes their path better than a paid one. Even
> though the algorithm gets the packets where they're going, and does so
> simply, it does so in a way that's too often incorrect.
>
> Then again, I don't really understand the MIT/New Jersey argument in
> Richard's worse-is-better story.

The "New Jersey" description is more of a caricature than a valid description:
  "I have intentionally caricatured the worse-is-better philosophy to
   convince you that it is obviously a bad philosophy and that the
   New Jersey approach is a bad approach."

I mentally did a 's/New Jersey/Microsoft/' and it made a lot more sense.

> The MIT guy says that a routine
> should handle a common non-fatal exception. The Jersey guy says that
> it's ok for the routine to return a try-again error and expect the
> caller to handle it. Since its trivial to build another layer that
> calls the routine in a loop until it returns success or a fatal error,
> it's more a philosophical argument than a practical one. As long as a
> correct result is consistently achieved in both cases, what's the
> difference?

That it's not always a trivial matter to build another layer.
That your retry layer needs at least a counter or timeout value so it
doesn't retry forever & those values need to be user configurable, so
the re-try layer isn't quite as trivial as it appears at first blush.

> Richard characterized the Jersey argument as, "It is slightly better
> to be simple than correct." I just don't see that in the Jersey
> argument. Every component must be correct. The system of components as
> a whole must be complete. It's slightly better for a component to be
> simple than complete. That's the argument I read and it makes sense to
> me.

Yes, I did a lot of interpreting also.  Then I hit on s/New
Jersey/Microsoft/ and it made a lot more sense to me.

> Honestly, the idea that software is good enough even with known corner
> cases that do something incorrect... I don't know how that survives in
> a world where security-conscious programming is not optional.

Agreed.  I substituted "soft-fail or fail-closed: user has to retry"
for doing something incorrect.

>> I had it much easier with anycast in an enterprise setting.  With
>> anycast servers in data centers A & B, just make sure no site has an
>> equal cost path to A and B.  Any link/ router/ whatever failure & the
>> user can just re-try.
>
> You've delicately balanced your network to achieve the principle that
> even when routing around failures the anycast sites are not
> equidistant from any other site. That isn't simplicity. It's
> complexity hidden in the expert selection of magic numbers.

^shrug^ it seemed simple to me.  And it was real easy to explain,
which is why I thought of that "worse is better" paper.  I took the
New Jersey approach & did what was basically a hack. You took the MIT
approach and created a general solution .. which is not so easy to
explain :)

> Even were that achievable in a network as chaotic as the Internet, is it 
> simpler
> than four trivial tweaks to the TCP stack plus a modestly complex but
> fully automatic user-space program that correctly reroutes the small
> percentage of packets that went astray?

Your four trivial tweaks to the TCP stack are kernel patches - right?
Which seems not at all trivial to me, but if you've got a group of
people that can support & maintain that - good for you!

Regards
Lee

RE: automatic rtbh trigger using flow data

2018-09-01 Thread Michel Py

> Roland Dobbins wrote :
> I'm well aware of what's mentioned in the Arbor report, thanks!

I would not have guessed :P


> Ryan Hamel wrote :
> No ISP is in the business of filtering traffic unless the client pays the 
> hefty fee since someone still has to tank the atack.

I agree. In the end, it tends to favor who has the biggest one.
I meant the biggest bandwidth, of course.

I the foreseeable future, no blacklist system is going to replace what Arbor 
and consorts can provide : a pipe big enough to either route the DDOS attack to 
null0 or even better route it to somewhere it can be analyzed further.

Michel.

TSI Disclaimer:  This message and any files or text attached to it are intended 
only for the recipients named above and contain information that may be 
confidential or privileged. If you are not the intended recipient, you must not 
forward, copy, use or otherwise disclose this communication or the information 
contained herein. In the event you have received this message in error, please 
notify the sender immediately by replying to this message, and then delete all 
copies of it from your system. Thank you!...

Re: automatic rtbh trigger using flow data

2018-09-01 Thread Hugo Slabbert



On Sun 2018-Sep-02 00:39:40 +, Ryan Hamel  wrote:

No ISP is in the business of filtering traffic unless the client pays the 
hefty fee since someone still has to tank the attack.


If I can tag an RTBH community on a /32, what's the additional lost revenue 
in letting me be more granular and get down to the specific flows I want 
dropped?


"drop all traffic to x/32" would drop *more* traffic than "drop any traffic 
from address y to x/32, protocol TCP, port n".


I also don’t think there is destination prefix IP filtering in flowspec, 
which could seriously cause problems.


What now?  Unless I'm misunderstanding what you're saying, it's right in 
the spec[1]:


   A flow specification NLRI must be validated such that it is
   considered feasible if and only if:

   a) The originator of the flow specification matches the originator of
  the best-match unicast route for the destination prefix embedded
  in the flow specification.

   b) There are no more specific unicast routes, when compared with the
  flow destination prefix, that have been received from a different
  neighboring AS than the best-match unicast route, which has been
  determined in step a).

   By originator of a BGP route, we mean either the BGP originator path
   attribute, as used by route reflection, or the transport address of
   the BGP peer, if this path attribute is not present.

   The underlying concept is that the neighboring AS that advertises the
   best unicast route for a destination is allowed to advertise flow-
   spec information that conveys a more or equally specific destination
   prefix.  Thus, as long as there are no more specific unicast routes,
   received from a different neighboring AS, which would be affected by
   that filtering rule.

   The neighboring AS is the immediate destination of the traffic
   described by the flow specification.  If it requests these flows to
   be dropped, that request can be honored without concern that it
   represents a denial of service in itself.  Supposedly, the traffic is
   being dropped by the downstream autonomous system, and there is no
   added value in carrying the traffic to it.

--
Hugo Slabbert   | email, xmpp/jabber: h...@slabnet.com
pgp key: B178313E   | also on Signal

[1] https://tools.ietf.org/html/rfc5575



From: NANOG  On Behalf Of Baldur Norddahl
Sent: Saturday, September 01, 2018 5:18 PM
To: nanog@nanog.org
Subject: Re: automatic rtbh trigger using flow data


fre. 31. aug. 2018 17.16 skrev Hugo Slabbert 
mailto:h...@slabnet.com>>:


I would love an upstream that accepts flowspec routes to get granular about
drops and to basically push "stateless ACLs" upstream.

_keeps dreaming_


We just need a signal to drop UDP for a prefix. The same as RTBH but only for 
UDP. This would prevent all volumetric attacks without the end user being cut 
off completely.

Besides from some games, VPN and VoIP, they would have an almost completely 
normal internet experience. DNS would go through the ISP servers and only be 
affected if the user is using a third party service.

Regards

Baldur



signature.asc
Description: Digital signature

Re: automatic rtbh trigger using flow data

2018-09-01 Thread Hugo Slabbert



On Sun 2018-Sep-02 10:09:32 +0700, Roland Dobbins  wrote:



On 1 Sep 2018, at 1:43, Hugo Slabbert wrote:

Generally on the TCP side you can try SYN or ACK floods, but you're 
not going to get an amplified reflection.


Actually, TCP reflection/amplification has been on the increase; the 
attacker is guaranteed at least 4:1 amplification in most 
circumstances, the number of reflectors/amplifiers is for all 
practical purposes infinite, and they're mostly legitimate, 
non-broken services/applications.


Fair.  I guess in terms of common reflect/amp vector at $dayjob we just see 
UDP-based significantly more frequently on large volumetric attacks given 
the amp factor on some vectors is so huge.


Some relevant reading I need to revisit:
https://www.usenix.org/sites/default/files/conference/protected-files/woot14_slides_kuhrer.pdf
https://www.usenix.org/system/files/conference/woot14/woot14-kuhrer.pdf

And as always, it's important to note that with all 
reflection/amplification attacks, the root of the issue is the lack of 
universal source-address validation (SAV).  Without the ability to spoof, 
there would be no reflection/amplification attacks.


ACK, pun intended.


---
Roland Dobbins 


--
Hugo Slabbert   | email, xmpp/jabber: h...@slabnet.com
pgp key: B178313E   | also on Signal


signature.asc
Description: Digital signature

Re: automatic rtbh trigger using flow data

2018-09-01 Thread Roland Dobbins




On 1 Sep 2018, at 1:43, Hugo Slabbert wrote:

Generally on the TCP side you can try SYN or ACK floods, but you're 
not going to get an amplified reflection.


Actually, TCP reflection/amplification has been on the increase; the 
attacker is guaranteed at least 4:1 amplification in most circumstances, 
the number of reflectors/amplifiers is for all practical purposes 
infinite, and they're mostly legitimate, non-broken 
services/applications.


And as always, it's important to note that with all 
reflection/amplification attacks, the root of the issue is the lack of 
universal source-address validation (SAV).  Without the ability to 
spoof, there would be no reflection/amplification attacks.


---
Roland Dobbins

Re: automatic rtbh trigger using flow data

2018-09-01 Thread Roland Dobbins




On 1 Sep 2018, at 1:20, Lotia, Pratik M wrote:

Arbor report mentions volumetric attacks using DNS, NTP form 75+% of 
the attacks.


I'm well aware of what's mentioned in the Arbor report, thanks!

;>


Then QoSing certain ports and protocols is the best way to start with.


The point is that when applying broad policies of this nature, one must 
be very conservative, else one can cause larger problems on a macro 
scale.  Internet ateriosclerosis is a significant issue.


---
Roland Dobbins

Re: automatic rtbh trigger using flow data

2018-09-01 Thread Roland Dobbins




On 1 Sep 2018, at 1:35, Aaron Gould wrote:

I may mark internet-sourced-udp with a certain marking dscp/exp so 
that as it travels through my internet
network, it will be the first to get dropped (? Wred ? work well for 
udp?) during congestion when an attack gets through


You can use flow telemetry analysis to look at the UDP non-initial 
fragments destined for any access networks under your control; you'll 
likely see that they comprise a tiny portion of the overall traffic mix, 
and they're most commonly associated with large DNS answers.


Once you've determined the numbers, you can police down the non-initial 
fragments destined for the access networks you control (don't do this on 
transit traffic!) to whatever small percentage makes the most sense, 
with a bit of extra headroom.  1% of link bandwidth works for several 
operators.


In that QoS policy, you exempt well-known/well-run open DNS recursor 
farms like Google DNS, OpenDNS, et. al. (and possibly your own, 
depending on your topology, etc.), and any other legitimate source CIDRs 
which generate appreciable amounts of non-initial fragments.


When a reflection/amplification attack which involves non-initial 
fragments hits, the QoS policy will sink a significant proportion of the 
attack.  It doesn't help with your peering links, but keeps the traffic 
off your core and off the large network(s).


Again, don't apply this across-the-board; only do it for access networks 
within your span of administrative control.


* btw, what can you experts tell me about tcp-based volumetric 
attacks...


TCP reflection/amplification.

---
Roland Dobbins

RE: automatic rtbh trigger using flow data

2018-09-01 Thread Ryan Hamel

No ISP is in the business of filtering traffic unless the client pays the hefty 
fee since someone still has to tank the attack.

I also don’t think there is destination prefix IP filtering in flowspec, which 
could seriously cause problems.

From: NANOG  On Behalf Of Baldur Norddahl
Sent: Saturday, September 01, 2018 5:18 PM
To: nanog@nanog.org
Subject: Re: automatic rtbh trigger using flow data

fre. 31. aug. 2018 17.16 skrev Hugo Slabbert 
mailto:h...@slabnet.com>>:

I would love an upstream that accepts flowspec routes to get granular about
drops and to basically push "stateless ACLs" upstream.

_keeps dreaming_

We just need a signal to drop UDP for a prefix. The same as RTBH but only for 
UDP. This would prevent all volumetric attacks without the end user being cut 
off completely.

Besides from some games, VPN and VoIP, they would have an almost completely 
normal internet experience. DNS would go through the ISP servers and only be 
affected if the user is using a third party service.

Regards

Baldur

Re: Service provider story about tracking down TCP RSTs

2018-09-01 Thread William Herrin

On Sat, Sep 1, 2018 at 6:11 PM, Lee  wrote:
> On 9/1/18, William Herrin  wrote:
>> On Sat, Sep 1, 2018 at 4:00 PM, William Herrin  wrote:
>>> Better yet, do the job right and build an anycast TCP stack as
>>> described here: https://bill.herrin.us/network/anycasttcp.html
>
> An explosion in state management would be the least of my worries :)
> I got as far as your Third hook: and thought of this
>   https://www.jwz.org/doc/worse-is-better.html

Hi Lee,

On a brief tangent: Geographic routing would drastically simplify the
Internet core, reducing both cost and complexity. You'd need to carry
only nearby specific routes and a few broad aggregates for
destinations far away. It will never be implemented, never, because no
cross-ocean carriers are willing to have their bandwidth stolen when
the algorithm decides it likes their path better than a paid one. Even
though the algorithm gets the packets where they're going, and does so
simply, it does so in a way that's too often incorrect.

Then again, I don't really understand the MIT/New Jersey argument in
Richard's worse-is-better story. The MIT guy says that a routine
should handle a common non-fatal exception. The Jersey guy says that
it's ok for the routine to return a try-again error and expect the
caller to handle it. Since its trivial to build another layer that
calls the routine in a loop until it returns success or a fatal error,
it's more a philosophical argument than a practical one. As long as a
correct result is consistently achieved in both cases, what's the
difference?

Richard characterized the Jersey argument as, "It is slightly better
to be simple than correct." I just don't see that in the Jersey
argument. Every component must be correct. The system of components as
a whole must be complete. It's slightly better for a component to be
simple than complete. That's the argument I read and it makes sense to
me.

Honestly, the idea that software is good enough even with known corner
cases that do something incorrect... I don't know how that survives in
a world where security-conscious programming is not optional.

> I had it much easier with anycast in an enterprise setting.  With
> anycast servers in data centers A & B, just make sure no site has an
> equal cost path to A and B.  Any link/ router/ whatever failure & the
> user can just re-try.

You've delicately balanced your network to achieve the principle that
even when routing around failures the anycast sites are not
equidistant from any other site. That isn't simplicity. It's
complexity hidden in the expert selection of magic numbers. Even were
that achievable in a network as chaotic as the Internet, is it simpler
than four trivial tweaks to the TCP stack plus a modestly complex but
fully automatic user-space program that correctly reroutes the small
percentage of packets that went astray?

Regards,
Bill Herrin

-- 
William Herrin  her...@dirtside.com  b...@herrin.us
Dirtside Systems . Web:

Re: automatic rtbh trigger using flow data

2018-09-01 Thread Baldur Norddahl

fre. 31. aug. 2018 17.16 skrev Hugo Slabbert :

>
>
> I would love an upstream that accepts flowspec routes to get granular
> about
> drops and to basically push "stateless ACLs" upstream.
>
> _keeps dreaming_
>
>
>
We just need a signal to drop UDP for a prefix. The same as RTBH but only
for UDP. This would prevent all volumetric attacks without the end user
being cut off completely.

Besides from some games, VPN and VoIP, they would have an almost completely
normal internet experience. DNS would go through the ISP servers and only
be affected if the user is using a third party service.

Regards

Baldur

Re: TekSavvy (Canada) contact

2018-09-01 Thread Eric Kuhnke

Hey all,

It was not my intention to cause any unwarranted concern related to the
TekSavvy network. There are zero issues with their network. Every service I
have ever purchased from them is rock solid and reliable.

I'm in contact with Paul and others there directly. The topic of discussion
is related to a third party copper last mile access network in a specific
geographic region of Canada.



On Thu, Aug 30, 2018, 5:16 AM Paul Stewart  wrote:

> Folks – please do **not** request “clueful neteng point of contact” on
> the list if you are really looking to place an order for residential
> service.  Thanks …
>
>
>
> Paul
>
>
>
>
>
> *From: *NANOG  on behalf of "p...@paulstewart.org"
> 
> *Date: *Wednesday, August 29, 2018 at 6:09 PM
> *To: *Mike Hammett 
> *Cc: *"nanog@nanog.org list" 
> *Subject: *Re: TekSavvy (Canada) contact
>
>
>
> Thnx all - already reached out
>
>
>
> Paul
>
>
>
> Get Outlook for iOS 
>
>
>
> On Wed, Aug 29, 2018 at 6:05 PM -0400, "Mike Hammett" 
> wrote:
>
> "Paul Stewart" 
>
> He's on AFMUG too.
>
>
>
> -
> Mike Hammett
> Intelligent Computing Solutions 
> [image: Image removed by sender.] [image:
> Image removed by sender.]
> [image:
> Image removed by sender.]
> [image:
> Image removed by sender.] 
> Midwest Internet Exchange 
> [image: Image removed by sender.] [image:
> Image removed by sender.]
> [image: Image
> removed by sender.] 
> The Brothers WISP 
> [image: Image removed by sender.]
> [image: Image removed by
> sender.] 
> --
>
> *From: *"Eric Kuhnke" 
> *To: *"nanog@nanog.org list" 
> *Sent: *Wednesday, August 29, 2018 4:48:48 PM
> *Subject: *TekSavvy (Canada) contact
>
> I'm looking for a clueful neteng point of contact at TekSavvy. Please
> contact me off-list. Thanks!
>
>
>
>
>
>
>
>

Re: Service provider story about tracking down TCP RSTs

2018-09-01 Thread Lee

On 9/1/18, William Herrin  wrote:
> On Sat, Sep 1, 2018 at 4:00 PM, William Herrin  wrote:
>> On Sat, Sep 1, 2018 at 2:51 PM,   wrote:
>>> pointing out that a
>>> single traceroute to a Fastly site was hitting two of their POPs (they
>>> use
>>> anycast) and because they don’t sync state between POPs the second POP
>>> would
>>> naturally issue a TCP RST (sidebar: fascinating blog article on Fastly’s
>>> infrastructure here:
>>> https://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balancing-requests).
>>
>> Better yet, do the job right and build an anycast TCP stack as
>> described here: https://bill.herrin.us/network/anycasttcp.html
>
> BTW, for anyone concerned about an explosion in state management
> overhead, the TL;DR version is: the anycast node which first accepts
> the TCP connection encodes its identity in the TCP sequence number
> where all the other nodes can statelessly find it in the subsequent
> packets. The exhaustive details for how that actually works are
> covered in the paper at the URL above, which you'll have to read
> despite its length if you want to understand.

An explosion in state management would be the least of my worries :)
I got as far as your Third hook: and thought of this
  https://www.jwz.org/doc/worse-is-better.html

I had it much easier with anycast in an enterprise setting.  With
anycast servers in data centers A & B, just make sure no site has an
equal cost path to A and B.  Any link/ router/ whatever failure & the
user can just re-try.

Lee

Re: Service provider story about tracking down TCP RSTs

2018-09-01 Thread Ryan Landry

Glad we could help, Frank.

On Sat, Sep 1, 2018 at 11:54  wrote:

> I want to share a little bit of our journey in tracking down the TCP RSTs
> that impacted some of our customers for almost ten weeks.
>
>
>
> Almost immediately after we turned up two new Arista border routers in
> late July we started receiving a trickle of complaints from customers
> regarding their inability to access certain websites (mostly B2B). All the
> packet captures showed the standard TCP SYN/SYN-ACK pair, then a TCP RST
> from the website after the client sent a TLS/SSL Client Hello. As the
> reports continued to come in, we built a Google Doc to keep track and it
> became clear that most of the sites were hosted by Incapsula/Imperva, but
> there were also a few by Sucuri and Fastly. Knowing that Incapsula provides
> DoS protection, we attempted to work with them (providing websites,
> source/destination IPs, traceroutes, and packet captures) to find out why
> their hosts were issuing our customers a TCP RST, but we made little
> progress. We moved some of the affected customers to different IP addresses
> but that didn’t resolve the issue. We also asked our customer to work with
> the website to see if they would be willing to open a ticket with
> Incapsula. In the meantime, customers were getting frustrated! They
> couldn’t visit Incapsula-hosted healthcare websites, financial firms,
> product dealers, etc. Over the weeks, a few of those customers
> purchased/borrowed different routers and some of those didn’t have website
> issues anymore. And more than a few of them discovered that the websites
> worked fine from home or their mobile phone/hotspot, but not from their
> Internet connection with us. You can guess where they were applying
> pressure! That said, we didn’t know why a small handful of companies, known
> for DoS protection, were issuing TCP RSTs to just some of our customers.
>
>
>
> Earlier this week we received four or five more websites from yet another
> affected customer, but most of those were with Fastly. By this time, we had
> been able to replicate the issue in our lab. Feeling desperate to make some
> tangible progress on this issue, I reached out to the Fastly NOC. In less
> than 12 hours they provided some helpful feedback, pointing out that a
> single traceroute to a Fastly site was hitting two of their POPs (they use
> anycast) and because they don’t sync state between POPs the second POP
> would naturally issue a TCP RST (sidebar: fascinating blog article on
> Fastly’s infrastructure here:
> https://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balancing-requests).
> In subsequent email exchanges, the Fastly NOC suggested that it appeared
> that we were “spraying flows” (that is, packets related to single client
> session were egressing our network via different paths). Because Fastly is
> also present with us at an IX (though they weren’t advertising their
> anycast IPs at the time), they suggested that we look at how our traffic
> egresses our network (IX versus transit) and our routers’ outbound
> load-balancing/hashing schemes.
>
>
>
> The IX turned up to be a red herring, so I turned my attention to our
> transit. Each of our border routers has two BGP sessions over two circuits
> to transit provider POP A and two BGP sessions over two circuits to transit
> provider POP B, for a total of four BGP sessions per border router, a total
> of eight BGP sessions altogether. Starting with our core router, I
> confirmed that its ECMP hashing was consistent such that Fastly-bound
> traffic always went to border router 1 or border router 2. Then I looked at
> the ECMP hashing scheme on our border routers and noticed something unique
> – by default Arista also uses TTL:
>
>
>
> IPv4 hash fields:
>
>Source IPv4 Address is ON
>
>Protocol is ON
>
>Time-To-Live is ON
>
>Destination IPv4 Address is ON
>
>
>
> Since the source and destination IPs and protocol weren’t changing,
> perhaps the TTL was not consistent? I opened the first packet trace in
> Wireshark and jackpot – the TTL value was 128 on the SYN but 127 on the
> TLS/SSL Client Hello. I adjusted the Arista’s load-balancing profile not to
> use TTL and immediately my MTR in the background changed and all the sites
> on the lab machine that couldn’t load before … were now loading.
>
>
>
> Fastly also pointed me to another article written by Joel Jaeggli (
> https://blog.apnic.net/2018/01/11/ipv6-flow-label-misuse-hashing/) that
> discusses IPv6 flow labels – we removed that from the border routers’ IPv6
> hash fields, too.
>
>
>
> I reviewed the packet traces today and noticed that TTL values remained
> consistent at 128 **behind** the router CPE. In packet captures on the
> WAN interface of the router CPE I see that the SYN remains at 128, but the
> TLS/Client Hello is properly decremented to 127. So, it appears that some
> router CPE (and there were a variety of makes and models) are doing
> something special to certai

Re: Service provider story about tracking down TCP RSTs

2018-09-01 Thread William Herrin

On Sat, Sep 1, 2018 at 4:00 PM, William Herrin  wrote:
> On Sat, Sep 1, 2018 at 2:51 PM,   wrote:
>> pointing out that a
>> single traceroute to a Fastly site was hitting two of their POPs (they use
>> anycast) and because they don’t sync state between POPs the second POP would
>> naturally issue a TCP RST (sidebar: fascinating blog article on Fastly’s
>> infrastructure here:
>> https://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balancing-requests).
>
> Better yet, do the job right and build an anycast TCP stack as
> described here: https://bill.herrin.us/network/anycasttcp.html

BTW, for anyone concerned about an explosion in state management
overhead, the TL;DR version is: the anycast node which first accepts
the TCP connection encodes its identity in the TCP sequence number
where all the other nodes can statelessly find it in the subsequent
packets. The exhaustive details for how that actually works are
covered in the paper at the URL above, which you'll have to read
despite its length if you want to understand.

Regards,
Bill Herrin

-- 
William Herrin  her...@dirtside.com  b...@herrin.us
Dirtside Systems . Web:

Re: Service provider story about tracking down TCP RSTs

2018-09-01 Thread Garrett Skjelstad

I would love this as a blog post to link folks that are not nanog members.

-Garrett

On Sat, Sep 1, 2018, 11:52  wrote:

> I want to share a little bit of our journey in tracking down the TCP RSTs
> that impacted some of our customers for almost ten weeks.
>
>
>
> Almost immediately after we turned up two new Arista border routers in
> late July we started receiving a trickle of complaints from customers
> regarding their inability to access certain websites (mostly B2B). All the
> packet captures showed the standard TCP SYN/SYN-ACK pair, then a TCP RST
> from the website after the client sent a TLS/SSL Client Hello. As the
> reports continued to come in, we built a Google Doc to keep track and it
> became clear that most of the sites were hosted by Incapsula/Imperva, but
> there were also a few by Sucuri and Fastly. Knowing that Incapsula provides
> DoS protection, we attempted to work with them (providing websites,
> source/destination IPs, traceroutes, and packet captures) to find out why
> their hosts were issuing our customers a TCP RST, but we made little
> progress. We moved some of the affected customers to different IP addresses
> but that didn’t resolve the issue. We also asked our customer to work with
> the website to see if they would be willing to open a ticket with
> Incapsula. In the meantime, customers were getting frustrated! They
> couldn’t visit Incapsula-hosted healthcare websites, financial firms,
> product dealers, etc. Over the weeks, a few of those customers
> purchased/borrowed different routers and some of those didn’t have website
> issues anymore. And more than a few of them discovered that the websites
> worked fine from home or their mobile phone/hotspot, but not from their
> Internet connection with us. You can guess where they were applying
> pressure! That said, we didn’t know why a small handful of companies, known
> for DoS protection, were issuing TCP RSTs to just some of our customers.
>
>
>
> Earlier this week we received four or five more websites from yet another
> affected customer, but most of those were with Fastly. By this time, we had
> been able to replicate the issue in our lab. Feeling desperate to make some
> tangible progress on this issue, I reached out to the Fastly NOC. In less
> than 12 hours they provided some helpful feedback, pointing out that a
> single traceroute to a Fastly site was hitting two of their POPs (they use
> anycast) and because they don’t sync state between POPs the second POP
> would naturally issue a TCP RST (sidebar: fascinating blog article on
> Fastly’s infrastructure here:
> https://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balancing-requests).
> In subsequent email exchanges, the Fastly NOC suggested that it appeared
> that we were “spraying flows” (that is, packets related to single client
> session were egressing our network via different paths). Because Fastly is
> also present with us at an IX (though they weren’t advertising their
> anycast IPs at the time), they suggested that we look at how our traffic
> egresses our network (IX versus transit) and our routers’ outbound
> load-balancing/hashing schemes.
>
>
>
> The IX turned up to be a red herring, so I turned my attention to our
> transit. Each of our border routers has two BGP sessions over two circuits
> to transit provider POP A and two BGP sessions over two circuits to transit
> provider POP B, for a total of four BGP sessions per border router, a total
> of eight BGP sessions altogether. Starting with our core router, I
> confirmed that its ECMP hashing was consistent such that Fastly-bound
> traffic always went to border router 1 or border router 2. Then I looked at
> the ECMP hashing scheme on our border routers and noticed something unique
> – by default Arista also uses TTL:
>
>
>
> IPv4 hash fields:
>
>Source IPv4 Address is ON
>
>Protocol is ON
>
>Time-To-Live is ON
>
>Destination IPv4 Address is ON
>
>
>
> Since the source and destination IPs and protocol weren’t changing,
> perhaps the TTL was not consistent? I opened the first packet trace in
> Wireshark and jackpot – the TTL value was 128 on the SYN but 127 on the
> TLS/SSL Client Hello. I adjusted the Arista’s load-balancing profile not to
> use TTL and immediately my MTR in the background changed and all the sites
> on the lab machine that couldn’t load before … were now loading.
>
>
>
> Fastly also pointed me to another article written by Joel Jaeggli (
> https://blog.apnic.net/2018/01/11/ipv6-flow-label-misuse-hashing/) that
> discusses IPv6 flow labels – we removed that from the border routers’ IPv6
> hash fields, too.
>
>
>
> I reviewed the packet traces today and noticed that TTL values remained
> consistent at 128 **behind** the router CPE. In packet captures on the
> WAN interface of the router CPE I see that the SYN remains at 128, but the
> TLS/Client Hello is properly decremented to 127. So, it appears that some
> router CPE (and there were a variety of m

Re: Service provider story about tracking down TCP RSTs

2018-09-01 Thread William Herrin

On Sat, Sep 1, 2018 at 2:51 PM,   wrote:
> pointing out that a
> single traceroute to a Fastly site was hitting two of their POPs (they use
> anycast) and because they don’t sync state between POPs the second POP would
> naturally issue a TCP RST (sidebar: fascinating blog article on Fastly’s
> infrastructure here:
> https://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balancing-requests).

Oh for Pete's sake. If they're going to attempt Anycast TCP with a
unicast protocol stack they should at least have the sense to suppress
the RSTs.

Better yet, do the job right and build an anycast TCP stack as
described here: https://bill.herrin.us/network/anycasttcp.html

Regards,
Bill Herrin


-- 
William Herrin  her...@dirtside.com  b...@herrin.us
Dirtside Systems . Web:

Service provider story about tracking down TCP RSTs

2018-09-01 Thread frnkblk

I want to share a little bit of our journey in tracking down the TCP RSTs
that impacted some of our customers for almost ten weeks.

 

Almost immediately after we turned up two new Arista border routers in late
July we started receiving a trickle of complaints from customers regarding
their inability to access certain websites (mostly B2B). All the packet
captures showed the standard TCP SYN/SYN-ACK pair, then a TCP RST from the
website after the client sent a TLS/SSL Client Hello. As the reports
continued to come in, we built a Google Doc to keep track and it became
clear that most of the sites were hosted by Incapsula/Imperva, but there
were also a few by Sucuri and Fastly. Knowing that Incapsula provides DoS
protection, we attempted to work with them (providing websites,
source/destination IPs, traceroutes, and packet captures) to find out why
their hosts were issuing our customers a TCP RST, but we made little
progress. We moved some of the affected customers to different IP addresses
but that didn't resolve the issue. We also asked our customer to work with
the website to see if they would be willing to open a ticket with Incapsula.
In the meantime, customers were getting frustrated! They couldn't visit
Incapsula-hosted healthcare websites, financial firms, product dealers, etc.
Over the weeks, a few of those customers purchased/borrowed different
routers and some of those didn't have website issues anymore. And more than
a few of them discovered that the websites worked fine from home or their
mobile phone/hotspot, but not from their Internet connection with us. You
can guess where they were applying pressure! That said, we didn't know why a
small handful of companies, known for DoS protection, were issuing TCP RSTs
to just some of our customers. 

 

Earlier this week we received four or five more websites from yet another
affected customer, but most of those were with Fastly. By this time, we had
been able to replicate the issue in our lab. Feeling desperate to make some
tangible progress on this issue, I reached out to the Fastly NOC. In less
than 12 hours they provided some helpful feedback, pointing out that a
single traceroute to a Fastly site was hitting two of their POPs (they use
anycast) and because they don't sync state between POPs the second POP would
naturally issue a TCP RST (sidebar: fascinating blog article on Fastly's
infrastructure here:
https://www.fastly.com/blog/building-and-scaling-fastly-network-part-2-balan
cing-requests). In subsequent email exchanges, the Fastly NOC suggested that
it appeared that we were "spraying flows" (that is, packets related to
single client session were egressing our network via different paths).
Because Fastly is also present with us at an IX (though they weren't
advertising their anycast IPs at the time), they suggested that we look at
how our traffic egresses our network (IX versus transit) and our routers'
outbound load-balancing/hashing schemes.

 

The IX turned up to be a red herring, so I turned my attention to our
transit. Each of our border routers has two BGP sessions over two circuits
to transit provider POP A and two BGP sessions over two circuits to transit
provider POP B, for a total of four BGP sessions per border router, a total
of eight BGP sessions altogether. Starting with our core router, I confirmed
that its ECMP hashing was consistent such that Fastly-bound traffic always
went to border router 1 or border router 2. Then I looked at the ECMP
hashing scheme on our border routers and noticed something unique - by
default Arista also uses TTL:

 

IPv4 hash fields:

   Source IPv4 Address is ON

   Protocol is ON

   Time-To-Live is ON

   Destination IPv4 Address is ON

 

Since the source and destination IPs and protocol weren't changing, perhaps
the TTL was not consistent? I opened the first packet trace in Wireshark and
jackpot - the TTL value was 128 on the SYN but 127 on the TLS/SSL Client
Hello. I adjusted the Arista's load-balancing profile not to use TTL and
immediately my MTR in the background changed and all the sites on the lab
machine that couldn't load before . were now loading.

 

Fastly also pointed me to another article written by Joel Jaeggli
(https://blog.apnic.net/2018/01/11/ipv6-flow-label-misuse-hashing/) that
discusses IPv6 flow labels - we removed that from the border routers' IPv6
hash fields, too.

 

I reviewed the packet traces today and noticed that TTL values remained
consistent at 128 *behind* the router CPE. In packet captures on the WAN
interface of the router CPE I see that the SYN remains at 128, but the
TLS/Client Hello is properly decremented to 127. So, it appears that some
router CPE (and there were a variety of makes and models) are doing
something special to certain packets and not decrementing the TTL. 

This explains why:

*   our customers had issues with all their devices behind their router
CPE
*   the issue remained regardless of what public IP address their router
CPE obtain

Re: Service provider story about tracking down TCP RSTs

RE: automatic rtbh trigger using flow data

Re: automatic rtbh trigger using flow data

Re: automatic rtbh trigger using flow data

Re: automatic rtbh trigger using flow data

Re: automatic rtbh trigger using flow data

Re: automatic rtbh trigger using flow data

RE: automatic rtbh trigger using flow data

Re: Service provider story about tracking down TCP RSTs

Re: automatic rtbh trigger using flow data

Re: TekSavvy (Canada) contact

Re: Service provider story about tracking down TCP RSTs

Re: Service provider story about tracking down TCP RSTs

Re: Service provider story about tracking down TCP RSTs

Re: Service provider story about tracking down TCP RSTs

Re: Service provider story about tracking down TCP RSTs

Service provider story about tracking down TCP RSTs

17 matches

Site Navigation

Mail list logo

Footer information