> On Apr 29, 2020, at 7:59 PM, Kaiser, Erich <er...@gotfusion.net> wrote:
>
> So it has been 3 weeks of major ICMP packet loss to any google service over
> the Dallas Equinix IX, it is not affecting performance of service but is
> affecting us with customer complaints and service calls due to some software
> using it for monitoring purposes people using it for benchmark testing. I
> have been told from them that they know the cause now and know that a Large
> ISP on the IX is causing the issue(Hmm wonder who that is...), so why do they
> not shutdown the peer with them and force the ISP to fix the issue? This
> issue is affecting everyone on the IX not just us, very very frustrating.
> Hopefully this will reach someone over there that can do something about it….
Issues with the IXP ecosystem aren’t new in the US. This is why some providers
don’t appear at them. The original one member could hurt it all was really the
gigaswitch HOLB (head of line blocking) issue that was triggered by congested
ports.
(Waits for others to crawl out of the woodwork who were more involved in this
:-)
This is why the majority of traffic volume for interconnection has generally
been over private peering links (paid, SFI, otherwise).
If you tried to force it through an IXP ecosystem the tens of Tbps wouldn’t fit
even in each city. Things like CDNs, the Netflix OpenConnect and otherwise
have really shifted the demand off the interconnection points as much as
feasible. Sometimes an organization can’t handle it or tries to cling to it’s
old ways. Sometimes it takes organization change or people change to improve
the situation.
I know it can sound like a broken record, but upgrading to match the capacity
demands really can make a difference to offload paths. It may also expose
other weak points. My personal goal is to cease thinking about things in the
95/5 model and more of a peak model. 95/5 gets you so far but the peaks are
really where networks can shine or show their age.
I understand it’s not always possible to upgrade links, or sometimes one party
holds out on the other. It’s certainly not the case at $dayjob and I try to
ensure the process works as best as it can here.
Sometimes it’s best to just de-peer a network. You may find it works out
better for all involved.
At $nightJob I want to peer as much traffic off as possible, but if the network
paths aren’t there or low-speed it may not make sense.
Evaluate your peers periodically to ensure you are getting what you expect.
- jared