Re: Centurylink having a bad morning?

2020-08-31 Thread Tom Beecher
In this specific event, 3356 not withdrawing routes is certainly a head
scratcher, and I'm sure for many the thing we're most looking forward to a
definitive answer on.

However, if a network only has 3356 as their upstream, they are 100% at the
mercy of 3356 at all times. Having a redundant AND diverse connection to a
2nd upstream ASN at least provides you some options. In this case for
example, let's say at all times you did a +2 prepend to both 3356 and Acme.
3356 even happens, you shut down your session to them. Some percentage of
your traffic that would have been faceplanting in/through 3356 now works
via Acme. Then you notice the non-withdrawl issue. You can then remove 1
prepend, or perhaps deagg strategically to try and get more traffic away
from the trouble.

A redundant path to a different.upstream at least provides you some
potential options to work around that with which you otherwise could not.
It wouldn't be perfect, but options > no options.

On Mon, Aug 31, 2020 at 5:08 PM Warren Kumari  wrote:

> On Mon, Aug 31, 2020 at 4:36 PM Tom Beecher  wrote:
> >
> > Hopefully those customers learned the difference between redundancy and
> diversity this weekend. :)
>
> I'm unclear how either solves things for many customers...
>
> If they had CenturyLink and AcmeNetworkWidgets, and announce the same
> network through both -- and their connection to CL went down, *but CL
> continues to announce / doesn't withdraw* they are still stuck, yes?
> (Unless they can deaggregate that is...)
> What am I missing?
>
> W
>
>
> >
> > On Mon, Aug 31, 2020 at 3:48 PM Eric Kuhnke 
> wrote:
> >>
> >> There's a number of enterprise end user type customers of 3356 that
> have on-premises server rooms/hosting for their stuff. And they spend a lot
> of money every month for a 'redundant' metro ethernet circuit that takes
> diverse fiber paths from their business park office building to the local
> clink/level3 POP. But all that last mile redundancy and fail over ability
> doesn't do much for them when 3356 breaks its network at the BGP level.
> >>
> >>
> >>
> >> On Mon, Aug 31, 2020 at 9:36 AM Drew Weaver 
> wrote:
> >>>
> >>> I also found the part where they mention that a lot of hosting
> companies only have one uplink to be quizzical and also the fact that he
> goes pretty close to implying that its Centurylink’s customers fault for
> not having multiple paths to Cloudflare that don’t touch Centurylink a bit
> puzzling. It could have just been poorly written.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> From: NANOG  On
> Behalf Of Tom Beecher
> >>> Sent: Monday, August 31, 2020 9:26 AM
> >>> To: Hank Nussbacher 
> >>> Cc: NANOG 
> >>> Subject: Re: Centurylink having a bad morning?
> >>>
> >>>
> >>>
> >>>
> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
> >>>
> >>>
> >>>
> >>> I definitely found Mr. Prince's writing about yesterday's events
> fascinating.
> >>>
> >>>
> >>>
> >>> Verizon makes a mistake with BGP filters that allows a secondary
> mistake from leaked "optimizer" routes to propagate, and Mr. Prince takes
> every opportunity to lob large chunks of granite about how terrible they
> are.
> >>>
> >>>
> >>>
> >>> L3 allows an erroneous flowspec announcement to cause massive global
> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
> wrote:
> >>>
> >>> On 30/08/2020 20:08, Baldur Norddahl wrote:
> >>>
> >>>
> >>>
> >>>
> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
> >>>
> >>>
> >>>
> >>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
> >>>
> >>>
> >>>
> >>> But that is Cloudflare speculation.
> >>>
> >>>
> >>>
> >>> Regards,
> >>> Hank
> >>>
> >>> Caveat: The views expressed above are solely my own and do not express
> the views or opinions of my employer
> >>>
> >>>
> >>>
> >>> An outage is what it is. I am not worried about outages. We have
> multiple transits to deal with that.
> >>>
> >>>
> >>>
> >>> It is the keep announcing prefixes after withdrawal from peers and
> customers that is the huge problem here. That is killing all the effort and
> money I put into having redundancy. It is sabotage of my network after I
> cut the ties. I do not want to be a customer at an outlet who has a system
> that will do that. Luckily we do not currently have a contract and now they
> will have to convince me it is safe for me to make a contract with them. If
> that is impossible I guess I won't be getting a contract with them.
> >>>
> >>>
> >>>
> >>> But I disagree in that it would be impossible. They need to make a
> good report telling exactly what went wrong and how they changed the
> design, so something like this can not happen again. The basic design of
> BGP is such that this should not happen easily if at all. They did
> something unwise. Did they make a route reflector based on a database or
> 

Re: Centurylink having a bad morning?

2020-08-31 Thread Warren Kumari
On Mon, Aug 31, 2020 at 4:36 PM Tom Beecher  wrote:
>
> Hopefully those customers learned the difference between redundancy and 
> diversity this weekend. :)

I'm unclear how either solves things for many customers...

If they had CenturyLink and AcmeNetworkWidgets, and announce the same
network through both -- and their connection to CL went down, *but CL
continues to announce / doesn't withdraw* they are still stuck, yes?
(Unless they can deaggregate that is...)
What am I missing?

W


>
> On Mon, Aug 31, 2020 at 3:48 PM Eric Kuhnke  wrote:
>>
>> There's a number of enterprise end user type customers of 3356 that have 
>> on-premises server rooms/hosting for their stuff. And they spend a lot of 
>> money every month for a 'redundant' metro ethernet circuit that takes 
>> diverse fiber paths from their business park office building to the local 
>> clink/level3 POP. But all that last mile redundancy and fail over ability 
>> doesn't do much for them when 3356 breaks its network at the BGP level.
>>
>>
>>
>> On Mon, Aug 31, 2020 at 9:36 AM Drew Weaver  wrote:
>>>
>>> I also found the part where they mention that a lot of hosting companies 
>>> only have one uplink to be quizzical and also the fact that he goes pretty 
>>> close to implying that its Centurylink’s customers fault for not having 
>>> multiple paths to Cloudflare that don’t touch Centurylink a bit puzzling. 
>>> It could have just been poorly written.
>>>
>>>
>>>
>>>
>>>
>>> From: NANOG  On Behalf Of 
>>> Tom Beecher
>>> Sent: Monday, August 31, 2020 9:26 AM
>>> To: Hank Nussbacher 
>>> Cc: NANOG 
>>> Subject: Re: Centurylink having a bad morning?
>>>
>>>
>>>
>>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>>
>>>
>>>
>>> I definitely found Mr. Prince's writing about yesterday's events 
>>> fascinating.
>>>
>>>
>>>
>>> Verizon makes a mistake with BGP filters that allows a secondary mistake 
>>> from leaked "optimizer" routes to propagate, and Mr. Prince takes every 
>>> opportunity to lob large chunks of granite about how terrible they are.
>>>
>>>
>>>
>>> L3 allows an erroneous flowspec announcement to cause massive global 
>>> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher  wrote:
>>>
>>> On 30/08/2020 20:08, Baldur Norddahl wrote:
>>>
>>>
>>>
>>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>>
>>>
>>>
>>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>>>
>>>
>>>
>>> But that is Cloudflare speculation.
>>>
>>>
>>>
>>> Regards,
>>> Hank
>>>
>>> Caveat: The views expressed above are solely my own and do not express the 
>>> views or opinions of my employer
>>>
>>>
>>>
>>> An outage is what it is. I am not worried about outages. We have multiple 
>>> transits to deal with that.
>>>
>>>
>>>
>>> It is the keep announcing prefixes after withdrawal from peers and 
>>> customers that is the huge problem here. That is killing all the effort and 
>>> money I put into having redundancy. It is sabotage of my network after I 
>>> cut the ties. I do not want to be a customer at an outlet who has a system 
>>> that will do that. Luckily we do not currently have a contract and now they 
>>> will have to convince me it is safe for me to make a contract with them. If 
>>> that is impossible I guess I won't be getting a contract with them.
>>>
>>>
>>>
>>> But I disagree in that it would be impossible. They need to make a good 
>>> report telling exactly what went wrong and how they changed the design, so 
>>> something like this can not happen again. The basic design of BGP is such 
>>> that this should not happen easily if at all. They did something unwise. 
>>> Did they make a route reflector based on a database or something?
>>>
>>>
>>>
>>> Regards,
>>>
>>>
>>>
>>> Baldur
>>>
>>>
>>>
>>> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho  wrote:
>>>
>>> Exactly. And asking that they somehow prove this won't happen again is 
>>> impossible.
>>>
>>> - Mike Bolitho
>>>
>>>
>>>
>>> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver  wrote:
>>>
>>> I’m not defending them but I am sure it isn’t intentional.
>>>
>>>
>>>
>>> From: NANOG  On Behalf Of 
>>> Baldur Norddahl
>>> Sent: Sunday, August 30, 2020 9:28 AM
>>> To: nanog@nanog.org
>>> Subject: Re: Centurylink having a bad morning?
>>>
>>>
>>>
>>> How is that acceptable behaviour? I shall remember never to make a contract 
>>> with these guys until they can prove that they won't advertise my prefixes 
>>> after I pull them. Under any circumstances.
>>>
>>>
>>>
>>> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins :
>>>
>>> Finally got through on their support line and spoke to level1. The only 
>>> thing the tech could say was it was an issue with BGP route reflectors and 
>>> it started about 3am(pacific). They were still trying to isolate the issue. 
>>> I've tried failing over my circuits and no go, the traffic just dies as 

Re: Centurylink having a bad morning?

2020-08-31 Thread Ben Cannon
We’re bailing out a customer in exactly this same boat as we speak.  There are 
so many.

Ms. Benjamin PD Cannon, ASCE
6x7 Networks & 6x7 Telecom, LLC 
CEO 
b...@6by7.net
"The only fully end-to-end encrypted global telecommunications company in the 
world.”

FCC License KJ6FJJ



> On Aug 31, 2020, at 12:52 PM, Eric Kuhnke  wrote:
> 
> 
> There's a number of enterprise end user type customers of 3356 that have 
> on-premises server rooms/hosting for their stuff. And they spend a lot of 
> money every month for a 'redundant' metro ethernet circuit that takes diverse 
> fiber paths from their business park office building to the local 
> clink/level3 POP. But all that last mile redundancy and fail over ability 
> doesn't do much for them when 3356 breaks its network at the BGP level.
> 
> 
> 
>> On Mon, Aug 31, 2020 at 9:36 AM Drew Weaver  wrote:
>> I also found the part where they mention that a lot of hosting companies 
>> only have one uplink to be quizzical and also the fact that he goes pretty 
>> close to implying that its Centurylink’s customers fault for not having 
>> multiple paths to Cloudflare that don’t touch Centurylink a bit puzzling. It 
>> could have just been poorly written.
>> 
>>  
>> 
>>  
>> 
>> From: NANOG  On Behalf Of 
>> Tom Beecher
>> Sent: Monday, August 31, 2020 9:26 AM
>> To: Hank Nussbacher 
>> Cc: NANOG 
>> Subject: Re: Centurylink having a bad morning?
>> 
>>  
>> 
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>> 
>>  
>> 
>> I definitely found Mr. Prince's writing about yesterday's events fascinating.
>> 
>>  
>> 
>> Verizon makes a mistake with BGP filters that allows a secondary mistake 
>> from leaked "optimizer" routes to propagate, and Mr. Prince takes every 
>> opportunity to lob large chunks of granite about how terrible they are. 
>> 
>>  
>> 
>> L3 allows an erroneous flowspec announcement to cause massive global 
>> connectivity issues, and Mr. Prince shrugs and says "Incidents happen." 
>> 
>>  
>> 
>>  
>> 
>>  
>> 
>>  
>> 
>>  
>> 
>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher  wrote:
>> 
>> On 30/08/2020 20:08, Baldur Norddahl wrote:
>> 
>>  
>> 
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>> 
>>  
>> 
>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>> 
>>  
>> 
>> But that is Cloudflare speculation.
>> 
>>  
>> 
>> Regards,
>> Hank
>> 
>> Caveat: The views expressed above are solely my own and do not express the 
>> views or opinions of my employer
>> 
>>  
>> 
>> An outage is what it is. I am not worried about outages. We have multiple 
>> transits to deal with that.
>> 
>>  
>> 
>> It is the keep announcing prefixes after withdrawal from peers and customers 
>> that is the huge problem here. That is killing all the effort and money I 
>> put into having redundancy. It is sabotage of my network after I cut the 
>> ties. I do not want to be a customer at an outlet who has a system that will 
>> do that. Luckily we do not currently have a contract and now they will have 
>> to convince me it is safe for me to make a contract with them. If that is 
>> impossible I guess I won't be getting a contract with them.
>> 
>>  
>> 
>> But I disagree in that it would be impossible. They need to make a good 
>> report telling exactly what went wrong and how they changed the design, so 
>> something like this can not happen again. The basic design of BGP is such 
>> that this should not happen easily if at all. They did something unwise. Did 
>> they make a route reflector based on a database or something?
>> 
>>  
>> 
>> Regards,
>> 
>>  
>> 
>> Baldur
>> 
>>  
>> 
>> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho  wrote:
>> 
>> Exactly. And asking that they somehow prove this won't happen again is 
>> impossible.
>> 
>> - Mike Bolitho
>> 
>>  
>> 
>> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver  wrote:
>> 
>> I’m not defending them but I am sure it isn’t intentional.
>> 
>>  
>> 
>> From: NANOG  On Behalf Of 
>> Baldur Norddahl
>> Sent: Sunday, August 30, 2020 9:28 AM
>> To: nanog@nanog.org
>> Subject: Re: Centurylink having a bad morning?
>> 
>>  
>> 
>> How is that acceptable behaviour? I shall remember never to make a contract 
>> with these guys until they can prove that they won't advertise my prefixes 
>> after I pull them. Under any circumstances. 
>> 
>>  
>> 
>> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins :
>> 
>> Finally got through on their support line and spoke to level1. The only 
>> thing the tech could say was it was an issue with BGP route reflectors and 
>> it started about 3am(pacific). They were still trying to isolate the issue. 
>> I've tried failing over my circuits and no go, the traffic just dies as L3 
>> won't stop advertising my routes.
>> 
>>  
>> 
>> On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG  
>> wrote:
>> 
>> Hello,
>> 
>>  
>> 
>> Woke up this morning to a bunch of reports of issues with connectivity had 
>> to shut down some Level3/CTL 

Re: Centurylink having a bad morning?

2020-08-31 Thread Tom Beecher
Hopefully those customers learned the difference between redundancy and
diversity this weekend. :)

On Mon, Aug 31, 2020 at 3:48 PM Eric Kuhnke  wrote:

> There's a number of enterprise end user type customers of 3356 that have
> on-premises server rooms/hosting for their stuff. And they spend a lot of
> money every month for a 'redundant' metro ethernet circuit that takes
> diverse fiber paths from their business park office building to the local
> clink/level3 POP. But all that last mile redundancy and fail over ability
> doesn't do much for them when 3356 breaks its network at the BGP level.
>
>
>
> On Mon, Aug 31, 2020 at 9:36 AM Drew Weaver 
> wrote:
>
>> I also found the part where they mention that a lot of hosting companies
>> only have one uplink to be quizzical and also the fact that he goes pretty
>> close to implying that its Centurylink’s customers fault for not having
>> multiple paths to Cloudflare that don’t touch Centurylink a bit puzzling.
>> It could have just been poorly written.
>>
>>
>>
>>
>>
>> *From:* NANOG  *On
>> Behalf Of *Tom Beecher
>> *Sent:* Monday, August 31, 2020 9:26 AM
>> *To:* Hank Nussbacher 
>> *Cc:* NANOG 
>> *Subject:* Re: Centurylink having a bad morning?
>>
>>
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>>
>> I definitely found Mr. Prince's writing about yesterday's events
>> fascinating.
>>
>>
>>
>> Verizon makes a mistake with BGP filters that allows a secondary mistake
>> from leaked "optimizer" routes to propagate, and Mr. Prince takes every
>> opportunity to lob large chunks of granite about how terrible they are.
>>
>>
>>
>> L3 allows an erroneous flowspec announcement to cause massive global
>> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
>> wrote:
>>
>> On 30/08/2020 20:08, Baldur Norddahl wrote:
>>
>>
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>>
>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>>
>>
>>
>> But that is Cloudflare speculation.
>>
>>
>>
>> Regards,
>> Hank
>>
>> Caveat: The views expressed above are solely my own and do not express
>> the views or opinions of my employer
>>
>>
>>
>> An outage is what it is. I am not worried about outages. We have multiple
>> transits to deal with that.
>>
>>
>>
>> It is the keep announcing prefixes after withdrawal from peers and
>> customers that is the huge problem here. That is killing all the effort and
>> money I put into having redundancy. It is sabotage of my network after I
>> cut the ties. I do not want to be a customer at an outlet who has a system
>> that will do that. Luckily we do not currently have a contract and now they
>> will have to convince me it is safe for me to make a contract with them. If
>> that is impossible I guess I won't be getting a contract with them.
>>
>>
>>
>> But I disagree in that it would be impossible. They need to make a good
>> report telling exactly what went wrong and how they changed the design, so
>> something like this can not happen again. The basic design of BGP is such
>> that this should not happen easily if at all. They did something unwise.
>> Did they make a route reflector based on a database or something?
>>
>>
>>
>> Regards,
>>
>>
>>
>> Baldur
>>
>>
>>
>> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
>> wrote:
>>
>> Exactly. And asking that they somehow prove this won't happen again is
>> impossible.
>>
>> - Mike Bolitho
>>
>>
>>
>> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver  wrote:
>>
>> I’m not defending them but I am sure it isn’t intentional.
>>
>>
>>
>> *From:* NANOG  *On
>> Behalf Of *Baldur Norddahl
>> *Sent:* Sunday, August 30, 2020 9:28 AM
>> *To:* nanog@nanog.org
>> *Subject:* Re: Centurylink having a bad morning?
>>
>>
>>
>> How is that acceptable behaviour? I shall remember never to make a
>> contract with these guys until they can prove that they won't advertise my
>> prefixes after I pull them. Under any circumstances.
>>
>>
>>
>> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins > >:
>>
>> Finally got through on their support line and spoke to level1. The only
>> thing the tech could say was it was an issue with BGP route reflectors and
>> it started about 3am(pacific). They were still trying to isolate the issue.
>> I've tried failing over my circuits and no go, the traffic just dies as L3
>> won't stop advertising my routes.
>>
>>
>>
>> On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG 
>> wrote:
>>
>> Hello,
>>
>>
>>
>> Woke up this morning to a bunch of reports of issues with connectivity
>> had to shut down some Level3/CTL connections to get it to return to normal.
>>
>>
>>
>> As of right now their support portal won’t load:
>> https://www.centurylink.com/business/login/
>>
>>
>>
>> Just wondering what others are seeing.
>>
>>
>>
>>
>>
>>


Re: Centurylink having a bad morning?

2020-08-31 Thread Warren Kumari
On Mon, Aug 31, 2020 at 3:52 PM Eric Kuhnke  wrote:
>
> There's a number of enterprise end user type customers of 3356 that have 
> on-premises server rooms/hosting for their stuff. And they spend a lot of 
> money every month for a 'redundant' metro ethernet circuit that takes diverse 
> fiber paths from their business park office building to the local 
> clink/level3 POP. But all that last mile redundancy and fail over ability 
> doesn't do much for them when 3356 breaks its network at the BGP level.

There is a lot of stuff that fails in an ugly way when a network
breaks and doesn't withdraw; in many (most?) ways it acts just like a
hijack...

W


>
>
>
> On Mon, Aug 31, 2020 at 9:36 AM Drew Weaver  wrote:
>>
>> I also found the part where they mention that a lot of hosting companies 
>> only have one uplink to be quizzical and also the fact that he goes pretty 
>> close to implying that its Centurylink’s customers fault for not having 
>> multiple paths to Cloudflare that don’t touch Centurylink a bit puzzling. It 
>> could have just been poorly written.
>>
>>
>>
>>
>>
>> From: NANOG  On Behalf Of 
>> Tom Beecher
>> Sent: Monday, August 31, 2020 9:26 AM
>> To: Hank Nussbacher 
>> Cc: NANOG 
>> Subject: Re: Centurylink having a bad morning?
>>
>>
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>>
>> I definitely found Mr. Prince's writing about yesterday's events fascinating.
>>
>>
>>
>> Verizon makes a mistake with BGP filters that allows a secondary mistake 
>> from leaked "optimizer" routes to propagate, and Mr. Prince takes every 
>> opportunity to lob large chunks of granite about how terrible they are.
>>
>>
>>
>> L3 allows an erroneous flowspec announcement to cause massive global 
>> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher  wrote:
>>
>> On 30/08/2020 20:08, Baldur Norddahl wrote:
>>
>>
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>>
>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>>
>>
>>
>> But that is Cloudflare speculation.
>>
>>
>>
>> Regards,
>> Hank
>>
>> Caveat: The views expressed above are solely my own and do not express the 
>> views or opinions of my employer
>>
>>
>>
>> An outage is what it is. I am not worried about outages. We have multiple 
>> transits to deal with that.
>>
>>
>>
>> It is the keep announcing prefixes after withdrawal from peers and customers 
>> that is the huge problem here. That is killing all the effort and money I 
>> put into having redundancy. It is sabotage of my network after I cut the 
>> ties. I do not want to be a customer at an outlet who has a system that will 
>> do that. Luckily we do not currently have a contract and now they will have 
>> to convince me it is safe for me to make a contract with them. If that is 
>> impossible I guess I won't be getting a contract with them.
>>
>>
>>
>> But I disagree in that it would be impossible. They need to make a good 
>> report telling exactly what went wrong and how they changed the design, so 
>> something like this can not happen again. The basic design of BGP is such 
>> that this should not happen easily if at all. They did something unwise. Did 
>> they make a route reflector based on a database or something?
>>
>>
>>
>> Regards,
>>
>>
>>
>> Baldur
>>
>>
>>
>> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho  wrote:
>>
>> Exactly. And asking that they somehow prove this won't happen again is 
>> impossible.
>>
>> - Mike Bolitho
>>
>>
>>
>> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver  wrote:
>>
>> I’m not defending them but I am sure it isn’t intentional.
>>
>>
>>
>> From: NANOG  On Behalf Of 
>> Baldur Norddahl
>> Sent: Sunday, August 30, 2020 9:28 AM
>> To: nanog@nanog.org
>> Subject: Re: Centurylink having a bad morning?
>>
>>
>>
>> How is that acceptable behaviour? I shall remember never to make a contract 
>> with these guys until they can prove that they won't advertise my prefixes 
>> after I pull them. Under any circumstances.
>>
>>
>>
>> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins :
>>
>> Finally got through on their support line and spoke to level1. The only 
>> thing the tech could say was it was an issue with BGP route reflectors and 
>> it started about 3am(pacific). They were still trying to isolate the issue. 
>> I've tried failing over my circuits and no go, the traffic just dies as L3 
>> won't stop advertising my routes.
>>
>>
>>
>> On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG  
>> wrote:
>>
>> Hello,
>>
>>
>>
>> Woke up this morning to a bunch of reports of issues with connectivity had 
>> to shut down some Level3/CTL connections to get it to return to normal.
>>
>>
>>
>> As of right now their support portal won’t load: 
>> https://www.centurylink.com/business/login/
>>
>>
>>
>> Just wondering what others are seeing.
>>
>>
>>
>>



-- 
I don't think the execution 

Re: Centurylink having a bad morning?

2020-08-31 Thread Eric Kuhnke
There's a number of enterprise end user type customers of 3356 that have
on-premises server rooms/hosting for their stuff. And they spend a lot of
money every month for a 'redundant' metro ethernet circuit that takes
diverse fiber paths from their business park office building to the local
clink/level3 POP. But all that last mile redundancy and fail over ability
doesn't do much for them when 3356 breaks its network at the BGP level.



On Mon, Aug 31, 2020 at 9:36 AM Drew Weaver  wrote:

> I also found the part where they mention that a lot of hosting companies
> only have one uplink to be quizzical and also the fact that he goes pretty
> close to implying that its Centurylink’s customers fault for not having
> multiple paths to Cloudflare that don’t touch Centurylink a bit puzzling.
> It could have just been poorly written.
>
>
>
>
>
> *From:* NANOG  *On Behalf
> Of *Tom Beecher
> *Sent:* Monday, August 31, 2020 9:26 AM
> *To:* Hank Nussbacher 
> *Cc:* NANOG 
> *Subject:* Re: Centurylink having a bad morning?
>
>
>
> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>
>
>
> I definitely found Mr. Prince's writing about yesterday's events
> fascinating.
>
>
>
> Verizon makes a mistake with BGP filters that allows a secondary mistake
> from leaked "optimizer" routes to propagate, and Mr. Prince takes every
> opportunity to lob large chunks of granite about how terrible they are.
>
>
>
> L3 allows an erroneous flowspec announcement to cause massive global
> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
> wrote:
>
> On 30/08/2020 20:08, Baldur Norddahl wrote:
>
>
>
> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>
>
>
> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>
>
>
> But that is Cloudflare speculation.
>
>
>
> Regards,
> Hank
>
> Caveat: The views expressed above are solely my own and do not express the
> views or opinions of my employer
>
>
>
> An outage is what it is. I am not worried about outages. We have multiple
> transits to deal with that.
>
>
>
> It is the keep announcing prefixes after withdrawal from peers and
> customers that is the huge problem here. That is killing all the effort and
> money I put into having redundancy. It is sabotage of my network after I
> cut the ties. I do not want to be a customer at an outlet who has a system
> that will do that. Luckily we do not currently have a contract and now they
> will have to convince me it is safe for me to make a contract with them. If
> that is impossible I guess I won't be getting a contract with them.
>
>
>
> But I disagree in that it would be impossible. They need to make a good
> report telling exactly what went wrong and how they changed the design, so
> something like this can not happen again. The basic design of BGP is such
> that this should not happen easily if at all. They did something unwise.
> Did they make a route reflector based on a database or something?
>
>
>
> Regards,
>
>
>
> Baldur
>
>
>
> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
> wrote:
>
> Exactly. And asking that they somehow prove this won't happen again is
> impossible.
>
> - Mike Bolitho
>
>
>
> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver  wrote:
>
> I’m not defending them but I am sure it isn’t intentional.
>
>
>
> *From:* NANOG  *On Behalf
> Of *Baldur Norddahl
> *Sent:* Sunday, August 30, 2020 9:28 AM
> *To:* nanog@nanog.org
> *Subject:* Re: Centurylink having a bad morning?
>
>
>
> How is that acceptable behaviour? I shall remember never to make a
> contract with these guys until they can prove that they won't advertise my
> prefixes after I pull them. Under any circumstances.
>
>
>
> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins  >:
>
> Finally got through on their support line and spoke to level1. The only
> thing the tech could say was it was an issue with BGP route reflectors and
> it started about 3am(pacific). They were still trying to isolate the issue.
> I've tried failing over my circuits and no go, the traffic just dies as L3
> won't stop advertising my routes.
>
>
>
> On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG 
> wrote:
>
> Hello,
>
>
>
> Woke up this morning to a bunch of reports of issues with connectivity had
> to shut down some Level3/CTL connections to get it to return to normal.
>
>
>
> As of right now their support portal won’t load:
> https://www.centurylink.com/business/login/
>
>
>
> Just wondering what others are seeing.
>
>
>
>
>
>


RE: Centurylink having a bad morning?

2020-08-31 Thread Drew Weaver
I also found the part where they mention that a lot of hosting companies only 
have one uplink to be quizzical and also the fact that he goes pretty close to 
implying that its Centurylink’s customers fault for not having multiple paths 
to Cloudflare that don’t touch Centurylink a bit puzzling. It could have just 
been poorly written.


From: NANOG  On Behalf Of Tom 
Beecher
Sent: Monday, August 31, 2020 9:26 AM
To: Hank Nussbacher 
Cc: NANOG 
Subject: Re: Centurylink having a bad morning?

https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

I definitely found Mr. Prince's writing about yesterday's events fascinating.

Verizon makes a mistake with BGP filters that allows a secondary mistake from 
leaked "optimizer" routes to propagate, and Mr. Prince takes every opportunity 
to lob large chunks of granite about how terrible they are.

L3 allows an erroneous flowspec announcement to cause massive global 
connectivity issues, and Mr. Prince shrugs and says "Incidents happen."





On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
mailto:h...@interall.co.il>> wrote:
On 30/08/2020 20:08, Baldur Norddahl wrote:

https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

Sounds like Flowspec possibly blocking tcp/179 might be the cause.

But that is Cloudflare speculation.

Regards,
Hank
Caveat: The views expressed above are solely my own and do not express the 
views or opinions of my employer

An outage is what it is. I am not worried about outages. We have multiple 
transits to deal with that.

It is the keep announcing prefixes after withdrawal from peers and customers 
that is the huge problem here. That is killing all the effort and money I put 
into having redundancy. It is sabotage of my network after I cut the ties. I do 
not want to be a customer at an outlet who has a system that will do that. 
Luckily we do not currently have a contract and now they will have to convince 
me it is safe for me to make a contract with them. If that is impossible I 
guess I won't be getting a contract with them.

But I disagree in that it would be impossible. They need to make a good report 
telling exactly what went wrong and how they changed the design, so something 
like this can not happen again. The basic design of BGP is such that this 
should not happen easily if at all. They did something unwise. Did they make a 
route reflector based on a database or something?

Regards,

Baldur

On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
mailto:mikeboli...@gmail.com>> wrote:
Exactly. And asking that they somehow prove this won't happen again is 
impossible.
- Mike Bolitho

On Sun, Aug 30, 2020, 8:10 AM Drew Weaver 
mailto:drew.wea...@thenap.com>> wrote:
I’m not defending them but I am sure it isn’t intentional.

From: NANOG 
mailto:thenap@nanog.org>> 
On Behalf Of Baldur Norddahl
Sent: Sunday, August 30, 2020 9:28 AM
To: nanog@nanog.org
Subject: Re: Centurylink having a bad morning?

How is that acceptable behaviour? I shall remember never to make a contract 
with these guys until they can prove that they won't advertise my prefixes 
after I pull them. Under any circumstances.

søn. 30. aug. 2020 15.14 skrev Joseph Jenkins 
mailto:j...@breathe-underwater.com>>:
Finally got through on their support line and spoke to level1. The only thing 
the tech could say was it was an issue with BGP route reflectors and it started 
about 3am(pacific). They were still trying to isolate the issue. I've tried 
failing over my circuits and no go, the traffic just dies as L3 won't stop 
advertising my routes.

On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG 
mailto:nanog@nanog.org>> wrote:
Hello,

Woke up this morning to a bunch of reports of issues with connectivity had to 
shut down some Level3/CTL connections to get it to return to normal.

As of right now their support portal won’t load: 
https://www.centurylink.com/business/login/

Just wondering what others are seeing.




Re: Does anyone actually like CenturyLink?

2020-08-31 Thread Ross Tajvar
True, but I was including conversations with colleagues where we generally
*do* discuss carriers we like.

On Mon, Aug 31, 2020 at 9:28 AM Tom Beecher  wrote:

> I've never heard a single positive word about them
>>
>
> There is rarely much in the way of emails/messages sent about things when
> they work well.
>
> On Sun, Aug 30, 2020 at 11:03 AM Ross Tajvar  wrote:
>
>> I've never heard a single positive word about them, and I've had my fair
>> share of issues myself (as an indirect customer). But it seems that lots of
>> people put them in their transit blend. Other than lack of options, why
>> would anyone use them? To me, it just seems like asking for trouble...but
>> maybe I'm missing something?
>>
>


Re: Centurylink having a bad morning?

2020-08-31 Thread Bryan Holloway
Not everyone will peer with you, notably, AS3356 (unless you're big 
enough, which few can say.)


On 8/31/20 4:33 PM, Tomas Lynch wrote:
Maybe we are idealizing these so-called tier-1 carriers and we, tier-ns, 
should treat them as what they really are: another AS. Accept that they 
are going to fail and do our best to mitigate the impact on our own 
networks, i.e. more peering.


On Mon, Aug 31, 2020 at 9:54 AM Martijn Schmidt via NANOG 
mailto:nanog@nanog.org>> wrote:


At this point you don't even know whether it's a human error
(example: generating a flowspec rule for port TCP/179), a filtering
issue (example: accepting a flowspec rule for port TCP/179), or a
software issue (example: certain flowspec update crashes the BGP
daemon). And in the third scenario I think that at least some
portion of the blame shifts from the carrier to its vendors,
assuming the thing that crashed was not a home-grown BGP implementation.

With the route optimizer incidents - because let's face it, Honest
Networker is on the money as usual
https://honestnetworker.net/2020/08/06/as10990-routing/ - there is
really no excuse for any tier-1 carrier, they should at the very
least have strict prefix-list based filtering in place for
customer-facing EBGP sessions. In those cases it's much easier to
state who's not taking care of their proverbial lawn.

Best regards,
Martijn

On 8/31/20 3:25 PM, Tom Beecher wrote:



https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/


I definitely found Mr. Prince's writing about yesterday's events
fascinating.

Verizon makes a mistake with BGP filters that allows a secondary
mistake from leaked "optimizer" routes to propagate, and Mr.
Prince takes every opportunity to lob large chunks of granite
about how terrible they are.

L3 allows an erroneous flowspec announcement to cause massive
global connectivity issues, and Mr. Prince shrugs and says
"Incidents happen."





On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher
mailto:h...@interall.co.il>> wrote:

On 30/08/2020 20:08, Baldur Norddahl wrote:


https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

Sounds like Flowspec possibly blocking tcp/179 might be the cause.

But that is Cloudflare speculation.

Regards,
Hank
Caveat: The views expressed above are solely my own and do not
express the views or opinions of my employer


An outage is what it is. I am not worried about outages. We
have multiple transits to deal with that.

It is the keep announcing prefixes after withdrawal from
peers and customers that is the huge problem here. That is
killing all the effort and money I put into having
redundancy. It is sabotage of my network after I cut the
ties. I do not want to be a customer at an outlet who has a
system that will do that. Luckily we do not currently have a
contract and now they will have to convince me it is safe for
me to make a contract with them. If that is impossible I
guess I won't be getting a contract with them.

But I disagree in that it would be impossible. They need to
make a good report telling exactly what went wrong and how
they changed the design, so something like this can not
happen again. The basic design of BGP is such that this
should not happen easily if at all. They did something
unwise. Did they make a route reflector based on a database
or something?

Regards,

Baldur

On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho
mailto:mikeboli...@gmail.com>> wrote:

Exactly. And asking that they somehow prove this won't
happen again is impossible.

- Mike Bolitho

On Sun, Aug 30, 2020, 8:10 AM Drew Weaver
mailto:drew.wea...@thenap.com>>
wrote:

I’m not defending them but I am sure it isn’t
intentional.

*From:* NANOG
mailto:thenap@nanog.org>> *On Behalf Of *Baldur
Norddahl
*Sent:* Sunday, August 30, 2020 9:28 AM
*To:* nanog@nanog.org 
*Subject:* Re: Centurylink having a bad morning?

How is that acceptable behaviour? I shall remember
never to make a contract with these guys until they
can prove that they won't advertise my prefixes after
I pull them. Under any circumstances.

søn. 30. aug. 2020 15.14 skrev Joseph Jenkins
mailto:j...@breathe-underwater.com>>:

Finally got through on their support line and
spoke to level1. The only thing the tech could
say 

Re: Centurylink having a bad morning?

2020-08-31 Thread Jason Kuehl
At the end of the day, the business needs to besides to take that cost. All
you can do is document, and talk about the risks.

Save that email for that "I told you so moment"

On Mon, Aug 31, 2020 at 10:50 AM Mike Bolitho  wrote:

> That's all we can do. Thankfully I work for an org that understands this
> and has *at least* two fully redundant circuits. Sometimes a third
> smaller carrier if we can prove that it is diverse, but that isn't the case
> very often.
>
> - Mike Bolitho
>
>
> On Mon, Aug 31, 2020 at 7:35 AM Tomas Lynch  wrote:
>
>> Maybe we are idealizing these so-called tier-1 carriers and we, tier-ns,
>> should treat them as what they really are: another AS. Accept that they are
>> going to fail and do our best to mitigate the impact on our own networks,
>> i.e. more peering.
>>
>> On Mon, Aug 31, 2020 at 9:54 AM Martijn Schmidt via NANOG <
>> nanog@nanog.org> wrote:
>>
>>> At this point you don't even know whether it's a human error (example:
>>> generating a flowspec rule for port TCP/179), a filtering issue (example:
>>> accepting a flowspec rule for port TCP/179), or a software issue (example:
>>> certain flowspec update crashes the BGP daemon). And in the third scenario
>>> I think that at least some portion of the blame shifts from the carrier to
>>> its vendors, assuming the thing that crashed was not a home-grown BGP
>>> implementation.
>>>
>>> With the route optimizer incidents - because let's face it, Honest
>>> Networker is on the money as usual
>>> https://honestnetworker.net/2020/08/06/as10990-routing/ - there is
>>> really no excuse for any tier-1 carrier, they should at the very least have
>>> strict prefix-list based filtering in place for customer-facing EBGP
>>> sessions. In those cases it's much easier to state who's not taking care of
>>> their proverbial lawn.
>>>
>>> Best regards,
>>> Martijn
>>>
>>> On 8/31/20 3:25 PM, Tom Beecher wrote:
>>>
>>>
 https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>>
>>>
>>> I definitely found Mr. Prince's writing about yesterday's events
>>> fascinating.
>>>
>>> Verizon makes a mistake with BGP filters that allows a secondary mistake
>>> from leaked "optimizer" routes to propagate, and Mr. Prince takes every
>>> opportunity to lob large chunks of granite about how terrible they are.
>>>
>>> L3 allows an erroneous flowspec announcement to cause massive global
>>> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
>>> wrote:
>>>
 On 30/08/2020 20:08, Baldur Norddahl wrote:


 https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

 Sounds like Flowspec possibly blocking tcp/179 might be the cause.

 But that is Cloudflare speculation.

 Regards,
 Hank
 Caveat: The views expressed above are solely my own and do not express
 the views or opinions of my employer

 An outage is what it is. I am not worried about outages. We have
 multiple transits to deal with that.

 It is the keep announcing prefixes after withdrawal from peers and
 customers that is the huge problem here. That is killing all the effort and
 money I put into having redundancy. It is sabotage of my network after I
 cut the ties. I do not want to be a customer at an outlet who has a system
 that will do that. Luckily we do not currently have a contract and now they
 will have to convince me it is safe for me to make a contract with them. If
 that is impossible I guess I won't be getting a contract with them.

 But I disagree in that it would be impossible. They need to make a good
 report telling exactly what went wrong and how they changed the design, so
 something like this can not happen again. The basic design of BGP is such
 that this should not happen easily if at all. They did something unwise.
 Did they make a route reflector based on a database or something?

 Regards,

 Baldur

 On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
 wrote:

> Exactly. And asking that they somehow prove this won't happen again is
> impossible.
>
> - Mike Bolitho
>
> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver 
> wrote:
>
>> I’m not defending them but I am sure it isn’t intentional.
>>
>>
>>
>> *From:* NANOG  *On
>> Behalf Of *Baldur Norddahl
>> *Sent:* Sunday, August 30, 2020 9:28 AM
>> *To:* nanog@nanog.org
>> *Subject:* Re: Centurylink having a bad morning?
>>
>>
>>
>> How is that acceptable behaviour? I shall remember never to make a
>> contract with these guys until they can prove that they won't advertise 
>> my
>> prefixes after I pull them. Under any circumstances.
>>
>>
>>
>> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins <
>> j...@breathe-underwater.com>:
>>
>> Finally got 

Re: Help configuring Dell Server

2020-08-31 Thread J. Hellenthal via NANOG
Dell Support or some other group may be more appropriate for this…


Good luck, In IRC land you may want to visit Freenode #Ubuntu #Linux #Centos….. 
If you are searching for an IRC client after thus message check out mIRC for 
Windows or XChat for *IX variations and hang out on those for a while. They’ll 
be willing to help.


> On Aug 31, 2020, at 09:51, Brielle  wrote:
> 
> 
> Um, this is a list for North American network operators, not r/techsupport...
> 
> If you aren’t capable of doing even the most basic configuration of name 
> servers, you are in the wrong place.
> 
> 
> Sent from my iPhone
> 
>> On Aug 31, 2020, at 8:38 AM, peter agakpe  wrote:
>> 
>> 
>> Can I get some help deploy my server online. I have Ubuntu and centos 
>> installed but still having some problems. I keep getting; “server IP address 
>> could not be found. DNS_PROBE_FINISHED_NXDOMAIN.”
>> 
>> COULD USE, NO, NEED HELP.
>> 
>>  
>> 
>> THANKS
>> 
>>  
>>  
>> Sent from Mail for Windows 10


-- 

J. Hellenthal

The fact that there's a highway to Hell but only a stairway to Heaven says a 
lot about anticipated traffic volume.








smime.p7s
Description: S/MIME cryptographic signature


Re: Help configuring Dell Server

2020-08-31 Thread Mel Beckman
It could be many things. I suggest you google DNS_PROBE_FINISHED_NXDOMAIN  and 
start reading articles. This group isn’t a good place to ask.

 -mel beckman

On Aug 31, 2020, at 7:40 AM, peter agakpe  wrote:



Can I get some help deploy my server online. I have Ubuntu and centos installed 
but still having some problems. I keep getting; “server IP address could not be 
found. DNS_PROBE_FINISHED_NXDOMAIN.”

Could use, no, need help.



thanks


Sent from Mail for Windows 10



Re: Help configuring Dell Server

2020-08-31 Thread Brielle

Um, this is a list for North American network operators, not r/techsupport...

If you aren’t capable of doing even the most basic configuration of name 
servers, you are in the wrong place.


Sent from my iPhone

> On Aug 31, 2020, at 8:38 AM, peter agakpe  wrote:
> 
> 
> Can I get some help deploy my server online. I have Ubuntu and centos 
> installed but still having some problems. I keep getting; “server IP address 
> could not be found. DNS_PROBE_FINISHED_NXDOMAIN.”
> 
> COULD USE, NO, NEED HELP.
> 
>  
> 
> THANKS
> 
>  
>  
> Sent from Mail for Windows 10
>  


Re: Centurylink having a bad morning?

2020-08-31 Thread Mike Bolitho
That's all we can do. Thankfully I work for an org that understands this
and has *at least* two fully redundant circuits. Sometimes a third smaller
carrier if we can prove that it is diverse, but that isn't the case very
often.

- Mike Bolitho


On Mon, Aug 31, 2020 at 7:35 AM Tomas Lynch  wrote:

> Maybe we are idealizing these so-called tier-1 carriers and we, tier-ns,
> should treat them as what they really are: another AS. Accept that they are
> going to fail and do our best to mitigate the impact on our own networks,
> i.e. more peering.
>
> On Mon, Aug 31, 2020 at 9:54 AM Martijn Schmidt via NANOG 
> wrote:
>
>> At this point you don't even know whether it's a human error (example:
>> generating a flowspec rule for port TCP/179), a filtering issue (example:
>> accepting a flowspec rule for port TCP/179), or a software issue (example:
>> certain flowspec update crashes the BGP daemon). And in the third scenario
>> I think that at least some portion of the blame shifts from the carrier to
>> its vendors, assuming the thing that crashed was not a home-grown BGP
>> implementation.
>>
>> With the route optimizer incidents - because let's face it, Honest
>> Networker is on the money as usual
>> https://honestnetworker.net/2020/08/06/as10990-routing/ - there is
>> really no excuse for any tier-1 carrier, they should at the very least have
>> strict prefix-list based filtering in place for customer-facing EBGP
>> sessions. In those cases it's much easier to state who's not taking care of
>> their proverbial lawn.
>>
>> Best regards,
>> Martijn
>>
>> On 8/31/20 3:25 PM, Tom Beecher wrote:
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>>
>> I definitely found Mr. Prince's writing about yesterday's events
>> fascinating.
>>
>> Verizon makes a mistake with BGP filters that allows a secondary mistake
>> from leaked "optimizer" routes to propagate, and Mr. Prince takes every
>> opportunity to lob large chunks of granite about how terrible they are.
>>
>> L3 allows an erroneous flowspec announcement to cause massive global
>> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>>
>>
>>
>>
>>
>> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
>> wrote:
>>
>>> On 30/08/2020 20:08, Baldur Norddahl wrote:
>>>
>>>
>>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>>
>>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>>>
>>> But that is Cloudflare speculation.
>>>
>>> Regards,
>>> Hank
>>> Caveat: The views expressed above are solely my own and do not express
>>> the views or opinions of my employer
>>>
>>> An outage is what it is. I am not worried about outages. We have
>>> multiple transits to deal with that.
>>>
>>> It is the keep announcing prefixes after withdrawal from peers and
>>> customers that is the huge problem here. That is killing all the effort and
>>> money I put into having redundancy. It is sabotage of my network after I
>>> cut the ties. I do not want to be a customer at an outlet who has a system
>>> that will do that. Luckily we do not currently have a contract and now they
>>> will have to convince me it is safe for me to make a contract with them. If
>>> that is impossible I guess I won't be getting a contract with them.
>>>
>>> But I disagree in that it would be impossible. They need to make a good
>>> report telling exactly what went wrong and how they changed the design, so
>>> something like this can not happen again. The basic design of BGP is such
>>> that this should not happen easily if at all. They did something unwise.
>>> Did they make a route reflector based on a database or something?
>>>
>>> Regards,
>>>
>>> Baldur
>>>
>>> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
>>> wrote:
>>>
 Exactly. And asking that they somehow prove this won't happen again is
 impossible.

 - Mike Bolitho

 On Sun, Aug 30, 2020, 8:10 AM Drew Weaver 
 wrote:

> I’m not defending them but I am sure it isn’t intentional.
>
>
>
> *From:* NANOG  *On
> Behalf Of *Baldur Norddahl
> *Sent:* Sunday, August 30, 2020 9:28 AM
> *To:* nanog@nanog.org
> *Subject:* Re: Centurylink having a bad morning?
>
>
>
> How is that acceptable behaviour? I shall remember never to make a
> contract with these guys until they can prove that they won't advertise my
> prefixes after I pull them. Under any circumstances.
>
>
>
> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins <
> j...@breathe-underwater.com>:
>
> Finally got through on their support line and spoke to level1. The
> only thing the tech could say was it was an issue with BGP route 
> reflectors
> and it started about 3am(pacific). They were still trying to isolate the
> issue. I've tried failing over my circuits and no go, the traffic just 
> dies
> as L3 won't stop advertising my routes.
>
>
>
> On Sun, Aug 

Re: Centurylink having a bad morning?

2020-08-31 Thread Martijn Schmidt via NANOG
You're preaching to the choir here.. ;)

On 8/31/20 4:33 PM, Tomas Lynch wrote:
Maybe we are idealizing these so-called tier-1 carriers and we, tier-ns, should 
treat them as what they really are: another AS. Accept that they are going to 
fail and do our best to mitigate the impact on our own networks, i.e. more 
peering.

On Mon, Aug 31, 2020 at 9:54 AM Martijn Schmidt via NANOG 
mailto:nanog@nanog.org>> wrote:
At this point you don't even know whether it's a human error (example: 
generating a flowspec rule for port TCP/179), a filtering issue (example: 
accepting a flowspec rule for port TCP/179), or a software issue (example: 
certain flowspec update crashes the BGP daemon). And in the third scenario I 
think that at least some portion of the blame shifts from the carrier to its 
vendors, assuming the thing that crashed was not a home-grown BGP 
implementation.

With the route optimizer incidents - because let's face it, Honest Networker is 
on the money as usual https://honestnetworker.net/2020/08/06/as10990-routing/ - 
there is really no excuse for any tier-1 carrier, they should at the very least 
have strict prefix-list based filtering in place for customer-facing EBGP 
sessions. In those cases it's much easier to state who's not taking care of 
their proverbial lawn.

Best regards,
Martijn

On 8/31/20 3:25 PM, Tom Beecher wrote:
https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

I definitely found Mr. Prince's writing about yesterday's events fascinating.

Verizon makes a mistake with BGP filters that allows a secondary mistake from 
leaked "optimizer" routes to propagate, and Mr. Prince takes every opportunity 
to lob large chunks of granite about how terrible they are.

L3 allows an erroneous flowspec announcement to cause massive global 
connectivity issues, and Mr. Prince shrugs and says "Incidents happen."





On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
mailto:h...@interall.co.il>> wrote:
On 30/08/2020 20:08, Baldur Norddahl wrote:

https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

Sounds like Flowspec possibly blocking tcp/179 might be the cause.

But that is Cloudflare speculation.

Regards,
Hank
Caveat: The views expressed above are solely my own and do not express the 
views or opinions of my employer

An outage is what it is. I am not worried about outages. We have multiple 
transits to deal with that.

It is the keep announcing prefixes after withdrawal from peers and customers 
that is the huge problem here. That is killing all the effort and money I put 
into having redundancy. It is sabotage of my network after I cut the ties. I do 
not want to be a customer at an outlet who has a system that will do that. 
Luckily we do not currently have a contract and now they will have to convince 
me it is safe for me to make a contract with them. If that is impossible I 
guess I won't be getting a contract with them.

But I disagree in that it would be impossible. They need to make a good report 
telling exactly what went wrong and how they changed the design, so something 
like this can not happen again. The basic design of BGP is such that this 
should not happen easily if at all. They did something unwise. Did they make a 
route reflector based on a database or something?

Regards,

Baldur

On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
mailto:mikeboli...@gmail.com>> wrote:
Exactly. And asking that they somehow prove this won't happen again is 
impossible.

- Mike Bolitho

On Sun, Aug 30, 2020, 8:10 AM Drew Weaver 
mailto:drew.wea...@thenap.com>> wrote:
I’m not defending them but I am sure it isn’t intentional.

From: NANOG 
mailto:thenap@nanog.org>> 
On Behalf Of Baldur Norddahl
Sent: Sunday, August 30, 2020 9:28 AM
To: nanog@nanog.org
Subject: Re: Centurylink having a bad morning?

How is that acceptable behaviour? I shall remember never to make a contract 
with these guys until they can prove that they won't advertise my prefixes 
after I pull them. Under any circumstances.

søn. 30. aug. 2020 15.14 skrev Joseph Jenkins 
mailto:j...@breathe-underwater.com>>:
Finally got through on their support line and spoke to level1. The only thing 
the tech could say was it was an issue with BGP route reflectors and it started 
about 3am(pacific). They were still trying to isolate the issue. I've tried 
failing over my circuits and no go, the traffic just dies as L3 won't stop 
advertising my routes.

On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG 
mailto:nanog@nanog.org>> wrote:
Hello,

Woke up this morning to a bunch of reports of issues with connectivity had to 
shut down some Level3/CTL connections to get it to return to normal.

As of right now their support portal won’t load: 
https://www.centurylink.com/business/login/

Just wondering what others are seeing.






Help configuring Dell Server

2020-08-31 Thread peter agakpe
Can I get some help deploy my server online. I have Ubuntu and centos installed 
but still having some problems. I keep getting; “server IP address could not be 
found. DNS_PROBE_FINISHED_NXDOMAIN.”

Could use, no, need help.



thanks


Sent from Mail for Windows 10



Re: Centurylink having a bad morning?

2020-08-31 Thread Tomas Lynch
Maybe we are idealizing these so-called tier-1 carriers and we, tier-ns,
should treat them as what they really are: another AS. Accept that they are
going to fail and do our best to mitigate the impact on our own networks,
i.e. more peering.

On Mon, Aug 31, 2020 at 9:54 AM Martijn Schmidt via NANOG 
wrote:

> At this point you don't even know whether it's a human error (example:
> generating a flowspec rule for port TCP/179), a filtering issue (example:
> accepting a flowspec rule for port TCP/179), or a software issue (example:
> certain flowspec update crashes the BGP daemon). And in the third scenario
> I think that at least some portion of the blame shifts from the carrier to
> its vendors, assuming the thing that crashed was not a home-grown BGP
> implementation.
>
> With the route optimizer incidents - because let's face it, Honest
> Networker is on the money as usual
> https://honestnetworker.net/2020/08/06/as10990-routing/ - there is really
> no excuse for any tier-1 carrier, they should at the very least have strict
> prefix-list based filtering in place for customer-facing EBGP sessions. In
> those cases it's much easier to state who's not taking care of their
> proverbial lawn.
>
> Best regards,
> Martijn
>
> On 8/31/20 3:25 PM, Tom Beecher wrote:
>
> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>
>
> I definitely found Mr. Prince's writing about yesterday's events
> fascinating.
>
> Verizon makes a mistake with BGP filters that allows a secondary mistake
> from leaked "optimizer" routes to propagate, and Mr. Prince takes every
> opportunity to lob large chunks of granite about how terrible they are.
>
> L3 allows an erroneous flowspec announcement to cause massive global
> connectivity issues, and Mr. Prince shrugs and says "Incidents happen."
>
>
>
>
>
> On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
> wrote:
>
>> On 30/08/2020 20:08, Baldur Norddahl wrote:
>>
>> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>>
>> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>>
>> But that is Cloudflare speculation.
>>
>> Regards,
>> Hank
>> Caveat: The views expressed above are solely my own and do not express
>> the views or opinions of my employer
>>
>> An outage is what it is. I am not worried about outages. We have multiple
>> transits to deal with that.
>>
>> It is the keep announcing prefixes after withdrawal from peers and
>> customers that is the huge problem here. That is killing all the effort and
>> money I put into having redundancy. It is sabotage of my network after I
>> cut the ties. I do not want to be a customer at an outlet who has a system
>> that will do that. Luckily we do not currently have a contract and now they
>> will have to convince me it is safe for me to make a contract with them. If
>> that is impossible I guess I won't be getting a contract with them.
>>
>> But I disagree in that it would be impossible. They need to make a good
>> report telling exactly what went wrong and how they changed the design, so
>> something like this can not happen again. The basic design of BGP is such
>> that this should not happen easily if at all. They did something unwise.
>> Did they make a route reflector based on a database or something?
>>
>> Regards,
>>
>> Baldur
>>
>> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
>> wrote:
>>
>>> Exactly. And asking that they somehow prove this won't happen again is
>>> impossible.
>>>
>>> - Mike Bolitho
>>>
>>> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver 
>>> wrote:
>>>
 I’m not defending them but I am sure it isn’t intentional.



 *From:* NANOG  *On
 Behalf Of *Baldur Norddahl
 *Sent:* Sunday, August 30, 2020 9:28 AM
 *To:* nanog@nanog.org
 *Subject:* Re: Centurylink having a bad morning?



 How is that acceptable behaviour? I shall remember never to make a
 contract with these guys until they can prove that they won't advertise my
 prefixes after I pull them. Under any circumstances.



 søn. 30. aug. 2020 15.14 skrev Joseph Jenkins <
 j...@breathe-underwater.com>:

 Finally got through on their support line and spoke to level1. The only
 thing the tech could say was it was an issue with BGP route reflectors and
 it started about 3am(pacific). They were still trying to isolate the issue.
 I've tried failing over my circuits and no go, the traffic just dies as L3
 won't stop advertising my routes.



 On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG 
 wrote:

 Hello,



 Woke up this morning to a bunch of reports of issues with connectivity
 had to shut down some Level3/CTL connections to get it to return to normal.



 As of right now their support portal won’t load:
 https://www.centurylink.com/business/login/



 Just wondering what others are seeing.




>>
>


Re: Does anyone actually like CenturyLink?

2020-08-31 Thread Töma Gavrichenkov
Peace,

On Mon, Aug 31, 2020, 4:42 PM Mike Bolitho  wrote:

> Maybe we should start an "Uptime mailing list" ha!
>

We already have outages@ which is a Boolean negation of what you're
proposing but works just the same :-)

--
Töma

>


Re: Centurylink having a bad morning?

2020-08-31 Thread Martijn Schmidt via NANOG
At this point you don't even know whether it's a human error (example: 
generating a flowspec rule for port TCP/179), a filtering issue (example: 
accepting a flowspec rule for port TCP/179), or a software issue (example: 
certain flowspec update crashes the BGP daemon). And in the third scenario I 
think that at least some portion of the blame shifts from the carrier to its 
vendors, assuming the thing that crashed was not a home-grown BGP 
implementation.

With the route optimizer incidents - because let's face it, Honest Networker is 
on the money as usual https://honestnetworker.net/2020/08/06/as10990-routing/ - 
there is really no excuse for any tier-1 carrier, they should at the very least 
have strict prefix-list based filtering in place for customer-facing EBGP 
sessions. In those cases it's much easier to state who's not taking care of 
their proverbial lawn.

Best regards,
Martijn

On 8/31/20 3:25 PM, Tom Beecher wrote:
https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

I definitely found Mr. Prince's writing about yesterday's events fascinating.

Verizon makes a mistake with BGP filters that allows a secondary mistake from 
leaked "optimizer" routes to propagate, and Mr. Prince takes every opportunity 
to lob large chunks of granite about how terrible they are.

L3 allows an erroneous flowspec announcement to cause massive global 
connectivity issues, and Mr. Prince shrugs and says "Incidents happen."





On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher 
mailto:h...@interall.co.il>> wrote:
On 30/08/2020 20:08, Baldur Norddahl wrote:

https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/

Sounds like Flowspec possibly blocking tcp/179 might be the cause.

But that is Cloudflare speculation.

Regards,
Hank
Caveat: The views expressed above are solely my own and do not express the 
views or opinions of my employer

An outage is what it is. I am not worried about outages. We have multiple 
transits to deal with that.

It is the keep announcing prefixes after withdrawal from peers and customers 
that is the huge problem here. That is killing all the effort and money I put 
into having redundancy. It is sabotage of my network after I cut the ties. I do 
not want to be a customer at an outlet who has a system that will do that. 
Luckily we do not currently have a contract and now they will have to convince 
me it is safe for me to make a contract with them. If that is impossible I 
guess I won't be getting a contract with them.

But I disagree in that it would be impossible. They need to make a good report 
telling exactly what went wrong and how they changed the design, so something 
like this can not happen again. The basic design of BGP is such that this 
should not happen easily if at all. They did something unwise. Did they make a 
route reflector based on a database or something?

Regards,

Baldur

On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
mailto:mikeboli...@gmail.com>> wrote:
Exactly. And asking that they somehow prove this won't happen again is 
impossible.

- Mike Bolitho

On Sun, Aug 30, 2020, 8:10 AM Drew Weaver 
mailto:drew.wea...@thenap.com>> wrote:
I’m not defending them but I am sure it isn’t intentional.

From: NANOG 
mailto:thenap@nanog.org>> 
On Behalf Of Baldur Norddahl
Sent: Sunday, August 30, 2020 9:28 AM
To: nanog@nanog.org
Subject: Re: Centurylink having a bad morning?

How is that acceptable behaviour? I shall remember never to make a contract 
with these guys until they can prove that they won't advertise my prefixes 
after I pull them. Under any circumstances.

søn. 30. aug. 2020 15.14 skrev Joseph Jenkins 
mailto:j...@breathe-underwater.com>>:
Finally got through on their support line and spoke to level1. The only thing 
the tech could say was it was an issue with BGP route reflectors and it started 
about 3am(pacific). They were still trying to isolate the issue. I've tried 
failing over my circuits and no go, the traffic just dies as L3 won't stop 
advertising my routes.

On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG 
mailto:nanog@nanog.org>> wrote:
Hello,

Woke up this morning to a bunch of reports of issues with connectivity had to 
shut down some Level3/CTL connections to get it to return to normal.

As of right now their support portal won’t load: 
https://www.centurylink.com/business/login/

Just wondering what others are seeing.





Re: Does anyone actually like CenturyLink?

2020-08-31 Thread Mike Bolitho
Maybe we should start an "Uptime mailing list" ha! But yeah, when things
are working well nobody talks about it. The CTL network is very large.
However, it's clear their blast radius mentality isn't real great. We saw
this yesterday. We saw this Dec 2018. Global outages shouldn't be a thing.

- Mike Bolitho

On Mon, Aug 31, 2020, 6:31 AM Tom Beecher  wrote:

> I've never heard a single positive word about them
>>
>
> There is rarely much in the way of emails/messages sent about things when
> they work well.
>
> On Sun, Aug 30, 2020 at 11:03 AM Ross Tajvar  wrote:
>
>> I've never heard a single positive word about them, and I've had my fair
>> share of issues myself (as an indirect customer). But it seems that lots of
>> people put them in their transit blend. Other than lack of options, why
>> would anyone use them? To me, it just seems like asking for trouble...but
>> maybe I'm missing something?
>>
>


Re: Does anyone actually like CenturyLink?

2020-08-31 Thread Tom Beecher
>
> I've never heard a single positive word about them
>

There is rarely much in the way of emails/messages sent about things when
they work well.

On Sun, Aug 30, 2020 at 11:03 AM Ross Tajvar  wrote:

> I've never heard a single positive word about them, and I've had my fair
> share of issues myself (as an indirect customer). But it seems that lots of
> people put them in their transit blend. Other than lack of options, why
> would anyone use them? To me, it just seems like asking for trouble...but
> maybe I'm missing something?
>


Re: Centurylink having a bad morning?

2020-08-31 Thread Tom Beecher
>
> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/


I definitely found Mr. Prince's writing about yesterday's events
fascinating.

Verizon makes a mistake with BGP filters that allows a secondary mistake
from leaked "optimizer" routes to propagate, and Mr. Prince takes every
opportunity to lob large chunks of granite about how terrible they are.

L3 allows an erroneous flowspec announcement to cause massive global
connectivity issues, and Mr. Prince shrugs and says "Incidents happen."





On Mon, Aug 31, 2020 at 1:15 AM Hank Nussbacher  wrote:

> On 30/08/2020 20:08, Baldur Norddahl wrote:
>
> https://blog.cloudflare.com/analysis-of-todays-centurylink-level-3-outage/
>
> Sounds like Flowspec possibly blocking tcp/179 might be the cause.
>
> But that is Cloudflare speculation.
>
> Regards,
> Hank
> Caveat: The views expressed above are solely my own and do not express the
> views or opinions of my employer
>
> An outage is what it is. I am not worried about outages. We have multiple
> transits to deal with that.
>
> It is the keep announcing prefixes after withdrawal from peers and
> customers that is the huge problem here. That is killing all the effort and
> money I put into having redundancy. It is sabotage of my network after I
> cut the ties. I do not want to be a customer at an outlet who has a system
> that will do that. Luckily we do not currently have a contract and now they
> will have to convince me it is safe for me to make a contract with them. If
> that is impossible I guess I won't be getting a contract with them.
>
> But I disagree in that it would be impossible. They need to make a good
> report telling exactly what went wrong and how they changed the design, so
> something like this can not happen again. The basic design of BGP is such
> that this should not happen easily if at all. They did something unwise.
> Did they make a route reflector based on a database or something?
>
> Regards,
>
> Baldur
>
> On Sun, Aug 30, 2020 at 5:13 PM Mike Bolitho 
> wrote:
>
>> Exactly. And asking that they somehow prove this won't happen again is
>> impossible.
>>
>> - Mike Bolitho
>>
>> On Sun, Aug 30, 2020, 8:10 AM Drew Weaver  wrote:
>>
>>> I’m not defending them but I am sure it isn’t intentional.
>>>
>>>
>>>
>>> *From:* NANOG  *On
>>> Behalf Of *Baldur Norddahl
>>> *Sent:* Sunday, August 30, 2020 9:28 AM
>>> *To:* nanog@nanog.org
>>> *Subject:* Re: Centurylink having a bad morning?
>>>
>>>
>>>
>>> How is that acceptable behaviour? I shall remember never to make a
>>> contract with these guys until they can prove that they won't advertise my
>>> prefixes after I pull them. Under any circumstances.
>>>
>>>
>>>
>>> søn. 30. aug. 2020 15.14 skrev Joseph Jenkins <
>>> j...@breathe-underwater.com>:
>>>
>>> Finally got through on their support line and spoke to level1. The only
>>> thing the tech could say was it was an issue with BGP route reflectors and
>>> it started about 3am(pacific). They were still trying to isolate the issue.
>>> I've tried failing over my circuits and no go, the traffic just dies as L3
>>> won't stop advertising my routes.
>>>
>>>
>>>
>>> On Sun, Aug 30, 2020 at 5:21 AM Drew Weaver via NANOG 
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> Woke up this morning to a bunch of reports of issues with connectivity
>>> had to shut down some Level3/CTL connections to get it to return to normal.
>>>
>>>
>>>
>>> As of right now their support portal won’t load:
>>> https://www.centurylink.com/business/login/
>>>
>>>
>>>
>>> Just wondering what others are seeing.
>>>
>>>
>>>
>>>
>


Re: Does anyone actually like CenturyLink?

2020-08-31 Thread Bill Woodcock


>> On Sun, Aug 30, 2020, 6:02 PM Ross Tajvar  wrote:
>> Other than lack of options, why would anyone use them?
>> 
> On Aug 30, 2020, at 6:41 PM, Töma Gavrichenkov  wrote:
> Connectivity and latency (of Level3 which was acquired).

Yeah.  What I think a lot of us liked was Global Crossing.  When Global 
Crossing was sucked into L3, L3 managed to retain a fair bit of what was good 
about Global Crossing.  The L3 got sucked into CenturyLink, and CenturyLink 
managed to retain a fair bit of what was good about L3.  But.  There’s still 
some inefficiency there.  Aggregation isn’t the cleanest way to build a network.

-Bill



signature.asc
Description: Message signed with OpenPGP