Re: [j-nsp] BGP timer

2024-04-29 Thread Mark Tinka via juniper-nsp




On 4/29/24 17:42, Lee Starnes via juniper-nsp wrote:

As for BFD and stability with aggressive settings, we don't run too
aggressive on this, but certainly do require it because the physical links
have not gone down in our cases when we have had issues, causing a larger
delay in killing the routes for that path. Not being able to rely on link
state failure leaves us with requiring the use of BFD.


Is this link carrying eBGP or iBGP?

If the latter, have you considered using BFD to track the IGP instead of 
BGP?


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Lee Starnes via juniper-nsp
Thank you everyone for the replies on this topic. For us, we would rather
keep a link down longer when it has an issue and goes down than to have it
come back up and then go down again. This is because the flapping is very
destructive to live video and VoIP. Having several diverse backbone
connections, we can tolerate having one down. This topic came up because we
have had one of our backbone carriers become problematic and the flapping
caused by their issues caused a lot of damage in terms of customer
relations. So certainly would want to let a failed link sit failed for a
little bit after it restores before bringing BGP back up.

As for BFD and stability with aggressive settings, we don't run too
aggressive on this, but certainly do require it because the physical links
have not gone down in our cases when we have had issues, causing a larger
delay in killing the routes for that path. Not being able to rely on link
state failure leaves us with requiring the use of BFD.

Again, thanks for all the replies everyone. I will check out the BFD
holddown.

-Lee

On Mon, Apr 29, 2024 at 5:43 AM Jeff Haas via juniper-nsp <
juniper-nsp@puck.nether.net> wrote:

>
> Juniper Business Use Only
> On 4/29/24, 02:41, "Saku Ytti" mailto:s...@ytti.fi>> wrote:
> > On Sun, 28 Apr 2024 at 21:20, Jeff Haas via juniper-nsp
> > > BFD holddown is the right feature for this.
> >
> > But why is this desirable? Why do I want to prioritise stability
> > always, instead of prioritising convergence on well-behaved interfaces
> > and stability on poorly behaved interfaces?
>
> This feature is "don't bring up BGP on interfaces that aren't stable
> enough to
> let BFD stay up".  The intended use case is when you have an interface
> noisy
> enough that TCP can fight its way through keeping BGP up... enough, but not
> stable enough that you'd really want to forward over it.  The assessment
> for
> that is "BFD will go down in short order".
>
> > That is, if I cannot have exponential back-off, I won't kill
> > convergence 'just in case', because it's not me who will feel the pain
> > of my decisions, it's my customers. Netengs and particularly infosec
> > people quite often are unnecessarily conservative in their policies,
> > because they don't have skin in the game, they feel the upside, but
> > not the downside.
>
> People make decisions that are appropriate for their networks.  Using BFD
> on
> your BGP sessions is probably overkill *for you*.  Don't do that then.
>
> -- Jeff
>
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Jeff Haas via juniper-nsp


Juniper Business Use Only
On 4/29/24, 02:41, "Saku Ytti" mailto:s...@ytti.fi>> wrote:
> On Sun, 28 Apr 2024 at 21:20, Jeff Haas via juniper-nsp
> > BFD holddown is the right feature for this.
>
> But why is this desirable? Why do I want to prioritise stability
> always, instead of prioritising convergence on well-behaved interfaces
> and stability on poorly behaved interfaces?

This feature is "don't bring up BGP on interfaces that aren't stable enough to
let BFD stay up".  The intended use case is when you have an interface noisy
enough that TCP can fight its way through keeping BGP up... enough, but not
stable enough that you'd really want to forward over it.  The assessment for
that is "BFD will go down in short order".

> That is, if I cannot have exponential back-off, I won't kill
> convergence 'just in case', because it's not me who will feel the pain
> of my decisions, it's my customers. Netengs and particularly infosec
> people quite often are unnecessarily conservative in their policies,
> because they don't have skin in the game, they feel the upside, but
> not the downside.

People make decisions that are appropriate for their networks.  Using BFD on
your BGP sessions is probably overkill *for you*.  Don't do that then.

-- Jeff

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Mark Tinka via juniper-nsp




On 4/29/24 09:13, Saku Ytti wrote:


100%, what Mark implied was not what I was trying to communicate.
Sure, go ahead and damp flapping interfaces, but to penalise on first
down event, when most of them are just that, one event, to me, is just
bad policy made by people who don't feel the cost.


Yes, agree with this. Didn't mean to cause a mix-up.

As before, my perspective is from circuits where it can be continuous 
events in, let's say, a 12-hour period every few moments. Yes, this is 
not the norm in most mature markets, but we have had to deal with this 
sort of thing several times a year, down here, and it can get complex 
especially if the route you are dealing with has no suitable alternative 
options other than going round the continent and back.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Mark Tinka via juniper-nsp




On 4/29/24 09:15, Saku Ytti wrote:


You are making this unnecessarily complicated.

You could simply configure that first down event doesn't add enough
points to damp, 2nd does. And you are wildly better off.

Perfect is the enemy of done and kills all movement towards better.


Fair enough.

My perspective is from this side of the world where backbone is not the 
greatest experience in most of the inland markets. But I grant that such 
scenarios are not the norm in more mature regions.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Saku Ytti via juniper-nsp
On Mon, 29 Apr 2024 at 10:13, Mark Tinka via juniper-nsp
 wrote:

> It comes down to how you classify stable (well-behaved) vs. unstable
> (misbehaving) interfaces.

You are making this unnecessarily complicated.

You could simply configure that first down event doesn't add enough
points to damp, 2nd does. And you are wildly better off.

Perfect is the enemy of done and kills all movement towards better.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Saku Ytti via juniper-nsp
On Mon, 29 Apr 2024 at 10:07, Gert Doering via juniper-nsp
 wrote:

> The interesting question is "how to react when underlay seems to be stable
> again"?  "bring up upper layers right away, with exponential decay flap
> dampening" or "always wait 15 minutes to be SURE it's stable!!!"...

100%, what Mark implied was not what I was trying to communicate.
Sure, go ahead and damp flapping interfaces, but to penalise on first
down event, when most of them are just that, one event, to me, is just
bad policy made by people who don't feel the cost.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Mark Tinka via juniper-nsp




On 4/29/24 09:06, Gert Doering wrote:


Yes, but that's a slightly different tangent.  If the underlay is unstable,
I think we're all in agremeent that higher layers should not send packets
there.


It comes down to how you classify stable (well-behaved) vs. unstable 
(misbehaving) interfaces.


This will vary for networks, backbones, providers, e.t.c.

In many cases, manual intervention will be required because even the 
most aggressive or the most conservative dampening settings will not be 
able to account for what stable and unstable interfaces means. I suppose 
one could "AI" it, but that's outside the realm of my abilities.


In other words, a one-size-fits-all is unlikely to work here. Plenty of 
tools exist, and I think it is up to the operator to educate themselves 
on all of them and make the best decision for a given scenario.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Gert Doering via juniper-nsp
Hi,

On Mon, Apr 29, 2024 at 08:52:17AM +0200, Mark Tinka via juniper-nsp wrote:
> Protocols staying up despite the underlay being unstable means traffic dies
> and users are not happy. It's really that simple.

Yes, but that's a slightly different tangent.  If the underlay is unstable,
I think we're all in agremeent that higher layers should not send packets
there.

The interesting question is "how to react when underlay seems to be stable
again"?  "bring up upper layers right away, with exponential decay flap
dampening" or "always wait 15 minutes to be SURE it's stable!!!"...

I go for flap dampening and taking ports into service quickly ("one of the
trainees might be on a rampage, killing the other uplink right next")  ;-)

gert

-- 
"If was one thing all people took for granted, was conviction that if you 
 feed honest figures into a computer, honest figures come out. Never doubted 
 it myself till I met a computer with a sense of humor."
 Robert A. Heinlein, The Moon is a Harsh Mistress

Gert Doering - Munich, Germany g...@greenie.muc.de


signature.asc
Description: PGP signature
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Mark Tinka via juniper-nsp




On 4/29/24 08:31, Saku Ytti via juniper-nsp wrote:


But why is this desirable? Why do I want to prioritise stability
always, instead of prioritising convergence on well-behaved interfaces
and stability on poorly behaved interfaces?

If I can pick just one, I'll prioritise convergence every time for both.

That is, if I cannot have exponential back-off, I won't kill
convergence 'just in case', because it's not me who will feel the pain
of my decisions, it's my customers. Netengs and particularly infosec
people quite often are unnecessarily conservative in their policies,
because they don't have skin in the game, they feel the upside, but
not the downside.


Over the decades, I've had a handful of customers that preferred uptime 
to convergence, because they were measured on that by their boss, 
organization or auditors.


You know - the kind of people that would refuse to reboot a router to 
implement new code, because "Last Reboot: 5y, 6w ago" looks far better 
than "Last Reboot: 15min ago" - those people.


Protocols staying up despite the underlay being unstable means traffic 
dies and users are not happy. It's really that simple.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] BGP timer

2024-04-29 Thread Saku Ytti via juniper-nsp
On Sun, 28 Apr 2024 at 21:20, Jeff Haas via juniper-nsp
 wrote:

> BFD holddown is the right feature for this.
> WARNING: BFD holddown is known to be problematic between Juniper and Cisco 
> implementations due to where each start their state machines for BFD vs. BGP.
>
> It was a partial motivation for BGP BFD strict:
> https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-bfd-strict-mode
>
> BGP BFD strict was added in 23.2R1.

But why is this desirable? Why do I want to prioritise stability
always, instead of prioritising convergence on well-behaved interfaces
and stability on poorly behaved interfaces?

If I can pick just one, I'll prioritise convergence every time for both.

That is, if I cannot have exponential back-off, I won't kill
convergence 'just in case', because it's not me who will feel the pain
of my decisions, it's my customers. Netengs and particularly infosec
people quite often are unnecessarily conservative in their policies,
because they don't have skin in the game, they feel the upside, but
not the downside.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp