Re: BGP Graceful Restart

2021-04-17 Thread lobna gouda
Hello Graham,

I had a chance to analysis this topic GR and GR helper mode( default) for EoR 
msg and for the LLGR timer afterwards and had e-mail correspondence  with the 
RFC auther.
I would say based on your environment topology and the type of BGP fault/error. 
You keep the default mode unless BFD is configured. Even if you donot have GR 
configured.  There is benefit to help your neigh ( if no BFD) in the restart 
process based on the type of  peering relation.

Now, to enable GR itself. You have to know what are you doing and why and where 
in the network it is actually enabled. Still, you don't enable it where BFD is 
configured.

Brgds,

LG

From: NANOG  on behalf of 
Graham Johnston 
Sent: Friday, April 16, 2021 10:11 AM
To: nanog@nanog.org 
Subject: BGP Graceful Restart

I do believe that I understand the intended purpose of BGP
graceful-restart. With that said, I was watching a video of a talk
given by someone respected in the industry the other day on the use of
graceful-shutdown and at the beginning of the talk there was a quick
disclaimer that his topic had nothing to do with graceful-restart
along with some text on the slide that provided me a clear indication
that he was not a fan of graceful-restart.

Largely, I suspect that his point was that if you otherwise do the
right things during maintenance that graceful-restart has the
potential of being really problematic if things go wrong, and thus he
was discouraging the use of it. Is there consensus as to whether
graceful-restart has any place in a service provider network?

Thanks,
Graham


Re: BGP Graceful Restart

2021-04-17 Thread Mark Tinka




On 4/16/21 16:11, Graham Johnston wrote:

I do believe that I understand the intended purpose of BGP
graceful-restart. With that said, I was watching a video of a talk
given by someone respected in the industry the other day on the use of
graceful-shutdown and at the beginning of the talk there was a quick
disclaimer that his topic had nothing to do with graceful-restart
along with some text on the slide that provided me a clear indication
that he was not a fan of graceful-restart.

Largely, I suspect that his point was that if you otherwise do the
right things during maintenance that graceful-restart has the
potential of being really problematic if things go wrong, and thus he
was discouraging the use of it. Is there consensus as to whether
graceful-restart has any place in a service provider network?


When the majority of the hardware we had had a single control plane, we 
used GR on those (and only inside our AS).


But as nearly 100% of all our BGP-speaking hardware now has dual control 
planes, we just go for the vendor's NSR implementation. We've found that 
to be a lot more reliable because it is locally significant, 
predictable, and generally works well as it has matured a great deal in 
the last decade.


Mark.


Re: BGP Graceful Restart

2021-04-16 Thread Yang Yu
On Fri, Apr 16, 2021 at 11:09 AM Graham Johnston
 wrote:
> Largely, I suspect that his point was that if you otherwise do the
> right things during maintenance that graceful-restart has the
> potential of being really problematic if things go wrong, and thus he
> was discouraging the use of it. Is there consensus as to whether
> graceful-restart has any place in a service provider network?

RFC4724 Graceful Restart is used to retain BGP routes where forwarding
plane is NOT disrupted. It can be useful for things that don't have
any alternative path to reduce exposure to control plane outages (e.g.
process restart).
Also sending End of Rib marker (not necessarily enabling GR) can be
helpful to troubleshoot BGP route collection (clear signal on
completion of initial convergence).

There is also LLGR https://tools.ietf.org/html/draft-ietf-idr-long-lived-gr-00


Re: BGP Graceful Restart

2021-04-16 Thread Mel Beckman
I use it BGP Graceful Restart in order to avoid route flapping penalties and 
undesired path selection when adding or removing prefixes on border routers 
(which entails ACL changes as well). However, when BGP is used as a data center 
fabric, I have heard it can cause complex failure modes lasting many minutes or 
even hours. I found this VMWare Validated Design Document 5.0.1 warning:

NSXT-VISDN-038 Do not enable Graceful Restart between BGP neighbors. Avoids 
loss of traffic. Graceful Restart maintains the forwarding table which in turn 
will forward packets to a down neighbor even after the BGP timers have expired 
causing loss of traffic

I don't run BGP as an east-west protocol, so I've never had cause to use this, 
but this might be one of the risks the speaker of the talk you heard was 
referring to.

 -mel


From: NANOG  on behalf of Graham 
Johnston 
Sent: Friday, April 16, 2021 7:11 AM
To: nanog@nanog.org 
Subject: BGP Graceful Restart

I do believe that I understand the intended purpose of BGP
graceful-restart. With that said, I was watching a video of a talk
given by someone respected in the industry the other day on the use of
graceful-shutdown and at the beginning of the talk there was a quick
disclaimer that his topic had nothing to do with graceful-restart
along with some text on the slide that provided me a clear indication
that he was not a fan of graceful-restart.

Largely, I suspect that his point was that if you otherwise do the
right things during maintenance that graceful-restart has the
potential of being really problematic if things go wrong, and thus he
was discouraging the use of it. Is there consensus as to whether
graceful-restart has any place in a service provider network?

Thanks,
Graham


BGP Graceful Restart

2021-04-16 Thread Graham Johnston
I do believe that I understand the intended purpose of BGP
graceful-restart. With that said, I was watching a video of a talk
given by someone respected in the industry the other day on the use of
graceful-shutdown and at the beginning of the talk there was a quick
disclaimer that his topic had nothing to do with graceful-restart
along with some text on the slide that provided me a clear indication
that he was not a fan of graceful-restart.

Largely, I suspect that his point was that if you otherwise do the
right things during maintenance that graceful-restart has the
potential of being really problematic if things go wrong, and thus he
was discouraging the use of it. Is there consensus as to whether
graceful-restart has any place in a service provider network?

Thanks,
Graham


Cisco IOS BGP Graceful-Restart Implementation query

2018-11-15 Thread Florin Vlad Olariu
Hey there!

In our environment we generally have ASR-1000X-2s everywhere peering via
iBGP/eBGP. These routers have no redundant RPs, hence cannot keep
forwarding traffic while the router reboots or crashes. As such, this is a
clear example of a router that's only NSF-aware (or graceful-restart-aware)
but not capable.

The reason I enabled this is because, from RFC 4724:

In addition, even if the speaker does not have the ability to preserve its
forwarding state for any address family during BGP restart, it is still
recommended that the speaker advertise the Graceful Restart Capability to
its peer (as mentioned before *this is done by not including any  in the advertised capability*). There are two reasons for doing this.
The first is to indicate its intention of generating the End-of-RIB marker
upon the completion of its initial routing updates, as doing this would be
useful for routing convergence in general. The second is to indicate its
support for a peer which wishes to perform a graceful restart.


So what I would expect to see in the "show ip bgp neighbor " command,
regarding Graceful Restart, would be something like the following:

BGP neighbor is ,  remote AS , internal link
[...]
  Neighbor capabilities:
[...]
Graceful Restart Capability: advertised and received
  Remote Restart timer is 120 seconds
  Address families advertised by peer:
*none*

Basically, GR is negotiated, but no address family is specified,
effectively only using the EOR marker for routing convergence improvements.
Instead, here's what the router specifies:

BGP neighbor is ,  remote AS , internal link
[...]
  Neighbor capabilities:
[...]
Graceful Restart Capability: advertised and received
  Remote Restart timer is 120 seconds
  Address families advertised by peer:
*IPv4 Unicast (was not preserved, VPNv4 Unicast (was not preserved*

My assumption is that the 'was not preserved' in the parentesis refers to
the most recent restart of the neighbor, and it means that when the
neighbor re-established the BGP connection, the GR Capability for IPv4 and
VPNv4 AFIs did not set the "Forwarding bit" as specified by the GR-RFC:

Once the session is re-established, if the "Forwarding State" bit for a
specific address family is not set in the newly received Graceful Restart
Capability, or if a specific address family is not included in the newly
received Graceful Restart Capability, or if the Graceful Restart Capability
is not received in the re-established session at all, then the Receiving
Speaker MUST immediately remove all the stale routes from the peer that it
is retaining for that address family.

Clearly the Forwarding State bit is never going to be set by this type of
router, due to hardware limitations. Here's my concern though: What happens
when the router reboots, and the neighboring routers keep forwarding
packets to this router because the GR-capabily did specify IPv4 and VPNv4
AFI/SAFI? This would clearly cause impact as traffic would be blackholed.

I will try to simulate this and see how it behaves, I'll report back, but
any info you have it would be greatly appreciated.
-- 
Florin Vlad Olariu