Hi,

I have a pair of OpenBSD 5.6 firewalls running releases happily for
years (I think since 5.1). They are in CARP failover mode, running bgp
sessions with upstrem providers and filtering traffic.

Few days ago I had Internet outage (first in years), which appear to
happen as a result of bgpd crash. I could ping ISP's interface, but
then i noticed i have no routes at all (except connected ones) in
routing table. Next, I discovered there is no bgpd running process.
Restarting bgpd gave me routes and Internet connectivity back.

Here's excerpt from messages log:

Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sync error
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
notification: Header error, synchronization error
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
restart of IPv4 unicast, keeping routes
Apr 17 18:29:18 bgp2 bgpd[24107]: neighbor 82.117.192.121 (sbb): bad nlri prefix
Apr 17 18:29:19 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
notification: error in UPDATE message, network unacceptable
Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
restart of IPv4 unicast, not restarted, flushing
Apr 17 18:29:52 bgp2 bgpd[24107]: fatal in RDE: peer_up: bad state
Apr 17 18:29:52 bgp2 bgpd[32268]: dispatch_imsg in main: pipe closed
Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
notification: Cease, administratively down
Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 178.253.194.253 (orion): sending 
notification: Cease, administratively down


Also from daemon log at the same time:

Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sync error
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
notification: Header error, synchronization error
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
restart of IPv4 unicast, keeping routes
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
Established -> Idle, reason: Fatal error
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
Idle -> Connect, reason: Start
Apr 17 18:29:18 bgp2 bgpd[32268]: incremented the demote state of group 'carp'
Apr 17 18:29:18 bgp2 bgpd[24107]: neighbor 82.117.192.121 (sbb): bad nlri prefix
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
Connect -> OpenSent, reason: Connection opened
Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
OpenSent -> Active, reason: Connection closed
Apr 17 18:29:19 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
notification: error in UPDATE message, network unacceptable
Apr 17 18:29:19 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
Active -> Idle, reason: Fatal error
Apr 17 18:29:49 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
Idle -> Connect, reason: Start
Apr 17 18:29:49 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
Connect -> OpenSent, reason: Connection opened
Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
restart of IPv4 unicast, not restarted, flushing
Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
OpenSent -> OpenConfirm, reason: OPEN message received
Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
OpenConfirm -> Established, reason: KEEPALIVE message received
Apr 17 18:29:52 bgp2 bgpd[24107]: fatal in RDE: peer_up: bad state
Apr 17 18:29:52 bgp2 bgpd[32268]: dispatch_imsg in main: pipe closed
Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
notification: Cease, administratively down
Apr 17 18:29:52 bgp2 bgpd[32268]: decremented the demote state of group 'carp'
Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
Established -> Idle, reason: Stop
Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 178.253.194.253 (orion): sending 
notification: Cease, administratively down
Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 178.253.194.253 (orion): state change 
Established -> Idle, reason: Stop
Apr 17 18:29:52 bgp2 bgpd[9759]: session engine exiting
Apr 17 18:29:54 bgp2 bgpd[32268]: kernel routing table 0 (Loc-RIB) decoupled
Apr 17 18:29:55 bgp2 bgpd[32268]: Terminating


I would be grateful if someone explained me me what happened here, and
also what to do in order to avoid it in the future.

Thank you in advance,
-- 
Marko Cupać
https://www.mimar.rs

Reply via email to