Package: frr
Version: 10.5.1-1
Severity: normal

Dear Maintainer,

I was rebasing the ubuntu changes on top of your latest 10.5.0, and then
10.5.1 uploads, and noticed that with 10.5.1 this command used in
autopkgtests, and interactively on a new install, started failing:

root@r-dep8:~# systemctl reload frr
Job for frr.service canceled.
root@r-dep8:~#  echo $?
1

That fails the test, and I noticed in debci that the tests are failing
there too. I'm trying to troubleshoot it, and so far think it's some change
in 10.5.1, because I haven't seen this error in 10.5.0.

>From systemd's point of view, it looks like systemd is waiting for
something. Status right after the failed reload:
● frr.service - FRRouting
     Loaded: loaded (/etc/systemd/system/frr.service; enabled; preset:
enabled)
     Active: deactivating (stop-sigterm) since Sat 2026-01-10 19:56:19 UTC;
1s ago
 Invocation: 053036da3d584383beaf25711f724dd0
       Docs: https://frrouting.readthedocs.io/en/latest/setup.html
    Process: 6849 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited,
status=0/SUCCESS)
    Process: 6996 ExecReload=/usr/lib/frr/frrinit.sh reload (code=exited,
status=0/SUCCESS)
   Main PID: 7015 (watchfrr)
     Status: "FRR Operational"
      Tasks: 15 (limit: 17968)
     Memory: 27.2M (peak: 58.4M)
        CPU: 694ms

And eventually in the journal logs we see that it gave up waiting:
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: State 'stop-sigterm' timed
out. Killing.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Killing process 6798
(watchfrr) with signal SIGKILL.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Killing process 6681
(mgmtd) with signal SIGKILL.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Killing process 6683
(zebra) with signal SIGKILL.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Killing process 6688 (bgpd)
with signal SIGKILL.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Killing process 6695
(staticd) with signal SIGKILL.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Main process exited,
code=killed, status=9/KILL
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Failed with result
'timeout'.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Triggering OnFailure=
dependencies.
Jan 10 19:51:02 r-dep8 systemd[1]: frr.service: Failed to enqueue
[email protected] job, ignoring: Unit
[email protected] not found.
Jan 10 19:51:07 r-dep8 systemd[1]: frr.service: Scheduled restart job,
restart counter is at 1.
Jan 10 19:51:07 r-dep8 systemd[1]: Starting frr.service - FRRouting...

I suspect some bad interaction with the watchfrr service perhaps, which is
what the systemd service unit drives.

I also started noticing the heartbeat-failed@ log message, which I never
saw before, but that doesn't seem to be a new change in 10.5.1. I'll table
troubleshooting that for another time.
  • Bug#1125203: frr: "systemctl reload frr" fails ... Andreas Hasenack

Reply via email to