Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-03-03 Thread David Wright
On Sun 03 Mar 2024 at 13:27:50 (+0100), Eduard Bloch wrote:
> * David Wright [Sun, Feb 11 2024, 10:20:16PM]:
> > On Sun 11 Feb 2024 at 20:41:51 (+), Darac Marjal wrote:
> > > On 11/02/2024 11:21, Rainer Dorsch wrote:
> >
> > > > - How do I set a timeout/limit for anacron, that it cannot block forever
> > > > during a reboot?
> > >
> > > It may be germane to point out that anacron.service already explicitly
> > > sets "TimeoutStopSec=Infinity". So, in the opinion of the developers,
> > > the service shouldn't be prematurely killed. Of course you, as the
> > > system administrator, always have the right to countermand that sort
> > > of decision, but it would be curious to find out why the developers
> > > thought they needed to override the systemd default in the first
> > > place?
> >
> > Bug #915379 explains all: long-running cron jobs, like backups, can
> > get killed, and there was also an issue with exim.
> 
> Yes, and?

If you don't like it, then

  # systemctl edit --full anacron.service

and remove or decrease the timeout, which means (for removal)
you'll be reverting to how things are in bullseye.

> The opposite is: you have some stupid (and UNKNOWN) task which
> hangs forever because of some programming bug. And then your whole system
> locks up, unable to reboot, and no way to recover it because the reboot
> is stuck because of this. I am observing this on my old hacking laptop
> right now, the system took many minutes (5? 7?) to continue, but even
> that was pure luck.

I would probably have force-powered-down by then.

> Sorry, no, that cannot be the proper way. I am reopening 915379 now, the
> maintainer should maybe come up with some sane solution.

#915379 has been archived, which AIUI means you need to open a new one.

Cheers,
David.



Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-03-03 Thread Eduard Bloch
reopen 915379
thanks

Hallo,
* David Wright [Sun, Feb 11 2024, 10:20:16PM]:
> On Sun 11 Feb 2024 at 20:41:51 (+), Darac Marjal wrote:
> > On 11/02/2024 11:21, Rainer Dorsch wrote:
>
> > > - How do I set a timeout/limit for anacron, that it cannot block forever
> > > during a reboot?
> >
> > It may be germane to point out that anacron.service already explicitly
> > sets "TimeoutStopSec=Infinity". So, in the opinion of the developers,
> > the service shouldn't be prematurely killed. Of course you, as the
> > system administrator, always have the right to countermand that sort
> > of decision, but it would be curious to find out why the developers
> > thought they needed to override the systemd default in the first
> > place?
>
> Bug #915379 explains all: long-running cron jobs, like backups, can
> get killed, and there was also an issue with exim.

Yes, and? The opposite is: you have some stupid (and UNKNOWN) task which
hangs forever because of some programming bug. And then your whole system
locks up, unable to reboot, and no way to recover it because the reboot
is stuck because of this. I am observing this on my old hacking laptop
right now, the system took many minutes (5? 7?) to continue, but even
that was pure luck.

Sorry, no, that cannot be the proper way. I am reopening 915379 now, the
maintainer should maybe come up with some sane solution.

Best regards,
Eduard.



Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-02-12 Thread Max Nikulin

On 13/02/2024 03:16, Rainer Dorsch wrote:


I will check the anacron
status before the next reboot.


Try it now. It is a good chance that you have a stuck job already 
(waiting for read on a file descriptor leaked from the parent, etc.).





Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-02-12 Thread Rainer Dorsch
Hi David and Max,

many thanks for the precise and very helpful answers. I will check the anacron 
status before the next reboot.

Thanks again
Rainer

Am Montag, 12. Februar 2024, 05:20:16 CET schrieb David Wright:
> On Sun 11 Feb 2024 at 20:41:51 (+), Darac Marjal wrote:
> > On 11/02/2024 11:21, Rainer Dorsch wrote:
> > > - How do I set a timeout/limit for anacron, that it cannot block forever
> > > during a reboot?
> > 
> > It may be germane to point out that anacron.service already explicitly
> > sets "TimeoutStopSec=Infinity". So, in the opinion of the developers,
> > the service shouldn't be prematurely killed. Of course you, as the
> > system administrator, always have the right to countermand that sort
> > of decision, but it would be curious to find out why the developers
> > thought they needed to override the systemd default in the first
> > place?
> 
> Bug #915379 explains all: long-running cron jobs, like backups, can
> get killed, and there was also an issue with exim.
> 
> There's mention there of an anacron replacement called cronie, but
> I don't know what the status of this is, besides being in trixie.
> 
> Cheers,
> David.


-- 
Rainer Dorsch
http://bokomoko.de/




Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-02-11 Thread David Wright
On Sun 11 Feb 2024 at 20:41:51 (+), Darac Marjal wrote:
> On 11/02/2024 11:21, Rainer Dorsch wrote:

> > - How do I set a timeout/limit for anacron, that it cannot block forever
> > during a reboot?
> 
> It may be germane to point out that anacron.service already explicitly
> sets "TimeoutStopSec=Infinity". So, in the opinion of the developers,
> the service shouldn't be prematurely killed. Of course you, as the
> system administrator, always have the right to countermand that sort
> of decision, but it would be curious to find out why the developers
> thought they needed to override the systemd default in the first
> place?

Bug #915379 explains all: long-running cron jobs, like backups, can
get killed, and there was also an issue with exim.

There's mention there of an anacron replacement called cronie, but
I don't know what the status of this is, besides being in trixie.

Cheers,
David.



Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-02-11 Thread Max Nikulin

On 12/02/2024 03:41, Darac Marjal wrote:

On 11/02/2024 11:21, Rainer Dorsch wrote:


- How can I found out which process anacron is still running?


I think that, once the shutdown has started this is basically 
impossible.


Likely some cron job requires a fix. Try

systemctl status anacron

*before* shutdown and inspect processes that are started by anacron. 
Other variants are


systemd-cgls
ps xuwf




Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-02-11 Thread Darac Marjal

On 11/02/2024 11:21, Rainer Dorsch wrote:

Hello,

I saw during a reboot

[  *** ] Job anacron.service/stop running (15min 49s / no limit)

eventually I did a hard reset, since I was not sure if the system simply hang.

I have two quick questions:
- How can I found out which process anacron is still running?


I think that, once the shutdown has started this is basically 
impossible. User sessions have likely been killed off so your only 
option would be to log in as root, but you'll probably also find that 
getty has been killed off too, so I don't know how you'd be able to 
enter any commands at this point.


However, one thing that you could look at is to inspect the journal from 
that boot. You can run "journalctl --list-boots" to get a list of boot 
ids, then run "journalctl -b  -u anacron". Anacron will print 
lines like the following:


Feb 10 13:32:50 host.example.com systemd[1]: Started anacron.service - 
Run anacron jobs.
Feb 10 13:32:50 host.example.com anacron[1822]: Anacron 2.3 started on 
2024-02-10
Feb 10 13:32:50 host.example.com anacron[1822]: Will run job 
`cron.daily' in 5 min.
Feb 10 13:32:50 host.example.com anacron[1822]: Jobs will be executed 
sequentially

Feb 10 13:37:50 host.example.com anacron[1822]: Job `cron.daily' started
Feb 10 13:37:50 host.example.com anacron[38129]: Updated timestamp for 
job `cron.daily' to 2024-02-10
Feb 10 13:37:51 host.example.com anacron[1822]: Job `cron.daily' 
terminated (mailing output)

Feb 10 13:37:51 host.example.com anacron[1822]: Normal exit (1 job run)
Feb 10 13:37:51 host.example.com systemd[1]: anacron.service: 
Deactivated successfully.


So, from that, you can see which set of cron scripts were running. If 
you have multiple scripts, then yes, it's harder to tell which script 
was the long running one (perhaps it's something like locate updating 
it's database?)



- How do I set a timeout/limit for anacron, that it cannot block forever
during a reboot?


It may be germane to point out that anacron.service already explicitly 
sets "TimeoutStopSec=Infinity". So, in the opinion of the developers, 
the service shouldn't be prematurely killed. Of course you, as the 
system administrator, always have the right to countermand that sort of 
decision, but it would be curious to find out why the developers thought 
they needed to override the systemd default in the first place?





Thanks
Rainer


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-02-11 Thread Michael Kjörling
On 11 Feb 2024 12:21 +0100, from m...@bokomoko.de (Rainer Dorsch):
> - How do I set a timeout/limit for anacron, that it cannot block forever 
> during a reboot?

I believe you want something like:

# systemctl edit anacron.service

and

[Service]
TimeoutStopSec=300

(or whichever value you feel comfortable with)

This will create a drop-in file to customize the anacron.service unit
file with the contents you provide. You can also directly edit the
unit file, but that will cause conflicts when the package is upgraded.

See systemd.service(5) for details; search for TimeoutSec=,
TimeoutStartSec= and TimeoutStopSec=.

-- 
Michael Kjörling 🔗 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”



[ *** ] Job anacron.service/stop running (15min 49s / no limit)

2024-02-11 Thread Rainer Dorsch
Hello,

I saw during a reboot

[  *** ] Job anacron.service/stop running (15min 49s / no limit)

eventually I did a hard reset, since I was not sure if the system simply hang.

I have two quick questions: 
- How can I found out which process anacron is still running?
- How do I set a timeout/limit for anacron, that it cannot block forever 
during a reboot?

Thanks
Rainer
-- 
Rainer Dorsch
http://bokomoko.de/