Bug#801065: Section 6.4 - discourage failing install or upgrade when service fails to start

2015-10-05 Thread Marvin Renich
Package: developers-reference
Version: 3.4.16
Severity: wishlist

As discussed on debian-devel starting at [1], I would like a comment
added to Section 6.4 "Best practices for maintainer scripts" that
recommends preventing the postinst script from returning failure when a
service fails to start.

A general rule of thumb is that if the corrective action would not
otherwise require re-running dpkg, then the postinst should behave as if
the installation succeeded.

An example of a case where postinst should succeed is if the admin,
prior to an upgrade, modified a configuration file which resulted in the
service being unable to restart during the upgrade.

Another example is if the service requires another resource to be
available that might be on another machine (e.g. a database).

The rational is that the failure has nothing to do with the installation
process or the contents of the .deb.  The service might just as well
have failed on reboot even though it was correctly installed.

I will send a follow-up message that contains a condensed digest of the
thread beginning at [1].

...Marvin

[1] https://lists.debian.org/debian-devel/2015/09/msg00496.html



Bug#801065: Section 6.4 - discourage failing install or upgrade when service fails to start

2015-10-05 Thread Marvin Renich
* Russ Allbery  [151005 18:24]:
> I'm also in favor.  However, this is a very substantial change to Debian
> practice, and I'm not sure what process should be used for making this
> kind of decision.  This wasn't a gap in our specification; rather, the
> previous standard was explicitly chosen (by at least some folks).  Failure
> to install was intentional and believed better since it didn't hide
> service failures.

I understand that this was intentional, but it is not the only way to
make sure the user (admin) is informed of the failure.

It was suggested in the discussion on d-devel that some other
notification mechanism could be implemented instead of installation
failure.

I was thinking about a special dpkg trigger that would be handled
internally by dpkg.  Packages would, in their postinst, place a file
containing non-fatal but important failure information in a directory
owned by dpkg.  This info would be printed out by dpkg at the end of the
run, and would also be passed to front ends that asked.

Front ends such as aptitude could ask dpkg for these notifications.  If
a large installation needed to be split into multiple calls to dpkg, the
front end can aggregate the notifications and present them all at the
very end, in whatever way is native to the front end.

One of the other notifications that I, personally, would like to see
this used for is when a configuration file has changes that cannot be
handled automatically.  Currently you are asked in the middle of the
installation; this would not change at all.  But it would be nice to
have a summary at the end of all the config files that had incompatible
changes, regardless of how I answered the dpkg or ucf prompt.

> I feel like this would benefit from a broader discussion than just the
> Policy list, but I'm not sure how to go about that, or what teams in
> particular should weigh in.

The discussion started on d-devel; should it be moved back there?  The
overwhelming majority of opinion seems to be in favor of the change.

...Marvin



Bug#801065: Section 6.4 - discourage failing install or upgrade when service fails to start

2015-10-05 Thread Sam Hartman
> "Marvin" == Marvin Renich  writes:

Marvin> As discussed on debian-devel starting at [1], I would like a
Marvin> comment added to Section 6.4 "Best practices for maintainer
Marvin> scripts" that recommends preventing the postinst script from
Marvin> returning failure when a service fails to start.

I strongly support this practice.



Bug#801065: digest of debian-devel discussion leading to this bug

2015-10-05 Thread Marvin Renich
Here is a condensation of the discussion prior to filing this bug.  I
have removed all quotes of previous messages (e.g. msg [2] contained
quotes from msg [1] that have been removed).  I have tried to identify
when a message is a reply to parts of several messages or to a message
that is not the previous message in this digest.  When in doubt, refer
to the original message.

[1] https://lists.debian.org/debian-devel/2015/09/msg00496.html
* Marvin Renich  [150923 13:53]:
> 
> 
> From the first time I had dpkg mark a package as half-configured when
> everything was correct except that the service would not start for some
> reason that had nothing to do with package installation (exactly the
> situation here for virtualbox), I have felt that dpkg had no business
> failing just because the service would not start.  I think that is a
> wrong design decision.
> 
> In fact, one specific case that often hurts me is when I have xen
> installed on a machine where I only run the hypervisor occasionally.
> Upgrading the xen packages causes (or has caused in the past) the
> upgrade to fail.  This is ridiculous!
> 
> I think it should be documented in the developers reference that if you
> attempt to start or restart a service in postinst, you should guard it
> so that a failure in the service does not propagate to a failure of the
> postinst.
> 
> 

[2] https://lists.debian.org/debian-devel/2015/09/msg00508.html
* Jeroen Dekkers  [150924 07:23]:
> But then when something goes wrong when upgrading and the service
> doesn't (re)start apt/dpkg will report success but the service isn't
> running anymore. That also sounds wrong to me. Letting postinst fail
> might not be the best way to signal this, but to change that we need
> something else to let the user know that something went wrong. Just
> printing an error message isn't enough, because the user might not see
> that (for example when multiple packages are installed/upgraded and a
> later package asks some questions using dialog or when using
> unattended-upgrades).

[3] https://lists.debian.org/debian-devel/2015/09/msg00511.html
* Marvin Renich  [150924 08:12]:
> How does failing the upgrade solve anything?  The upgrade should only
> fail if the failure of the service to start was because something in the
> upgrade itself was broken; this is rarely the case.
> 
> There are two prominent reasons why a service fails to start after an
> upgrade:  the relationship between the application and its configuration
> has changed (e.g. different, incompatible defaults or incompatible file
> format) or some external influence that has nothing to do with the
> upgrade (e.g.  unavailable resource).
> 
> The first case requires the admin to sort out the changes and fix the
> configuration.  Being required to re-run the dpkg installation just to
> flip the 'half-configured' state to 'installed' when the result would
> have been the same if dpkg had not failed the first time is wrong.
> 
> In the second case, how is it a dpkg installation failure if the
> hypervisor is not running so xen won't start?  Everything is installed
> perfectly.  Or if a daemon fails to start because the ldap server on a
> different host is down?  Failing the installation is _really_, _really_
> _wrong_!
> 
> What makes this even worse is that when installing or upgrading a large
> number of packages, this kind of incorrect failure sometimes affects
> many completely unrelated packages.  For an unattended upgrade, this is
> so much worse than having one service that (for a correct reason)
> refused to restart after the upgrade.
> 
> What you are looking for is a more prominent notification that a service
> did not restart.  But the current situation is like the "check engine"
> light flashing when you are low on fuel; yes, it gets your attention,
> but it is telling you the wrong thing.

[4] https://lists.debian.org/debian-devel/2015/09/msg00518.html
* Henrique de Moraes Holschuh  [150924 12:21]:
> What we really want is a "do not fail upgrade, BUT report that some services
> *that were previously running* failed to restart after the upgrade run".
> 
> ESPECIALLY if you are going to take "unattended upgrades" seriously.
> 
> Still, that would need some proper design work, and a reasonable amount of
> code to be written and tested.  Some of it will hook into the package
> system, some of it needs to interface to the services subsystem (systemd,
> sysvinit, others).

[5] https://lists.debian.org/debian-devel/2015/09/msg00519.html
* Paul Gevers  [150924 14:12]:
> I would like to add there is more than just services. As the current
> maintainer of dbconfig-common, it is more than clear to me that updates
> of packages that require updates of (and even installs into) databases
> (tables and/or their contents) also fall into this category. If for
> whatever reason we can't connect to the database (which may even be on a
> 

Bug#801065: Section 6.4 - discourage failing install or upgrade when service fails to start

2015-10-05 Thread Bill Allombert
On Mon, Oct 05, 2015 at 03:20:34PM -0700, Russ Allbery wrote:
> Sam Hartman  writes:
> >> "Marvin" == Marvin Renich  writes:
> 
> > Marvin> As discussed on debian-devel starting at [1], I would like a
> > Marvin> comment added to Section 6.4 "Best practices for maintainer
> > Marvin> scripts" that recommends preventing the postinst script from
> > Marvin> returning failure when a service fails to start.
> 
> > I strongly support this practice.
> 
> I'm also in favor.  However, this is a very substantial change to Debian
> practice, and I'm not sure what process should be used for making this
> kind of decision.  This wasn't a gap in our specification; rather, the
> previous standard was explicitly chosen (by at least some folks).  Failure
> to install was intentional and believed better since it didn't hide
> service failures.

But it had the major drawback that you could not fix the issue when the fix
required to install more packages, which is common.

So I am also in favor.

Cheers,
-- 
Bill. 

Imagine a large red swirl here. 



Bug#801065: Section 6.4 - discourage failing install or upgrade when service fails to start

2015-10-05 Thread Russ Allbery
Sam Hartman  writes:
>> "Marvin" == Marvin Renich  writes:

> Marvin> As discussed on debian-devel starting at [1], I would like a
> Marvin> comment added to Section 6.4 "Best practices for maintainer
> Marvin> scripts" that recommends preventing the postinst script from
> Marvin> returning failure when a service fails to start.

> I strongly support this practice.

I'm also in favor.  However, this is a very substantial change to Debian
practice, and I'm not sure what process should be used for making this
kind of decision.  This wasn't a gap in our specification; rather, the
previous standard was explicitly chosen (by at least some folks).  Failure
to install was intentional and believed better since it didn't hide
service failures.

I feel like this would benefit from a broader discussion than just the
Policy list, but I'm not sure how to go about that, or what teams in
particular should weigh in.

-- 
Russ Allbery (r...@debian.org)   



Bug#801065: Section 6.4 - discourage failing install or upgrade when service fails to start

2015-10-05 Thread Sam Hartman
> "Russ" == Russ Allbery  writes:

Yeah, but there's a significant factor that reduces things somewhat.
In the past, /etc/init.d/foo failing would often cause postinst to
break.
However, in the past, there were a lot of failures that were not
detected by the init.d script.
We have two intentional decisions conflicting:

1) systemd tries to be a lot better about reporting service status than
our previous infrastructure

2) We had the decision on a number of people to not hide failures by
causing installation to fail.

I actually think the folks in category 2 weren't typically hiding a lot
of service failures, because  it was fairly common for the init script
to mask that, but it did tend to hide failures like missing
configuration files etc.

I certainly know that when deploying units for krb5 I had to mask a
bunch more failures to get behavior consistent with the previous
packages.

Given the above, I think a discussion on -devel (which has happened) and
a discussion on-policy is sufficient.

--Sam