Bug#1079967: should policy and dpkg agree on allowed versions?

2024-08-28 Thread Helmut Grohne
Package: dpkg-dev,debian-policy
Severity: wishlist
X-Debbugs-Cc: po...@debian.org

Hi Guillem and policy editors,

Emilio and me noticed that policy and dpkg have subtly different ideas
of what is a version. While man deb-version says

| The upstream-version may contain only alphanumerics (“A-Za-z0-9”)
| and the characters . + - : ~ (full stop, plus, hyphen, colon, tilde)
| and should start with a digit.

Debian policy section 5.6.1 says

| The upstream_version must contain only alphanumerics 6 and the
| characters . + - ~ (full stop, plus, hyphen, tilde) and should start
| with a digit. If there is no debian_revision then hyphens are not
| allowed.

Technically speaking, it is fine for policy to forbid things that dpkg
allows. Other distributions based on dpkg may use a different policy and
allow using multiple colons. Still is is an odd aspect and may cause
confusion. Is this difference intentional? If yes, would it make sense
to add a footnote to policy hinting that it is more restrictive than
dpkg? I also checked packages in unstable and found no packages with a
version containing two colons (i.e. all packages are policy-compliant in
this regard).

Thanks for considering

Helmut



Bug#1074014: Bug#1073622: Bug#1073608: mksh, pax: no move to /usr going to happen, because:

2024-08-08 Thread Helmut Grohne
Hi Russ,

On Thu, Aug 08, 2024 at 08:40:46AM -0700, Russ Allbery wrote:
> Just to be sure, though, I don't think this is the problem that Thorsten
> was worried about.  My understanding of the problem Thorsten was reporting
> was slightly different:

Thank you for bridging the gap in communication.

> 1. A user has pointed /bin/sh to mksh, via a local diversion, changed
>symlink, or whatever other mechanism.  The current mksh package, which
>from dpkg's perspective provides /bin/mksh, is installed.
> 
> 2. A new version of mksh is uploaded that no longer provides /bin/mksh and
>does provide /usr/bin/mksh.
> 
> 3. dpkg unpacks and installs that package.  This involves some sequence of
>operations that, from dpkg's perspective, remove /bin/mksh and install
>/usr/bin/mksh.  However, these are both the same file.  My
>understanding of Thorsten's concern is that there may exist a window
>during which dpkg would delete /bin/mksh, since it is no longer
>included in the package, before it installs /usr/bin/mksh, and thus
>there could be a window where /bin/sh is a broken symlink.  If the
>system crashes in that window, recovery could be annoying.

Such concern is unwarranted. When dpkg unpacks a .deb, it unpacks all
the files with a .dpkg-tmp suffix appended. Hence, we also get a file
/usr/bin/mksh.dpkg-tmp. Once all of these are synced, it issues a
sequence of renames, including rename(/usr/bin/mksh.dpkg-tmp,
/usr/bin/mksh). This will atomically replace mksh even though it was
formerly /bin/mksh (but the same file via aliasing). At no time will
looking up /bin/mksh yield -ENOENT.

If you want to verify this, you can do a very similar experiment by
upgrading dash from bookworm to trixie and observing its behaviour about
/usr/bin/dash using strace.

mmdebstrap bookworm /dev/null --variant=apt --include=strace 
--customize-hook='sed -i -e s/bookworm/trixie/ "$1/etc/apt/sources.list"' 
--chrooted-customize-hook='apt-get update && apt-get -y install libc6 && 
apt-get download dash' --chrooted-customize-hook=bash

In there, strace dpkg -i dash_*.deb.

> I am unfortunately not very familiar with the ordering guarantees provided
> by dpkg and with the precise mechanism of dh_movetouser.  I did look over
> the source of the latter and didn't see anything that obviously seemed to
> handle this case, but I quite likely missed something.  This presumably
> has come up with other packages (libc6, for example), so maybe there's
> something that makes this safe that I'm not aware of?

dh_movetousr is rather boring as it runs at build time and just moves
files inside the data.tar from / to /usr. You can often achieve the same
effect by changing debian/*.install files and adding a couple of "usr/".
I'm not sure how it would be relevant to ordering guarantees of dpkg.

> Maybe the protective diversions also protect against this problem as well
> as the problem of moved files?  I unfortunately failed to spot where the
> protective diversions were added in dh_movetouser (if that even is the
> right place to be looking), so I'm fairly sure I'm missing something.

dh_movetousr has nothing to do with protective diversions. It does not
add nor remove diversions nor does it change any. All it changes is
locations of files in the data.tar of a .deb. All of the protective
diversions that we ever installed for DEP17 are managed in maintainer
scripts and dh_movetousr does not touch maintainer scripts at all.

> Do we document for users somewhere how to change /bin/sh, as a replacement
> for the debconf questions?  When I was investigating this, I tried to find
> documentation in the bash and dash packages and was unsuccessful.  That's
> not a Policy question, of course, just an aside, but this sounds somewhat
> complicated and I'm not sure a user would be able to figure out the new,
> correct way to change /bin/sh.

I also looked and could not locate such documentation.
dash/0.5.11+git20210903+057cd650a4ed-8 mentions:

|   * Remove the remnants of the debconf shell question
|(Closes: #1007093, #1007241).

In the first bug, Andrew stated that selecting the shell via debconf
wasn't supported. The submitter of the second indicates that the debconf
choice didn't practically work.

Earlier, dash/0.5.11+git20210903+057cd650a4ed-4 mentions:
|   * Stop using debconf to select the default /bin/sh.

But no NEWS nor documentation seems to have been added at that time.

I also had my hands in this. I sent the patch for removing the diversion
of dash for itself (#989632). It was applied during the trixie cycle.
This work enables us to have no diversion in the common case of /bin/sh
pointing to dash.

> It sounds like the plan is to cover this in the relesae notes, which is
> great, but we probably also need ongoing documentation for the future.
> I'm not sure the best place to put that.  Maybe just in README.Debian for
> whatever package currently provides /bin/sh by default.

Your reasoning makes sense to me. 

Bug#1074014: Bug#1073608: Bug#1074014: Bug#1073622: Bug#1073608: mksh, pax: no move to /usr going to happen, because:

2024-08-08 Thread Helmut Grohne
Hi Sam,

I see this is getting a bit off-topic and reommend that you spin off a
discussion on d-devel if this really matters to you.

On Wed, Aug 07, 2024 at 04:27:01PM -0600, Sam Hartman wrote:
> >>>>> "Helmut" == Helmut Grohne  writes:
> 
> Helmut> In bullseye and earlier, I guess it works.
> 
> Helmut> If you start with bullseye or earlier, upgrade to bookworm
> Helmut> and then to trixie, it continues to work, because the dash
> Helmut> maintainer scripts preserve any diversion that is not owned
> Helmut> by dash.
> 
> This seems really broken.
> As a sysadmin I would definitely expect to be able to repoint the
> /bin/sh symlink and have that preserved.

I am surprised. This is not what I would expect for files below /usr
(and the locations that now point inside /usr). Much to the contrary, I
would expect that the next package upgrade overwrites my changes unless
I take special precautions (such as update-alternatives or dpkg-divert)
to persist my changes.

> I hope we choose to move back to a situation at some point where it does
> not require diversions to get that behavior.

There is no "back". What you picture never was the case. /bin/sh has
always been installed in some .deb's data.tar. What changed over time is
that we first added diversions for transitioning from bash to dash and
later removed that mechanism as the transition is complete and the
desire to choose your /bin/sh is not as prevalent as it used to be
(mainly because choice of /bin/sh no longer affects boot speed much).

> In my mind that is an even higher priority than say avoiding bootstrap
> tools needing to create a /bin/sh symlink.

I hear you, but I note that both of us arrive at significantly different
cost/benefit judgements leading us to arrive at opposite opinions. From
my current point of view, the status quo is reasonable and
you are the one who sees a need for change. I am happy to participate
constructively, but I am not the one driving such change.

Helmut



Bug#1074014: Bug#1073608: Bug#1074014: Bug#1073622: Bug#1073608: mksh, pax: no move to /usr going to happen, because:

2024-08-07 Thread Helmut Grohne
Hi Thorsten,

On Wed, Aug 07, 2024 at 09:59:09AM +, Thorsten Glaser wrote:
> >that the way people tend to use mksh is by adding a local diversion for
> 
> Unfortunately not.
> 
> The way we have to do it since squeeze, when dash unilaterally broke
> cross-package coordination, is:
> 
> dpkg-reconfigure dash ⇒ remove its owning of /bin/sh
>  (so it reverts to bash)
> ln -sf lksh /bin/sh
> 
> This cleanly persists across upgrades, bash was never problematic
> wrt this.

I fear this approach no longer works.

In bullseye and earlier, I guess it works.

If you start with bullseye or earlier, upgrade to bookworm and then to
trixie, it continues to work, because the dash maintainer scripts
preserve any diversion that is not owned by dash.

If you start with bookworm, the debconf stuff is gone and your
dpkg-reconfigure does nothing. In order to change your default shell,
you get to remove dash's diversion and do your own diversion. dash will
not touch this diversion.

Before you ugrade to trixie, you'll have to add a diversion for
/usr/bin/sh.

If you start with trixie, debconf and diversion management is gone. All
that remains is code for removing the previous default diversion of
dash. You get to do your own diversion and dash will not touch it.

I see how the changes to /bin/sh management feel annoying to, but the
target state bears quite some benefits in my view:
 * In forky, we will remove all maintainer script snippets related to
   management of /bin/sh (i.e. the remaining diversion cleanup).
   These maintainer scripts posed significant maintenance costs earlier.
 * It is rare to want to change /bin/sh and local diversions are a
   suitable and established mechanism to change package-installed files.
 * Managing debconf stuff from scripting can be more difficult than
   creating a local diversion. For the scripting use case it actually
   becomes easier.

> There is absolutely no reason to force files to move, given they
> are now aliased already *anyway*.

The reason to force the file move is to entirely remove the problem
categories listed at https://subdivi.de/~helmu/dep17.html. Removing
entire classes of problems reduces the collaborative long-term
maintenance cost of the distribution, but it incurs the one-time cost of
performing the move. I see that you disagree with our judgement of the
cost-benefit trade-off, but we have wide agreement inside Debian with
this view. I've had long arguments about this a year ago listened to
concerns and raised my own. I suggest that the time for arguing on this
matter is over unless you bring new information to the table.

Helmut



Bug#1074014: Bug#1073622: Bug#1073608: mksh, pax: no move to /usr going to happen, because:

2024-08-07 Thread Helmut Grohne
Hi Thorsten and Russ,

thanks for dissecting the disagreement. Your reply helped me better
understand what Thorsten probably sees as a problem.

On Tue, Aug 06, 2024 at 05:23:35PM -0700, Russ Allbery wrote:
> Second, you believe the existing migration strategy will not work safely
> for mksh because of the potential /bin/sh symlink.  Helmut, do you
> disagree with this?  I'm not sure I'm clear on the precise point of
> disagreement: is the argument a factual disagreement about the behavior of
> the tools and the upgrade process, or an argument about acceptable risk?

I was looking at this too narrowly from a mksh-perspective only and I
still think that the addition of dh_movetousr to mksh does not worsen
the situation on the mksh side. What I didn't see as clearly earlier is
that the way people tend to use mksh is by adding a local diversion for
/bin/sh and such a diversion is subject to DEP17 problems and in
particular, it is rendered ineffective by dash/0.5.12-9, which moves
/bin/sh to /usr/bin/sh. Say I have a bookworm system with mksh and
/bin/sh locally diverted. Now I upgrade dash to trixie. In that process,
dpkg will honour the diversion during deletion and then see that no
diversion affects the new location /usr/bin/sh happily overwriting
/usr/bin/sh (and via aliasing /bin/sh) breaking the user's earlier
choice to link /bin/sh to lksh.

> Both bash and dash have already done this migration; how did they handle
> this problem?  Presumably they are at least as widely used as /bin/sh as
> mksh is, and I don't recall this breaking people's systems.  Perhaps I
> missed those problems?

bash and dash earlier had a mechanism based on package-issued diversions
and debconf. I managed to remove this mechanism before the release of
bookworm and now the only supported way of changing /bin/sh is local
diversions. Indeed, bash and dash did not handle this at all as we
deemed messing with local diversions to be too much risk of getting the
user's intention wrong. Rather, we will be extending the release-notes
and add a section on local diversions asking users to duplicate them
before upgrading.

Given that diverting /bin/sh is a more common use case, I think it is
fair to add a check to dash.preinst for /bin/sh being diverted and
/usr/bin/sh not and in that case abort the upgrade giving users the
chance to fix their system before moving forward. I will be filing a bug
with a patch for dash later (hopefully today).

So thanks to your interaction, I now see how there is a problem for
mksh, but it is not introduced nor worsened by using dh_movetousr in
mksh.

> I don't think this is really open for discussion at the Policy Editor
> level since my understanding is that the CTTE has decided that this is how
> we're going to do the transition.  In the case where this approach risks
> harm to the user's system, obviously that is something that needs to be
> analyzed and appropriately addressed, but in the typical case, no, the
> files in the packages should move so that we get to the more predictable
> and easier-to-reason-about end state that was the goal of the migration
> fix adopted by the CTTE.

I don't think the CTTE has actually issued a ruling on DEP17 or
/usr-move short of repealing the moratorium in order to enable moving
forward. The initial DEP17 as I proposed it suggested leaving all files
in place and enabling dpkg to understand the aliasing. However, that is
not the solution that consensus emerged around. I then adopted and
pursued the /usr-move path that I perceived as reaching most agreement.
There are two occasions where this could be seen as having been vetted.
One is elaborate discussions on d-devel with consensus summaries that
have not been objected to. The other is a transition bug that has been
acknowledged by the release team. In any case, I do not think we can use
the CTTE to back up my proposed policy change.

Helmut



Bug#1074014: encode mandatory merged-/usr into policy

2024-07-26 Thread Helmut Grohne
Hi,

On Fri, Jun 21, 2024 at 08:27:57PM +0200, Helmut Grohne wrote:
> given the progress we have made with /usr-move and DEP17, I think it is
> time to consider encoding the changes into policy. As of this writing,
> there are 216 source packages in unstable that still install into
> aliased locations and their number has been dropping since a while. All
> but very few packages have bug reports of important severity and will
> have their severity upgraded to serious on August 6th.

118 source packages as of this writing.

I think we have quite positive feedback from both policy editors and
others. We have a proposed wording and we have seconds from
 * Chris Hofstaedtler
 * Holger Levsen
 * Jochen Sprickerhof
 * Luca Boccassi
 * Michael Biebl

I am happy to move forward with or without Russ' proposed addition
however you see fit and generally think it is good advice:

| Since paths either with or without /usr are supported on Debian
| systems, maintainers of non-native packages are encouraged to follow
| the same conventions as the upstream package when referencing absolute
| paths.  There is no need to change upstream code from, for example,
| /bin to /usr/bin (or from /usr/bin to /bin) when packaging for Debian.

Can a policy editor follow up with instructions on where we are (from a
policy procedures point of view) and what needs to be done to move this
proposal forward?

Thanks

Helmut



Bug#1075856: Clarify filename conflicts for programs

2024-07-06 Thread Helmut Grohne
Hi,

On Sat, Jul 06, 2024 at 06:29:20PM +0200, Chris Hofstaedtler wrote:
> every so often packages install different, unrelated programs into
> different directories on the PATH. This often goes unnoticed for a
> long time, thus changing it later becomes harder.
> 
> I think policy already forbids this with the existing wording in
> 10.1 - it says "filenames" and not "paths". I think this should be
> made more explicit.
> 
> Today this is *in Debian* often "only" a problem for the root user,
> which has /sbin on the default PATH. But some of our downstreams
> always have /sbin on the PATH, and it also seems adding /sbin is a
> popular customization, etc.

I welcome this change having been bitten by this myself. The current
behaviour is surprising in a bad way.

> diff --git a/policy/ch-files.rst b/policy/ch-files.rst
> index b34c183..40bfa42 100644
> --- a/policy/ch-files.rst
> +++ b/policy/ch-files.rst
> @@ -7,7 +7,9 @@ Binaries
>  
> 
>  Two different packages must not install programs with different
> -functionality but with the same filenames. (The case of two programs
> +functionality but with the same filenames. This also applies when they
> +are installed into different directories on the default (user or root)
> +``PATH``. (The case of two programs
>  having the same functionality but different implementations is handled
>  via "alternatives" or the "Conflicts" mechanism. See
>  :ref:`s-maintscripts` and

I second the change and the wording, but caution on the order and
timing. I recommend filing all relevant problems as a MBF prior to
changing policy. You may use dumat to gauge this problem:

SELECT * FROM content AS c1 JOIN content AS c2 JOIN package AS p1 JOIN package 
AS p2 WHERE c1.filename LIKE 'bin/%' AND c2.filename LIKE 'sbin/%' AND 's' || 
c1.filename = c2.filename AND c1.pid = p1.id AND c2.pid = p2.id AND p1.name != 
p2.name;

You may download a suitable DB from
https://subdivi.de/~helmut/dumat.sql.zst.removethis (link intentionally
broken to prevent crawlers) or generate one yourself using dumat. You
shall see less than 700 occurrences with significant repetition, so I
expect fewer than 50 bug reports. Once these bugs are filed and have
their severity upgraded to at least important, I have no objections on
including the change in policy. I do not intend to perform this work.

Helmut


signature.asc
Description: PGP signature


Bug#1074014: encode mandatory merged-/usr into policy

2024-06-22 Thread Helmut Grohne
Hi Russ,

On Fri, Jun 21, 2024 at 02:06:05PM -0700, Russ Allbery wrote:
> I spent some time thinking about this, since I personally still wish
> people wouldn't write /usr/bin/sh when they mean /bin/sh.  I don't think
> Policy should prohibit this since, among other reasons, we have no
> effective enforcement mechanism and the package will clearly work fine on
> Debian systems.  But it would be nice if people didn't gratuitously break
> portability, admittedly mostly to non-Linux systems at this point.

I agree with your point of view. Elsewhere, when people asked whether
they should move their file references, I argued for "no". I think my
wording does not explicitly discourage those changes but makes it clear
that they are not required. When we suggested changing the dynamic
loader path, there was significant opposition and similarly /usr/bin/sh
was not popular. Hence, there is no way of going without those
links in the foreseeable future. Is there any way we can further clarify
this in the proposed wording?

> That said, I think I convinced myself that this is just not something
> Policy can reasonably address.  We should state the assumption as you
> stated it since that's required for packages to use /bin/sh at all, and
> this probably is not the place to give people portability advice,
> particularly since it only applies to things that might be copied from
> Debian to a non-Debian system and most of those aren't under our control
> and will never be written to Policy anyway.

Portability is one angle and certainly an important one. Spending
collective project resources is another one. I argue that changing these
paths beyond what is technically necessary is not a good use of our
time. So how about having policy recommend not changing path references
compared to upstream? I don't think this should be a policy requirement
as there may be good reasons to deviate and we can rationalize this
recommendation with the portability and the effort arguments. I
recognize that it only partially addresses your portability concern as
native packages as well as packages where Debian is upstream can freely
change those references without violating the recommendation, but those
tend to be a minority and a significant portion of them tends to not
work on non-Debian systems anyway.

> > Questions:

I have another question. Thorsten Glaser was unhappy about my mksh
report as he believes that it should be /bin/mksh and not /usr/bin/mksh.
I argued that the biggest concern is the symlink vs directory conflict
and he came up with a crazy solution where mksh's data.tar contains
./bin/mksh but not ./bin on the grounds that ./bin is provided by an
essential package in all Debian releases. I think this approach
practically solves a significant chunk of the problems listed by DEP17,
but it still confuses QA tools and e.g. dpkg -S (maybe more). My
proposal here would make mksh's approach violate policy. Should policy
allow Thorsten's approach? It certainly is something that needs to be
forbidden for any transitively essential package or bootstrapping tools
fail.

Helmut



Bug#1074014: encode mandatory merged-/usr into policy

2024-06-21 Thread Helmut Grohne
Package: debian-policy
Version: 4.7.0.0
X-Debbugs-Cc: bl...@debian.org,m...@debian.org,mbi...@debian.org,z...@debian.org

Hi,

given the progress we have made with /usr-move and DEP17, I think it is
time to consider encoding the changes into policy. As of this writing,
there are 216 source packages in unstable that still install into
aliased locations and their number has been dropping since a while. All
but very few packages have bug reports of important severity and will
have their severity upgraded to serious on August 6th.

Generally speaking DEP17 says that no package should install any files
below /bin/, /lib*/ and /sbin/. Doing so would amount to a symlink vs
directory conflict between base-files which now installs symlinks at the
relevant locations. What happens with these locations depends on the
order of unpacks. In many cases, this is not a problem, because
base-files is essential and thus unpacked early. Other than that,
running dpkg-deb -x foo.deb / causes these symlinks to be overwritten
with actual directories possibly breaking the installation. We currently
have mitigations for these problems in place and plan to drop them after
trixie.

For these reasons, I propose changing section 10.1 and encoding the
avoidance of symlink vs directory conflicts into policy. To get a
discussion going, I suggest the following update.

- To support merged-/usr systems, packages must not install files in both
- /path and /usr/path. For example, a package must not install both
- /bin/example and /usr/bin/example.
+ Since base-files implements mandatory merged-/usr by installing the
+ aliasing symbolic links, other packages must not install files into
+ aliased paths such as /bin, /lib, /lib* or /sbin. The package manager is
+ not prepared to deal with such aliasing and in prohibiting the
+ installation into aliased locations, we avoid triggering undefined
+ behaviour. Conversely, packages may assume that /bin, /lib and /sbin are
+ symlinks at all times and that their files below /usr/bin, /usr/lib and
+ /usr/sbin are also accessible via their aliased locations.

I suspect that this is not perfect, but it is hopefully good enough for
entering the discussion.

Questions:
 1. Do you agree that policy should be changed?

 If yes:

 2. Do you agree that policy should prohibit installing into aliased
paths?
 3. Do you agree that the current progress is sufficient for changing
policy? If not, when can we change policy?
 4. Do you agree with the proposed wording? Can you suggest
improvements?
 5. Given earlier disagreement on this matter, should we discuss this
matter in a wider setting such as d-devel?

Thanks for considering

Helmut



Bug#1057199: debian-policy: express more clearly that Conflicts to not reliably prevent concurrent unpacks

2023-12-01 Thread Helmut Grohne
Package: debian-policy
Version: 4.6.2.0
X-Debbugs-Cc: debian-d...@lists.debian.org, de...@lists.debian.org

Hi,

first of all huge thanks to David, Guillem and Julian for all of their
explanations. In large parts, this bug report is yours and I'm just the
one writing it down.

§7.4 currently starts with:

When one binary package declares a conflict with another using a
Conflicts field, dpkg will refuse to allow them to be unpacked on
the system at the same time.

I believe this is technically wrong. There are situations where dpkg
will allow such unpacks to temporarily co-exist. §6.6 goes into further
detail and is accurate.

Suppose we have two arch:all packages a version 1 and b version 1 both
of which are installed. Now we attempt to install a version 2, which
happens to declare "Conflicts: b (<< 2)". We may therefore mark b for
removal

echo "b:all deinstall" | dpkg --set-selections

and proceed to installing a:

dpkg --auto-deconfigure --unpack a_2.deb

When we do this, dpkg will unpack a version 2 before removing the files
of b version 1. I argue this is very briefly allowing these packages to
be unpacked at the same time as the next thing dpkg does is removing b's
files.

This situation can be forced if we add package b version 2, which
declares "Breaks: a (<< 2)" and attempt to install both. apt figures
that it has to temporarily remove b and hence issues the selection
above. Then it proceeds to unpacking both packages.

The difference actually is rather subtle. As dpkg is tracking ownership
of files, one should not be observing a difference. What one can see is
that a.preinst version 2 is run at a time where b version 1 is still
unpacked (and that's fine as the statement only talks about unpack). The
effects of concurrent unpack are theoretically not observable, due to
dpkg tracking files. However when you add aliasing to the mix, dpkg can
now delete files that are still needed via differences in aliasing. That
way - and I am fully aware that this violates fundamental assumptions of
dpkg - we can make the order of unpacks visible and demonstrate that
indeed a version 2 is unpacked before b version 1 has its files removed.
All of this is fully in line with the long description in §6.6. What I
take issue with is the executive summary at the start of §7.4.

In case you like some kind of test case to tinker with, I'm attaching a
script that demonstrates the situation.

Helmut


conflict-demo.sh
Description: Bourne shell script


Bug#1051801: document DEB_BUILD_OPTIONS value nopgo

2023-09-12 Thread Helmut Grohne
Package: debian-policy
Version: 4.6.2.0
Severity: wishlist
X-Debbugs-Cc: 
debian-cr...@lists.debian.org,rb-gene...@lists.reproducible-builds.org

Hi,

more and more packages implement a technique called profile guided
optimization. The general idea is that it performs a build that is
instrumented for profiling first. It then runs a reasonable workload to
collect profiling data, which in turn is used to guide the optimizer of
a second build which is not thus instrumented. The idea is that this
second build probably is faster than a regular build.

Quite obviously this approach completely breaks cross building. It also
is unclear how it affects reproducible builds since such builds depend
on the performance characteristics of the system performing the build.
This makes it very obvious that the pgo technique has downsides that
warrant disabling it in some situations.

A number of packages have agreed on disabling such optimization when
DEB_BUILD_OPTIONS contains nopgo. I'm aware of the following packages:
 * binutils
 * cross-toolchain-base
 * gcc-VER
 * halide
 * pythonVER

I'll also be filing a patch for foot to support this option.

Is this sufficient coverage to document the option already? If not, this
bug report can serve as a central point for discussing it and its
adoption.

Proposed wording:

This tag requests that any optimization performed during the build
should not rely on performance characteristics captured during the
build. Such optimization is usually called profile guided
optimization.

The proposed tag intentionally is fairly narrow. It does not cover link
time optimization. It also does not cover the case where profiling
information is recorded ahead of upload and included in the source
package[1]. In both cases, neither cross building nor reproducibility is
impacted.

As for cross builds, it is not clear to me where we want to put the
responsibility to disable pgo. At the time of this writing, most
packages automatically disable pgo when performing a cross build. On the
flip side, we could have any cross builder set nopgo like they set
nocheck already. Doing so would allow performing a pgo-enabled cross
build to i386 on amd64 while still benefiting from the larger address
space for instance. I consider this aspect to be a separate matter
though.

Helmut

[1] 
https://lists.reproducible-builds.org/pipermail/rb-general/2022-June/002638.html



Bug#1051371: debian-policy: stop referring to legacy filesystem paths for script interpreters

2023-09-07 Thread Helmut Grohne
Hi Luca,

On Wed, Sep 06, 2023 at 10:50:14PM +0100, Luca Boccassi wrote:
> Package: debian-policy
> X-Debbugs-Cc: j...@debian.org hel...@subdivi.de
> 
> Debian only supports merged-usr since Bookworm. We should update policy
> to reference /usr/bin/sh and similar paths to describe recommended
> shebangs for scripts.

I disagree. The promise of merged-/usr has always been that both paths
are valid. /bin/sh remains the location recommended by external
standards and (like the dynamic loader path) should remain the way it
is.

> I heard many times the policy maintainers mention something along the
> lines of 'policy should not be a hammer to beat other maintainers
> with'. Today I saw policy being used to force a maintainer to re-add
> support for the deprecated and unsupported split-usr filesystem layout,
> as 'policy only mentions /bin/sh, not /usr/bin/sh'.

This can also be addressed by adding a note to policy that allows
maintainers to rely on the aliasing. If there was a need to refer to the
shell via /usr/bin/sh, we would aim for eventually removing the aliasing
symlinks. That's not what we're up to.

> So let's update the policy to refer to modern and supported filesystem
> paths as adopted by Debian de-facto and de-jure, and stop other
> maintainers from getting beaten with it.

I don't think this is right. We intend to finalize the /usr-merge
transition by moving files from / to /usr. This is is an implementation
strategy that arises from the constraints set by the current
implementation of dpkg and other components. It is not a new filesystem
layout that we expect upstreams to support. Rather, we promised to
upstreams that both ways will work. The aspect that in a data.tar we'll
have to install files to /usr is a technical one and can be supported by
debhelper. Still, packages may assume that referencing files they
installed to /usr via aliased paths in / will continue to work.

> Patch attached and also pushed to
> https://salsa.debian.org/bluca/policy/-/tree/bin_sh

Nack to this particular change, but I agree that it is worth considering
two changes to policy sooner and later:
 * Making it explicit that referring to files via either paths for
   read-only consumption is ok.
 * DEP17 aims for not installing any files in aliased locations and we
   should encode that in policy once there is wide adoption of this rule
   in binary packages.

Would you agree to repurpose this bug to propose the former change?
While my variant is weaker, it still prevents people from using policy
to require supporting split-/usr.

Helmut



Bug#945269: debian-policy: packages should use tmpfiles.d(5) to create directories below /var

2023-06-05 Thread Helmut Grohne
On Sun, Jun 04, 2023 at 02:56:59PM +0100, Simon McVittie wrote:
> I think one way or another, if anyone is going to set a package-level
> dependency on systemd-tmpfiles, the first (preferred) dependency needs to
> be on either a concrete provider (systemd or systemd-tmpfiles-standalone
> in this case), or a default-systemd-tmpfiles virtual package
> that only has one provider per architecture (which is the way
> {default-,}dbus-{system,session}-bus are handled). Otherwise, you
> can get a non-deterministic choice of default implementation, which
> seems strictly worse than either depending on systemd or depending on
> systemd-tmpfiles-standalone - if you're unlucky, it can have all the
> disadvantages of either one of those.

Thank you for the elaborate writeup. There is little to add to what you
write except for one minor aspect.

>   - actual result: apt's heuristic might have difficulty realising that
> it needs to do that

I think we should be able to guide apt here. I recently had to look into
Replaces and in that process I also had to re-read policy section 7.6.2.
It details the "other" use of Replaces to guide a package manager (e.g.
apt) for changing implementations of an interface - which is exactly
what we are talking about here. In essence, it says that we should do:

Provides: systemd-tmpfiles
Conflicts: systemd-tmpfiles
Replaces: systemd-tmpfiles

And systemd-standalone-tmpfiles does that. :) But systemd does not. :(
systemd misses out on Conflicts and Replaces. I guess (but have not
verified) that once these are added, apt would be happier to "upgrade"
systemd-standalone-tmpfiles to systemd when needed.

I've also experimented with a minimal chroot, installed the standalone
tools and the asked apt to install libbiometric0 (which happens to have
a dependency on systemd) and apt was quite happy with removing the
standalone variants. This is still missing the consumers of the provided
facilities though, so it might not be representative.

Is there any concrete evidence of apt having difficulties in a real
situation? Or maybe a constructed example demonstrating this? Thanks for
being cautious, but I'd also like to understand whether this is
hypothetical or real.

Helmut



Bug#1020323: debian-policy: document DPKG_ROOT

2022-10-05 Thread Helmut Grohne
Hi Joshannes,

On Wed, Oct 05, 2022 at 02:35:30PM +0200, Johannes Schauer Marin Rodrigues 
wrote:
>To enable creating a foreign architecture Debian chroot during the early
>bootstrap of a new Debian architecture, maintainer scripts and utilities
>called by maintainer scripts of packages in the essential and
>build-essential set, should support operating on a custom chroot directory.
>This is to avoid running any of the foreign architecture utilities from the
>chroot, because those cannot be executed during the early bootstrapping
>phase of a new architecture.  Instead, by avoiding the chroot() call,
>utilities from the outside should operate on the chroot path given via the
>`DPKG_ROOT` environment variable.  This environment variable is set but
>empty during normal package installations.  If the `DPKG_ROOT` environment
>variable is not empty, then this indicates to the maintainer scripts and 
> the
>tools it executes, that a chroot is being built as part of an early
>architecture bootstrap and all operations should be performed in the chroot
>path given by the contents of the `DPKG_ROOT` environment variable. In that
>case, the maintainer script should not modify anything outside the chroot
>directory.

Thank you for writing this.

> I refrained from using "must" because we promised maintainers that they would
> not need to do the work themselves but will get patches sent from us. We do 
> not
> want to force work on maintainers by making it an RC bug if they do not 
> support
> DPKG_ROOT.
> 
> Helmut, what do you think?

I think this text is already quite good. I am yet wondering about the
scope of support that we mention here.

1. You write that we want essential + build-essential. In practice, we
   also want things such as apt or systemd. I am wondering whether we
   should rephrase this in a less specific way that leaves open some
   packages beyond the mentioned set. Vagueness can be avoided by
   explaining the purpose: We target packages that are relevant to
   setting up an initial build daemon.

2. We should likely mention that package upgrade and removal paths can
   freely ignore DPKG_ROOT. Maintainer scripts can assume that when
   DPKG_ROOT is in effect, it will be an initial installation. Something
   along this would have helped Michael in determining whether his
   recent changes to init script handling would affect DPKG_ROOT.

Let me try to extend your text:

To enable creating a foreign architecture Debian chroot during the early
bootstrap of a new Debian architecture, maintainer scripts and utilities
called by maintainer scripts of packages relevant to setting up a
build daemon, should support operating on a custom chroot directory.
[... keep rest of the text unchanged ...]
Support for `DPKG_ROOT` in code that handles package upgrades or
package removal is not needed.

Helmut



Bug#970234: consider dropping "No hard links in source packages"

2022-09-22 Thread Helmut Grohne
Hi Russ,

On Thu, Sep 22, 2022 at 07:20:00PM -0700, Russ Allbery wrote:
> From 12b014c4b930577a728dfb1254b64aac6a5eb1e0 Mon Sep 17 00:00:00 2001
> From: Russ Allbery 
> Date: Thu, 22 Sep 2022 19:15:52 -0700
> Subject: [PATCH] Allow hard links in source packages
> 
> It's not clear why this restriction was in place, and Debian
> included a package containing hard links without anyone noticing
> in the last release.
> ---
>  policy/ch-source.rst | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/policy/ch-source.rst b/policy/ch-source.rst
> index c7415fc..a7df539 100644
> --- a/policy/ch-source.rst
> +++ b/policy/ch-source.rst
> @@ -282,8 +282,8 @@ source files in a package, as far as is reasonably 
> possible.  [#]_
>  Restrictions on objects in source packages
>  --
>  
> -The source package must not contain any hard links,  [#]_ device special
> -files, sockets or setuid or setgid files.. [#]_
> +The source package must not contain device special files, sockets, or
> +setuid or setgid files. [#]_
>  
>  .. _s-debianrules:
>  
> @@ -918,13 +918,6 @@ must not exist a file ``debian/patches/foo.series`` for 
> any ``foo``.
> would be nice if the modification time of the upstream source would
> be preserved.
>  
> -.. [#]
> -   This is not currently detected when building source packages, but
> -   only when extracting them.
> -
> -   Hard links may be permitted at some point in the future, but would
> -   require a fair amount of work.
> -
>  .. [#]
> Setgid directories are allowed.
>  

Seconded.

Helmut


signature.asc
Description: PGP signature


Bug#983657: debian-policy: weaken manual page requirement

2021-02-28 Thread Helmut Grohne
On Sun, Feb 28, 2021 at 10:53:20AM -0700, Sean Whitton wrote:
> Can you post a patch just doing the moving manpages to dependencies part
> and indicate that you are seeking seconds?  Then we can get that
> applied.

I call for seconds on:

--- a/policy/ch-docs.rst
+++ b/policy/ch-docs.rst
@@ -12,9 +12,9 @@
 "cat page".
 
 Each program, utility, and function should have an associated manual
-page included in the same package. It is suggested that all
-configuration files also have a manual page included as well. Manual
-pages for protocols and other auxiliary things are optional.
+page included in the same package or a dependency. It is suggested that
+all configuration files also have a manual page included as well.
+Manual pages for protocols and other auxiliary things are optional.
 
 If no manual page is available, this is considered as a bug and should
 be reported to the Debian Bug Tracking System (the maintainer of the

Helmut



Bug#983657: debian-policy: weaken manual page requirement

2021-02-28 Thread Helmut Grohne
On Sun, Feb 28, 2021 at 11:58:08AM +0100, Bill Allombert wrote:
> On Sun, Feb 28, 2021 at 08:29:21AM +0100, Helmut Grohne wrote:
> > So this is actually asking for two distinct things:
> >  * Allow moving manual pages to dependencies
> >  * Allow demoting such dependencies to recommends
> > 
> > A possible wording in ch-docs.rst could be:
> >  Each program, utility, and function should have an associated manual
> > -page included in the same package. It is suggested that all
> > +page included in the same package or one of its dependencies or
> > +recommended packages. It is suggested that all
> >  configuration files also have a manual page included as well. Manual
> >  pages for protocols and other auxiliary things are optional.
> > 
> > What do you think?
> 
> The goal is to avoid program to be installed but not their manpages,
> so generally I do not find Recommends to be enough.

If we cannot build consensus around that second part, so be it. But
maybe the other part (moving manual pages to dependencies) can reach
consensus?

Helmut



Bug#983657: debian-policy: weaken manual page requirement

2021-02-27 Thread Helmut Grohne
Package: debian-policy
Version: 4.5.1.0
Severity: wishlist

I think that the Debian policy is unreasonably strict in its manual page
requirement. While the common case is that manual pages are small and
should be included in the same package, occasionally they are numerous
and moving them to a separate package makes sense. Other times, there
already is a -common or -doc package and including them there would be
possible without increasing the package count. Doing so often allows
demoting dependencies to Build-Depends-Indep and thus reducing bootstrap
problems.

I therefore think that the policy should explicitly allow manual pages
to be shipped in a dependency. We can see that this already is
established practice from this non-exhaustive list:
 * aptitude -> aptitude-common
 * assaultcube -> assaultcube-data
 * aumix -> aumix-common
 * auto-multiple-choice -> auto-multiple-choice-common
 * binutils -> binutils-common
 * bitlbee -> bitlbee-common
 * bup -> bup-doc (recommends)
 * cpp-10 -> cpp-10-doc (no relation, license re
 * critterding -> crittering-common
 * grass-core -> grass-doc
 * x3270 -> 3270-common

Beyond this, I think that a manual page does not warrant a strong
dependency given that man-db is not essential. Rather a recommendation
should be strong enough. I'm not sure whether this view is universal
though.

So this is actually asking for two distinct things:
 * Allow moving manual pages to dependencies
 * Allow demoting such dependencies to recommends

A possible wording in ch-docs.rst could be:
 Each program, utility, and function should have an associated manual
-page included in the same package. It is suggested that all
+page included in the same package or one of its dependencies or
+recommended packages. It is suggested that all
 configuration files also have a manual page included as well. Manual
 pages for protocols and other auxiliary things are optional.

What do you think?

Helmut



Bug#924401: #924401 base-files fails postinst when base-passwd is unpacked

2021-02-22 Thread Helmut Grohne
On Mon, Feb 22, 2021 at 07:33:10AM +, Tim Woodall wrote:
> A. /etc/passwd is part of base-passwd's interface and base-files is
>right in relying on it working at all times. Then base-passwd is rc
>buggy for violating a policy must. Fixing this violation is
>technically impossible.
> 
> 
> I seem to have hit this same issue independently.
> 
> Could you explain why "Fixing this violation is technically impossible"

The requirement here is that base-passwd needs to work when unpacked.
The only way to make that work is making /etc/passwd a conffile. That
would technically be possible, but it would be very annoying, because
this file is different on virtually any Debian installation. So we
cannot make it a conffile in practice. The next bet would be ensuring
that base-passwd.postinst is run before other packages' postint somehow.
Such an ordering mechanism does not exist at present and it would be
prone to dependency loops.

> As far as I can see, making base-passwd not essential, only required,
> and then making passwd and base-files pre-depend on base-passwd the
> system seems to bootstrap /etc/passed and /etc/group OK.

What you write is almost certainly self-contradictory. base-files is
essential. Anything it depends on (including base-passwd in your
scenario) is pseudo-essential and thus inherits all the same
requirements except for actually being essential. You gained nothing.
And you didn't explain how you'd make base-passwd non-essential.

> That also seems to conform to the debian policy. The oddity is that
> base-files and passwd only actually need to depend on base-passwd, not
> pre-depend on it as they only use /etc/passwd and /etc/group in the
> postinst scripts but the debian policy doesn't seem to consider this
> case.

They don't have to depend on base-passwd at all, because dependencies on
essential packages should be omitted.

I suggest that you detail on the practical issue you have been hitting.
Doing so allows evaluating prospective solutions against all relevant
use cases.

Helmut



Bug#970234: consider dropping "No hard links in source packages"

2020-10-12 Thread Helmut Grohne
Hi cate,

On Mon, Oct 12, 2020 at 04:10:00PM +0200, Giacomo Catenazzi wrote:
> The rationale was probably similar so symlinks: they may fail across
> different filesystems, and we supported to have e.g. / /usr /usr/share
> /usr/local /var (and various /var/*) /home /tmp /boot etc on different file
> systems. Now we are more strict on where we can split filesystems (and disk
> are larger, and LVM simplified much of filesystem handling).

You appear to be talking about binary packages. This bug is about source
packages. When you unpack a source package, you are creating a directory
hiearchy rooted at the point where you start unpacking. There is not
possibly any reasonable way to split your source package into multiple
file systems. This is very different from binary packages where the
underlying hiearchy is shared with other packages and directories
frequently already exist.

> I think a hardlink on same directory should be fine, or within directories
> which must be on the same filesystem.

I argue that all files within a source package are always located on the
same filesystem, because the unpack step creates the source package root
directory on one file system and everything else resides on that very
filesystem.

For binary packages, restricting the use of symlinks makes a lot more
sense to me.

Helmut



Bug#970234: consider dropping "No hard links in source packages"

2020-09-13 Thread Helmut Grohne
Package: debian-policy
Version: 4.5.0.3
Severity: wishlist

Jakub stumbled into the "No hard links in source packages" requirement
added around 1996 and couldn't make sense of it. Neither could Christoph
nor myself. tar does support hard links just fine. lintian does not
check this property. sugar-log-activity/38 is an example package
violating the property. It is shipped in buster and technically
rc-buggy though no bug is filed about it.

I believe that the requriement needs a rationale. Failing that, it
should be dropped.

Helmut



Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-15 Thread Helmut Grohne
Hi Santiago,

On Fri, Mar 15, 2019 at 11:58:12AM +0100, Santiago Vila wrote:
> blame for such bug, is annoying me. (So, Helmut, please file a bug
> in the bootstrapping tool which does not work for you, and do not
> try to fix it here).

I refuse the view that multistrap is buggy. You cite undocumented
behaviour as a reason to mark it buggy. However, multistrap relies on
semantics presently assured by policy. Given that policy talks about
unpacked packages, applying it to bootstrap (in its present wording) is
reasonable. There is a bug somewhere between policy, base-files and
base-passwd, which is exactly how I filed it.  Once this bug is fixed
(in any one of these components), additional bugs can result from that.

I think at least Guillem and Santiago were arguing that policy should
not be applied to bootstrap. While I don't like that view, I do find it
reasonable. It can be made explicit in section 3.8 quite easily:

 Since dpkg will not prevent upgrading of other packages while an
 ``essential`` package is in an unconfigured state, all ``essential``
 packages must supply all of their core functionality even when
-unconfigured. If the package cannot satisfy this requirement it must not
+unconfigured after being configured at least once.
+If the package cannot satisfy this requirement it must not
 be tagged as essential, and any packages depending on this package must
 instead have explicit dependency fields as appropriate.

After doing so, we'll likely need to do something about mmdebstrap and
multistrap as well as furthering our utopia about declarative
replacements for maintainer scripts.

Helmut



Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-14 Thread Helmut Grohne
On Thu, Mar 14, 2019 at 07:50:27AM +0100, Johannes Schauer wrote:
> > I would certainly consider a lot cleaner to add a new field to base-files in
> > the form "Bootstrap-Depends: base-passwd" than converting all chowns in
> > postinst to use integer numbers.
> 
> I agree that we should not expect maintainers to write numeric user and group
> ids into their maintainer scripts. This is not only hard to write but also 
> hard
> to read and maintain. In my opinion, using numeric ids should only be a
> temporary measure until we have a declarative method or other helper that does
> the correct translation instead. But since no such helper exists right now,
> numeric ids are probably the best way to fix this bug for buster.

I object to this view. It was never suggested to have anyone write
numeric ids. What Simon suggested was writing symbolic names and have
the package build use static allocations to translate these symbolic
names to numeric ids at build time. This is a whole different story than
having to write them.

If we agree that this would be the best fix for buster, I volunteer to
write a patch for base-files to implement that. Doing so would be easier
if using a more featureful template interpolation language than sed. Do
you (Santiago) have any preference here? I could think of
m4/sh/perl/python. sed will work, but might be ugly.

On Thu, 14 Mar 2019 10:21:30 +0100, Santiago Vila wrote:
> The way I see it, if base-files fails during bootstrapping it's not
> because it does not "help" the bootstrapping tool, but because the
> bootstrapping tool didn't bootstrap base-passwd in the first place.

I think this view is difficult. How is a bootstrap tool supposed to know
that it must configure base-passwd before base-files? Where should we
document that? Basically everyone in this thread except you argued that
requiring such out-of-band knowledge is bad. And if that really is
required, I think policy should be a little explicit about that. Even
though Guillem found a paragraph that supports this view, it is quite
implicit at present.

> Now the question would be if we really need to add a paragraph to
> Debian Policy, "Recommendations/guidelines for bootstrapping tools",
> clearly stating that bootstrapping tools should bootstrap base-passwd
> before trying to configure base-files. I think that would be quite
> clear by now, but I could be wrong.

I actually don't think that policy should document this dependency,
because it really should be an implementation detail. From my
perspective, making it explicit that policy only applies post-bootstrap
is sufficient (e.g. copying the "configured at least once" language from
section 6.5).

If on the other hand, we require the literal interpretation that
base-files must be able to configure while other essential packages are
only unpacked and never configured, it all becomes a lot easier to
reason about. Simon's proposal implements that easily and is a
maintainable solution.

Helmut



Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-12 Thread Helmut Grohne
Hi Santiago,

On Tue, Mar 12, 2019 at 06:17:50PM +0100, Santiago Vila wrote:
> To be precise: Who is unpacking (but not configuring) a buster or
> unstable essential package set, if not a bootstrapping tool?

multistrap is doing just that.

https://manpages.debian.org/testing/multistrap/multistrap.1.en.html
| Once installed, the packages themselves need to be configured using
| the package maintainer scripts and "dpkg --configure -a", unless this
| is a native multistrap.

> Do any of them still don't know that base-passwd should be configured
> first because otherwise any other package using root (be it base-files
> or any other) will fail? I think this was already settled in the last
> discussion we had about this several years ago.

multistrap doesn't take care of this and you can provoke a
base-files.postinst failure this way.

Then there is mmdebstrap. I looked into it and couldn't find any code
that orders base-passwd or base-files or creates an /etc/passwd. It
might not fail now.

> Can you provide at least a bug number for the bootstrapping tool that
> apparently still tries to configure all packages at once, or
> base-passwd and base-files in the same row?

#924401, but I'm not yet sure which part we need to fix.

I really like Simon's (thank you for that enlightening reply) view of
interpolating the uids. It removes a bunch of problems from the equation
and works well when bootstrapping from non-Debian or from ancient Debian
releases even in chrootless mode. At the same time, it is quite safe
(due to the static allocation) and easy to implement. I fail to see
downsides.

Just because debootstrap encodes a ton of hacks to make things barely
work (and break every so often) doesn't mean we have to maintain them
until eternity.

> In other words: Is the present bug report to be considered in a
> theoretical way, or it is the result of some problem that you actually
> found recently with a bootstrapping tool?

I don't have a minimal test case at hand, but I can reproduce it with
multistrap at least.

Helmut



Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-12 Thread Helmut Grohne
Package: base-passwd,base-files,debian-policy

Debian policy section 3.8 says:

| Essential is defined as the minimal set of functionality that must be
| available and usable on the system at all times, even when packages
| are in the “Unpacked” state.

When unpacking (but not configuring) a buster or unstable essential
package set, nothing creates /etc/passwd. Creation of that file is
performed by base-passwd.postinst. base-files.postinst relies on a
working /etc/passwd by using e.g. "chown root:root".

Now we can make a choice:
A. /etc/passwd is part of base-passwd's interface and base-files is
   right in relying on it working at all times. Then base-passwd is rc
   buggy for violating a policy must. Fixing this violation is
   technically impossible.
B. /etc/passwd is not part of base-passwd's interface and base-files
   wrongly relies on its presence rendering base-files rc buggy.
C. Guillem Jover hinted that policy expects every essential package to
   be configured at least once. The current text does not make this
   assumption clear. If it holds, policy would simply say nothing about
   how to bootstrap an essential system, which may be fine. Given that
   we have debootstrap, cdebootstrap, multistrap, and mmdebstrap, it
   seems like specifying the bootstrap interface would be a good idea.
   Unfortunately, I don't exactly understand the bootstrap interface at
   present. In practise, you cannot run postinsts of essential packages
   in arbitrary order.

I argue that something is buggy. I'm not sure what. I gave three
options. Can we gather consensus on one of these?

Helmut



Bug#515856: [debhelper-devel] Bug#515856: debhelper: please implement dh get-orig-source

2017-09-18 Thread Helmut Grohne
On Mon, Sep 18, 2017 at 11:28:42AM +0200, Bill Allombert wrote:
> get-orig-source and watch files serve a different purpose.
> 
> get-orig-source is used to build the .orig. tarball from the true
> upstream one. Most package do not need that.  Watch files could not do
> that until recently.
> 
> So the comparaison is unfair.
> 
> What need to be checked is how many get-orig-source rules has been
> reimplemented in term of watch files.

Challenge accepted. ticharich.d.o has an unpack of rules debian/rules
files. Most of them are world-readable. A small number (~30) are
inaccessible, so my analysis will have an error of around 0.2%.

A simple method is to just look at which of them contain the string
"get-orig-source" and which of them contain the string "uscan" assuming
that when both show up, get-orig-source is implemented using uscan.

The following packages do not implement get-orig-source with uscan:

biojava4-live
boinc-app-seti
cjk
edk2
fasttree
freeorion
freerdp
gr-air-modes
gr-fcdproplus
gr-iqbal
gr-osmosdr
htmlunit
ioquake3
iortcw
josm
libb64
libreoffice
libtgvoip
neobio
nvidia-graphics-drivers
nvidia-graphics-drivers-legacy-304xx
pencil2d
pixelmed
qemu
r-cran-rniftilib
sagemath
west-chamber
zsh

So we have around 22500 source packages with watch files, we have 3000
packages with get-orig source, of those 28 don't use uscan. The fair
comparison is 22500 vs. 28. That's almost 3 magnitudes. If anything,
policy should document debian/watch, not get-orig-source. The perl
policy, python policy, elpa policy, ... each affect more packages than
get-orig-source. Keeping it is uneconomic.

Helmut



Bug#749826: Documenting `Multi-Arch: foreign`

2017-09-04 Thread Helmut Grohne
On Sat, Sep 02, 2017 at 08:44:14AM -0700, Sean Whitton wrote:
> Rather than introduce the new terminology 'intended interface', which we
> would definitely have to define, how about something like this:
> 
> If all a package's architecture-dependent interfaces are listed in
> README.multiarch, the package is not considered to have any
> architecture-dependent interfaces for the purposes of determining
> whether it may be labelled Multi-Arch: foreign.

This is not how it works. It's not like you can just mark any package
Multi-Arch: foreign after saying that it is architecture-dependent. That
documentation must come with a contract saying that reverse dependencies
must not use those architecture-dependent interfaces.

> If libc6's use is legitimate then it seems we'd need to include this as
> an exception.

Well, it's not exactly legitimate. It's more like unavoidable as Simon
pointed out in his reply. Technically, libc6's behaviour is wrong and
causes unpack errors. The reasonable solution would be prohibiting
coinstallation of libc6:mips and libc6:mipsel, but package metadata does
not allow us to do that currently (#747261 -> self-conflicts are always
ignored). The other option of removing Multi-Arch: same from libc6 would
essentially render Multi-Arch useless. So all we can do now is pretend
the issue wasn't there.

> > * If you rebuild the source package with a very different
> > installation set (i.e. much newer Build-Depends), does it still
> > have to match with older instances? Example: #825146. What
> > divergence in installation sets is ok?
> 
> We could just say that it must match the instances in the target suite.

We could. That would render libgiac0 rc buggy for instance, because it
was built on mips64el three weeks later than on other architectures and
thus uses an incompatible gettext.

That definition is pretty annoying for bootstraps though as replicating
ancient toolchain is kinda the opposite of what bootstrappers do.

> >(A simple way to satisfy this requirement is to use
> >architecture-dependent paths exclusively. That works except for
> >/usr/share/doc/$pkg.)
> >
> >  * The maintainer scripts must handle multiple configuration and
> >multiple deconfiguration correctly. In particular, a package can be
> >purged for one architecture while being installed for another.
> >Example: #682420.
> >
> >(A simple way to satisfy this requirement is to not ship maintainer
> >scripts.)
> >
> >  * Source packages carrying any binary package marked `Multi-Arch: same`
> >must always be binNMUed in lock-step. (Presently violated e.g. by
> >libselinux1)
> 
> Could you turn this into some commits against my branch, please?

I tried and ran into a new problem: I am now convinced that we cannot
just describe one Multi-Arch value after another as they do share some
common values. That "interface" aspect and architecture-constraints on
dependencies is a common theme and likely deserves an introductory text.

Yet, I am attaching what I have.

> It sounds like we need to just drop the whole bullet point.
> Architecture: all packages need to be checked carefully, just like
> Architecture: any packages.

Reworded.

> To my mind, the most important ways to achieve readability in this case
> are
> 
> - avoid repetition
> - avoid "probably", "likely" sentences.

The latter is particularly hard, because we violate the strict
definitions more often than is immediately apparent.

As Simon's mail demonstrates, we likely need more answers/consensus
before continuing. I'll reply in a separate mail.

Helmut
diff --git a/policy/ch-controlfields.rst b/policy/ch-controlfields.rst
index 509a96e..e6451d5 100644
--- a/policy/ch-controlfields.rst
+++ b/policy/ch-controlfields.rst
@@ -1028,6 +1028,18 @@ control file.
 We consider the meaning of each possible value of this field
 separately.
 
+``Multi-Arch: no``
+++
+
+This value is the default. When satisfying a dependency on a package
+(implicitly) marked ``Multi-Arch: no``, the depender and the dependee
+must have the same architecture. For the purpose of this matching,
+``Architecture: all`` packages are treated as if they had the
+architecture value of ``dpkg``.
+
+The value ``no`` cannot currently be used in binary packages due to
+limitations of the archive processing.
+
 ``Multi-Arch: foreign``
 +++
 
@@ -1037,12 +1049,15 @@ architecture.
 In order to determine whether this holds, you should consider
 
 the files installed by the package
-``Architecture: all`` packages always provide
-architecture-independent interfaces.  Shared and static libraries
-provide architecture-dependent ABIs.  Binary executables may
-provide architecture-independent interfaces: could software
-interacting with the executable determine the architecture for
-which it was built without reading the executable file?
+``Architecture: all`` packages tend to provide
+

Bug#749826: Documenting `Multi-Arch: foreign`

2017-09-04 Thread Helmut Grohne
Hi Simon,

On Sat, Sep 02, 2017 at 05:26:57PM +0100, Simon McVittie wrote:
> That seems like it might be a bug (or design flaw if you prefer). If a
> package (build-)depends on foo:any, it is saying "I am only using the
> arch-indep parts of foo's interface", whatever those are.

You may call it feature. The idea here was that :any should not be used
mindlessly. Thus it is only allowed on packages properly marked for that
used with ``Multi-Arch: allowed``. In Build-Depends, you can mostly
achieve the same effect with :native (which essentially is :any on any
package (but Architecture: all packages (though our dependency resolvers
don't agree here))).

> Perhaps a dependency on foo:any by (for example) bar:mips should
> always be satisfiable by foo:mips (as though the :any had been omitted),
> regardless of foo's multi-arch status? This would bring it back to the
> same meaning as omitting the :any, in the trivial case where only one
> architecture is enabled.

That proposal may ease meta data changes indeed. I suspect that it would
also cause a lot of useless :any annotations. It's a two-sided sword.

> Perhaps a dependency on foo:any should be satisfiable by any instance
> of foo that is Multi-Arch: foreign? (In this case the :any is completely
> redundant, because foreign sets up a similar situation from the other end)

After studying Multi-Arch for many years now, I recognize that a core
idea is to almost always flag the architecture constraint on the target
of an edge. To understand this wicked sentence, consider a dependency
graph and label each node (package) with an architecture. Now Multi-Arch
says that by default every edge (dependency) must enforce equal
architecture on both ends. Most of the header's job is relaxing this
restriction. The designers of Multi-Arch decided that this relaxing
should not be a property of the edges (e.g. :any), but a property of the
dependee.

Thus the current implementation ensures that :any cannot be used in
situations where it is inappropriate. As you point out, that design is
annoying for meta data transitions.

> > > I think "the files installed by ``Architecture: all`` packages always
> > > provide architecture-independent interfaces." is too broad. The counter
> > > example is haskell-devscripts-minimal. This needs to be weakened
> > > somehow.
> 
> I would argue that these interfaces are architecture-independent from
> the perspective of the package's (lack of) architecture. What they
> are not independent of is the *build machine* architecture, just like
> running uname -m or inspecting /proc/cpuinfo aren't independent of the
> build machine architecture. This is certainly a problem for
> cross-compilation, but it isn't the same issue as in dpkg or pkg-config,
> where the architecture for which dpkg or pkg-config was built gets
> hard-coded into its installed files (as the output of --print-architecture
> or part of the default search path, respectively).

That's a nice view, but it is not the view expressed by Multi-Arch. The
meaning of the header considers the whole installation set as a unit.
Whether you view this in a package building context or runtime context
does not matter, what matters is whether the tools behave differently
when you swap the architecture of underlying parts.

As a side note, we marked pkg-config Multi-Arch: foreign, but that is
technically wrong on another level. The marking would imply that it
doesn't matter which architecture you use to supply the package. A
prospective README.multiarch would need to say that you must not use
plain pkg-config (without a triplet prefix). Yet that is what most
packages do. If you perform an archive rebuild of pkg-config build-rdeps
on amd64 in a chroot with preinstalled pkg-config:i386, the majority of
builds will fail even though their Build-Depends are installable.

This is another place where we bend the rules just to make it barely
useful. For performing useful cross builds, one needs to discard host
architecture instances of ``Multi-Arch: foreign`` packages.

> > > For instance, the policy should make it
> > > clear that marking libmdds-dev `Multi-Arch: foreign` (fictional, see
> > > #843023) would be a policy violation.
> 
> It is not clear to me that doing so *should* be a policy violation. If
> libmdds-dev contains only headers (no shared or static library), and it
> exposes architecture-independent libboost-dev headers (but no Boost
> shared or static library), is there really anything wrong with having
> libboost-dev from "the wrong architecture"?

As long as everything is header-only, you can use ``Multi-Arch:
foreign``.  The thing is, even if libboost-dev was
architecture-independent, it would expose libstdc++-7-dev. Since
exposure is transitive, that carries over to libmdds-dev.

Boost's dependency on libstdc++-4.8-dev | libstdc++-dev looks a bit
strange though. Since libc++-dev provides libstdc++-dev (and no compiler
will just use libc++-dev when it is installed without further options)

Bug#872808: [debian-policy] nocheck DEB_BUILD_OPTIONS DEB_BUILD_PROFILES

2017-08-23 Thread Helmut Grohne
On Wed, Aug 23, 2017 at 07:23:14PM +0100, Ghislain Vaillant wrote:
> I also suspect that given DEB_BUILD_PROFILES=nocheck implies
> DEB_BUILD_OPTIONS=nocheck, the same should be true for nodoc?

Like DEB_BUILD_PROFILES=nocheck does *not* imply
DEB_BUILD_OPTIONS=nocheck (you must set the latter explicitly),
DEB_BUILD_PROFILES=nodoc does *not* imply DEB_BUILD_OPTIONS=nodoc.

In general, I think that this historic split into options and profiles
is unfortunate. If we were to restart now, we'd likely remove nocheck,
nodoc and maybe also nostrip from DEB_BUILD_OPTIONS and use
DEB_BUILD_PROFILES exclusively. That's not where we are unfortunately.

Arguably, the same responsibility we require for nocheck should be
applied to nodoc. Given that the nodoc option has a much lower adoption,
I am in favour of simply deprecating it. We should also remove the
"nodocs" option from the archive while at it.

Furthermore, I question the usefulness of nodoc. Since -doc packages are
generally arch:all, most often you can skip them by doing an arch-only
build. In the cases where documentation is stuffed into arch:any
packages, the option modifies package contents. As such, you can no
longer tell whether your modified package correctly satisfies its
reverse dependencies (that may use parts of the documentation other than
/usr/share/doc/). As such the nodoc option/profile is generally
considered "unsafe". Given that you cannot simply rebuild the world with
nodoc active, I have yet to encounter a practical use of nodoc.  It
seems to be a futile exercise in increasing complexity at present.

Whatever the outcome to the relevant questions is, consensus is not what
we have now.

Helmut



Bug#749826: Documenting `Multi-Arch: foreign`

2017-08-20 Thread Helmut Grohne
Hi Sean,

Thanks for picking up multiarch!

On Sat, Aug 19, 2017 at 09:50:21PM -0700, Sean Whitton wrote:
> I spoke to Russ and we're both of the view that we should document
> multiarch piecemeal.  Let's begin by getting a definition of the
> Multi-Arch: field into ch. 5 of policy.

I'm glad you agree to my proposal.

> I have pushed a new branch to the Debian policy repo named
> bug749826-spwhitton.  On that branch I've committed a slightly reworked
> form of your draft text.[1]  Please review the diff.  Here are some
> comments/issues:

Very welcome.

> - I substantially shortened your text.  Let me know if you think I went
>   too far.

I fear that some important aspects got lost indeed. More on that later.

> - Previously I was worried about defining 'interface' but I've found
>   another place where policy uses this word without defining it, and I
>   don't think it needs to be changed in either place.

I'm not a friend of vagueness, but I do recognize the difficulty in
expressing the requirements precisely.

> - I couldn't figure out how to include this text, because I didn't
>   understand it:
> 
> For instance, using dpkg --print-architecture can be used to emit the
> native architecture even though dpkg is marked Multi-Arch:
> foreign. Similarly, calling pkg-config (without a prefix) will behave
> differently on different architectures as its search path is
> architecture-dependent even thoug pkg-config is marked Multi-Arch:
> foreign.
> 
>   Are you saying that packages that depend or implicitly depend on dpkg
>   or pkg-config cannot be Multi-arch: foreign, although dpkg and
>   pkg-config themselves are Multi-arch: foreign?  Why are dpkg and
>   pkg-config Multi-arch: foreign, if they provide these
>   architecture-dependent interfaces?

Those are very good questions and clarifying them will lead to a better
understanding of what we have to put into policy. You do understand that
"dpkg --print-architecture" is part of dpkg's interface. Yet its out
varies with its architecture. Taking this strictly would indeed imply
that dpkg is wrongly marked. Similarly, running pkg-config may result in
architecture-dependent paths and thus our strict interpretation would
result in rejecting the foreign marking.

A common theme with such cases is to resort to `Multi-Arch: allowed`
(e.g. make), but that has the downside of requiring most consumers to
attach the :any annotation and that it can never be switched back
(because :any dependencies on packages not marked M-A:allowed are
unsatisfiable).

This is where I thought about README.multiarch:

> - I didn't include your TODO about README.multiarch; let me know whether
>   you have a more concrete idea about the purpose of that file

It can document assumptions one makes about users of a package. For
instance, we expect dpkg users to use `dpkg --print-architecture`
diagnostically only. Similarly, we expect that package builds call
pkg-config if they mean the build architecture and they need to call
$(DEB_HOST_GNU_TYPE)-pkg-config if they mean the host architecture.
Indeed that happens automatically for autotools projects that happen to
use PKG_CHECK_MODULES or PKG_PROG_PKG_CONFIG (i.e. most). It also
happens for cmake when built with dh_auto_build.

Let me give a counter example to illustrate more of the point.
haskell-devscripts-minimal is an `Architecture: all` package with some
shell scripts. Sounds like a good candidate for `Multi-Arch: foreign`.
When you look at /usr/share/haskell-devscripts/Dh_Haskell.sh though, you
see that functions such as cpu(), os(), etc. specifically introspect the
build architecture by using the build architecture ghc. Such usage is
not ok for `Multi-Arch: foreign` (#769377).

I believe that policy should encourage some uniform way to document the
intended interface as we have several cases where this is not obvious.
README.multiarch may be that way. In particular, using a package in a
way not permitted by such README.multiarch would need to be a policy
violation on its own. For instance, one could depend on a shared library
and declare it an implementation detail. Relying on the transitive
dependency would then be considered a policy violation.

> - after we've got text documenting the other possible values of the
>   Multi-Arch: field, we might want to promote the list of things to
>   consider out of the Multi-Arch: foreign subsubsection.  It should
>   become clear once we've got that other text together.

Indeed, documenting `Multi-Arch: same` may be easier (or not). For the
purpose of defining it, we shall call Debian binary packages for
different architectures with equal binary package name and version
"instances" of a package. I currently see the following requirements:

 * It must not be used on `Architecture: all` packages (though I wish
   you could ;).

 * Given any two instances of a package and any filename, that filename
   must be non-existent in at least one package or the type (directory /

Bug#757760: debian-policy: please document build profiles

2017-07-21 Thread Helmut Grohne
On Tue, Jul 18, 2017 at 10:33:06AM +0100, Simon McVittie wrote:
> I suspect stage1 might also still be useful for (possibly pre-emptively)
> breaking cycles involving build-time vs. runtime dependencies, like the one
> that historically existed between glib2.0 and dbus: it seems more
> straightforward to have one profile name than to invent a series of
> nodbus, noglib, noqt, etc. if each one will only be used in practice in
> a small number of places to break cycles.

That may all be true, but simply using "stage1" is harmful in the long
run, because it renders "stage1" meaningless. We have a pile of "stage1"
profiles in the archive already and we essentially lost structure on
them. Some modify binary packages. Others drop binary packages. You
cannot tell what "stage1" does without looking at a particular source
package. The only meaning left is that it is often used with
bootstrapping in unspecified ways, but it doesn't even tell whether
that's native or cross and whether it actually is legacy.

> I'm not sure I agree with this. A maintainer can't reliably know all the
> cycles their package might be involved in over time, sure, but they *can*
> know what is the bare minimum of functionality that their package can
> have and still be useful for build-dependencies, and that's how I used
> stage1 in src:dbus.

It seems that the essence of dbus' stage1 is to reduce as much
functionality as possible. Why not call it pkg.dbus.minimal instead?
That name would directly carry the intention. (See Josch's mail for
rationale.)

We should actually go one step further: We should require that any
profile being used is documented in a "canonical" place. For "standard"
profiles (nocheck, nopython, ...) that'll be some central document (e.g.
policy). For extension profiles (pkg.$sourcepackage.$anything) that'll
be debian/README.source or something similar. Without such documentation
we'll only enlarge the mess we have with "stage1" now.

Let me give a recent example for comparison. I recently had to add a
profile to unbound to break a cycle. unbound builds itself four times
with varying ./configure flags. Reducing it to the one relevant build
pass building the shared library was the obvious thing to do. I could
have called the profile "stage1", but I think "pkg.unbound.libonly" much
better tells what it does. The very same logic applies to dbus, no?

While at it, I'd like to emphasize that it is not forbidden or even
discouraged to use pkg.foo.someprofile in source package bar. The only
requirement is that foo's maintainer agrees with that use (e.g. by
documenting how it is supposed to be used).

I see no urgency in removing "stage1" profiles now. They're a mess, but
a working mess. What I'd like to avoid is furthering the mess. So let's
not add more "stage1" than we have now please. Use descriptive profiles
for new stuff.

Helmut



Re: Bug#650077: dpkg: The Installed-Size estimate can be wrong by a factor of 8 or a difference of 100MB

2015-01-07 Thread Helmut Grohne
On Wed, Jan 07, 2015 at 12:22:47PM +0100, Johannes Schauer wrote:
> It is also worth asking what functionality the Installed-Size field is 
> supposed
> to have when looking for a solution. It's primary purpose is probably to give
> apt a clue of whether or not there is enough free space to install a certain
> package.

This was/is a recurring question. The policy expends 4 on the field in
section 5.6.20. It fails however to clarify the purpose and thus the
preferred way of computing or using it.

> I think that an over approximation would be the right way to go because it is
> better to wrongly warn the user that a binary package might not be installable
> due to not sufficient remaining disk space, than to install a package without
> sufficient remaining disk space and only fail once there actually is no more
> space.

Consider Alice. She wants to install foo, which has a good approximation
for her filesystem. Unfortunately, it is too big to be installed. Thus
she looks at other packages and determines that she no longer needs bar.
Duly she issues "apt-get install foo bar-". Unfortunately, this command
fails unpacking foo as bar's approximation was bad and thus it does not
free the space advertised in Installed-Size.

>   ( find mathjax-2.4 -type f -print0 \
>   | du --files0-from=- -b; \
>   find mathjax-2.4 \! -type f -printf "1\n" ) \
>   | awk '{total = total + int($1/4096) + 4096}END{print total}'

Slight improvement:

find ... \( -type f -printf "%s\n" \) -o \
 \( ! -type f -printf "1\n" \) | ...

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20150107185302.ga16...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packageso

2013-04-14 Thread Helmut Grohne
On Sun, Apr 14, 2013 at 02:22:47PM +0200, Bill Allombert wrote:
> Why files in ca-certificates are configuration files in the first place ?
> I doubt users are expected to edit PEM certificate.

Correction of what I said before: ca-certificates does not ship them as
conffiles, but as configuration files.

Actually they are symbolic links to the actual certificates shipped
within /usr/share. The purpose of the links is to allow the user to
remove particular certificates, that she does not trust. As such those
symbolic links express configuration choices.

As it stands I see ca-certificates as a valid use case of UTF-8
characters in configuration file names. I strongly suggest to talk to
the ca-certificates maintainers before changing the policy in a way this
way.

The reason for reporting this bug was to get a way to interpret
filenames *now*. The proposed wording (by Charles Plessy) enables us to
do so. I would like to see further restrictions on filenames deferred to
another issue, because it has less of a perceived benefit and there is
not the broad consensus and support for further restrictions. Clearly
further discussion is required for these.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130414124726.GA26069@localhost.localdomain



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-04-14 Thread Helmut Grohne
On Sun, Apr 14, 2013 at 11:58:03AM +0200, Bill Allombert wrote:
> I think configuration files should also be included in the first list, 
> because the
> user is supposed to be able to interact dirrectly with them.

I object to this extension of the proposal, because use of UTF-8
characters in conffile names is a current use case of ca-certificates.
If anything it could be treated as a "should" and turned into "must"
after working with the ca-certificates maintainers on a solution.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130414115529.GA25265@localhost.localdomain



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-04-08 Thread Helmut Grohne
On Sat, Apr 06, 2013 at 08:20:15PM +0900, Charles Plessy wrote:
>   
> File names
> 
> 
>   The name of the files installed by binary packages in the system 
> PATH 
>   (namely /bin, /sbin, /usr/bin,
>   /usr/sbin and /usr/games/) must be encoded in
>   ASCII.
> 
> 
> 
>   The name of the files and directories installed by binary packages
>   outside the system PATH must be encoded in UTF-8 and should be
>   restricted to ASCII when they can be represented in that character
>   set.
> 
>   
> 
> 
> What do you think ?

Thanks to all involved parties for your work on this issue. I am very
much satisfied with the result and happy that it is met with consensus.
The suggestions of Julian Gilbey appear sensible, but do not touch the
general direction.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130408090430.gb21...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-04-01 Thread Helmut Grohne
On Sun, Mar 24, 2013 at 08:01:03PM +0900, Charles Plessy wrote:
> after more than one month of discussion, we have not reached a conclusion.

Thanks for the ping.

> In the current situation there is no policy, which means that everything is
> allowed.  Indeed, there is at least one package with filenames using more than
> one set of non-ASCII characters, so no user can see correctly the names of
> every file in this package at the same time.

Some more data here. I checked sid main amd64 binary packages. The only
ones containing invalid UTF-8 sequences (and thus violating the current
proposal) would be aspell-is and jpilot. This suggests that UTF-8 is a
defacto standard already. Fixing two packages shouldn't be that hard. I
have filed a wishlist bug #704446 against lintian to check for this
regardless of the outcome of this bug.

> On my side, I made a proposal with actionable items: fix the few packages that
> are not using UTF-8, and modify the Policy to reflect the current practice
> of using ASCII in most of the times and other UTF-8 characters parcimoniously.

I am in favour of this solution.

 * Requiring any subset of UTF-8 has the direct benefit of being able to
   interpret all filenames used without guesswork.
 * This is in line with Fedora's policy.
 * I saw very little disagreement about whether to permit non-UTF-8
   sequences. Discussion seemed mostly to be around which subset to
   require.

> I understand very well the arguments against having any UTF-8 character at 
> all,
> but we currently have such packages in our archive, so if there is no plan to
> modify these packages, then we can not plan to solve this bug.

I see little benefit with restricting to ASCII compared to the benefit
with restricting to UTF-8. Remember that the goal of this bug was to
make filenames machine readable. I think that further restrictions
should happen in the context of #99933. I asked for not merging these
issues, because I would like to keep the scope of this issue limited and
thus implementable.

> Can others comment how they would like to see this bug solved ?

Any proposal that limits to a subset of UTF-8 and a superset of
printable ASCII is fine with me. My preferred choice would be just
UTF-8. I have no objections to recommending the use of a subset of
printable ASCII either.

To me it appears to be a matter of wording right now. Consensus is
basically there. Implementing it would cause two policy violations
(aspell-is and jpilot), which imo is little impact.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130401093755.ga16...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-02-22 Thread Helmut Grohne
Thanks for your comments.

On Sat, Feb 23, 2013 at 01:31:32PM +0900, Charles Plessy wrote:
>  - There are here and there discussions raising possible corner cases
>where distributing files with a name not representable in UTF-8 might
>be justified, for instance in test suites.

Even though the general argument is correct, the particular example
probably applies to source packages in most cases. We don't control
source packages (unless we repack them), so I think they should not be
covered by a filename encoding policy.

>  - Similar discussion also took place in #99933.  I wonder about merging this
>bug (#701081) and #99933.

I stumbled upon this bug before reporting this one and decided that the
issues were sufficiently separate from each other to warrant a new bug
number. I did not read the full bug log and therefore did not discover
that its scope widened to filenames as well. The discussion found
therein clearly is valuable. I still think that separating bugs for
filename encoding and file content encoding is a good idea, because
those issues can be solved independently. That said merging also makes
sense to point to the rest of the discussion. In the latter case, please
select a better summary message.

I have to admit, that I am slightly in favour of just copying Fedora's
approach. Making distributions more compatible with each other seems
like a worthwhile thing to do.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130223070209.ga18...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-02-21 Thread Helmut Grohne
Package: debian-policy
Severity: wishlist

Apparently the debian-policy currently says nothing about the characters
used in filenames contained in binary packages. Most packages use common
sense and only use a small subset of US-ASCII. In Debian sid main most
filenames can be represented using the following subset of US-ASCII
characters (written as a regular expression):

[][a-zA-Z0-9{}<>() ^/,=:&!*%#$~@+._-]

The number of exceptions is about 200 contained in about 50 binary
packages. In those packages some filenames are not representable as
UTF-8 (for example aspell-is) and others don't make any sense in
ISO-8859-15 (for example ca-certificates).

It would be nice if some common ground concerning filename encoding
could be reached. The options range from a rather restrictive definition
of acceptable characters via requiring filenames to be representable in
US-ASCII to mandating a particular encoding (such as UTF-8). This could
be first introduced as a SHOULD and later turned into a MUST.

Personally I do not really care about what the precise restriction is as
long as it permits a mechanical transformation to unicode.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130221114327.ga19...@alf.mars



Bug#650077: dpkg: The Installed-Size estimate can be wrong by a factor of 8 or a difference of 100MB

2011-11-26 Thread Helmut Grohne
Package: dpkg
Version: 1.16.1.2
Severity: wishlist

Symptom
~~~
I just installed libjs-mathjax. According to its Installed-Size this
would just consume 16512KB. Now according to policy this is just an
estimate of course. But how accurate is it actually? So I installed said
package on ext3. Turns out /usr/share/javascript/mathjax takes up
127296KB and /usr/share/doc/mathjax takes another 1200KB. So our
estimate is wrong by a factor of 8 or a difference of 100MB. This
estimate is also used to determine whether the disk has enough space, so
if my disk just had 50MB left, aptitude would have tried to install this
package and failed.

The actual problem
~~
Problems with Installed-Size are not exactly new as discussion in
http://bugs.debian.org/534408 (unit for Installed-Size) and
http://bugs.debian.org/630533 (usage of du --apparent-size) have shown.
So what is different this time? Installing the very same package on a
btrfs yields a size that is much closer to the listed Installed-Size. (I
don't have any numbers on this.) So whatever dpkg puts into this field,
it *will* be wrong somewhere. The policy already mentions that this
estimate cannot be accurate everywhere, but in fact it will be wrong by
a factor of at least 2.5 (=sqrt(8)) or a difference of at least 50MB
(=100MB/2) somewhere. Any attempt to change the computation of this
value thus cannot fix this bug.

Discussion
~~
In the example of libjs-mathjax the reason for the huge difference is
the inclusion of a large number of very small files. Some filesystems
allocate a block for each of these files and others are able to store
multiple files in a block. A simple approach could be to include an
additional field ("Installed-Files"?) that returns the number of files
in the package. A second estimate for the Installed-Size would then be
given by the number of files times the block size. The maximum of both
estimates could be used. It would solve the immediate symptoms with
libjs-mathjax. It is not without problems though. For instance I
did not explain what block size to use. An administrator may have
different file systems set up for / and /usr. Also the question remains
whether this feature is worth the associated effort.

To get discussion going I pull in debian-policy@l.d.o.

Helmut



-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/2026110639.ga30...@alf.mars



Re: Bug#553135: sendmail-base: maintainer-script-calls-init-script-directly prerm:67 than using invoke-rc.d. The use of invoke-rc.d to invoke the /etc/init.d/* initscripts instead of calling them dire

2010-01-22 Thread Helmut Grohne
severity 553135 normal
thanks

On Fri, Jan 22, 2010 at 01:50:40PM -0800, Russ Allbery wrote:
> That being said, this is clearly not the problem that either Policy or the
> Lintian tag were designed to catch, and you should feel free to decrease
> the severity and add an override.  Also, please feel free to report a bug

Thanks for your input. I just downgrade the severity for now, so others
don't try to fix it as an rc bug.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Re: Bug#553135: sendmail-base: maintainer-script-calls-init-script-directly prerm:67 than using invoke-rc.d. The use of invoke-rc.d to invoke the /etc/init.d/* initscripts instead of calling them dire

2010-01-22 Thread Helmut Grohne
Hi,

thanks to Manoj for pointing this out and Richard for explaining it.
Unfortunately this rc bug is still open after two months.

Short summary:

sendmail-base.prerm invokes an init script without invoke-rc.d which
technically is forbidden by the Debian policy. (report from Manoj)

The part that is invoked is not a standard command (clean) and would
that way produce a warning. (pointed out by Richard)

Let me outline possible solutions:
1) Tag it as wontfix and decrease severity.
   The reason for using invoke-rc.d is that it can prevent starting and
   stopping daemons when this is not desired. Cleaning the queue does
   not interfere with this.

2) Use invoke-rc.d --force. (suggested by Richard)

3) Move the queue cleaning script somewhere else and call it from the
   init script.

Please decide about a solution and solve this issue.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#443902: debian-policy: (C.3) subdirectories in debian/ not allowed?

2007-09-24 Thread Helmut Grohne
Package: debian-policy
Version: 3.7.2.2
Severity: wishlist
Tags: patch

Appending C.3 says:
"All the directories in the diff must exist, except the debian
subdirectory of the top of the source tree, which will be created by
dpkg-source if necessary when unpacking."

This is exactly one exception namely `debian/'. Creating directories
like `debian/patches' (dpatch) would violate the policy while strictly
reading it. I therefore suggest that `subdirectory' is replaced by
`subtree', `subdirectory and subdirectories thereof' or something
similar.

Helmut

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.20.1 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=de_DE (charmap=ISO-8859-1)
Shell: /bin/sh linked to /bin/dash

-- no debconf information



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]