Bug#787816: Replace FHS 2.3 by FHS 3.0 in the Policy.

2017-06-25 Thread Niels Thykier
On Fri, 5 Jun 2015 20:43:10 +0900 Charles Plessy  wrote:
> Package: debian-policy
> Severity: normal
> 
> Le Thu, Jun 04, 2015 at 07:09:00PM -0700, Russ Allbery a écrit :
> > I have not looked at this at all, but this list should be aware that it
> > exists.
> 
> > Date: Wed, 03 Jun 2015 09:19:04 -0400
> > Subject: [fhs-discuss] FHS 3.0
> > 
> > The LSB workgroup is happy to announce the release of FHS 3.0.
> ... 
> > Release notes can be found here:
> > 
> > https://wiki.linuxfoundation.org/en/FHSReleaseNotes30
> 
> Thanks Russ for the heads-up.
> 
> Judging from the release notes, it would not be too hard to update the 
> Policy's
> description of how Debian follows and deviates from the FHS.
> 
> By the way, I wonder if the debian-policy package is the best place for
> shipping a copy of the FHS.  I just checked out the bzr repository that
> contains its sources, and it builds out of the box (build-depends on xmlto and
> fop).  Perhaps it would deserve its own package ?
> 
> Have a nice day,
> 
> Charles
> 
> -- 
> Charles Plessy
> Tsurumi, Kanagawa, Japan
> 
> 

Hi,

I would like to see FHS 3.0 adopted as well.  Or an 2.3 exception to
allow the use of /usr/libexec.

Thanks,
~Niels



Bug#865713: Please Start UTF-8 debian-policy Text Files with UTF-8 Signature

2017-06-25 Thread Henrique de Moraes Holschuh
On Sat, 24 Jun 2017, Russ Allbery wrote:
> Russ Allbery  writes:
> > I did a bit more research, and apparently this approach has become more
> > blessed again.  I'm glad I looked it up!  As of Unicode 5.0, the

...

> Okay, I experimented with this, but unfortunately less displays the BOM at
> the start of the file as a very ugly reverse-video  at the top of
> the screen.

An alternative would be to just use .8txt, .u8txt or some other
extension for UTF-8 text files that is not ".txt".

This also addresses the mix of UTF-8 and unknown charset .txt files in
our web trees, the difficulty of configuring charsets out-of-band across
a mirror network, etc.

An imperfect world asks for imperfect solutions :-(  All of them will
have drawbacks.

-- 
  Henrique Holschuh



Bug#865713: Please Start UTF-8 debian-policy Text Files with UTF-8 Signature

2017-06-25 Thread Paul Hardy
[On the use of the UTF-8 signature, aka the BOM, at the start of a UTF-8 file]

On Sat, Jun 24, 2017 at 1:59 PM, Russ Allbery  wrote:
> Russ Allbery  writes:
>
>> I did a bit more research, and apparently this approach has become more
>> blessed again..
>
> Okay, I experimented with this, but unfortunately less displays the BOM at
> the start of the file as a very ugly reverse-video  at the top of
> the screen.
>
> I think this is arguably a bug in less; this is a control character in a
> sense, but the whole point is for it to be invisible, particularly when
> it's the first character of the file.  Nonetheless, that's how less
> currently behaves.  My feeling is that good display in less is a more
> important use case for us than enabling this autorecognition in web
> browsers (which will normally be viewing the HTML versions).
>
> Given that, I think the right fix here is to fix the declared charset on
> www.debian.org for these files.

I hadn't looked at less output on the file.  After doing that, I agree
that this is a bug in less.  I just emailed the following to
bug-l...@gnu.org:


If using less on a text file that contains embedded UTF-8 characters,
less seems to properly interpret and display the UTF-8.  However, if
that UTF-8 file begins with the UTF-8 signature (aka the UTF-8 version
of the Byte Order Mark, U+FEFF), less displays it with inverted video
at the start of the first line as "".

Please have less detect this sequence and not display it, given that
other UTF-8-encoded data in a text file is not displayed like that.

The latest Unicode Standard was released on 20 June 2017.  You can
download it at

http://www.unicode.org/versions/Unicode10.0.0/UnicodeStandard-10.0.pdf

At the bottom of p. 67, there is a section called "Unicode Signature".
That section shows that it is acceptable to use the UTF-8 version of
the Byte Order Mark.  In the past, that was not the case.

Part of the impetus behind the change is likely the World Wide Web.
HTML5 browsers are required to recognize the UTF-8 signature at the
start of a plain text file and if present, then to interpret the
remainder of the file as a UTF-8 file.  You can see mention of this at

https://www.w3.org/International/questions/qa-byte-order-mark

which contains this paragraph: "In HTML5 browsers are required to
recognize the UTF-8 BOM and use it to detect the encoding of the page,
and recent versions of major browsers handle the BOM as expected when
used for UTF-8 encoded pages."

Thank you for your consideration,


Paul Hardy



Bug#845255: Inclusion of best practices for packaging database applications in Debian policy

2017-06-25 Thread Paul Gevers
Hi Debian developers,

This e-mail is meant for maintainers of applications that use databases
and for those of you that are interested in how packages should handle
those.

In bug 845255¹ I started the discussion for inclusion of the "best
practices for packaging database applications" in the Debian policy.
These practices were written down by Sean Finney more than 12 years ago
after discussion that started on this list (see links in the
documentation). These best practices have always been part of the
dbconfig-common package and are available on debian.org/doc² for a year
now. After consulting the audience of my talk at Debconf, I think these
practices (or an altered form if required after discussion) should be
part of the Debian policy. In the bug it is said I should ask for
seconds of this proposal from database application maintainers, which I
am hereby seeking.

Please follow-up in the policy bug¹ (reply-to set).

For ease of the readers, the full text is given below.

Paul

¹ https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=845255
² https://www.debian.org/doc/manuals/dbapp-policy/ch-dbapps.html


===

Best practices for packaging database applications
Chapter 1 - Database Applications

1.1 Scope

In this document a "database application" is any program that relies on
some form of data storage outside the scope of the program's execution.
This is primarily intended to encompass applications which rely on a
relational database server or their own persistent storage mechanism,
though effectively is a much larger set of applications. In the future
this scope may have to be narrowed to avoid ambiguity and be more
effective as a policy.

1.2 Terminology

For the purposes of this document, there are two types of databases:
persistent and cached.

Persistent databases contain data that can not be entirely reconstituted
in the case that the database is removed. Also included are databases
that if removed would cause serious denial of service (making a system
unstable/unusable) or security concerns. Applications using this
category of databases are the primary focus of this document. Examples:

relational database servers providing storage to other applications.

web applications using a relational database

openldap's slapd databases

rrd files containing accumulated/historical data.

Cached databases are a specific group of databases which upon their
removal could be sufficiently regenerated, and could be removed without
causing serious denial of service or security concerns. Examples include:

debconf responses

locate database

caching nameserver data

apt's list of available packages

1.3 Placement of databases

Both persistent and cached databases fall under already defined
guidelines in the FHS; persistent data must be placed under
/var/lib/packagename, and cached data under /var/cache/packagename,
respectively [1]. The remainder of this document primarily addresses the
former.

1.4 Installation related issues

The following descriptions are divided into different parts, based on
the action being performed. For each process, the acceptable behavior of
database application packages is outlined.

1.4.1 Automatic Database Configuration/Prompting

It must always be assumed that the local admin knows more than any
automated system. He/She must be given the ability to opt out of any
"assistance" on the part of the package maintainer. Packages providing
any such automated assistance may do so by default if and only if the
opt-out debconf prompt is equal to or greater than priority high. With
this in mind, directions for manually installing (and upgrading if
relevant) the database must be included in the documentation for the
package.

1.4.2 Database Installation

For packages providing automated assistance, database
installation/configuration should be considered as part of the package
installation process. A failure to install a database should be
considered a failure to install the package and should result in an
error value returned by the relevant maintainer script. Packages may
provide a "try again" option to re-attempt configuration. Any such "try
again" features here or elsewhere mentioned in this document must have a
default negative response value, otherwise infinite loops could occur
for noninteractive installs.

To properly handle package reinstallation and reconfiguration, any
automated assistance must allow for a package to be reinstalled at the
same version without removing or overwriting existing application data.
Package reconfiguration may do so.

1.4.3 Database Upgrading

Occasionally a new upstream version of an application will require
modifications to be made to the application's underlying database. If an
automated system is to assist in such an upgrade, it should be
considered as a part of the package upgrade process; failure to upgrade
the database should be considered a failure to upgrade 

Bug#864615: please update version of posix standard for scripts (section 10.4)

2017-06-25 Thread Guillem Jover
On Sun, 2017-06-11 at 20:46:23 +0200, Bill Allombert wrote:
> On Sun, Jun 11, 2017 at 06:51:49PM +0200, Ralf Treinen wrote:
> > Package: debian-policy
> > Version: 4.0.0.0
> > Severity: normal

> > section 10.4 says:
> > 
> >   Scripts may assume that /bin/sh implements the SUSv3 Shell Command
> >   Language ...
> > 
> > This version of the standard is so outdated that it isn't even any
> > longer available on the opengroup web site. The latest version of the
> > standard is 4.2 (published in 2016), earlier versions currently 
> > available on the opengroup site are 4 (from 2008) and 4.1 (from 2013).
> > Please consider updating the policy.

> Before doing that, we have to check whether all the relevant packages
> are compliant with this update.

Well, I don't all of our current shell packages are even compliant with
the current version specified in policy.

dash is one of such packages, with several non-compliancy bugs. And given
that it is our default shell…

Thanks,
Guillem



Bug#865769: Second data package including some machine-readable data

2017-06-25 Thread Guillem Jover
Hi!

On Sat, 2017-06-24 at 09:57:33 -0700, Russ Allbery wrote:
> Package: debian-policy
> Version: 4.0.0.2
> Severity: wishlist

> A discussion in #865720 got me thinking that there is some data maintained
> in Policy that would be useful to have in a machine-readable format.  The
> things that have occurred to me so far are:
> 
> - The list of registered virtual packages

This one definitely makes sense, because policy is the canonical place
where this is defined.

> - The list of archive sections and their descriptions

I think this belongs on each archive providing those, alongside the
other archive metadata. And I'd rather see the involved parties
defining an appropriate file to provide so that any downloader which
has to fetch the matadata anyway would use instead of hardcoding it.

Using a file from policy does not seem useful to me, because it would
mean software would need to depend on such policy provided package,
and if you are going to mix and match repos, you really need the
metadata from the archive you are pulling from.

In addition the text in policy states that the canonical list is
maintained by the archive anyway. :)

> - The list of valid Debian control field names (by type of control file)

This one, I'm uncertain, but I'd tend to think it is partly in a similar
situation to the previous one.

For example dpkg contains already such a list (provably more
exhaustive) in Dpkg::Control::Fields, and I don't see making dpkg
depend on an external list, because dpkg is being used beyond Debian.

The "list" in dpkg has currently some problems though:

 - in a perl module; not that easily accessible to other languages.
 - tracks on which control file the fields are available, but cannot
   currently distinguish the differing semantics (field separator) for
   fields with the same name, f.ex. Files in .changes and .dsc.
 - lacks information whether a field is folded, simple, multiline, etc.

My plan is to remedy at least the last two points with a new perl
module hiearchy. I'm not sure if the first is worth "fixing" in dpkg
though?

For the equivalent in policy I think I see where you are coming from,
and I think it would be nice to have most of policy in a declarative
format that could be used by linters, or some parsers, but if that
means it's going to make those somewhat Debian-specific it might not
take off. I guess to avoid that the path and names to get to that
information would need to be somewhat neutral and allow for other
derivatives with their own policies. :)

> These are things that either we already maintain or that have no other
> obvious place to live.  This data could then be consumed by packages like
> lintian (although that's a bit tricky for lintian.debian.org),
> libconfig-model-dpkg-perl, etc.

The list of common licenses perhaps. Other things that come to mind
could be perhaps a file with common regexes to validate things that
policy specifies, say package names, version strings etc. Precisely
because those can and do diverge from what dpkg accepts for example.

Valid pathnames, etc, and as I've mentioned above ideally all of
policy would be available in a declarative format, but that'd be a
pretty huge undertaking. But then it might make sense to do a quick
poll and ask whether people would use any of this, because otherwise
it seems perhaps a bit like a waste.

> The idea would be to provide these in some machine-readable form (probably
> JSON unless someone has objections) in files under /usr/share/debian-policy
> or some similar path (so that software can consume them) in a separate
> binary package built from the debian-policy package (debian-policy-data,
> perhaps) so that other packages can depend on that package without pulling
> in the larger human-focused Policy documentation.

I don't think I have a direct use for any of the above anyway, but I
also think I'd prefer YAML, because it is more human readable. But not
a strong objection in any case.

Thanks,
Guillem



Bug#787816: Replace FHS 2.3 by FHS 3.0 in the Policy.

2017-06-25 Thread Bill Allombert
On Sun, Jun 25, 2017 at 08:26:00AM +, Niels Thykier wrote:
> On Fri, 5 Jun 2015 20:43:10 +0900 Charles Plessy  wrote:
> > Package: debian-policy
> > Severity: normal
> > 
> > Le Thu, Jun 04, 2015 at 07:09:00PM -0700, Russ Allbery a écrit :
> > > I have not looked at this at all, but this list should be aware that it
> > > exists.
> > 
> > > Date: Wed, 03 Jun 2015 09:19:04 -0400
> > > Subject: [fhs-discuss] FHS 3.0
> > > 
> > > The LSB workgroup is happy to announce the release of FHS 3.0.
> > ... 
> > > Release notes can be found here:
> > > 
> > > https://wiki.linuxfoundation.org/en/FHSReleaseNotes30
> > 
> > Thanks Russ for the heads-up.
> > 
> > Judging from the release notes, it would not be too hard to update the 
> > Policy's
> > description of how Debian follows and deviates from the FHS.
> > 
> > By the way, I wonder if the debian-policy package is the best place for
> > shipping a copy of the FHS.  I just checked out the bzr repository that
> > contains its sources, and it builds out of the box (build-depends on xmlto 
> > and
> > fop).  Perhaps it would deserve its own package ?
> 
> I would like to see FHS 3.0 adopted as well.  Or an 2.3 exception to
> allow the use of /usr/libexec.

I assume if we allow /usr/libexec, we also need to support
/usr/libexec/x86_64-linux-gnu/ etc. ?

Cheers,
-- 
Bill. 

Imagine a large red swirl here. 



Bug#542288: debian-policy: Version numbering: native packages, NMU's, and binary only uploads

2017-06-25 Thread Russ Allbery
It's been a while since the last update to this thread and proposed
wording about the special version numbering conventions in use in Debian,
and in the meantime things have settled out a bit more and we have a
pretty firm consensus on how to handle special versions.  I'd therefore
like to resurrect this thread and see if we can agree on some wording.

The patch below adds a definition of native packages to our definitions
section and documents the following version number conventions:

- Native packages
- NMUs of native and non-native packages
- binNMUs
- Stable updates
- Backports

I think these are all fairly consistent and widely agreed-on at this
point.

Concerns, objections, seconds?

diff --git a/policy.xml b/policy.xml
index cf9a589..fdf50b6 100644
--- a/policy.xml
+++ b/policy.xml
@@ -357,6 +357,21 @@
   
 
 
+  native package
+  
+
+  A native package is software written specifically for Debian
+  whose canonical distribution format is as a Debian package.
+  Native packages have no separate upstream source in their
+  source package representation and no separate Debian
+  revision in their version numbers.  Native packages are an
+  exception: most Debian packages are "non-native" and have
+  source packages composed of an upstream software release and
+  separate Debian-specific modifications.
+
+  
+
+
   UTF-8
   
 
@@ -3722,8 +3737,8 @@ Package: libc6
   
 It is optional; if it isn't present then the
 upstream_version may not
-contain a hyphen.  This format represents the case where a
-piece of software was written specifically to be a Debian
+contain a hyphen.  This format represents a native package:
+a piece of software written specifically to be a Debian
 package, where the Debian package source must always be
 identical to the pristine source and therefore no revision
 indication is required.
@@ -3811,6 +3826,110 @@ Package: libc6
 
   
 
+
+
+  Special Version Conventions
+
+  
+The following special version numbering conventions are used in
+the Debian archive:
+  
+  
+
+  
+The absence of debian_revision,
+and therefore of a hyphen in the version number, indicates
+that the package is native.
+  
+
+
+  
+debian_revision components
+ending in . followed by a number
+indicate this version of the non-native package was
+uploaded by someone other than the maintainer (an NMU or
+non-maintainer upload).  This is used for a source package
+upload; for uploads of only binary packages without source
+changes, see the binary NMU convention below.
+  
+
+
+  
+upstream_version components in
+native packages ending in +nmu followed
+by a number indicate an NMU of a native package.  As with
+the convention for non-native packages, this is used for a
+source package upload, not for uploads of only binary
+packages without source changes.
+  
+
+
+  
+upstream_version components in
+native packages or
+debian_revision components in
+non-native packages ending in +b
+followed by a number indicate a binary NMU: an upload of a
+binary package without any source changes and hence
+without any corresponding source package upload or version
+change.
+  
+
+
+  
+upstream_version components in
+native packages or
+debian_revision components in
+non-native packages ending in +debNuX
+indicate a stable update.  This is a version of the
+package uploaded directly to a stable release, and the
+version is chosen to sort before any later version of the
+package uploaded to Debian's unstable distribution.  The
+N is the major version number
+of the Debian stable release to which the package was
+package uploaded directly to a stable release, and the
+version is chosen to sort before any later version of the
+package uploaded to Debian's unstable distribution.  The
+

Bug#587279: Clarify restrictions on main to non-free dependencies

2017-06-25 Thread Russ Allbery
Raphael Geissert  writes:

> After five years of letting the discussion settle down, perhaps there's
> a way to move things forward now?

> Other than the discussion about foo2zjs I think that only Bill believes
> that the new wording proposed in message #56 differs from the current
> practice.

> Moreover, as demonstrated by follow ups, the issue raised by Bill
> regarding the possibility of an accidental installation of non-free
> software appears to be a system configuration problem.  That is, the
> granularity of what should or should not be taken into consideration
> when resolving a package's dependencies can and should be handled on the
> package manager's side.

> As such, I believe that the proposed wording is appropriate and open for
> seconding.

Here is an updated version of the patch from earlier in this (now very
long) thread for discussion.  I still think this is consistent with
previous practice and reasonable documentation of what we're currently
doing.

diff --git a/policy.xml b/policy.xml
index 7ba5fc0..daf4c3c 100644
--- a/policy.xml
+++ b/policy.xml
@@ -595,7 +595,9 @@
   Build-Depends,
   Build-Depends-Indep, or
   Build-Depends-Arch relationship on a
-  non-main package),
+  non-main package) unless that package
+  is only listed as a non-default alternative for a package in
+  main,
 
   
   

If we still can't reach consensus on this, we should probably bump it to
the Technical Committee for resolution so that this doesn't just sit
around unresolved forever.  (I feel like that happened at some point in
the past, but it's been so long that my memory is very hazy.)

-- 
Russ Allbery (r...@debian.org)   



Bug#640263: debian-policy: Clarify policy section 9.9 - Environment variables

2017-06-25 Thread Russ Allbery
Russ Allbery  writes:

> Looking at this section, there are several issues.  One is the issue
> addressed above, and I like Jonathan's wording for that.  Another is the
> one Colin mentioned earlier: this only applies to programs installed in
> the system path.  (I considered saying programs intended to be directly
> invoked by users, but I can imagine pointless arguments about /usr/sbin
> programs, so let's just go with that.)  A third issue is that parts of
> that section are now out of date, since /etc/profile.d exists (but still
> shouldn't be used for this purpose).

> I propose the attached patch to address all of those issues.  Seconds or
> further discussion?

Hi folks,

Everyone seemed generally happy with this text, but it never clearly got
enough seconds to apply.  Here's an updated patch so that we can take
another run at getting enough seconds and getting it merged.

diff --git a/policy.xml b/policy.xml
index 7ba5fc0..ace6a3b 100644
--- a/policy.xml
+++ b/policy.xml
@@ -9352,11 +9352,14 @@ Reloading description 
configuration...done.
   Environment variables
 
   
-A program must not depend on environment variables to get
-reasonable defaults.  (That's because these environment variables
-would have to be set in a system-wide configuration file like
-/etc/profile, which is not supported by all
-shells.)
+Programs installed on the system PATH (/bin,
+/usr/bin, /sbin,
+/usr/sbin, or similar directories) must not
+depend on custom environment variable settings to get reasonable
+defaults.  This is because such environment variables would have
+to be set in a system-wide configuration file such as a file in
+/etc/profile.d, which is not supported by all
+shells.
   
   
 If a program usually depends on environment variables for its
@@ -9364,7 +9367,7 @@ Reloading description 
configuration...done.
 reasonable default configuration if these environment variables
 are not present.  If this cannot be done easily (e.g., if the
 source code of a non-free program is not available), the program
-must be replaced by a small "wrapper" shell script which sets the
+must be replaced by a small "wrapper" shell script that sets the
 environment variables if they are not already defined, and calls
 the original program.
   
@@ -9377,12 +9380,6 @@ BAR=${BAR:-/var/lib/fubar}
 export BAR
 exec /usr/lib/foo/foo "$@"
   
-  
-Furthermore, as /etc/profile is a
-configuration file of the base-files package,
-other packages must not put any environment variables or other
-commands into that file.
-  
 
 
 

-- 
Russ Allbery (r...@debian.org)   



Cleaned up branches in Policy Git repository

2017-06-25 Thread Russ Allbery
Since we've invalidated old patches with the DocBook conversion, I've gone
through all the old branches in the Policy Git repository and gotten rid
of most of them.

I refreshed three of mine (and sent updated patches to the relevant bugs)
that still seemed to be current and fairly uncontroversial -- basically,
the things I could easily update without a lot of reading.  For the rest
that weren't merged, we can recreate any patches as needed from the bug
discussion.  Frequently there were lots of objections and substantial work
required anyway.

Bill, I also deleted your branches that were already merged.  Left are
bug218893-bill (Build-Features, which I wasn't sure was fully reflected in
the thread) and bug779506-bill (for virtual packages, so it would still
apply).

-- 
Russ Allbery (r...@debian.org)   



Bug#787816: Replace FHS 2.3 by FHS 3.0 in the Policy.

2017-06-25 Thread Simon McVittie
On Sun, 25 Jun 2017 at 22:37:04 +0200, Bill Allombert wrote:
> I assume if we allow /usr/libexec, we also need to support
> /usr/libexec/x86_64-linux-gnu/ etc. ?

I'm not sure I see why we would? Platforms with the "multilib" lib/lib64
duality (Red Hat derivatives, etc.) only have one /usr/libexec, just like
they only have one /usr/bin; so anything that expects a per-architecture
libexecdir is already broken on more or less everything other than Debian.

I'd expect a future Debian with FHS 3.0 in Policy to have
libdir=/usr/lib/TUPLE and libexecdir=/usr/libexec as the normal settings
for Autoconf.

(On the other hand, there's probably no reason why we'd specifically
forbid /usr/libexec/TUPLE either.)

More background: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=859724

S



Re: Bug#542288: debian-policy: Version numbering: native packages, NMU's, and binary only uploads

2017-06-25 Thread Simon McVittie
On Sun, 25 Jun 2017 at 14:08:05 -0700, Russ Allbery wrote:
> +upstream_version components in
> +native packages ending in +nmu followed
> +by a number indicate an NMU of a native package.

I thought 1.2.3-4+nmu1 was also allowed as an alternative to 1.2.3-4.1?
But perhaps that's non-standard (it's certainly redundant).

> +N is the major version number
> +of the Debian stable release to which the package was
> +package uploaded directly to a stable release, and the

You have some duplicated lines here I think.

One rarer case is missing here:

1.2.3-4~deb9u1
Everything in 1.2.3-4 from unstable was in fact needed in Debian
9, so it was simply rebuilt for Debian 9 and uploaded there
(prominent examples: firefox-esr, intel-microcode)

Regards,
S



Bug#758234: debian-policy: allow packages to depend on packages of lower priority

2017-06-25 Thread Russ Allbery
Ansgar Burchardt  writes:

> I discussed this a bit on IRC with the other ftp-masters and we came to
> this summary:

> 0) We would like to drop the requirement for packages to not depend on
>packages of lower priority: it is better to declare only what we
>actually want included in the installation (that is at priority >=
>standard) rather than also the dependency closure.

> 1) We agree that the 'extra' priority can be dropped.

> 2) We wonder if the 'standard' priority can also be dropped: as far as
>we know, it is used only by the "standard" task and it might make
>sense to treat it the same as other tasks.
>(Depending on what works better for the installer team.)

Given KiBi's reply, I'll leave 2 out for now.

Given the necessary wording changes, I don't think we can separate 0 and 1
very easily, so I'll just propose wording for both (even though we forked
the Policy bugs into two).  Here's a wording proposal based on Adam
Borowski's wording with a bit of (hopefully correct) tightening.

Note that this also says that no two packages that both have a priority of
standard or higher may conflict.  I think that's a logical consequence of
the use of priorities, and didn't want to lose that completely when that
requirement was dropped from optional.

diff --git a/policy.xml b/policy.xml
index ace6a3b..be458cd 100644
--- a/policy.xml
+++ b/policy.xml
@@ -837,11 +837,33 @@
   Priorities
 
   
-Each package should have a priority value,
-which is included in the package's control
-record (see ).  This
-information is used by the Debian package management tools to
-separate high-priority packages from less-important packages.
+Each package must have a priority value,
+which is set in the metadata for the Debian archive and is also
+included in the package's control files (see ).  This information is used to control
+which packages are included in standard or minimal Debian
+installations.
+  
+  
+Most Debian packages will have a priority of
+optional.  Priority levels other than
+optional are only used for packages that should
+be included by default in a standard installation of Debian.
+  
+  
+The priority of a package is determined solely by the
+functionality it provides directly to the user.  The priority of a
+package should not be increased merely because another
+higher-priority package depends on it; instead, the tools used to
+construct Debian installations will correctly handle package
+dependencies.  In particular, this means that C-like libraries
+will almost never have a priority above
+optional, since they do not provide
+functionality directly to users.  However, as an exception, the
+maintainers of Debian installers may request an increase of the
+priority of a package to resolve installation issues and ensure
+that the correct set of packages is included in a standard or
+minimal install.
   
   
 The following priority levels are recognized
@@ -896,19 +922,22 @@
   installed by default if the user doesn't select anything
   else.  It doesn't include many large applications.
 
+
+  No two packages that both have a priority of
+  standard or higher may conflict with each
+  other.
+
   
 
 
   optional
   
 
-  (In a sense everything that isn't required is optional, but
-  that's not what is meant here.) This is all the software
-  that you might reasonably want to install if you didn't know
-  what it was and don't have specialized requirements.  This
-  is a much larger system and includes the X Window System, a
-  full TeX distribution, and many applications.  Note that
-  optional packages should not conflict with each other.
+  This is the default priority for the majority of the
+  archive.  Unless a package should be installed by default on
+  standard Debian systems, it should have a priority of
+  optional.  Packages with a priority of
+  optional may conflict with each other.
 
   
 
@@ -916,22 +945,21 @@
   extra
   
 
-  This contains all packages that conflict with others with
-  required, important, standard or optional priorities, or are
-  only likely to be useful if you already know what they are
-  or have specialized requirements (such as packages
-  containing only detached debugging symbols).
+  This priority is deprecated.  Use the
+  optional priority instead.
+  

Bug#542288: debian-policy: Version numbering: native packages, NMU's, and binary only uploads

2017-06-25 Thread Russ Allbery
Simon McVittie  writes:
> On Sun, 25 Jun 2017 at 14:08:05 -0700, Russ Allbery wrote:

>> +upstream_version components in
>> +native packages ending in +nmu followed
>> +by a number indicate an NMU of a native package.

> I thought 1.2.3-4+nmu1 was also allowed as an alternative to 1.2.3-4.1?
> But perhaps that's non-standard (it's certainly redundant).

There's some previous discussion in the bug.  The summary is that this was
proposed and is sometimes used, but pretty much everyone uses the 4.1
syntax still, so it doesn't seem to really have consensus.

Note that this list is not exclusive; there may be version numbering
conventions that aren't documented.  I just wanted to get down the most
likely ones that people will encounter.

>> +N is the major version number
>> +of the Debian stable release to which the package was
>> +package uploaded directly to a stable release, and the

> You have some duplicated lines here I think.

Argh.  For some reason, less constantly messes with me when I cut and
paste diffs instead of saving them to a file and including them directly.
I could have sworn I configured it to never use partial pages.

Included is the correct version of the patch.

> One rarer case is missing here:

> 1.2.3-4~deb9u1
>   Everything in 1.2.3-4 from unstable was in fact needed in Debian
>   9, so it was simply rebuilt for Debian 9 and uploaded there
>   (prominent examples: firefox-esr, intel-microcode)

Is this widespread enough to be worth describing?  It's kind of hard to
describe.

diff --git a/policy.xml b/policy.xml
index 7ba5fc0..fbc53c9 100644
--- a/policy.xml
+++ b/policy.xml
@@ -357,6 +357,21 @@
   
 
 
+  native package
+  
+
+  A native package is software written specifically for Debian
+  whose canonical distribution format is as a Debian package.
+  Native packages have no separate upstream source in their
+  source package representation and no separate Debian
+  revision in their version numbers.  Native packages are an
+  exception: most Debian packages are "non-native" and have
+  source packages composed of an upstream software release and
+  separate Debian-specific modifications.
+
+  
+
+
   UTF-8
   
 
@@ -589,13 +604,10 @@
 
   must not require or recommend a package outside of
   main for compilation or execution
-  (thus, the package must not declare a
-  Pre-Depends, Depends,
-  Recommends,
-  Build-Depends,
-  Build-Depends-Indep, or
-  Build-Depends-Arch relationship on a
-  non-main package),
+  (thus, the package must not declare a "Pre-Depends",
+  "Depends", "Recommends", "Build-Depends",
+  "Build-Depends-Indep", or "Build-Depends-Arch" relationship
+  on a non-main package),
 
   
   
@@ -3725,11 +3737,11 @@ Package: libc6
   
 It is optional; if it isn't present then the
 upstream_version may not
-contain a hyphen.  This format represents the case where a
-piece of software was written specifically to be a Debian
-package, where the Debian package source must always be
-identical to the pristine source and therefore no revision
-indication is required.
+contain a hyphen.  This format represents a native
+package: a piece of software written specifically to be a
+Debian package, where the Debian package source must
+always be identical to the pristine source and therefore
+no revision indication is required.
   
   
 It is conventional to restart the
@@ -3814,6 +3826,110 @@ Package: libc6
 
   
 
+
+
+  Special version conventions
+
+  
+The following special version numbering conventions are used in
+the Debian archive:
+  
+  
+
+  
+The absence of debian_revision,
+and therefore of a hyphen in the version number, indicates
+that the package is native.
+  
+
+
+  
+debian_revision components
+ending in . followed by a number
+indicate this version of the non-native package was
+uploaded by someone other than the maintainer (an NMU or
+non-maintainer upload).  This is used for a source package
+upload; for uploa

Bug#640263: debian-policy: Clarify policy section 9.9 - Environment variables

2017-06-25 Thread Simon McVittie
On Sun, 25 Jun 2017 at 14:58:06 -0700, Russ Allbery wrote:
> Everyone seemed generally happy with this text, but it never clearly got
> enough seconds to apply.  Here's an updated patch so that we can take
> another run at getting enough seconds and getting it merged.

I second the patch quoted below.

> diff --git a/policy.xml b/policy.xml
> index 7ba5fc0..ace6a3b 100644
> --- a/policy.xml
> +++ b/policy.xml
> @@ -9352,11 +9352,14 @@ Reloading description 
> configuration...done.
>Environment variables
>  
>
> -A program must not depend on environment variables to get
> -reasonable defaults.  (That's because these environment variables
> -would have to be set in a system-wide configuration file like
> -/etc/profile, which is not supported by all
> -shells.)
> +Programs installed on the system PATH (/bin,
> +/usr/bin, /sbin,
> +/usr/sbin, or similar directories) must not
> +depend on custom environment variable settings to get reasonable
> +defaults.  This is because such environment variables would have
> +to be set in a system-wide configuration file such as a file in
> +/etc/profile.d, which is not supported by all
> +shells.
>
>
>  If a program usually depends on environment variables for its
> @@ -9364,7 +9367,7 @@ Reloading description 
> configuration...done.
>  reasonable default configuration if these environment variables
>  are not present.  If this cannot be done easily (e.g., if the
>  source code of a non-free program is not available), the program
> -must be replaced by a small "wrapper" shell script which sets the
> +must be replaced by a small "wrapper" shell script that sets the
>  environment variables if they are not already defined, and calls
>  the original program.
>
> @@ -9377,12 +9380,6 @@ BAR=${BAR:-/var/lib/fubar}
>  export BAR
>  exec /usr/lib/foo/foo "$@"
>
> -  
> -Furthermore, as /etc/profile is a
> -configuration file of the base-files package,
> -other packages must not put any environment variables or other
> -commands into that file.
> -  
>  
>  
>  

Regards,
S



Bug#587279: Clarify restrictions on main to non-free dependencies

2017-06-25 Thread Simon McVittie
On Sun, 25 Jun 2017 at 14:43:36 -0700, Russ Allbery wrote:
> Here is an updated version of the patch from earlier in this (now very
> long) thread for discussion.  I still think this is consistent with
> previous practice and reasonable documentation of what we're currently
> doing.
> 
> diff --git a/policy.xml b/policy.xml
> index 7ba5fc0..daf4c3c 100644
> --- a/policy.xml
> +++ b/policy.xml
> @@ -595,7 +595,9 @@
>Build-Depends,
>Build-Depends-Indep, or
>Build-Depends-Arch relationship on a
> -  non-main package),
> +  non-main package) unless that package
> +  is only listed as a non-default alternative for a package in
> +  main,
>  
>
>
> 
> If we still can't reach consensus on this, we should probably bump it to
> the Technical Committee for resolution so that this doesn't just sit
> around unresolved forever.  (I feel like that happened at some point in
> the past, but it's been so long that my memory is very hazy.)

A TC resolution in 2014 said that
"Depends: package-in-main | package-in-non-free" is acceptable for main,
and not a Policy §2.2.1 violation. What you're doing here is editing
Policy §2.2.1 to make the 2014 TC's interpretation more obviously the
correct one.

References: ,


This is certainly not unanimous (the TC vote in 2014 wasn't unanimous
either); but I think there's rough consensus, it matches current
practice, and it's better for Policy to be clear and specific as a
self-contained document, rather than leaving ambiguity in place and
requiring past TC decisions to be consulted for disambiguation. So
I second this patch.

Regards,
S



Bug#865713: Please Start UTF-8 debian-policy Text Files with UTF-8 Signature

2017-06-25 Thread Paul Hardy
On Sat, Jun 24, 2017 at 1:59 PM, Russ Allbery  wrote:
> Russ Allbery  writes:
>
>> I did a bit more research, and apparently this approach has become more
>> blessed again...
>
> Okay, I experimented with this, but unfortunately less displays the BOM at
> the start of the file as a very ugly reverse-video  at the top of
> the screen.
>
> I think this is arguably a bug in less...

I just noticed that this less bug was reported in Debian 2008 and is still open:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=473227

Earlier today, I sent the GNU less maintainer a two-line patch to the
"charset.c" file after my original email to him.  The patch prevents
the BOM from printing, but less still echoes the BOM correctly to an
output device that is not a terminal (output file, pipe, etc.).

I attached the patch file to bug report number 473227 so that the
Debian less maintainer could have access to it before the GNU less
maintainer has a chance to incorporate it (if he decides to).


Paul Hardy



Bug#865769: Second data package including some machine-readable data

2017-06-25 Thread Russ Allbery
Guillem Jover  writes:
> On Sat, 2017-06-24 at 09:57:33 -0700, Russ Allbery wrote:

>> - The list of archive sections and their descriptions

> I think this belongs on each archive providing those, alongside the
> other archive metadata. And I'd rather see the involved parties
> defining an appropriate file to provide so that any downloader which
> has to fetch the matadata anyway would use instead of hardcoding it.

> Using a file from policy does not seem useful to me, because it would
> mean software would need to depend on such policy provided package,
> and if you are going to mix and match repos, you really need the
> metadata from the archive you are pulling from.

> In addition the text in policy states that the canonical list is
> maintained by the archive anyway. :)

I don't see how this would work.  The program would dynamically retrieve
the list of sections every time it ran?  This seems like a bad idea, and
even impossible in a lot of situations (off-line development work, for
instance).

We maintain a list of archive sections in Policy anyway, so it's easy for
us to provide this list in a machine-readable format as well.  (Well, we
don't have the descriptions, but that's not hard to add and doesn't really
add much additional maintenance work.)

I think it's fine that a debian-policy-data package only provide
information for the Debian archive.  The same is also true of the virtual
package names, of course; some other archive may have different virtual
packages too.  Programs that want to work with various different package
archives will need to know how to obtain this data from multiple sources.
The intent is to provide a tiny package that others can easily depend on
without much overhead.

>> - The list of valid Debian control field names (by type of control file)

> This one, I'm uncertain, but I'd tend to think it is partly in a similar
> situation to the previous one.

> For example dpkg contains already such a list (provably more
> exhaustive) in Dpkg::Control::Fields, and I don't see making dpkg
> depend on an external list, because dpkg is being used beyond Debian.

This was just an idle thought of mine, and maybe it doesn't solve any real
problems.

> For the equivalent in policy I think I see where you are coming from,
> and I think it would be nice to have most of policy in a declarative
> format that could be used by linters, or some parsers, but if that means
> it's going to make those somewhat Debian-specific it might not take
> off.

I'm in general fine with the things provided by Debian Policy being
Debian-specific.  That, in my opinion, is the point of the package.  If
some other distribution wants something equivalent, they can certainly
fork Debian Policy or write their own separate document that supplements
Debian Policy, and maintain corresponding data files.

> The list of common licenses perhaps. Other things that come to mind
> could be perhaps a file with common regexes to validate things that
> policy specifies, say package names, version strings etc. Precisely
> because those can and do diverge from what dpkg accepts for example.

Yes, those would also be interesting.

> Valid pathnames, etc, and as I've mentioned above ideally all of policy
> would be available in a declarative format, but that'd be a pretty huge
> undertaking. But then it might make sense to do a quick poll and ask
> whether people would use any of this, because otherwise it seems perhaps
> a bit like a waste.

Indeed.  The virtual package name list has a specific use case already,
and people suggesting using sed scripts to parse files from the
debian-policy package to generate it right now, so maybe we should just
start there and see if uses of the other data actually materialize.

Lintian is a large possible use case, but Lintian already has mechanisms
for gathering and maintaining this data internally, and Lintian may not
want to depend on a debian-policy-data package for various reasons (it
makes lintian.debian.org a bit harder).

> I don't think I have a direct use for any of the above anyway, but I
> also think I'd prefer YAML, because it is more human readable. But not a
> strong objection in any case.

I have a professional aversion to YAML because the security properties of
YAML are so awful.

I wish everyone would just use TOML, but unfortunately it's not at a 1.0
version yet and is not as widely supported by default as JSON is.

-- 
Russ Allbery (r...@debian.org)   



Bug#587279: Clarify restrictions on main to non-free dependencies

2017-06-25 Thread Russ Allbery
Simon McVittie  writes:
> On Sun, 25 Jun 2017 at 14:43:36 -0700, Russ Allbery wrote:

>> Here is an updated version of the patch from earlier in this (now very
>> long) thread for discussion.  I still think this is consistent with
>> previous practice and reasonable documentation of what we're currently
>> doing.

>> diff --git a/policy.xml b/policy.xml
>> index 7ba5fc0..daf4c3c 100644
>> --- a/policy.xml
>> +++ b/policy.xml
>> @@ -595,7 +595,9 @@
>>Build-Depends,
>>Build-Depends-Indep, or
>>Build-Depends-Arch relationship on a
>> -  non-main package),
>> +  non-main package) unless that package
>> +  is only listed as a non-default alternative for a package in
>> +  main,
>>  
>>
>>

>> If we still can't reach consensus on this, we should probably bump it
>> to the Technical Committee for resolution so that this doesn't just sit
>> around unresolved forever.  (I feel like that happened at some point in
>> the past, but it's been so long that my memory is very hazy.)

> A TC resolution in 2014 said that
> "Depends: package-in-main | package-in-non-free" is acceptable for main,
> and not a Policy §2.2.1 violation. What you're doing here is editing
> Policy §2.2.1 to make the 2014 TC's interpretation more obviously the
> correct one.

Ah, thank you!  I did remember correctly that the Technical Committee took
this up.

> This is certainly not unanimous (the TC vote in 2014 wasn't unanimous
> either); but I think there's rough consensus, it matches current
> practice, and it's better for Policy to be clear and specific as a
> self-contained document, rather than leaving ambiguity in place and
> requiring past TC decisions to be consulted for disambiguation. So I
> second this patch.

Thanks!  Yeah, given that it was a TC decision, we definitely should
document it.

-- 
Russ Allbery (r...@debian.org)   



Bug#865769: Second data package including some machine-readable data

2017-06-25 Thread Guillem Jover
On Sun, 2017-06-25 at 16:13:39 -0700, Russ Allbery wrote:
> Guillem Jover  writes:
> > On Sat, 2017-06-24 at 09:57:33 -0700, Russ Allbery wrote:
> >> - The list of archive sections and their descriptions
> 
> > I think this belongs on each archive providing those, alongside the
> > other archive metadata. And I'd rather see the involved parties
> > defining an appropriate file to provide so that any downloader which
> > has to fetch the matadata anyway would use instead of hardcoding it.
> 
> > Using a file from policy does not seem useful to me, because it would
> > mean software would need to depend on such policy provided package,
> > and if you are going to mix and match repos, you really need the
> > metadata from the archive you are pulling from.
> 
> > In addition the text in policy states that the canonical list is
> > maintained by the archive anyway. :)
> 
> I don't see how this would work.  The program would dynamically retrieve
> the list of sections every time it ran?  This seems like a bad idea, and
> even impossible in a lot of situations (off-line development work, for
> instance).

When I researched this at the time, there were two clear groups of
users for this information [U] (now summarized in [W]).

The first were package manager frontends and similar, which need to
fetch archive meta-data anyway, and they do not need to do that all
the time, as they tend to cache that. For this group using an out-of-band
file provided by a non-canonical package seems suboptimal, when the
information can be there alongside the rest of the metadata to
download. For example dselect is a prominent omission from that list,
one for which I'd rather not introduce the hardcoding and wait for
proper meta-data from the archives themselves, or make it Debian-specific
by having it depend on a Debian Policy specific file. :)

  [U] 
  [W] 

For off-line tools such as linters, syntax highlighers, and similar it
certainly seems like a problem to require fetching the data from the
archive. Although, in some cases relying on an external package that
might update the data outside of the control of the tool might be
undesirable, and it might be better to do like lintian is doing, and
refresh it as part of the release process.

Then I supposed there's a third group comprised of services. But those
I guess kind of fall somehow under the package manager frontends case,
as they need to fetch metadata information from the archive anyway(?).

> We maintain a list of archive sections in Policy anyway, so it's easy for
> us to provide this list in a machine-readable format as well.  (Well, we
> don't have the descriptions, but that's not hard to add and doesn't really
> add much additional maintenance work.)
> 
> I think it's fine that a debian-policy-data package only provide
> information for the Debian archive.  The same is also true of the virtual
> package names, of course; some other archive may have different virtual
> packages too.  Programs that want to work with various different package
> archives will need to know how to obtain this data from multiple sources.
> The intent is to provide a tiny package that others can easily depend on
> without much overhead.

Oh, I didn't mean to imply that Debian Policy should provide data or
support for other non-Debian archives.

My point is that perhaps it is not the best way to provide some of
this data in the first place, because:

  - it's not the canonical origin of the data,
  - having to fork the policy package just to amend the sections seems
burdensome, when the latter change way less often than the former,
  - might make code having to support this data Debian-specific.

If we need an off-line replica of the data, it might perhaps make more
sense for the archive admins (ftp-masters in this case) to provide it,
in a similar way as we have a debian-archive-keyring. Of course they'd
need to agree to that first. :)

Barring that, having a single place to include this kind of information
in a uniform way, similar to what distro-info does, might be the second
best options.

But even then, if the least bad solution is to have debian-policy
provide the data, what I was trying to have at least taken into
account is that it would be nice to try to specify a somewhat neutral
hierarchical structure in the filesystem, and ideally a common file
format, so that ideally programs can just check for the vendor and
do the equivalent of something like:

   load /usr/share/distro-metadata/-archive-metadata.

instead of say, having debian-policy hardcoded therein or similar, so
you could just key on the vendor and be somewhat neutral.

> >> - The list of valid Debian control field names (by type of control file)
> 
> > This one, I'm uncertain, but I'd tend to think it is partly in a similar
> > situation to the previous one.
> 
> > For example dpkg contains already such a list (provab

Bug#865713: Please Start UTF-8 debian-policy Text Files with UTF-8 Signature

2017-06-25 Thread Paul Wise
On Sun, 2017-06-25 at 16:07 -0700, Paul Hardy wrote:

> Earlier today, I sent the GNU less maintainer a two-line patch to the
> "charset.c" file after my original email to him.

I'm no expert on the less source code, but it seems to me that it will
also hide U+FEFF characters after the first one. I would suggest
updating it so that  is only hidden when it is the first UTF-8
character in the file.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Processing of debian-policy_4.0.0.4_amd64.changes

2017-06-25 Thread Debian FTP Masters
debian-policy_4.0.0.4_amd64.changes uploaded successfully to localhost
along with the files:
  debian-policy_4.0.0.4.dsc
  debian-policy_4.0.0.4.tar.xz
  debian-policy_4.0.0.4_all.deb
  debian-policy_4.0.0.4_amd64.buildinfo

Greetings,

Your Debian queue daemon (running on host usper.debian.org)



debian-policy_4.0.0.4_amd64.changes ACCEPTED into unstable

2017-06-25 Thread Debian FTP Masters


Accepted:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Format: 1.8
Date: Sun, 25 Jun 2017 20:34:27 -0700
Source: debian-policy
Binary: debian-policy
Architecture: source all
Version: 4.0.0.4
Distribution: unstable
Urgency: medium
Maintainer: Debian Policy List 
Changed-By: Russ Allbery 
Description:
 debian-policy - Debian Policy Manual and related documents
Changes:
 debian-policy (4.0.0.4) unstable; urgency=medium
 .
   * Fix URLs to Policy documents.  Policy previously used the full URL as
 the link but a partial URL as anchor text, and then sometimes added
 the full URL in parentheses.  The result was very ugly in the text
 version.  Replace that style with just the full URL as anchor text for
 a link to that URL, which is not ideal for HTML output but produces
 reasonable results for both HTML and text.
   * Extensive reformatting of the maintainer script section.
 - Use  in the summary of how maintainer scripts are
   called, with a single .  This avoids gluing all the command
   summaries together with commas.  The output in text is radically
   better; HTML has some font issues, but isn't awful.
 - Avoid trailing newlines in  examples, which produce extra
   whitespace in text and HTML output.
 - Use different numeration methods at different levels of the nested
   ordered lists and ensure there is some introductory text for each
   list element to avoid awkward formatting.
   * Convert many of the  tags in Policy to  since
 the semantics are slightly more correct.
   * Remove the newline before  or  end tags,
 since it produces an extraneous and distracting blank line in both
 text and HTML output.
   * Change the mapping of Perl module names to Debian package names in
 the Perl Policy from a verbatim block to a table.
   * Change the list of cron directories from a verbatim block to a list.
Checksums-Sha1:
 a2fded68889766c33379995dbe717a3cd3e9d230 1608 debian-policy_4.0.0.4.dsc
 8f0f86bd8dfacc6aaa781fd15a8a7d9cf1103bd2 660604 debian-policy_4.0.0.4.tar.xz
 29d917c9493aefca26a1e1b9fba697d4336e5f74 1971562 debian-policy_4.0.0.4_all.deb
 1d069fcfcd30935e88d17180087e59d4b52d17d0 12006 
debian-policy_4.0.0.4_amd64.buildinfo
Checksums-Sha256:
 b3c47301f0617e3707df521f177a1fa7dc435cedfa968605cf62a659d44be7f6 1608 
debian-policy_4.0.0.4.dsc
 d8c72ea98d1739c7defe71d9e3e45fb197c96b60aa96961a57266b0bd5036beb 660604 
debian-policy_4.0.0.4.tar.xz
 b76eff34e95d68d5bf3befc20da532a31c5f66d46e72d9b11134addaa865448e 1971562 
debian-policy_4.0.0.4_all.deb
 063e84f80c0b2f92755a3b4580b2d05bff93abf8c2ecc9297f0706541f135d81 12006 
debian-policy_4.0.0.4_amd64.buildinfo
Files:
 7a41c7ca5ee853ac6145206876657004 1608 doc optional debian-policy_4.0.0.4.dsc
 f06aa5ba93098022334d3c5172795cfc 660604 doc optional 
debian-policy_4.0.0.4.tar.xz
 d863fa2a3210e9cc022fbfb3a8b3ad6a 1971562 doc optional 
debian-policy_4.0.0.4_all.deb
 4cf58a5d557265589d6137adbc577f42 12006 doc optional 
debian-policy_4.0.0.4_amd64.buildinfo

-BEGIN PGP SIGNATURE-

iQEzBAEBCAAdFiEE1zk0tJZ0z1zNmsJ4fYAxXFc23nUFAllQgUcACgkQfYAxXFc2
3nXJ7gf+N7H/wMGhqEG1EZt/Qf0KfgIpNXZ3RMzjtHC0hR9PGbNGTWFMeKP+VPJ3
F+hYmtwZfFO8VAfd0/8TSxPTZJF9e4QkZx8EVIEp2RIAHoH5IkzIIugF6AA+i/Tq
zL7fiSLRo4oEiXrCynKM5KWfRYcFwxpGHxp+nZBOBxDDCipJ7XZXWZhEY0Cs76WY
uQb2x1SKL7+DKHXzFJASdd8l2BJamfEfjFhlSEQ7BUr1kOhhcuPSYWllPp5LA8pJ
mWt68wQ7WAVX/2SaYWbmeJAbaiXCY4Irz05WBckc87eQU5kAdV+cKbki4tpiVKCe
xMUwvsDgiVrvjmIYtEkFRGInA4VmJQ==
=Pp31
-END PGP SIGNATURE-


Thank you for your contribution to Debian.



Bug#865713: Please Start UTF-8 debian-policy Text Files with UTF-8 Signature

2017-06-25 Thread Paul Hardy
Paul,

On Sun, Jun 25, 2017 at 8:24 PM, Paul Wise  wrote:
> On Sun, 2017-06-25 at 16:07 -0700, Paul Hardy wrote:
>
>> Earlier today, I sent the GNU less maintainer a two-line patch to the
>> "charset.c" file after my original email to him.
>
> I'm no expert on the less source code, but it seems to me that it will
> also hide U+FEFF characters after the first one.

You are correct.

> I would suggest
> updating it so that  is only hidden when it is the first UTF-8
> character in the file.

Well, U+FEFF has a dual personality.  It began its life as "ZERO WIDTH
NO-BREAK SPACE (ZWNBSP)".  Then that use became deprecated; new
documents were supposed to use U+2060 ("WORD JOINER") instead.  Of
course, there might still be legacy documents around, and less might
have to display them.

The alias for U+FEFF is "BYTE ORDER MARK (BOM)", and U+FEFF can go by
either of its names.

If used as a BOM, then U+FEFF is supposed to appear at the beginning
of a document according to The Unicode Standard, and discarding all
instances of U+FEFF does not accommodate that.

But the proper handling of U+FEFF as ZWNBSP is to print zero width,
and not cause a break between the surrounding characters.  You get
that alternate effect by not printing the character, which is what the
patch does.


However, in the HTML5 link I posted earlier, it mentions that a
compliant HTML5 web browser must detect the BOM anywhere within an
HTML document and if present, treat the web page as having UTF-8
encoding (or UTF-16, depending on the BOM format encountered).  They
mention the reason for this is to allow web servers to claim that they
are serving one type of content in a generic fashion, but individual
Unicode documents that are embedded in the HTML response still should
correctly display.  The presence of the BOM anywhere in the web page
is supposed to override the HTTP header charset and any META charset
tags, if present.  The latter requires interpreting U+FEFF anywhere in
the web page as a BOM.

This is a quote from p. 866 of the Unicode Standard 10.0.0 that goes
into some of the context-sensitive nature of U+FEFF (but note that
nobody is supposed to be using U+FEFF as a ZWNBSP, as of over a decade
ago):

"Where the byte order is explicitly specified, such as in UTF-16BE or
UTF-16LE, then all U+FEFF characters—even at the very beginning of the
text—are to be interpreted as zero width no-break spaces. Similarly,
where Unicode text has known byte order, initial U+FEFF characters are
not required, but for backward compatibility are to be interpreted as
zero width no-break spaces. For example, for strings in an API, the
memory architecture of the processor provides the explicit byte order.
For databases and similar structures, it is much more efficient and
robust to use a uniform byte order for the same field (if not the
entire database), thereby avoiding use of the byte order mark.

"Systems that use the byte order mark must recognize when an initial
U+FEFF signals the byte order. In those cases, it is not part of the
textual content and should be removed before processing, because
otherwise it may be mistaken for a legitimate zero width no-break
space. To represent an initial U+FEFF zero width no-break space in a
UTF-16 file, use U+FEFF twice in a row. The first one is a byte order
mark; the second one is the initial zero width no-break space. See
Table 23-6 for a summary of encoding scheme signatures."

Yet the only "processing" that less should be doing is outputting one
line at a time.  It does not figure out line breaks dynamically the
way a WISYWIG word processor program would, for example.  If it did,
then this dual-personality of U+FEFF would probably require
introducing an additional state variable into less.

So I think recognizing and discarding all occurrences of the BOM in
less produces the desired effect in all cases.


Paul