Re: DEP-5: general file syntax

2010-08-25 Thread Lars Wirzenius
On ma, 2010-08-23 at 14:50 +1200, Lars Wirzenius wrote:
 On su, 2010-08-22 at 15:24 -0700, Russ Allbery wrote:
  It's... okay.  It's a little strange, but I don't think it would be
  confusing since it is a summary of the license text in a machine-readable
  format, in essence.
 
 ACK, you and Ben have assured me that it is acceptable, and I've changed
 the spec draft. Latest diff attached.

There hasn't been any further suggestions to this diff, so I'll apply it
to the bzr trunk and we can move to the next topic.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282767798.2242.22.ca...@havelock



Re: DEP-5: general file syntax

2010-08-22 Thread Charles Plessy
Le Sat, Aug 21, 2010 at 09:09:28PM +1200, Lars Wirzenius a écrit :
  
 +There are four kinds values for fields. Each field specifies which
 +kind is allowed.
 +i
 +* Single-line values.
 +* White space separated lists.
 +* Line based lists.
 +* Free-form text formatted like package long descriptions.

Hi Lars,

I have mixed feelings about adding a extra level of complexity and introduce
a syntax for lists. I think that apart from the Files field, the DEP could
use mostly free-form values in the fields.

In particular for the Copyright field, I am of the opinion that it should be
free form and verbatim, preserving the newlines as they are in
debian/copyright. One minor problem is that Policy §5.1 specifies that if a
field value may not be wrapped, then this field is a single line of white space
separated data, and indeed there is no field in Policy's chapter 5 that is
purely free-form while preserving newlines characters. I have opened #593909 to
disambiguate this.

I also feel a contradiction to call ‘free-form’ some text that is formatted
according to some markup rules, even if they are simple. I propose to replace
instances like:

  Free-form text formatted like package long descriptions

by:

  Formatted text like package long descriptions


Here are additional comments between quotations of your patch.


 +`Copyright` can list many copyright statements, one per line.

For fine-grained descriptions, I would rather recommend to write a SPDX file in
cooperation with Upstream, and use it to generate a DEP-5 template.


   * **`Upstream-Name`**
 * Optional
 * Single occurrence
 +   * Value: single line
 * Syntax: Single line (in most cases a single word),
   containing the name upstream uses for the software.

   * **`Upstream-Contact`**
 * Optional
 * Single occurrence
 +   * Value: line based list
 * Syntax: Line(s) containing the preferred address(es) to reach
   the upstream project. May be free-form text, but by convention
   will usually be written as a list of RFC2822 addresses or URIs.

The syntax of the Upstream-Contact field does not reflect the use intended by
the Perl packaging team, which is to match a Debian package with a CPAN
maintainer. The CPAN maintainer's email address not necessarly the preferred
address to reach the upstream authors (for instance, a mailing list).

Since this thread is not about the Upstream-* fields, let's not go too much in
the details, except that in my opinion, ‘line based list’ is not the most
appropriate format for the Upstream-Contact field's value.

Another potential problem for both fields is that a Debian source package can
be composed by multiple unrelated upstream works.

All in all, I would recommend to make these fields free-form. Packaging teams
that would like to use a more specialised syntax can add their own local
policies on top of the DEP.


 @@ -99,13 +132,15 @@
   * **`Source`**
 * Optional
 * Single occurrence
 +   * Value: single line
 * Syntax: One or more URIs, one per line, indicating the primary
   point of distribution of the software.

Since the syntax allows multiple URIs, and since the URIs may be long, I think
that allowing newlines in the field will make it more readable. for instance by
making it free-form (not formatted, see below).


   * **`License`**
 * Licensing terms for the files listed in **`Files`** field for this 
 section
 * Required
 +   * Value: free-form text, with special first line
 * Syntax:
   * First line: an abbreviated name for the license (see *Short names*
 section for a list of standard abbreviations). If empty, it is

If the extended description finally requires double space for verbatim display,
then how abould calling the ‘special first line’ synopsis, to be closer to the
vocabulary used in the specification of the Description field ? 


Have a nice day,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100822071229.gb32...@merveille.plessy.net



Re: DEP-5: general file syntax

2010-08-22 Thread Lars Wirzenius
On la, 2010-08-21 at 22:30 -0700, Manoj Srivastava wrote:
 Can't we just fold long copyright header fields similarly?

The issue is that one Copyright field (or header) will contain many
copyright statements, and if we want to automatically parse those, we
need a way to see where a new one starts.

However, since there seems to be no current plans to parse copyright
statements out of the Copyright field, I think we can forget this issue,
at least for now, and leave it for later generations to solve, if they
start caring.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282514128.12989.386.ca...@havelock



Re: DEP-5: general file syntax

2010-08-22 Thread Lars Wirzenius
On su, 2010-08-22 at 16:12 +0900, Charles Plessy wrote:
 I also feel a contradiction to call ‘free-form’ some text that is formatted
 according to some markup rules, even if they are simple. I propose to replace
 instances like:
 
   Free-form text formatted like package long descriptions
 
 by:
 
   Formatted text like package long descriptions

ACK, done.

 All in all, I would recommend to make these fields free-form. Packaging teams
 that would like to use a more specialised syntax can add their own local
 policies on top of the DEP.

I disagree with this: I think a line-based list is perfectly fine for
Upstream-Contact. Does anyone else have an opinion?

  @@ -99,13 +132,15 @@
* **`Source`**
  * Optional
  * Single occurrence
  +   * Value: single line
  * Syntax: One or more URIs, one per line, indicating the primary
point of distribution of the software.
 
 Since the syntax allows multiple URIs, and since the URIs may be long, I think
 that allowing newlines in the field will make it more readable. for instance 
 by
 making it free-form (not formatted, see below).

Actually, I think I made a mistake: I think Source should be a
line-based list. This will make it easier for parsers to extract the
URIs.

Splitting a URI to two physical lines seems to me a bad idea, and messes
up URI parsing in too many contexts. (The real fix is to get upstream to
not have excessively long URIs, but that's hard to fix.)

 If the extended description finally requires double space for verbatim 
 display,
 then how abould calling the ‘special first line’ synopsis, to be closer to the
 vocabulary used in the specification of the Description field ? 

Could some English experts weigh in whether the word synopsis is a good
way to describe the list of license short names?


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282514846.12989.396.ca...@havelock



Re: DEP-5: general file syntax

2010-08-22 Thread Lars Wirzenius
I've attached the current diff for the general file syntax changes.
=== modified file 'dep5.mdwn'
--- dep5.mdwn	2010-08-21 09:05:12 +
+++ dep5.mdwn	2010-08-22 22:08:51 +
@@ -76,7 +76,7 @@
 * Single-line values.
 * White space separated lists.
 * Line based lists.
-* Free-form text formatted like package long descriptions.
+* Text formatted like package long descriptions.
 
 A single-line value means that the whole value of a field must fit on
 a single line. For example, the `Format` field has a single line value
@@ -90,7 +90,7 @@
 Another kind of list value has one value per line. For example,
 `Copyright` can list many copyright statements, one per line.
 
-Free-form text is formatted the same as the long description in
+Formatted text fields use the same rules as the long description in
 a package's `Description` field, possibly also using the first
 field in a special way, like `Description` uses it for the
 short description.
@@ -132,14 +132,14 @@
  * **`Source`**
* Optional
* Single occurrence
-   * Value: single line
-   * Syntax: One or more URIs, one per line, indicating the primary
- point of distribution of the software.
+   * Value: line based list
+   * Syntax: One or more URIs, indicating the primary
+ points of distribution of the software.
 
  * **`Disclaimer`**
* Optional
* Single occurrence
-   * Value: free-form text, no special first line
+   * Value: formatted text, no special first line
* Syntax: On Debian systems, this field can be
  used in the case of non-free and contrib packages (see [Policy
  12.5](
@@ -183,7 +183,7 @@
  * **`License`**
* Licensing terms for the files listed in **`Files`** field for this section
* Required
-   * Value: free-form text, with special first line
+   * Value: formatted text, with special first line
* Syntax:
  * First line: an abbreviated name for the license (see *Short names*
section for a list of standard abbreviations). If empty, it is



Re: DEP-5: general file syntax

2010-08-22 Thread Russ Allbery
Lars Wirzenius l...@liw.fi writes:
 On su, 2010-08-22 at 16:12 +0900, Charles Plessy wrote:

 If the extended description finally requires double space for verbatim
 display, then how abould calling the ‘special first line’ synopsis, to
 be closer to the vocabulary used in the specification of the
 Description field ?

 Could some English experts weigh in whether the word synopsis is a good
 way to describe the list of license short names?

It's... okay.  It's a little strange, but I don't think it would be
confusing since it is a summary of the license text in a machine-readable
format, in essence.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


--
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87mxsei8v1@windlord.stanford.edu



Re: DEP-5: general file syntax

2010-08-22 Thread Lars Wirzenius
On su, 2010-08-22 at 15:24 -0700, Russ Allbery wrote:
 It's... okay.  It's a little strange, but I don't think it would be
 confusing since it is a summary of the license text in a machine-readable
 format, in essence.

ACK, you and Ben have assured me that it is acceptable, and I've changed
the spec draft. Latest diff attached.
=== modified file 'dep5.mdwn'
--- dep5.mdwn	2010-08-17 20:47:26 +
+++ dep5.mdwn	2010-08-23 02:47:59 +
@@ -3,7 +3,7 @@
 	Title: Machine-readable debian/copyright
 	DEP: 5
 	State: DRAFT
-	Date: 2010-08-18
+	Date: 2010-08-23
 	Drivers: Steve Langasek vor...@debian.org,
 	 Lars Wirzenius l...@liw.fi
 	URL: http://dep.debian.net/deps/dep5
@@ -70,6 +70,36 @@
 http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-controlsyntax
 for details.
 
+There are four kinds values for fields. Each field specifies which
+kind is allowed.
+
+* Single-line values.
+* White space separated lists.
+* Line based lists.
+* Text formatted like package long descriptions.
+
+A single-line value means that the whole value of a field must fit on
+a single line. For example, the `Format` field has a single line value
+specifying the version of the machine-readable format that is used.
+
+A white space separated list means that the field value may be on one
+line or many, but values in the list are separated by one or more
+white space characters (including space, TAB, and newline). For
+example, the `Files` field has a list of filename patterns.
+
+Another kind of list value has one value per line. For example,
+`Copyright` can list many copyright statements, one per line.
+
+Formatted text fields use the same rules as the long description in
+a package's `Description` field, possibly also using the first
+line as a synopsis, like `Description` uses it for the
+short description.
+See section 5.6.13, Description, at
+http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description
+for details.
+For example, `Disclaimer` has no special first line, whereas
+`License` does.
+
 # Implementation
 ## Sections
 ### Header Section (Once)
@@ -77,6 +107,7 @@
  * **`Format`**
* Required
* Single occurrence
+   * Value: single line
* Syntax: URI of the format specification, such as:
  * http://svn.debian.org/wsvn/dep/web/deps/dep5.mdwn?op=filerev=REVISION
  * Note that the unwieldy length of the URL should be solved in
@@ -86,12 +117,14 @@
  * **`Upstream-Name`**
* Optional
* Single occurrence
+   * Value: single line
* Syntax: Single line (in most cases a single word),
  containing the name upstream uses for the software.
 
  * **`Upstream-Contact`**
* Optional
* Single occurrence
+   * Value: line based list
* Syntax: Line(s) containing the preferred address(es) to reach 
  the upstream project. May be free-form text, but by convention
  will usually be written as a list of RFC2822 addresses or URIs.
@@ -99,13 +132,15 @@
  * **`Source`**
* Optional
* Single occurrence
-   * Syntax: One or more URIs, one per line, indicating the primary
- point of distribution of the software.
+   * Value: line based list
+   * Syntax: One or more URIs, indicating the primary
+ points of distribution of the software.
 
  * **`Disclaimer`**
* Optional
* Single occurrence
-   * Syntax: Free-form text. On Debian systems, this field can be
+   * Value: formatted text, no synopsis
+   * Syntax: On Debian systems, this field can be
  used in the case of non-free and contrib packages (see [Policy
  12.5](
  http://www.debian.org/doc/debian-policy/ch-docs.html#s-copyrightfile))
@@ -132,13 +167,15 @@
* Required for all but the first paragraph.
  If omitted from the first paragraph,
  this is equivalent to a value of '*'.
+   * Value: white space separated list
* Syntax: List of patterns indicating files covered by the license
  and copyright specified in this paragraph.  See File patterns below.
 
  * **`Copyright`**
* Required
-   * Syntax: one or more free-form copyright statement(s) that apply to
- the files matched by the above pattern.
+   * Value: line based list
+   * Syntax: one or more free-form copyright statement(s), one per line,
+ that apply to the files matched by the above pattern.
  * Example value: 2008, John Q. Holder john.hol...@example.org
  * If a work has no copyright holder (i.e., it is in the public
domain), that information should be recorded here.
@@ -146,6 +183,7 @@
  * **`License`**
* Licensing terms for the files listed in **`Files`** field for this section
* Required
+   * Value: formatted text, with synopsis
* Syntax:
  * First line: an abbreviated name for the license (see *Short names*
section for a list of standard abbreviations). If empty, it is



Re: DEP-5: general file syntax

2010-08-22 Thread Russ Allbery
Lars Wirzenius l...@liw.fi writes:
 On la, 2010-08-21 at 01:58 -0700, Russ Allbery wrote:

 I was assuming that's how we'd get to a 1.1 version.  I haven't read
 DEP-0 recently, though, so I guess I have a poor grasp of how this is
 supposed to work.  I'll go review it.  If we pick up the files in
 debian-policy, then wherever we publish them from should really publish
 the versions from the debian-policy package.

I've reviewed it and undersand better now.  DEP material is supposed to be
incorporated into other documents where appropriate, rather than being
maintained as a DEP.  That was the bit that I was missing.

 I was assuming we'd have the current official version be in the
 debian-policy package, and published at http://www.debian.org/doc/ or
 http://www.debian.org/doc/debian-policy/ rather than on dep.debian.net.
 The final version of DEP-5 would have a pointer to the version in
 debian-policy. That's why I'm having such as bad time figuring out how
 to put the version in the URL.

Yeah, that makes more sense.

 However, it now strikes me that the filename in debian-policy can just
 have the version number. So the filename would start out as
 copyright-format-1.0.txt, and when it changes, the the filename changes
 to copyright-format-1.1.txt. Does that sound reasonable?

 The URL for Format would then be something like

 http://www.debian.org/doc/packaging-manuals/copyright-format-1.0.html

 That's a bit long, perhaps.

We might want to talk to debian-www about possible alternatives at some
point (packaging-manuals is really long), but I think that's basically the
right idea.

Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/

would fit easily in 80 columns, and I think we can probably generate the
right directory structure for that.  Something like:

Format: http://www.debian.org/doc/standards/copyright-format/1.0/

would be even shorter, of course, but I don't know if it's worth the
disruption.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/877hjic81b@windlord.stanford.edu



Re: DEP-5: general file syntax

2010-08-22 Thread Russ Allbery
Charles Plessy ple...@debian.org writes:

 I have mixed feelings about adding a extra level of complexity and
 introduce a syntax for lists. I think that apart from the Files field,
 the DEP could use mostly free-form values in the fields.

 In particular for the Copyright field, I am of the opinion that it
 should be free form and verbatim, preserving the newlines as they are in
 debian/copyright.

If we use the same format everywhere, as proposed elsewhere, people can
just add two spaces before the copyright statements to preserve the
formatting.

 One minor problem is that Policy §5.1 specifies that if a field value
 may not be wrapped, then this field is a single line of white space
 separated data, and indeed there is no field in Policy's chapter 5 that
 is purely free-form while preserving newlines characters. I have opened
 #593909 to disambiguate this.

The word wrapped in that context means folded, not wrapped in the
sense that you're thinking of.  But we'll talk about that more in that
bug.  Policy didn't use language very consistently.

   * **`Upstream-Name`**
 * Optional
 * Single occurrence
 +   * Value: single line
 * Syntax: Single line (in most cases a single word),
   containing the name upstream uses for the software.

   * **`Upstream-Contact`**
 * Optional
 * Single occurrence
 +   * Value: line based list
 * Syntax: Line(s) containing the preferred address(es) to reach
   the upstream project. May be free-form text, but by convention
   will usually be written as a list of RFC2822 addresses or URIs.

 The syntax of the Upstream-Contact field does not reflect the use
 intended by the Perl packaging team, which is to match a Debian package
 with a CPAN maintainer. The CPAN maintainer's email address not
 necessarly the preferred address to reach the upstream authors (for
 instance, a mailing list).

I don't understand what you're concerned with here.  It seems to match
what they're doing now to me.

 Since this thread is not about the Upstream-* fields, let's not go too
 much in the details, except that in my opinion, ‘line based list’ is not
 the most appropriate format for the Upstream-Contact field's value.

I like line-based list for this.

 @@ -99,13 +132,15 @@
   * **`Source`**
 * Optional
 * Single occurrence
 +   * Value: single line
 * Syntax: One or more URIs, one per line, indicating the primary
   point of distribution of the software.

 Since the syntax allows multiple URIs, and since the URIs may be long, I
 think that allowing newlines in the field will make it more
 readable. for instance by making it free-form (not formatted, see
 below).

I agree with Lars on this.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


--
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8739u6c7up@windlord.stanford.edu



Re: DEP-5: general file syntax

2010-08-21 Thread Lars Wirzenius
On pe, 2010-08-20 at 17:05 -0700, Russ Allbery wrote:
 I think a better approach would be to, once the document has settled down,
 publish it with a version number and give that version of the document a
 permanent URL.  So, for instance, we would publish DEP-5 1.0 and give it a
 URL something like http://dep.debian.net/DEP-5/1.0 at which it would
 always be found.  If we publish a new version of the document, the new
 version would be put at http://dep.debian.net/DEP-5/1.1, but the old
 version wouldn't be changed.

DEPs are not supposed to change after they're approved, so it should be
a new DEP rather than DEP-5/1.1, but that's a trivial detail.

How would that tie in with updating it via the normal policy process? I
thought we'd keep the file in the debian-policy package for future
updates.

 Note that you should say that explicitly, since in the control file format
 not every field is multi-line (the default is that a field may not be
 multi-line).

ACK.

 I think we could merge all three of these into the same case by using the
 Description syntax, with the note that blank lines don't really make sense
 in some fields.  (So, I guess, merge them into two cases.)

I'm OK with saying that multiline fields should use the Description
markup, especially noting Charles's point about only using the long
description part, when appropriate. This simplifies things quite a lot.
I'll word a concrete patch to propose.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282379544.12989.309.ca...@havelock



Re: DEP-5: general file syntax

2010-08-21 Thread Russ Allbery
Lars Wirzenius l...@liw.fi writes:
 On pe, 2010-08-20 at 17:05 -0700, Russ Allbery wrote:

 I think a better approach would be to, once the document has settled
 down, publish it with a version number and give that version of the
 document a permanent URL.  So, for instance, we would publish DEP-5 1.0
 and give it a URL something like http://dep.debian.net/DEP-5/1.0 at
 which it would always be found.  If we publish a new version of the
 document, the new version would be put at
 http://dep.debian.net/DEP-5/1.1, but the old version wouldn't be
 changed.

 DEPs are not supposed to change after they're approved, so it should be
 a new DEP rather than DEP-5/1.1, but that's a trivial detail.

 How would that tie in with updating it via the normal policy process? I
 thought we'd keep the file in the debian-policy package for future
 updates.

I was assuming that's how we'd get to a 1.1 version.  I haven't read DEP-0
recently, though, so I guess I have a poor grasp of how this is supposed
to work.  I'll go review it.  If we pick up the files in debian-policy,
then wherever we publish them from should really publish the versions from
the debian-policy package.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8739u8wdei@windlord.stanford.edu



Re: DEP-5: general file syntax

2010-08-21 Thread Lars Wirzenius
On la, 2010-08-21 at 20:32 +1200, Lars Wirzenius wrote:
 I'm OK with saying that multiline fields should use the Description
 markup, especially noting Charles's point about only using the long
 description part, when appropriate. This simplifies things quite a lot.
 I'll word a concrete patch to propose.

While wording this, I realized that we have more cases: Files has a list
of values (currently comma-separated, but I propose to make it
white-space separated), and Copyright and maybe other fields have a list
of values one per line. I took the liberty of taking this into account. 

The relevant new text in the file syntax section:

There are four kinds values for fields. Each field specifies which
kind is allowed.

* Single-line values.
* White space separated lists.
* Line based lists.
* Free-form text formatted like package long descriptions.

A single-line value means that the whole value of a field must fit on
a single line. For example, the `Format` field has a single line value
specifying the version of the machine-readable format that is used.

A white space separated list means that the field value may be on one
line or many, but values in the list are separated by one or more
white space characters (including space, TAB, and newline). For
example, the `Files` field has a list of filename patterns.

Another kind of list value has one value per line. For example,
`Copyright` can list many copyright statements, one per line.

Free-form text is formatted the same as the long description in
a package's `Description` field, possibly also using the first
field in a special way, like `Description` uses it for the
short description.
See section 5.6.13, Description, at

http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description
for details.
For example, `Disclaimer` has no special first line, whereas
`License` does.

I'm attaching the exact diff, which lists the type of value for each
field.

Comments?
=== modified file 'dep5.mdwn'
--- dep5.mdwn	2010-08-17 20:47:26 +
+++ dep5.mdwn	2010-08-21 09:04:06 +
@@ -70,6 +70,36 @@
 http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-controlsyntax
 for details.
 
+There are four kinds values for fields. Each field specifies which
+kind is allowed.
+
+* Single-line values.
+* White space separated lists.
+* Line based lists.
+* Free-form text formatted like package long descriptions.
+
+A single-line value means that the whole value of a field must fit on
+a single line. For example, the `Format` field has a single line value
+specifying the version of the machine-readable format that is used.
+
+A white space separated list means that the field value may be on one
+line or many, but values in the list are separated by one or more
+white space characters (including space, TAB, and newline). For
+example, the `Files` field has a list of filename patterns.
+
+Another kind of list value has one value per line. For example,
+`Copyright` can list many copyright statements, one per line.
+
+Free-form text is formatted the same as the long description in
+a package's `Description` field, possibly also using the first
+field in a special way, like `Description` uses it for the
+short description.
+See section 5.6.13, Description, at
+http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description
+for details.
+For example, `Disclaimer` has no special first line, whereas
+`License` does.
+
 # Implementation
 ## Sections
 ### Header Section (Once)
@@ -77,6 +107,7 @@
  * **`Format`**
* Required
* Single occurrence
+   * Value: single line
* Syntax: URI of the format specification, such as:
  * http://svn.debian.org/wsvn/dep/web/deps/dep5.mdwn?op=filerev=REVISION
  * Note that the unwieldy length of the URL should be solved in
@@ -86,12 +117,14 @@
  * **`Upstream-Name`**
* Optional
* Single occurrence
+   * Value: single line
* Syntax: Single line (in most cases a single word),
  containing the name upstream uses for the software.
 
  * **`Upstream-Contact`**
* Optional
* Single occurrence
+   * Value: line based list
* Syntax: Line(s) containing the preferred address(es) to reach 
  the upstream project. May be free-form text, but by convention
  will usually be written as a list of RFC2822 addresses or URIs.
@@ -99,13 +132,15 @@
  * **`Source`**
* Optional
* Single occurrence
+   * Value: single line
* Syntax: One or more URIs, one per line, indicating the primary
  point of distribution of the software.
 
  * **`Disclaimer`**
* Optional
* Single occurrence
-   * Syntax: Free-form text. On Debian systems, this field can be
+   * Value: free-form text, no special first line
+   * Syntax: On Debian systems, this field 

Re: DEP-5: general file syntax

2010-08-21 Thread Russ Allbery
Lars Wirzenius l...@liw.fi writes:

 While wording this, I realized that we have more cases: Files has a list
 of values (currently comma-separated, but I propose to make it
 white-space separated), and Copyright and maybe other fields have a list
 of values one per line. I took the liberty of taking this into account. 

This generally looks good to me, but:

 Another kind of list value has one value per line. For example,
 `Copyright` can list many copyright statements, one per line.

What happens when the copyright statement is longer than a line?  I have a
bunch of those, such as:

  Copyright 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
  Board of Trustees, Leland Stanford Jr. University
  Copyright 1991, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
  2002, 2003 by The Internet Software Consortium and Rich Salz
  Copyright 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
  2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
  Free Software Foundation, Inc.

Note that the FSF lawyer told them not to use ranges in copyright dates,
so most GNU projects don't.  From standards:

 Do not abbreviate the year list using a range; for instance, do not
  write `1996--1998'; instead, write `1996, 1997, 1998'.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87tymouy1k@windlord.stanford.edu



Re: DEP-5: general file syntax

2010-08-21 Thread Lars Wirzenius
On la, 2010-08-21 at 01:58 -0700, Russ Allbery wrote:
  How would that tie in with updating it via the normal policy process? I
  thought we'd keep the file in the debian-policy package for future
  updates.
 
 I was assuming that's how we'd get to a 1.1 version.  I haven't read DEP-0
 recently, though, so I guess I have a poor grasp of how this is supposed
 to work.  I'll go review it.  If we pick up the files in debian-policy,
 then wherever we publish them from should really publish the versions from
 the debian-policy package.

I was assuming we'd have the current official version be in the
debian-policy package, and published at http://www.debian.org/doc/ or
http://www.debian.org/doc/debian-policy/ rather than on dep.debian.net.
The final version of DEP-5 would have a pointer to the version in
debian-policy. That's why I'm having such as bad time figuring out how
to put the version in the URL. 

However, it now strikes me that the filename in debian-policy can just
have the version number. So the filename would start out as
copyright-format-1.0.txt, and when it changes, the the filename changes
to copyright-format-1.1.txt. Does that sound reasonable?

The URL for Format would then be something like

http://www.debian.org/doc/packaging-manuals/copyright-format-1.0.html

That's a bit long, perhaps.

Having an updated DEP-5 be generated from debian-policy on
dep.debian.net, when DEP is not used to update it, seems unpleasantly
complicated to me.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282382283.12989.320.ca...@havelock



Re: DEP-5: general file syntax

2010-08-21 Thread Lars Wirzenius
On la, 2010-08-21 at 02:15 -0700, Russ Allbery wrote:
 What happens when the copyright statement is longer than a line?  I have a
 bunch of those, such as:

Good point. I see at least thw following possible solutions:

* Keep one line per copyright statement, but make the lines be long.
(This is what we have now.)
* Have one copyright statement per Copyright field, and have multiple
instances of the field.
* Just make it all be free-form text, disabling any automatic parsing of
the Copyright field.

 Note that the FSF lawyer told them not to use ranges in copyright 
 dates,

For actively maintained software, this is going to get really hard to
read in a millennium or two.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282382607.12989.324.ca...@havelock



Re: DEP-5: general file syntax

2010-08-21 Thread Stephen Leake
Lars Wirzenius l...@liw.fi writes:

 Files has a list
 of values (currently comma-separated, but I propose to make it
 white-space separated), 

File names can have spaces. Not common, but possible. I guess such file
names would need to be quoted?

-- 
-- Stephe


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8262z4404s@stephe-leake.org



Re: DEP-5: general file syntax

2010-08-21 Thread Ben Finney
Lars Wirzenius l...@liw.fi writes:

 On la, 2010-08-21 at 02:15 -0700, Russ Allbery wrote:
  What happens when the copyright statement is longer than a line?
[…]

 Good point. I see at least thw following possible solutions:
[…]

 * Have one copyright statement per Copyright field, and have multiple
 instances of the field.

This is my preference, and what I've been doing in my packages.

 For actively maintained software, this is going to get really hard to
 read in a millennium or two.

Let's solve that before the millennium is out, by reforming
international copyright law to drastically reduce copyright duration :-)

-- 
 \  “If society were bound to invent technologies which could only |
  `\   be used entirely within the law, then we would still be sitting |
_o__)   in caves sucking our feet.” —Gene Kan, creator of Gnutella |
Ben Finney


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87wrrkqg0q@benfinney.id.au



Re: DEP-5: general file syntax

2010-08-21 Thread Russ Allbery
Ben Finney ben+deb...@benfinney.id.au writes:
 Lars Wirzenius l...@liw.fi writes:

 * Have one copyright statement per Copyright field, and have multiple
 instances of the field.

 This is my preference, and what I've been doing in my packages.

Unfortunately, this creates real challenges for parsers.  I've written a
few RFC 5322 parsers, particularly for Usenet, and allowing repetition of
headers always causes headaches in representation.  You end up having to
add another layer of data structure, with corresponding changes to
everything that consumes information from the parser, if you don't want to
throw away information.  It's also a divergence from the Debian control
file format, which allows only one instance of a field per stanza,
probably for much the same reason.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/878w3zyfxe@windlord.stanford.edu



DEP-5: Structure for multiple copyright statements (was: DEP-5: general file syntax)

2010-08-21 Thread Ben Finney
Russ Allbery r...@debian.org writes:

 Ben Finney ben+deb...@benfinney.id.au writes:
  Lars Wirzenius l...@liw.fi writes:

  * Have one copyright statement per Copyright field, and have multiple
  instances of the field.

  This is my preference, and what I've been doing in my packages.

 Unfortunately, this creates real challenges for parsers. […]

Okay, I can see that. Thanks for explaining.

Lars Wirzenius l...@liw.fi writes:

 On la, 2010-08-21 at 02:15 -0700, Russ Allbery wrote:
  What happens when the copyright statement is longer than a line? I
  have a bunch of those, such as:

 Good point. I see at least thw following possible solutions:

 * Keep one line per copyright statement, but make the lines be long.
 (This is what we have now.)

Could we take advantage of the natural “©” marker to indicate each
copyright statement? Specification off the top of my head, that
hopefully shows what I mean:

The ‘Copyright’ field must contain one or more copyright statements.
Each virtual line of the field value is a single copyright
statement. Each copyright statement must begin with the “©”
character (U+00A9 COPYRIGHT SIGN) at the start of a virtual line.
Each physical line which does not begin with “©” is a continuation
of the previous virtual line in the same field.

Contrived examples:

Files: *
Copyright: © 2009 Frank Foo fr...@example.com

Files: doc/*
Copyright:
© 1995, 1997, 1998, 1999, 2002, 2003, 2004, 2006, 2009 Beatrice
Bar beatr...@example.org
© 2008, 2010 Barry Baz ba...@example.com

-- 
 \   “The long-term solution to mountains of waste is not more |
  `\  landfill sites but fewer shopping centres.” —Clive Hamilton, |
_o__)_Affluenza_, 2005 |
Ben Finney


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87sk27r5gs.fsf...@benfinney.id.au



Re: DEP-5: Structure for multiple copyright statements (was: DEP-5: general file syntax)

2010-08-21 Thread Lars Wirzenius
On su, 2010-08-22 at 08:00 +1000, Ben Finney wrote:
 Could we take advantage of the natural “©” marker to indicate each
 copyright statement?

That's an interesting idea, but would people in general find it easy or
difficult to write that character? (I'd have to copy-paste it, for
instance, since my keymap does not seem to have a binding for it.)

The word Copyright or the ASCII-art (C) might be substituted.

 Copyright:
 © 1995, 1997, 1998, 1999, 2002, 2003, 2004, 2006, 2009 Beatrice
 Bar beatr...@example.org
 © 2008, 2010 Barry Baz ba...@example.com

What do others think?

If I was writing a parser, I'd rather have the simplicity of long lines,
but then I'm lazy. If I was writing DEP-5 files, I am not sure what I
would prefer, but I know I would hate filling out the copyright fields
in any case, since it's boring, repetitive work.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282433814.12989.361.ca...@havelock



Re: DEP-5: Structure for multiple copyright statements (was: DEP-5: general file syntax)

2010-08-21 Thread Roger Leigh
On Sun, Aug 22, 2010 at 11:36:54AM +1200, Lars Wirzenius wrote:
 On su, 2010-08-22 at 08:00 +1000, Ben Finney wrote:
  Could we take advantage of the natural “©” marker to indicate each
  copyright statement?
 
 That's an interesting idea, but would people in general find it easy or
 difficult to write that character? (I'd have to copy-paste it, for
 instance, since my keymap does not seem to have a binding for it.)

AltGr-c-o or RightAlt-c-0 or some other similar compose sequence?

For most common characters not on my keyboard man groff_char is my
usual copy-paste source :)

 The word Copyright or the ASCII-art (C) might be substituted.

The ASCII-art has no legal meaning AFAIK, it's only the word
copyright or the symbol © that can really be used.


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: DEP-5: Structure for multiple copyright statements (was: DEP-5: general file syntax)

2010-08-21 Thread Steve Langasek
On Sun, Aug 22, 2010 at 11:36:54AM +1200, Lars Wirzenius wrote:
 On su, 2010-08-22 at 08:00 +1000, Ben Finney wrote:
  Could we take advantage of the natural “©” marker to indicate each
  copyright statement?

 That's an interesting idea, but would people in general find it easy or
 difficult to write that character? (I'd have to copy-paste it, for
 instance, since my keymap does not seem to have a binding for it.)

I think requiring symbols not on the standard keyboard is an undue burden.

 The word Copyright or the ASCII-art (C) might be substituted.

So the copyright field will include the word copyright?  That seems
annoyingly redundant.

TTBOMK the 'Copyright' as the field name already fulfills the legal
requirements and I don't think we should encode any redundancy with this.

I also agree with Russ that we want this to be as simple as possible to
write, and there isn't really a strong use case for imposing additional
structure on this field.  We might recommend best practices, but if no one
is actually going to be parsing it, let's not impose the overhead on
authoring.

(Extracting dates should work without any additional structure, btw; just
look for the sequence of 4 digits, possibly followed by a dash and a
subsequent 2-4 digits?  :)

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: Digital signature


Re: DEP-5: general file syntax

2010-08-21 Thread Manoj Srivastava
On Sat, Aug 21 2010, Russ Allbery wrote:

 Ben Finney ben+deb...@benfinney.id.au writes:
 Lars Wirzenius l...@liw.fi writes:

 * Have one copyright statement per Copyright field, and have multiple
 instances of the field.

 This is my preference, and what I've been doing in my packages.

 Unfortunately, this creates real challenges for parsers.  I've written a
 few RFC 5322 parsers, particularly for Usenet, and allowing repetition of
 headers always causes headaches in representation.  You end up having to
 add another layer of data structure, with corresponding changes to
 everything that consumes information from the parser, if you don't want to
 throw away information.  It's also a divergence from the Debian control
 file format, which allows only one instance of a field per stanza,
 probably for much the same reason.

If I recall correctly, 2822 allows for header field folding:
--8---cut here---start-8---
2.2. Header Fields

   Header fields are lines composed of a field name, followed by a colon
   (:), followed by a field body, and terminated by CRLF.  A field
   name MUST be composed of printable US-ASCII characters (i.e.,
   characters that have values between 33 and 126, inclusive), except
   colon.  A field body may be composed of any US-ASCII characters,
   except for CR and LF.  However, a field body may contain CRLF when
   used in header folding and  unfolding as described in section
   2.2.3.  All field bodies MUST conform to the syntax described in
   sections 3 and 4 of this standard.
2.2.3. Long Header Fields

   Each header field is logically a single line of characters comprising
   the field name, the colon, and the field body.  For convenience
   however, and to deal with the 998/78 character limitations per line,
   the field body portion of a header field can be split into a multiple
   line representation; this is called folding.  The general rule is
   that wherever this standard allows for folding white space (not
   simply WSP characters), a CRLF may be inserted before any WSP.  For
   example, the header field:

   Subject: This is a test

   can be represented as:

   Subject: This
is a test

   Note: Though structured field bodies are defined in such a way that
   folding can take place between many of the lexical tokens (and even
   within some of the lexical tokens), folding SHOULD be limited to
   placing the CRLF at higher-level syntactic breaks.  For instance, if
   a field body is defined as comma-separated values, it is recommended
   that folding occur after the comma separating the structured items in
   preference to other places where the field could be folded, even if
   it is allowed elsewhere.

   The process of moving from this folded multiple-line representation
   of a header field to its single line representation is called
   unfolding. Unfolding is accomplished by simply removing any CRLF
   that is immediately followed by WSP.  Each header field should be
   treated in its unfolded form for further syntactic and semantic
   evaluation.
--8---cut here---end---8---

Can't we just fold long copyright header fields similarly?

manoj

-- 
Houston, Tranquillity Base here.  The Eagle has landed. Neil Armstrong
Manoj Srivastava sriva...@acm.org http://www.golden-gryphon.com/  
4096R/C5779A1C E37E 5EC5 2A01 DA25 AD20  05B6 CF48 9438 C577 9A1C


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/871v9rz02a@anzu.internal.golden-gryphon.com



Re: DEP-5: general file syntax

2010-08-20 Thread Russ Allbery
Lars Wirzenius l...@liw.fi writes:

 * We refer to Policy 5.1 by section number, section title, and URL. I
 don't think the policy version is necessary: if they make incompatible
 changes, then all Debian control files will potentially break, and DEP-5
 copyright files are no exception. Including the 5.1 section verbatim in
 DEP-5, on the other hand, results in duplication, which is likely to
 result in divergence between the policy and DEP-5.

 * We add to DEP-5 details of how to handle values of multiline fields.
 We can discuss exact wording of that later (see below), if we can get
 consensus on the overall topic of file syntax.

 * Once DEP-5 is accepted, we move it into the debian-policy package; it
 will then be maintained via the normal policy amendment process on the
 debian-policy mailing list. If section 5.1 changes (including just
 number), the DEP-5 spec shall be changed at the same time.

I agree with all of this.

 * We specify the debian/control Format: field to include an identifier
 that is not dependent on the DEP-5 URL. Currently, the spec includes a
 URL to the specific version of itself; this is obviously problematic. I
 suggest we change it by having two words in the Format: value: an
 unversioned URL to the spec (currently to the DEP site, but later to the
 debian-policy site), and a date.

In the XML standards world, and everything touched by it, all standards
documents are strongly encouraged to have a URI that embeds the version of
the standard.  I don't like the idea of separating the version from the
URL for a few reasons: it adds additional steps to the process of getting
exactly the corresponding version of the specification, two pieces of data
tend to get disconnected or not stored together, and it doesn't match how
standards are usually handled these days.

I think a better approach would be to, once the document has settled down,
publish it with a version number and give that version of the document a
permanent URL.  So, for instance, we would publish DEP-5 1.0 and give it a
URL something like http://dep.debian.net/DEP-5/1.0 at which it would
always be found.  If we publish a new version of the document, the new
version would be put at http://dep.debian.net/DEP-5/1.1, but the old
version wouldn't be changed.

See, for example, how the desktop entry specification is handled at:

http://standards.freedesktop.org/desktop-entry-spec/

I definitely agree with not using the VCS revision number of the document
as its version, since that number can increase frequently for changes that
aren't normative and that no consumer of the standard needs to care about.

 On to fields with multiline values. Well, every field can have
 multi-line values, but the generic rules suffice for most of them.

Note that you should say that explicitly, since in the control file format
not every field is multi-line (the default is that a field may not be
multi-line).

 For License, the text in the field (except the first line) should
 probably not be word-wrapped, newlines are significant, and definitely
 empty lines need to be handled in some way. The reason word-wrapping
 shouldn't happen is that in many cases upstream licenses use ad hoc
 plain text formatting conventions, such as bulleted lists, and any word
 wrapping will mess that up. There is already rough consensus on how to
 handle empty line markup (read: same as Description in debian/control).

I think there's something to be said for letting the License body text
wrap by default and requiring an extra space if it's intended to not wrap.
It means something else for people to be aware of and be sure to add two
spaces for many licenses, but it means that the various free-form text
that is used by many licenses can still be nicely wrapped in various
presentations.  For example, the typical MIT/X Consortium license doesn't
need to have significant newlines.

That also lets the rule with License be consistent with the rule for other
fields, by requiring two leading spaces for any literal text.  It also
means that we would be using essentially the same formatting conventions
as Description (Policy 5.6.13).

 So there are three cases:

 * License: newlines are significant, no word-wrapping, desc-escape is
 used.
 * Disclaimer (and Comment in the future): newlines are not significant,
 word-wrapping is OK, desc-escape is used.
 * Everything else: newlines are not significant, word-wrapping is OK,
 desc-escape is not used. Normal RFC822-style handling of line
 continuations applies.

I think we could merge all three of these into the same case by using the
Description syntax, with the note that blank lines don't really make sense
in some fields.  (So, I guess, merge them into two cases.)

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 

Re: DEP-5: general file syntax

2010-08-20 Thread gregor herrmann
On Fri, 20 Aug 2010 17:05:23 -0700, Russ Allbery wrote:

 That also lets the rule with License be consistent with the rule for other
 fields, by requiring two leading spaces for any literal text.  It also
 means that we would be using essentially the same formatting conventions
 as Description (Policy 5.6.13).

I agree, saying same formatting as in Description in debian/control
makes it easier than having to remember different syntaxes.
 
Cheers,
gregor
 
-- 
 .''`.   http://info.comodo.priv.at/ -- GPG key IDs: 0x8649AA06, 0x00F3CFE4
 : :' :  Debian GNU/Linux user, admin,  developer - http://www.debian.org/
 `. `'   Member of VIBE!AT  SPI, fellow of Free Software Foundation Europe
   `-NP: Bruce Springsteen  The E Street Band: Thunder Road


signature.asc
Description: Digital signature


Re: DEP-5: general file syntax

2010-08-20 Thread Charles Plessy
Le Sat, Aug 21, 2010 at 02:30:40AM +0200, gregor herrmann a écrit :
 On Fri, 20 Aug 2010 17:05:23 -0700, Russ Allbery wrote:
 
  That also lets the rule with License be consistent with the rule for other
  fields, by requiring two leading spaces for any literal text.  It also
  means that we would be using essentially the same formatting conventions
  as Description (Policy 5.6.13).
 
 I agree, saying same formatting as in Description in debian/control
 makes it easier than having to remember different syntaxes.

I am a little bit worried that it may be unwelcomed to have to take extra care
of indentation when using DEP-5. Nevertheless, if we go that way, I think that
we need to make the explanation about the format more precise, since it may be
ambiguous if the above means that the Comment and Disclaimer fields will have a
single line synopsis or not. So I would proposer either ‘same syntax as the
Description field of Debian control files’ if they will have a synopsis or
‘same formatting as the long description of the Description field in Debian
control files’ if only the Files field will contain a syopsis.

I have another comment on details of the DEP's syntax, about the order of
paragraphs. Policy's §5.1 does not specify that the order or paragraphs is
important, while this is a crucial information in DEP-5. If this is not an
omission in §5.1, I recommend that this additional requirement is mentionned in
the DEP.

Have a nice week-end,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100821014420.gb...@merveille.plessy.net



Re: DEP-5: general file syntax

2010-08-20 Thread Russ Allbery
Charles Plessy ple...@debian.org writes:

 I have another comment on details of the DEP's syntax, about the order
 of paragraphs. Policy's §5.1 does not specify that the order or
 paragraphs is important, while this is a crucial information in
 DEP-5. If this is not an omission in §5.1, I recommend that this
 additional requirement is mentionned in the DEP.

Order of paragraphs is significant in debian/control files.  It never
occurred to me that anyone would think that the order of paragraphs wasn't
significant.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


--
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87lj80vhhx@windlord.stanford.edu



Re: DEP-5: general file syntax

2010-08-18 Thread Philip Hands
On Wed, 18 Aug 2010 09:29:33 +1200, Lars Wirzenius l...@liw.fi wrote:
 For simplicity, I will introduce a new term, desc-escape. This refers
 to the escaping of content similar to the way Description does it in
 debian/control: each line is prefixed with a space, except empty lines
 are replaced with a space and period. The Policy's specification is not
 usable for this, I think, because it goes much further than what DEP-5
 needs.
 
 Note that I've dropped the possibility of prefixing escaped lines with a
 TAB character. It is a needless difference from Description, and would
 complicate parsers.
 
 So there are three cases:
 
 * License: newlines are significant, no word-wrapping, desc-escape is
 used.

We could always use the same convention as in Description: 

  http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description

where a single space prefix indicates wrappable text, and two spaces
indicates verbatim.

That also deals with the case of the original text containing
a line with a single full-stop, as that could be included by prefixing it
with two spaces.

Mechanical conversions could just add two spaces by default, and if
anyone can be bothered, paragraphs that would be fine word-wrapped could
then be back-indented one space by hand.

Cheers, Phil.
-- 
|)|  Philip Hands [+44 (0)20 8530 9560]http://www.hands.com/
|-|  HANDS.COM Ltd.http://www.uk.debian.org/
|(|  10 Onslow Gardens, South Woodford, London  E18 1NE  ENGLAND


pgpSG5o0ZPVTp.pgp
Description: PGP signature


DEP-5: general file syntax

2010-08-17 Thread Lars Wirzenius
There would seem to be at least a rough consensus that DEP-5 should
follow Policy 5.1 on control file syntax. The open question how to
specify that: it is my understanding that most people favor just
referring to the relevant Policy section and not duplicate things in
DEP-5, but since that is also my strong preference, I want to be
careful.

Here's my current suggestion:

* We refer to Policy 5.1 by section number, section title, and URL. I
don't think the policy version is necessary: if they make incompatible
changes, then all Debian control files will potentially break, and DEP-5
copyright files are no exception. Including the 5.1 section verbatim in
DEP-5, on the other hand, results in duplication, which is likely to
result in divergence between the policy and DEP-5.

* We add to DEP-5 details of how to handle values of multiline fields.
We can discuss exact wording of that later (see below), if we can get
consensus on the overall topic of file syntax.

* Once DEP-5 is accepted, we move it into the debian-policy package; it
will then be maintained via the normal policy amendment process on the
debian-policy mailing list. If section 5.1 changes (including just
number), the DEP-5 spec shall be changed at the same time.

* We specify the debian/control Format: field to include an identifier
that is not dependent on the DEP-5 URL. Currently, the spec includes a
URL to the specific version of itself; this is obviously problematic. I
suggest we change it by having two words in the Format: value: an
unversioned URL to the spec (currently to the DEP site, but later to the
debian-policy site), and a date.

Comments on the above? The rest of this e-mail proposes a specific way
of handling multiline values.

 - - -

On to fields with multiline values. Well, every field can have
multi-line values, but the generic rules suffice for most of them. There
are three important details here: for specific fields, are newlines
significant, can word-wrapping happen, and how empty lines are handled.

For License, the text in the field (except the first line) should
probably not be word-wrapped, newlines are significant, and definitely
empty lines need to be handled in some way. The reason word-wrapping
shouldn't happen is that in many cases upstream licenses use ad hoc
plain text formatting conventions, such as bulleted lists, and any word
wrapping will mess that up. There is already rough consensus on how to
handle empty line markup (read: same as Description in debian/control).

For Disclaimer, and Comment if we add that, it might be helpful to have
empty lines, but word-wrapping is definitely needed. Newlines are not
significant.

For simplicity, I will introduce a new term, desc-escape. This refers
to the escaping of content similar to the way Description does it in
debian/control: each line is prefixed with a space, except empty lines
are replaced with a space and period. The Policy's specification is not
usable for this, I think, because it goes much further than what DEP-5
needs.

Note that I've dropped the possibility of prefixing escaped lines with a
TAB character. It is a needless difference from Description, and would
complicate parsers.

So there are three cases:

* License: newlines are significant, no word-wrapping, desc-escape is
used.
* Disclaimer (and Comment in the future): newlines are not significant,
word-wrapping is OK, desc-escape is used.
* Everything else: newlines are not significant, word-wrapping is OK,
desc-escape is not used. Normal RFC822-style handling of line
continuations applies.

In other words, for Disclaimer, a formatter would un-escape (remove
leading space, replace lines with just period with empty ones), then
split the resulting text into paragraphs at empty lines, and format
those paragraphs in whatever manner it sees fit.

I echo Charles's suggestion that we specify the way escaping is done in
the section that describes the overall syntax, and then specify for each
field if they use desc-escape or not, whether newlines are significant
or not, whether the content can be word-wrapped or not.

Comments on this part? I haven't got specific wording changes to suggest
yet, I want to know if this approach is acceptable first, before we
spend time on wording details.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282080573.12989.179.ca...@havelock



Re: DEP-5: general file syntax

2010-08-17 Thread Charles Plessy
Le Wed, Aug 18, 2010 at 09:29:33AM +1200, Lars Wirzenius a écrit :
 
 For Disclaimer, and Comment if we add that, it might be helpful to have
 empty lines, but word-wrapping is definitely needed. Newlines are not
 significant.

Hi Lars,

some debian/copyright files contain extracts of correspondance between the
maintainer and an upstream person, for instance when the status of some files
need to be clarified.

Would they be removed, transferred to a non-parsable section of the file (with
a mechanism to be determined, for instance similar to DEP-3), or would they be
suitable for comment fields (if we introduce them).

In the case they are put in a comment field, ignoring newlines is likely to
make them difficult to read.

Have a nice day,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100818012041.gb5...@merveille.plessy.net



Re: DEP-5: general file syntax

2010-08-17 Thread Russ Allbery
Charles Plessy ple...@debian.org writes:
 Le Wed, Aug 18, 2010 at 09:29:33AM +1200, Lars Wirzenius a écrit :

 For Disclaimer, and Comment if we add that, it might be helpful to have
 empty lines, but word-wrapping is definitely needed. Newlines are not
 significant.

 some debian/copyright files contain extracts of correspondance between
 the maintainer and an upstream person, for instance when the status of
 some files need to be clarified.

 Would they be removed, transferred to a non-parsable section of the file
 (with a mechanism to be determined, for instance similar to DEP-3), or
 would they be suitable for comment fields (if we introduce them).

 In the case they are put in a comment field, ignoring newlines is likely
 to make them difficult to read.

I wonder if we should have some terminator for the machine-readable
portion of debian/copyright, below which is free-form supporting material
like complete e-mail exchanges and whatnot.  That seems to me like the
best way of handling the problem of attaching a complete e-mail exchange.

Those exchanges aren't the actual license or copyright information, which
can still be stated in a structured form.  They're usually just defenses
of why thet claimed license information is what it is (when it may, for
example, contradict or supplement information included in the source
files).

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


--
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87iq38btm0@windlord.stanford.edu



Re: DEP-5: general file syntax

2010-08-17 Thread Lars Wirzenius
On ti, 2010-08-17 at 18:24 -0700, Russ Allbery wrote:
 Those exchanges aren't the actual license or copyright information, which
 can still be stated in a structured form.  They're usually just defenses
 of why thet claimed license information is what it is (when it may, for
 example, contradict or supplement information included in the source
 files).

Hmm. If the e-mails (or whatever) modify or clarify the license, should
not the e-mails be considered part of the license information?

License: other
 This software is released under the GPLv2 blahblah.
 .
 From: Upstream Author aut...@upstream.example.com
 Message-Id: loof.li...@upstream.example.com
 Date: Mon, Apr 01 2010 04:01:00 +0401
 Subject: License clarification
 .
 When I say GPL I actually mean LGPL, sorry about that.

If the e-mail is just a clarification to the license and does not modify
it, then I guess License is not the right place. Rather than munge it
into Comment, I guess we need a new field. However, how often do these
things happen? If it is very rarely, we could just live with appending
them to License.

Having part of the file be non-machine-readable might be an option, but
I have the feeling that for large debian/copyright files, it'd be easier
to have these e-mails near the paragraphs that concern them, otherwise
it'll get too difficult to keep track of things. So a structured
approach would be my preference here.


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1282108302.12989.199.ca...@havelock



Re: DEP-5: general file syntax

2010-08-17 Thread Craig Small
On Tue, Aug 17, 2010 at 06:24:39PM -0700, Russ Allbery wrote:
 I wonder if we should have some terminator for the machine-readable
 portion of debian/copyright, below which is free-form supporting material
That would be the simplest way, a 'stop reading here' line for the
parsers.  That way anything that is supplementary can go there.
It probably needs to be documented that nothing that places extra
restrictions or conditions can go there though.

 - Craig
-- 
Craig Small  GnuPG:1C1B D893 1418 2AF4 45EE  95CB C76C E5AC 12CA DFA5
http://www.enc.com.au/ csmall at : enc.com.au
http://www.debian.org/  Debian GNU/Linux, software should be Free 


-- 
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100818051302.ga11...@enc.com.au



Re: DEP-5: general file syntax

2010-08-17 Thread Don Armstrong
On Wed, 18 Aug 2010, Lars Wirzenius wrote:
 If the e-mail is just a clarification to the license and does not
 modify it, then I guess License is not the right place. Rather than
 munge it into Comment, I guess we need a new field. However, how
 often do these things happen? If it is very rarely, we could just
 live with appending them to License.

In this case, I suspect that you have a change to the License (it's
really LGPL) and you have an e-mail as evidence. So something like:

 File: *
 Licence: LGPL
 Evidence: 
  From: Upstream Author aut...@upstream.example.com
  Message-Id: loof.li...@upstream.example.com
  Date: Mon, Apr 01 2010 04:01:00 +0401
  Subject: License clarification
  .
  When I say GPL I actually mean LGPL, sorry about that.

may be an option. [I'm thinking that in the normal case, Evidence
would be assumed to be headers in the files themselves or a COPYING,
COPYRIGHT, LICENSE, or similar file in the source repository, so you
wouldn't include it.]

It may also be important to be able to later verify PGP signatures or
similar, so perhaps some simple transform to e-mail messages would be
acceptable? Maybe something like:

s/^(\.+)$/.$1/;
s/^$/.//;
s/^/ /;

with the obvious reversal of:

s/^ //;
s/^\.(\.*)$/$1/;

with non-important header removal allowed. (We probably only need
From, Message-Id, Date, Subject, Content-Type?)


Don Armstrong

-- 
Do not handicap your children by making their lives easy.
 -- Robert Heinlein _Time Enough For Love_ p251

http://www.donarmstrong.com  http://rzlab.ucr.edu


--
To UNSUBSCRIBE, email to debian-project-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100818054628.gt17...@teltox.donarmstrong.com