Re: DEP-5 and files with white spaces

2012-02-11 Thread Jakub Wilk

* Charles Plessy ple...@debian.org, 2012-02-11, 12:06:
For the encoding, this is not a problem limited to the machine-readable 
format. If the Debian copyright file is in an encoding A, and one file 
has a name or is in a directory that has a name in an encoding B that 
can not be represented in A, and that there is no way to escape this 
problem with wildcards, that the file or directory can not be described 
by its name regardless of the syntax followed by the copyright file.


Not true. You can say “all the files are…” in a plain English copyright 
file. You can't say that in a DEP-5 copyright file.


--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120211102324.ga2...@jwilk.net



Re: DEP-5 and files with white spaces

2012-02-11 Thread Adam Borowski
On Fri, Feb 10, 2012 at 11:50:10AM +0100, Adam Borowski wrote:
 On Thu, Feb 09, 2012 at 11:05:25PM -0800, Russ Allbery wrote:
  Note that another case that I don't think has been discussed, but which is
  probably more common than embedded quote marks, is a filename that's
  invalid UTF-8 (straight ISO 8859-1, for example).
 
 Do these even happen anymore?  Looking at binary packages, I see just one[1]
 violation: lletters-media, a package not updated in 6 years.

It looks like there is not a single such filename in all sources, anywhere
in unstable (for x in *.tar.*z*;do tar tf $x;done).  Even lletters-media
ships its data with English names and links them at build.


-- 
// If you believe in so-called intellectual property, please immediately
// cease using counterfeit alphabets.  Instead, contact the nearest temple
// of Amon, whose priests will provide you with scribal services for all
// your writing needs, for Reasonable and Non-Discriminatory prices.


signature.asc
Description: Digital signature


Re: DEP-5 and files with white spaces

2012-02-11 Thread Russ Allbery
Adam Borowski kilob...@angband.pl writes:

 It looks like there is not a single such filename in all sources, anywhere
 in unstable (for x in *.tar.*z*;do tar tf $x;done).  Even lletters-media
 ships its data with English names and links them at build.

Oh, cool, thank you for checking!  I think we can safely not care about
this case, then.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87zkcp10fo@windlord.stanford.edu



Re: DEP-5 and files with white spaces

2012-02-10 Thread Adam Borowski
On Thu, Feb 09, 2012 at 11:05:25PM -0800, Russ Allbery wrote:
 Note that another case that I don't think has been discussed, but which is
 probably more common than embedded quote marks, is a filename that's
 invalid UTF-8 (straight ISO 8859-1, for example).

Do these even happen anymore?  Looking at binary packages, I see just one[1]
violation: lletters-media, a package not updated in 6 years.

Of course, it's source packages that matter, can't check them that easily.
Could someone who has all the sources downloaded and unpacked check?  My box
that has them decided to not heed wake-on-lan.

It might be easier to work around the issue by forbidding such filenames.
You can expect build failures when trying to access those files, and they
can't even be unpacked on some filesystems.


[1]. Using dists/unstable/Contents-amd64 for a list of candidates to check,
I might have missed something; I can check .debs in the evening.

-- 
// If you believe in so-called intellectual property, please immediately
// cease using counterfeit alphabets.  Instead, contact the nearest temple
// of Amon, whose priests will provide you with scribal services for all
// your writing needs, for Reasonable and Non-Discriminatory prices.


signature.asc
Description: Digital signature


Re: DEP-5 and files with white spaces

2012-02-10 Thread Wouter Verhelst
On Thu, Feb 09, 2012 at 11:05:25PM -0800, Russ Allbery wrote:
 Wouter Verhelst wou...@debian.org writes:
  On Thu, Feb 09, 2012 at 11:01:00AM +0100, Goswin von Brederlow wrote:
 
  Not a solution on its own.
 
  Actually, I think it's a perfectly workable solution.
 
  What about a file named foo bar' baz?
  
  For a worst case what about files with newlines?
 
  Unless these are part of a test suite on filenames, slap upstream and
  tell them to use sane filenames?
 
 We're basically retracing the previous discussion, and rediscovering why
 we left the spec alone.
 
 Formal correctness says that any possible file name should be
 representable, at which point filenames with newlines or embedded quote
 characters are a theoretical possibility and we would want some sort of
 robust solution for all those cases.

Right.

 If we *aren't* going to try to represent absolutely any possible legal
 filename exactly, then we're debating over how much of a technical
 correctness hole we want to leave, not over whether we're going to have one.
 At that point, I think it's reasonable to ask if we care about going to the
 work of expanding the spec to handle filenames with spaces in them without
 wildcards, as even that is not a horribly common case.  (I realize it's more
 common for upstreams who develop on Windows or Mac OS.)

Indeed, so the question is how far will we go in this.

I think having filenames with spaces in them is common enough that it
warrants extending the spec for. I do not think that having filenames
with weird characters in that have special meaning to a shell are common
enough to warrant extending the spec for.

On a personal note, one of my upstreams (beid) has a fairly complex
licensing situation and has files in the tarball with spaces in the
names... I suppose it would be to my benefit that this were allowed, but
I guess it's also fair to say I may be biased.

[...]
-- 
The volume of a pizza of thickness a and radius z can be described by
the following formula:

pi zz a


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120210110722.gm20...@grep.be



Re: DEP-5 and files with white spaces

2012-02-10 Thread Jakub Wilk

* Russ Allbery r...@debian.org, 2012-02-09, 23:05:
Note that another case that I don't think has been discussed, but which 
is probably more common than embedded quote marks, is a filename that's 
invalid UTF-8 (straight ISO 8859-1, for example). That's also not 
representable in our typical debian/copyright file,


The specification currently reads: “Only the wildcards * and ? apply; 
the former matches any number of characters (including none), the latter 
a single character.”


But characters of which encoding? If UTF-8, then for some filenames, no 
wildcard exist that would match them.


--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120210115827.ga2...@jwilk.net



Re: DEP-5 and files with white spaces

2012-02-10 Thread Russ Allbery
Jakub Wilk jw...@debian.org writes:
 * Russ Allbery r...@debian.org, 2012-02-09, 23:05:

 Note that another case that I don't think has been discussed, but which
 is probably more common than embedded quote marks, is a filename that's
 invalid UTF-8 (straight ISO 8859-1, for example). That's also not
 representable in our typical debian/copyright file,

 The specification currently reads: “Only the wildcards * and ? apply; the
 former matches any number of characters (including none), the latter a
 single character.”

 But characters of which encoding? If UTF-8, then for some filenames, no
 wildcard exist that would match them.

Indeed.  That's arguably a worse hole in the specification than whitespace
handling, since it may not be possible to use wildcards to work around it.
I'm not sure if we need to say something about that explicitly, or if it's
rare enough that we don't have to care.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8739aiftrw@windlord.stanford.edu



Re: DEP-5 and files with white spaces

2012-02-10 Thread Paul Wise
On Fri, Feb 10, 2012 at 6:50 PM, Adam Borowski wrote:

 Of course, it's source packages that matter, can't check them that easily.
 Could someone who has all the sources downloaded and unpacked check?  My box
 that has them decided to not heed wake-on-lan.

Just look at the Contents-source files:

ftp://ftp.debian.org/debian/dists/sid/main/Contents-source.gz

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAKTje6FuBNXqT4puetTOBdoxpOE+EVLsasbfc==nsCtAeu=i...@mail.gmail.com



Re: DEP-5 and files with white spaces

2012-02-10 Thread Jakub Wilk

* Paul Wise p...@debian.org, 2012-02-11, 08:35:
Of course, it's source packages that matter, can't check them that 
easily. Could someone who has all the sources downloaded and unpacked 
check?  My box that has them decided to not heed wake-on-lan.


Just look at the Contents-source files:

ftp://ftp.debian.org/debian/dists/sid/main/Contents-source.gz


This file is whole UTF-8. I believe that ftp-masters recode non-UTF-8 
filenames to UTF-8 in some unspecified way.


--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120211003851.ga4...@jwilk.net



Re: DEP-5 and files with white spaces

2012-02-10 Thread Charles Plessy
Le Fri, Feb 10, 2012 at 10:05:55AM -0800, Russ Allbery a écrit :
 Jakub Wilk jw...@debian.org writes:
  * Russ Allbery r...@debian.org, 2012-02-09, 23:05:
 
  Note that another case that I don't think has been discussed, but which
  is probably more common than embedded quote marks, is a filename that's
  invalid UTF-8 (straight ISO 8859-1, for example). That's also not
  representable in our typical debian/copyright file,
 
  The specification currently reads: “Only the wildcards * and ? apply; the
  former matches any number of characters (including none), the latter a
  single character.”
 
  But characters of which encoding? If UTF-8, then for some filenames, no
  wildcard exist that would match them.
 
 Indeed.  That's arguably a worse hole in the specification than whitespace
 handling, since it may not be possible to use wildcards to work around it.
 I'm not sure if we need to say something about that explicitly, or if it's
 rare enough that we don't have to care.

Dear all,

how about documenting these facts in the DEP and going ahead with the current
syntax ?

+  section id=limitations
+titleLimitations/title
+para
+  The pattern syntax can not distinguish files whose names differ only by
+  whitespaces, nor files that have the same name but are in paths that only
+  differ by whitespaces.
+/para
+para
+  It is not possible to represent a file name or a path using an encoding
+  that is not compatible with Unicode.
+/para
+  /section

For the white spaces, it has been a year that we claim that we will not make
normative changes unless necessary, and the possibilities discussed are all
theoretical.  I think that extensions are welcome for next versions of the
format, but the possibility to break existing files with a normative change is
not less unlikely than the possibility to encounter a package where two files
have different licenses and names that differ only by whitespaces, and where
the upstream author would either refuse or not be available to correct that
problem.

For the encoding, this is not a problem limited to the machine-readable format.
If the Debian copyright file is in an encoding A, and one file has a name or is
in a directory that has a name in an encoding B that can not be represented in
A, and that there is no way to escape this problem with wildcards, that the
file or directory can not be described by its name regardless of the syntax
followed by the copyright file.

It is good to care about these cases, and I propose to do so by documenting
them the version 1.0 and keeping bugs open, that may be solved in a future
version if there is a solution that satisfies both the developers who write the
files and the developers who write the parsers.

Have a nice week-end,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120211030650.gf19...@falafel.plessy.net



Re: DEP-5 and files with white spaces

2012-02-09 Thread Goswin von Brederlow
Benjamin Drung bdr...@debian.org writes:

 Am Mittwoch, den 01.02.2012, 14:20 -0800 schrieb Russ Allbery:
 Benjamin Drung bdr...@debian.org writes:
 
  DEP-5 is nice, but how can I specify a license for a file with white
  spaces? For example you want to specify that the file foo/file one.bar
  is licensed under ISC, but foo/file_one.bar is licensed under GPL. How
  can you do that?
 
 No, that distinction isn't representable.  There was some earlier
 discussion about that, and the conclusion reached was that it was a rare
 case that wasn't worth making the syntax more complicated (after various
 more complicated syntaxes were tossed around without making anyone very
 happy).

 Is it to complex to have a syntax that is similar to what the shell
 does? Two solutions pop into my mind. Please let me know, why these are
 not use. You can point me to previous discussions.

 Idea 1: Use a escape sequence for specifying a whitespace (e.g. \  for
 a space).

 Idea 2: Allow quotation marks.

Not a solution on its own. What about a file named foo bar' baz?

For a worst case what about files with newlines?

MfG
Goswin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87r4y4s4v7.fsf@frosties.localnet



Re: DEP-5 and files with white spaces

2012-02-09 Thread Andrew Shadura
Hello,

On Thu, 09 Feb 2012 11:01:00 +0100
Goswin von Brederlow goswin-...@web.de wrote:

  Idea 2: Allow quotation marks.

 Not a solution on its own. What about a file named foo bar' baz?

 For a worst case what about files with newlines?

You can double the delimiter to embed it into a string, like this:
foo bar' baz or 'foo bar'' baz'.

-- 
WBR, Andrew


signature.asc
Description: PGP signature


Re: DEP-5 and files with white spaces

2012-02-09 Thread Jon Dowland
On Thu, Feb 09, 2012 at 07:38:29PM +0300, Andrew Shadura wrote:
 Hello,
 
 On Thu, 09 Feb 2012 11:01:00 +0100
 Goswin von Brederlow goswin-...@web.de wrote:
 
   Idea 2: Allow quotation marks.
 
  Not a solution on its own. What about a file named foo bar' baz?
 
  For a worst case what about files with newlines?
 
 You can double the delimiter to embed it into a string, like this:
 foo bar' baz or 'foo bar'' baz'.

Urgh. Or do 1. as well as 2. and have escape sequences. Also urgh.

It's a theoretical problem and Jakub has shown that there is a workable
solution with the current syntax.

He's also shown that we couldn't handle distinguishing foo bar from
foo\tbar.  That is, surely, also entirely theoretical.


-- 
Jon Dowland


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120209171751.GA23213@debian



Re: DEP-5 and files with white spaces

2012-02-09 Thread Wouter Verhelst
On Thu, Feb 09, 2012 at 11:01:00AM +0100, Goswin von Brederlow wrote:
  Idea 2: Allow quotation marks.
 
 Not a solution on its own.

Actually, I think it's a perfectly workable solution.

 What about a file named foo bar' baz?
 
 For a worst case what about files with newlines?

Unless these are part of a test suite on filenames, slap upstream and
tell them to use sane filenames?

(and if they *are* part of a test suite on file names, they need not
have content, therefore need not appear in a copyright file, and can be
trivially created at run time with 'touch')

-- 
The volume of a pizza of thickness a and radius z can be described by
the following formula:

pi zz a


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120210062901.gn3...@grep.be



Re: DEP-5 and files with white spaces

2012-02-09 Thread Russ Allbery
Wouter Verhelst wou...@debian.org writes:
 On Thu, Feb 09, 2012 at 11:01:00AM +0100, Goswin von Brederlow wrote:

 Not a solution on its own.

 Actually, I think it's a perfectly workable solution.

 What about a file named foo bar' baz?
 
 For a worst case what about files with newlines?

 Unless these are part of a test suite on filenames, slap upstream and
 tell them to use sane filenames?

We're basically retracing the previous discussion, and rediscovering why
we left the spec alone.

Formal correctness says that any possible file name should be
representable, at which point filenames with newlines or embedded quote
characters are a theoretical possibility and we would want some sort of
robust solution for all those cases.  If we *aren't* going to try to
represent absolutely any possible legal filename exactly, then we're
debating over how much of a technical correctness hole we want to leave,
not over whether we're going to have one.  At that point, I think it's
reasonable to ask if we care about going to the work of expanding the spec
to handle filenames with spaces in them without wildcards, as even that is
not a horribly common case.  (I realize it's more common for upstreams who
develop on Windows or Mac OS.)

That's how ended up where we are now.

Note that another case that I don't think has been discussed, but which is
probably more common than embedded quote marks, is a filename that's
invalid UTF-8 (straight ISO 8859-1, for example).  That's also not
representable in our typical debian/copyright file, and is likely to cause
significant practical problems (such as having the encoding format change
every time the maintainer edits the file, since some editors will try to
fix such problems).

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87pqdn183u@windlord.stanford.edu



Re: DEP-5 and files with white spaces

2012-02-03 Thread Charles Plessy
Le Thu, Feb 02, 2012 at 09:50:09AM +0900, Charles Plessy a écrit :
 
 1) DEP 5 and directory/file names with spaces
(http://lists.debian.org/debian-devel/2009/06/msg00155.html)
 
 My summary is that the participants were quite divided on whether separating
 the list of files by spaces or by commas.  Space-separation took advantage, as
 the resulting list can be pasted directly in a shell.  The escaping syntax was
 glob(7) at the time, but it allows patterns that the shell will not expand, so
 the two wildcards * and ? were proposed.  My personal feeling is that more
 complete syntax, like allowing shell quotes, did not make it because no
 participant had patience or energy left for moving this forward.  But ‘shell
 pastability’ is I think the conclusion.

While reading the DEP again, I realised that our current format is not always
directly pastable to the shell, as the wildcards are allowed to match directory
separators, so that ‘*/Makefile.in’ can match at any depth.  There is a small
number of packages using that feature, with Makefile.in in most of the cases.
It looks like that we can not have both conveniences at the same time.  I think
that it is one more argument to consider revisiting the current syntax in a
later evolution of the format, but I think that we should accumulate more
experience before that.

Have a nice week-end,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120204024339.ga4...@merveille.plessy.net



Re: DEP-5 and files with white spaces

2012-02-01 Thread Russ Allbery
Benjamin Drung bdr...@debian.org writes:

 DEP-5 is nice, but how can I specify a license for a file with white
 spaces? For example you want to specify that the file foo/file one.bar
 is licensed under ISC, but foo/file_one.bar is licensed under GPL. How
 can you do that?

No, that distinction isn't representable.  There was some earlier
discussion about that, and the conclusion reached was that it was a rare
case that wasn't worth making the syntax more complicated (after various
more complicated syntaxes were tossed around without making anyone very
happy).

The general way to specify information for a file name that contains
whitespace is to use wildcards to match the whitespace, which means that
you can't disambiguate from other files that only differ in the places
where whitespace is present.

Out of curiosity, have you run across a case where this matters, or were
you asking because it's a theoretical hole?  It's definitely a theoretical
hole, but one of the reasons why we didn't spend more time on it was that
everyone was dubious that the case would arise in a real-world situation.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8762fq2o2u@windlord.stanford.edu



Re: DEP-5 and files with white spaces

2012-02-01 Thread Jakub Wilk

* Russ Allbery r...@debian.org, 2012-02-01, 14:20:
DEP-5 is nice, but how can I specify a license for a file with white 
spaces? For example you want to specify that the file foo/file 
one.bar is licensed under ISC, but foo/file_one.bar is licensed 
under GPL. How can you do that?


No, that distinction isn't representable.


This one is representable. You can take advantage of the fact the the 
last paragraph that matches a particular file applies to it:


| Files: foo/file?one.bar
| License: ISC
|
| Files: foo/file_one.bar
| License: GPL

That said, you _can_ construct even more contrived examples which are 
unrepresentable, e.g. by replacing _ with a tab.


The general way to specify information for a file name that contains 
whitespace is to use wildcards to match the whitespace,


That works only if you can stand ugliness of such Files fields.

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120201223158.ga...@jwilk.net



Re: DEP-5 and files with white spaces

2012-02-01 Thread Russ Allbery
Jakub Wilk jw...@debian.org writes:

 This one is representable. You can take advantage of the fact the the
 last paragraph that matches a particular file applies to it:

 | Files: foo/file?one.bar
 | License: ISC
 |
 | Files: foo/file_one.bar
 | License: GPL

Oh, hey, yes, good point.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87ty3a18in@windlord.stanford.edu



Re: DEP-5 and files with white spaces

2012-02-01 Thread Benjamin Drung
Am Mittwoch, den 01.02.2012, 14:20 -0800 schrieb Russ Allbery:
 Benjamin Drung bdr...@debian.org writes:
 
  DEP-5 is nice, but how can I specify a license for a file with white
  spaces? For example you want to specify that the file foo/file one.bar
  is licensed under ISC, but foo/file_one.bar is licensed under GPL. How
  can you do that?
 
 No, that distinction isn't representable.  There was some earlier
 discussion about that, and the conclusion reached was that it was a rare
 case that wasn't worth making the syntax more complicated (after various
 more complicated syntaxes were tossed around without making anyone very
 happy).

Is it to complex to have a syntax that is similar to what the shell
does? Two solutions pop into my mind. Please let me know, why these are
not use. You can point me to previous discussions.

Idea 1: Use a escape sequence for specifying a whitespace (e.g. \  for
a space).

Idea 2: Allow quotation marks.

 The general way to specify information for a file name that contains
 whitespace is to use wildcards to match the whitespace, which means that
 you can't disambiguate from other files that only differ in the places
 where whitespace is present.

I don't like the idea of abusing a wildcard if the files could be
specified more precisely.

 Out of curiosity, have you run across a case where this matters, or were
 you asking because it's a theoretical hole?  It's definitely a theoretical
 hole, but one of the reasons why we didn't spend more time on it was that
 everyone was dubious that the case would arise in a real-world situation.

I haven't run across an actual case. This case just popped into my mind
and I wondered how to cover this case.

-- 
Benjamin Drung
Debian  Ubuntu Developer


signature.asc
Description: This is a digitally signed message part


Re: DEP-5 and files with white spaces

2012-02-01 Thread Benjamin Drung
Am Mittwoch, den 01.02.2012, 23:31 +0100 schrieb Jakub Wilk:
 * Russ Allbery r...@debian.org, 2012-02-01, 14:20:
 DEP-5 is nice, but how can I specify a license for a file with white 
 spaces? For example you want to specify that the file foo/file 
 one.bar is licensed under ISC, but foo/file_one.bar is licensed 
 under GPL. How can you do that?
 
 No, that distinction isn't representable.
 
 This one is representable. You can take advantage of the fact the the 
 last paragraph that matches a particular file applies to it:
 
 | Files: foo/file?one.bar
 | License: ISC
 |
 | Files: foo/file_one.bar
 | License: GPL
 
 That said, you _can_ construct even more contrived examples which are 
 unrepresentable, e.g. by replacing _ with a tab.
 
 The general way to specify information for a file name that contains 
 whitespace is to use wildcards to match the whitespace,
 
 That works only if you can stand ugliness of such Files fields.

True words.

For example, the eclipse source package has files with spaces in it
using ? instead of spaces does look ugly.

-- 
Benjamin Drung
Debian  Ubuntu Developer


signature.asc
Description: This is a digitally signed message part


Re: DEP-5 and files with white spaces

2012-02-01 Thread Russ Allbery
Benjamin Drung bdr...@debian.org writes:

 Is it to complex to have a syntax that is similar to what the shell
 does? Two solutions pop into my mind. Please let me know, why these are
 not use. You can point me to previous discussions.

 Idea 1: Use a escape sequence for specifying a whitespace (e.g. \  for
 a space).

 Idea 2: Allow quotation marks.

Yeah, both of those were among the other syntax proposals that were
suggested, and I think one of them was in the document at one point.
Using backslash is probably the easiest, although it does make parsing the
files harder.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87pqdy184s@windlord.stanford.edu



Re: DEP-5 and files with white spaces

2012-02-01 Thread Benjamin Drung
Am Mittwoch, den 01.02.2012, 14:49 -0800 schrieb Russ Allbery:
 Benjamin Drung bdr...@debian.org writes:
 
  Is it to complex to have a syntax that is similar to what the shell
  does? Two solutions pop into my mind. Please let me know, why these are
  not use. You can point me to previous discussions.
 
  Idea 1: Use a escape sequence for specifying a whitespace (e.g. \  for
  a space).
 
  Idea 2: Allow quotation marks.
 
 Yeah, both of those were among the other syntax proposals that were
 suggested, and I think one of them was in the document at one point.
 Using backslash is probably the easiest, although it does make parsing the
 files harder.

IMHO allowing both would be the optimum. A real parser would have
problems with both, but a simplistic parser that just split the string
by spaces would have a problem.

-- 
Benjamin Drung
Debian  Ubuntu Developer


signature.asc
Description: This is a digitally signed message part


Re: DEP-5 and files with white spaces

2012-02-01 Thread Russ Allbery
Benjamin Drung bdr...@debian.org writes:
 Am Mittwoch, den 01.02.2012, 14:49 -0800 schrieb Russ Allbery:

 Yeah, both of those were among the other syntax proposals that were
 suggested, and I think one of them was in the document at one point.
 Using backslash is probably the easiest, although it does make parsing
 the files harder.

 IMHO allowing both would be the optimum. A real parser would have
 problems with both, but a simplistic parser that just split the string
 by spaces would have a problem.

Yeah, that was, as I understand it, the motivation (to allow really simple
parsers).

I don't know if it's worth revisiting this.  I can't say that I
particularly liked the outcome we arrived at, but theoretical holes in
standards bother me a lot (possibly more than they should).

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87d39y17ug@windlord.stanford.edu



Re: DEP-5 and files with white spaces

2012-02-01 Thread Benjamin Drung
Am Mittwoch, den 01.02.2012, 14:56 -0800 schrieb Russ Allbery:
 Benjamin Drung bdr...@debian.org writes:
  Am Mittwoch, den 01.02.2012, 14:49 -0800 schrieb Russ Allbery:
 
  Yeah, both of those were among the other syntax proposals that were
  suggested, and I think one of them was in the document at one point.
  Using backslash is probably the easiest, although it does make parsing
  the files harder.
 
  IMHO allowing both would be the optimum. A real parser would have
  problems with both, but a simplistic parser that just split the string
  by spaces would have a problem.
 
 Yeah, that was, as I understand it, the motivation (to allow really simple
 parsers).

What is more important: A good looking copyright file or being
parsable by a dead simple, stupid parser? The proposed changes would
make the parser overly complex. 

 I don't know if it's worth revisiting this.  I can't say that I
 particularly liked the outcome we arrived at, but theoretical holes in
 standards bother me a lot (possibly more than they should).

I would call a theoretical hole a design bug.

-- 
Benjamin Drung
Debian  Ubuntu Developer


signature.asc
Description: This is a digitally signed message part


Re: DEP-5 and files with white spaces

2012-02-01 Thread Charles Plessy
Le Wed, Feb 01, 2012 at 11:44:36PM +0100, Benjamin Drung a écrit :
 
 Is it to complex to have a syntax that is similar to what the shell
 does? Two solutions pop into my mind. Please let me know, why these are
 not use. You can point me to previous discussions.

Hi Benjamin,

You can refer to the following threads


1) DEP 5 and directory/file names with spaces
   (http://lists.debian.org/debian-devel/2009/06/msg00155.html)

My summary is that the participants were quite divided on whether separating
the list of files by spaces or by commas.  Space-separation took advantage, as
the resulting list can be pasted directly in a shell.  The escaping syntax was
glob(7) at the time, but it allows patterns that the shell will not expand, so
the two wildcards * and ? were proposed.  My personal feeling is that more
complete syntax, like allowing shell quotes, did not make it because no
participant had patience or energy left for moving this forward.  But ‘shell
pastability’ is I think the conclusion.


2) DEP-5: an example parser, choice of syntax for Files:
   (http://lists.debian.org/debian-devel/2009/09/msg00558.html)

Discussion on the original syntax based on the find command, where I reminded
the thread above; no objection.


3) DEP-5: file globbing
   (http://lists.debian.org/debian-project/2010/08/msg00154.html)

Discussion about exclusion patterns.


4) DEP-5: Files field and filename patterns
   (http://lists.debian.org/debian-project/2010/08/msg00289.html)
   (http://lists.debian.org/debian-project/2010/09/msg00029.html)

The simple globbing with * and ? was finally chosen.  It was noted that because
it is a lowest common denominator, it leaves the room for expansion later.


5)  Re: DEP5: CANDIDATE and ready for use in squeeze+1
   (http://lists.debian.org/debian-devel/2011/01/msg00235.html)

In this thread, you questionned how to escape files with
a space in their names, and did not object to the answer from Lars.


The current syntax has been used for years, and while it can be perfected, I do
not think that such extension is in the scope of the version 1.0 that we are
preparing.  What I propose, if you think it is worth, is to open a bug, to
track that request for the next revision. 

Have a nice day,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120202005009.gf22...@merveille.plessy.net