Re: new Contents generator on ftp-master

2011-03-17 Thread Adam D. Barratt
On Tue, March 15, 2011 21:40, Joerg Jaspert wrote:

 The new implementation is currently only used for suites that are not
 marked as untouchable. Oldstable and stable will switch during the next
 point release.
 Have you (or anyone else) verified that any tools in {old,}stable
 parsing contents files are compatible with the new structure (and
 filenames in the case of udebs)?

 As far as I remember, there is currently no user for the udeb contents
 files.
 Or that was the case last we had a discussion about it. :)

That wouldn't surprise me.  The filenames were a bit of an add-on thought
to be honest, my main concern was the tool compatibility.

Regards,

Adam


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/c84c74dad8297a327440c4cd27cd48f8.squir...@adsl.funky-badger.org



Re: new Contents generator on ftp-master

2011-03-15 Thread Torsten Werner
On Sat, Mar 12, 2011 at 11:10 PM, Kurt Roeckx k...@roeckx.be wrote:
 I mean, I really don't understand why you can't atleast list the
 other files from the package.

I've added a de-duplication mechanism.

Torsten


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/aanlktin-jboufwfyxt783jygi4tjp78nb_g++qubg...@mail.gmail.com



Re: new Contents generator on ftp-master

2011-03-15 Thread Adam D. Barratt
On Sat, 2011-03-12 at 11:01 +0100, Torsten Werner wrote:
 we have disabled the contents generator of apt-ftparchive and replaced
 it by a new implementation in dak. There are some visible changes:
[...]
 The new implementation is currently only used for suites that are not
 marked as untouchable. Oldstable and stable will switch during the next
 point release.

Have you (or anyone else) verified that any tools in {old,}stable
parsing contents files are compatible with the new structure (and
filenames in the case of udebs)?

Apologies if this was answered elsewhere in the thread and I simply
missed it, in which case a pointer would be appreciated.

Regards,

Adam


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/1300220640.2477.443.ca...@hathi.jungle.funky-badger.org



Re: new Contents generator on ftp-master

2011-03-15 Thread Joerg Jaspert

 The new implementation is currently only used for suites that are not
 marked as untouchable. Oldstable and stable will switch during the next
 point release.
 Have you (or anyone else) verified that any tools in {old,}stable
 parsing contents files are compatible with the new structure (and
 filenames in the case of udebs)?

As far as I remember, there is currently no user for the udeb contents files.
Or that was the case last we had a discussion about it. :)

-- 
bye, Joerg
exa yes, I'm annoying.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87aagw5afw@gkar.ganneff.de



Re: new Contents generator on ftp-master

2011-03-12 Thread Holger Levsen
Hi,

On Samstag, 12. März 2011, Torsten Werner wrote:
 The new implementation is currently only used for suites that are not
 marked as untouchable. Oldstable and stable will switch during the next
 point release.

Why switch stable and oldstable at all?


cheers,
Holger


signature.asc
Description: This is a digitally signed message part.


Re: new Contents generator on ftp-master

2011-03-12 Thread Adam Borowski
On Sat, Mar 12, 2011 at 11:01:21AM +0100, Torsten Werner wrote:
 we have disabled the contents generator of apt-ftparchive and replaced
 it by a new implementation in dak. There are some visible changes:
 
[Contents.gz]
 
 2) The encoding in proper UTF-8. ISO8859-1 filenames are re-coded
 automatically. To find out what happens to other encodings is left as an
 exercise to the reader. :)

Can we get RC bugs for every file name in packages that is not proper UTF-8? 
These packages will be uninstallable on some filesystems.

On the other hand, for ancient charsets, UTF-8 filenames will be mangled but
accessible.

-- 
1KB // Microsoft corollary to Hanlon's razor:
//  Never attribute to stupidity what can be
//  adequately explained by malice.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312104135.ga24...@angband.pl



Re: new Contents generator on ftp-master

2011-03-12 Thread Torsten Werner
On Sat, Mar 12, 2011 at 11:25 AM, Holger Levsen hol...@layer-acht.org wrote:
 Why switch stable and oldstable at all?

Why not? Should we maintain two different configurations for several
years for no obvious reason?

Torsten


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/AANLkTi=frwhhvcy-yizjcpct-+q8qwhusyfbinx_e...@mail.gmail.com



Re: new Contents generator on ftp-master

2011-03-12 Thread Julien Cristau
On Sat, Mar 12, 2011 at 11:25:32 +0100, Holger Levsen wrote:

 Hi,
 
 On Samstag, 12. März 2011, Torsten Werner wrote:
  The new implementation is currently only used for suites that are not
  marked as untouchable. Oldstable and stable will switch during the next
  point release.
 
 Why switch stable and oldstable at all?
 
Yeah, that seems like a rather bad plan.

Cheers,
Julien


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312105220.gs2...@radis.liafa.jussieu.fr



Re: new Contents generator on ftp-master

2011-03-12 Thread Holger Levsen
Hi Torsten,

On Samstag, 12. März 2011, Torsten Werner wrote:
  Why switch stable and oldstable at all?
 Why not? Should we maintain two different configurations for several
 years for no obvious reason?

well, the obvious reason is to not break Debian stable (and oldstable), like 
it happened when md5sums where removed. Are you absolutly 100% sure this will 
not happen again?

If you need to do development on dak, great, use a test setup for that. I'm 
totally fine with testing on the testing and unstable parts of the archive, 
but not so much with stable.

(obviously you is a plural you here.)


cheers,
Holger


signature.asc
Description: This is a digitally signed message part.


Re: new Contents generator on ftp-master

2011-03-12 Thread Kurt Roeckx
On Sat, Mar 12, 2011 at 11:01:21AM +0100, Torsten Werner wrote:
 
 5) Packages with duplicate filenames are marked just as such and no
 contents is recorded, e.g.
 DUPLICATE_FILENAMES  text/inorwegian,text/wnorwegian

So basicly apt-file search will fail to find any file in
inorwegian and wnorwegian?  Or just the duplicate ones?
Why?  What's wrong with the old way of doing it?


Kurt


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312130237.ga32...@roeckx.be



Re: new Contents generator on ftp-master

2011-03-12 Thread Jakub Wilk

* Torsten Werner twer...@debian.org, 2011-03-12, 11:01:

2) The encoding in proper UTF-8. ISO8859-1 filenames are re-coded
automatically. To find out what happens to other encodings is left as an
exercise to the reader. :)


What's the point of messing with encodings?


5) Packages with duplicate filenames are marked just as such and no
contents is recorded, e.g.
DUPLICATE_FILENAMES  text/inorwegian,text/wnorwegian


Shouldn't dak reject debs with duplicate filenames in the first place?

Anyway, both packages are just fine (AFAICT). Reporting them as having 
duplicate filenames looks like a side effect of encoding mangling:


$ dpkg -c inorwegian_2.0.10-3.2_i386.deb | grep -v /$ | tr -s ' ' | cut -d' ' 
-f 6 | sort | uniq -c
   1 ./usr/lib/ispell/bokm\303\245l.aff
   1 ./usr/lib/ispell/bokm\303\245l.hash
   1 ./usr/lib/ispell/bokm\345l.aff
   1 ./usr/lib/ispell/bokm\345l.hash
   1 ./usr/lib/ispell/bokmaal.aff
   1 ./usr/lib/ispell/bokmaal.hash
   1 ./usr/lib/ispell/nb.aff
   1 ./usr/lib/ispell/nb.hash
   1 ./usr/lib/ispell/nn.aff
   1 ./usr/lib/ispell/nn.hash
   1 ./usr/lib/ispell/norsk.aff
   1 ./usr/lib/ispell/norsk.hash
   1 ./usr/lib/ispell/nynorsk.aff
   1 ./usr/lib/ispell/nynorsk.hash
   1 ./usr/share/doc/inorwegian/README.Debian
   1 ./usr/share/doc/inorwegian/changelog.Debian.gz
   1 ./usr/share/doc/inorwegian/copyright
   1 ./var/lib/dictionaries-common/ispell/inorwegian

$ dpkg -c wnorwegian_2.0.10-3.2_all.deb | grep -v /$ | tr -s ' ' | cut -d' ' -f 
6 | sort | uniq -c
   1 ./usr/share/dict/bokm\303\245l
   1 ./usr/share/dict/bokm\345l
   1 ./usr/share/dict/bokmaal
   1 ./usr/share/dict/norsk
   1 ./usr/share/dict/nynorsk
   1 ./usr/share/doc/wnorwegian/changelog.Debian.gz
   1 ./usr/share/doc/wnorwegian/copyright
   1 ./var/lib/dictionaries-common/wordlist/wnorwegian

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312133712.ga6...@jwilk.net



Re: new Contents generator on ftp-master

2011-03-12 Thread Tollef Fog Heen
]] Jakub Wilk 

| * Torsten Werner twer...@debian.org, 2011-03-12, 11:01:
| 2) The encoding in proper UTF-8. ISO8859-1 filenames are re-coded
| automatically. To find out what happens to other encodings is left as an
| exercise to the reader. :)
| 
| What's the point of messing with encodings?

Probably make it predictable for tools trying to consume the file.

| 5) Packages with duplicate filenames are marked just as such and no
| contents is recorded, e.g.
| DUPLICATE_FILENAMES  text/inorwegian,text/wnorwegian
| 
| Shouldn't dak reject debs with duplicate filenames in the first place?

No, packages might very well ship duplicate files (think all mtas
shipping /usr/sbin/sendmail) but they then have to conflict + replace.

| Anyway, both packages are just fine (AFAICT). Reporting them as having
| duplicate filenames looks like a side effect of encoding mangling:

Yeah, I should probably stop shipping the latin1 version, now, though.

-- 
Tollef Fog Heen
UNIX is user friendly, it's just picky about who its friends are


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87r5ac1kes@qurzaw.varnish-software.com



Re: new Contents generator on ftp-master

2011-03-12 Thread Philipp Kern
On 2011-03-12, Tollef Fog Heen tfh...@err.no wrote:
 ]] Jakub Wilk 
| * Torsten Werner twer...@debian.org, 2011-03-12, 11:01:
| 2) The encoding in proper UTF-8. ISO8859-1 filenames are re-coded
| automatically. To find out what happens to other encodings is left as an
| exercise to the reader. :)
| What's the point of messing with encodings?
 Probably make it predictable for tools trying to consume the file.

But then it's not at all predictable if the file you want can actually be found
automatically.  You're throwing away information.

I guess UTF8 encoded filenames everywhere could well be a release goal.

Kind regards
Philipp Kern


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/slrninn4dl.954.tr...@kelgar.0x539.de



Re: new Contents generator on ftp-master

2011-03-12 Thread Ralf Treinen
On Sat, Mar 12, 2011 at 03:30:03PM +0100, Tollef Fog Heen wrote:
 ]] Jakub Wilk 

 | Shouldn't dak reject debs with duplicate filenames in the first place?
 
 No, packages might very well ship duplicate files (think all mtas
 shipping /usr/sbin/sendmail) but they then have to conflict + replace.

Some statistics on that, since I am testing for possible package
installation errors due to file hijacking: in sid (main+contrib+
nonfree) we have about 1000 pairs of packages that share at least one
path name. Only about 65 of them can be installed together, the remaining
ones are directly or indirectly in conflict. And among the 65 which
can be installed together, almost all of them use a diversion of Replaces
(otherwise they'll a get a Serious bug report from me when I catch them :-)

http://edos.debian.net/file-overwrites/

-Ralf.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312162949.ga2...@free.fr



Re: new Contents generator on ftp-master

2011-03-12 Thread Torsten Werner
On Sat, Mar 12, 2011 at 2:02 PM, Kurt Roeckx k...@roeckx.be wrote:
 So basicly apt-file search will fail to find any file in
 inorwegian and wnorwegian?

It won't find any files currently for both packages.

 Why?  What's wrong with the old way of doing it?

Both packages ship symlinks with the same name but different
encodings. Some weeks ago there have been some discussion about
Unicode correctness in Debian. I think that we should accept only
properly UTF-8 encoded filenames in the future but I am open for
discussion. We can use a manual workaround for (old)stable if we want
that.

Torsten


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/aanlktimbsotjn1k0wick+hxpeyxl+hb1fw5ytn3bk...@mail.gmail.com



Re: new Contents generator on ftp-master

2011-03-12 Thread Kurt Roeckx
On Sat, Mar 12, 2011 at 07:53:34PM +0100, Torsten Werner wrote:
 On Sat, Mar 12, 2011 at 2:02 PM, Kurt Roeckx k...@roeckx.be wrote:
  So basicly apt-file search will fail to find any file in
  inorwegian and wnorwegian?
 
 It won't find any files currently for both packages.

$ apt-file search bokmaal.aff
inorwegian: /usr/lib/ispell/bokmaal.aff
$ apt-file search bokmål.aff
inorwegian: /usr/lib/ispell/bokmål.aff

Seems to work fine?

  Why?  What's wrong with the old way of doing it?
 
 Both packages ship symlinks with the same name but different
 encodings. Some weeks ago there have been some discussion about
 Unicode correctness in Debian. I think that we should accept only
 properly UTF-8 encoded filenames in the future but I am open for
 discussion. We can use a manual workaround for (old)stable if we want
 that.

I think there might be some misunderstanding of what you mean.
My understanding from the original mail was:
some packages ship /usr/sbin/sendmail, currently that seems to be:
citadel-mta: /usr/sbin/sendmail
courier-mta: /usr/sbin/sendmail
dma: /usr/sbin/sendmail
esmtp-run: /usr/sbin/sendmail
exim4-daemon-heavy: /usr/sbin/sendmail
exim4-daemon-light: /usr/sbin/sendmail
masqmail: /usr/sbin/sendmail
msmtp-mta: /usr/sbin/sendmail
nullmailer: /usr/sbin/sendmail
postfix: /usr/sbin/sendmail
sendmail-base: /usr/sbin/sendmailconfig
ssmtp: /usr/sbin/sendmail
xmail: /usr/sbin/sendmail

And as a result for all those packages you would list no file at all.

But then my understanding changed so that a single package
shipping a file which has different encodings for the filename
would not have that (or any) file mentioned.  If you already
convert them all to utf-8, I see no problem in only mentioning
that file once instead of not at all.


Kurt


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312192835.ga6...@roeckx.be



Re: new Contents generator on ftp-master

2011-03-12 Thread Torsten Werner
On Sat, Mar 12, 2011 at 8:28 PM, Kurt Roeckx k...@roeckx.be wrote:
 $ apt-file search bokmaal.aff
 inorwegian: /usr/lib/ispell/bokmaal.aff
 $ apt-file search bokmål.aff
 inorwegian: /usr/lib/ispell/bokmål.aff

$ zgrep inorwegian Contents-amd64.gz
DUPLICATE_FILENAMES
text/inorwegian,text/wnorwegian

Are you using a current contents file from sid?

 But then my understanding changed so that a single package
 shipping a file which has different encodings for the filename
 would not have that (or any) file mentioned.  If you already
 convert them all to utf-8, I see no problem in only mentioning
 that file once instead of not at all.

It is not that easy due to database constraints. But if that is needed...

Torsten


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/AANLkTi=zpj+yjybncbpxkfs-9mgs4djwkndy26klu...@mail.gmail.com



Re: new Contents generator on ftp-master

2011-03-12 Thread Kurt Roeckx
On Sat, Mar 12, 2011 at 09:44:51PM +0100, Torsten Werner wrote:
 On Sat, Mar 12, 2011 at 8:28 PM, Kurt Roeckx k...@roeckx.be wrote:
  $ apt-file search bokmaal.aff
  inorwegian: /usr/lib/ispell/bokmaal.aff
  $ apt-file search bokmål.aff
  inorwegian: /usr/lib/ispell/bokmål.aff
 
 $ zgrep inorwegian Contents-amd64.gz
 DUPLICATE_FILENAMES
 text/inorwegian,text/wnorwegian
 
 Are you using a current contents file from sid?

No, I'm used an older file.  It used to work, so I see no reason
why that shouldn't work anymore.

  But then my understanding changed so that a single package
  shipping a file which has different encodings for the filename
  would not have that (or any) file mentioned.  If you already
  convert them all to utf-8, I see no problem in only mentioning
  that file once instead of not at all.
 
 It is not that easy due to database constraints. But if that is needed...

I currently don't see why.


Kurt


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312220641.ga9...@roeckx.be



Re: new Contents generator on ftp-master

2011-03-12 Thread Kurt Roeckx
On Sat, Mar 12, 2011 at 11:06:41PM +0100, Kurt Roeckx wrote:
 On Sat, Mar 12, 2011 at 09:44:51PM +0100, Torsten Werner wrote:
  On Sat, Mar 12, 2011 at 8:28 PM, Kurt Roeckx k...@roeckx.be wrote:
   $ apt-file search bokmaal.aff
   inorwegian: /usr/lib/ispell/bokmaal.aff
   $ apt-file search bokmål.aff
   inorwegian: /usr/lib/ispell/bokmål.aff
  
  $ zgrep inorwegian Contents-amd64.gz
  DUPLICATE_FILENAMES
  text/inorwegian,text/wnorwegian
  
  Are you using a current contents file from sid?
 
 No, I'm used an older file.  It used to work, so I see no reason
 why that shouldn't work anymore.

I mean, I really don't understand why you can't atleast list the
other files from the package.

   But then my understanding changed so that a single package
   shipping a file which has different encodings for the filename
   would not have that (or any) file mentioned.  If you already
   convert them all to utf-8, I see no problem in only mentioning
   that file once instead of not at all.
  
  It is not that easy due to database constraints. But if that is needed...
 
 I currently don't see why.

But I think it would also be a good idea to only have utf-8 files
in packages.  If only all the software could deal with them.


Kurt


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110312221054.ga9...@roeckx.be