Bug#708705: publican: ships many copies of common resources
Helmut Grohne hel...@subdivi.de writes: Wow. People now reporting bugs based on dedup.d.n. That's what I wrote it for! \o/ Thanks for establishing it! :-) Next time, please include a link to http://wiki.debian.org/dedup.debian.net, because it includes useful information for the maintainer. OK, thanks. Sorry for missing it earlier. * Hard links, that cross directories should be ok, if the hierarchy is completely owned by the package in question. This includes FWIW, I've dealt with a filesystem (OpenAFS) that supports no cross-directory hard links whatsoever, perhaps because its ACLs are per-directory rather than per-file. However, it's a network filesystem and generally somewhat idiosyncratic, so supporting package installation into OpenAFS isn't necessarily critical. -- Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org) http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?a...@monk.mit.edu -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#708705: publican: ships many copies of common resources
On Sun, May 19, 2013 at 02:24:30PM +0200, Raphael Hertzog wrote: On Fri, 17 May 2013, Aaron M. Ucko wrote: Per http://dedup.debian.net/compare/publican/publican, publican ships many copies of common resources (images, CSS files, etc.) under /usr/share/publican/Common_Content and /usr/share/publican/doc, accounting for most of its massive size increase from 2.8-3. (It's gone up from 6.4 MiB to 50.6 MiB.) Could you please arrange to ship only one copy of each duplicated file, at least within /usr/share/publican/Common_Content? Wow. People now reporting bugs based on dedup.d.n. That's what I wrote it for! \o/ Next time, please include a link to http://wiki.debian.org/dedup.debian.net, because it includes useful information for the maintainer. It doesn't look trivial. Each set of language files ought to be self-contained so that any generated document is independant. So replacing with symlinks is not satisfactory (unless we modify the publican build logic to replace symlinks with the corresponding file). I would advise against any manual solution. It just causes work at little benefit. Replacing with hardlinks is better but is quite uncommon in Debian packages (there's a lintian warning suggesting it's a bad idea). I have discussed the question about hard link usage a number of times now. Conclusions so far: * Hard links to files in the same directory (not subdirectory) are always ok. (Example: bzip2) * When you have many small files, hard links reduce the installation size over sym links due to savings in inodes. * Hard links, that cross directories should be ok, if the hierarchy is completely owned by the package in question. This includes /usr/lib/$package and /usr/share/$package. Of course this does not cover hard links from /usr/lib/$package/foo to /usr/share/$package/bar. As a rule of thumb: If a package is the only package to create a directory, you can use hard links therein. Last but not least, I'm not going to manually deduplicate all those files so someone should really create a helper script that would deduplicate a sub-directory. The wiki page above gives some explanations on how to achieve this using rdfind and symlinks. A helper utility does not exist. In your case I'd suggest the following line as part of the build process. rdfind -makehardlinks true -outputname /dev/null debian/publican/usr/share/publican Should you have any questions, just ask. In any case feedback on the usability and documentation of dedup.d.n is welcome. Helmut -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#708705: publican: ships many copies of common resources
Control: forwarded -1 https://bugzilla.redhat.com/show_bug.cgi?id=966143 On Mon, 20 May 2013, Aaron M. Ucko wrote: Understood, but the extra usage isn't so trivial either. I had envisioned shipping (and taking care to follow) symlinks, or perhaps automatically taking contents from some new common directory when no locale-specific variant shadows them. Yes, it definitely makes sense. I opened an upstream bug report asking for this. Cheers, -- Raphaël Hertzog ◈ Debian Developer Get the Debian Administrator's Handbook: → http://debian-handbook.info/get/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#708705: publican: ships many copies of common resources
Raphael Hertzog hert...@debian.org writes: It doesn't look trivial. Understood, but the extra usage isn't so trivial either. I had envisioned shipping (and taking care to follow) symlinks, or perhaps automatically taking contents from some new common directory when no locale-specific variant shadows them. Thanks for considering this suggestion! -- Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org) http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?a...@monk.mit.edu -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#708705: publican: ships many copies of common resources
Hi, On Fri, 17 May 2013, Aaron M. Ucko wrote: Per http://dedup.debian.net/compare/publican/publican, publican ships many copies of common resources (images, CSS files, etc.) under /usr/share/publican/Common_Content and /usr/share/publican/doc, accounting for most of its massive size increase from 2.8-3. (It's gone up from 6.4 MiB to 50.6 MiB.) Could you please arrange to ship only one copy of each duplicated file, at least within /usr/share/publican/Common_Content? It doesn't look trivial. Each set of language files ought to be self-contained so that any generated document is independant. So replacing with symlinks is not satisfactory (unless we modify the publican build logic to replace symlinks with the corresponding file). Replacing with hardlinks is better but is quite uncommon in Debian packages (there's a lintian warning suggesting it's a bad idea). Last but not least, I'm not going to manually deduplicate all those files so someone should really create a helper script that would deduplicate a sub-directory. Cheers, -- Raphaël Hertzog ◈ Debian Developer Get the Debian Administrator's Handbook: → http://debian-handbook.info/get/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#708705: publican: ships many copies of common resources
Package: publican Version: 3.1.5-2 Severity: minor Per http://dedup.debian.net/compare/publican/publican, publican ships many copies of common resources (images, CSS files, etc.) under /usr/share/publican/Common_Content and /usr/share/publican/doc, accounting for most of its massive size increase from 2.8-3. (It's gone up from 6.4 MiB to 50.6 MiB.) Could you please arrange to ship only one copy of each duplicated file, at least within /usr/share/publican/Common_Content? Thanks! -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org