Bug#708705: publican: ships many copies of common resources

2013-06-14 Thread Aaron M. Ucko
Helmut Grohne hel...@subdivi.de writes:

 Wow. People now reporting bugs based on dedup.d.n. That's what I wrote
 it for! \o/

Thanks for establishing it! :-)

 Next time, please include a link to
 http://wiki.debian.org/dedup.debian.net, because it includes useful
 information for the maintainer.

OK, thanks.  Sorry for missing it earlier.

  * Hard links, that cross directories should be ok, if the hierarchy is
completely owned by the package in question. This includes

FWIW, I've dealt with a filesystem (OpenAFS) that supports no
cross-directory hard links whatsoever, perhaps because its ACLs are
per-directory rather than per-file.  However, it's a network filesystem
and generally somewhat idiosyncratic, so supporting package installation
into OpenAFS isn't necessarily critical.

-- 
Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org)
http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?a...@monk.mit.edu


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#708705: publican: ships many copies of common resources

2013-06-12 Thread Helmut Grohne
On Sun, May 19, 2013 at 02:24:30PM +0200, Raphael Hertzog wrote:
 On Fri, 17 May 2013, Aaron M. Ucko wrote:
  Per http://dedup.debian.net/compare/publican/publican, publican ships
  many copies of common resources (images, CSS files, etc.) under
  /usr/share/publican/Common_Content and /usr/share/publican/doc,
  accounting for most of its massive size increase from 2.8-3.  (It's
  gone up from 6.4 MiB to 50.6 MiB.)
  
  Could you please arrange to ship only one copy of each duplicated
  file, at least within /usr/share/publican/Common_Content?

Wow. People now reporting bugs based on dedup.d.n. That's what I wrote
it for! \o/ Next time, please include a link to
http://wiki.debian.org/dedup.debian.net, because it includes useful
information for the maintainer.

 It doesn't look trivial. Each set of language files ought to be
 self-contained so that any generated document is independant. So replacing
 with symlinks is not satisfactory (unless we modify the publican build
 logic to replace symlinks with the corresponding file).

I would advise against any manual solution. It just causes work at
little benefit.

 Replacing with hardlinks is better but is quite uncommon in Debian
 packages (there's a lintian warning suggesting it's a bad idea).

I have discussed the question about hard link usage a number of times
now. Conclusions so far:

 * Hard links to files in the same directory (not subdirectory) are
   always ok. (Example: bzip2)
 * When you have many small files, hard links reduce the installation
   size over sym links due to savings in inodes.
 * Hard links, that cross directories should be ok, if the hierarchy is
   completely owned by the package in question. This includes
   /usr/lib/$package and /usr/share/$package. Of course this does not
   cover hard links from /usr/lib/$package/foo to
   /usr/share/$package/bar. As a rule of thumb: If a package is the only
   package to create a directory, you can use hard links therein.

 Last but not least, I'm not going to manually deduplicate all those
 files so someone should really create a helper script that would
 deduplicate a sub-directory.

The wiki page above gives some explanations on how to achieve this using
rdfind and symlinks. A helper utility does not exist.

In your case I'd suggest the following line as part of the build
process.

rdfind -makehardlinks true -outputname /dev/null 
debian/publican/usr/share/publican

Should you have any questions, just ask. In any case feedback on the
usability and documentation of dedup.d.n is welcome.

Helmut


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#708705: publican: ships many copies of common resources

2013-05-22 Thread Raphael Hertzog
Control: forwarded -1 https://bugzilla.redhat.com/show_bug.cgi?id=966143

On Mon, 20 May 2013, Aaron M. Ucko wrote:
 Understood, but the extra usage isn't so trivial either.  I had
 envisioned shipping (and taking care to follow) symlinks, or perhaps
 automatically taking contents from some new common directory when no
 locale-specific variant shadows them.

Yes, it definitely makes sense. I opened an upstream bug report asking for
this.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Get the Debian Administrator's Handbook:
→ http://debian-handbook.info/get/


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#708705: publican: ships many copies of common resources

2013-05-20 Thread Aaron M. Ucko
Raphael Hertzog hert...@debian.org writes:

 It doesn't look trivial.

Understood, but the extra usage isn't so trivial either.  I had
envisioned shipping (and taking care to follow) symlinks, or perhaps
automatically taking contents from some new common directory when no
locale-specific variant shadows them.

Thanks for considering this suggestion!

-- 
Aaron M. Ucko, KB1CJC (amu at alum.mit.edu, ucko at debian.org)
http://www.mit.edu/~amu/ | http://stuff.mit.edu/cgi/finger/?a...@monk.mit.edu


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#708705: publican: ships many copies of common resources

2013-05-19 Thread Raphael Hertzog
Hi,

On Fri, 17 May 2013, Aaron M. Ucko wrote:
 Per http://dedup.debian.net/compare/publican/publican, publican ships
 many copies of common resources (images, CSS files, etc.) under
 /usr/share/publican/Common_Content and /usr/share/publican/doc,
 accounting for most of its massive size increase from 2.8-3.  (It's
 gone up from 6.4 MiB to 50.6 MiB.)
 
 Could you please arrange to ship only one copy of each duplicated
 file, at least within /usr/share/publican/Common_Content?

It doesn't look trivial. Each set of language files ought to be
self-contained so that any generated document is independant. So replacing
with symlinks is not satisfactory (unless we modify the publican build
logic to replace symlinks with the corresponding file).

Replacing with hardlinks is better but is quite uncommon in Debian
packages (there's a lintian warning suggesting it's a bad idea).

Last but not least, I'm not going to manually deduplicate all those
files so someone should really create a helper script that would
deduplicate a sub-directory.

Cheers,
-- 
Raphaël Hertzog ◈ Debian Developer

Get the Debian Administrator's Handbook:
→ http://debian-handbook.info/get/


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#708705: publican: ships many copies of common resources

2013-05-17 Thread Aaron M. Ucko
Package: publican
Version: 3.1.5-2
Severity: minor

Per http://dedup.debian.net/compare/publican/publican, publican ships
many copies of common resources (images, CSS files, etc.) under
/usr/share/publican/Common_Content and /usr/share/publican/doc,
accounting for most of its massive size increase from 2.8-3.  (It's
gone up from 6.4 MiB to 50.6 MiB.)

Could you please arrange to ship only one copy of each duplicated
file, at least within /usr/share/publican/Common_Content?

Thanks!


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org