Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Wed, 2012-02-15 at 16:41:21 +, Ian Jackson wrote: Guillem Jover writes (Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)): [...] But trying to workaround this by coming up with stacks of hacked up solutions [...] I disagree with your tendentious phrasing. The refcnt feature is not a hacked up solution (nor a stack of them). It is entirely normal in Debian core tools (as in any substantial piece of software serving a lot of diverse needs) to have extra code to make it easier to deploy or use in common cases simpler. All along this thread, when referring to the additional complexity and the additional hacks, I've not been talking about the refcnt'ing at all, but to all the other fixes needed to make it a workable solution. regards, guillem -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120229195152.ga4...@gaara.hadrons.org
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
Guillem Jover writes (Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)): On Tue, 2012-02-14 at 14:28:58 +, Ian Jackson wrote: I think the refcounting approach is very worthwhile because it eliminates unnecessary work (by human maintainers) in many simple cases. Aside from what I said on my other reply, I just wanted to note that this seems to be a recurring point of tension in the project when it comes to archive wide source package changes, where supposed short term convenience (with its usually long term harmful effects) appears to initially seduce people over what seems to be the cleaner although slightly a bit more laborious solution. The refcnt doesn't just eliminate unnecessary multiarch conversion work. It also eliminates unnecessary maintenance effort. Maintaining a split package will be more work than without. I think that over the lifetime of the multiarch deployment this extra packaging work will far outweigh the extra maintenance and documentation burden of the refcnt feature. [...] But trying to workaround this by coming up with stacks of hacked up solutions [...] I disagree with your tendentious phrasing. The refcnt feature is not a hacked up solution (nor a stack of them). It is entirely normal in Debian core tools (as in any substantial piece of software serving a lot of diverse needs) to have extra code to make it easier to deploy or use in common cases simpler. Ian. -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20283.57393.237949.649...@chiark.greenend.org.uk
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Tue, 14 Feb 2012, Philipp Kern wrote: On 2012-02-14, Raphael Hertzog hert...@debian.org wrote: Somehow my suggestion is then to extend dpkg-parsechangelog to provide the required logic to split the changelog in its bin-nmu part and its usual content. dpkg-parsechangelog --split-binnmu binnmu-part-file remaining-part-file Then dh_installchangelogs could try to use this (and if it fails, fallback to the standard changelog installation). Does that sound sane? If yes, I can have a look at implementing this. In theory sbuild could also offload this to dpkg-buildpackage by passing something like --binnmu-version 2 --binnmu-changelog 'Rebuild for libfoo transition'. The only thing that would be annoying is checking if the old style or the new style must be used. (I.e. there must be some sort of feature query first.) Yes but that doesn't change anything to the fact that dpkg-dev should not install files in the generated .deb. So we still need some interaction with dh_installchangelogs... but your suggestion lead me to another proposal. dpkg-buildpackage --binary-version ver --binary-changelog 'foo' could create debian/changelog.build with the given changelog version and changelog entry. dpkg-parsechangelog could be taught to read debian/changelog.build before debian/changelog so that dpkg-parsechangelog continues to do the right thing (when called from debian/rules). And dh_installchangelogs can be taught to install debian/changelog.build as /usr/share/doc/foo/changelog.Debian.build-$arch. dpkg-buildpackage would clean up debian/changelog.build if it wasn't passed the proper option. dpkg-source would learn to not include it in generated source packages, too. This looks like rather appealing to me. What do you think? Cheers, -- Raphaël Hertzog ◈ Debian Developer Pre-order a copy of the Debian Administrator's Handbook and help liberate it: http://debian-handbook.info/liberation/ -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120214131720.gd11...@rivendell.home.ouaza.com
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Mon, 2012-02-13 at 22:43:04 -0800, Russ Allbery wrote: If this is comprehensive, then I propose the following path forward, which is a mix of the various solutions that have been discussed: * dpkg re-adds the refcounting implementation for multiarch, but along with a Policy requirement that packages that are multiarch must only contain files in classes 1 and 2 above. * All packages that want to be multiarch: same have to move all generated documentation into a separate package unless the maintainer has very carefully checked that the generated documentation will be byte-for-byte identical even across minor updates of the documentation generation tools and when run at different times. If packages have to be split anyway to cope with the other cases, then the number of new packages which might not be needed otherwise will be even smaller than the predicted amount, at which point it makes even less sense to support refcnt'ing. It also requires maintainers to carefully consider if the (doc, etc) toolchains will generate predictible ouput. Your proposal still requires papering over the other corner-cases. * Policy prohibits arch-varying data files in multiarch: same packages except in arch-qualified paths. Well, there's no escape from this any way you look at it, regardless of refcnt'ing or not. * The binNMU process is changed to add the binNMU changelog entry to an arch-qualified file (changelog.Debian.arch, probably). We need to figure out what this means if the package being binNMU'd has a /usr/share/doc/package symlink to another package, though; it's not obvious what to do here. This requires IMO multitude of hacks when the simplest and obvious arch-qualified pkgname solves this cleanly, and allows debhelper to automatically deal with it. And for tools to just change where they always look for those files in the M-A:same case regardless of the package being binNMUed or not. This still does not solve the other issues I listed, namely binNMUs have to be performed in lock-step, more complicated transitions / upgrades. And introduces different solutions for different problems, while my proposal is generic for all cases. So this is still pretty much unconvincing, and seems like clinging into the refcnt'ing “solution” while it makes things overall more complicated, will introduce inconsistency and incertainty to maintainers, needs way more global changes to keep it going, etc. What I'd change to my proposal in the summary mail, is that arch-indep files might be considered for splitting at maintainers discretion, when it actually seems worth it, in the same way we've handled splitting arch-indep files from arch:any up to now. So for example a couple of headers could be kept on the -dev package, or Ian's case on essential and data files could also be kept on the same lib package, as long as their paths are arch-qualified either trhough a pkgname:arch or the multiarch triplet. This would reduce even more the amount of newly split packages. regards, guillem -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120214140138.ga23...@gaara.hadrons.org
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
Le lundi 13 février 2012 à 22:43 -0800, Russ Allbery a écrit : There's been a lot of discussion of this, but it seems to have been fairly inconclusive. We need to decide what we're doing, if anything, for wheezy fairly soon, so I think we need to try to drive this discussion to some concrete conclusions. Thank you very much for your constructive work. 3. Generated documentation. Here's where I think refcounting starts failing. So we need to move a lot of documentation generated with gtk-doc or doxygen from -dev packages to -doc packages. But it really seems an acceptable tradeoff between the amount of work required and the cleanness of the solution. Does this seem comprehensive to everyone? Am I missing any cases? Are there any cases of configuration files in /etc that vary across architectures? Think of stuff like ld.so.conf, where some plugins or library path is coded in a configuration file. -- .''`. Josselin Mouette : :' : `. `' `- -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1329230441.3297.378.camel@pi0307572
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
* Raphael Hertzog hert...@debian.org, 2012-02-14, 14:17: dpkg-buildpackage --binary-version ver --binary-changelog 'foo' could create debian/changelog.build with the given changelog version and changelog entry. dpkg-parsechangelog could be taught to read debian/changelog.build before debian/changelog so that dpkg-parsechangelog continues to do the right thing (when called from debian/rules). And dh_installchangelogs can be taught to install debian/changelog.build as /usr/share/doc/foo/changelog.Debian.build-$arch. dpkg-buildpackage would clean up debian/changelog.build if it wasn't passed the proper option. dpkg-source would learn to not include it in generated source packages, too. This looks like rather appealing to me. What do you think? Yes, it does look appealing. But... Are we sure than no existing package uses debian/changelog.build for their own purposes? Are we sure that all existing packages (and helpers) that parse debian/changelog use dpkg-parsechangelog? -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120214144341.ga3...@jwilk.net
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
Hi, On Tue, 14 Feb 2012, Guillem Jover wrote: * All packages that want to be multiarch: same have to move all generated documentation into a separate package unless the maintainer has very carefully checked that the generated documentation will be byte-for-byte identical even across minor updates of the documentation generation tools and when run at different times. If packages have to be split anyway to cope with the other cases, then the number of new packages which might not be needed otherwise will be even smaller than the predicted amount, at which point it makes even less sense to support refcnt'ing. Why are you so opposed to the refcnt'ing? It's not such a big deal to maintain this feature in dpkg. And even if the current implementation is not perfect, it can be improved later when dpkg will store by itself checksums of provided files. To me it looks like you don't like refcnt'ing and you're trying to find some reasons to make it unacceptable. It also requires maintainers to carefully consider if the (doc, etc) toolchains will generate predictible ouput. If the maintainer has to install files in non-standard path (because of the need to arch-qualify it), it will also need maintainers to carefully consider how to ensure that this move doesn't break anything. It's not a white/black situation. You're trading one potential problem for another. And the differing files are likely to be much more easy to spot than other behaviour changes that might be implied by the move of some files to arch qualified paths. Your proposal still requires papering over the other corner-cases. Can you be explicit about which corner cases you're referring to ? This still does not solve the other issues I listed, namely binNMUs have to be performed in lock-step Can you explain why? If the binnmu changelog is in a arch-specific file, then we're free to bin-nmu packages separately. dpkg must just ensure that all M-A: same packages have the same source version (instead of the binary version as currently). , more complicated transitions / upgrades. We have no experience on this. It's a bit early to say whether those constraints are going to be problematic or not. And introduces different solutions for different problems, while my proposal is generic for all cases. There's nothing like a generic solution. You still have to decide whether you move files to a -common package or if you arch qualify them and keep them in the M-A: same package. And in both cases, you have to evaluate the implications, in terms of package installation ordering in one case, in terms of modifications to do to properly support the arch-qualified files in the other one. While it may sound like cleaner from a theoretical point of view, I'm not convinced that it's better than the approach outlined by Russ. Also you completely ignore the fact that what you're proposing is an important change for multi-arch packages that have already been converted both in Debian and in Ubuntu. You're pushing back the work to package maintainers when there's not reason to not deal with this at the build infrastructure level. To reduce some of the downsides associated to compressed files in M-A: same packages, we could/should investigate how to not compress files in such packages instead of duplicating them needlessly. So this is still pretty much unconvincing, and seems like clinging into the refcnt'ing “solution” while it makes things overall more complicated, will introduce inconsistency and incertainty to maintainers, needs way more global changes to keep it going, etc. This is not a fair characterization of the situation. IMO Global changes are better than lots of maintainers having to do busy-work splitting their packages. You see inconsistency in Russ's proposal but you don't see inconsistency/incertainty when you change the standard location of changelog files. And the more complicated, it might be true at the dpkg level, but I don't believe that it's true from the maintainers points of view. Cheers, -- Raphaël Hertzog ◈ Debian Developer Pre-order a copy of the Debian Administrator's Handbook and help liberate it: http://debian-handbook.info/liberation/ -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120214151318.ga14...@rivendell.home.ouaza.com
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Tue, 14 Feb 2012, Jakub Wilk wrote: Are we sure than no existing package uses debian/changelog.build for their own purposes? No, but with debian/changelog.dpkg-build we should be safe. Are we sure that all existing packages (and helpers) that parse debian/changelog use dpkg-parsechangelog? No, but I would consider anything else as a bug and we would notice relatively quickly (we could even do a full rebuild to try to verify pro-actively). Cheers, -- Raphaël Hertzog ◈ Debian Developer Pre-order a copy of the Debian Administrator's Handbook and help liberate it: http://debian-handbook.info/liberation/ -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120214152757.gc14...@rivendell.home.ouaza.com
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Tue, 2012-02-14 at 14:28:58 +, Ian Jackson wrote: Guillem Jover writes (Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)): On Mon, 2012-02-13 at 22:43:04 -0800, Russ Allbery wrote: * The binNMU process is changed to add the binNMU changelog entry to an arch-qualified file (changelog.Debian.arch, probably). We need to figure out what this means if the package being binNMU'd has a /usr/share/doc/package symlink to another package, though; it's not obvious what to do here. This requires IMO multitude of hacks when the simplest and obvious arch-qualified pkgname solves this cleanly, and allows debhelper to automatically deal with it. And for tools to just change where they always look for those files in the M-A:same case regardless of the package being binNMUed or not. I agree that it would be nice to always arch-qualify the changelog filename. But that would involve a lot of changes to changelog-reading tools which we perhaps don't want to do right now. I've never proposed to arch-qualify the filename for the stuff under /usr/share/doc/pkgname/, I've proposed to arch-qualify the pkgname in the path (/usr/share/doc/pkgname:arch/), but only for M-A:same packages, which are the only ones needing the disambiguation. This is how dpkg handles pkgname output, or how it stores their data in the db too. And it should be easy to ask a multiarch enabled dpkg-query for example to normalize the pkgname output to be used on those paths, or otherwise do it by hand: if M-A == same pkgname:arch else pkgname Note that even if we decide to always arch-qualify, we will still have lots of old packages so all changelog-reading tools will need to look in both places. For most changelog-reading tools it won't be very troublesome if they accidentally don't spot a binNMU entry. So Russ's proposal is a good step towards your proposal. And if we decide we don't need to go all the way then it's good enough for now. How many tools are there that actually read the binary package changelog file anyway? I only know of packages.d.o. Any other tool reading from the installed path, cannot really rely on it being present at all anyway, per policy. And in addition, binNMU split changelogs are going to be there forever, and as such their possible double locations. While the possible double location for M-A:same packages using pkgname:arch qualified pathnames would only be temporary and disappear once the packages have been rebuilt with a new debhelper which automatically installs them in the correct place. So this is still pretty much unconvincing, and seems like clinging into the refcnt'ing “solution” while it makes things overall more complicated, will introduce inconsistency and incertainty to maintainers, needs way more global changes to keep it going, etc. I think the refcounting approach is very worthwhile because it eliminates unnecessary work (by human maintainers) in many simple cases. As I mentioned in Riku's reply, the amount of packages that would need splitting that would otherwise not be needed should be even less than before (which was predicted at around 700), also as I mentioned there too, nothing prevents us from arch-qualifying paths (with Debian arch or multiarch triplet depending on the case) if that's more convenient or safer (as per your essential data example), and is what we've been doing anyway for arch-indep data shipped in arch:any packages all along. Given the amount of hacks or special casing piling up to make refcnt'ing workable, when all that's really needed is a one time handling (or a possible additional change for already converted packages, for things that debhelper might not be able to handle) of moving qualifying paths or splitting into new packages, it really does not seem worth it, no. regards, guillem -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120214164015.ga27...@gaara.hadrons.org
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Tue, 2012-02-14 at 14:28:58 +, Ian Jackson wrote: I think the refcounting approach is very worthwhile because it eliminates unnecessary work (by human maintainers) in many simple cases. Aside from what I said on my other reply, I just wanted to note that this seems to be a recurring point of tension in the project when it comes to archive wide source package changes, where supposed short term convenience (with its usually long term harmful effects) appears to initially seduce people over what seems to be the cleaner although slightly a bit more laborious solution. Other recent-ish incarnations of this tension could be the build-arch build-indep targets, or the build flag settings; where the former got recently resolved so that the right thing to do is for *all* packages needing to eventually support those targets, or for the latter which got switched from the seemingly more convenient to the more laborious but correct solution, that is, *all* packages need to set those build flags by themselves. This is a fundamental issue with how our source packages are handled, and the freedom and power it gives to experiment and implement them whatever way the maintainer wants, has the price that doing some archive wide changes is sometimes more costly, than changing something centrally and be done with it. But trying to workaround this by coming up with stacks of hacked up solutions will not solve that fundamental issue, and this kind of tension will keep coming up again and again, as long as the foundation is not reworked. Either that, or the project needs to accept that fact and learn to live with this kind of changes, with patience. regards, guillem -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120215011510.ga15...@gaara.hadrons.org
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
Guillem Jover wrote: Aside from what I said on my other reply, I just wanted to note that this seems to be a recurring point of tension in the project when it comes to archive wide source package changes, where supposed short term convenience (with its usually long term harmful effects) appears to initially seduce people over what seems to be the cleaner although slightly a bit more laborious solution. Other recent-ish incarnations of this tension could be the build-arch build-indep targets, or the build flag settings; where the former got recently resolved so that the right thing to do is for *all* packages needing to eventually support those targets, or for the latter which got switched from the seemingly more convenient to the more laborious but correct solution, that is, *all* packages need to set those build flags by themselves. This is a fundamental issue with how our source packages are handled, and the freedom and power it gives to experiment and implement them whatever way the maintainer wants, has the price that doing some archive wide changes is sometimes more costly, than changing something centrally and be done with it. But trying to workaround this by coming up with stacks of hacked up solutions will not solve that fundamental issue, and this kind of tension will keep coming up again and again, as long as the foundation is not reworked. Either that, or the project needs to accept that fact and learn to live with this kind of changes, with patience. Very interesting mail. While I certianly agree with your examples, it's worth remembering the counterexample of the /usr/doc transition which took approximately 5 years to complete[1], and probably could have been accomplished quickly and without pain with a simple hack to dpkg. Anyway, my worry about the refcounting approach (or perhaps M-A: same in general) is not the details of the implementation in dpkg, but the added mental complexity of dpkg now being able to have multiple distinct packages installed under the same name. I had a brief exposure to rpm, which can install multiple versions of the same package, and that was the main cause of much confusing behavior in rpm. While dpkg's invariant that all co-installable package names be unique (and have unique files) has certianly led to lots of ugly package names, it's kept the users' and developers' mental models quite simple. I worry that we have barely begun to scratch the surface of the added complexity of losing this invariant. -- see shy jo [1] To the extent it was ever completed.. master.debian.org still has a vestigial /usr/doc/ signature.asc Description: Digital signature
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Tue, 14 Feb 2012, Guillem Jover wrote: I've never proposed to arch-qualify the filename for the stuff under /usr/share/doc/pkgname/, I've proposed to arch-qualify the pkgname in the path (/usr/share/doc/pkgname:arch/), but only for M-A:same packages, which are the only ones needing the disambiguation. This is how dpkg handles pkgname output, or how it stores their data in the db too. [...] How many tools are there that actually read the binary package changelog file anyway? There's apt-listchanges surely. And probably a bunch of other that are less known. I don't know if it's worth it, but if we go down that route, and if we want to keep /usr/share/doc/pkgname on user's systems we could create a new command in dpkg-maintscript-helper to manage that path as a symlink to the native M-A: same package (if possible, otherwise to any installed arch). That dpkg-maintscript-helper call could be auto-enabled by debhelper for M-A: same packages. Cheers, -- Raphaël Hertzog ◈ Debian Developer Pre-order a copy of the Debian Administrator's Handbook and help liberate it: http://debian-handbook.info/liberation/ -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120215074019.gc24...@rivendell.home.ouaza.com
Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
There's been a lot of discussion of this, but it seems to have been fairly inconclusive. We need to decide what we're doing, if anything, for wheezy fairly soon, so I think we need to try to drive this discussion to some concrete conclusions. First, Steve's point here is very good: Steve Langasek vor...@debian.org writes: I guess we're looking at the same data, yet we seem to have reached opposite conclusions. - Riku reports that 33 out of 82k files have different compression when using current gzip vs. 10-year-old gzip. I'd be surprised if any of those binary packages hadn't been superseded long ago. It's not a guarantee, but I think the risks, and ultimate cost, of relying on gzip output to not change often and to just do sourceful rebuilds when it isn't are a lot smaller than if we go about manually splitting our packages further. - The cases where gzip output has been reported to not be reproducible seem to all boil down to a single issue with gzip being passed different arguments due to the unreproducible nature of *find*'s output. A patch has been made available already on the bug, and this patch seems to address the instances of the problem that we've hit so far in the Ubuntu archive. Now, it's worth following up with gzip upstream about our concerns, but even without that, I just don't see this being problematic. It isn't the end of the world if we have some conflicts provided that we can detect them and can do something consistent to fix them. I'm rather nervous about relying on reproducibility of gzip because of Joey's experience with pristine-tar, where he does find a lot of variation in practice, but it is true that, for the purposes of multiarch, Debian *can* possibly construct things such that we only need to worry about our own gzip, which does simplify the situation. However, as we've subsequently discussed, those are not the only issues with file overlaps between packages. So I'm going to try to summarize and propose some possible solutions for the different issues. I'm going to discuss these issues in order from the most consistent with a refcounting solution to the least consistent. 1. Uncompressed files that we know are absolutely identical between different architectures. These include arch-independent header files that are just copied verbatim from the upstream source and data files in textual formats or arch-independent binary formats that aren't compressed and whose generation doesn't vary. (Symlinks are a special case of this.) Reference counting works great for these. These also resolve most of the file overlaps between -dev packages, and many of the harder cases for interpackage dependencies if we split everything out. I think it makes a lot of sense to use refcounting for these files. 2. Files like the above but that are compressed. This is most common in the doc directory for things like README or the upstream changelog. Upstream man pages written directly in *roff fall into this category as well, for -dev packages. With Steve's point above about gzip, I think we're probably okay using refcounting for this as well. 3. Generated documentation. Here's where I think refcounting starts failing. Man pages generated from POD may change if the version of Perl used to generate them changes, if Pod::Simple or Pod::Man have had a new release. Doxygen-generated HTML documentation is even more likely to change. Many documentation generation systems will include timestamps or other information that changes, or (even more likely) will have minor changes in their output and formatting even if there is nothing as obvious as a version number or timestamp. I don't think we can use refcounting for generated documentation produced as part of the package build process. If there is Doxygen-generated documentation, generated man pages, or the like, I think those have to be split into a separate arch: all package. Even if it's just a couple of man pages. This is rather annoying, but I think trying to use refcounting here is just too fragile. 4. Lintian overrides. I believe these should be qualified with the architecture on any multiarch: same package so that the overrides can vary by architecture, since this is a semi-frequent use case for Lintian. 5. Data files that vary by architecture. This includes big-endian vs. little-endian issues. These are simply incompatible with multiarch as currently designed, and incompatible with the obvious variations that I can think of, and will have to either be moved into arch-qualified directories (with corresponding patches to the paths from which the libraries load the data) or these packages can't be made multiarch. 6. Debian changelogs. The actual content of these files change with binNMUs, so these obviously can't be refcounted at all right now. We have to do
Re: Multiarch file overlap summary and proposal (was: Summary: dpkg shared / reference counted files and version match)
On Mon, 13 Feb 2012, Russ Allbery wrote: There's been a lot of discussion of this, but it seems to have been fairly inconclusive. We need to decide what we're doing, if anything, for wheezy fairly soon, so I think we need to try to drive this discussion to some concrete conclusions. Thanks for this. 2. Files like the above but that are compressed. This is most common in the doc directory for things like README or the upstream changelog. Upstream man pages written directly in *roff fall into this category as well, for -dev packages. With Steve's point above about gzip, I think we're probably okay using refcounting for this as well. Yes, but I would still document at the policy level that, when feasible without downsides, it's best to move compressed files in a shared package. Also it might be wise to relax the policy rules on compression for multi-arch: same and to let dh_compress not compress (some) files in such packages. Does this seem comprehensive to everyone? Am I missing any cases? It's a good summary, yes. If this is comprehensive, then I propose the following path forward, which is a mix of the various solutions that have been discussed: I agree with this plan. * The binNMU process is changed to add the binNMU changelog entry to an arch-qualified file (changelog.Debian.arch, probably). We need to figure out what this means if the package being binNMU'd has a /usr/share/doc/package symlink to another package, though; it's not obvious what to do here. I wonder what's the proper way to handle this. In theory, it would be nice to deal with that at the dpkg-dev level but dpkg-dev is not at all involved in installing the changelog. And I believe that the bin-nmu process just adds a top-level entry to debian/changelog. So the code should go to dh_installchangelogs... but it doesn't seem to be a good idea to put the bin-nmu logic there in particular since we might extend it (see #440094). Somehow my suggestion is then to extend dpkg-parsechangelog to provide the required logic to split the changelog in its bin-nmu part and its usual content. dpkg-parsechangelog --split-binnmu binnmu-part-file remaining-part-file Then dh_installchangelogs could try to use this (and if it fails, fallback to the standard changelog installation). Does that sound sane? If yes, I can have a look at implementing this. Cheers, -- Raphaël Hertzog ◈ Debian Developer Pre-order a copy of the Debian Administrator's Handbook and help liberate it: http://debian-handbook.info/liberation/ -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120214073923.ga...@rivendell.home.ouaza.com