Hi! On Sat, 2023-04-22 at 10:27:26 +0200, Helmut Grohne wrote: > On Sat, Apr 08, 2023 at 04:35:25AM +0200, Guillem Jover wrote: > > Let's also get back to the very basics. dpkg manages objects shipped > > in binary packages, on the filesystem. It assumes this managing role in > > exclusivity, it will for example overwrite unmanaged files. It preserves > > admin changes with interfaces specifically provided for that (diversions, > > statoverrides, conffile changes) or the unfortunate symlink redirects. > > These shipped objects define the filesystem layout (not the other way > > around). Due to the missing fsys metadata, where it does not have all > > such metadata at hand when necessary (it might only have the one for > > the currently unpacked .deb), it might use heuristics or check the > > filesystem for such metadata, because it does not have anything else, > > but that should not be taken to mean that the filesystem is the source > > of truth, as most of those will be unnecessary once it has such > > metadata at hand. > > This captures an insight I previously didn't have in that clarity and > that I find agreeable conceptually. > > > So the reason this proposal is still conceptually wrong is manifold: > > > > * dpkg cannot safely and atomically perform such switches (and I don't > > see it ever being able to portably do so, so I don't see ever > > supporting that). > > I agree, but the proposal also does not ask dpkg to perform such > switches, so I kinda fail to see how this is a relevant argument.
It is relevant because it affects the end state, and what solutions are going to be appropriate then. See below. This might perhaps also have been a source of misunderstanding, my thinking is not focused solely on this particular instance but how this interacts with other current or long term behavior and upcoming features, and how this all would look like in the end. > > * No packages ships those symlinks (and none should! as that would > > currently imply having the same pathname contain different file types > > on the same system, introducing ordering issues and file type > > conflicts). > > I disagree with this argument on two levels. For one thing, I think that > the transition only is complete once these symlinks are shipped in a > package. In particular, that notion of complete likely encompasses that > no aliasing occurs anymore as all aliased files have been moved to their > canonical location somehow (<- and this likely will be a quite difficult > thing to do). For another, no package actually ships those symlinks now. > They are created behind dpkg's back in some postinst. This is > unfortunate and I agree with Simon Richter that this kinda is a policy > violation, but at this time, it is an aspect we have to deal with > whether we want to or not. > > I suspect that you disagree with the notion the we have to deal with > this situation, which I consider to be our fundamental disagreement. I don't think we disagree (?), I probably didn't express myself clearly. The fact that no package ships those symlinks *is* and *has* been a problem, and what I've been saying all along, this will be the only correct way to let dpkg know whether there will be aliasing in play. At the same time what I was trying to say is that we cannot ship those symlinks because even though dpkg does not yet track fsys metadata (even though it should and is one requirement to be able to be aliasing-aware), it would be an implicit file type conflict, where dpkg (currently) would not know or be able to do anything meaningful with it, and might make unpacks fail in the future (depending on the ordering or packages being unpacked). Coming now back to the atomic and safe switches, and the ordering, as I think I've mentioned elsewhere, dpkg should eventually be made aliasing-aware, in that it should know about all fsys file types and be able to detect these cases during unpack (once these symlinks are properly shipped in a package). But given these mentioned constraints it cannot be made to support (as in accept) unpacking files inside aliased directories (it should be able to unpack the symlinks creating those aliased directories though!). There are several reasons for that: * One is that the expected behavior for file types tracked by dpkg is to switch their file type if this is data.tar initiated and the operation can be done when the dirs are empty (so to get rid of these dpkg-maintscript-helper parts) otherwise abort, applying the symlink←→dir preservation behavior should only be done (if at all) for admin initiated changes on the fsys. * Another is that dpkg would need to allow those pathnames to have at the same time two sets of metadata attributes (mode, perms, xattrs, file type, one a symlink target), which is a terrible interface. * But more importantly this causes ordering issues and unpredictability. If there is a package A shipping a directory and package B shipping an aliasing symlink on the same pathname, and package C shipping also contents within that directory, and we have established that dpkg cannot always safely perform such file type switch, then depending on the unpack order and whether the "directory" is empty or not, dpkg would be able to perform the file type switch or not, and you might end up with files appearing in two "directories" and with an aliased directory or not. This is also terrible behavior. And that's why I say dpkg should simply refuse that, and something that should not be supported. > > * This introduces a series of commands to let dpkg know that a > > filesystem change that was not shipped in any .deb (even though that > > should have been the way to do it), has been done, which: > > - Switches the source of truth from the .deb to the fsys. > > While this is correct on some level, the aim of this change is to put > that truth back into dpkg of course. Sure, the problem is the price that will need to be paid to get there, in terms of problematic interfaces or behavior and what kind of workarounds or hacks that will entail, and for how long. > > - Confuses admin initiated changes from distro initiated ones. > > I think we already do this with dpkg-divert, dpkg-statoverride and other > tools. While this may not be nice, it certain has prior art and is > consistent with how we have been doing things in the past. dpkg-divert distinguishes between local and package level changes, it is true that dpkg-statoverride does not have (currently) that distinction, although it is primarily an admin tool where I don't think it makes much sense to support something like declarative package statoverrides TBH once we can ship fsys metadata (perhaps conditional one though). > > * Wants to be a generic change but it is really targeted to this > > specific mess. We have been doing similar aliasing transitions for > > many doc dirs, by stopping shipping files within, shipping that > > pathname as a symlink and then switching the directories to symlinks > > to match (via the dpkg-maintscript-helper hack because we miss fsys > > metadata). This means we'd need to then register all these directories > > too? Meh. > > I would love to agree with this, but I believe that this ship has > sailed. This likely is part of our fundamental disagreement. The comment was not focused on how this could have been done, but in that this is a common operation we do, and would need to get the same treatment, which seems bad. > > * This information can get out of sync with reality, as it adds an > > additional and unconnected with anything source of truth, that dpkg > > cannot do anything about if it diverges (in contrast to diversions > > or statoverrides f.ex.). This can never happen when that information > > comes from the real source of truth (the fsys metadata via the .deb). > > I have difficulties accurately capturing the argument. The problem of > information getting out of sync with reality should affect every aspect > of dpkg and indeed, that kinda is the status quo where upgrades can > loose files, because dpkg has an incomplete picture of reality. The aim > of this change is to allow us to re-sync the status quo into dpkg. My > view is that dpkg's information presently is out of sync with reality > and the proposed change partially fixes that. The current problem stems from both dpkg lacking fsys metadata and Debian holding dpkg wrong in an unsupported way, but where ideally both of these will eventually go away (?). My objection was that the proposal introduces a mechanism which makes things worse because it adds more information sources that can/will get out of sync. > > [ As an aside, I think ideally eventually nothing distro provided should > > be allowed to be installed within an aliased dir, and dpkg should > > eventually just error out in those cases, which eventually would get > > rid of the aliasing problems and any such complexity (I'm not sure how > > or when that would be feasible though, but obviously in Debian at > > least not until nothing ships files there). ] > > It seems to me that this is something everyone agrees on. So our > disagreement resides in the way to get there rather than where to get > to. If that's the case, then great. My impression though is that some people expect dpkg will be able to unpack content within aliased directories (?), which I don't see happening for the reasons I mentioned above. This will imply that you cannot install any old package that ships content there, which might be unexpected, but I don't see any other sane way to handle this. :/ > > So this still looks like a terrible interface, like it did at the time > > it was discarded; founded on a hack, an interface that seems wants to > > be kind of a file-type override but it cannot be, and cannot even > > properly act as record tracker, etc… > > I agree that in a perfect world, we would not need this. Let me circle > back to our fundamental disagreement. > > My impression is that at this time basically everyone except you agrees > that we have to deal with the aliasing problems that have been rolled > out to users and will be forced in bookworm. I believe that this is the > state that we have to consider as starting point and that we cannot > magically turn this transition back to perform it in a better way. And > indeed, I believe that there would have been a better way[1] that no > longer is available to us. I think I've mentioned before multiple times, that dpkg should eventually be able to be aliasing-aware. I think I've also mentioned that to get there we need to move all files out of aliased directories, otherwise several of the changes required for that "support" might not be even able to be deployed. > On the other hand, my impression is that you continue to see the > transition as fundamentally broken and in a state that we cannot work > from. You appear to believe that if we want to do it, we must start over > in a better way. That better way must not cause aliasing problems to > dpkg. Well, it should be obvious by now this somewhat called transition is fundamentally broken, and I also see that there is no magic simple and clean way to get out of it. And every way out, is through further complexity, workarounds or badness. Of course given the corner Debian has painted itself into, there needs to be a way out, my objection is what kind of price to pay for that. > > I thought it would be clear that if there is stuff that depends on > > any of this kind of changes to dpkg, relying on those changes in > > Debian would not be possible until after trixie+1. Of course there is > > always the route to further pile up over the Jenga tower of hacks, > > by for example adding huge amounts of Pre-Depends… > > I agree that we probably will deal with this until at least trixie+1. > This is precisely why I would like to have a plan to finish it sooner > rather than later. Also, to note, that even if the way out was through some dpkg workaround, which would even get backported to bookworm, AIUI upgrades are never guaranteed to start from the last point release, so that would not seem to help much anyway. So coming back to workarounds and hacks, I'm finding the diversions stuff to be rather bad, as it requires to bypass an explicit dpkg refusal to deal with diverted directories, so it's going into further unsupported territory. :/ My other concern is that this might end up leaving unsupported directory diversions around which could break dpkg if it starts refusing to work on them during unpack, not just during diversion additions. I did a PoC (untested) implementation for the partial upgrade deletion prevention workaround to see how bad that might look like, and in comparison to the diverted stuff it is bad but not as bad. As I mentioned on our talks, this needs to imply emitting a warning, because otherwise this might end up as relied on behavior that should not be supported, and it would be a temporary hack for Debian and derivatives until things have moved out. https://git.hadrons.org/git/debian/dpkg/dpkg.git/log/?h=pu/aliasing-workaround Also, in case there is any confusion, this is a _partial_ workaround that does not cover many of the other badness, such as file overwrites and disappearances in other stages of the package life-cycle nor in other tools from the dpkg suite, from local packages, or from admin initiated changes via supported interfaces. I still think all the proposed workarounds are pretty terrible, TBH. Thanks, Guillem