On Fri, 10.10.14 13:52, Alexander Larsson ([email protected]) wrote: > So, I've got some kind of initial runtime going, and its now time to > look at how we want to package these runtimes/apps. There are a few > requirements, and a bunch of nice to have. > > This is what we absolutely require: > > * Some kind of format for an application that is delivered over the > network. This will contain metadata + content (a set of files). > > * A format for the application when installed on a system. This has to > be done in such a way that we can access content via the normal > kernel fs syscalls.
I am pretty sure these two formats need to be very close to each other, otherwise all the stuff like signatures that checked on access area really hard to do. Also note, that I want to keep an eye on the big picture. I want the same delivery for the OS itself, as well as OS containers. To me the delivery of apps and their runtimes/frameworks is just one usecase of the scheme.. > * Install that does not require root. It would be nice if a user could just > download an app and not require root to be able to run it. I am not convinced really that this is necessary, nor even desirable. I think installation of apps to normal users should be permitted, but I am also very sure we should not completely open this up. More specifically, I think having some PolicyKit-style check when an app is installed or removed is a *good* thing. Moreover, we need to think about updating schemes as well, and I think those are better done in a single system-service than individually by unpriviliged user code (which would be really nasty on multi-user systems with many users). Hence: app installation/uninstallation/update under priviliged control is a good thing, not a bad thing. > * Minimal need for setuid helpers. Well, true. But I think having a polkit-enabled priviliged service is a much better design than setuid helpers anyway, at least for the installation. For execution I fear some minimal setuid code is not avoidable though. > * Don't pass untrusted data to the kernel. For instance, it is risky > to download raw filesystem data and then mount that, or mount a > loopback file that the user can modify. The raw filesystem data is > directly parsed by the kernel and weird data there can cause kernel > panics. Well, this is unavoidable if we ever want to allow fully signed systems. I mean, again, I would not isolate the problem of app images so much from the problem of OS images. I want to solve this at the same time, as the problems with verification, distribution and so on are pretty much the same. I also really don't believe that the kernel would be any worse with verifying structural integrity of images than userspace code... > * Regular directory > > We require an install phase that explodes the app bundle into > separate files. > > For multi-version storage we can use hardlinks which results in > sharing both disk and page cache between versions at a file-granular > level. > > Install and mounting is doable as non-root, doesn't pass untrusted > data to the kernel and once done allows easy access to exported files. > > However, installation is not atomic, and there are no lazy checking > of checksums or signatures. Also, the hardlink farms are certainly not pretty. > * Download filesystem images and loopback mount > > In this model the app is a single file containing both metadata and > a filesystem image. The filesystem image can be mounted as loopback > directly from the app file, given just the offset and the length. > > Installation/Removal is atomic, so the app is never in a partially > installed mode, and removal/replacement of the file doesn't bother > actively running instances as the inode will not be removed until > the final mount is removed. > > However, you have to be root to do the loopback mount (or a setuid > helper), and loading an untrusted fs image into a kernel is pretty > risky. > > In a naive approach there is no sharing of data between different > installed versions of the same thing, but there are approaches > like devicemapper or btrfs loopback images with snapshots that > can give you disk-space sharing (but not page cache sharing). Oh god, devicemapper! > * btrfs volumes > > If the filesystem where we're installing the app is btrfs (either natively > or via a loopback mounted file) we can install the apps in subvolumes. > If the root is btrfs this is easy, but the loopback mounted case is pretty > tricky, as it requires resizing the loopback when needed, etc. > > This is similar to exploding the files, but we can use the subvolume > to share data between different versions of an app. This will share > disk space, but not page cache. > > Removal of apps is atomic, although you can't remove a btrfs volume > until its not mounted anymore (i.e. the app is not in use anymore). > > Also, btrfs volume removal requires root rights, as do mounting a > loopback btrfs image so some level of setuid helper is needed. > > btrfs also has an interesting feature where you can btrfs-send a > subvolume, which creates a file describing the diff from the parent > volume and the subvolume. This can then be applied with > btrfs-recieve which is a userspace app that applies a set of file > ops to convert the parent to the new child state. This is imho, not > super interesting for our usecase. Btrfs-send is rarely what you > want anyway as a newly built version of an app is built from scratch > anyway and not based on the previous version. One can use rsync to > create a new subvolume based on the old one, but then you're using > rsync, not btrfs-send to generate the diffs. I absolutely disagree. Kay and I have been discussing this stuff with the btrfs folks. The thing is that we want the signatures for the files be transferred in-line. While the signature stuff doesn't exist right now for btrfs they guys working on it are ensuring that the signatures can be serialized from btrfs as part of the btrfs send/recv image, and then deserialized again on the destination, while staying fully valid. Harald has been playing around with some build logic that makes sure that rebuilt app updates are efficiently shipped as btrfs send/recv, with stable inode numbers and stuff. You know, this is explicitly something where we shouldn't reinvent the wheel. It's quite frankly crazy to come up with a new serialization format, that contains per-file verification data, that then somehow can be deserialized on some destination system again back into the fs layer... > What do people think of the various approaches here? Did i miss any > interestion option? I will probably start looking into a more detailed > proposal for how an "explode-files-on-install" approach could work, > including how it looks when delivered as a file (full and > incremental). Keeping the big picture in mind I don't think any but the btrfs approach (including btrfs send/recv) even comes close to what we want. btrfs will not deliver from day #1 what we want (the signature stuff is currently vaporware), but the path towards it is clear and somewhat clean, and the guys hacking on it are friendly and helpful. I know that the Red Hat fs crew hates btrfs like it was the devil, and loves LVM/DM like it was a healthy project. But yuck, just yuck! Lennart -- Lennart Poettering, Red Hat _______________________________________________ gnome-os-list mailing list [email protected] https://mail.gnome.org/mailman/listinfo/gnome-os-list
