So, I've got some kind of initial runtime going, and its now time to look at how we want to package these runtimes/apps. There are a few requirements, and a bunch of nice to have.
This is what we absolutely require: * Some kind of format for an application that is delivered over the network. This will contain metadata + content (a set of files). * A format for the application when installed on a system. This has to be done in such a way that we can access content via the normal kernel fs syscalls. There are also various things that would be nice to have. Depending on the implementation we may be able to fulfil a subset of these. * An efficient delta format for updates that share lots of data. * Efficient multi-version storage in installed form. e.g. multiple installed versions of the same app would share disk/ram for shared files. * Atomic install/uninstall. If the installed form of the app is a single file, then installation is a single write and remove is a single rm, and additionally if the app is in use then removing the file will keep in-use versions of the app running (since the inode is not freed until last use). This is a very useful feature because we never end up with half-installed apps. * Install that does not require root. It would be nice if a user could just download an app and not require root to be able to run it. * Minimal need for setuid helpers. * Don't pass untrusted data to the kernel. For instance, it is risky to download raw filesystem data and then mount that, or mount a loopback file that the user can modify. The raw filesystem data is directly parsed by the kernel and weird data there can cause kernel panics. * Lazy integrity checks. If we download a file we can always run a checksum on it to verify that the download is ok, and that nothing modified the data. However, this is costly to do up-front. Some filesystems allow checksum to happen as each file is read, which avoids a large check initially. * Trusted signature checks. This is similar to the integrity checks, but even more powerful, as it verifies not only integrity, but also trust. If we enroll some kind of key in the bios, we can then securely inherit it all the way down to the app and have the kernel verify the file is trusted. This is not only more efficient than signature verification up-front, but also more secure, as it detects changes to the files post-install. * Easy to export files to the host for integration. For instance, if an installed app includes a desktop file and icon we need to be able to read those files from the desktop. If this is an easy operation that doesn't require mounting a filesystem or parsing a specialized file format that is a plus. For the network format we're not very constrained, so i think the design here revolves mostly around how an app looks in installed form, and the network form will follow naturally from that. So, what are the alternatives here? * Regular directory We require an install phase that explodes the app bundle into separate files. For multi-version storage we can use hardlinks which results in sharing both disk and page cache between versions at a file-granular level. Install and mounting is doable as non-root, doesn't pass untrusted data to the kernel and once done allows easy access to exported files. However, installation is not atomic, and there are no lazy checking of checksums or signatures. * Download filesystem images and loopback mount In this model the app is a single file containing both metadata and a filesystem image. The filesystem image can be mounted as loopback directly from the app file, given just the offset and the length. Installation/Removal is atomic, so the app is never in a partially installed mode, and removal/replacement of the file doesn't bother actively running instances as the inode will not be removed until the final mount is removed. However, you have to be root to do the loopback mount (or a setuid helper), and loading an untrusted fs image into a kernel is pretty risky. In a naive approach there is no sharing of data between different installed versions of the same thing, but there are approaches like devicemapper or btrfs loopback images with snapshots that can give you disk-space sharing (but not page cache sharing). If the filesystem used supports integrity checking (like btrfs) that can be used. Exporting files to the host requires either that all installed apps are mounted, or that we explode file from the filesystem at install time. * Create filesystem images locally This is similar to the above approach, except we create the filesystem from the data files at install time rather than using a pre-created filesystem image. This can be done easily in userspace with e.g. the squashfs tools to create a filesystem. This requires an extra step, but otoh it lowers the risk of passing untrusted data to the kernel. That said, it still requires trust in the filesystem creation tool, and that the user doesn't modify the filesystem image once created. * btrfs volumes If the filesystem where we're installing the app is btrfs (either natively or via a loopback mounted file) we can install the apps in subvolumes. If the root is btrfs this is easy, but the loopback mounted case is pretty tricky, as it requires resizing the loopback when needed, etc. This is similar to exploding the files, but we can use the subvolume to share data between different versions of an app. This will share disk space, but not page cache. Removal of apps is atomic, although you can't remove a btrfs volume until its not mounted anymore (i.e. the app is not in use anymore). Also, btrfs volume removal requires root rights, as do mounting a loopback btrfs image so some level of setuid helper is needed. btrfs also has an interesting feature where you can btrfs-send a subvolume, which creates a file describing the diff from the parent volume and the subvolume. This can then be applied with btrfs-recieve which is a userspace app that applies a set of file ops to convert the parent to the new child state. This is imho, not super interesting for our usecase. Btrfs-send is rarely what you want anyway as a newly built version of an app is built from scratch anyway and not based on the previous version. One can use rsync to create a new subvolume based on the old one, but then you're using rsync, not btrfs-send to generate the diffs. I personally very much appreciate the atomicity of a the loopback mounted single-file app-bundle, but given what I've written above, especially with the risks with pushing non-trustworthy data to the kernel I feel that the simple approach of just exploding the files is probably the best. Even that approach has several options. For instance, one could have a common repostory with files that have filenames based on (say) the sha1 hash of the content, and then each app could hardlink from those. Or one could have completely separate trees for each app which are only hardlinked when we do an incremental update and keep the old version. I'm probably favouring the later, as there is the remote risk of hash collisions that could let you attach another app, and because there is unlikely to be much sharing between non-related apps anyway, so it won't give you much. What do people think of the various approaches here? Did i miss any interestion option? I will probably start looking into a more detailed proposal for how an "explode-files-on-install" approach could work, including how it looks when delivered as a file (full and incremental). _______________________________________________ gnome-os-list mailing list [email protected] https://mail.gnome.org/mailman/listinfo/gnome-os-list
