Fedora Modularity: What's the Problem?

Stephen Gallagher Mon, 28 Oct 2019 07:16:10 -0700

One of the recurring themes in the ongoing Modularity threads has been that
we've made references to the problems we're trying to solve, but we haven't
done a good job of gathering those requirements and use-cases into a single
place. To resolve this, I've written a (fairly long) blog post describing
the set of problems that we are trying to solve.

You can read it with nice formatting over at
https://communityblog.fedoraproject.org/fedora-modularity-whats-the-problem/
or the mediocre copy-and-paste I'm including in this message.

(Apologies for the HTML mail; I want to preserve the hyperlinks and
formatting from the blog post)

---

Fedora Modularity: What’s the Problem?
<https://communityblog.fedoraproject.org/fedora-modularity-whats-the-problem/>

Much has been said
<https://lists.fedoraproject.org/archives/list/de...@lists.fedoraproject.org/thread/W4BCWO5ID2VIWZVUGWP7OMC7JCNOE5AZ/>
about
Fedora Modularity over the past couple weeks. Much of it has been
constructive; some of it the expected resistance to change that all large
features encounter. Some, however, is the result of our not having painted
a good picture of the problems that Modularity aims to solve. Numerous
suggestions have been made on the Fedora Development mailing list that
sound good on the surface but that ultimately fail to address some
important use-cases. This blog post will attempt to enumerate these cases
in detail so as to serve as a common reference point for the ongoing
discussions.

Please note as well that these are goals. There are numerous places where
the implementation of Modularity at the time of this writing is not yet
fully adherent to them.
It’s all about the apps!

Though many of the readers of this blog might be of a different mind, it’s
important to remember that very few people install a Linux distribution for
its own sake. Ultimately, the goal is to “scratch a particular itch” that
the user is experiencing. The solutions may take many forms, but ultimately
this user wants to deploy some software that solves a problem for them.

This leads us to a classic problem that Linux distributions have faced: the
“Too Fast/Too Slow” problem. Linux distributions are traditionally quite
monolithic. The package collections they ship are generally
self-consistent, providing generally whatever the latest stable major
release of the software at the time of the distribution release. As the
release ages, it will receive bugfixes and enhancements, but usually will
remain on the same major version.

This is excellent for the maintainers of the distribution, because it
allows them to test that everything works together as a cohesive whole. It
means that there’s one authoritative version to align to.

Users, on the other hand, are most concerned about solving their problem.
It matters less to them that the distribution is cohesive and more that the
tools they need are available to them.

The “Too Fast/Too Slow” problem is basically this: users want a solid,
stable, reliable, *unchanging* system. They want it to stay that way for
the life of their application. However, they also want their application to
run using the set of dependencies it was designed for. If that doesn’t
happen to be the same version (newer or older) as the one selected for the
monolithic distribution, the user will now have to resort to alternative
means to get up and running. This may be as simple as bundling a dependency
or as drastic as selecting an entirely different distribution that better
fits their specific need.
A little background

One of the precursors to Fedora Modularity was Software Collections
<https://www.softwarecollections.org/> (SCLs). This was a first try at
solving the Too Fast/Too Slow” problem in the Fedora/Red Hat ecosystem.
provides two basic advantages: *Parallel Availability* and *Parallel
Installability*.

*Parallel Availability* means that more than one major release of a popular
software project is available for installation. For example, the “Developer
Toolset” SCLs provide access to newer versions of GCC and its related
toolchain for building software. There are Python and Ruby SCLs that
provide assorted runtimes for those languages and so on.

*Parallel Installability* means that more than one major release of a
software project can be installed on the same userspace.

A few years back, the Product Management team inside Red Hat performed a
large-scale survey of customers and potential customers about the user
experience of Red Hat Enterprise Linux. In particular, they asked about
their level of satisfaction with the software available from the enterprise
distribution and their opinion on these Software Collections.

Perhaps unsurprisingly, the overwhelming majority of respondents were
thrilled to have supported versions of software beyond what had shipped
with the base operating system. What the survey team did come away with
that was an epiphany was that the respondents generally did not care about
the parallel installability of the SCLs. For the most part, they maintained
individual userspaces (using bare metal, traditional virtualization or
containers) for each of the applications they cared about.

The most common problem reported for Software Collections was that using
them required changes to the applications they wanted to run. SCLs install
to a separate filesystem location from more traditional RPMs and
applications that rely on them need to know where to look for them. (In SCL
parlance, this is called “activating” the collection.)

The consequence of this relocation on disk is that users were unable to
take existing applications (either FOSS or proprietary) and simply use
them. Instead, they had to modify the projects to first activate the
collections. This was a consistent pain point.

Given this feedback, Red Hat came to the conclusion that parallel
installability, while nice to have, was not a critical user requirement.
Instead, the focus would be on the parallel *availability*. By dropping
this requirement, it became possible to create a solution that allowed the
different versions to be swapped in and take over the standard locations on
the disk.
Meanwhile in Fedora

Of course, it’s not just Red Hat — people in Fedora are also concerned with
solving this Too Fast / Too Slow problem for our users. Efforts around this
kicked off in seriousness with the Fedora.next initiative
<https://fedoramagazine.org/fedora-present-and-future-a-fedora-next-2014-update-part-ii-whats-happening/>
and
Fedora Project Leader Matthew Miller’s “Rings
<https://lwn.net/Articles/563395/>” talk at the first Flock conference in
2013.

This led to the proposal and approval by the Fedora Council of the Modularity
Prototype Fedora Objective
<https://fedoraproject.org/wiki/Objectives/Fedora_Modularization,_Prototype_Phase>
and
its follow-up Modularity Release Fedora Objective
<https://fedoraproject.org/wiki/Objectives/Fedora_Modularization_%E2%80%94_The_Release>
.
Critical use cases for consumers

First and foremost, our primary driving goal is to make it easy for our
users to understand and interact with alternative software versions. In any
instance where choosing between the packager experience and the user
experience is in conflict, we elect to improve things for the user.
Standard Locations

In order to make deployment of users’ applications simpler, we need to make
sure that software can be installed into the common, expected locations on
the system. This includes (but is not limited to):

- Libraries must be installed to /usr/lib[64].
- Headers must be installed to /usr/include.
- Executables must be installed to a location in the default system $PATH
- Other -devel functionality such as pkgconfig files must be installed
in their standard lookup locations.
- Installed services may own a well-known DBUS address.
- Services may own the appropriate standard TCP/UDP ports or local
socket paths.

*Requirement*: Installation must occur in the same locations as traditional
RPM software delivery.
Don’t break the app!

It is very common for Fedora to update to the latest major version of
packages at each new semiannual release. This ensures that Fedora remains
at the leading edge of software development, but it can wreak havoc on
anyone trying to maintain a deployment on Fedora. If they are running an
app that is built for PostgreSQL 9.6 and Fedora switches to carrying
PostgreSQL 10 in the next major release, upgrading to that release may
break their app (possibly in ways undetectable by the upgrade process).

However, staying on an old version of Fedora forever has its own problems.
Not least of these is the problem of security updates: Once a release has
been out for about 13 months, it stops receiving errata. Moreover, new
releases of the Fedora platform may have other useful enhancements (better
security defaults, increased performance thanks to compiler improvements,
etc.).

*Requirement*: We need to allow users to “lock” themselves onto certain
dependencies as long as the packager is maintaining them. These
dependencies must continue to receive updates.

*Requirement*: There must be appropriate and helpful UX for dealing with
when those dependencies go EOL.
Support the developers

Developers often want to build their applications using the
latest-and-greatest version of their dependencies. However, that may not
have been released until after the most recent Fedora release. In
non-Modular Fedora, that means waiting up to six months to be able to work
on it there.

*Requirement*: It must be possible to gain access to newer software than
was available at the Fedora release GA.

Additionally, Dev/Ops people are rapidly switching to a new paradigm of
development and deployment (containers) to solve the above issue. However,
most containers today are retrieved from public repositories. The public
repositories are generally user-managed and have not been verified and
validated for security.

*Requirement*: Provide a mechanism for building *trusted* container base
and application images with content alternatives.
Keep it updated

It’s not enough that other versions of software are available to install.
They also need to be kept up to date with bug fixes and security updates.
In non-Modular Fedora, users had the ability to force DNF to lock to a
specific RPM NEVRA, but they wouldn’t get updates from it.

*Requirement*: Alternative software must receive be able to recieve and
apply updates.
Make it discoverable

Having alternative versions available is important but not sufficient. It
is also necessary for users to be able to locate these alternatives. Some
of our early explorations into this area failed this ease-of-use test
because they require the user to have knowledge of external sites and then
to search those sites for what they think they want.

*Requirement*: Users must be able to discover what alternative software
versions are available with tools that are shipped with the OS by default.
Ideally, these should be the same tools that they are already comfortable
with.
Don’t break existing package management workflows

Users are slow to adapt to changes in the way they need to behave.
Requiring them to learn a new set of commands to interact with their system
will likely result in frustration and possibly exodus to other
distributions.

*Requirement*: It must remain possible to continue to operate with only the
package management commands used in traditional Fedora. We may provide
additional commands to support new functionality, but we must not break the
previous workflow.

*Requirement*: Existing automation tools such as anaconda’s kickstart and
Ansible must continue to work.
Critical use-cases for packagersDependencies

Because very little software today is wholly self-contained, it must be
possible for Modules to depend on each other.

*Requirement*: There must be a mechanism for packagers to explicitly list
dependencies on other software, including alternative versions. This
mechanism must support both build-time and run-time dependencies.
Alternative dependencies

Some software is very restrictive about which dependencies it can work
with. Other software may work with several different major releases of a
dependency. For example, a user may ship two Ruby-based web applications,
one which is capable of running on Ruby 2.5 and the other that can run on
either Ruby 2.5 or Ruby 2.6. In non-modular Fedora, only one version of
Ruby would be available. If the system version was 2.5, then both
applications could run fine. But if in the next release of Fedora the Ruby
2.6 release becomes the system copy, one of those applications will have to
be dropped (or patched) to work with it.

*Requirement*: It must be possible to build software that can be run
against multiple versions of its dependencies.

*Requirement*: The packaging process for creating software that supports
multiple versions of their dependencies must not be significantly more
difficult than packaging for a single dependency.

As more and more things become modules, there is concern that such things
will grow into an unbounded matrix. For this, we need to establish policies
on when the use of alternative dependencies is preferable or when it is
better to constrain it to a single version or small set.

*Requirement*: Packaging guidelines need to provide advice on when to use
multiple alternative dependencies or to select a single one.
Managing private dependencies

When a person decides that they want Fedora to carry a particular package
and decides to do the work to accomplish this, it is not uncommon to
discover that the package they care about has additional dependencies that
are not yet packaged in Fedora. Traditionally, this has meant that the
packager has needed to package up those dependencies and then continue to
maintain them for anyone who may be using them for other purposes. This can
sometimes be a significant investment in time and energy, all to support a
package they don’t necessarily care about except for how it supports the
primary package.
Build-time Dependencies

Sometimes, a package is needed only to build the software and is not
required at run-time. In such cases, Modularity should offer the ability to
keep those build-time dependencies entirely private and not exposed to the
Fedora Package Collection at large.

*Requirement*: Build-time only dependencies for an alternative version may
be excluded from the installable output artifacts. These excluded artifacts
may be preserved by the build-system for other purposes.

*Requirement*: All sources used for generating alternative versions,
regardless of final visibility, must be available to the community for
purposes of modification and reproducibility.
Defining the public API

Similarly, there are times when an application the packager cares about
depends on another package that is required at runtime, but sufficiently
complex that the packager would not want to maintain it for general use.
(For example, an application that links to a complicated library but only
uses a few functions.)

In this case, we want there to be a standard mechanism for the packager to
be able to indicate that some of the output artifacts are not supported for
use outside this module. If they are needed by others, they should package
it themselves and/or help maintain it in a shared place.

*Requirement*: Packagers must be able to encode whether their output
artifacts are intended for use by other projects or if they are effectively
private to the alternative version. Packagers must also have a way of
finding this information out so they understand what they can and cannot
rely on as a dependency.
Use-case-based installation

Since the earliest days of Linux, the “package” has been the fundamental
unit of installable software. If you want to have some functionality on the
system, you need to learn the name of the individual packages that provide
that functionality (not all of which are named obviously). As we build
modules, one of the goals is to try to focus installation around use-cases
rather than around upstream projects. A big piece of this is that we want
to have a way to install a subset of a module that supports specific
use-cases. A common example being “server” and “client” cases.

*Requirement*: It must be possible to install a subset of artifacts from an
alternative version. These installation groups should be easily
discoverable.

*Recommendation*: Installation groups should be named based on the use-case
they are intended to solve. This will provide a better user experience.
Lifecycle isolation

Another of the major issues faced by Fedora is maintaining a release
schedule when all of the components within it follow vastly differing
schedules. There are two main aspects to this problem:

- A major version of a popular piece of software is released just after
a Fedora release, so it doesn’t land in Fedora for six months.
- Some software does frequent major revisions (Django, Node.js, etc.)
and swapping them out every six months for the latest one means that
dependent projects are constantly needing to adapt to the new breakage or
find alternative mechanisms for retaining the older, working version
- Some software does not handle multiple-version upgrades (Nextcloud,
for example). Attempting to go from version 15 to verison 19 requires first
upgrading through 16, 17, and 18.

*Requirement*: It must be possible for new alternative versions of software
to become available to the Fedora Package Collection between release dates.

*Requirement*: It must be possible for alternative versions of software to
go end-of-life during a Fedora release. This does not mean that the
software must disappear from the repositories, merely that an assertion
exists somewhere that after a certain date, the package will not receive
updates.

*Requirement*: For alternative versions whose lifecycle will continue
through at least part of the next Fedora release, it must be possible to
upgrade from one release to the next and remain with the fully-compatible
version.
Third-party additions

Some third-party add-on repositories (particularly EPEL) have been limited
in the past by relying on the system copies of packages in the base
distribution of the release. In the particular case of EPEL, little can be
done to upgrade these system copies. In order to be able to package much of
the available FOSS software out there, it may be necessary to override some
of the content shipped in the base system with packages known to work
properly.

*Requirement*: It must be possible for third party repositories to create
alternative versions that override base distribution content at the user’s
explicit choice.

*Requirement*: It must be possible for third party repositories to create
alternative versions of software that exist in the base distribution.
Reduce duplication in packaging work

There is plenty of software out in the wild that maintains compatibility
over time and is therefore useful to carry in multiple releases of Fedora.
With traditional packaging, this means carrying and building separate
branches of the packages for each release of Fedora. In the case of
software “stacks” which are tightly bound, this means also manually
building each of its dependencies in each release of Fedora.

*Requirement*: It must be possible to build multiple component software
packages in the same build process.

*Requirement*: It must be possible for the packager to specify the order in
which packages must be built (and to indicate which ones can be built in
parallel).

*Requirement*: It must be possible to be build for all supported platforms
using the same specification and with a single build command.
Non-GoalsParallel installability

As mentioned in the Background section, the goals of Modularity are
specifically to *not* implement parallel-installability. We recommend
instead that users should rely on other mechanisms such as virtualization
or containerization to accomplish this. If parallel-installation is
unavoidable, then Modularity is not the correct tool for this job.
Arbitrary stream switching

Module streams are intended to be compatible update streams. That means
they must follow the same rules regarding RPM package-level updates within
the stream. By definition, two streams of the same module exist because
upgrades (or downgrades or cross-grades…) are not capable of being done in
a safe, automated fashion.

That does not mean that stream switching should be impossible, but it does
mean that we will not build any tools intended to handle such switching in
a generic manner. Stream switches should be handled on a module-by-module
basis and detailed instructions and/or tools written for each such case.

_______________________________________________
devel-announce mailing list -- devel-announce@lists.fedoraproject.org
To unsubscribe send an email to devel-announce-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org

Fedora Modularity: What's the Problem?

Reply via email to