On 2/23/21, Random832 <random...@fastmail.com> wrote:
> I was reading a discussion thread
> <https://gist.github.com/tiran/2dec9e03c6f901814f6d1e8dad09528e> about
> various issues with the Debian packaged version of Python, and the following
> statement stood out for me as shocking:
> Christian Heimes wrote:
>> Core dev and PyPA has spent a lot of effort in promoting venv because we
>> don't want users to break their operating system with sudo pip install.
> I don't think sudo pip install should break the operating system. And I
> think if it does, that problem should be solved rather than merely advising
> users against using it. And why is it, anyway, that distributions whose
> package managers can't coexist with pip-installed packages don't ever seem
> to get the same amount of flak for "damaging python's brand" as Debian is
> getting from some of the people in the discussion thread? Why is it that
> this community is resigned to recommending a workaround when distributions
> decide the site-packages directory belongs to their package manager rather
> than pip, instead of bringing the same amount of fiery condemnation of that
> practice as we apparently have for *checks notes* splitting parts of the
> stdlib into optional packages? Why demand that pip be present if we're not
> going to demand that it works properly?
> I think that installing packages into the actual python installation, both
> via distribution packaging tools and pip [and using both simultaneously -
> the Debian model of separated dist-packages and site-packages folders seems
> like a reasonable solution to this problem] can and should be a supported
> paradigm, and that virtual environments [or more extreme measures such as
> shipping an entire python installation as part of an application's
> deployment] should ideally be reserved for the rare corner cases where that
> doesn't work for some reason.
> How is it that virtual environments have become so indispensable, that
> no-one considers installing libraries centrally to be a viable model
> anymore? Are library maintainers making breaking changes too frequently,
> reasoning that if someone needs the old version they can just venv it? Is
> there some other cause?

First, pip+venv is not sufficient for secure software deployment:
something must set appropriate permissions so that the application
cannot overwrite itself and other core libraries (in order to
eliminate W^X violations (which e.g. Android is solving by requiring
all installed binaries to come from an APK otherwise they won't and
can't be labeled with the SELinux extended file atrributes necessary
for a binary to execute; but we don't have binaries, we have an
interpreter and arbitrary hopefully-signed somewhere source code, at

Believe it or not, this is wrong:

# python -m venv httpbin || virtualenv httpbin
# source httpbin/bin/activate
mkvirtualenv httpbin

pip install httpbin gunicorn
gunicorn -b httpbin:app

# python -m webbrowser

It's wrong - it's insecure - because the user executing the Python
interpreter (through gunicorn, in this case) can overwrite the app.
W^X: has both write and execute permissions. What would be better?

This would be better because pip isn't running setup.py as root (with
non-wheels) and httpbin_exec can't modify the app interpreter or the
code it loads at runtime:

useradd httpbin # also creates a group also named 'httpbin'
sudo -u httpbin sh -c ' \
    python -m venv httpbin; \
    umask 0022; \
    ./httpbin/bin/python -m pip install httpbin gunicorn'

useradd httpbin_exec -G httpbin
sudo -u httpbin_exec './httpbin/bin/gunicorn -b httpbin:app'

This would be better if it worked, though there are a few caveats:

sudo apt-get install python-gunicorn python-httpbin
sudo -u nobody /usr/bin/gunicorn -b httpbin:app

1. Development is impossible:
- You can't edit the code in /usr/lib/python3.n/site-package/ without
root permissions.
- You should not be running an editor as root.
- You can edit distro-package files individually with e.g. sudoedit
(and then the GPG-signed package file checksums will fail when you run
`debsums` or `rpm -Va` because you've edited the file and that's
changed the hash).

- Non-root users cannot install python packages without having someone
repack (and sign it) for them.

- What do I need to do in order to patch the distro's signed repack of
the Python package released to PyPI?
  - I like how Fedora pkgs and conda-forge have per-package git repos now.
  - Conda-forge has a bot that watches PyPI for new releases and tries
sending an automated PR.
  - If I send a PR to the main branch of the source repo and it gets
merged, how long will it be before there's a distro repack built and
uploaded to the distro package index?

2. It should be installed in a chroot/jail/zone/container/context/vm
so that it cannot read other data on the machine.
The httpbin app does not need read access to /etc/shadow, for example.
Distro package installs are not - either - sandboxed.

To pick on httpbin a bit more, the httpbin docs specify that httpbin
should be run as a docker container:

docker run -p 80:8001 kennethreitz/httpbin

Is that good enough? We don't know, we haven't reviewed:

- the Dockerfile
  - it says `FROM ubuntu:18.04`, which is fortunately an LTS release.
But if it hasn't been updated this month, it probably has the sudo bug
that enabled escalation to root (which - even in a container - is bad
because it could obnoxiously just overwrite libc, for example, and
unless the container is rebuilt or something runs `debsums`, nothing
will detect that data integrity error)
- the requirements.txt / setup.py:install_requires / Pipfile[.lock] dependencies
  - does it depend upon outdated pinned exact versions?
    - Is there an SBOM (Software Bill of Materials) that we can review
against known vulnerability databases?

How do I know that:

- The packages I have installed are not outdated and unpatched against
known vulnerabilities
- The files on disk are exactly what should be in the package
- The app_exec user can't overwrite the binary interpreter or the
source files it loads at runtime
- There won't be unreviewed code running as root (including at install time)
- All Python package dependencies are available as wheels (that
basically only need to be unzipped)
- The ensemble of dependencies which I've miraculously assembled is
available on the target platform(s)
- The integration tests for my app pass with each combination of
dependencies which satisfy the specified dependency constraints
- I can edit things and quickly re-test
- Each dependency is signed by a key that's valid for that dependency

So, if pip is insufficient for secure software deployment, what are
pro teams using to build signed, deployable artifacts with fresh,
upgaded dependencies either bundled in or loosely-referenced?

- Bazel (from Google's internal Blaze) builds from BUILD files.
  - https://github.com/dropbox/dbx_build_tools
- Pantsbuild, Buck
- zipapps
- FPM can apparently package up an entire virtualenv; though IDK how
good it is at permissions?

As an open source maintainer, there are very many potential
environments to release builds for.
Manylinux docker images (and auditwheel, delocate, and *cibuildwheel*)
are a response to extreme and somewhat-avoidable complexity.

Distro packagers can and do build upon e.g. pip; which is great for
development but not sufficient for production deployment due to lack
of support for file permissions, extended file attributes, checksums,
cryptographic signatures, and due to running setup.py as the install
user for non-wheel packages.

There are many deployment stories now: pull/push, configuration
management systems, venvs within containers within VMs. For your
favorite distro, how do I get from cibuildwheel to a signed release
artifact in your package index; and which keys can sign for what?
