Bug#1068588: redesign of how autopkgtest talks to the testbed

Paul Gevers Sun, 07 Apr 2024 07:45:30 -0700

Package: autopkgtest
Severity: wishlist

Hi all,

The following issues have come up several times over the years. I propose to discuss them in one place (this bug report) to define the solution strategy. I haven't gone through all the details myself, so I might be thinking in the wrong direction, please correct me if you think so. Please also voice agreement, if not on the details, then on the general concept.


Problem statements
==================

* runner/autopkgtest talks to the backend with a simple text protocol. While this enables users to add another backend without changes to the src:autopkgtest code trivially, the drawback of that is loosing all nuance of what really is going on on both sides of the communication. That is particularly bad when unexpected events happen. All events need handling on both sides, including unexpected events.

* every backend has its own virt server that does the real communication with the testbed. A result of that is subtle differences in test results between different backends when they don't do exactly the same (code easily goes out of sync).

* most backends don't automatically provide a testbed as a user would see when working on a system. I recall smcv saying words like "user session", "dbus something-something" and the like.

* [mostly orthogonal] currently the autopkgtest code has a lot of state in a non-Pythonic way. Reasoning about what goes on and debugging autopkgtest code flow is non-trivial.


Solution direction
==================

* unify the communication with test beds via ssh. This ensures that the environment is much more likely to be the same across the different backends and also ensures the right session.

* each virt server would only need to ensure an ssh server is setup and running in the testbed and leaving the rest of the communication to a common driver. (Maybe with the exception of the null, chroot and schroot virt servers, to be investigated.) Obviously it's still responsible for the tear down of the testbed.

* handle communication between runner/autopkgtest and the virt servers and the ssh driver via Python classes instead of the text based protocol. Do this in a "plugin" friendly way such that backends can still easily be used without changes to src:autopkgtest.


Alternatives
============

* make the change to use ssh for communication, without a change of the virt server protocol.


Open Questions
==============

* we could consider supporting the current protocol in parallel, which would enable us to migrate one backend at a time and enable our users to migrate their own backends at their own pace. However, it means we'd need to support two code paths. So the open question is: (how long) do we want to maintain the current protocol. I wonder how many other backends are out there.

* although I don't know where it hooks in, but sbuild is using autopkgtest's backends for some of its functionality. We don't want to break sbuild, so the question is how the connection works.

* we already have an ssh virtual server, is that good enough to be the ssh driver, or is it missing functionality and/or deserves a rewrite by itself? To answer the last question, probably yes if we want to move away from the current protocol.


Tasks
=====

[ ] discuss this idea and get consensus on the way forward
[ ] create working branch and generate a PoC with one of the backends
[ ] figure out how sbuild hooks into our backends
[ ] while changing code, add Python typing where applicable
[ ] ...

Paul

PS: would it be worth it to enable dashboards for autopkgtest on salsa to manage this project? I assume issues on salsa are disabled on purpose to avoid bug reports in multiple places. Could we make adding issues project members only?

OpenPGP_signature.asc
Description: OpenPGP digital signature

Bug#1068588: redesign of how autopkgtest talks to the testbed

Reply via email to