Add the document describing a new design for the OS installation process for
new instances.

Signed-off-by: Michele Tartara <[email protected]>
---
 doc/design-draft.rst |    1 +
 doc/design-os.rst    |  318 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 319 insertions(+)
 create mode 100644 doc/design-os.rst

diff --git a/doc/design-draft.rst b/doc/design-draft.rst
index c821292..3ed3852 100644
--- a/doc/design-draft.rst
+++ b/doc/design-draft.rst
@@ -20,6 +20,7 @@ Design document drafts
    design-daemons.rst
    design-hsqueeze.rst
    design-ssh-ports.rst
+   design-os.rst
 
 .. vim: set textwidth=72 :
 .. Local Variables:
diff --git a/doc/design-os.rst b/doc/design-os.rst
new file mode 100644
index 0000000..7a42a7f
--- /dev/null
+++ b/doc/design-os.rst
@@ -0,0 +1,318 @@
+===============================
+Ganeti OS installation redesign
+===============================
+
+.. contents:: :depth: 3
+
+This is a design document detailing a new OS installation procedure, more
+secure, able to provide more features and easier to use for many common tasks
+w.r.t. the current one.
+
+Current state and shortcomings
+==============================
+
+As of Ganeti 2.10, each instance is associated with an OS definition. An OS
+definition is a set of scripts (``create``, ``export``, ``import``, ``rename``)
+that are executed with root privileges on the primary host of the instance to
+perform all the OS-related functionality (setting up an operating system inside
+the disks of the instance being created, exporting/importing the instance,
+renaming it).
+
+These scripts receive, as environment variables, a fixed set of parameters
+describing the instance (such as the hypervisor, the name of the instance, the
+number of disks, and their location) and a set of user defined parameters. Each
+of these parameters is also written into the configuration file of Ganeti, to
+allow for future reinstalls of the instance, and in various log files, namely:
+
+* node daemon log file: contains DEBUG strings of the ``/os_validate``,
+  ``/instance_os_add`` and ``/instance_start`` RPC calls.
+
+* master daemon log file: DEBUG strings related to the same RPC calls are 
stored
+  here as well.
+
+* commands log: the CLI commands that create a new instance, including their
+  parameters, are logged here.
+
+* RAPI log: the RAPI commands that create a new instances, including their
+  parameters, are logged here.
+
+* job logs: the job files stored in the job queue or in its archive contain the
+  parameters.
+
+The current situation presents a number of shortcomings:
+
+* Having the installation scripts run with root power on the nodes is a huge
+  security issue.
+
+* Ganeti cannot be used to create instances starting from user provided disk
+  images: even in the (hypothetical) case where the scripts are completely
+  secure and run not by root but by an unprivileged user with only the power to
+  mount arbitrary files as disk images, this is a security issue. It has been
+  proven that a carefully crafted file system might exploit kernel
+  vulnerabilities to gain control of the system. Therefore, directly mounting
+  images on the Ganeti nodes is not an option.
+
+* There is no way to inject files into an existing disk image. A common use 
case
+  is for the system administrator to provide a standard image of the system, to
+  be later personalized with the network configuration, private keys 
identifying
+  the machine, ssh keys of the users and so on. A possible workaround would be
+  for the scripts to mount the image (only if this is trusted!) and to receive
+  the configurations and ssh keys as user defined OS parameters. Unfortunately,
+  this is also not an option for security sensitive material (such as the ssh
+  keys) because the OS parameters are stored in many places on the system, as
+  already described above.
+
+* Most other virtualization software simply work with instance images, not with
+  installation scripts. This difference makes the interaction of Ganeti with
+  other softwares difficult.
+
+Proposed changes
+================
+
+In order to fix the shortcomings of the current state, we plan to introduce the
+following changes:
+
+* Change the OS parameters to have three categories:
+
+ * ``public``: the current behavior. The parameter is logged and stored freely.
+
+ * ``private``: the parameter is saved inside the Ganeti configuration (to 
allow
+   for instance reinstall) but it is not shown in logs, job logs, or passed 
back
+   via RAPI.
+
+ * ``secret``: the parameter is not saved inside the Ganeti configuration.
+   Reinstall are impossible unless the data is passed again. The parameter will
+   not appear in any log file. In order to preserve the functionality of 
Ganeti,
+   the parameters will still need to be stored in the job files, but they will
+   be removed from there when the job has finished running (either successfully
+   or not).
+
+* A new OS installation procedure, based on a safe virtualized environment.
+  This virtualized environment will run with the same hardware parameter as the
+  actual instance being installed, as much as possible. This will also allow to
+  reduce the memory usage in the host (specifically, in Dom0 for Xen
+  installations). Each instance will have these possible execution modes:
+
+  * ``run``: the default mode, used when the machine is running normally.
+
+  * ``self_install``: Ganeti will start the instance with a different set of
+    user-specified parameters, therefore allowing to attach an installation
+    floppy/cdrom/network, change the boot device order, or specify an OS image
+    to be used. The instance will then be responsible to get the parameters for
+    configuring itself (its network interfaces, IP address, hostname, etc.) 
from
+    a set of metadata provided to it by Ganeti (e.g.: using an approach
+    comparable to the one of the ``cloud-init`` tool). When this installation
+    mode is used, no OS installation script is required.
+    In order for installation of an OS from an image to be possible, a new
+    parameter ``--os-image`` will be added, allwoing to specify where to take
+    the image from. It will have to be mutually exclusive with ``--os-type``. 
If
+    ``--os-image`` is specified, ``--os-parameters`` can still be used, as it
+    will be passed to the instance as part of the metadata.
+    The set of ``self_install`` parameters will be stored as part of the
+    instance configuration, so that they can be used to reinstall the instance.
+    It will be the user's responsibility to ensure that the OS image or any
+    installation media is still available in the proper position when a
+    reinstall happens.
+
+  * ``install``: Ganeti will start the instance using a virtual appliance
+    specifically made for installing Ganeti instances. Scripts analogous to the
+    current ones will run inside this instance. The disks of the instance being
+    installed will be connected to this virtual appliance, so that the scripts
+    can mount them and modify them as needed, as currently happens, but with 
the
+    additional protection given by this happening in a VM. The virtual 
appliance
+    will be started in a clean state every time a new instance need to be
+    created, to further increase security. Metadata will be provided also to
+    this virtual applicance, that will take care of converting them to
+    environment variables for the installation scripts.
+
+In order to allow for the metadata to be sent inside the instance, a
+communication mechanism between the instance and the host will be created. This
+mechanism will be bidirectional (e.g.: to allow the setup process going on
+inside the instance to communicate its progress to the host). Each instance 
will
+have access exclusively to its own metadata, and it will be only able to
+communicate with its host over this channel.
+
+As part of the instance creation command it will be possible to indicate a URL
+for a "personalization package", that is an archive containing a set of files
+meant to be overlayed on top of the operating system file system at the end of
+the setup process, before the VM is started for the first time in ``run`` mode.
+Ganeti will provide a mechanism for receiving and unpacking this archive as 
part
+of the ``install`` execution mode, whereas in ``self_install`` mode it will 
only
+be provided as a metadata for the instance to use.
+The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or ``.tgz``)
+and will contain the files according to the directory structure that will be
+recreated on the installation disk. Files contained in this archive will
+overwrite files with the same path created during the install procedure (if
+any).
+The URL of the "personalization package" will have to specify an extesion to
+identify the file format (in order to allow for more formats to be supported in
+the future).
+The URL will be stored as part of the configuration of the instance (therefore,
+the URL should not contain confidential information, but the file there
+available can). It is up to the system administrator to ensure that a package
+is actually available at that URL at install and reinstall time.
+The content of the package is allowed to change. E.g.: a system administrator
+might create a package containing the private keys of the instance being
+created. When the instance is reinstalled, a new package with new keys can be
+made available there, therefore allowing instance reinstall without the need to
+store keys.
+
+Implementation
+==============
+
+The implementation of this design will happen as an ordered sequence of steps,
+of increasing impact on the system and, in some cases, dependent on each other:
+
+#. Private and secret instance parameters
+#. Communication mechanism between host and instance
+#. Metadata service
+#. Personalization package
+#. ``self_install`` mode
+#. ``install`` mode (with virtualization environment)
+
+Some of these steps need to be more deeply specified w.r.t. what is already
+written in the `Proposed changes`_ Section. Extra details will be provided in
+the following Subsections.
+
+Communication mechanism and metadata service
+++++++++++++++++++++++++++++++++++++++++++++
+
+The communication mechanism and the metadata service are described together
+because they are deeply tied. On the other hand, the communication mechanism
+will need to be more generic because it can be used for other reasons in the
+future (like allowing instances to esplicitly send commands to Ganeti, or to 
let
+Ganeti control a helper instance, like the one hereby introduced for performing
+OS installs inside a safe environment).
+
+The communication mechanism will be enabled automatically when the instance is
+in ``self_install`` or ``install`` mode, but for backwards compatibility it 
will
+be disabled when the instance is in ``run`` mode unless it is esplicitly
+requested at instance startup by using a new, ad-hoc, parameter
+(``--communication``).
+
+When the communication mechanism is enabled, Ganeti will create a new network
+interface inside the instance. This extra network interface will be the last 
one
+of the instance, after all the user defined ones. On the host side, this
+interface will be only accessible to the host itself, and not be routed outside
+the machine.
+On this network interface, the instance will connect using the IP:
+169.254.169.1 and netmask 255.255.255.0.
+The host will be on the same network, with the IP address: 169.254.169.254.
+The instance will be able to connect to 169.254.169.254:80, and issue GET
+requests to an HTTP server that will provide the instance metadata.
+
+The choice of this IP address and port is done for compatibility reasons with
+OpenStack's and Amazon EC2's ways of providing metadata to the instance.
+
+Where possible, the metadata will be provided in a way compatible with 
OpenStack
+at::
+
+  http://169.254.169.254/openstack/<version>/meta_data.json
+
+or with Amazon EC2, at::
+
+  http://169.254.169.254/<version>/meta-data/*
+
+If some metadata are Ganeti-specific and don't fit this structure, they will be
+provided at::
+
+  http://169.254.169.254/<version>/ganeti/meta_data.json
+
+``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to indicate
+the most recent available protocol version.
+
+A bi-directional, pipe-like communication channel will be provided. The 
instance
+will be able to receive data from the host by a GET request at::
+
+  http://169.254.169.254/<version>/ganeti/pipe_in
+
+and to send data to the host by a POST request at::
+
+  http://169.254.169.254/<version>/ganeti/pipe_out
+
+As in a pipe, once the data are read, they will not be in the buffer anymore, 
so
+subsequent get request to ``pipe_in`` will not return the same data twice.
+Unlike a pipe, though, it will not be possible to perform blocking I/O
+operations.
+
+The OS parameters will be accessible through a GET
+request at::
+
+  http://169.254.169.254/<version>/ganeti/os/parameters/<visibility>.json
+
+as a JSON serialized dictionary. ``<visibility>`` will be either ``public`` or
+``private`` or ``secret``.
+
+The installation scripts to be run inside the virtualized environment while the
+instance is run in ``install`` mode will be available at::
+
+  http://169.254.169.254/<version>/ganeti/os/scripts/<script_name>
+
+where ``<script_name>`` is the name of the script.
+
+The host and the instances (as detailed in `Installation process in a
+virtualized environment`_) will be able to create other communication channels
+on the other ports of the same IP address.
+
+
+Rationale
+---------
+
+The choice of using a network interface for instance-host communication, as
+opposed to VirtIO, XenBus or other methods, is due to the will of having a
+generic, hypervisor-independent way of creating a communication channel, that
+doesn't require unusual (para)virtualization drivers.
+At the same time, a network interface was preferred over solutions involving
+virtual floppy or USB devices because the latter tend to be detected and
+configured by the guest operating systems, sometimes even in prominent 
positions
+in the user interface, whereas it is fairly common to have an unconfigured
+network interface in a system, usually without any negative side effects.
+
+
+Installation process in a virtualized environment
++++++++++++++++++++++++++++++++++++++++++++++++++
+
+In the new OS installation scenario, we distinguish between trusted and
+untrusted code.
+
+The trusted installation code maintains the behavior of the current one, with
+the scripts running on the node the instance is being created on. The untrusted
+code is stored in a subdirectory of the OS definition called ``untrusted``.
+This directory contains scripts that are equivalent to the already existing
+ones (``create``, ``export``, ``import``, ``rename``) but that will be run
+inside an virtualized environment, to protect the host from malicious 
tampering.
+
+The ``untrusted`` code is meant to either be untrusted itself, or to be trusted
+code running operations that might be dangerous (such as mounting a
+user-provided image).
+
+In order to allow for the highest flexibility, if both a trusted and an
+untrusted script are provided for the same operation (i.e. ``create``), both of
+them will be executed at the same time, one on the host, and one inside the
+installation appliance. They will be allowed to communicate with each other
+through the already described communication mechanism, in order to orchestrate
+their execution (e.g.: the untrusted code might execute the installation, while
+the trusted one receives status updates from it and delivers them to a user
+interface).
+
+Ganeti will provide a script to be run at install time that can be used to
+create the virtualized environment that will perform the OS installation of new
+instances.
+This script will build a debootstrapped basic debian system including including
+a software that will read the metadata, setup the environment variables and
+launch the installation scripts inside the virtualized environment. The script
+will also provide hooks for personalization.
+
+It will also be possible to use other self-made virtualized environment, as 
long
+as they connect to ganeti over the described communication mechanism and they
+know how to read and use the provided metadata to create a new instance.
+
+While performing an installation in the virtualized environment, a
+personalizable timeout will be used to detect possible problems with the
+installation process, and to kill the virtualized environment.
+
+.. vim: set textwidth=72 :
+.. Local Variables:
+.. mode: rst
+.. fill-column: 72
+.. End:
-- 
1.7.10.4

Reply via email to