On Tue, Nov 12, 2013 at 12:41 PM, Michele Tartara <[email protected]> wrote:
> Add the document describing a new design for the OS installation process for
> new instances.
>
> Signed-off-by: Michele Tartara <[email protected]>
> ---
>  doc/design-draft.rst |    1 +
>  doc/design-os.rst    |  318 
> ++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 319 insertions(+)
>  create mode 100644 doc/design-os.rst
>
> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
> index c821292..3ed3852 100644
> --- a/doc/design-draft.rst
> +++ b/doc/design-draft.rst
> @@ -20,6 +20,7 @@ Design document drafts
>     design-daemons.rst
>     design-hsqueeze.rst
>     design-ssh-ports.rst
> +   design-os.rst
>
>  .. vim: set textwidth=72 :
>  .. Local Variables:
> diff --git a/doc/design-os.rst b/doc/design-os.rst
> new file mode 100644
> index 0000000..7a42a7f
> --- /dev/null
> +++ b/doc/design-os.rst
> @@ -0,0 +1,318 @@
> +===============================
> +Ganeti OS installation redesign
> +===============================
> +
> +.. contents:: :depth: 3
> +
> +This is a design document detailing a new OS installation procedure, more
> +secure, able to provide more features and easier to use for many common tasks
> +w.r.t. the current one.
> +
> +Current state and shortcomings
> +==============================
> +
> +As of Ganeti 2.10, each instance is associated with an OS definition. An OS
> +definition is a set of scripts (``create``, ``export``, ``import``, 
> ``rename``)
> +that are executed with root privileges on the primary host of the instance to
> +perform all the OS-related functionality (setting up an operating system 
> inside
> +the disks of the instance being created, exporting/importing the instance,
> +renaming it).
> +
> +These scripts receive, as environment variables, a fixed set of parameters
> +describing the instance (such as the hypervisor, the name of the instance, 
> the
> +number of disks, and their location) and a set of user defined parameters. 
> Each
> +of these parameters is also written into the configuration file of Ganeti, to
> +allow for future reinstalls of the instance, and in various log files, 
> namely:
> +
> +* node daemon log file: contains DEBUG strings of the ``/os_validate``,
> +  ``/instance_os_add`` and ``/instance_start`` RPC calls.
> +
> +* master daemon log file: DEBUG strings related to the same RPC calls are 
> stored
> +  here as well.
> +
> +* commands log: the CLI commands that create a new instance, including their
> +  parameters, are logged here.
> +
> +* RAPI log: the RAPI commands that create a new instances, including their
> +  parameters, are logged here.
> +
> +* job logs: the job files stored in the job queue or in its archive contain 
> the
> +  parameters.
> +
> +The current situation presents a number of shortcomings:
> +
> +* Having the installation scripts run with root power on the nodes is a huge
> +  security issue.
> +

s/is a huge security issue/doesn't allow user-defined os scripts, as
they would pose a huge security issue/

Note that there's no security issue *per se* in the current situation,
if the OS scripts are trusted.
(except perhaps for export, if the os script mounts the instance disk,
which is also not necessarily the case)

That said it could be a safety issue in the sense that an eventual
bug/error in the os script could risk disrupting the node.

> +* Ganeti cannot be used to create instances starting from user provided disk
> +  images: even in the (hypothetical) case where the scripts are completely
> +  secure and run not by root but by an unprivileged user with only the power 
> to
> +  mount arbitrary files as disk images, this is a security issue. It has been
> +  proven that a carefully crafted file system might exploit kernel
> +  vulnerabilities to gain control of the system. Therefore, directly mounting
> +  images on the Ganeti nodes is not an option.
> +
> +* There is no way to inject files into an existing disk image. A common use 
> case
> +  is for the system administrator to provide a standard image of the system, 
> to
> +  be later personalized with the network configuration, private keys 
> identifying
> +  the machine, ssh keys of the users and so on. A possible workaround would 
> be
> +  for the scripts to mount the image (only if this is trusted!) and to 
> receive
> +  the configurations and ssh keys as user defined OS parameters. 
> Unfortunately,
> +  this is also not an option for security sensitive material (such as the ssh
> +  keys) because the OS parameters are stored in many places on the system, as
> +  already described above.
> +
> +* Most other virtualization software simply work with instance images, not 
> with
> +  installation scripts. This difference makes the interaction of Ganeti with
> +  other softwares difficult.

s/softwares/software/

> +
> +Proposed changes
> +================
> +
> +In order to fix the shortcomings of the current state, we plan to introduce 
> the
> +following changes:
> +
> +* Change the OS parameters to have three categories:
> +
> + * ``public``: the current behavior. The parameter is logged and stored 
> freely.
> +
> + * ``private``: the parameter is saved inside the Ganeti configuration (to 
> allow
> +   for instance reinstall) but it is not shown in logs, job logs, or passed 
> back
> +   via RAPI.
> +
> + * ``secret``: the parameter is not saved inside the Ganeti configuration.
> +   Reinstall are impossible unless the data is passed again. The parameter 
> will
> +   not appear in any log file. In order to preserve the functionality of 
> Ganeti,
> +   the parameters will still need to be stored in the job files, but they 
> will
> +   be removed from there when the job has finished running (either 
> successfully
> +   or not).
> +

Do we actually need to save them in the job files?
The job files could be saved (to disk) without, and in case the master
is failed over the job can be failed.
(this should make it a lot harder to access)

> +* A new OS installation procedure, based on a safe virtualized environment.
> +  This virtualized environment will run with the same hardware parameter as 
> the
> +  actual instance being installed, as much as possible. This will also allow 
> to
> +  reduce the memory usage in the host (specifically, in Dom0 for Xen
> +  installations).
> Each instance will have these possible execution modes:
> +
> +  * ``run``: the default mode, used when the machine is running normally.
> +
> +  * ``self_install``: Ganeti will start the instance with a different set of
> +    user-specified parameters, therefore allowing to attach an installation
> +    floppy/cdrom/network, change the boot device order, or specify an OS 
> image
> +    to be used. The instance will then be responsible to get the parameters 
> for
> +    configuring itself (its network interfaces, IP address, hostname, etc.) 
> from
> +    a set of metadata provided to it by Ganeti (e.g.: using an approach
> +    comparable to the one of the ``cloud-init`` tool). When this installation
> +    mode is used, no OS installation script is required.
> +    In order for installation of an OS from an image to be possible, a new
> +    parameter ``--os-image`` will be added, allwoing to specify where to take
> +    the image from. It will have to be mutually exclusive with 
> ``--os-type``. If
> +    ``--os-image`` is specified, ``--os-parameters`` can still be used, as it
> +    will be passed to the instance as part of the metadata.
> +    The set of ``self_install`` parameters will be stored as part of the
> +    instance configuration, so that they can be used to reinstall the 
> instance.
> +    It will be the user's responsibility to ensure that the OS image or any
> +    installation media is still available in the proper position when a
> +    reinstall happens.
> +

Should we use --os-type image:<name> and/or have an image os provider
that defines:
1) the actual parameters needed for installation
2) the image (eg. the verify script could double check that the image
is available from the node or accessible via the network...)

I think in particular it would be useful to still have the concept of
an OS "provider" that tells ganeti how to install itself (which
parameters to use). This of course could be overridable, but at least
there would be a sane default without relying on the user to "get it
right".

> +  * ``install``: Ganeti will start the instance using a virtual appliance
> +    specifically made for installing Ganeti instances. Scripts analogous to 
> the
> +    current ones will run inside this instance. The disks of the instance 
> being
> +    installed will be connected to this virtual appliance, so that the 
> scripts
> +    can mount them and modify them as needed, as currently happens, but with 
> the
> +    additional protection given by this happening in a VM. The virtual 
> appliance
> +    will be started in a clean state every time a new instance need to be
> +    created, to further increase security. Metadata will be provided also to
> +    this virtual applicance, that will take care of converting them to
> +    environment variables for the installation scripts.
> +

Please specify better that by "will be started in a clean state" you
actually mean "the disk will be reset to its pristine state and not
reused between reinstallation" because it might be construed to mean
just the "booting" (runtime info) which is sort of less strict.

> +In order to allow for the metadata to be sent inside the instance, a
> +communication mechanism between the instance and the host will be created. 
> This
> +mechanism will be bidirectional (e.g.: to allow the setup process going on
> +inside the instance to communicate its progress to the host). Each instance 
> will
> +have access exclusively to its own metadata, and it will be only able to
> +communicate with its host over this channel.
> +

Too vague :)


> +As part of the instance creation command it will be possible to indicate a 
> URL
> +for a "personalization package", that is an archive containing a set of files
> +meant to be overlayed on top of the operating system file system at the end 
> of
> +the setup process, before the VM is started for the first time in ``run`` 
> mode.
> +Ganeti will provide a mechanism for receiving and unpacking this archive as 
> part
> +of the ``install`` execution mode, whereas in ``self_install`` mode it will 
> only
> +be provided as a metadata for the instance to use.
> +The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or 
> ``.tgz``)
> +and will contain the files according to the directory structure that will be
> +recreated on the installation disk. Files contained in this archive will
> +overwrite files with the same path created during the install procedure (if
> +any).
> +The URL of the "personalization package" will have to specify an extesion to
> +identify the file format (in order to allow for more formats to be supported 
> in
> +the future).
> +The URL will be stored as part of the configuration of the instance 
> (therefore,
> +the URL should not contain confidential information, but the file there
> +available can). It is up to the system administrator to ensure that a package
> +is actually available at that URL at install and reinstall time.
> +The content of the package is allowed to change. E.g.: a system administrator
> +might create a package containing the private keys of the instance being
> +created. When the instance is reinstalled, a new package with new keys can be
> +made available there, therefore allowing instance reinstall without the need 
> to
> +store keys.
> +

Add something about authentication perhaps (so that an admin can have
a file available only to the ganeti installer only for the time of the
installation) and also about the fact that we won't cache/keep the
file on the node OS.

> +Implementation
> +==============
> +
> +The implementation of this design will happen as an ordered sequence of 
> steps,
> +of increasing impact on the system and, in some cases, dependent on each 
> other:
> +
> +#. Private and secret instance parameters
> +#. Communication mechanism between host and instance
> +#. Metadata service
> +#. Personalization package
> +#. ``self_install`` mode
> +#. ``install`` mode (with virtualization environment)
> +
> +Some of these steps need to be more deeply specified w.r.t. what is already
> +written in the `Proposed changes`_ Section. Extra details will be provided in
> +the following Subsections.
> +
> +Communication mechanism and metadata service
> +++++++++++++++++++++++++++++++++++++++++++++
> +
> +The communication mechanism and the metadata service are described together
> +because they are deeply tied. On the other hand, the communication mechanism
> +will need to be more generic because it can be used for other reasons in the
> +future (like allowing instances to esplicitly send commands to Ganeti, or to 
> let

explicitly

> +Ganeti control a helper instance, like the one hereby introduced for 
> performing
> +OS installs inside a safe environment).
> +
> +The communication mechanism will be enabled automatically when the instance 
> is
> +in ``self_install`` or ``install`` mode, but for backwards compatibility it 
> will
> +be disabled when the instance is in ``run`` mode unless it is esplicitly

^ see above

> +requested at instance startup by using a new, ad-hoc, parameter
> +(``--communication``).

Which parameter is this? An instance, hypervisor or backend parameter? And why?
Also -C could do as well (if we go for instance level). Remember to
specify here as it has to be clear that an instance once configured
that way will be always started that way.

> +
> +When the communication mechanism is enabled, Ganeti will create a new network
> +interface inside the instance. This extra network interface will be the last 
> one
> +of the instance, after all the user defined ones. On the host side, this
> +interface will be only accessible to the host itself, and not be routed 
> outside
> +the machine.

Actually it would be great if we didn't even have to create the tap.

> +On this network interface, the instance will connect using the IP:
> +169.254.169.1 and netmask 255.255.255.0.
> +The host will be on the same network, with the IP address: 169.254.169.254.
> +The instance will be able to connect to 169.254.169.254:80, and issue GET
> +requests to an HTTP server that will provide the instance metadata.
> +
> +The choice of this IP address and port is done for compatibility reasons with
> +OpenStack's and Amazon EC2's ways of providing metadata to the instance.
> +
> +Where possible, the metadata will be provided in a way compatible with 
> OpenStack
> +at::
> +
> +  http://169.254.169.254/openstack/<version>/meta_data.json
> +
> +or with Amazon EC2, at::
> +
> +  http://169.254.169.254/<version>/meta-data/*
> +
> +If some metadata are Ganeti-specific and don't fit this structure, they will 
> be
> +provided at::
> +
> +  http://169.254.169.254/<version>/ganeti/meta_data.json
> +

Not quite clear! :) How does the OS choose between those? How are they
expected to differ?

> +``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to 
> indicate
> +the most recent available protocol version.
> +

Is this what openstack and EC2 do?

> +A bi-directional, pipe-like communication channel will be provided. The 
> instance
> +will be able to receive data from the host by a GET request at::
> +
> +  http://169.254.169.254/<version>/ganeti/pipe_in
> +
> +and to send data to the host by a POST request at::
> +
> +  http://169.254.169.254/<version>/ganeti/pipe_out
> +

Why is it /openstack/<version>
but <version>/meta-data
and <version>/ganeti ?
Can we have it a bit more logical?

> +As in a pipe, once the data are read, they will not be in the buffer 
> anymore, so
> +subsequent get request to ``pipe_in`` will not return the same data twice.
> +Unlike a pipe, though, it will not be possible to perform blocking I/O
> +operations.
> +

So maybe we should just call it read and write? :)


> +The OS parameters will be accessible through a GET
> +request at::
> +
> +  http://169.254.169.254/<version>/ganeti/os/parameters/<visibility>.json
> +
> +as a JSON serialized dictionary. ``<visibility>`` will be either ``public`` 
> or
> +``private`` or ``secret``.
> +

Why does the instance care about the visibility, and why is this
provided at the file level? Couldn't a single json contain all info,
with also ancillary data to specify the level of confidentiality?

> +The installation scripts to be run inside the virtualized environment while 
> the
> +instance is run in ``install`` mode will be available at::
> +
> +  http://169.254.169.254/<version>/ganeti/os/scripts/<script_name>
> +
> +where ``<script_name>`` is the name of the script.
> +
> +The host and the instances (as detailed in `Installation process in a
> +virtualized environment`_) will be able to create other communication 
> channels
> +on the other ports of the same IP address.
> +

Why not at other URLs?

> +
> +Rationale
> +---------
> +
> +The choice of using a network interface for instance-host communication, as
> +opposed to VirtIO, XenBus or other methods, is due to the will of having a
> +generic, hypervisor-independent way of creating a communication channel, that
> +doesn't require unusual (para)virtualization drivers.
> +At the same time, a network interface was preferred over solutions involving
> +virtual floppy or USB devices because the latter tend to be detected and
> +configured by the guest operating systems, sometimes even in prominent 
> positions
> +in the user interface, whereas it is fairly common to have an unconfigured
> +network interface in a system, usually without any negative side effects.
> +
> +
> +Installation process in a virtualized environment
> ++++++++++++++++++++++++++++++++++++++++++++++++++
> +
> +In the new OS installation scenario, we distinguish between trusted and
> +untrusted code.
> +
> +The trusted installation code maintains the behavior of the current one, with
> +the scripts running on the node the instance is being created on. The 
> untrusted
> +code is stored in a subdirectory of the OS definition called ``untrusted``.
> +This directory contains scripts that are equivalent to the already existing
> +ones (``create``, ``export``, ``import``, ``rename``) but that will be run
> +inside an virtualized environment, to protect the host from malicious 
> tampering.
> +
> +The ``untrusted`` code is meant to either be untrusted itself, or to be 
> trusted
> +code running operations that might be dangerous (such as mounting a
> +user-provided image).
> +
> +In order to allow for the highest flexibility, if both a trusted and an
> +untrusted script are provided for the same operation (i.e. ``create``), both 
> of
> +them will be executed at the same time, one on the host, and one inside the
> +installation appliance. They will be allowed to communicate with each other
> +through the already described communication mechanism, in order to 
> orchestrate
> +their execution (e.g.: the untrusted code might execute the installation, 
> while
> +the trusted one receives status updates from it and delivers them to a user
> +interface).
> +

Sounds a bit clunky, and makes it hard to provide OS definitions from
the user (as an admin I have to "open" them and check that the trusted
scripts are empty or allowed... maybe this should be a new version and
disallow the old way altogether.

> +Ganeti will provide a script to be run at install time that can be used to
> +create the virtualized environment that will perform the OS installation of 
> new
> +instances.
> +This script will build a debootstrapped basic debian system including 
> including

s/including including/including/

> +a software that will read the metadata, setup the environment variables and
> +launch the installation scripts inside the virtualized environment. The 
> script
> +will also provide hooks for personalization.
> +



> +It will also be possible to use other self-made virtualized environment, as 
> long
> +as they connect to ganeti over the described communication mechanism and they
> +know how to read and use the provided metadata to create a new instance.
> +
> +While performing an installation in the virtualized environment, a
> +personalizable timeout will be used to detect possible problems with the
> +installation process, and to kill the virtualized environment.
> +

Will the timeout be reset upon communication? Will there be a way to reset it?
How will it be customizable? Who specifies where to customize it?

Thanks,

Guido

Reply via email to