On Wed, Dec 11, 2013 at 4:10 PM, Jose A. Lopes <[email protected]> wrote:
> On Mon, Dec 09, 2013 at 10:30:17AM +0100, Michele Tartara wrote:
>> Add the document describing a new design for the OS installation process for
>> new instances.
>>
>> Signed-off-by: Michele Tartara <[email protected]>
>> ---
>>  doc/design-draft.rst |   1 +
>>  doc/design-os.rst    | 399 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 400 insertions(+)
>>  create mode 100644 doc/design-os.rst
>>
>> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
>> index c821292..3ed3852 100644
>> --- a/doc/design-draft.rst
>> +++ b/doc/design-draft.rst
>> @@ -20,6 +20,7 @@ Design document drafts
>>     design-daemons.rst
>>     design-hsqueeze.rst
>>     design-ssh-ports.rst
>> +   design-os.rst
>>
>>  .. vim: set textwidth=72 :
>>  .. Local Variables:
>> diff --git a/doc/design-os.rst b/doc/design-os.rst
>> new file mode 100644
>> index 0000000..a26801a
>> --- /dev/null
>> +++ b/doc/design-os.rst
>> @@ -0,0 +1,399 @@
>> +===============================
>> +Ganeti OS installation redesign
>> +===============================
>> +
>> +.. contents:: :depth: 3
>> +
>> +This is a design document detailing a new OS installation procedure, more
>> +secure, able to provide more features and easier to use for many common 
>> tasks
>> +w.r.t. the current one.
>> +
>> +Current state and shortcomings
>> +==============================
>> +
>> +As of Ganeti 2.10, each instance is associated with an OS definition. An OS
>> +definition is a set of scripts (``create``, ``export``, ``import``, 
>> ``rename``)
>> +that are executed with root privileges on the primary host of the instance 
>> to
>> +perform all the OS-related functionality (setting up an operating system 
>> inside
>> +the disks of the instance being created, exporting/importing the instance,
>> +renaming it).
>> +
>> +These scripts receive, as environment variables, a fixed set of parameters
>> +describing the instance (such as the hypervisor, the name of the instance, 
>> the
>> +number of disks, and their location) and a set of user defined parameters. 
>> Each
>> +of these parameters is also written into the configuration file of Ganeti, 
>> to
>> +allow for future reinstalls of the instance, and in various log files, 
>> namely:
>> +
>> +* node daemon log file: contains DEBUG strings of the ``/os_validate``,
>> +  ``/instance_os_add`` and ``/instance_start`` RPC calls.
>> +
>> +* master daemon log file: DEBUG strings related to the same RPC calls are 
>> stored
>> +  here as well.
>> +
>> +* commands log: the CLI commands that create a new instance, including their
>> +  parameters, are logged here.
>> +
>> +* RAPI log: the RAPI commands that create a new instances, including their
>> +  parameters, are logged here.
>> +
>> +* job logs: the job files stored in the job queue or in its archive contain 
>> the
>> +  parameters.
>> +
>> +The current situation presents a number of shortcomings:
>> +
>> +* Having the installation scripts run with root power on the nodes doesn't 
>> allow
>> +  user-defined OS scripts, as they would pose a huge security issue.
>> +  Furthermore, even a script without malicious intentions might end up
>> +  distrupting a node because of a bug in it.
>> +
>> +* Ganeti cannot be used to create instances starting from user provided disk
>> +  images: even in the (hypothetical) case where the scripts are completely
>> +  secure and run not by root but by an unprivileged user with only the 
>> power to
>> +  mount arbitrary files as disk images, this is a security issue. It has 
>> been
>> +  proven that a carefully crafted file system might exploit kernel
>> +  vulnerabilities to gain control of the system. Therefore, directly 
>> mounting
>> +  images on the Ganeti nodes is not an option.
>> +
>> +* There is no way to inject files into an existing disk image. A common use 
>> case
>> +  is for the system administrator to provide a standard image of the 
>> system, to
>> +  be later personalized with the network configuration, private keys 
>> identifying
>> +  the machine, ssh keys of the users and so on. A possible workaround would 
>> be
>> +  for the scripts to mount the image (only if this is trusted!) and to 
>> receive
>> +  the configurations and ssh keys as user defined OS parameters. 
>> Unfortunately,
>> +  this is also not an option for security sensitive material (such as the 
>> ssh
>> +  keys) because the OS parameters are stored in many places on the system, 
>> as
>> +  already described above.
>> +
>> +* Most other virtualization software simply work with instance images, not 
>> with
>> +  installation scripts. This difference makes the interaction of Ganeti with
>> +  other software difficult.
>> +
>> +Proposed changes
>> +================
>> +
>> +In order to fix the shortcomings of the current state, we plan to introduce 
>> the
>> +following changes:
>> +
>> +* Change the OS parameters to have three categories:
>> +
>> + * ``public``: the current behavior. The parameter is logged and stored 
>> freely.
>> +
>> + * ``private``: the parameter is saved inside the Ganeti configuration (to 
>> allow
>> +   for instance reinstall) but it is not shown in logs, job logs, or passed 
>> back
>> +   via RAPI.
>> +
>> + * ``secret``: the parameter is not saved inside the Ganeti configuration.
>> +   Reinstall are impossible unless the data is passed again. The parameter 
>> will
>> +   not appear in any log file. When a functionality is performed jointly by
>> +   multiple daemons (such as MasterD and LuxiD), currently Ganeti sometimes
>> +   serializes jobs on disk and later reloads them. Secret parameters will 
>> not be
>> +   serialized on disk. They will be passed around as part of the LUXI calls
>> +   exchanged by the daemons, and only kept in memory, in order to reduce 
>> their
>> +   accessibility as much as possible. In case of a failure of the master 
>> node,
>> +   these parameters will be lost and cannot be recovered because they are 
>> not
>> +   serialized on file, therefore the job cannot taken over by the new 
>> master.
>> +   This is an expected and accepted side effect of jobs with secret 
>> parameters:
>> +   if they fail, they'll have to be restarted manually.
>> +
>> +* A new OS installation procedure, based on a safe virtualized environment.
>> +  This virtualized environment will run with the same hardware parameter as 
>> the
>> +  actual instance being installed, as much as possible. This will also 
>> allow to
>> +  reduce the memory usage in the host (specifically, in Dom0 for Xen
>> +  installations). Each instance will have these possible execution modes:
>> +
>> +  * ``default``: the default mode, used when the machine is running 
>> normally and
>> +    the OS installation procedure is run before starting the instance for 
>> the
>> +    first time.
>
> Is this supposed to be ``run`` instead of ``default``.  Here it says
> default, but the rest of the document keeps mentioning the ``run`` mode,
> which doesn't seem to be anywhere.
>
> Thanks,
> Jose
>

Of course it is. Thanks for spotting it, I'll change it to "run".

Cheers,
Michele

-- 
Google Germany GmbH
Dienerstr. 12
80331 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores

Reply via email to