On Thu, Jan 16, 2014 at 2:38 PM, Jose A. Lopes <[email protected]> wrote:
> Interdiff
>   * address discussion about general communication mechanism, hooks and 
> iptables
>   * address other comments
>   * spell-checking :)
>
> Interdiff follows and full doc in attachment.
>
> Thanks,
> Jose
>
> diff --git a/doc/design-os.rst b/doc/design-os.rst
> index cc1b13c..8b96df7 100644
> --- a/doc/design-os.rst
> +++ b/doc/design-os.rst
> @@ -12,17 +12,18 @@ Current state and shortcomings
>  ==============================
>
>  As of Ganeti 2.10, each instance is associated with an OS definition. An OS
> -definition is a set of scripts (``create``, ``export``, ``import``, 
> ``rename``)
> -that are executed with root privileges on the primary host of the instance to
> -perform all the OS-related functionality (setting up an operating system 
> inside
> -the disks of the instance being created, exporting/importing the instance,
> -renaming it).
> -
> -These scripts receive through environment variables a fixed set of parameters
> -related to the instance (such as the hypervisor, the name of the instance, 
> the
> -number of disks, and their location) and a set of user defined parameters.
> -These parameters are also written in the configuration file of Ganeti, to 
> allow
> -future reinstalls of the instance, and in various log files, namely:
> +definition is a set of scripts (i.e., ``create``, ``export``, ``import``,
> +``rename``) that are executed with root privileges on the primary host of the
> +instance.  These scripts are responsible perform all the OS-related tasks,

responsible *to* perform

> +namely, create an instance, setup an operating system on the instance's 
> disks,
> +export/import the instance, and rename the instance.
> +
> +These scripts receive, through environment variables, a fixed set of instance
> +parameters (such as, the hypervisor, the name of the instance, the number of
> +disks and their location) and a set of user defined parameters.  Both the
> +instance and user defined parameters are written in the configuration file of
> +Ganeti, to allow future reinstalls of the instance, and in various log files,
> +namely:
>
>  * node daemon log file: contains DEBUG strings of the ``/os_validate``,
>    ``/instance_os_add`` and ``/instance_start`` RPC calls.
> @@ -41,31 +42,31 @@ future reinstalls of the instance, and in various log 
> files, namely:
>
>  The current situation presents a number of shortcomings:
>
> -* Having the installation scripts run as root on the nodes doesn't allow
> -  user-defined OS scripts, as they would pose a huge security issue.
> +* Having the installation scripts run as root on the nodes does not allow
> +  user-defined OS scripts, as they would pose a huge security risk.
>    Furthermore, even a script without malicious intentions might end up
> -  distrupting a node because of a bug in it.
> +  disrupting a node because of due to a bug.
>
>  * Ganeti cannot be used to create instances starting from user provided disk
> -  images: even in the (hypothetical) case where the scripts are completely
> +  images: even in the (hypothetical) case in which the scripts are completely
>    secure and run not by root but by an unprivileged user with only the power 
> to
> -  mount arbitrary files as disk images, this is a security issue. It has been
> -  proven that a carefully crafted file system might exploit kernel
> +  mount arbitrary files as disk images, this is still a security issue. It 
> has
> +  been proven that a carefully crafted file system might exploit kernel
>    vulnerabilities to gain control of the system. Therefore, directly mounting
>    images on the Ganeti nodes is not an option.
>
>  * There is no way to inject files into an existing disk image. A common use 
> case
>    is for the system administrator to provide a standard image of the system, 
> to
>    be later personalized with the network configuration, private keys 
> identifying
> -  the machine, ssh keys of the users and so on. A possible workaround would 
> be
> +  the machine, ssh keys of the users, and so on. A possible workaround would 
> be
>    for the scripts to mount the image (only if this is trusted!) and to 
> receive
>    the configurations and ssh keys as user defined OS parameters. 
> Unfortunately,
>    this is also not an option for security sensitive material (such as the ssh
>    keys) because the OS parameters are stored in many places on the system, as
>    already described above.
>
> -* Most other virtualization software simply work with instance images, not 
> with
> -  installation scripts. This difference makes the interaction of Ganeti with
> +* Most other virtualization software allow only instance images, but no
> +  installation scripts. This difference makes the interaction between Ganeti 
> and
>    other software difficult.
>
>  Proposed changes
> @@ -74,7 +75,6 @@ Proposed changes
>  In order to fix the shortcomings of the current state, we plan to introduce 
> the
>  following changes.
>
> -
>  OS parameter categories
>  +++++++++++++++++++++++
>
> @@ -99,7 +99,6 @@ Change the OS parameters to have three categories:
>    is an expected and accepted side effect of jobs with secret parameters: if
>    they fail, they'll have to be restarted manually.
>
> -
>  Metadata
>  ++++++++
>
> @@ -112,40 +111,42 @@ with its host over this channel.  This is the approach 
> followed the
>  ``cloud-init`` tool and more details will be provided in the `Communication
>  mechanism`_ and `Metadata service`_ sections.
>
> -
>  Installation procedure
>  ++++++++++++++++++++++
>
> -A new installation procedure will be introduced, with which it will be 
> possible
> -to use an installation medium and run the OS scripts in an optional 
> virtualized
> -environment and with an optional personalization package.  There will be two
> -sets of parameters, namely, installation parameters, which are used mainly 
> for
> -installs and reinstalls, and execution parameters, which are used in all the
> -other runs that are not part of an installation procedure.
> -
> -This set of installation parameters will allow, e.g., to attach an 
> installation
> -floppy/cdrom/network, change the boot device order, or specify a disk image 
> to
> -be used.  Through this set of parameters, the administrator will have to 
> provide
> -the hypervisor a location for an installation medium for the instance (e.g., 
> a
> -boot disk, a network image, etc).  This medium will carry out the 
> installation
> -of the instance onto the instance's disks and will then be responsible for
> -getting the parameters for configuring the instance, such as, network
> -interfaces, IP address, and hostname.  These parameters are taken from the
> -metadata.  The installation parameters will be stored in the configuration of
> -Ganeti and used in future reinstalls, but not during normal execution.
> +A new installation procedure will be introduced.  There will be two sets of
> +parameters, namely, installation parameters, which are used mainly for 
> installs
> +and reinstalls, and execution parameters, which are used in all the other 
> runs
> +that are not part of an installation procedure.  Also, it will be possible to
> +use an installation medium and/or run the OS scripts in an optional 
> virtualized
> +environment, and optionally use a personalization package.  This section 
> details
> +all of these options.
> +
> +The set of installation parameters will allow, for example, to attach an
> +installation floppy/cdrom/network, change the boot device order, or specify a
> +disk image to be used.  Through this set of parameters, the administrator 
> will
> +have to provide the hypervisor a location for an installation medium for the
> +instance (e.g., a boot disk, a network image, etc).  This medium will carry 
> out
> +the installation of the instance onto the instance's disks and will then be
> +responsible for getting the parameters for configuring the instance, such as,
> +network interfaces, IP address, and hostname.  These parameters are taken 
> from
> +the metadata.  The installation parameters will be stored in the 
> configuration
> +of Ganeti and used in future reinstalls, but not during normal execution.
>
>  The instance is reinstalled using the same installation parameters from the
>  first installation.  However, it will be the administrator's responsibility 
> to
> -ensure that the any installation media is still available at the proper 
> location
> +ensure that the installation media is still available at the proper location
>  when a reinstall occurs.
>
>  The parameter ``--os-parameters`` can still be used to specify the OS
>  parameters.  However, without OS scripts, Ganeti cannot do more than a 
> syntactic
> -check to validate the supplied OS parameters string.  As a result, this 
> string
> -will be directly passed to the instance as part of the metadata.  If the
> -installation procedure is running inside a virtualized environment, then 
> Ganeti
> -will take these parameters from the metadata and pass them to the OS scripts 
> as
> -environment variables.
> +check to validate the supplied OS parameter string.  As a result, this string
> +will be passed directly to the instance as part of the metadata.  If OS 
> scripts
> +are used and the installation procedure is running inside a virtualized
> +environment, Ganeti will take these parameters from the metadata and pass 
> them
> +to the OS scripts as environment variables.
> +
> +Ganeti allows the following installation options:
>
>  * Use a disk image:
>
> @@ -159,13 +160,13 @@ environment variables.
>
>    The parameter ``--os-type`` (short version: ``-o``), is currently used to
>    specify the OS scripts.  This parameter will still be used to specify the 
> OS
> -  scripts with the difference that these OS scripts may optionally run 
> inside a
> +  scripts with the difference that these scripts may optionally run inside a
>    virtualized environment for safety reasons, depending on whether they are
>    trusted or not.  For more details on trusted and untrusted OS scripts, 
> refer
>    to the `Installation process in a virtualized environment`_ section.  Note
> -  also that this parameter will become optional thus allowing a user to 
> create
> -  an instance specifying only, for example, a disk image or a cdrom image to
> -  boot from.
> +  that this parameter will become optional thus allowing a user to create an
> +  instance specifying only, for example, a disk image or a cdrom image to 
> boot
> +  from.
>
>  * Personalization package
>
> @@ -173,9 +174,9 @@ environment variables.
>    URL for a "personalization package", which is an archive containing a set 
> of
>    files meant to be overlayed on top of the OS file system at the end of the
>    setup process and before the VM is started for the first time in normal 
> mode.
> -  Ganeti will provide a mechanism for receiving and unpacking this archive
> -  whether the installation is being performed inside the virtualized 
> environment
> -  or not.
> +  Ganeti will provide a mechanism for receiving and unpacking this archive,
> +  independently of whether the installation is being performed inside the
> +  virtualized environment or not.
>
>    The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or
>    ``.tgz``) and contain the files according to the directory structure that 
> will
> @@ -210,9 +211,8 @@ environment variables.
>  * Combine a disk image, OS scripts, and a personalization package
>
>    It will possible to combine a disk image, OS scripts, and a personalization
> -  package, both with or without a virtualized environment.  There is one
> -  exception which is if there are untrusted OS scripts.  At least, an
> -  installation medium or OS scripts should be specified.
> +  package, both with or without a virtualized environment (see the exception
> +  below). At least, an installation medium or OS scripts should be specified.
>
>    The disk image of the actual virtual appliance, which bootstraps the 
> virtual
>    environment used in the installation procedure, will be read only, so that 
> a
> @@ -248,12 +248,34 @@ the following subsections.
>  Communication mechanism
>  +++++++++++++++++++++++
>
> -The communication mechanism will be a generic communication channel between
> -Ganeti and the instances, not only to provide access to the metadata service,
> -but also to allow instances to send commands directly to Ganeti or request
> -changes to parameters, such as, those related to the distribution upgrades, 
> or
> -even let Ganeti control a helper instance, such as, the one for performing OS
> -installs inside a safe environment, as introduced in this document.
> +The communication mechanism will be an exclusive, generic, bidirectional
> +communication channel between Ganeti hosts and guests.
> +
> +exclusive
> +  The communication mechanism allows communication between a guest and its 
> host,
> +  but it does not allow a guest to communicate with other guests or reach the
> +  outside world.
> +
> +generic
> +  The communication mechanism allows a guest to reach any service on the 
> host,
> +  not just the metadata service.  Examples of valid communication include, 
> but
> +  are not limited to, access to the metadata service, send commands to 
> Ganeti,
> +  request changes to parameters, such as, those related to the distribution
> +  upgrades, and let Ganeti control a helper instance, such as, the one for
> +  performing OS installs inside a safe environment.
> +
> +bidirectional
> +  The communication mechanism allows communication to be initiated from 
> either
> +  party, namely, from a host to a guest or guest to host.
> +
> +Note that Ganeti will allow communication with any service (e.g., daemon) 
> running
> +on the host and, as a result, Ganeti will not be responsible for ensuring 
> that
> +only the metadata service is reachable.  It is the responsibility of each 
> system
> +administrator to ensure that the extra firewalling and routing rules 
> specified
> +on the host provide the necessary protection on a given Ganeti installation 
> and,
> +at the same time, do not accidentally override the behaviour hereby described
> +which makes the communication between the host and the guest exclusive, 
> generic,
> +and bidirectional, unless intended.
>
>  The communication mechanism will be enabled automatically during an 
> installation
>  procedure that requires a virtualized environment, but, for backwards
> @@ -265,29 +287,28 @@ enabled for a particular instance.  The value of this 
> parameter will be saved as
>  part of the instance's configuration.
>
>  The communication mechanism will be implemented through network interfaces on
> -the host and the guest, and Ganeti will be responsible for creating and
> -configure the interfaces on the host side.  The host will create a TAP 
> network
> -interface for each guest.  This network interface will be connected to the
> -guest's last network interface, which is meant to be used exclusively for the
> -communication mechanism and is defined after all the used-defined interfaces.
> -Moreover, the network interfaces provide a communication channel that is 
> solely
> -used by the host and each guest, therefore, a guest cannot use this network
> -interface to reach the outside world or other guests.  It is the system
> -administrator's responsibility to ensure that the extra firewalling and 
> routing
> -rules specified on the host do not override this behaviour accidentally.
> -
> -On the host side, Ganeti will create a TAP network interface for each guest 
> and
> -configure it to have IP address ``169.254.169.254`` and netmask
> -``255.255.255.255``.  On the guest side, each instance will have its own MAC
> -address and IP address.  Both MAC address and the IP address must be unique
> -within a single host.
> -
> +the host and the guest, and Ganeti will be responsible for the host side,
> +namely, creating a TAP interface for each guest and configuring these 
> interfaces
> +to have IP address ``169.254.169.254`` and netmask ``255.255.255.255``.  This
> +network interface will be connected to the guest's last network interface, 
> which
> +is meant to be used exclusively for the communication mechanism and is 
> defined
> +after all the used-defined interfaces.  The last interface was chosen (as
> +opposed to the first one, for example) because the first interface is 
> generally
> +understood and the main gateway out, and also because it minimizes the 
> impact on
> +existing systems, for example, in a scenario where the system administrator 
> has
> +a running cluster and wants to enable the communication mechanism for already
> +existing instances, which might have been created with older versions of 
> Ganeti.
> +Further, DBus should assist in keeping the guest network interfaces more 
> stable.
> +
> +On the guest side, each instance will have its own MAC address and IP 
> address.
> +Both the guest's MAC address and IP address must be unique within a single 
> host.
>  The guest will use the DHCP protocol on its last network interface to 
> contact a
> -DHCP server running on the host and thus determine its IP address.  The DHCP 
> is
> -configured, started, and stopped, by Ganeti and it will be listening 
> exclusively
> -on the TAP network interfaces of the guests in order not to interfere with a
> -potential DHCP server running on the same host.  Furthermore, the DHCP server
> -will only recognize MAC and IP address pairs that have been approved by 
> Ganeti.
> +DHCP server running on the host and thus determine its IP address.  The DHCP
> +server is configured, started, and stopped, by Ganeti and it will be 
> listening
> +exclusively on the TAP network interfaces of the guests in order not to
> +interfere with a potential DHCP server running on the same host.  
> Furthermore,
> +the DHCP server will only recognize MAC and IP address pairs that have been
> +approved by Ganeti.
>
>  The TAP network interfaces created for each guest share the same IP address.
>  Therefore, it will be necessary to extend the routing table with rules 
> specific
> @@ -303,22 +324,30 @@ and try to steal another guest's IP address, however, 
> this routing rule will
>  block traffic (i.e., IP packets carrying the wrong IP) from the DHCP server 
> to
>  the malicious guest.  Similarly, the guest could lie about its IP address 
> (i.e.,
>  simply assign a predefined IP address, perhaps from another guest), however,
> -once again this routing rule will block all traffic from the guest that does 
> not
> -come from the IP address that has been assigned by Ganeti.
> +replies from the host will not be routed to the malicious guest.
> +
> +This routing rule ensures that the communication channel is exclusive but, as
> +mentioned before, it will not prevent guests from accessing any service on 
> the
> +host.  It is the system administrator's responsibility to employ the 
> necessary
> +``iptables`` rules.  In order to achieve this, Ganeti will provide ``ifup``
> +hooks associated with the guest network interfaces which will give system
> +administrator's the opportunity to customize their own ``iptables``, if
> +necessary.  Ganeti will also provide examples of such hooks.  However, these 
> are
> +meant to personalized to each Ganeti installation and not to be taken as
> +production ready scripts.
>
>  For KVM, an instance will be started with a unique MAC address and the file
>  descriptor for the TAP network interface meant to be used by the 
> communication
>  mechanism.  Ganeti will be responsible for generating a unique MAC address 
> for
> -the guest, opening the TAP interface and passing to KVM its file descriptor::
> +the guest, opening the TAP interface, and passing to KVM its file 
> descriptor::
>
> -  kvm -net nic,macaddr=<mac> -net tap,fd=<tap-fd>,script=no,downscript=no ...
> +  kvm -net nic,macaddr=<mac> -net tap,fd=<tap-fd> ...
>
>  For Xen, a network interface will be created on the host (using the ``vif``
>  parameter of the Xen configuration file).  Each instance will have its
>  corresponding ``vif`` network interface on the host.  The ``vif-route`` 
> script
>  of Xen might be helpful in implementing this.
>
> -
>  Metadata service
>  ++++++++++++++++
>
> @@ -376,7 +405,6 @@ available at::
>
>  where ``<script_name>`` is the name of the script.
>
> -
>  Rationale
>  ---------
>
> @@ -390,7 +418,6 @@ configured by the guest operating systems, sometimes even 
> in prominent positions
>  in the user interface, whereas it is fairly common to have an unconfigured
>  network interface in a system, usually without any negative side effects.
>
> -
>  Installation process in a virtualized environment
>  +++++++++++++++++++++++++++++++++++++++++++++++++
>
> @@ -431,7 +458,7 @@ running on the host, leaving only the ones running in the 
> VM.
>  Ganeti will provide a script to be run at install time that can be used to
>  create the virtualized environment that will perform the OS installation of 
> new
>  instances.
> -This script will build a debootstrapped basic debian system including a 
> software
> +This script will build a debootstrapped basic Debian system including a 
> software
>  that will read the metadata, setup the environment variables and launch the
>  installation scripts inside the virtualized environment. The script will also
>  provide hooks for personalization.
> @@ -440,14 +467,20 @@ It will also be possible to use other self-made 
> virtualized environments, as
>  long as they connect to Ganeti over the described communication mechanism and
>  they know how to read and use the provided metadata to create a new instance.
>
> -While performing an installation in the virtualized environment, a
> -personalizable timeout will be used to detect possible problems with the
> -installation process, and to kill the virtualized environment. The timeout 
> will
> -be optional and set on a cluster basis by the administrator. If set, it will 
> be
> -the total time allowed to setup an instance inside the appliance. It is 
> mainly
> -meant as a safety measure to prevent an instance taken over by malicious 
> scripts
> -to be available for a long time.
> +While performing an installation in the virtualized environment, a 
> customizable
> +timeout will be used to detect possible problems with the installation 
> process,
> +and to kill the virtualized environment. The timeout will be optional and 
> set on
> +a cluster basis by the administrator. If set, it will be the total time 
> allowed
> +to setup an instance inside the appliance. It is mainly meant as a safety
> +measure to prevent an instance taken over by malicious scripts to be 
> available
> +for a long time.
>
> +Alternatives to design and implementation
> +=========================================
> +
> +This section lists alternatives to design and implementation which are left 
> to
> +the user as an option.

I don't think we should write "left to the user as an option" or it
might seem that they will be implemented but not enabled by default,
which is not going to be true.

> Please read carefully through the limitations and
> +security concerns of each of these alternatives.
>
>  Port forwarding in KVM
>  ++++++++++++++++++++++
> @@ -464,15 +497,17 @@ A TCP/IP forwarding device can be created through the 
> following KVM invocation::
>      user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,
>      guestfwd=tcp:169.254.169.254:80-tcp:127.0.0.1:8080 ...
>
> -This invocation even has advantage that it can remap ports, which would have
> -allowed the metadata service daemon to run in port 8080 instead of 80.  
> However,
> -in this scheme, KVM opens the TCP connection only once, when it is started, 
> and,
> -if the connection breaks, KVM will not reconnect.  Furthermore, this also
> +This invocation even has the advantage that it can block undesired traffic
> +(i.e., traffic that is not explicitly specified in the arguments) and it can
> +remap ports, which would have allowed the metadata service daemon to run in 
> port
> +8080 instead of 80.  However, in this scheme, KVM opens the TCP connection 
> only
> +once, when it is started, and, if the connection breaks, KVM will not
> +reestablish the connection.  Furthermore, opening the TCP connection only 
> once
>  interferes with the HTTP protocol, which needs to dynamically establish and
>  close connections.
>
> -The alternative to opening a single TCP/IP connection is to execute a 
> command.
> -The KVM invocation for this is, for example, the following::
> +The alternative to the TCP/IP forwarding device is to execute a command.  The
> +KVM invocation for this is, for example, the following::
>
>    kvm -net nic -net \
>      "user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,
> @@ -484,12 +519,11 @@ supported in KVM 1.2 and above, and, therefore, not 
> viable because we want to
>  provide support for at least KVM version 1.0, which is the version provided 
> by
>  Ubuntu LTS.
>
> -
>  Alternatives to the DHCP server
>  +++++++++++++++++++++++++++++++
>
> -There are alternatives to using the DHCP server, for example, by assigning
> -identical IP addresses to guests, such as, the IP address 
> ``169.254.169.253``.
> +There are alternatives to using the DHCP server, for example, by assigning a
> +fixed IP address to guests, such as, the IP address ``169.254.169.253``.
>  However, this introduces a routing problem, namely, how to route incoming
>  packets from the same source IP to the host.  This problem can be overcome 
> in a
>  number of ways.
> @@ -499,9 +533,9 @@ example, ``169.254.169.253``, to an IP address unique 
> within a single host, for
>  example, ``169.254.0.1``.  Given that NAT through ``ip rule`` is deprecated,
>  users can resort to ``iptables``.  Note that this has not yet been tested.
>
> -Another option, which has indeed been tested in a prototype, is to connect 
> the
> -TAP network interfaces of the guests to a bridge.  The bridge takes the
> -configuration for the TAP network interfaces, namely, IP address
> +Another option, which has been tested in a prototype, is to connect the TAP
> +network interfaces of the guests to a bridge.  The bridge takes the
> +configuration from the TAP network interfaces, namely, IP address
>  ``169.254.169.254`` and netmask ``255.255.255.255``, thus leaving those
>  interfaces without an IP address.  Note that in this setting, guests will be
>  able to reach each other, therefore, if necessary, additional ``iptables`` 
> rules
>
>
>
> On Mon, Dec 09, 2013 at 10:30:17AM +0100, Michele Tartara wrote:
>> Add the document describing a new design for the OS installation process for
>> new instances.
>>
>> Signed-off-by: Michele Tartara <[email protected]>
>> ---
>>  doc/design-draft.rst |   1 +
>>  doc/design-os.rst    | 399 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 400 insertions(+)
>>  create mode 100644 doc/design-os.rst
>>
>> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
>> index c821292..3ed3852 100644
>> --- a/doc/design-draft.rst
>> +++ b/doc/design-draft.rst
>> @@ -20,6 +20,7 @@ Design document drafts
>>     design-daemons.rst
>>     design-hsqueeze.rst
>>     design-ssh-ports.rst
>> +   design-os.rst
>>
>>  .. vim: set textwidth=72 :
>>  .. Local Variables:
>> diff --git a/doc/design-os.rst b/doc/design-os.rst
>> new file mode 100644
>> index 0000000..a26801a
>> --- /dev/null
>> +++ b/doc/design-os.rst
>> @@ -0,0 +1,399 @@
>> +===============================
>> +Ganeti OS installation redesign
>> +===============================
>> +
>> +.. contents:: :depth: 3
>> +
>> +This is a design document detailing a new OS installation procedure, more
>> +secure, able to provide more features and easier to use for many common 
>> tasks
>> +w.r.t. the current one.
>> +
>> +Current state and shortcomings
>> +==============================
>> +
>> +As of Ganeti 2.10, each instance is associated with an OS definition. An OS
>> +definition is a set of scripts (``create``, ``export``, ``import``, 
>> ``rename``)
>> +that are executed with root privileges on the primary host of the instance 
>> to
>> +perform all the OS-related functionality (setting up an operating system 
>> inside
>> +the disks of the instance being created, exporting/importing the instance,
>> +renaming it).
>> +
>> +These scripts receive, as environment variables, a fixed set of parameters
>> +describing the instance (such as the hypervisor, the name of the instance, 
>> the
>> +number of disks, and their location) and a set of user defined parameters. 
>> Each
>> +of these parameters is also written into the configuration file of Ganeti, 
>> to
>> +allow for future reinstalls of the instance, and in various log files, 
>> namely:
>> +
>> +* node daemon log file: contains DEBUG strings of the ``/os_validate``,
>> +  ``/instance_os_add`` and ``/instance_start`` RPC calls.
>> +
>> +* master daemon log file: DEBUG strings related to the same RPC calls are 
>> stored
>> +  here as well.
>> +
>> +* commands log: the CLI commands that create a new instance, including their
>> +  parameters, are logged here.
>> +
>> +* RAPI log: the RAPI commands that create a new instances, including their
>> +  parameters, are logged here.
>> +
>> +* job logs: the job files stored in the job queue or in its archive contain 
>> the
>> +  parameters.
>> +
>> +The current situation presents a number of shortcomings:
>> +
>> +* Having the installation scripts run with root power on the nodes doesn't 
>> allow
>> +  user-defined OS scripts, as they would pose a huge security issue.
>> +  Furthermore, even a script without malicious intentions might end up
>> +  distrupting a node because of a bug in it.
>> +
>> +* Ganeti cannot be used to create instances starting from user provided disk
>> +  images: even in the (hypothetical) case where the scripts are completely
>> +  secure and run not by root but by an unprivileged user with only the 
>> power to
>> +  mount arbitrary files as disk images, this is a security issue. It has 
>> been
>> +  proven that a carefully crafted file system might exploit kernel
>> +  vulnerabilities to gain control of the system. Therefore, directly 
>> mounting
>> +  images on the Ganeti nodes is not an option.
>> +
>> +* There is no way to inject files into an existing disk image. A common use 
>> case
>> +  is for the system administrator to provide a standard image of the 
>> system, to
>> +  be later personalized with the network configuration, private keys 
>> identifying
>> +  the machine, ssh keys of the users and so on. A possible workaround would 
>> be
>> +  for the scripts to mount the image (only if this is trusted!) and to 
>> receive
>> +  the configurations and ssh keys as user defined OS parameters. 
>> Unfortunately,
>> +  this is also not an option for security sensitive material (such as the 
>> ssh
>> +  keys) because the OS parameters are stored in many places on the system, 
>> as
>> +  already described above.
>> +
>> +* Most other virtualization software simply work with instance images, not 
>> with
>> +  installation scripts. This difference makes the interaction of Ganeti with
>> +  other software difficult.
>> +
>> +Proposed changes
>> +================
>> +
>> +In order to fix the shortcomings of the current state, we plan to introduce 
>> the
>> +following changes:
>> +
>> +* Change the OS parameters to have three categories:
>> +
>> + * ``public``: the current behavior. The parameter is logged and stored 
>> freely.
>> +
>> + * ``private``: the parameter is saved inside the Ganeti configuration (to 
>> allow
>> +   for instance reinstall) but it is not shown in logs, job logs, or passed 
>> back
>> +   via RAPI.
>> +
>> + * ``secret``: the parameter is not saved inside the Ganeti configuration.
>> +   Reinstall are impossible unless the data is passed again. The parameter 
>> will
>> +   not appear in any log file. When a functionality is performed jointly by
>> +   multiple daemons (such as MasterD and LuxiD), currently Ganeti sometimes
>> +   serializes jobs on disk and later reloads them. Secret parameters will 
>> not be
>> +   serialized on disk. They will be passed around as part of the LUXI calls
>> +   exchanged by the daemons, and only kept in memory, in order to reduce 
>> their
>> +   accessibility as much as possible. In case of a failure of the master 
>> node,
>> +   these parameters will be lost and cannot be recovered because they are 
>> not
>> +   serialized on file, therefore the job cannot taken over by the new 
>> master.
>> +   This is an expected and accepted side effect of jobs with secret 
>> parameters:
>> +   if they fail, they'll have to be restarted manually.
>> +
>> +* A new OS installation procedure, based on a safe virtualized environment.
>> +  This virtualized environment will run with the same hardware parameter as 
>> the
>> +  actual instance being installed, as much as possible. This will also 
>> allow to
>> +  reduce the memory usage in the host (specifically, in Dom0 for Xen
>> +  installations). Each instance will have these possible execution modes:
>> +
>> +  * ``default``: the default mode, used when the machine is running 
>> normally and
>> +    the OS installation procedure is run before starting the instance for 
>> the
>> +    first time.
>> +
>> +  * ``self_install``: the first run of the instance will be with a 
>> different set
>> +    of parameters w.r.t. all the successive runs. This set of "install
>> +    parameters" will allow, e.g., to attach an installation
>> +    floppy/cdrom/network, change the boot device order, or specify an OS 
>> image
>> +    to be used. Through this set of parameters, the administrator will have 
>> to
>> +    provide the hypervisor a way to find an installation medium for the 
>> instance
>> +    (e.g., a boot disk, a network image, etc). This medium will then 
>> install the
>> +    instance itself on the disks and will then be responsible to get the
>> +    parameters for configuring it (its network interfaces, IP address, 
>> hostname,
>> +    etc.) from a set of metadata provided by Ganeti (e.g.: using an approach
>> +    comparable to the one of the ``cloud-init`` tool). When this 
>> installation
>> +    mode is used, no OS installation script is required.  In order for
>> +    installation of an OS from an image to be possible, the ``--os-type``
>> +    parameter will be extended to support a new additional format: 
>> ``--os-type
>> +    image:<URL>`` will instruct ganeti to take an image from the specified
>> +    position. For the initial implementation, URL can be either a filename 
>> or a
>> +    publically accessible http or ftp resource. Once the instance image is
>> +    received, it will be dd-ed on the first disk of the instance.
>> +    When an image is specified, ``--os-parameters`` can still be used,
>> +    and its content will be passed to the instance as part of the metadata. 
>> Nota
>> +    that as part of the OS scripts there is a file specifying what 
>> parameters
>> +    are expected. With OS images, though, none of the traditional structure 
>> of
>> +    OS scripts is in place, so there will be no check regarding what 
>> parameters
>> +    can be specified: they will all be passed, as long as the
>> +    ``--os-parameters`` string is syntactically valid.
>> +    The set of ``self_install`` parameters will be stored as part of the
>> +    instance configuration, so that they can be used to reinstall the 
>> instance.
>> +    It will be the user's responsibility to ensure that the OS image or any
>> +    installation media is still available in the proper position when a
>> +    reinstall happens. After the first run, the instance will revert to
>> +    ``default`` mode.
>> +
>> +  * ``install``: Ganeti will start the instance using a virtual appliance
>> +    specifically made for installing Ganeti instances. Scripts analogous to 
>> the
>> +    current ones will run inside this instance. The disks of the instance 
>> being
>> +    installed will be connected to this virtual appliance, so that the 
>> scripts
>> +    can mount them and modify them as needed, as currently happens, but 
>> with the
>> +    additional protection given by this happening in a VM. The disk of the
>> +    virtual appliance will be read only, so that a pristine copy of the
>> +    appliance can be started every time a new instance needs to be created, 
>> to
>> +    further increase security. The data the instance needs to write at 
>> runtime
>> +    will only be stored in RAM, and disappear as soon as the instance is
>> +    stopped. Metadata will be provided also to this virtual applicance, that
>> +    will take care of converting them to environment variables for the
>> +    installation scripts. After the first run, the instance will revert to
>> +    ``default`` mode.
>> +
>> +* In order to allow for the metadata to be sent inside the instance, a
>> +  communication mechanism between the instance and the host will be created.
>> +  This mechanism will be bidirectional (e.g.: to allow the setup process 
>> going
>> +  on inside the instance to communicate its progress to the host). Each 
>> instance
>> +  will have access exclusively to its own metadata, and it will be only 
>> able to
>> +  communicate with its host over this channel. More details will be 
>> provided in
>> +  the `Communication mechanism and metadata service`_ section.
>> +
>> +* As part of the instance creation command it will be possible to indicate 
>> a URL
>> +  for a "personalization package", that is an archive containing a set of 
>> files
>> +  meant to be overlayed on top of the operating system file system at the 
>> end of
>> +  the setup process, before the VM is started for the first time in ``run``
>> +  mode.  Ganeti will provide a mechanism for receiving and unpacking this
>> +  archive as part of the ``install`` execution mode, whereas in 
>> ``self_install``
>> +  mode it will only be provided as a metadata for the instance to use.  The
>> +  archive will be in TAR-GZIP format (with extension ``.tar.gz`` or 
>> ``.tgz``)
>> +  and will contain the files according to the directory structure that will 
>> be
>> +  recreated on the installation disk. Files contained in this archive will
>> +  overwrite files with the same path created during the install procedure 
>> (if
>> +  any).  The URL of the "personalization package" will have to specify an
>> +  extesion to identify the file format (in order to allow for more formats 
>> to be
>> +  supported in the future).  The URL will be stored as part of the 
>> configuration
>> +  of the instance (therefore, the URL should not contain confidential
>> +  information, but the file there available can). It is up to the system
>> +  administrator to ensure that a package is actually available at that URL 
>> at
>> +  install and reinstall time.  The content of the package is allowed to 
>> change.
>> +  E.g.: a system administrator might create a package containing the private
>> +  keys of the instance being created. When the instance is reinstalled, a 
>> new
>> +  package with new keys can be made available there, therefore allowing 
>> instance
>> +  reinstall without the need to store keys.  Together with the URL, a 
>> username
>> +  and a password can be specified to. If the URL is a http(s) URL, they 
>> will be
>> +  used as basic access authentication credentials to access that URL. The
>> +  username and password will not be saved in the config, and will have to be
>> +  provided again in case a reinstall is requested.  The downloaded
>> +  personalization package will not be stored locally on the node for longer 
>> than
>> +  it is needed while unpacking it and adding its files to the instance being
>> +  created.  The personalization package will be overlayed on top of the 
>> instance
>> +  filesystem after the scripts that created it have been executed.  In 
>> order for
>> +  the files in the package to be automatically overlayed on top of the 
>> instance
>> +  filesystem it is required that the appliance is actually able to mount the
>> +  instance disks, therefore this will not work for every filesystem.
>> +
>> +Implementation
>> +==============
>> +
>> +The implementation of this design will happen as an ordered sequence of 
>> steps,
>> +of increasing impact on the system and, in some cases, dependent on each 
>> other:
>> +
>> +#. Private and secret instance parameters
>> +#. Communication mechanism between host and instance
>> +#. Metadata service
>> +#. Personalization package (inside a virtualization environment)
>> +#. ``self_install`` mode
>> +#. ``install`` mode (inside a virtualization environment)
>> +
>> +Some of these steps need to be more deeply specified w.r.t. what is already
>> +written in the `Proposed changes`_ Section. Extra details will be provided 
>> in
>> +the following Subsections.
>> +
>> +Communication mechanism and metadata service
>> +++++++++++++++++++++++++++++++++++++++++++++
>> +
>> +The communication mechanism and the metadata service are described together
>> +because they are deeply tied. On the other hand, the communication mechanism
>> +will need to be more generic because it can be used for other reasons in the
>> +future (like allowing instances to explicitly send commands to Ganeti, or 
>> to let
>> +Ganeti control a helper instance, like the one hereby introduced for 
>> performing
>> +OS installs inside a safe environment).
>> +
>> +The communication mechanism will be enabled automatically when the instance 
>> is
>> +in ``self_install`` or ``install`` mode, but for backwards compatibility it 
>> will
>> +be disabled when the instance is in ``run`` mode unless it is explicitly
>> +requested. Specifically, a new parameter ``--communication`` (short version:
>> +``-C``), with possible values ``true`` or ``false`` will be added to
>> +``gnt-instance add`` and ``gnt-instance modify``. It will determine whether 
>> the
>> +instance will have a communication channel set up to interact with the host 
>> and
>> +to receive metadata. The value of this parameter will be saved as part of 
>> the
>> +configuration of the instance.
>> +
>> +When the communication mechanism is enabled, Ganeti will create a new 
>> network
>> +interface inside the instance. This extra network interface will be the 
>> last one
>> +of the instance, after all the user defined ones. On the host side, this
>> +interface will be only accessible to the host itself, and not be routed 
>> outside
>> +the machine.
>> +On this network interface, the instance will connect using the IP:
>> +169.254.169.1 and netmask 255.255.255.0.
>> +The host will be on the same network, with the IP address: 169.254.169.254.
>> +
>> +The way to create this interface depends on the specific hypervisor being 
>> used.
>> +In KVM, it is possible to create a network interface inside the instance 
>> without
>> +having a corresponding interface created on the host. Using a command like::
>> +
>> +  kvm -net nic -net \
>> +    user,restrict=on,net=169.254.169.0/24,host=169.254.169.253,
>> +    guestfwd=tcp:169.254.169.254:80-tcp:127.0.0.1:8080
>> +
>> +a network interface will be created inside the VM, part of the 
>> 169.254.169.0/24
>> +network, where the VM will have IP address .253 and the host port 8080 will 
>> be
>> +reachable on port 80.
>> +
>> +In Xen, unfortunately, such a capability is not present, and an actual 
>> network
>> +interface has to be created on the host (using the ``vif`` parameter of the 
>> Xen
>> +configuration file). Each instance will have its corresponding ``vif`` 
>> network
>> +interface on the host. These interface will not be connected to each other 
>> in
>> +any way, and Ganeti will not configure them to allow traffic to be forwarded
>> +beyond the host machine. The ``vif-route`` script of xen might be helpful in
>> +implementing this.
>> +It will be the system administrator to ensure that extra firewalling and 
>> routing
>> +rules specified on the host don't allow this accidentally.
>> +
>> +The instance will be able to connect to 169.254.169.254:80, and issue GET
>> +requests to an HTTP server that will provide the instance metadata.
>> +
>> +The choice of this IP address and port for accessing the metadata is done 
>> for
>> +compatibility reasons with OpenStack's and Amazon EC2's ways of providing
>> +metadata to the instance. The metadata will be provided by a single daemon,
>> +which will determine what instance the request comes from and reply with the
>> +metadata specific for that instance.
>> +
>> +Where possible, the metadata will be provided in a way compatible with 
>> Amazon
>> +EC2, at::
>> +
>> +  http://169.254.169.254/<version>/meta-data/*
>> +
>> +If some metadata are Ganeti-specific and don't fit this structure, they 
>> will be
>> +provided at::
>> +
>> +  http://169.254.169.254/ganeti/<version>/meta_data.json
>> +
>> +``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to 
>> indicate
>> +the most recent available protocol version.
>> +
>> +If needed in the future, this structure also allows us to support 
>> OpenStack's
>> +metadata at::
>> +
>> +  http://169.254.169.254/openstack/<version>/meta_data.json
>> +
>> +A bi-directional, pipe-like communication channel will be provided. The 
>> instance
>> +will be able to receive data from the host by a GET request at::
>> +
>> +  http://169.254.169.254/ganeti/<version>/read
>> +
>> +and to send data to the host by a POST request at::
>> +
>> +  http://169.254.169.254/ganeti/<version>/write
>> +
>> +As in a pipe, once the data are read, they will not be in the buffer 
>> anymore, so
>> +subsequent get request to ``read`` will not return the same data twice.
>> +Unlike a pipe, though, it will not be possible to perform blocking I/O
>> +operations.
>> +
>> +The OS parameters will be accessible through a GET
>> +request at::
>> +
>> +  http://169.254.169.254/ganeti/<version>/os/parameters.json
>> +
>> +as a JSON serialized dictionary having the parameter name as the key, and 
>> the
>> +pair ``(<value>, <visibility>)`` as the value, where ``<value>`` is the
>> +user-provided value of the parameter, and ``<visibility>`` is either 
>> ``public``,
>> +``private`` or ``secret``.
>> +
>> +The installation scripts to be run inside the virtualized environment while 
>> the
>> +instance is run in ``install`` mode will be available at::
>> +
>> +  http://169.254.169.254/<version>/ganeti/os/scripts/<script_name>
>> +
>> +where ``<script_name>`` is the name of the script.
>> +
>> +
>> +Rationale
>> +---------
>> +
>> +The choice of using a network interface for instance-host communication, as
>> +opposed to VirtIO, XenBus or other methods, is due to the will of having a
>> +generic, hypervisor-independent way of creating a communication channel, 
>> that
>> +doesn't require unusual (para)virtualization drivers.
>> +At the same time, a network interface was preferred over solutions involving
>> +virtual floppy or USB devices because the latter tend to be detected and
>> +configured by the guest operating systems, sometimes even in prominent 
>> positions
>> +in the user interface, whereas it is fairly common to have an unconfigured
>> +network interface in a system, usually without any negative side effects.
>> +
>> +
>> +Installation process in a virtualized environment
>> ++++++++++++++++++++++++++++++++++++++++++++++++++
>> +
>> +In the new OS installation scenario, we distinguish between trusted and
>> +untrusted code.
>> +
>> +The trusted installation code maintains the behavior of the current one and
>> +requires no modifications, with the scripts running on the node the 
>> instance is
>> +being created on. The untrusted code is stored in a subdirectory of the OS
>> +definition called ``untrusted``.  This directory contains scripts that are
>> +equivalent to the already existing ones (``create``, ``export``, ``import``,
>> +``rename``) but that will be run inside an virtualized environment, to 
>> protect
>> +the host from malicious tampering.
>> +
>> +The ``untrusted`` code is meant to either be untrusted itself, or to be 
>> trusted
>> +code running operations that might be dangerous (such as mounting a
>> +user-provided image).
>> +
>> +By default, all new OS definitions will have to be explicitly marked as 
>> trusted
>> +by the cluster administrator (with a new ``gnt-os modify`` command) before 
>> they
>> +can run code on the host. Otherwise, only the untrusted part of the code 
>> will be
>> +allowed to run, inside the virtual appliance. For backwards compatibility
>> +reasons, when upgrading an existing cluster, all the installed OSes will be
>> +marked as trusted, so that they can keep running with no changes.
>> +
>> +In order to allow for the highest flexibility, if both a trusted and an
>> +untrusted script are provided for the same operation (i.e. ``create``), 
>> both of
>> +them will be executed at the same time, one on the host, and one inside the
>> +installation appliance. They will be allowed to communicate with each other
>> +through the already described communication mechanism, in order to 
>> orchestrate
>> +their execution (e.g.: the untrusted code might execute the installation, 
>> while
>> +the trusted one receives status updates from it and delivers them to a user
>> +interface).
>> +
>> +The cluster administrator will have an option to completely disable scripts
>> +running on the host, leaving only the ones running in the VM.
>> +
>> +Ganeti will provide a script to be run at install time that can be used to
>> +create the virtualized environment that will perform the OS installation of 
>> new
>> +instances.
>> +This script will build a debootstrapped basic debian system including 
>> including
>> +a software that will read the metadata, setup the environment variables and
>> +launch the installation scripts inside the virtualized environment. The 
>> script
>> +will also provide hooks for personalization.
>> +
>> +It will also be possible to use other self-made virtualized environment, as 
>> long
>> +as they connect to ganeti over the described communication mechanism and 
>> they
>> +know how to read and use the provided metadata to create a new instance.
>> +
>> +While performing an installation in the virtualized environment, a
>> +personalizable timeout will be used to detect possible problems with the
>> +installation process, and to kill the virtualized environment. The timeout 
>> will
>> +be optional and set on a cluster basis by the administrator. If set, it 
>> will be
>> +the total time allowed to setup an instance inside the appliance. It is 
>> mainly
>> +meant as a safety measure to prevent an instance taken over by malicious 
>> scripts
>> +to be available for a long time.
>> +
>> +.. vim: set textwidth=72 :
>> +.. Local Variables:
>> +.. mode: rst
>> +.. fill-column: 72
>> +.. End:
>> --
>> 1.8.5.1
>>
>
> --
> Jose Antonio Lopes
> Ganeti Engineering
> Google Germany GmbH
> Dienerstr. 12, 80331, München
>
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> Geschäftsführer: Graham Law, Christine Elizabeth Flores
> Steuernummer: 48/725/00206
> Umsatzsteueridentifikationsnummer: DE813741370

Thanks,
Michele

-- 
Google Germany GmbH
Dienerstr. 12
80331 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores

Reply via email to