On Thu, Nov 14, 2013 at 2:55 PM, Jose A. Lopes <[email protected]> wrote:
> On Thu, Nov 14, 2013 at 11:31:11AM +0100, Guido Trotter wrote:
>> On Wed, Nov 13, 2013 at 9:57 AM, Michele Tartara <[email protected]> wrote:
>> > On Tue, Nov 12, 2013 at 2:13 PM, Guido Trotter <[email protected]> 
>> > wrote:
>> >> On Tue, Nov 12, 2013 at 12:41 PM, Michele Tartara <[email protected]> 
>> >> wrote:
>> >>> Add the document describing a new design for the OS installation process 
>> >>> for
>> >>> new instances.
>> >>>
>> >>> Signed-off-by: Michele Tartara <[email protected]>
>> >>> ---
>> >>>  doc/design-draft.rst |    1 +
>> >>>  doc/design-os.rst    |  318 
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>  2 files changed, 319 insertions(+)
>> >>>  create mode 100644 doc/design-os.rst
>> >>>
>> >>> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
>> >>> index c821292..3ed3852 100644
>> >>> --- a/doc/design-draft.rst
>> >>> +++ b/doc/design-draft.rst
>> >>> @@ -20,6 +20,7 @@ Design document drafts
>> >>>     design-daemons.rst
>> >>>     design-hsqueeze.rst
>> >>>     design-ssh-ports.rst
>> >>> +   design-os.rst
>> >>>
>> >>>  .. vim: set textwidth=72 :
>> >>>  .. Local Variables:
>> >>> diff --git a/doc/design-os.rst b/doc/design-os.rst
>> >>> new file mode 100644
>> >>> index 0000000..7a42a7f
>> >>> --- /dev/null
>> >>> +++ b/doc/design-os.rst
>> >>> @@ -0,0 +1,318 @@
>> >>> +===============================
>> >>> +Ganeti OS installation redesign
>> >>> +===============================
>> >>> +
>> >>> +.. contents:: :depth: 3
>> >>> +
>> >>> +This is a design document detailing a new OS installation procedure, 
>> >>> more
>> >>> +secure, able to provide more features and easier to use for many common 
>> >>> tasks
>> >>> +w.r.t. the current one.
>> >>> +
>> >>> +Current state and shortcomings
>> >>> +==============================
>> >>> +
>> >>> +As of Ganeti 2.10, each instance is associated with an OS definition. 
>> >>> An OS
>> >>> +definition is a set of scripts (``create``, ``export``, ``import``, 
>> >>> ``rename``)
>> >>> +that are executed with root privileges on the primary host of the 
>> >>> instance to
>> >>> +perform all the OS-related functionality (setting up an operating 
>> >>> system inside
>> >>> +the disks of the instance being created, exporting/importing the 
>> >>> instance,
>> >>> +renaming it).
>> >>> +
>> >>> +These scripts receive, as environment variables, a fixed set of 
>> >>> parameters
>> >>> +describing the instance (such as the hypervisor, the name of the 
>> >>> instance, the
>> >>> +number of disks, and their location) and a set of user defined 
>> >>> parameters. Each
>> >>> +of these parameters is also written into the configuration file of 
>> >>> Ganeti, to
>> >>> +allow for future reinstalls of the instance, and in various log files, 
>> >>> namely:
>> >>> +
>> >>> +* node daemon log file: contains DEBUG strings of the ``/os_validate``,
>> >>> +  ``/instance_os_add`` and ``/instance_start`` RPC calls.
>> >>> +
>> >>> +* master daemon log file: DEBUG strings related to the same RPC calls 
>> >>> are stored
>> >>> +  here as well.
>> >>> +
>> >>> +* commands log: the CLI commands that create a new instance, including 
>> >>> their
>> >>> +  parameters, are logged here.
>> >>> +
>> >>> +* RAPI log: the RAPI commands that create a new instances, including 
>> >>> their
>> >>> +  parameters, are logged here.
>> >>> +
>> >>> +* job logs: the job files stored in the job queue or in its archive 
>> >>> contain the
>> >>> +  parameters.
>> >>> +
>> >>> +The current situation presents a number of shortcomings:
>> >>> +
>> >>> +* Having the installation scripts run with root power on the nodes is a 
>> >>> huge
>> >>> +  security issue.
>> >>> +
>> >>
>> >> s/is a huge security issue/doesn't allow user-defined os scripts, as
>> >> they would pose a huge security issue/
>> >>
>> >> Note that there's no security issue *per se* in the current situation,
>> >> if the OS scripts are trusted.
>> >> (except perhaps for export, if the os script mounts the instance disk,
>> >> which is also not necessarily the case)
>> >
>> > Yes, that's what I meant. I'll reword it as you suggest.
>> >
>> >>
>> >> That said it could be a safety issue in the sense that an eventual
>> >> bug/error in the os script could risk disrupting the node.
>> >
>> > ACK
>> >
>> >>
>> >>> +* Ganeti cannot be used to create instances starting from user provided 
>> >>> disk
>> >>> +  images: even in the (hypothetical) case where the scripts are 
>> >>> completely
>> >>> +  secure and run not by root but by an unprivileged user with only the 
>> >>> power to
>> >>> +  mount arbitrary files as disk images, this is a security issue. It 
>> >>> has been
>> >>> +  proven that a carefully crafted file system might exploit kernel
>> >>> +  vulnerabilities to gain control of the system. Therefore, directly 
>> >>> mounting
>> >>> +  images on the Ganeti nodes is not an option.
>> >>> +
>> >>> +* There is no way to inject files into an existing disk image. A common 
>> >>> use case
>> >>> +  is for the system administrator to provide a standard image of the 
>> >>> system, to
>> >>> +  be later personalized with the network configuration, private keys 
>> >>> identifying
>> >>> +  the machine, ssh keys of the users and so on. A possible workaround 
>> >>> would be
>> >>> +  for the scripts to mount the image (only if this is trusted!) and to 
>> >>> receive
>> >>> +  the configurations and ssh keys as user defined OS parameters. 
>> >>> Unfortunately,
>> >>> +  this is also not an option for security sensitive material (such as 
>> >>> the ssh
>> >>> +  keys) because the OS parameters are stored in many places on the 
>> >>> system, as
>> >>> +  already described above.
>> >>> +
>> >>> +* Most other virtualization software simply work with instance images, 
>> >>> not with
>> >>> +  installation scripts. This difference makes the interaction of Ganeti 
>> >>> with
>> >>> +  other softwares difficult.
>> >>
>> >> s/softwares/software/
>> >
>> > ACK
>> >
>> >>
>> >>> +
>> >>> +Proposed changes
>> >>> +================
>> >>> +
>> >>> +In order to fix the shortcomings of the current state, we plan to 
>> >>> introduce the
>> >>> +following changes:
>> >>> +
>> >>> +* Change the OS parameters to have three categories:
>> >>> +
>> >>> + * ``public``: the current behavior. The parameter is logged and stored 
>> >>> freely.
>> >>> +
>> >>> + * ``private``: the parameter is saved inside the Ganeti configuration 
>> >>> (to allow
>> >>> +   for instance reinstall) but it is not shown in logs, job logs, or 
>> >>> passed back
>> >>> +   via RAPI.
>> >>> +
>> >>> + * ``secret``: the parameter is not saved inside the Ganeti 
>> >>> configuration.
>> >>> +   Reinstall are impossible unless the data is passed again. The 
>> >>> parameter will
>> >>> +   not appear in any log file. In order to preserve the functionality 
>> >>> of Ganeti,
>> >>> +   the parameters will still need to be stored in the job files, but 
>> >>> they will
>> >>> +   be removed from there when the job has finished running (either 
>> >>> successfully
>> >>> +   or not).
>> >>> +
>> >>
>> >> Do we actually need to save them in the job files?
>> >> The job files could be saved (to disk) without, and in case the master
>> >> is failed over the job can be failed.
>> >> (this should make it a lot harder to access)
>> >
>> > Unfortunately, I think we need to save them. Currently the job is
>> > created by luxid, serialized, and then read from file and executed by
>> > masterd, as part of the ongoing migration of the job queue from
>> > masterd to luxid.
>> >
>>
>> Ack, but this is hopefully temporary, and the job data can perhaps in
>> the future be passed via socket between the two...
>> So OK temporarily during development, but not by design, let's rather
>> fix the underlying problem.
>>
>> >>> +* A new OS installation procedure, based on a safe virtualized 
>> >>> environment.
>> >>> +  This virtualized environment will run with the same hardware 
>> >>> parameter as the
>> >>> +  actual instance being installed, as much as possible. This will also 
>> >>> allow to
>> >>> +  reduce the memory usage in the host (specifically, in Dom0 for Xen
>> >>> +  installations).
>> >>> Each instance will have these possible execution modes:
>> >>> +
>> >>> +  * ``run``: the default mode, used when the machine is running 
>> >>> normally.
>> >>> +
>> >>> +  * ``self_install``: Ganeti will start the instance with a different 
>> >>> set of
>> >>> +    user-specified parameters, therefore allowing to attach an 
>> >>> installation
>> >>> +    floppy/cdrom/network, change the boot device order, or specify an 
>> >>> OS image
>> >>> +    to be used. The instance will then be responsible to get the 
>> >>> parameters for
>> >>> +    configuring itself (its network interfaces, IP address, hostname, 
>> >>> etc.) from
>> >>> +    a set of metadata provided to it by Ganeti (e.g.: using an approach
>> >>> +    comparable to the one of the ``cloud-init`` tool). When this 
>> >>> installation
>> >>> +    mode is used, no OS installation script is required.
>> >>> +    In order for installation of an OS from an image to be possible, a 
>> >>> new
>> >>> +    parameter ``--os-image`` will be added, allwoing to specify where 
>> >>> to take
>> >>> +    the image from. It will have to be mutually exclusive with 
>> >>> ``--os-type``. If
>> >>> +    ``--os-image`` is specified, ``--os-parameters`` can still be used, 
>> >>> as it
>> >>> +    will be passed to the instance as part of the metadata.
>> >>> +    The set of ``self_install`` parameters will be stored as part of the
>> >>> +    instance configuration, so that they can be used to reinstall the 
>> >>> instance.
>> >>> +    It will be the user's responsibility to ensure that the OS image or 
>> >>> any
>> >>> +    installation media is still available in the proper position when a
>> >>> +    reinstall happens.
>> >>> +
>> >>
>> >> Should we use --os-type image:<name> and/or have an image os provider
>> >> that defines:
>> >> 1) the actual parameters needed for installation
>> >> 2) the image (eg. the verify script could double check that the image
>> >> is available from the node or accessible via the network...)
>> >>
>> >> I think in particular it would be useful to still have the concept of
>> >> an OS "provider" that tells ganeti how to install itself (which
>> >> parameters to use). This of course could be overridable, but at least
>> >> there would be a sane default without relying on the user to "get it
>> >> right".
>> >
>> > Regarding using --os-type image:<name>:
>> > That was my initial though too, and also my favorite choice. Still,
>> > given that we usually want to keep backwards compatibility, this would
>> > cause problems if somebody has an OS definition called "image".
>> > Furthermore, that name would become reserved in the future.
>> > If you think it is a small enough risk, and listing this in the
>> > "incompatible changes" section of the NEWS file is enough, then I'm
>> > absolutely in favor of doing it.
>> >
>>
>> I think it would be OK as it's not conflicting with an OS definition
>> called "image" but one called image:<something>, no?
>>
>> > Regarding the os provider: my idea here was to have a possibility of
>> > using Ganeti without having to provide a provider, but just an OS
>> > image plus some "gnt-instance add" parameters, therefore having a more
>> > standard approach, similar to what other solutions are doing. Having
>> > an OS provider for this as well, would defeat this purpose. Moreover,
>> > providing an installation script would still be an option, so who want
>> > to have an OS provider, can have it.
>> >
>>
>> Ack.
>>
>> >>
>> >>> +  * ``install``: Ganeti will start the instance using a virtual 
>> >>> appliance
>> >>> +    specifically made for installing Ganeti instances. Scripts 
>> >>> analogous to the
>> >>> +    current ones will run inside this instance. The disks of the 
>> >>> instance being
>> >>> +    installed will be connected to this virtual appliance, so that the 
>> >>> scripts
>> >>> +    can mount them and modify them as needed, as currently happens, but 
>> >>> with the
>> >>> +    additional protection given by this happening in a VM. The virtual 
>> >>> appliance
>> >>> +    will be started in a clean state every time a new instance need to 
>> >>> be
>> >>> +    created, to further increase security. Metadata will be provided 
>> >>> also to
>> >>> +    this virtual applicance, that will take care of converting them to
>> >>> +    environment variables for the installation scripts.
>> >>> +
>> >>
>> >> Please specify better that by "will be started in a clean state" you
>> >> actually mean "the disk will be reset to its pristine state and not
>> >> reused between reinstallation" because it might be construed to mean
>> >> just the "booting" (runtime info) which is sort of less strict.
>> >
>> > ACK
>> >
>> >>
>> >>> +In order to allow for the metadata to be sent inside the instance, a
>> >>> +communication mechanism between the instance and the host will be 
>> >>> created. This
>> >>> +mechanism will be bidirectional (e.g.: to allow the setup process going 
>> >>> on
>> >>> +inside the instance to communicate its progress to the host). Each 
>> >>> instance will
>> >>> +have access exclusively to its own metadata, and it will be only able to
>> >>> +communicate with its host over this channel.
>> >>> +
>> >>
>> >> Too vague :)
>> >
>> > It's intentionally vague: here it's just meant to state the problem.
>> > The actual description of the metadata and the communication mechanism
>> > is in the implementation section. I'll add a reference to that from
>> > here.
>> >
>>
>> Thanks.
>>
>> >>
>> >>
>> >>> +As part of the instance creation command it will be possible to 
>> >>> indicate a URL
>> >>> +for a "personalization package", that is an archive containing a set of 
>> >>> files
>> >>> +meant to be overlayed on top of the operating system file system at the 
>> >>> end of
>> >>> +the setup process, before the VM is started for the first time in 
>> >>> ``run`` mode.
>> >>> +Ganeti will provide a mechanism for receiving and unpacking this 
>> >>> archive as part
>> >>> +of the ``install`` execution mode, whereas in ``self_install`` mode it 
>> >>> will only
>> >>> +be provided as a metadata for the instance to use.
>> >>> +The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or 
>> >>> ``.tgz``)
>> >>> +and will contain the files according to the directory structure that 
>> >>> will be
>> >>> +recreated on the installation disk. Files contained in this archive will
>> >>> +overwrite files with the same path created during the install procedure 
>> >>> (if
>> >>> +any).
>> >>> +The URL of the "personalization package" will have to specify an 
>> >>> extesion to
>> >>> +identify the file format (in order to allow for more formats to be 
>> >>> supported in
>> >>> +the future).
>> >>> +The URL will be stored as part of the configuration of the instance 
>> >>> (therefore,
>> >>> +the URL should not contain confidential information, but the file there
>> >>> +available can). It is up to the system administrator to ensure that a 
>> >>> package
>> >>> +is actually available at that URL at install and reinstall time.
>> >>> +The content of the package is allowed to change. E.g.: a system 
>> >>> administrator
>> >>> +might create a package containing the private keys of the instance being
>> >>> +created. When the instance is reinstalled, a new package with new keys 
>> >>> can be
>> >>> +made available there, therefore allowing instance reinstall without the 
>> >>> need to
>> >>> +store keys.
>> >>> +
>> >>
>> >> Add something about authentication perhaps (so that an admin can have
>> >> a file available only to the ganeti installer only for the time of the
>> >> installation) and also about the fact that we won't cache/keep the
>> >> file on the node OS.
>> >
>> > ACK
>> >
>> >>
>> >>> +Implementation
>> >>> +==============
>> >>> +
>> >>> +The implementation of this design will happen as an ordered sequence of 
>> >>> steps,
>> >>> +of increasing impact on the system and, in some cases, dependent on 
>> >>> each other:
>> >>> +
>> >>> +#. Private and secret instance parameters
>> >>> +#. Communication mechanism between host and instance
>> >>> +#. Metadata service
>> >>> +#. Personalization package
>> >>> +#. ``self_install`` mode
>> >>> +#. ``install`` mode (with virtualization environment)
>> >>> +
>> >>> +Some of these steps need to be more deeply specified w.r.t. what is 
>> >>> already
>> >>> +written in the `Proposed changes`_ Section. Extra details will be 
>> >>> provided in
>> >>> +the following Subsections.
>> >>> +
>> >>> +Communication mechanism and metadata service
>> >>> +++++++++++++++++++++++++++++++++++++++++++++
>> >>> +
>> >>> +The communication mechanism and the metadata service are described 
>> >>> together
>> >>> +because they are deeply tied. On the other hand, the communication 
>> >>> mechanism
>> >>> +will need to be more generic because it can be used for other reasons 
>> >>> in the
>> >>> +future (like allowing instances to esplicitly send commands to Ganeti, 
>> >>> or to let
>> >>
>> >> explicitly
>> >
>> > ACK
>> >
>> >>
>> >>> +Ganeti control a helper instance, like the one hereby introduced for 
>> >>> performing
>> >>> +OS installs inside a safe environment).
>> >>> +
>> >>> +The communication mechanism will be enabled automatically when the 
>> >>> instance is
>> >>> +in ``self_install`` or ``install`` mode, but for backwards 
>> >>> compatibility it will
>> >>> +be disabled when the instance is in ``run`` mode unless it is esplicitly
>> >>
>> >> ^ see above
>> >
>> > ACK
>> >
>> >>
>> >>> +requested at instance startup by using a new, ad-hoc, parameter
>> >>> +(``--communication``).
>> >>
>> >> Which parameter is this? An instance, hypervisor or backend parameter? 
>> >> And why?
>> >> Also -C could do as well (if we go for instance level). Remember to
>> >> specify here as it has to be clear that an instance once configured
>> >> that way will be always started that way.
>> >>
>> >
>> > Yes, it's intended to be an instance level parameter. I'll specify
>> > that it is set at creation time, or modifiable with "gnt-instance
>> > modify", and then is automatically read from the config and used every
>> > time the instance is started.
>> >
>> >>> +
>> >>> +When the communication mechanism is enabled, Ganeti will create a new 
>> >>> network
>> >>> +interface inside the instance. This extra network interface will be the 
>> >>> last one
>> >>> +of the instance, after all the user defined ones. On the host side, this
>> >>> +interface will be only accessible to the host itself, and not be routed 
>> >>> outside
>> >>> +the machine.
>> >>
>> >> Actually it would be great if we didn't even have to create the tap.
>> >
>> > Do you mean something like (for kvm):
>> >   -net user,net=169.254.169.0/24,host=169.254.169.254
>> > that starts a user network showing the host as reachable with address
>> > 169.254.169.254?
>> >
>>
>> Yes, that would be a secure way to do it. Or perhaps using a
>> VDE-compatible connection?
>> But it doesn't have to be. Otherwise let's discuss which rules will
>> there be by default so that we assure that traffic can't get to the
>> wrong place.
>>
>> >>> +On this network interface, the instance will connect using the IP:
>> >>> +169.254.169.1 and netmask 255.255.255.0.
>> >>> +The host will be on the same network, with the IP address: 
>> >>> 169.254.169.254.
>> >>> +The instance will be able to connect to 169.254.169.254:80, and issue 
>> >>> GET
>> >>> +requests to an HTTP server that will provide the instance metadata.
>> >>> +
>> >>> +The choice of this IP address and port is done for compatibility 
>> >>> reasons with
>> >>> +OpenStack's and Amazon EC2's ways of providing metadata to the instance.
>> >>> +
>> >>> +Where possible, the metadata will be provided in a way compatible with 
>> >>> OpenStack
>> >>> +at::
>> >>> +
>> >>> +  http://169.254.169.254/openstack/<version>/meta_data.json
>> >>> +
>> >>> +or with Amazon EC2, at::
>> >>> +
>> >>> +  http://169.254.169.254/<version>/meta-data/*
>> >>> +
>> >>> +If some metadata are Ganeti-specific and don't fit this structure, they 
>> >>> will be
>> >>> +provided at::
>> >>> +
>> >>> +  http://169.254.169.254/<version>/ganeti/meta_data.json
>> >>> +
>> >>
>> >> Not quite clear! :) How does the OS choose between those? How are they
>> >> expected to differ?
>> >
>> > The idea is to provide the data in both formats, so the OS can chose
>> > based on its own preferences (there are some tools already getting the
>> > data from those postions, such as cloud-init).
>> >
>> >>
>> >>> +``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to 
>> >>> indicate
>> >>> +the most recent available protocol version.
>> >>> +
>> >>
>> >> Is this what openstack and EC2 do?
>> >
>> > Yes, I'm writing this here just as a clarification, but it's exactly
>> > their format.
>> >
>> >>
>> >>> +A bi-directional, pipe-like communication channel will be provided. The 
>> >>> instance
>> >>> +will be able to receive data from the host by a GET request at::
>> >>> +
>> >>> +  http://169.254.169.254/<version>/ganeti/pipe_in
>> >>> +
>> >>> +and to send data to the host by a POST request at::
>> >>> +
>> >>> +  http://169.254.169.254/<version>/ganeti/pipe_out
>> >>> +
>> >>
>> >> Why is it /openstack/<version>
>> >> but <version>/meta-data
>> >> and <version>/ganeti ?
>> >> Can we have it a bit more logical?
>> >
>> > EC2 is:
>> > /<version>/meta-data/*
>> >
>> > OpenStack came later but wanted to keep compatibility, so they created
>> > their own directory, including their own API version number:
>> >
>> > /openstack/<version>/meta-data.json
>> >
>> > And Ganeti is supposed to follow the same style as openstack, but I
>> > wrote it wrong, sorry for the mistake:
>> > /ganeti/<version>/*
>> >
>>
>> Ack then.
>>
>> >>
>> >>> +As in a pipe, once the data are read, they will not be in the buffer 
>> >>> anymore, so
>> >>> +subsequent get request to ``pipe_in`` will not return the same data 
>> >>> twice.
>> >>> +Unlike a pipe, though, it will not be possible to perform blocking I/O
>> >>> +operations.
>> >>> +
>> >>
>> >> So maybe we should just call it read and write? :)
>> >
>> > Perfectly fine for me.
>> >
>> >>> +The OS parameters will be accessible through a GET
>> >>> +request at::
>> >>> +
>> >>> +  
>> >>> http://169.254.169.254/<version>/ganeti/os/parameters/<visibility>.json
>> >>> +
>> >>> +as a JSON serialized dictionary. ``<visibility>`` will be either 
>> >>> ``public`` or
>> >>> +``private`` or ``secret``.
>> >>> +
>
> Instead of having 'os/parameters/<visibility>', why not just have one
> endpoing that returns a JSON object with keys 'public', 'private', and
> 'secret'? Something like os/parameters.json. It gives us more
> flexibility in case we want to change the datastructure instead of
> having to maintain several endpoints.

As I already replied to Guido in a previous email, that's perfectly
fine, and I'll do it.

Thanks for the suggestion,
Michele
-- 
Google Germany GmbH
Dienerstr. 12
80331 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores

Reply via email to