On Thu, Nov 14, 2013 at 2:05 PM, Jose A. Lopes <[email protected]> wrote:
> On Thu, Nov 14, 2013 at 11:31:11AM +0100, Guido Trotter wrote:
>> On Wed, Nov 13, 2013 at 9:57 AM, Michele Tartara <[email protected]> wrote:
>> > On Tue, Nov 12, 2013 at 2:13 PM, Guido Trotter <[email protected]> 
>> > wrote:
>> >> On Tue, Nov 12, 2013 at 12:41 PM, Michele Tartara <[email protected]> 
>> >> wrote:
>> >>> Add the document describing a new design for the OS installation process 
>> >>> for
>> >>> new instances.
>> >>>
>> >>> Signed-off-by: Michele Tartara <[email protected]>
>> >>> ---
>> >>>  doc/design-draft.rst |    1 +
>> >>>  doc/design-os.rst    |  318 
>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>  2 files changed, 319 insertions(+)
>> >>>  create mode 100644 doc/design-os.rst
>> >>>
>> >>> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
>> >>> index c821292..3ed3852 100644
>> >>> --- a/doc/design-draft.rst
>> >>> +++ b/doc/design-draft.rst
>> >>> @@ -20,6 +20,7 @@ Design document drafts
>> >>>     design-daemons.rst
>> >>>     design-hsqueeze.rst
>> >>>     design-ssh-ports.rst
>> >>> +   design-os.rst
>> >>>
>> >>>  .. vim: set textwidth=72 :
>> >>>  .. Local Variables:
>> >>> diff --git a/doc/design-os.rst b/doc/design-os.rst
>> >>> new file mode 100644
>> >>> index 0000000..7a42a7f
>> >>> --- /dev/null
>> >>> +++ b/doc/design-os.rst
>> >>> @@ -0,0 +1,318 @@
>> >>> +===============================
>> >>> +Ganeti OS installation redesign
>> >>> +===============================
>> >>> +
>> >>> +.. contents:: :depth: 3
>> >>> +
>> >>> +This is a design document detailing a new OS installation procedure, 
>> >>> more
>> >>> +secure, able to provide more features and easier to use for many common 
>> >>> tasks
>> >>> +w.r.t. the current one.
>> >>> +
>> >>> +Current state and shortcomings
>> >>> +==============================
>> >>> +
>> >>> +As of Ganeti 2.10, each instance is associated with an OS definition. 
>> >>> An OS
>> >>> +definition is a set of scripts (``create``, ``export``, ``import``, 
>> >>> ``rename``)
>> >>> +that are executed with root privileges on the primary host of the 
>> >>> instance to
>> >>> +perform all the OS-related functionality (setting up an operating 
>> >>> system inside
>> >>> +the disks of the instance being created, exporting/importing the 
>> >>> instance,
>> >>> +renaming it).
>> >>> +
>> >>> +These scripts receive, as environment variables, a fixed set of 
>> >>> parameters
>> >>> +describing the instance (such as the hypervisor, the name of the 
>> >>> instance, the
>> >>> +number of disks, and their location) and a set of user defined 
>> >>> parameters. Each
>> >>> +of these parameters is also written into the configuration file of 
>> >>> Ganeti, to
>> >>> +allow for future reinstalls of the instance, and in various log files, 
>> >>> namely:
>> >>> +
>> >>> +* node daemon log file: contains DEBUG strings of the ``/os_validate``,
>> >>> +  ``/instance_os_add`` and ``/instance_start`` RPC calls.
>> >>> +
>> >>> +* master daemon log file: DEBUG strings related to the same RPC calls 
>> >>> are stored
>> >>> +  here as well.
>> >>> +
>> >>> +* commands log: the CLI commands that create a new instance, including 
>> >>> their
>> >>> +  parameters, are logged here.
>> >>> +
>> >>> +* RAPI log: the RAPI commands that create a new instances, including 
>> >>> their
>> >>> +  parameters, are logged here.
>> >>> +
>> >>> +* job logs: the job files stored in the job queue or in its archive 
>> >>> contain the
>> >>> +  parameters.
>> >>> +
>> >>> +The current situation presents a number of shortcomings:
>> >>> +
>> >>> +* Having the installation scripts run with root power on the nodes is a 
>> >>> huge
>> >>> +  security issue.
>> >>> +
>> >>
>> >> s/is a huge security issue/doesn't allow user-defined os scripts, as
>> >> they would pose a huge security issue/
>> >>
>> >> Note that there's no security issue *per se* in the current situation,
>> >> if the OS scripts are trusted.
>> >> (except perhaps for export, if the os script mounts the instance disk,
>> >> which is also not necessarily the case)
>> >
>> > Yes, that's what I meant. I'll reword it as you suggest.
>> >
>> >>
>> >> That said it could be a safety issue in the sense that an eventual
>> >> bug/error in the os script could risk disrupting the node.
>> >
>> > ACK
>> >
>> >>
>> >>> +* Ganeti cannot be used to create instances starting from user provided 
>> >>> disk
>> >>> +  images: even in the (hypothetical) case where the scripts are 
>> >>> completely
>> >>> +  secure and run not by root but by an unprivileged user with only the 
>> >>> power to
>> >>> +  mount arbitrary files as disk images, this is a security issue. It 
>> >>> has been
>> >>> +  proven that a carefully crafted file system might exploit kernel
>> >>> +  vulnerabilities to gain control of the system. Therefore, directly 
>> >>> mounting
>> >>> +  images on the Ganeti nodes is not an option.
>> >>> +
>> >>> +* There is no way to inject files into an existing disk image. A common 
>> >>> use case
>> >>> +  is for the system administrator to provide a standard image of the 
>> >>> system, to
>> >>> +  be later personalized with the network configuration, private keys 
>> >>> identifying
>> >>> +  the machine, ssh keys of the users and so on. A possible workaround 
>> >>> would be
>> >>> +  for the scripts to mount the image (only if this is trusted!) and to 
>> >>> receive
>> >>> +  the configurations and ssh keys as user defined OS parameters. 
>> >>> Unfortunately,
>> >>> +  this is also not an option for security sensitive material (such as 
>> >>> the ssh
>> >>> +  keys) because the OS parameters are stored in many places on the 
>> >>> system, as
>> >>> +  already described above.
>> >>> +
>> >>> +* Most other virtualization software simply work with instance images, 
>> >>> not with
>> >>> +  installation scripts. This difference makes the interaction of Ganeti 
>> >>> with
>> >>> +  other softwares difficult.
>> >>
>> >> s/softwares/software/
>> >
>> > ACK
>> >
>> >>
>> >>> +
>> >>> +Proposed changes
>> >>> +================
>> >>> +
>> >>> +In order to fix the shortcomings of the current state, we plan to 
>> >>> introduce the
>> >>> +following changes:
>> >>> +
>> >>> +* Change the OS parameters to have three categories:
>> >>> +
>> >>> + * ``public``: the current behavior. The parameter is logged and stored 
>> >>> freely.
>> >>> +
>> >>> + * ``private``: the parameter is saved inside the Ganeti configuration 
>> >>> (to allow
>> >>> +   for instance reinstall) but it is not shown in logs, job logs, or 
>> >>> passed back
>> >>> +   via RAPI.
>> >>> +
>> >>> + * ``secret``: the parameter is not saved inside the Ganeti 
>> >>> configuration.
>> >>> +   Reinstall are impossible unless the data is passed again. The 
>> >>> parameter will
>> >>> +   not appear in any log file. In order to preserve the functionality 
>> >>> of Ganeti,
>> >>> +   the parameters will still need to be stored in the job files, but 
>> >>> they will
>> >>> +   be removed from there when the job has finished running (either 
>> >>> successfully
>> >>> +   or not).
>> >>> +
>> >>
>> >> Do we actually need to save them in the job files?
>> >> The job files could be saved (to disk) without, and in case the master
>> >> is failed over the job can be failed.
>> >> (this should make it a lot harder to access)
>> >
>> > Unfortunately, I think we need to save them. Currently the job is
>> > created by luxid, serialized, and then read from file and executed by
>> > masterd, as part of the ongoing migration of the job queue from
>> > masterd to luxid.
>> >
>>
>> Ack, but this is hopefully temporary, and the job data can perhaps in
>> the future be passed via socket between the two...
>> So OK temporarily during development, but not by design, let's rather
>> fix the underlying problem.
>>
>> >>> +* A new OS installation procedure, based on a safe virtualized 
>> >>> environment.
>> >>> +  This virtualized environment will run with the same hardware 
>> >>> parameter as the
>> >>> +  actual instance being installed, as much as possible. This will also 
>> >>> allow to
>> >>> +  reduce the memory usage in the host (specifically, in Dom0 for Xen
>> >>> +  installations).
>> >>> Each instance will have these possible execution modes:
>> >>> +
>> >>> +  * ``run``: the default mode, used when the machine is running 
>> >>> normally.
>> >>> +
>> >>> +  * ``self_install``: Ganeti will start the instance with a different 
>> >>> set of
>> >>> +    user-specified parameters, therefore allowing to attach an 
>> >>> installation
>> >>> +    floppy/cdrom/network, change the boot device order, or specify an 
>> >>> OS image
>> >>> +    to be used. The instance will then be responsible to get the 
>> >>> parameters for
>> >>> +    configuring itself (its network interfaces, IP address, hostname, 
>> >>> etc.) from
>> >>> +    a set of metadata provided to it by Ganeti (e.g.: using an approach
>> >>> +    comparable to the one of the ``cloud-init`` tool). When this 
>> >>> installation
>> >>> +    mode is used, no OS installation script is required.
>> >>> +    In order for installation of an OS from an image to be possible, a 
>> >>> new
>> >>> +    parameter ``--os-image`` will be added, allwoing to specify where 
>> >>> to take
>> >>> +    the image from. It will have to be mutually exclusive with 
>> >>> ``--os-type``. If
>> >>> +    ``--os-image`` is specified, ``--os-parameters`` can still be used, 
>> >>> as it
>> >>> +    will be passed to the instance as part of the metadata.
>> >>> +    The set of ``self_install`` parameters will be stored as part of the
>> >>> +    instance configuration, so that they can be used to reinstall the 
>> >>> instance.
>> >>> +    It will be the user's responsibility to ensure that the OS image or 
>> >>> any
>> >>> +    installation media is still available in the proper position when a
>> >>> +    reinstall happens.
>> >>> +
>> >>
>> >> Should we use --os-type image:<name> and/or have an image os provider
>> >> that defines:
>> >> 1) the actual parameters needed for installation
>> >> 2) the image (eg. the verify script could double check that the image
>> >> is available from the node or accessible via the network...)
>> >>
>> >> I think in particular it would be useful to still have the concept of
>> >> an OS "provider" that tells ganeti how to install itself (which
>> >> parameters to use). This of course could be overridable, but at least
>> >> there would be a sane default without relying on the user to "get it
>> >> right".
>> >
>> > Regarding using --os-type image:<name>:
>> > That was my initial though too, and also my favorite choice. Still,
>> > given that we usually want to keep backwards compatibility, this would
>> > cause problems if somebody has an OS definition called "image".
>> > Furthermore, that name would become reserved in the future.
>> > If you think it is a small enough risk, and listing this in the
>> > "incompatible changes" section of the NEWS file is enough, then I'm
>> > absolutely in favor of doing it.
>> >
>>
>> I think it would be OK as it's not conflicting with an OS definition
>> called "image" but one called image:<something>, no?
>>
>> > Regarding the os provider: my idea here was to have a possibility of
>> > using Ganeti without having to provide a provider, but just an OS
>> > image plus some "gnt-instance add" parameters, therefore having a more
>> > standard approach, similar to what other solutions are doing. Having
>> > an OS provider for this as well, would defeat this purpose. Moreover,
>> > providing an installation script would still be an option, so who want
>> > to have an OS provider, can have it.
>> >
>>
>> Ack.
>>
>> >>
>> >>> +  * ``install``: Ganeti will start the instance using a virtual 
>> >>> appliance
>> >>> +    specifically made for installing Ganeti instances. Scripts 
>> >>> analogous to the
>> >>> +    current ones will run inside this instance. The disks of the 
>> >>> instance being
>> >>> +    installed will be connected to this virtual appliance, so that the 
>> >>> scripts
>> >>> +    can mount them and modify them as needed, as currently happens, but 
>> >>> with the
>> >>> +    additional protection given by this happening in a VM. The virtual 
>> >>> appliance
>> >>> +    will be started in a clean state every time a new instance need to 
>> >>> be
>> >>> +    created, to further increase security. Metadata will be provided 
>> >>> also to
>> >>> +    this virtual applicance, that will take care of converting them to
>> >>> +    environment variables for the installation scripts.
>> >>> +
>> >>
>> >> Please specify better that by "will be started in a clean state" you
>> >> actually mean "the disk will be reset to its pristine state and not
>> >> reused between reinstallation" because it might be construed to mean
>> >> just the "booting" (runtime info) which is sort of less strict.
>> >
>> > ACK
>> >
>> >>
>> >>> +In order to allow for the metadata to be sent inside the instance, a
>> >>> +communication mechanism between the instance and the host will be 
>> >>> created. This
>> >>> +mechanism will be bidirectional (e.g.: to allow the setup process going 
>> >>> on
>> >>> +inside the instance to communicate its progress to the host). Each 
>> >>> instance will
>> >>> +have access exclusively to its own metadata, and it will be only able to
>> >>> +communicate with its host over this channel.
>> >>> +
>> >>
>> >> Too vague :)
>> >
>> > It's intentionally vague: here it's just meant to state the problem.
>> > The actual description of the metadata and the communication mechanism
>> > is in the implementation section. I'll add a reference to that from
>> > here.
>> >
>>
>> Thanks.
>>
>> >>
>> >>
>> >>> +As part of the instance creation command it will be possible to 
>> >>> indicate a URL
>> >>> +for a "personalization package", that is an archive containing a set of 
>> >>> files
>> >>> +meant to be overlayed on top of the operating system file system at the 
>> >>> end of
>> >>> +the setup process, before the VM is started for the first time in 
>> >>> ``run`` mode.
>> >>> +Ganeti will provide a mechanism for receiving and unpacking this 
>> >>> archive as part
>> >>> +of the ``install`` execution mode, whereas in ``self_install`` mode it 
>> >>> will only
>> >>> +be provided as a metadata for the instance to use.
>> >>> +The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or 
>> >>> ``.tgz``)
>> >>> +and will contain the files according to the directory structure that 
>> >>> will be
>> >>> +recreated on the installation disk. Files contained in this archive will
>> >>> +overwrite files with the same path created during the install procedure 
>> >>> (if
>> >>> +any).
>> >>> +The URL of the "personalization package" will have to specify an 
>> >>> extesion to
>> >>> +identify the file format (in order to allow for more formats to be 
>> >>> supported in
>> >>> +the future).
>> >>> +The URL will be stored as part of the configuration of the instance 
>> >>> (therefore,
>> >>> +the URL should not contain confidential information, but the file there
>> >>> +available can). It is up to the system administrator to ensure that a 
>> >>> package
>> >>> +is actually available at that URL at install and reinstall time.
>> >>> +The content of the package is allowed to change. E.g.: a system 
>> >>> administrator
>> >>> +might create a package containing the private keys of the instance being
>> >>> +created. When the instance is reinstalled, a new package with new keys 
>> >>> can be
>> >>> +made available there, therefore allowing instance reinstall without the 
>> >>> need to
>> >>> +store keys.
>> >>> +
>> >>
>> >> Add something about authentication perhaps (so that an admin can have
>> >> a file available only to the ganeti installer only for the time of the
>> >> installation) and also about the fact that we won't cache/keep the
>> >> file on the node OS.
>> >
>> > ACK
>> >
>> >>
>> >>> +Implementation
>> >>> +==============
>> >>> +
>> >>> +The implementation of this design will happen as an ordered sequence of 
>> >>> steps,
>> >>> +of increasing impact on the system and, in some cases, dependent on 
>> >>> each other:
>> >>> +
>> >>> +#. Private and secret instance parameters
>> >>> +#. Communication mechanism between host and instance
>> >>> +#. Metadata service
>> >>> +#. Personalization package
>> >>> +#. ``self_install`` mode
>> >>> +#. ``install`` mode (with virtualization environment)
>> >>> +
>> >>> +Some of these steps need to be more deeply specified w.r.t. what is 
>> >>> already
>> >>> +written in the `Proposed changes`_ Section. Extra details will be 
>> >>> provided in
>> >>> +the following Subsections.
>> >>> +
>> >>> +Communication mechanism and metadata service
>> >>> +++++++++++++++++++++++++++++++++++++++++++++
>> >>> +
>> >>> +The communication mechanism and the metadata service are described 
>> >>> together
>> >>> +because they are deeply tied. On the other hand, the communication 
>> >>> mechanism
>> >>> +will need to be more generic because it can be used for other reasons 
>> >>> in the
>> >>> +future (like allowing instances to esplicitly send commands to Ganeti, 
>> >>> or to let
>> >>
>> >> explicitly
>> >
>> > ACK
>> >
>> >>
>> >>> +Ganeti control a helper instance, like the one hereby introduced for 
>> >>> performing
>> >>> +OS installs inside a safe environment).
>> >>> +
>> >>> +The communication mechanism will be enabled automatically when the 
>> >>> instance is
>> >>> +in ``self_install`` or ``install`` mode, but for backwards 
>> >>> compatibility it will
>> >>> +be disabled when the instance is in ``run`` mode unless it is esplicitly
>> >>
>> >> ^ see above
>> >
>> > ACK
>> >
>> >>
>> >>> +requested at instance startup by using a new, ad-hoc, parameter
>> >>> +(``--communication``).
>> >>
>> >> Which parameter is this? An instance, hypervisor or backend parameter? 
>> >> And why?
>> >> Also -C could do as well (if we go for instance level). Remember to
>> >> specify here as it has to be clear that an instance once configured
>> >> that way will be always started that way.
>> >>
>> >
>> > Yes, it's intended to be an instance level parameter. I'll specify
>> > that it is set at creation time, or modifiable with "gnt-instance
>> > modify", and then is automatically read from the config and used every
>> > time the instance is started.
>> >
>> >>> +
>> >>> +When the communication mechanism is enabled, Ganeti will create a new 
>> >>> network
>> >>> +interface inside the instance. This extra network interface will be the 
>> >>> last one
>> >>> +of the instance, after all the user defined ones. On the host side, this
>> >>> +interface will be only accessible to the host itself, and not be routed 
>> >>> outside
>> >>> +the machine.
>> >>
>> >> Actually it would be great if we didn't even have to create the tap.
>> >
>> > Do you mean something like (for kvm):
>> >   -net user,net=169.254.169.0/24,host=169.254.169.254
>> > that starts a user network showing the host as reachable with address
>> > 169.254.169.254?
>> >
>>
>> Yes, that would be a secure way to do it. Or perhaps using a
>> VDE-compatible connection?
>> But it doesn't have to be. Otherwise let's discuss which rules will
>> there be by default so that we assure that traffic can't get to the
>> wrong place.
>>
>> >>> +On this network interface, the instance will connect using the IP:
>> >>> +169.254.169.1 and netmask 255.255.255.0.
>> >>> +The host will be on the same network, with the IP address: 
>> >>> 169.254.169.254.
>> >>> +The instance will be able to connect to 169.254.169.254:80, and issue 
>> >>> GET
>> >>> +requests to an HTTP server that will provide the instance metadata.
>> >>> +
>> >>> +The choice of this IP address and port is done for compatibility 
>> >>> reasons with
>> >>> +OpenStack's and Amazon EC2's ways of providing metadata to the instance.
>> >>> +
>> >>> +Where possible, the metadata will be provided in a way compatible with 
>> >>> OpenStack
>> >>> +at::
>> >>> +
>> >>> +  http://169.254.169.254/openstack/<version>/meta_data.json
>> >>> +
>> >>> +or with Amazon EC2, at::
>> >>> +
>> >>> +  http://169.254.169.254/<version>/meta-data/*
>> >>> +
>> >>> +If some metadata are Ganeti-specific and don't fit this structure, they 
>> >>> will be
>> >>> +provided at::
>> >>> +
>> >>> +  http://169.254.169.254/<version>/ganeti/meta_data.json
>> >>> +
>> >>
>> >> Not quite clear! :) How does the OS choose between those? How are they
>> >> expected to differ?
>> >
>> > The idea is to provide the data in both formats, so the OS can chose
>> > based on its own preferences (there are some tools already getting the
>> > data from those postions, such as cloud-init).
>
> cloud-init seems to be using the Amazon EC2 URL.
>
> Why doesn't Ganeti use the Amazon EC2 URL? It seems the Amazon EC2 URL
> already has some of the fields Ganeti requires in this design doc,
> such as, public keys.  We can add other fields that do not collide
> with existing ones, for example, private keys.

> And why do we have the Openstack URL as well?  If there are any tools
> that we plan to use that require the Openstack URL, maybe they should
> be listed in this design document, as well.
>
> It just seems strange that we have an excellent opportunity here to
> design something clean from scratch and it already looks so complicated...


The idea is exactly using Amazon EC2 URL for everything that fits in
there, so that existing tools, like cloud-init, can access those data.
Unfortunately there are some things that don't fit in EC2's metadata
format, and for those I'm planning a separate "ganeti/" directory.
I suggested also supporting OpenStack because it is quickly becoming a
standard, but it can definitely be removed if we don't have an
immediate reason for supporting it. It can always be added later.

Regarding Ganeti's own URLs: some things are not present in Amazon's
API. For those, we cannot just add random pieces to that API, because
that would be an incompatible "embrace and extend" approach.

Adding a new ganeti/ directory next to the amazon one, as openstack
did, makes it clear what is new and what is compatible, and it seems
to me that is not complicated.

The reason why it looks complicated, probably, is that (as I already
wrote replying to Guido) unfortunately there is a mistake in the
design I sent. The actual directory structure which I meant is not:

169.254.169.254/<version>/*
169.254.169.254/<version>/ganeti/*
169.254.169.254/openstack/<version>/*

but:

169.254.169.254/<version>/*
169.254.169.254/ganeti/<version>/*
169.254.169.254/openstack/<version>/*

which makes things much more elegantly separate.

Sorry for the mistake, and I hope this structure (which might or might
not contain openstack, at this point) looks better.

Thanks,
Michele
-- 
Google Germany GmbH
Dienerstr. 12
80331 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores

Reply via email to