On Thu, Jul 25, 2013 at 1:41 AM, Dimitris Aragiorgis <[email protected]> wrote:

This is a very big patch, would it make sense to split it, introducing
the new methods, and using them, perhaps?

> Hotplug will not be supported for:
>  - KVM < 1.0
>  - existing devices in the cluster
>  - python-fdsend module is not installed (NIC hotplug)
>  - chroot (Disk hotplug)
>  - security mode other than None (Disk hotplug)
> If no hotplug takes place modifications will take place after reboot.
>
> New methods:
> Introduce new method HotplugDevice() and HotAddNic/HotDelNic,
> HotAddDisk/HotDelDisk helper methods that eventually make use of
> QEMU monitor interface for hotpluging.
>
> Device naming:
> QEMU monitor expects devices to be uniquely named. Device ids derive
> from the following function:
> kvm_devid = <device_type>-<part of uuid>-pci-<pci_slot>
> Device ids must be reproduce-able when we want to remove them.
> For that reason we store the pci slot inside the runtime file and
> in case we want to remove a device we obtain its pci slot by
> parsing the corresponding runtime enrty and matching the device
> by its uuid.
>
> Finding the PCI slot:
> For newly added devices Hypervisor parses existing PCI allocations
> (via _AnnotateFreePCISlot() and eventually ``info pci`` monitor
> command) and decides the PCI slot to plug in the device. During
> instance startup hypervisor invokes _UpdatePCISlots() for every
> device of the instance.  Initial PCI reservations derive from KVM
> default setup, that allocates 4 slots for devices other than disks
> and NICs.
>
> NIC hotplug:
>  - open a tap and get its file descriptor.
>  - pass fd with SCM rights (using python-fdsend) via monitor socket
>  - create netdev and device with id=kvm_devid and proper pci info
>
> Disk hotplug:
>  - create drive with id=kvm_devid
>  - create device with id=kvm_devid and corresponding pci info
>
> In order to migrate a VM, an identical VM should be booted with
> exactly the same pci configuration (and with -incoming option).  PCI
> info is passed via runtime file. To this end every time a hotplug
> takes place runtime file must be updated.
>
> Introduce _GenerateKVMBlockDevicesOptions():
> The runtime file contains one more field: block_devices. kvm_cmd is extended
> with block device options during _ExecuteKVMRuntime().
>
> Handle old style format of runtime files:
> In case block_devices are already encapsulated inside kvm_cmd
> and runtime files have only 3 entries, set block_devices to [].
> This way migration will not fail and hotplug will succeed for
> new disks only.
>
> Signed-off-by: Dimitris Aragiorgis <[email protected]>
> ---
>  lib/hypervisor/hv_kvm.py |  356 
> ++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 312 insertions(+), 44 deletions(-)
>
> diff --git a/lib/hypervisor/hv_kvm.py b/lib/hypervisor/hv_kvm.py
> index 4bd8a9e..5bf4255 100644
> --- a/lib/hypervisor/hv_kvm.py
> +++ b/lib/hypervisor/hv_kvm.py
> @@ -37,10 +37,16 @@ import shutil
>  import socket
>  import stat
>  import StringIO
> +import copy
> +from bitarray import bitarray
>  try:
>    import affinity   # pylint: disable=F0401
>  except ImportError:
>    affinity = None
> +try:
> +  import fdsend   # pylint: disable=F0401
> +except ImportError:
> +  fdsend = None
>
>  from ganeti import utils
>  from ganeti import constants
> @@ -79,6 +85,40 @@ _SPICE_ADDITIONAL_PARAMS = frozenset([
>    constants.HV_KVM_SPICE_USE_TLS,
>    ])
>
> +FREE = bitarray("0")
> +

Could you perhaps give this a better name, and/or add a comment about its use?
A global constant called "FREE" as a bitarray doesn't say much per se.

> +def _GenerateDeviceKVMId(dev_type, dev):
> +

Missing docstring.

> +  if not dev or not dev.pci:
> +    return None
> +
> +  return "%s-%s-pci-%d" % (dev_type.lower(), dev.uuid.split("-")[0], dev.pci)

does it make more sense to return "None" or an exception here?
(just want to avoid uncaught None being around because we haven't checked)

> +
> +
> +def _UpdatePCISlots(dev, pci_reservations):
> +  """Update pci configuration for a stopped instance
> +
> +  If dev has a pci slot the reserve it, else find first available.
> +
> +  """

Please document the types here

> +  if dev.pci:
> +    free = dev.pci
> +  else:
> +    [free] = pci_reservations.search(FREE, 1) # pylint: disable=E1103
> +    if not free:
> +      raise errors.HypervisorError("All PCI slots occupied")
> +    dev.pci = int(free)
> +
> +  pci_reservations[free] = True
> +
> +
> +def _RemoveFromRuntimeEntry(devices, device, fn):

Missing docstring

> +  try:
> +    [rem] = [x for x in fn(devices) if x.uuid == device.uuid]
> +    devices.remove(rem)
> +  except (ValueError, IndexError):
> +    logging.info("No device with uuid %s in runtime file", device.uuid)
> +
>
>  def _GetTunFeatures(fd, _ioctl=fcntl.ioctl):
>    """Retrieves supported TUN features from file descriptor.
> @@ -569,6 +609,14 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>    _BOOT_RE = re.compile(r"^-drive\s([^-]|(?<!^)-)*,boot=on\|off", re.M | 
> re.S)
>    _UUID_RE = re.compile(r"^-uuid\s", re.M)
>
> +  _INFO_PCI_RE = re.compile(r'Bus.*device[ ]*(\d+).*')
> +  _INFO_PCI_CMD = "info pci"
> +  _INFO_VERSION_RE = \
> +    re.compile(r'^QEMU (\d+)\.(\d+)(\.(\d+))?.*monitor.*', re.M)
> +  _INFO_VERSION_CMD = "info version"
> +
> +  _DEFAULT_PCI_RESERVATIONS = "11110000000000000000000000000000"
> +
>    ANCILLARY_FILES = [
>      _KVM_NETWORK_SCRIPT,
>      ]
> @@ -1021,6 +1069,81 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>          data.append(info)
>      return data
>
> +  def _GetExistingDeviceKVMId(self, instance, dev_type, dev):

Missing docstring

> +    (_, kvm_nics, __, block_devices) = self._LoadKVMRuntime(instance)
> +    if dev_type == constants.HOTPLUG_NIC:
> +      found = [n for n in kvm_nics
> +               if n.uuid == dev.uuid]
> +    elif dev_type == constants.HOTPLUG_DISK:
> +      found = [d for d, _ in block_devices
> +               if d.uuid == dev.uuid]
> +    dev_info = None
> +    if found:
> +      dev_info = found[0]
> +    return _GenerateDeviceKVMId(dev_type, dev_info)
> +
> +  def _GenerateKVMBlockDevicesOptions(self, instance, kvm_cmd, block_devices,
> +                                      pci_reservations, kvmhelp):
> +

Missing docstring.

> +    hvp = instance.hvparams
> +    boot_disk = hvp[constants.HV_BOOT_ORDER] == constants.HT_BO_DISK
> +
> +    # whether this is an older KVM version that uses the boot=on flag
> +    # on devices
> +    needs_boot_flag = self._BOOT_RE.search(kvmhelp)
> +
> +    disk_type = hvp[constants.HV_DISK_TYPE]
> +    if disk_type == constants.HT_DISK_PARAVIRTUAL:
> +      #TODO: parse kvm -device ? output

Space before "TODO"

> +      disk_model = "virtio-blk-pci"
> +      if_val = ",if=virtio"
> +    else:
> +      if_val = ",if=%s" % disk_type
> +    # Cache mode
> +    disk_cache = hvp[constants.HV_DISK_CACHE]
> +    if instance.disk_template in constants.DTS_EXT_MIRROR:
> +      if disk_cache != "none":
> +        # TODO: make this a hard error, instead of a silent overwrite
> +        logging.warning("KVM: overriding disk_cache setting '%s' with 'none'"
> +                        " to prevent shared storage corruption on migration",
> +                        disk_cache)
> +      cache_val = ",cache=none"
> +    elif disk_cache != constants.HT_CACHE_DEFAULT:
> +      cache_val = ",cache=%s" % disk_cache
> +    else:
> +      cache_val = ""
> +    for cfdev, dev_path in block_devices:
> +      if cfdev.mode != constants.DISK_RDWR:
> +        raise errors.HypervisorError("Instance has read-only disks which"
> +                                     " are not supported by KVM")
> +      # TODO: handle FD_LOOP and FD_BLKTAP (?)
> +      boot_val = ""
> +      if boot_disk:
> +        kvm_cmd.extend(["-boot", "c"])
> +        boot_disk = False
> +        if needs_boot_flag and disk_type != constants.HT_DISK_IDE:
> +          boot_val = ",boot=on"
> +      drive_val = "file=%s,format=raw%s%s" % \
> +                  (dev_path, boot_val, cache_val)
> +      _UpdatePCISlots(cfdev, pci_reservations)
> +      kvm_devid = _GenerateDeviceKVMId("DISK", cfdev)
> +      if kvm_devid:
> +        #TODO: name id after model

Same

> +        drive_val += (",if=none,id=%s" % kvm_devid)
> +        drive_val += (",bus=0,unit=%d" % cfdev.pci)
> +      else:
> +        drive_val += if_val
> +
> +      kvm_cmd.extend(["-drive", drive_val])
> +
> +      if kvm_devid:
> +        dev_val = ("%s,drive=%s,id=%s" %
> +                    (disk_model, kvm_devid, kvm_devid))
> +        dev_val += ",bus=pci.0,addr=%s" % hex(cfdev.pci)
> +        kvm_cmd.extend(["-device", dev_val])
> +
> +    return kvm_cmd
> +

Rather than taking kvm_cmd as input and extending it shouldn't you
just return the -drive values, and let the caller act on kvm_cmd as
appropriate?

>    def _GenerateKVMRuntime(self, instance, block_devices, startup_paused,
>                            kvmhelp):
>      """Generate KVM information to start an instance.
> @@ -1089,9 +1212,8 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>
>      kernel_path = hvp[constants.HV_KERNEL_PATH]
>      if kernel_path:
> -      boot_disk = boot_cdrom = boot_floppy = boot_network = False
> +      boot_cdrom = boot_floppy = boot_network = False
>      else:
> -      boot_disk = hvp[constants.HV_BOOT_ORDER] == constants.HT_BO_DISK

What's this change? Can it be extracted out/explained separately?

>        boot_cdrom = hvp[constants.HV_BOOT_ORDER] == constants.HT_BO_CDROM
>        boot_floppy = hvp[constants.HV_BOOT_ORDER] == constants.HT_BO_FLOPPY
>        boot_network = hvp[constants.HV_BOOT_ORDER] == constants.HT_BO_NETWORK
> @@ -1107,38 +1229,6 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>      needs_boot_flag = self._BOOT_RE.search(kvmhelp)
>
>      disk_type = hvp[constants.HV_DISK_TYPE]
> -    if disk_type == constants.HT_DISK_PARAVIRTUAL:
> -      if_val = ",if=virtio"
> -    else:
> -      if_val = ",if=%s" % disk_type
> -    # Cache mode
> -    disk_cache = hvp[constants.HV_DISK_CACHE]
> -    if instance.disk_template in constants.DTS_EXT_MIRROR:
> -      if disk_cache != "none":
> -        # TODO: make this a hard error, instead of a silent overwrite
> -        logging.warning("KVM: overriding disk_cache setting '%s' with 'none'"
> -                        " to prevent shared storage corruption on migration",
> -                        disk_cache)
> -      cache_val = ",cache=none"
> -    elif disk_cache != constants.HT_CACHE_DEFAULT:
> -      cache_val = ",cache=%s" % disk_cache
> -    else:
> -      cache_val = ""
> -    for cfdev, dev_path in block_devices:
> -      if cfdev.mode != constants.DISK_RDWR:
> -        raise errors.HypervisorError("Instance has read-only disks which"
> -                                     " are not supported by KVM")
> -      # TODO: handle FD_LOOP and FD_BLKTAP (?)
> -      boot_val = ""
> -      if boot_disk:
> -        kvm_cmd.extend(["-boot", "c"])
> -        boot_disk = False
> -        if needs_boot_flag and disk_type != constants.HT_DISK_IDE:
> -          boot_val = ",boot=on"
> -
> -      drive_val = "file=%s,format=raw%s%s%s" % (dev_path, if_val, boot_val,
> -                                                cache_val)
> -      kvm_cmd.extend(["-drive", drive_val])
>
>      #Now we can specify a different device type for CDROM devices.
>      cdrom_disk_type = hvp[constants.HV_KVM_CDROM_DISK_TYPE]

Please move disk_type = hvp[constants.HV_DISK_TYPE] just above this
line, as now it's extracted and used only in this section.
(ore remove it completely and use the explicit hvp... below, since
it's used only once, your choice, but don't live it as a lonely line
above).

> @@ -1407,7 +1497,7 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>      kvm_nics = instance.nics
>      hvparams = hvp
>
> -    return (kvm_cmd, kvm_nics, hvparams)
> +    return (kvm_cmd, kvm_nics, hvparams, block_devices)
>
>    def _WriteKVMRuntime(self, instance_name, data):
>      """Write an instance's KVM runtime
> @@ -1433,9 +1523,13 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>      """Save an instance's KVM runtime
>
>      """
> -    kvm_cmd, kvm_nics, hvparams = kvm_runtime
> +    kvm_cmd, kvm_nics, hvparams, block_devices = kvm_runtime
> +
>      serialized_nics = [nic.ToDict() for nic in kvm_nics]
> -    serialized_form = serializer.Dump((kvm_cmd, serialized_nics, hvparams))
> +    serialized_blockdevs = [(blk.ToDict(), link) for blk, link in 
> block_devices]
> +    serialized_form = serializer.Dump((kvm_cmd, serialized_nics, hvparams,
> +                                      serialized_blockdevs))
> +
>      self._WriteKVMRuntime(instance.name, serialized_form)
>
>    def _LoadKVMRuntime(self, instance, serialized_runtime=None):
> @@ -1444,10 +1538,19 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>      """
>      if not serialized_runtime:
>        serialized_runtime = self._ReadKVMRuntime(instance.name)
> +
>      loaded_runtime = serializer.Load(serialized_runtime)
> -    kvm_cmd, serialized_nics, hvparams = loaded_runtime
> +    if len(loaded_runtime)==3:
> +      serialized_blockdevs = []
> +      kvm_cmd, serialized_nics, hvparams = loaded_runtime
> +    else:
> +      kvm_cmd, serialized_nics, hvparams, serialized_blockdevs = 
> loaded_runtime
> +

Just wondering if you shouldn't add a runtime "version", and assume 0
if not present, instead of desuming features from the found fields.

>      kvm_nics = [objects.NIC.FromDict(snic) for snic in serialized_nics]
> -    return (kvm_cmd, kvm_nics, hvparams)
> +    block_devices = [(objects.Disk.FromDict(sdisk), link)
> +                     for sdisk, link in serialized_blockdevs]
> +
> +    return (kvm_cmd, kvm_nics, hvparams, block_devices)
>
>    def _RunKVMCmd(self, name, kvm_cmd, tap_fds=None):
>      """Run the KVM cmd and check for errors
> @@ -1472,6 +1575,7 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>      if not self._InstancePidAlive(name)[2]:
>        raise errors.HypervisorError("Failed to start instance %s" % name)
>
> +  # pylint: disable=R0914

What are you disabling? And why? Can you add a comment here explaining?

>    def _ExecuteKVMRuntime(self, instance, kvm_runtime, kvmhelp, 
> incoming=None):
>      """Execute a KVM cmd, after completing it with some last minute data.
>
> @@ -1495,9 +1599,12 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>
>      temp_files = []
>
> -    kvm_cmd, kvm_nics, up_hvp = kvm_runtime
> +    kvm_cmd, kvm_nics, up_hvp, block_devices = kvm_runtime
>      # the first element of kvm_cmd is always the path to the kvm binary
>      kvm_path = kvm_cmd[0]
> +
> +    kvm_cmd_runtime = copy.deepcopy(kvm_cmd)
> +

copy.deepcopy tends to be slower than needed. can we avoid it?
sometimes serializing/deserializing in memory is faster. of course it
might not matter here, perhaps?

>      up_hvp = objects.FillDict(conf_hvp, up_hvp)
>
>      # We know it's safe to run as a different user upon migration, so we'll 
> use
> @@ -1516,6 +1623,13 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>        utils.WriteFile(keymap_path, data="include en-us\ninclude %s\n" % 
> keymap)
>        kvm_cmd.extend(["-k", keymap_path])
>
> +    pci_reservations = bitarray(self._DEFAULT_PCI_RESERVATIONS)
> +
> +    kvm_cmd = self._GenerateKVMBlockDevicesOptions(instance, kvm_cmd,
> +                                                   block_devices,
> +                                                   pci_reservations,
> +                                                   kvmhelp)
> +

ew! in the function above you both update the passed kvm_cmd, return
it, and reassign it here to the same.
let's avoid this please, as mentioned!! :)

>      # We have reasons to believe changing something like the nic driver/type
>      # upon migration won't exactly fly with the instance kernel, so for nic
>      # related parameters we'll use up_hvp
> @@ -1556,8 +1670,15 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>          tapfds.append(tapfd)
>          taps.append(tapname)
>          if kvm_supports_netdev:
> -          nic_val = "%s,mac=%s,netdev=netdev%s" % (nic_model, nic.mac, 
> nic_seq)
> -          tap_val = "type=tap,id=netdev%s,fd=%d%s" % (nic_seq, tapfd, 
> tap_extra)
> +          nic_val = "%s,mac=%s" % (nic_model, nic.mac)
> +          _UpdatePCISlots(nic, pci_reservations)
> +          kvm_devid = _GenerateDeviceKVMId("NIC", nic)
> +          netdev = kvm_devid or "netdev%d" % nic_seq

kvm_devid or ... ? maybe better explicitly putting these in the if
below, and restructure the code so there's one only explicit if...
(or try/catch if we move _GenerateDevice... to exceptions).

> +          nic_val += (",netdev=%s" % netdev)
> +          if kvm_devid:
> +            nic_val += (",id=%s,bus=pci.0,addr=%s" % (kvm_devid, 
> hex(nic.pci)))
> +          tap_val = ("type=tap,id=%s,fd=%d%s" %
> +                     (netdev, tapfd, tap_extra))
>            kvm_cmd.extend(["-netdev", tap_val, "-device", nic_val])
>          else:
>            nic_val = "nic,vlan=%s,macaddr=%s,model=%s" % (nic_seq,
> @@ -1680,6 +1801,10 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>        # explicitly requested resume the vm status.
>        self._CallMonitorCommand(instance.name, self._CONT_CMD)
>
> +    kvm_runtime_with_pci_info = (kvm_cmd_runtime, kvm_nics,
> +                                 up_hvp, block_devices)
> +    return kvm_runtime_with_pci_info
> +

And next time we add something here we call it
_with_pci_info_and_something_else?
Also you're changing _ExecuteKVMRuntime's from not returning to
returning without updating the docstring.
And perhaps we could avoid it and keep the generate/save separate from
the execute part?

>    def StartInstance(self, instance, block_devices, startup_paused):
>      """Start an instance.
>
> @@ -1690,7 +1815,9 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>      kvm_runtime = self._GenerateKVMRuntime(instance, block_devices,
>                                             startup_paused, kvmhelp)
>      self._SaveKVMRuntime(instance, kvm_runtime)
> -    self._ExecuteKVMRuntime(instance, kvm_runtime, kvmhelp)
> +    kvm_runtime_with_pci_info = self._ExecuteKVMRuntime(instance, 
> kvm_runtime,
> +                                                        kvmhelp)
> +    self._SaveKVMRuntime(instance, kvm_runtime_with_pci_info)
>

Save then execute then save? No....

>    def _CallMonitorCommand(self, instance_name, command):
>      """Invoke a command on the instance monitor.
> @@ -1716,6 +1843,145 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>
>      return result
>
> +  def _AnnotateFreePCISlot(self, instance, dev):
> +    """Get the first available pci slot of a runnung instance.
> +
> +    """

@type? @param? @return?
Also _Annotate ... "Get the ..." Can you explain better what are you
annotating versus getting?

> +    slots = bitarray(32)
> +    slots.setall(False) # pylint: disable=E1101

What error is that and why it's not a problem here?

> +    output = self._CallMonitorCommand(instance.name, self._INFO_PCI_CMD)
> +    for line in output.stdout.splitlines():
> +      match = self._INFO_PCI_RE.search(line)
> +      if match:
> +        slot = int(match.group(1))
> +        slots[slot] = True
> +

Could we get them all with a multiline re, and iterate just on the
result, perhaps?

> +    [free] = slots.search(FREE, 1) # pylint: disable=E1101
> +    if not free:
> +      raise errors.HypervisorError("All PCI slots occupied")
> +
> +    dev.pci = int(free)
> +

Ah, so you're just finding the slot, not annotating anything... or am I wrong?

> +  def _TryHotplug(self, instance, dev_type):

_TryHotplug....

> +    """Get QEMU version from the instance's monitor.

... gets the qemu version ??

> +
> +    Hotplug is supported for running instances and for versions >= 1.0.

Empty line here. Also @return, @param, etc.

> +    """
> +    if dev_type == constants.HOTPLUG_DISK:
> +      hvp = instance.hvparams
> +      security_model = hvp[constants.HV_SECURITY_MODEL]
> +      use_chroot = hvp[constants.HV_KVM_USE_CHROOT]
> +      if use_chroot or security_model != constants.HT_SM_NONE:
> +        return False

and returns False (clearly not a version)

> +    output = self._CallMonitorCommand(instance.name, self._INFO_VERSION_CMD)
> +    #TODO: search for netdev_add, drive_add, device_add.....
> +    match = self._INFO_VERSION_RE.search(output.stdout)
> +    if not match:
> +      return False
> +    v_major, v_min, _, _ = match.groups()
> +    return (v_major, v_min) >= (1, 0)

Or a tuple. No. Please streamline this code.
Also please try to see if you can share code between here and the
"normal" version getting.
Finally keep separate the "get the version" part from the "decide
whether X is supported" part, even if you depend on the version.
And, if you can, try to see if something is supported via other help
commands rather than the version itself (but if impossible the version
is ok).

> +
> +  def _CallMonitorHotplugCommand(self, name, cmd):
> +    output = self._CallMonitorCommand(name, cmd)
> +    #TODO: parse output and check if succeeded

Indeed. But not in the distant future. :) Also:
- missing docstring
- this is not a hotplug specific function, it seems to be "call+log,
or similar".
- maybe it's time to change _CallMonitorCommand not to use socat (in a
previous patch), especially as we use it more?

> +    for line in output.stdout.splitlines():
> +      logging.info("%s", line)
> +
> +  def HotplugDevice(self, instance, action, dev_type, device, extra, seq):
> +    """ Generic method to hotplug device
> +
> +    Depending on action and dev_type invoke the coresponding method that

corresponding.

> +    does the actual hotplug (via KVM monitor commands).
> +    Before and after, do all hotplug related steps (e.g. check if hotplug
> +    is possible, annotate pci slot to device and generate/get existing
> +    KVM device ids, read/write runtime file)
> +

Is this meant as a Hypervisor base function? If so please add the one
to the base class first (in a separate patch), and document there and
here properly with the complete docstring.

> +    """
> +    if self._TryHotplug(instance, dev_type):

else? pretend nothing happened and don't return anything, nor raise an
unsupported exception? No.

> +      (kvm_cmd, kvm_nics, hvparams, \
> +        block_devices) = self._LoadKVMRuntime(instance)
> +      if action == constants.HOTPLUG_ADD:

This looks like will be standard among all hypervisors. Maybe even if
the rpc call is only one you want here to have at least:
AddHotplugDevice
and
DelHotplugDevice
and perhaps directly Disk and Nic also, so the switch can happen only
in backend.

> +        self._AnnotateFreePCISlot(instance, device)
> +        kvm_devid = _GenerateDeviceKVMId(dev_type, device)
> +        if dev_type == constants.HOTPLUG_DISK:
> +          self._HotAddDisk(instance,
> +                           device, extra, seq, kvm_devid, block_devices)
> +        elif dev_type == constants.HOTPLUG_NIC and fdsend:

and fdsend is not None. What if it's None? Again, just moving on and
pretending nothing happened is not a solution.

> +          self._HotAddNic(instance, device, extra, seq, kvm_devid, kvm_nics)
> +      elif action == constants.HOTPLUG_REMOVE:
> +        kvm_devid = self._GetExistingDeviceKVMId(instance, dev_type, device)
> +        if dev_type == constants.HOTPLUG_DISK:
> +          self._HotDelDisk(instance,
> +                           device, extra, seq, kvm_devid, block_devices)
> +        elif dev_type == constants.HOTPLUG_NIC and fdsend:
> +          self._HotDelNic(instance, device, extra, seq, kvm_devid, kvm_nics)

You're actually not using fdsend for removing nics. But well, if you
want to keep the dependency specular that's fair too.
Please add a comment in that sense.

> +      self._SaveKVMRuntime(instance,
> +                          (kvm_cmd, kvm_nics, hvparams, block_devices))
> +      time.sleep(2)

For what? Just so we hope the operation whose output we leave as a
TODO to check has more chances to succeed?
What are we waiting for? Is there a way to properly way and perhaps report?

> +
> +  def _HotAddDisk(self, instance, disk, dev_path, _, kvm_devid, 
> block_devices):
> +    """Hotplug/add new disk to and instance
> +

Usual docstring complaints.

> +    """
> +    command = ("drive_add dummy file=%s,if=none,id=%s,format=raw" %
> +               (dev_path, kvm_devid))
> +    self._CallMonitorHotplugCommand(instance.name, command)
> +    command = ("device_add virtio-blk-pci,bus=pci.0,addr=%s,"
> +               "drive=%s,id=%s"
> +               % (hex(disk.pci), kvm_devid, kvm_devid))
> +    self._CallMonitorHotplugCommand(instance.name, command)
> +    block_devices.append((disk, dev_path))

Can you explain the two commands, either in the docstring or somewhere else?
Also perhaps if you change the _CallMonitorCommand not to use socat
you can even send the two over the same socket without opening and
closing and forking a couple of times.

> +
> +  def _HotAddNic(self, instance, nic, _, seq, kvm_devid, kvm_nics):
> +    """Hotplug/add nic to an instance
> +

Same.

> +    """
> +    (tap, fd) = _OpenTap()
> +    self._PassTapFd(instance, fd, nic)
> +    command = ("netdev_add tap,id=%s,fd=%s" % (kvm_devid, kvm_devid))
> +    self._CallMonitorHotplugCommand(instance.name, command)
> +    command = ("device_add virtio-net-pci,bus=pci.0,addr=%s,mac=%s,"
> +               "netdev=%s,id=%s"
> +               % (hex(nic.pci), nic.mac, kvm_devid, kvm_devid))
> +    self._CallMonitorHotplugCommand(instance.name, command)
> +    self._ConfigureNIC(instance, seq, nic, tap)
> +    utils.WriteFile(self._InstanceNICFile(instance.name, seq), data=tap)
> +    kvm_nics.append(nic)
> +

Same. Also wouldn't it make sense to configure it and then plug it?

> +  def _HotDelDisk(self, instance, disk, _, __, kvm_devid, block_devices):
> +    """Hotplug/remove disk from an instance
> +
> +    """
> +    command = "device_del %s" % kvm_devid
> +    self._CallMonitorHotplugCommand(instance.name, command)
> +    #command = "drive_del %s" % uuid
> +    #self._CallMonitorHotplugCommand(instance.name, command)

Mmmm... fishy commented out code with no explanation... ;)

> +    _RemoveFromRuntimeEntry(block_devices, disk, lambda x: [d for d, l in x])
> +
> +  def _HotDelNic(self, instance, nic, _, __, kvm_devid, kvm_nics):
> +    """Hotplug/remove existing nic from an instance
> +
> +    """
> +    command = "device_del %s" % kvm_devid
> +    self._CallMonitorHotplugCommand(instance.name, command)
> +    command = "netdev_del %s" % kvm_devid
> +    self._CallMonitorHotplugCommand(instance.name, command)
> +    _RemoveFromRuntimeEntry(kvm_nics, nic, lambda x: x)
> +
> +  def _PassTapFd(self, instance, fd, nic):
> +    """Pass file descriptor to kvm process via monitor socket using 
> SCM_RIGHTS
> +
> +    """
> +    monsock = utils.ShellQuote(self._InstanceMonitor(instance.name))
> +    s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
> +    s.connect(monsock)
> +    kvm_devid = _GenerateDeviceKVMId("NIC", nic)
> +    command = "getfd %s\n" % kvm_devid
> +    fds = [fd]
> +    logging.info("%s", fds)
> +    fdsend.sendfds(s, command, fds = fds)
> +    s.close()
> +

You're not creating a new way to connect the the kvm monitor and then
using it just for your own purpose here in a forgotten function
ignoring all other cases and a more generic implementation, right? ...
:) Of course you wouldn't do that?


>    @classmethod
>    def _ParseKVMVersion(cls, text):
>      """Parse the KVM version from the --help output.
> @@ -1831,7 +2097,9 @@ class KVMHypervisor(hv_base.BaseHypervisor):
>      self._SaveKVMRuntime(instance, kvm_runtime)
>      kvmpath = instance.hvparams[constants.HV_KVM_PATH]
>      kvmhelp = self._GetKVMOutput(kvmpath, self._KVMOPT_HELP)
> -    self._ExecuteKVMRuntime(instance, kvm_runtime, kvmhelp)
> +    kvm_runtime_with_pci_info = \
> +      self._ExecuteKVMRuntime(instance, kvm_runtime, kvmhelp)
> +    self._SaveKVMRuntime(instance, kvm_runtime_with_pci_info)
>

Again Save + ... + Save is not nice.

Thanks, and sorry for the looong review.

Guido

Reply via email to