On Tue, Jul 30, 2013 at 9:40 PM, Dimitris Aragiorgis <[email protected]> wrote:
> * Guido Trotter <[email protected]> [2013-07-30 13:48:10 +0200]:
>
>> On Thu, Jul 25, 2013 at 1:41 AM, Dimitris Aragiorgis <[email protected]> wrote:
>> > Hotpluging is done by functions invoked by ApplyContainerMods(). In
>> > order hotpluging to take place the --hotplug option must be passed
>> > otherwise the modifications will take place after reboot.
>> >
>> > NIC hotplug supports add, remove and modify. The modify is done by removing
>> > the existing NIC and adding a new one in the same pci slot.
>> >
>> > Disk hotplug support add and remove.  Before hotpluging a Disk it
>> > must be assembled. During LUInstanceSetParams() newly created disks
>> > are not added to the instance so ExpandCheckDisks() in
>> > AssembleInstanceDisks() will fail. So we make this check optional
>> > only for this case.
>> >
>> > In order to remove a disk (with blockdev_remove) it must be shutdown. So
>> > after unpluging the disk ShutdownDiskInstanceDisks() must be invoked.
>> >
>> > For both device types we use the generic RPC call_hotplug_device.
>> >
>> > Signed-off-by: Dimitris Aragiorgis <[email protected]>
>> > ---
>> >  lib/cmdlib/instance.py         |   52 
>> > ++++++++++++++++++++++++++++++++++++----
>> >  lib/cmdlib/instance_storage.py |    5 ++--
>> >  2 files changed, 50 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/lib/cmdlib/instance.py b/lib/cmdlib/instance.py
>> > index 22812ea..a602512 100644
>> > --- a/lib/cmdlib/instance.py
>> > +++ b/lib/cmdlib/instance.py
>> > @@ -76,6 +76,11 @@ _TApplyContModsCbChanges = \
>> >      ])))
>> >
>> >
>> > +def _DeviceHotplugable(dev):
>> > +
>>
>> Guess what I'm going to say? :)
>>
>
> I am feeling lucky: docstring?
>

Yep.

>> > +  return dev.uuid is not None
>> > +
>>
>> Can it possibly be None ? Wouldn't that be a ProgrammerError?
>>
>>
>
> No it actually cannot. I asked Christos recently. I'll remove that
> helperfunction.  With current implementation there is no limitation from
> the server side regarding hotplug-able devices.
>

Ack.

>> > +
>> >  def _CheckHostnameSane(lu, name):
>> >    """Ensures that a given hostname resolves to a 'sane' name.
>> >
>> > @@ -3028,7 +3033,8 @@ class LUInstanceSetParams(LogicalUnit):
>> >        # Operate on copies as this is still in prereq
>> >        nics = [nic.Copy() for nic in self.instance.nics]
>> >        _ApplyContainerMods("NIC", nics, self._nic_chgdesc, self.nicmod,
>> > -                          self._CreateNewNic, self._ApplyNicMods, None)
>> > +                          self._CreateNewNic, self._ApplyNicMods,
>> > +                          self._RemoveNic)
>> >        # Verify that NIC names are unique and valid
>> >        utils.ValidateDeviceNames("NIC", nics)
>> >        self._new_nics = nics
>> > @@ -3203,6 +3209,14 @@ class LUInstanceSetParams(LogicalUnit):
>> >                          " continuing anyway: %s", idx,
>> >                          self.cfg.GetNodeName(pnode_uuid), msg)
>> >
>> > +  def _HotplugDevice(self, action, dev_type, device, extra, seq):
>> > +    self.LogInfo("Trying to hotplug device...")
>>
>> ... and not quite telling you whether we succeeded or not? :)
>>
>
> Again here is it OK to use result.payload (that can be the return value of
> the hypervisor method) to print messages and warnings?
>

Yes. Or even catch an HotplugException exception and transform it into warnings.
(after all the RPC failed, it's the whole opcode that didn't)

>> > +    result = self.rpc.call_hotplug_device(self.instance.primary_node,
>> > +                                          self.instance, action, dev_type,
>> > +                                          device, extra, seq)
>> > +    result.Raise("Could not hotplug device.")
>>
>> Maybe you should raise an internal exception here...
>>
>> > +    self.LogInfo("Hotplug done.")
>> > +
>> >    def _CreateNewDisk(self, idx, params, _):
>> >      """Creates a new disk.
>> >
>> > @@ -3228,6 +3242,13 @@ class LUInstanceSetParams(LogicalUnit):
>> >                           disks=[(idx, disk, 0)],
>> >                           cleanup=new_disks)
>> >
>> > +    if self.op.hotplug:
>> > +      _, device_info = AssembleInstanceDisks(self, self.instance,
>> > +                                             [disk], check=False)
>> > +      _, _, dev_path = device_info[0]
>> > +      self._HotplugDevice(constants.HOTPLUG_ADD, constants.HOTPLUG_DISK,
>> > +                          disk, dev_path, idx)
>> > +
>>
>> ... and catch it here, transform it into a warning, and move on?
>>
>> >      return (disk, [
>> >        ("disk/%d" % idx, "add:size=%s,mode=%s" % (disk.size, disk.mode)),
>> >        ])
>> > @@ -3253,6 +3274,11 @@ class LUInstanceSetParams(LogicalUnit):
>> >      """Removes a disk.
>> >
>> >      """
>> > +    if self.op.hotplug and _DeviceHotplugable(root):
>> > +      self._HotplugDevice(constants.HOTPLUG_REMOVE, 
>> > constants.HOTPLUG_DISK,
>> > +                          root, None, idx)
>> > +      ShutdownInstanceDisks(self, self.instance, [root])
>> > +
>> >      (anno_disk,) = AnnotateDiskParams(self.instance, [root], self.cfg)
>> >      for node_uuid, disk in anno_disk.ComputeNodeTree(
>> >                               self.instance.primary_node):
>> > @@ -3282,13 +3308,18 @@ class LUInstanceSetParams(LogicalUnit):
>> >                         nicparams=nicparams)
>> >      nobj.uuid = self.cfg.GenerateUniqueID(self.proc.GetECId())
>> >
>> > -    return (nobj, [
>> > +    if self.op.hotplug:
>> > +      self._HotplugDevice(constants.HOTPLUG_ADD, constants.HOTPLUG_NIC,
>> > +                          nobj, None, idx)
>> > +
>> > +    desc =  [
>> >        ("nic.%d" % idx,
>> >         "add:mac=%s,ip=%s,mode=%s,link=%s,network=%s" %
>> >         (mac, ip, private.filled[constants.NIC_MODE],
>> > -       private.filled[constants.NIC_LINK],
>> > -       net)),
>> > -      ])
>> > +       private.filled[constants.NIC_LINK], net)),
>> > +      ]
>> > +
>> > +    return (nobj, desc)
>> >
>> >    def _ApplyNicMods(self, idx, nic, params, private):
>> >      """Modifies a network interface.
>> > @@ -3313,8 +3344,19 @@ class LUInstanceSetParams(LogicalUnit):
>> >        for (key, val) in nic.nicparams.items():
>> >          changes.append(("nic.%s/%d" % (key, idx), val))
>> >
>> > +    if self.op.hotplug and _DeviceHotplugable(nic):
>>
>> Should we warn here about the possible breakage?
>>
>
> This is done in client side.
>

Ack. But then it wouldn't appear in the rapi response, I guess.
Although that's less important there.

>> Also perhaps should we be more sneaky here? Only a few of the possible
>> NIC modification require actual unplug and replug.
>> MAC does for example. But network could  be done completely externally
>
> Not really. Network configuration requires kvm_if up scripts to run. 
> _ConfigureNIC()
> sould run in any case. Putting logic here whether to create a new tap or
> obtain it from /var/run/.. will make code ugly and we won't benefit much.
>

Sure, but I'm wonder if we shouldn't (now or in the future) change
kvm_ifup to be able to do changes on an interface that exists already,
and not remove/replug for changes that are just "node-side". Maybe the
logic could be on the kvm side, not here, and here we could just call
replug.

>> from the hypervisor without the unplugging and replugging and possibly
>> breaking. Name needs no unplug/replug at all, also? (the guest doesn't
>> know the name right? so unplugging it for changing it doesn't help).
>>
>> > +      self._HotplugDevice(constants.HOTPLUG_REMOVE, constants.HOTPLUG_NIC,
>> > +                          nic, None, idx)
>> > +      self._HotplugDevice(constants.HOTPLUG_ADD, constants.HOTPLUG_NIC,
>> > +                          nic, None, idx)
>> > +
>>
>> Maybe there should be a REPLUG mode to avoid the RTT and doing
>> everything in one op?
>
> Yes this is doable. This will change hypervisor commands as well but will
> ensure the "plug it in again in the same slot" (by obtaining the pci slot
> from the runtime file, save it locally, remove the entry and re-add it
> with the same pci slot)
>

Perfect, thanks. And will allow hypervisors to implement as "change
without unplug/replug" when it makes sense for them. :)
Which also means they'll need to return an explicit information on
whether an unplug/replug was necessary as part of the change.
So we need to call the constant HOTPLUG_CHANGE, which might or might
not imply an unplug/plug, depending on the changes and the
implementation.

>> Also we can't guarantee we won't fail to remove and then add, and
>> other such fun situations, like this, if we keep ignoring failures.
>>
>> >      return changes
>> >
>> > +  def _RemoveNic(self, idx, nic, _):
>> > +    if self.op.hotplug and _DeviceHotplugable(nic):
>> > +      self._HotplugDevice(constants.HOTPLUG_REMOVE, constants.HOTPLUG_NIC,
>> > +                          nic, None, idx)
>> > +
>> >    def Exec(self, feedback_fn):
>> >      """Modifies an instance.
>> >
>> > diff --git a/lib/cmdlib/instance_storage.py 
>> > b/lib/cmdlib/instance_storage.py
>> > index 793cfa6..1516826 100644
>> > --- a/lib/cmdlib/instance_storage.py
>> > +++ b/lib/cmdlib/instance_storage.py
>> > @@ -1259,7 +1259,7 @@ def _SafeShutdownInstanceDisks(lu, instance, 
>> > disks=None):
>> >
>> >
>> >  def AssembleInstanceDisks(lu, instance, disks=None, 
>> > ignore_secondaries=False,
>> > -                           ignore_size=False):
>> > +                          ignore_size=False, check=True):
>> >    """Prepare the block devices for an instance.
>> >
>> >    This sets up the block devices on all nodes.
>> > @@ -1284,7 +1284,8 @@ def AssembleInstanceDisks(lu, instance, disks=None, 
>> > ignore_secondaries=False,
>> >    """
>> >    device_info = []
>> >    disks_ok = True
>> > -  disks = ExpandCheckDisks(instance, disks)
>> > +  if check:
>> > +    disks = ExpandCheckDisks(instance, disks)
>> >
>>
>> Mmm... what if we added it before to the instance and left the check?
>> Skipping it might be again wrong... We can just add it to the instance
>> object and *then* try to hotplug.
>>
>
> Got lost here too. Can you provide more details? What I know is the following:
> Before hotplugging a disk it must be created (blockdev_create) and
> assembled (blockdev_assemble) which will return its actual dev_path.
> ExpandCheckDisk just compares a list of disks (the newly created one)
> with the already existing iobj.disks (not including the newly created).
> ApplyContainerMods invokes CreateNewDisk and then inserts the newly
> created item in the corresponding container (instance.disks).  This to
> change is a lot of effort and will break things. I think for the time
> being this workaround is not that wrong.
> In order to avoid this workaround hotplug must be dragged out of 
> CreateNewDisk()
> which is not the easiest thing to do..

Why not? It seems creating a new disk and hotplugging a disk into an
instance are two quite different tasks.
See it from the future "disks as an entity" point of view: what if you
want to hotplug a disk that already exists: then you don't create it,
you just load it and hotplug it. Or you unplug it but not "remove it".
:)

Thanks,

Guido

Reply via email to