On Tue, Jul 30, 2013 at 9:40 PM, Dimitris Aragiorgis <[email protected]> wrote:
> * Guido Trotter <[email protected]> [2013-07-30 13:48:10 +0200]:
>
>> On Thu, Jul 25, 2013 at 1:41 AM, Dimitris Aragiorgis <[email protected]> wrote:
>> > Hotpluging is done by functions invoked by ApplyContainerMods(). In
>> > order hotpluging to take place the --hotplug option must be passed
>> > otherwise the modifications will take place after reboot.
>> >
>> > NIC hotplug supports add, remove and modify. The modify is done by removing
>> > the existing NIC and adding a new one in the same pci slot.
>> >
>> > Disk hotplug support add and remove. Before hotpluging a Disk it
>> > must be assembled. During LUInstanceSetParams() newly created disks
>> > are not added to the instance so ExpandCheckDisks() in
>> > AssembleInstanceDisks() will fail. So we make this check optional
>> > only for this case.
>> >
>> > In order to remove a disk (with blockdev_remove) it must be shutdown. So
>> > after unpluging the disk ShutdownDiskInstanceDisks() must be invoked.
>> >
>> > For both device types we use the generic RPC call_hotplug_device.
>> >
>> > Signed-off-by: Dimitris Aragiorgis <[email protected]>
>> > ---
>> > lib/cmdlib/instance.py | 52
>> > ++++++++++++++++++++++++++++++++++++----
>> > lib/cmdlib/instance_storage.py | 5 ++--
>> > 2 files changed, 50 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/lib/cmdlib/instance.py b/lib/cmdlib/instance.py
>> > index 22812ea..a602512 100644
>> > --- a/lib/cmdlib/instance.py
>> > +++ b/lib/cmdlib/instance.py
>> > @@ -76,6 +76,11 @@ _TApplyContModsCbChanges = \
>> > ])))
>> >
>> >
>> > +def _DeviceHotplugable(dev):
>> > +
>>
>> Guess what I'm going to say? :)
>>
>
> I am feeling lucky: docstring?
>
Yep.
>> > + return dev.uuid is not None
>> > +
>>
>> Can it possibly be None ? Wouldn't that be a ProgrammerError?
>>
>>
>
> No it actually cannot. I asked Christos recently. I'll remove that
> helperfunction. With current implementation there is no limitation from
> the server side regarding hotplug-able devices.
>
Ack.
>> > +
>> > def _CheckHostnameSane(lu, name):
>> > """Ensures that a given hostname resolves to a 'sane' name.
>> >
>> > @@ -3028,7 +3033,8 @@ class LUInstanceSetParams(LogicalUnit):
>> > # Operate on copies as this is still in prereq
>> > nics = [nic.Copy() for nic in self.instance.nics]
>> > _ApplyContainerMods("NIC", nics, self._nic_chgdesc, self.nicmod,
>> > - self._CreateNewNic, self._ApplyNicMods, None)
>> > + self._CreateNewNic, self._ApplyNicMods,
>> > + self._RemoveNic)
>> > # Verify that NIC names are unique and valid
>> > utils.ValidateDeviceNames("NIC", nics)
>> > self._new_nics = nics
>> > @@ -3203,6 +3209,14 @@ class LUInstanceSetParams(LogicalUnit):
>> > " continuing anyway: %s", idx,
>> > self.cfg.GetNodeName(pnode_uuid), msg)
>> >
>> > + def _HotplugDevice(self, action, dev_type, device, extra, seq):
>> > + self.LogInfo("Trying to hotplug device...")
>>
>> ... and not quite telling you whether we succeeded or not? :)
>>
>
> Again here is it OK to use result.payload (that can be the return value of
> the hypervisor method) to print messages and warnings?
>
Yes. Or even catch an HotplugException exception and transform it into warnings.
(after all the RPC failed, it's the whole opcode that didn't)
>> > + result = self.rpc.call_hotplug_device(self.instance.primary_node,
>> > + self.instance, action, dev_type,
>> > + device, extra, seq)
>> > + result.Raise("Could not hotplug device.")
>>
>> Maybe you should raise an internal exception here...
>>
>> > + self.LogInfo("Hotplug done.")
>> > +
>> > def _CreateNewDisk(self, idx, params, _):
>> > """Creates a new disk.
>> >
>> > @@ -3228,6 +3242,13 @@ class LUInstanceSetParams(LogicalUnit):
>> > disks=[(idx, disk, 0)],
>> > cleanup=new_disks)
>> >
>> > + if self.op.hotplug:
>> > + _, device_info = AssembleInstanceDisks(self, self.instance,
>> > + [disk], check=False)
>> > + _, _, dev_path = device_info[0]
>> > + self._HotplugDevice(constants.HOTPLUG_ADD, constants.HOTPLUG_DISK,
>> > + disk, dev_path, idx)
>> > +
>>
>> ... and catch it here, transform it into a warning, and move on?
>>
>> > return (disk, [
>> > ("disk/%d" % idx, "add:size=%s,mode=%s" % (disk.size, disk.mode)),
>> > ])
>> > @@ -3253,6 +3274,11 @@ class LUInstanceSetParams(LogicalUnit):
>> > """Removes a disk.
>> >
>> > """
>> > + if self.op.hotplug and _DeviceHotplugable(root):
>> > + self._HotplugDevice(constants.HOTPLUG_REMOVE,
>> > constants.HOTPLUG_DISK,
>> > + root, None, idx)
>> > + ShutdownInstanceDisks(self, self.instance, [root])
>> > +
>> > (anno_disk,) = AnnotateDiskParams(self.instance, [root], self.cfg)
>> > for node_uuid, disk in anno_disk.ComputeNodeTree(
>> > self.instance.primary_node):
>> > @@ -3282,13 +3308,18 @@ class LUInstanceSetParams(LogicalUnit):
>> > nicparams=nicparams)
>> > nobj.uuid = self.cfg.GenerateUniqueID(self.proc.GetECId())
>> >
>> > - return (nobj, [
>> > + if self.op.hotplug:
>> > + self._HotplugDevice(constants.HOTPLUG_ADD, constants.HOTPLUG_NIC,
>> > + nobj, None, idx)
>> > +
>> > + desc = [
>> > ("nic.%d" % idx,
>> > "add:mac=%s,ip=%s,mode=%s,link=%s,network=%s" %
>> > (mac, ip, private.filled[constants.NIC_MODE],
>> > - private.filled[constants.NIC_LINK],
>> > - net)),
>> > - ])
>> > + private.filled[constants.NIC_LINK], net)),
>> > + ]
>> > +
>> > + return (nobj, desc)
>> >
>> > def _ApplyNicMods(self, idx, nic, params, private):
>> > """Modifies a network interface.
>> > @@ -3313,8 +3344,19 @@ class LUInstanceSetParams(LogicalUnit):
>> > for (key, val) in nic.nicparams.items():
>> > changes.append(("nic.%s/%d" % (key, idx), val))
>> >
>> > + if self.op.hotplug and _DeviceHotplugable(nic):
>>
>> Should we warn here about the possible breakage?
>>
>
> This is done in client side.
>
Ack. But then it wouldn't appear in the rapi response, I guess.
Although that's less important there.
>> Also perhaps should we be more sneaky here? Only a few of the possible
>> NIC modification require actual unplug and replug.
>> MAC does for example. But network could be done completely externally
>
> Not really. Network configuration requires kvm_if up scripts to run.
> _ConfigureNIC()
> sould run in any case. Putting logic here whether to create a new tap or
> obtain it from /var/run/.. will make code ugly and we won't benefit much.
>
Sure, but I'm wonder if we shouldn't (now or in the future) change
kvm_ifup to be able to do changes on an interface that exists already,
and not remove/replug for changes that are just "node-side". Maybe the
logic could be on the kvm side, not here, and here we could just call
replug.
>> from the hypervisor without the unplugging and replugging and possibly
>> breaking. Name needs no unplug/replug at all, also? (the guest doesn't
>> know the name right? so unplugging it for changing it doesn't help).
>>
>> > + self._HotplugDevice(constants.HOTPLUG_REMOVE, constants.HOTPLUG_NIC,
>> > + nic, None, idx)
>> > + self._HotplugDevice(constants.HOTPLUG_ADD, constants.HOTPLUG_NIC,
>> > + nic, None, idx)
>> > +
>>
>> Maybe there should be a REPLUG mode to avoid the RTT and doing
>> everything in one op?
>
> Yes this is doable. This will change hypervisor commands as well but will
> ensure the "plug it in again in the same slot" (by obtaining the pci slot
> from the runtime file, save it locally, remove the entry and re-add it
> with the same pci slot)
>
Perfect, thanks. And will allow hypervisors to implement as "change
without unplug/replug" when it makes sense for them. :)
Which also means they'll need to return an explicit information on
whether an unplug/replug was necessary as part of the change.
So we need to call the constant HOTPLUG_CHANGE, which might or might
not imply an unplug/plug, depending on the changes and the
implementation.
>> Also we can't guarantee we won't fail to remove and then add, and
>> other such fun situations, like this, if we keep ignoring failures.
>>
>> > return changes
>> >
>> > + def _RemoveNic(self, idx, nic, _):
>> > + if self.op.hotplug and _DeviceHotplugable(nic):
>> > + self._HotplugDevice(constants.HOTPLUG_REMOVE, constants.HOTPLUG_NIC,
>> > + nic, None, idx)
>> > +
>> > def Exec(self, feedback_fn):
>> > """Modifies an instance.
>> >
>> > diff --git a/lib/cmdlib/instance_storage.py
>> > b/lib/cmdlib/instance_storage.py
>> > index 793cfa6..1516826 100644
>> > --- a/lib/cmdlib/instance_storage.py
>> > +++ b/lib/cmdlib/instance_storage.py
>> > @@ -1259,7 +1259,7 @@ def _SafeShutdownInstanceDisks(lu, instance,
>> > disks=None):
>> >
>> >
>> > def AssembleInstanceDisks(lu, instance, disks=None,
>> > ignore_secondaries=False,
>> > - ignore_size=False):
>> > + ignore_size=False, check=True):
>> > """Prepare the block devices for an instance.
>> >
>> > This sets up the block devices on all nodes.
>> > @@ -1284,7 +1284,8 @@ def AssembleInstanceDisks(lu, instance, disks=None,
>> > ignore_secondaries=False,
>> > """
>> > device_info = []
>> > disks_ok = True
>> > - disks = ExpandCheckDisks(instance, disks)
>> > + if check:
>> > + disks = ExpandCheckDisks(instance, disks)
>> >
>>
>> Mmm... what if we added it before to the instance and left the check?
>> Skipping it might be again wrong... We can just add it to the instance
>> object and *then* try to hotplug.
>>
>
> Got lost here too. Can you provide more details? What I know is the following:
> Before hotplugging a disk it must be created (blockdev_create) and
> assembled (blockdev_assemble) which will return its actual dev_path.
> ExpandCheckDisk just compares a list of disks (the newly created one)
> with the already existing iobj.disks (not including the newly created).
> ApplyContainerMods invokes CreateNewDisk and then inserts the newly
> created item in the corresponding container (instance.disks). This to
> change is a lot of effort and will break things. I think for the time
> being this workaround is not that wrong.
> In order to avoid this workaround hotplug must be dragged out of
> CreateNewDisk()
> which is not the easiest thing to do..
Why not? It seems creating a new disk and hotplugging a disk into an
instance are two quite different tasks.
See it from the future "disks as an entity" point of view: what if you
want to hotplug a disk that already exists: then you don't create it,
you just load it and hotplug it. Or you unplug it but not "remove it".
:)
Thanks,
Guido