[+ lists]

Hi Guido,

* Guido Trotter <[email protected]> [2013-06-18 13:40:04 +0200]:

> On Sat, May 25, 2013 at 6:41 PM, Dimitris Aragiorgis <[email protected]> wrote:
> > * Dimitris Aragiorgis <[email protected]> [2013-05-25 12:06:51 +0300]:
> >
> >> > > >
> >> > > > How do we guarantee that no two devices will have the same name? We
> >> > > should
> >> > > > double check this, to be sure there are no conflicts.
> >> > > >
> >> > > >
> >> > >
> >> > > Well we actually don't. Trimming system's generated uuid rises our 
> >> > > chances
> >> > > for duplicate IDs. If there is a way to generate a uuid of max 32 chars
> >> > > starting with a letter based on the uuids of type
> >> > > /proc/sys/kernel/random/uuid
> >> > > then it would be great. Until we find one,  we could disable hotplug 
> >> > > for
> >> > > devices
> >> > > that have the uuid prefix.
> >> > >
> >> >
> >> > That's going to be a problem, as it will generate errors for no reason.
> >> > (sorry, you can't hotplug your device, because you were unlucky in the 
> >> > uuid
> >> > roulette. please try again being luckier)
> >> >
> >> > How about using some information that is specific to the instance and 
> >> > can't
> >> > change? eg.
> >> >
> >> > x<pci-id>_<start-of-uuid>
> >> > or
> >>
> >> Well it seems that if we obtain the device pci slot before naming it then 
> >> we
> >> could definitely use it as an identifier. To this end, UUIDs are not a 
> >> prereq.
> >> And it seems to be doable. Good point.
> >>
> >
> >
> > Well after second thoughts and code digging, I think the device naming
> > should derive from uuid only. That is because when we remove a device
> > via an rpc the object that is passed to the hypervisor does not have any
> > pci details (because master is unaware of hypervisor specific info). It
> > must be a deterministic way to produce a unique name that remains
> > unchanged (the uuid, for example, does not change during device life)...
> >
> 
> But the hypervisor then knows (and can dig up) the pci id, no?
> I am a bit worried with the uuid only, due to the duplication problem.
> 

The thing is that ganeti keeps track a list of devices. With hotplug
support the order of that list may not be identical with the one inside
the VM.  E.g.:
We have an instance with [nic0, nic1, nic2].
It gets booted with the same device order [pci0, pci1, pci2].
He hot-remove nic1 so we have [nic0, nic2] and [pci0, pci2].
Then we hot-add another device and get [nic0, nic2, nic3] and
[pci0, pci3, pci2] because nic3 gets placed in nic1's pci slot
(the first available).

The hypervisor can only find a device based (a) on its name (new style
hotplug command like netdev_add, etc) or (b) based on its pci slot (old
style commands like pci_add, etc.).

I have gone with (a) so we have to find a way to get a unique device name
that will be used by kvm. I have used the first part of UUID which is not
enough with your sayings.

Runtime files are used to keep track the configuration of a running
instance.  It means that the device lists that master knows are not the
same with the ones kept in runtime files.

In current design runtime file updates takes place after a successful
hotplug action.  In case of hot-add, we ask the hypervisor for the first
available pci slot, we name the device after its uuid, we hot-add the device,
and we append the newly added device to the existing list in runtime file.

In case of hot-remove we get the name of the device (from its uuid),
hot-remove it, and then parse the existing list in runtime file and
remove the device with the corresponding uuid.

In my opinion this scenario is more elegant but we have the duplication
corner-case, which one must be sooooo unlucky to trigger....


This can change slightly and we can use the device's pci slot for its name!
So I propose the following:

In case of hot-add, we ask the hypervisor for the first
available pci slot, we name the device (something like nic-pci-<slot>), we 
hot-add
it based on that name and then we append it to the corresponding list inside
runtime file.

In case of hot-remove, we parse the runtime list, find the device based on
it uuid, we get its pci slot and hot-remove it based on its name 
(nic-pci-<slot>).
Then we parse the list in runtime file and find the device the entry based on
its pci and remove it.

With other words we save the pci slot of each device in runtime files and 
retrieve
it from there when needed.

In case you do not want to add another slot in NIC and Disk objects this
info could be stored as a tupple of (NIC.ToDict, <pci_slot>) but I think
the code will be more complicated.. Additionally because we would
refer to devices based on their uuids, the lists should be changed to dicts
with uuids as keys.

What do you think about all that?

> >> > x<arbitrary-integer>_<start-of-uuid> (where the arbitrary integer is just
> >> > used for deduplication)
> >> >
> >> >
> >
> > Arbitrary integer cannot be used because device name should be anytime
> > reproduce-able.
> >
> 
> Yes, unless you have them saved or readable.
> 
> >>
> >> OK. Sasha's feedback is more than helpful. I'll try to address the issues 
> >> that
> >> are raised, but still I think that those could extend the initial 
> >> implementation
> >> discussed here and do not affect the basic design.
> >>
> >>
> >> To sum up:
> >>
> >> 1) pci slot of each device must be saved in backend for the sake of
> >> migrations.  Should this be a slot in NIC, Disk objects or should it be
> >> saved as a tupple in serialized data?
> >>
> 
> Again, are you sure it can't just be taken from kvm on the source node
> before migration, instead of saving it?
> 

No you cannot. See above. It must be saved somewhere..

> >> 2) if we want disk hotplug, the format of runtime files should change. 
> >> block_devices
> >> should be treated separately, just like NICs. Migrating old instances 
> >> without first
> >> rebooting them still remains an issue.
> >>
> 
> Still not entirely sure here. But let's discuss the above point first.
> 

We definitely have to split disks from the cmd part. See above.

> >> 3) merge all rpc calls in one hotplug_device(device, action)
> 
> Ack.
> 
> >>
> >> 4) do not include disk resizing in first implementation.
> >>
> 
> Ack.
> 
> >> 5) add hv param for enabling hotplug or not
> 
> Ack.
> 
> >> 6) warnings in case of disk/nic live removal
> >>
> 
> Ack.
> 
> >> 7) warnings in case hotplug is not supported but --hotplug option is passed
> >>
> 
> Ack.
> 

OK with the ACKs

> >> 8) address Sasha's concerns:
> >>    * Parse kvm help output to see if hotplug is supported.
> >>    * Dig into the case where an instance does not run as a root.
> >>
> 
> Definitely.
> 
> >>
> > 9) address the uuid(or something)-to-kvm_device_name deterministic function.
> >
> 
> Yes, this one worries me a bit. I would be OK with "start-of-uuid" as
> long as the start of uuid is guaranteed (elsewhere) to be unique.
> 

The new design (see above) address it adequately. The deterministic function
can be <device type>-pci-<device slot>.


Looking forward to your comments.

Thanks,
dimara

Attachment: signature.asc
Description: Digital signature

Reply via email to