Status: Accepted
Owner: [email protected]
Labels: Type-Defect Priority-High Milestone-Release2.8 Component-instances

New issue 633 by [email protected]: Migration of ext type instances fail due to physical_id missing
http://code.google.com/p/ganeti/issues/detail?id=633

What steps will reproduce the problem?
1. Add a second disk in a test instance with ext disk template
2. Migrate the instance
3. Accept instance fails

Target node's noded-daemon log:

2013-12-03 14:25:02,853: ganeti-noded pid=1799 ERROR Error in RPC call
Traceback (most recent call last):
File "/usr/share/ganeti/ganeti/server/noded.py", line 181, in HandleRequest
    result = (True, method(serializer.LoadJson(req.request_body)))
File "/usr/share/ganeti/ganeti/server/noded.py", line 645, in perspective_accept_instance
    return backend.AcceptInstance(instance, info, target)
  File "/usr/share/ganeti/ganeti/backend.py", line 1537, in AcceptInstance
    _GatherAndLinkBlockDevs(instance)
File "/usr/share/ganeti/ganeti/backend.py", line 1315, in _GatherAndLinkBlockDevs
    device = _RecursiveFindBD(disk)
  File "/usr/share/ganeti/ganeti/backend.py", line 2112, in _RecursiveFindBD
    return bdev.FindDevice(disk, children)
  File "/usr/share/ganeti/ganeti/bdev.py", line 3419, in FindDevice
    disk.params)
  File "/usr/share/ganeti/ganeti/bdev.py", line 2994, in __init__
    raise ValueError("Invalid configuration data %s" % str(unique_id))
ValueError: Invalid configuration data None
2013-12-03 14:25:02,863: ganeti-noded pid=1799 INFO 192.168.2.2:51978 POST /accept_instance HTTP/1.1 200



What is the expected output? What do you see instead?

This happens because when adding a disk in 2.8, SetDiskID() does not get
invoked. As a result disk entries inside config.data have no physical_id.
In case of EXT, before accepting the instance we invoke _GatherAndLinkBlockDevices(). This takes `instance` as an argument which includes disks without physical_id and thus FindDevice() fails.

As far as I know physical_id is meant to be deprecated right? In
future releases it will be removed from the file. So I propose
the following fix:

--- a/lib/cmdlib/instance_migration.py
+++ b/lib/cmdlib/instance_migration.py
@@ -709,6 +709,8 @@ class TLMigrateInstance(Tasklet):
       self._WaitUntilSync()

     self.feedback_fn("* preparing %s to accept the instance" % target_node)
+    for disk in instance.disks:
+      self.cfg.SetDiskID(disk, target_node)
     result = self.rpc.call_accept_instance(target_node,
                                            instance,
                                            migration_info,


Any comments on that?


--
You received this message because this project is configured to send all issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

Reply via email to