[Yahoo-eng-team] [Bug 1680918] Re: Nova upgrade fails if PCI devices of type-PF or type-PCI are present in the database
Reviewed: https://review.openstack.org/456397 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7f3f0ef1fbb51f6f17d2c13840e0f98d17fa9093 Submitter: Jenkins Branch:master commit 7f3f0ef1fbb51f6f17d2c13840e0f98d17fa9093 Author: Steven WebsterDate: Wed Apr 5 09:05:07 2017 -0400 Fix mitaka online migration for PCI devices Currently, a validation error is thrown if we find any PCI device records which have not populated the parent_addr column on a nova upgrade. However, the only PCI records for which a parent_addr makes sense for are those with a device type of 'type-VF' (ie. an SRIOV virtual function). PCI records with a device type of 'type-PF' or 'type-PCI' will not have a parent_addr. If any of those records are present on upgrade, the validation will fail. This change checks that the device type of the PCI record is 'type-VF' when making sure the parent_addr has been correctly populated Closes-Bug: #1680918 Change-Id: Ia7e773674a4976fc03deee3f08a6ddb45568ec11 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1680918 Title: Nova upgrade fails if PCI devices of type-PF or type-PCI are present in the database Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) newton series: New Status in OpenStack Compute (nova) ocata series: New Bug description: Description === If a Nova DB is upgraded (migrated) while containing PCI devices with device type 'type-PF' or 'type-PCI', a validation error similar to this will be thrown: "ValidationError: There are still 2 unmigrated records in the pci_devices table. Migration cannot continue until all records have been migrated." The error is generated by the 330_enforce_mitaka_online_migrations.py upgrade script. The PCI device migration validation will fail if any PCI device entries without a populated parent_addr are found. However, the parent_addr really only applies to PCI device entries of 'type-VF' (ie. SRIOV virtual functions) This is an example of what the pci_devices table looks like with SRIOV enabled PCI devices if the appropriate entries are whitelisted in nova.conf: MariaDB [nova]> select * from pci_devices; +-+-++-++-+--++---+--+--+-+---++---++---+--+ | created_at | updated_at | deleted_at | deleted | id | compute_node_id | address | product_id | vendor_id | dev_type | dev_id | label | status| extra_info | instance_uuid | request_id | numa_node | parent_addr | +-+-++-++-+--++---+--+--+-+---++---++---+--+ | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 1 | 1 | :05:10.1 | 10ed | 8086 | type-VF | pci__05_10_1 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 2 | 1 | :05:10.3 | 10ed | 8086 | type-VF | pci__05_10_3 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 3 | 1 | :05:10.5 | 10ed | 8086 | type-VF | pci__05_10_5 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 4 | 1 | :05:10.7 | 10ed | 8086 | type-VF | pci__05_10_7 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:53:13 | NULL| NULL | 0 | 5 | 1 | :05:00.0 | 10fb | 8086 | type-PF | pci__05_00_0 | label_8086_10fb | available | {} | NULL | NULL | 0 | NULL | | 2017-04-06 21:53:13 | NULL| NULL | 0 | 6 | 1 | :05:00.1 | 10fb | 8086 | type-PF | pci__05_00_1 | label_8086_10fb | available | {} | NULL | NULL | 0 | NULL |
[Yahoo-eng-team] [Bug 1680918] Re: Nova upgrade fails if PCI devices of type-PF or type-PCI are present in the database
** Also affects: nova/ocata Importance: Undecided Status: New ** Also affects: nova/newton Importance: Undecided Status: New ** Changed in: nova Importance: Undecided => High ** Changed in: nova/newton Importance: Undecided => High ** Changed in: nova/ocata Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1680918 Title: Nova upgrade fails if PCI devices of type-PF or type-PCI are present in the database Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) newton series: New Status in OpenStack Compute (nova) ocata series: New Bug description: Description === If a Nova DB is upgraded (migrated) while containing PCI devices with device type 'type-PF' or 'type-PCI', a validation error similar to this will be thrown: "ValidationError: There are still 2 unmigrated records in the pci_devices table. Migration cannot continue until all records have been migrated." The error is generated by the 330_enforce_mitaka_online_migrations.py upgrade script. The PCI device migration validation will fail if any PCI device entries without a populated parent_addr are found. However, the parent_addr really only applies to PCI device entries of 'type-VF' (ie. SRIOV virtual functions) This is an example of what the pci_devices table looks like with SRIOV enabled PCI devices if the appropriate entries are whitelisted in nova.conf: MariaDB [nova]> select * from pci_devices; +-+-++-++-+--++---+--+--+-+---++---++---+--+ | created_at | updated_at | deleted_at | deleted | id | compute_node_id | address | product_id | vendor_id | dev_type | dev_id | label | status| extra_info | instance_uuid | request_id | numa_node | parent_addr | +-+-++-++-+--++---+--+--+-+---++---++---+--+ | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 1 | 1 | :05:10.1 | 10ed | 8086 | type-VF | pci__05_10_1 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 2 | 1 | :05:10.3 | 10ed | 8086 | type-VF | pci__05_10_3 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 3 | 1 | :05:10.5 | 10ed | 8086 | type-VF | pci__05_10_5 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:01:21 | 2017-04-06 21:53:13 | NULL | 0 | 4 | 1 | :05:10.7 | 10ed | 8086 | type-VF | pci__05_10_7 | label_8086_10ed | available | {} | NULL | NULL | 0 | :05:00.1 | | 2017-04-06 21:53:13 | NULL| NULL | 0 | 5 | 1 | :05:00.0 | 10fb | 8086 | type-PF | pci__05_00_0 | label_8086_10fb | available | {} | NULL | NULL | 0 | NULL | | 2017-04-06 21:53:13 | NULL| NULL | 0 | 6 | 1 | :05:00.1 | 10fb | 8086 | type-PF | pci__05_00_1 | label_8086_10fb | available | {} | NULL | NULL | 0 | NULL | +-+-++-++-+--++---+--+--+-+---++---++---+--+ 6 rows in set (0.00 sec) I think the upgrade script should be checking the PciDevice dev_type field for 'type-VF' when validating the parent_addr. Steps to reproduce == 1. Install a Mitaka control node and edit the nova.conf file to include 1 or more PCI devices in the pci_passthrough_whitelist. ie: pci_passthrough_whitelist = {"vendor_id": "8086", "product_id":"10fb"} 2. Install a second Newton or newer control node and edit the nova.conf to point to the SQL database of the Mitaka node. ie: [database] connection = mysql+pymysql://root:supersecret@/nova?charset=utf8 [api_database] connection =