Re: [openstack-dev] [Openstack] [nova] [os-vif] [vif_plug_ovs] Support for OVS DB tcp socket communication.
On Wed, Jul 25, 2018 at 03:22:27PM +0530, pranab boruah wrote: > Hello folks, > > I have filed a bug in os-vif: > https://bugs.launchpad.net/os-vif/+bug/1778724 and > working on a patch. Any feedback/comments from you guys would be extremely > helpful. > > Bug details: > > OVS DB server has the feature of listening over a TCP socket for > connections rather than just on the unix domain socket. [0] > > If the OVS DB server is listening over a TCP socket, then the ovs-vsctl > commands should include the ovsdb_connection parameter: > # ovs-vsctl --db=tcp:IP:PORT ... > eg: > # ovs-vsctl --db=tcp:169.254.1.1:6640 add-port br-int eth0 > > Neutron supports running the ovs-vsctl commands with the ovsdb_connection > parameter. The ovsdb_connection parameter is configured in > openvswitch_agent.ini file. [1] > > While adding a vif to the ovs bridge(br-int), Nova(os-vif) invokes the > ovs-vsctl command. Today, there is no support to pass the ovsdb_connection > parameter while invoking the ovs-vsctl command. The support should be > added. This would enhance the functionality of os-vif, since it would > support a scenario when OVS DB server is listening on a TCP socket > connection and on functional parity with Neutron. > > [0] http://www.openvswitch.org/support/dist-docs/ovsdb-server.1.html > [1] https://docs.openstack.org/neutron/pike/configuration > /openvswitch-agent.html > TIA, > Pranab Hello Pranab, Makes sense for me. This is really related to the OVS plugin that we are maintaining. I guess you will have to add a new config option for it as we have with 'network_device_mtu' and 'ovs_vsctl_timeout'. Don't hesitate to add me as reviewer when patch is ready. Thanks, s. > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] about filter the flavor
On Mon, Jul 02, 2018 at 11:08:51AM +0800, Rambo wrote: > Hi,all > > I have an idea.Now we can't filter the special flavor according to > the property.Can we achieve it?If we achieved this,we can filter the > flavor according the property's key and value to filter the > flavor. What do you think of the idea?Can you tell me more about > this ?Thank you very much. Is that not the aim of AggregateTypeAffinityFilter and/or AggregateInstanceExtraSpecFilter? Based on flavor or flavor properties the instances can only be scheduled on a specific set of hosts. https://git.openstack.org/cgit/openstack/nova/tree/nova/scheduler/filters/type_filter.py https://git.openstack.org/cgit/openstack/nova/tree/nova/scheduler/filters/aggregate_instance_extra_specs.py Thanks, s. > > Best Regards > Rambo > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] NUMA-aware live migration: easy but incomplete vs complete but hard
On Thu, Jun 21, 2018 at 09:36:58AM -0400, Jay Pipes wrote: > On 06/18/2018 10:16 AM, Artom Lifshitz wrote: > > Hey all, > > > > For Rocky I'm trying to get live migration to work properly for > > instances that have a NUMA topology [1]. > > > > A question that came up on one of patches [2] is how to handle > > resources claims on the destination, or indeed whether to handle that > > at all. > > > > The previous attempt's approach [3] (call it A) was to use the > > resource tracker. This is race-free and the "correct" way to do it, > > but the code is pretty opaque and not easily reviewable, as evidenced > > by [3] sitting in review purgatory for literally years. > > > > A simpler approach (call it B) is to ignore resource claims entirely > > for now and wait for NUMA in placement to land in order to handle it > > that way. This is obviously race-prone and not the "correct" way of > > doing it, but the code would be relatively easy to review. > > > > For the longest time, live migration did not keep track of resources > > (until it started updating placement allocations). The message to > > operators was essentially "we're giving you this massive hammer, don't > > break your fingers." Continuing to ignore resource claims for now is > > just maintaining the status quo. In addition, there is value in > > improving NUMA live migration *now*, even if the improvement is > > incomplete because it's missing resource claims. "Best is the enemy of > > good" and all that. Finally, making use of the resource tracker is > > just work that we know will get thrown out once we start using > > placement for NUMA resources. > > > > For all those reasons, I would favor approach B, but I wanted to ask > > the community for their thoughts. > > Side question... does either approach touch PCI device management during > live migration? > > I ask because the only workloads I've ever seen that pin guest vCPU threads > to specific host processors -- or make use of huge pages consumed from a > specific host NUMA node -- have also made use of SR-IOV and/or PCI > passthrough. [1] Not really. There are lot of virtual switches that we do support like OVS-DPDK, Contrail Virtual Router... that support vhostuser interfaces which is one use-case. (We do support live-migration of vhostuser interface) > If workloads that use PCI passthrough or SR-IOV VFs cannot be live migrated > (due to existing complications in the lower-level virt layers) I don't see > much of a point spending lots of developer resources trying to "fix" this > situation when in the real world, only a mythical workload that uses CPU > pinning or huge pages but *doesn't* use PCI passthrough or SR-IOV VFs would > be helped by it. > > Best, > -jay > > [1 I know I'm only one person, but every workload I've seen that requires > pinned CPUs and/or huge pages is a VNF that has been essentially an ASIC > that a telco OEM/vendor has converted into software and requires the same > guarantees that the ASIC and custom hardware gave the original > hardware-based workload. These VNFs, every single one of them, used either > PCI passthrough or SR-IOV VFs to handle latency-sensitive network I/O. > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] NUMA-aware live migration: easy but incomplete vs complete but hard
On Mon, Jun 18, 2018 at 10:16:05AM -0400, Artom Lifshitz wrote: > Hey all, > > For Rocky I'm trying to get live migration to work properly for > instances that have a NUMA topology [1]. > > A question that came up on one of patches [2] is how to handle > resources claims on the destination, or indeed whether to handle that > at all. > > The previous attempt's approach [3] (call it A) was to use the > resource tracker. This is race-free and the "correct" way to do it, > but the code is pretty opaque and not easily reviewable, as evidenced > by [3] sitting in review purgatory for literally years. > > A simpler approach (call it B) is to ignore resource claims entirely > for now and wait for NUMA in placement to land in order to handle it > that way. This is obviously race-prone and not the "correct" way of > doing it, but the code would be relatively easy to review. Hello Artom, The problem I have with B approach is that. It's based on something which has not been designed for which will end-up with the same bugs that you are trying to solve (1417667, 1289064). The live migration is a sensitive operation that operators need to have trust on, if we take case of a host evacuation the result would be terrible, no? If you want continue with B, I think you will have to find at least a mechanism to update the host NUMA topology resources of the destination during the on-going migrations. But again that should be done early to avoid a too big window where an other instance can be scheduled and be assigned of the same CPU topology. Also does this really make sense when we now that at some point placement will take care of such things for NUMA resources? The A approach already handles what you need: - Test whether destination host can accept the guest CPU policy - Build new instance NUMA topology based on destination host - Hold and update NUMA topology resources of destination host - Store the destination host NUMA topology so it can be used by source ... My preference is A because it reuses something which is used for every guests that are scheduled today (not only for pci or numa things), we have trust on it, it's also used for some move operations, it limits the race window to a one we already have, and finally we limit the code introduced. Thanks, s. > For the longest time, live migration did not keep track of resources > (until it started updating placement allocations). The message to > operators was essentially "we're giving you this massive hammer, don't > break your fingers." Continuing to ignore resource claims for now is > just maintaining the status quo. In addition, there is value in > improving NUMA live migration *now*, even if the improvement is > incomplete because it's missing resource claims. "Best is the enemy of > good" and all that. Finally, making use of the resource tracker is > just work that we know will get thrown out once we start using > placement for NUMA resources. > > For all those reasons, I would favor approach B, but I wanted to ask > the community for their thoughts. > > Thanks! > > [1] > https://review.openstack.org/#/q/topic:bp/numa-aware-live-migration+(status:open+OR+status:merged) > [2] https://review.openstack.org/#/c/567242/ > [3] https://review.openstack.org/#/c/244489/ > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] increasing the number of allowed volumes attached per instance > 26
On Fri, Jun 08, 2018 at 11:35:45AM +0200, Kashyap Chamarthy wrote: > On Thu, Jun 07, 2018 at 01:07:48PM -0500, Matt Riedemann wrote: > > On 6/7/2018 12:56 PM, melanie witt wrote: > > > Recently, we've received interest about increasing the maximum number of > > > allowed volumes to attach to a single instance > 26. The limit of 26 is > > > because of a historical limitation in libvirt (if I remember correctly) > > > and is no longer limited at the libvirt level in the present day. So, > > > we're looking at providing a way to attach more than 26 volumes to a > > > single instance and we want your feedback. > > > > The 26 volumes thing is a libvirt driver restriction. > > The original limitation of 26 disks was because at that time there was > no 'virtio-scsi'. > > (With 'virtio-scsi', each of its controller allows upto 256 targets, and > each target can use any LUN (Logical Unit Number) from 0 to 16383 > (inclusive). Therefore, the maxium allowable disks on a single > 'virtio-scsi' controller is 256 * 16384 == 4194304.) Source[1]. Not totally true for Nova. Nova handles one virtio-scsi controller per guest and plug all the volumes in one target so in theory that would be 16384 LUN (only). But you made a good point the 26 volumes thing is not a libvirt driver restriction. For example the QEMU SCSI native implementation handles 256 disks. About the virtio-blk limitation I made the same finding but Tsuyoshi Nagata shared an interesting point saying that virtio-blk is not longer limited by the number of PCI slot available. That in recent kernel and QEMU version [0]. I could join what you are suggesting at the bottom and fix the limit to 256 disks. [0] https://review.openstack.org/#/c/567472/16/nova/virt/libvirt/blockinfo.py@162 > [...] > > > > Some ideas that have been discussed so far include: > > > > > > A) Selecting a new, higher maximum that still yields reasonable > > > performance on a single compute host (64 or 128, for example). Pros: > > > helps prevent the potential for poor performance on a compute host from > > > attaching too many volumes. Cons: doesn't let anyone opt-in to a higher > > > maximum if their environment can handle it. > > Option (A) can still be considered: We can limit it to 256 disks. Why? > > FWIW, I did some digging here: > > The upstream libguestfs project after some thorough testing, arrived at > a limit of 256 disks, and suggest the same for Nova. And if anyone > wants to increase that limit, the proposer should come up with a fully > worked through test plan. :-) (Try doing any meaningful I/O to so many > disks at once, and see how well that works out.) > > What more, the libguestfs upstream tests 256 disks, and even _that_ > fails sometimes: > > https://bugzilla.redhat.com/show_bug.cgi?id=1478201 -- "kernel runs > out of memory with 256 virtio-scsi disks" > > The above bug is fixed now in kernel-4.17.0-0.rc3.git1.2. (And also > required a corresponding fix in QEMU[2], which is available from version > v2.11.0 onwards.) > > [...] > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg02823.html > -- virtio-scsi limits > [2] https://git.qemu.org/?p=qemu.git;a=commit;h=5c0919d > > -- > /kashyap > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] openstack-dev] [nova] Cannot live migrattion, because error:libvirtError: the CPU is incompatible with host CPU: Host CPU does not provide required features: cmt, mbm_total, mbm_lo
On Mon, May 14, 2018 at 11:23:51AM +0800, 何健乐 wrote: > Hi, all > When I did live-miration , I met the following error: result = > proxy_call(self._autowrap, f, *args, **kwargs)May 14 10:33:11 > nova-compute[981335]: File > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call > May 14 10:33:11 nova-compute[981335]: rv = execute(f, *args, **kwargs) > May 14 10:33:11 nova-compute[981335]: File > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute > May 14 10:33:11 nova-compute[981335]: six.reraise(c, e, tb) > May 14 10:33:11 nova-compute[981335]: File > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker > May 14 10:33:11 nova-compute[981335]: rv = meth(*args, **kwargs) > May 14 10:33:11 nova-compute[981335]: File > "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3 > May 14 10:33:11 nova-compute[981335]: if ret == -1: raise libvirtError > ('virDomainMigrateToURI3() failed', dom=self) > May 14 10:33:11 nova-compute[981335]: libvirtError: the CPU is incompatible > with host CPU: Host CPU does not provide required features: cmt, mbm_total, > mbm_local > Is there any one that has solution for this problem. > > Thanks This could be because you are running an older libvirt version on destination node which does not know anything about the cache or memory bandwidth monitoring features from Intel. Upgrading your libvirt version should resolve the issue. Or you are effectively trying to live-migrate a host-model domain to a destination node that does not support such features. To resolve it you should update your nova.conf to use a CPU model for your guests that will be compatible for both of your host. In nova.conf under section libvirt. cpu_mode=custom cpu_model=Haswell Then you should restart nova-compute service and reboot --force the instance so it will take the new cpu configuration into account. s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron][nova] live-migration broken after update of OVS/DPDK
We have an issue with live-migration if operators update OVS from a version that does not support dpdkvhostuserclient to a version that is supporting it. Basically from OVS2.6 to OVS2.7 or upper. The problem is that, for libvirt driver all the instances created that use vhu interfaces in server mode (OVS2.6) wont be able to live-migrate anymore. That because Neutron to select which vhu mode to use is looking at OVS capabilities [0]. Meaning that During the live-migration port details are going to be updated but Nova and in particular libvirt driver does not update guests domain XML to refer the changes. - We can fix Neutron by making it consider to always use the same vhu mode if the ports already exist. - We can enhance Nova and in particular libvirt driver to update guests domain XML during live-migration. The benefit is that the instances are going to be updated for free to use vhu in client mode which is totally better but it's probably not so trivial to implement. - We can avoid fixing it meaning that operators will have to update their instances to use vhu mode client that by a way like snapshot/rebuild. Then live-migration will be possible. [0] https://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2/drivers/openvswitch/mech_driver/mech_openvswitch.py#n94 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [os-vif] [nova] Changes to os-vif cores
On Tue, Oct 24, 2017 at 03:32:15PM +0100, Stephen Finucane wrote: > Hey, > > I'm not actually sure what the protocol is for adding/removing cores to a > library project without a PTL, so I'm just going to put this out there: I'd > like to propose the following changes to the os-vif core team. > > - Add 'nova-core' > > os-vif makes extensive use of objects and we've had a few hiccups around > versionings and the likes recently [1][2]. I'd the expertise of some of the > other nova cores here as we roll this out to projects other than nova, and I > trust those not interested/knowledgeable in this area to stay away :) > > - Remove Russell Bryant, Maxime Leroy > > These folks haven't been active on os-vif [3][4] for a long time and I > think > they can be safely removed. Indeed, they are not active. Seems to be reasonable. > To the existing core team members, please respond with a yay/nay and we'll > wait > a week before doing anything. > > Cheers, > Stephen > > [1] https://review.openstack.org/#/c/508498/ > [2] https://review.openstack.org/#/c/509107/ > [3] > https://review.openstack.org/#/q/reviewedby:%22Russell+Bryant+%253Crbryant% > 2540redhat.com%253E%22+project:openstack/os-vif > [4] > https://review.openstack.org/#/q/reviewedby:%22Maxime+Leroy+%253Cmaxime.ler > oy%25406wind.com%253E%22+project:openstack/os-vif > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] overhead_pin_set option
Some workloads require to have hypervisor overheads to be isolated from the set of pCPUs running guest vCPUs threads. For libvirt driver we have introduced the emulator threads placements which provides an option to reserve an additional host CPU per guest to pin the emulator threads on [0]. To extend the flexibility and address use-cases where resources on hosts are limited. We are introducing 'overhead_pin_set' option on compute node. Operators will have ability to reserve host CPUs for hypervisor overhead. For case of libvirt driver we are extending the flavor property hw:emulator_threads_policy to accept 'host' value, meaning that the guests configured to use hw:emulator_threads_policy=host will have their emulator threads running on the set of pCPUs configured with 'overhead_pin_set' option. The blueprint [1] on launchpad.net, patches [2] and spec updated [3] [0] https://specs.openstack.org/openstack/nova-specs/specs/pike/implemented/libvirt-emulator-threads-policy.html [1] https://blueprints.launchpad.net/nova/+spec/overhead-pin-set [2] https://review.openstack.org/#/c/510897/ [3] https://review.openstack.org/#/c/511188/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] vGPUs support for Nova - Implementation
On Fri, Sep 29, 2017 at 04:51:10PM +, Bob Ball wrote: > Hi Sahid, > > > > a second device emulator along-side QEMU. There is no mdev > > > integration. I'm concerned about how much mdev-specific functionality > > > would have to be faked up in the XenServer-specific driver for vGPU to > > > be used in this way. > > > > What you are refering with your DEMU it's what QEMU/KVM have with its > > vfio-pci. XenServer is > > reading through MDEV since the vendors provide drivers on *Linux* using the > > MDEV framework. > > MDEV is a kernel layer, used to expose hardwares, it's not hypervisor > > specific. > > It is possible that the vendor's userspace libraries use mdev, > however DEMU has no concept of mdev at all. If the vendor's > userspace libraries do use mdev then this is entirely abstracted > from XenServer's integration. While I don't have access to the > vendors source for the userspace libraries or the kernel module my > understanding was that the kernel module in XenServer's integration > is for the userspace libraries to talk to the kernel module and for > IOCTLS. My reading of mdev implies that /sys/class/mdev_bus should > exist for it to be used? It does not exist in XenServer, which to > me implies that the vendor's driver for XenServer do not use mdev? I shared our discussion to Alex Williamson, it's response: > Hi Sahid, > > XenServer does not use mdev for vGPU support. The mdev/vfio > infrastructure was developed in response to DEMU used on XenServer, > which we felt was not an upstream acceptable solution. There has > been cursory interest in porting vfio to Xen, so it's possible that > they might use the same mechanism some day, but for now they are > different solutions, the vifo/mdev solution being the only one > accepted upstream so far. Thanks, > > Alex It's my mistake. It seems clear now that XenSever can't take the benefice of that mdev support I have added in /pci module. The support of vGPUs for Xen will have to wait for the generic device management I guess. > > Bob > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] vGPUs support for Nova - Implementation
On Fri, Sep 29, 2017 at 11:16:43AM -0400, Jay Pipes wrote: > Hi Sahid, comments inline. :) > > On 09/29/2017 04:53 AM, Sahid Orentino Ferdjaoui wrote: > > On Thu, Sep 28, 2017 at 05:06:16PM -0400, Jay Pipes wrote: > > > On 09/28/2017 11:37 AM, Sahid Orentino Ferdjaoui wrote: > > > > Please consider the support of MDEV for the /pci framework which > > > > provides support for vGPUs [0]. > > > > > > > > Accordingly to the discussion [1] > > > > > > > > With this first implementation which could be used as a skeleton for > > > > implementing PCI Devices in Resource Tracker > > > > > > I'm not entirely sure what you're referring to above as "implementing PCI > > > devices in Resource Tracker". Could you elaborate? The resource tracker > > > already embeds a PciManager object that manages PCI devices, as you know. > > > Perhaps you meant "implement PCI devices as Resource Providers"? > > > > A PciManager? I know that we have a field PCI_DEVICE :) - I guess a > > virt driver can return inventory with total of PCI devices. Talking > > about manager, not sure. > > I'm referring to this: > > https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L33 > > [SNIP] > > It is that piece that Eric and myself have been talking about standardizing > into a "generic device management" interface that would have an > update_inventory() method that accepts a ProviderTree object [1] Jay, all of that looks to me perfectly sane even it's not clear what you want make so generic. That part of code is for the virt layers and you can't make it like just considering GPU or NET as a generic piece, they have characteristic which are requirements for virt layers. In that method 'update_inventory(provider_tree)' which you are going to introduce for /pci/PciManager, a first step would be to convert the objects to a understable dict for the whole logic, right, or do you have an other plan? In all cases from my POV I don't see any blocker, both work can co-exist without any pain. And adding features in the current /pci module is not going to add heavy work but is going to give to us a clear view of what is needed. > [1] > https://github.com/openstack/nova/blob/master/nova/compute/provider_tree.py > > and would add resource providers corresponding to devices that are made > available to guests for use. > > > You still have to define "traits", basically for physical network > > devices, the users want to select device according physical network, > > to select device according the placement on host (NUMA), to select the > > device according the bandwidth capability... For GPU it's same > > story. *And I do not have mentioned devices which support virtual > > functions.* > > Yes, the generic device manager would be responsible for associating traits > to the resource providers it adds to the ProviderTree provided to it in the > update_inventory() call. > > > So that is what you plan to do for this release :) - Reasonably I > > don't think we are close to have something ready for production. > > I don't disagree with you that this is a huge amount of refactoring to > undertake over the next couple releases. :) Yes and that is the point. We are going to block the work on /pci module during a period where we can see a large interest around such support. > > Jay, I have question, Why you don't start by exposing NUMA ? > > I believe you're asking here why we don't start by modeling NUMA nodes as > child resource providers of the compute node? Instead of starting by > modeling PCI devices as child providers of the compute node? If that's not > what you're asking, please do clarify... > > We're starting with modeling PCI devices as child providers of the compute > node because they are easier to deal with as a whole than NUMA nodes and we > have the potential of being able to remove the PciPassthroughFilter from the > scheduler in Queens. > > I don't see us being able to remove the NUMATopologyFilter from the > scheduler in Queens because of the complexity involved in how coupled the > NUMA topology resource handling is to CPU pinning, huge page support, and IO > emulation thread pinning. > > Hope that answers that question; again, lemme know if that's not the > question you were asking! :) Yes it was the question and you perfectly responded, thanks. I will try to be more clear in the future :) As you have noticed the support of NUMA will be quite difficult and it is not in the TODO right now, which let me think that we are going to b
Re: [openstack-dev] vGPUs support for Nova - Implementation
On Fri, Sep 29, 2017 at 12:26:07PM +, Bob Ball wrote: > Hi Sahid, > > > Please consider the support of MDEV for the /pci framework which provides > > support for vGPUs [0]. > > From my understanding, this MDEV implementation for vGPU would be > entirely specific to libvirt, is that correct? No, but Linux specific yes. Windows is supporting SR-IOV. > XenServer's implementation for vGPU is based on a pooled device > model (as described in > http://lists.openstack.org/pipermail/openstack-dev/2017-September/122702.html) This topic is referring something which I guess everyone understand now - It's basically why I do have added support of MDEV in /pci to make it working whatever how the virtual devices are exposed, SR-IOV or MDEV. > a second device emulator along-side QEMU. There is no mdev > integration. I'm concerned about how much mdev-specific > functionality would have to be faked up in the XenServer-specific > driver for vGPU to be used in this way. What you are refering with your DEMU it's what QEMU/KVM have with its vfio-pci. XenServer is reading through MDEV since the vendors provide drivers on *Linux* using the MDEV framework. MDEV is a kernel layer, used to expose hardwares, it's not hypervisor specific. > I'm not familiar with mdev, but it looks Linux specific, so would not be > usable by Hyper-V? > I've also not been able to find suggestions that VMWare can make use of mdev, > although I don't know the architecture of VMWare's integration. > > The concepts of PCI and SR-IOV are, of course, generic, but I think out of > principal we should avoid a hypervisor-specific integration for vGPU (indeed > Citrix has been clear from the beginning that the vGPU integration we are > proposing is intentionally hypervisor agnostic) > I also think there is value in exposing vGPU in a generic way, irrespective > of the underlying implementation (whether it is DEMU, mdev, SR-IOV or > whatever approach Hyper-V/VMWare use). > > It's quite difficult for me to see how this will work for other > hypervisors. Do you also have a draft alternate spec where more > details can be discussed? I would expect that XenServer provides the MDEV UUID, then it's easy to ask sysfs if you need to get the NUMA node of the physical device or the mdev_type. > Bob > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Running large instances with CPU pinning and OOM
On Thu, Sep 28, 2017 at 11:10:38PM +0200, Premysl Kouril wrote: > > > > Only the memory mapped for the guest is striclty allocated from the > > NUMA node selected. The QEMU overhead should float on the host NUMA > > nodes. So it seems that the "reserved_host_memory_mb" is enough. > > > > Even if that would be true and overhead memory could float in NUMA > nodes it generally doesn't prevent us from running into OOM troubles. > No matter where (in which NUMA node) the overhead memory gets > allocated, it is not included in available memory calculation for that > NUMA node when provisioning new instance and thus can cause OOM (once > the guest operating system of the newly provisioned instance actually > starts allocating memory which can only be allocated from its assigned > NUMA node). That is why you need to use Huge Pages. The memory will be reserved and locked for the guest. > Prema > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] vGPUs support for Nova - Implementation
On Thu, Sep 28, 2017 at 05:06:16PM -0400, Jay Pipes wrote: > On 09/28/2017 11:37 AM, Sahid Orentino Ferdjaoui wrote: > > Please consider the support of MDEV for the /pci framework which > > provides support for vGPUs [0]. > > > > Accordingly to the discussion [1] > > > > With this first implementation which could be used as a skeleton for > > implementing PCI Devices in Resource Tracker > > I'm not entirely sure what you're referring to above as "implementing PCI > devices in Resource Tracker". Could you elaborate? The resource tracker > already embeds a PciManager object that manages PCI devices, as you know. > Perhaps you meant "implement PCI devices as Resource Providers"? A PciManager? I know that we have a field PCI_DEVICE :) - I guess a virt driver can return inventory with total of PCI devices. Talking about manager, not sure. You still have to define "traits", basically for physical network devices, the users want to select device according physical network, to select device according the placement on host (NUMA), to select the device according the bandwidth capability... For GPU it's same story. *And I do not have mentioned devices which support virtual functions.* So that is what you plan to do for this release :) - Reasonably I don't think we are close to have something ready for production. Jay, I have question, Why you don't start by exposing NUMA ? > > we provide support for > > attaching vGPUs to guests. And also to provide affinity per NUMA > > nodes. An other important point is that that implementation can take > > advantage of the ongoing specs like PCI NUMA policies. > > > > * The Implementation [0] > > > > [PATCH 01/13] pci: update PciDevice object field 'address' to accept > > [PATCH 02/13] pci: add for PciDevice object new field mdev > > [PATCH 03/13] pci: generalize object unit-tests for different > > [PATCH 04/13] pci: add support for mdev device type request > > [PATCH 05/13] pci: generalize stats unit-tests for different > > [PATCH 06/13] pci: add support for mdev devices type devspec > > [PATCH 07/13] pci: add support for resource pool stats of mdev > > [PATCH 08/13] pci: make manager to accept handling mdev devices > > > > In this serie of patches we are generalizing the PCI framework to > > handle MDEV devices. We arguing it's a lot of patches but most of them > > are small and the logic behind is basically to make it understand two > > new fields MDEV_PF and MDEV_VF. > > That's not really "generalizing the PCI framework to handle MDEV devices" :) > More like it's just changing the /pci module to understand a different > device management API, but ok. If you prefer call it like that :) - The point is the /pci manages physical devices, It can passthrough the whole device or its virtual functions exposed through SRIOV or MDEV. > > [PATCH 09/13] libvirt: update PCI node device to report mdev devices > > [PATCH 10/13] libvirt: report mdev resources > > [PATCH 11/13] libvirt: add support to start vm with using mdev (vGPU) > > > > In this serie of patches we make libvirt driver support, as usually, > > return resources and attach devices returned by the pci manager. This > > part can be reused for Resource Provider. > > Perhaps, but the idea behind the resource providers framework is to treat > devices as generic things. Placement doesn't need to know about the > particular device attachment status. > > > [PATCH 12/13] functional: rework fakelibvirt host pci devices > > [PATCH 13/13] libvirt: resuse SRIOV funtional tests for MDEV devices > > > > Here we reuse 100/100 of the functional tests used for SR-IOV > > devices. Again here, this part can be reused for Resource Provider. > > Probably not, but I'll take a look :) > > For the record, I have zero confidence in any existing "functional" tests > for NUMA, SR-IOV, CPU pinning, huge pages, and the like. Unfortunately, due > to the fact that these features often require hardware that either the > upstream community CI lacks or that depends on libraries, drivers and kernel > versions that really aren't available to non-bleeding edge users (or users > with very deep pockets). It's good point, if you are not confidence, don't you think it's premature to move forward on implementing new thing without to have well trusted functional tests? > > * The Usage > > > > There are no difference between SR-IOV and MDEV, from operators point > > of view who knows how to expose SR-IOV devices in Nova, they already > > know how to expose MDEV devices (vGPUs). >
[openstack-dev] vGPUs support for Nova - Implementation
Please consider the support of MDEV for the /pci framework which provides support for vGPUs [0]. Accordingly to the discussion [1] With this first implementation which could be used as a skeleton for implementing PCI Devices in Resource Tracker we provide support for attaching vGPUs to guests. And also to provide affinity per NUMA nodes. An other important point is that that implementation can take advantage of the ongoing specs like PCI NUMA policies. * The Implementation [0] [PATCH 01/13] pci: update PciDevice object field 'address' to accept [PATCH 02/13] pci: add for PciDevice object new field mdev [PATCH 03/13] pci: generalize object unit-tests for different [PATCH 04/13] pci: add support for mdev device type request [PATCH 05/13] pci: generalize stats unit-tests for different [PATCH 06/13] pci: add support for mdev devices type devspec [PATCH 07/13] pci: add support for resource pool stats of mdev [PATCH 08/13] pci: make manager to accept handling mdev devices In this serie of patches we are generalizing the PCI framework to handle MDEV devices. We arguing it's a lot of patches but most of them are small and the logic behind is basically to make it understand two new fields MDEV_PF and MDEV_VF. [PATCH 09/13] libvirt: update PCI node device to report mdev devices [PATCH 10/13] libvirt: report mdev resources [PATCH 11/13] libvirt: add support to start vm with using mdev (vGPU) In this serie of patches we make libvirt driver support, as usually, return resources and attach devices returned by the pci manager. This part can be reused for Resource Provider. [PATCH 12/13] functional: rework fakelibvirt host pci devices [PATCH 13/13] libvirt: resuse SRIOV funtional tests for MDEV devices Here we reuse 100/100 of the functional tests used for SR-IOV devices. Again here, this part can be reused for Resource Provider. * The Usage There are no difference between SR-IOV and MDEV, from operators point of view who knows how to expose SR-IOV devices in Nova, they already know how to expose MDEV devices (vGPUs). Operators will be able to expose MDEV devices in the same manner as they expose SR-IOV: 1/ Configure whitelist devices ['{"vendor_id":"10de"}'] 2/ Create aliases [{"vendor_id":"10de", "name":"vGPU"}] 3/ Configure the flavor openstack flavor set --property "pci_passthrough:alias"="vGPU:1" * Limitations The mdev does not provide 'product_id' but 'mdev_type' which should be considered to exactly identify which resource users can request e.g: nvidia-10. To provide that support we have to add a new field 'mdev_type' so aliases could be something like: {"vendor_id":"10de", mdev_type="nvidia-10" "name":"alias-nvidia-10"} {"vendor_id":"10de", mdev_type="nvidia-11" "name":"alias-nvidia-11"} I do have plan to add but first I need to have support from upstream to continue that work. [0] https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:pci-mdev-support [1] http://lists.openstack.org/pipermail/openstack-dev/2017-September/122591.html __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Running large instances with CPU pinning and OOM
On Wed, Sep 27, 2017 at 11:10:40PM +0200, Premysl Kouril wrote: > > Lastly, qemu has overhead that varies depending on what you're doing in the > > guest. In particular, there are various IO queues that can consume > > significant amounts of memory. The company that I work for put in a good > > bit of effort engineering things so that they work more reliably, and part > > of that was determining how much memory to reserve for the host. > > > > Chris > > Hi, I work with Jakub (the op of this thread) and here is my two > cents: I think what is critical to realize is that KVM virtual > machines can have substantial memory overhead of up to 25% of memory, > allocated to KVM virtual machine itself. This overhead memory is not > considered in nova code when calculating if the instance being > provisioned actually fits into host's available resources (only the > memory, configured in instance's flavor is considered). And this is > especially being a problem when CPU pinning is used as the memory > allocation is bounded by limits of specific NUMA node (due to the > strict memory allocation mode). This renders the global reservation > parameter reserved_host_memory_mb useless as it doesn't take NUMA into > account. Only the memory mapped for the guest is striclty allocated from the NUMA node selected. The QEMU overhead should float on the host NUMA nodes. So it seems that the "reserved_host_memory_mb" is enough. > This KVM virtual machine overhead is what is causing the OOMs in our > infrastructure and that's what we need to fix. > > Regards, > Prema > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Running large instances with CPU pinning and OOM
On Mon, Sep 25, 2017 at 05:36:44PM +0200, Jakub Jursa wrote: > Hello everyone, > > We're experiencing issues with running large instances (~60GB RAM) on > fairly large NUMA nodes (4 CPUs, 256GB RAM) while using cpu pinning. The > problem is that it seems that in some extreme cases qemu/KVM can have > significant memory overhead (10-15%?) which nova-compute service doesn't > take in to the account when launching VMs. Using our configuration as an > example - imagine running two VMs with 30GB RAM on one NUMA node > (because we use cpu pinning) - therefore using 60GB out of 64GB for > given NUMA domain. When both VMs would consume their entire memory > (given 10% KVM overhead) OOM killer takes an action (despite having > plenty of free RAM in other NUMA nodes). (the numbers are just > arbitrary, the point is that nova-scheduler schedules the instance to > run on the node because the memory seems 'free enough', but specific > NUMA node can be lacking the memory reserve). In Nova when using NUMA we do pin the memory on the host NUMA nodes selected during scheduling. In your case it seems that you have specificly requested a guest with 1 NUMA node. It will be not possible for the process to grab memory on an other host NUMA node but some other processes could be running in that host NUMA node and consume memory. What you need is to use Huge Pages, in such case the memory will be locked for the guest. > Our initial solution was to use ram_allocation_ratio < 1 to ensure > having some reserved memory - this didn't work. Upon studying source of > nova, it turns out that ram_allocation_ratio is ignored when using cpu > pinning. (see > https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L859 > and > https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L821 > ). We're running Mitaka, but this piece of code is implemented in Ocata > in a same way. > We're considering to create a patch for taking ram_allocation_ratio in > to account. > > My question is - is ram_allocation_ratio ignored on purpose when using > cpu pinning? If yes, what is the reasoning behind it? And what would be > the right solution to ensure having reserved RAM on the NUMA nodes? > > Thanks. > > Regards, > > Jakub Jursa > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] vGPUs support for Nova
On Mon, Sep 25, 2017 at 04:59:04PM +, Jianghua Wang wrote: > Sahid, > > Just share some background. XenServer doesn't expose vGPUs as mdev > or pci devices. That does not make any sense. There is physical device (PCI) which provides functions (vGPUs). These functions are exposed through mdev framework. What you need is the mdev UUID related to a specific vGPU and I'm sure that XenServer is going to expose it. Something which XenServer may not expose is the NUMA node where the physical device is plugged on but in such situation you could still use sysfs. > I proposed a spec about one year ago to make fake pci devices so > that we can use the existing PCI mechanism to cover vGPUs. But > that's not a good design and got strongly objection. After that, we > switched to use the resource providers by following the advice from > the core team. > > Regards, > Jianghua > > -Original Message- > From: Sahid Orentino Ferdjaoui [mailto:sferd...@redhat.com] > Sent: Monday, September 25, 2017 11:01 PM > To: OpenStack Development Mailing List (not for usage questions) > > Subject: Re: [openstack-dev] vGPUs support for Nova > > On Mon, Sep 25, 2017 at 09:29:25AM -0500, Matt Riedemann wrote: > > On 9/25/2017 5:40 AM, Jay Pipes wrote: > > > On 09/25/2017 05:39 AM, Sahid Orentino Ferdjaoui wrote: > > > > There is a desire to expose the vGPUs resources on top of Resource > > > > Provider which is probably the path we should be going in the long > > > > term. I was not there for the last PTG and you probably already > > > > made a decision about moving in that direction anyway. My personal > > > > feeling is that it is premature. > > > > > > > > The nested Resource Provider work is not yet feature-complete and > > > > requires more reviewer attention. If we continue in the direction > > > > of Resource Provider, it will need at least 2 more releases to > > > > expose the vGPUs feature and that without the support of NUMA, and > > > > with the feeling of pushing something which is not > > > > stable/production-ready. > > > > > > > > It's seems safer to first have the Resource Provider work well > > > > finalized/stabilized to be production-ready. Then on top of > > > > something stable we could start to migrate our current virt > > > > specific features like NUMA, CPU Pinning, Huge Pages and finally PCI > > > > devices. > > > > > > > > I'm talking about PCI devices in general because I think we should > > > > implement the vGPU on top of our /pci framework which is > > > > production ready and provides the support of NUMA. > > > > > > > > The hardware vendors building their drivers using mdev and the > > > > /pci framework currently understand only SRIOV but on a quick > > > > glance it does not seem complicated to make it support mdev. > > > > > > > > In the /pci framework we will have to: > > > > > > > > * Update the PciDevice object fields to accept NULL value for > > > > 'address' and add new field 'uuid' > > > > * Update PciRequest to handle a new tag like 'vgpu_types' > > > > * Update PciDeviceStats to also maintain pool of vGPUs > > > > > > > > The operators will have to create alias(-es) and configure > > > > flavors. Basically most of the logic is already implemented and > > > > the method 'consume_request' is going to select the right vGPUs > > > > according the request. > > > > > > > > In /virt we will have to: > > > > > > > > * Update the field 'pci_passthrough_devices' to also include GPUs > > > > devices. > > > > * Update attach/detach PCI device to handle vGPUs > > > > > > > > We have a few people interested in working on it, so we could > > > > certainly make this feature available for Queen. > > > > > > > > I can take the lead updating/implementing the PCI and libvirt > > > > driver part, I'm sure Jianghua Wang will be happy to take the lead > > > > for the virt XenServer part. > > > > > > > > And I trust Jay, Stephen and Sylvain to follow the developments. > > > > > > I understand the desire to get something in to Nova to support > > > vGPUs, and I understand that the existing /pci modules represent the > > > fastest/cheapes
Re: [openstack-dev] vGPUs support for Nova
On Mon, Sep 25, 2017 at 09:29:25AM -0500, Matt Riedemann wrote: > On 9/25/2017 5:40 AM, Jay Pipes wrote: > > On 09/25/2017 05:39 AM, Sahid Orentino Ferdjaoui wrote: > > > There is a desire to expose the vGPUs resources on top of Resource > > > Provider which is probably the path we should be going in the long > > > term. I was not there for the last PTG and you probably already made a > > > decision about moving in that direction anyway. My personal feeling is > > > that it is premature. > > > > > > The nested Resource Provider work is not yet feature-complete and > > > requires more reviewer attention. If we continue in the direction of > > > Resource Provider, it will need at least 2 more releases to expose the > > > vGPUs feature and that without the support of NUMA, and with the > > > feeling of pushing something which is not stable/production-ready. > > > > > > It's seems safer to first have the Resource Provider work well > > > finalized/stabilized to be production-ready. Then on top of something > > > stable we could start to migrate our current virt specific features > > > like NUMA, CPU Pinning, Huge Pages and finally PCI devices. > > > > > > I'm talking about PCI devices in general because I think we should > > > implement the vGPU on top of our /pci framework which is production > > > ready and provides the support of NUMA. > > > > > > The hardware vendors building their drivers using mdev and the /pci > > > framework currently understand only SRIOV but on a quick glance it > > > does not seem complicated to make it support mdev. > > > > > > In the /pci framework we will have to: > > > > > > * Update the PciDevice object fields to accept NULL value for > > > 'address' and add new field 'uuid' > > > * Update PciRequest to handle a new tag like 'vgpu_types' > > > * Update PciDeviceStats to also maintain pool of vGPUs > > > > > > The operators will have to create alias(-es) and configure > > > flavors. Basically most of the logic is already implemented and the > > > method 'consume_request' is going to select the right vGPUs according > > > the request. > > > > > > In /virt we will have to: > > > > > > * Update the field 'pci_passthrough_devices' to also include GPUs > > > devices. > > > * Update attach/detach PCI device to handle vGPUs > > > > > > We have a few people interested in working on it, so we could > > > certainly make this feature available for Queen. > > > > > > I can take the lead updating/implementing the PCI and libvirt driver > > > part, I'm sure Jianghua Wang will be happy to take the lead for the > > > virt XenServer part. > > > > > > And I trust Jay, Stephen and Sylvain to follow the developments. > > > > I understand the desire to get something in to Nova to support vGPUs, > > and I understand that the existing /pci modules represent the > > fastest/cheapest way to get there. > > > > I won't block you from making any of the above changes, Sahid. I'll even > > do my best to review them. However, I will be primarily focusing this > > cycle on getting the nested resource providers work feature-complete for > > (at least) SR-IOV PF/VF devices. > > > > The decision of whether to allow an approach that adds more to the > > existing /pci module is ultimately Matt's. > > > > Best, > > -jay > > > > __ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > Nested resource providers is not merged or production ready because we > haven't made it a priority. We've certainly talked about it and Jay has had > patches proposed for several releases now though. > > Building vGPU support into the existing framework, which only a couple of > people understand - certainly not me, might be a short-term gain but is just > more technical debt we have to pay off later, and delays any focus on nested > resource providers for the wider team. > > At the Queens PTG it was abundantly clear that many features are dependent > on nested resource providers, including several networking-related features > like bandwidth-based scheduling. > >
[openstack-dev] vGPUs support for Nova
There is a desire to expose the vGPUs resources on top of Resource Provider which is probably the path we should be going in the long term. I was not there for the last PTG and you probably already made a decision about moving in that direction anyway. My personal feeling is that it is premature. The nested Resource Provider work is not yet feature-complete and requires more reviewer attention. If we continue in the direction of Resource Provider, it will need at least 2 more releases to expose the vGPUs feature and that without the support of NUMA, and with the feeling of pushing something which is not stable/production-ready. It's seems safer to first have the Resource Provider work well finalized/stabilized to be production-ready. Then on top of something stable we could start to migrate our current virt specific features like NUMA, CPU Pinning, Huge Pages and finally PCI devices. I'm talking about PCI devices in general because I think we should implement the vGPU on top of our /pci framework which is production ready and provides the support of NUMA. The hardware vendors building their drivers using mdev and the /pci framework currently understand only SRIOV but on a quick glance it does not seem complicated to make it support mdev. In the /pci framework we will have to: * Update the PciDevice object fields to accept NULL value for 'address' and add new field 'uuid' * Update PciRequest to handle a new tag like 'vgpu_types' * Update PciDeviceStats to also maintain pool of vGPUs The operators will have to create alias(-es) and configure flavors. Basically most of the logic is already implemented and the method 'consume_request' is going to select the right vGPUs according the request. In /virt we will have to: * Update the field 'pci_passthrough_devices' to also include GPUs devices. * Update attach/detach PCI device to handle vGPUs We have a few people interested in working on it, so we could certainly make this feature available for Queen. I can take the lead updating/implementing the PCI and libvirt driver part, I'm sure Jianghua Wang will be happy to take the lead for the virt XenServer part. And I trust Jay, Stephen and Sylvain to follow the developments. s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] realtime kvm cpu affinities
On Tue, Jun 27, 2017 at 04:00:35PM +0200, Henning Schild wrote: > Am Tue, 27 Jun 2017 09:44:22 +0200 > schrieb Sahid Orentino Ferdjaoui : > > > On Mon, Jun 26, 2017 at 10:19:12AM +0200, Henning Schild wrote: > > > Am Sun, 25 Jun 2017 10:09:10 +0200 > > > schrieb Sahid Orentino Ferdjaoui : > > > > > > > On Fri, Jun 23, 2017 at 10:34:26AM -0600, Chris Friesen wrote: > > > > > On 06/23/2017 09:35 AM, Henning Schild wrote: > > > > > > Am Fri, 23 Jun 2017 11:11:10 +0200 > > > > > > schrieb Sahid Orentino Ferdjaoui : > > > > > > > > > > > > In Linux RT context, and as you mentioned, the non-RT vCPU > > > > > > > can acquire some guest kernel lock, then be pre-empted by > > > > > > > emulator thread while holding this lock. This situation > > > > > > > blocks RT vCPUs from doing its work. So that is why we have > > > > > > > implemented [2]. For DPDK I don't think we have such > > > > > > > problems because it's running in userland. > > > > > > > > > > > > > > So for DPDK context I think we could have a mask like we > > > > > > > have for RT and basically considering vCPU0 to handle best > > > > > > > effort works (emulator threads, SSH...). I think it's the > > > > > > > current pattern used by DPDK users. > > > > > > > > > > > > DPDK is just a library and one can imagine an application > > > > > > that has cross-core communication/synchronisation needs where > > > > > > the emulator slowing down vpu0 will also slow down vcpu1. You > > > > > > DPDK application would have to know which of its cores did > > > > > > not get a full pcpu. > > > > > > > > > > > > I am not sure what the DPDK-example is doing in this > > > > > > discussion, would that not just be cpu_policy=dedicated? I > > > > > > guess normal behaviour of dedicated is that emulators and io > > > > > > happily share pCPUs with vCPUs and you are looking for a way > > > > > > to restrict emulators/io to a subset of pCPUs because you can > > > > > > live with some of them beeing not 100%. > > > > > > > > > > Yes. A typical DPDK-using VM might look something like this: > > > > > > > > > > vCPU0: non-realtime, housekeeping and I/O, handles all virtual > > > > > interrupts and "normal" linux stuff, emulator runs on same pCPU > > > > > vCPU1: realtime, runs in tight loop in userspace processing > > > > > packets vCPU2: realtime, runs in tight loop in userspace > > > > > processing packets vCPU3: realtime, runs in tight loop in > > > > > userspace processing packets > > > > > > > > > > In this context, vCPUs 1-3 don't really ever enter the kernel, > > > > > and we've offloaded as much kernel work as possible from them > > > > > onto vCPU0. This works pretty well with the current system. > > > > > > > > > > > > For RT we have to isolate the emulator threads to an > > > > > > > additional pCPU per guests or as your are suggesting to a > > > > > > > set of pCPUs for all the guests running. > > > > > > > > > > > > > > I think we should introduce a new option: > > > > > > > > > > > > > >- hw:cpu_emulator_threads_mask=^1 > > > > > > > > > > > > > > If on 'nova.conf' - that mask will be applied to the set of > > > > > > > all host CPUs (vcpu_pin_set) to basically pack the emulator > > > > > > > threads of all VMs running here (useful for RT context). > > > > > > > > > > > > That would allow modelling exactly what we need. > > > > > > In nova.conf we are talking absolute known values, no need > > > > > > for a mask and a set is much easier to read. Also using the > > > > > > same name does not sound like a good idea. > > > > > > And the name vcpu_pin_set clearly suggest what kind of load > > > > > > runs here, if using a mask it should be called pin_set. > > > > > > > > > > I agree with Henning. >
Re: [openstack-dev] realtime kvm cpu affinities
On Mon, Jun 26, 2017 at 12:12:49PM -0600, Chris Friesen wrote: > On 06/25/2017 02:09 AM, Sahid Orentino Ferdjaoui wrote: > > On Fri, Jun 23, 2017 at 10:34:26AM -0600, Chris Friesen wrote: > > > On 06/23/2017 09:35 AM, Henning Schild wrote: > > > > Am Fri, 23 Jun 2017 11:11:10 +0200 > > > > schrieb Sahid Orentino Ferdjaoui : > > > > > > > > In Linux RT context, and as you mentioned, the non-RT vCPU can acquire > > > > > some guest kernel lock, then be pre-empted by emulator thread while > > > > > holding this lock. This situation blocks RT vCPUs from doing its > > > > > work. So that is why we have implemented [2]. For DPDK I don't think > > > > > we have such problems because it's running in userland. > > > > > > > > > > So for DPDK context I think we could have a mask like we have for RT > > > > > and basically considering vCPU0 to handle best effort works (emulator > > > > > threads, SSH...). I think it's the current pattern used by DPDK users. > > > > > > > > DPDK is just a library and one can imagine an application that has > > > > cross-core communication/synchronisation needs where the emulator > > > > slowing down vpu0 will also slow down vcpu1. You DPDK application would > > > > have to know which of its cores did not get a full pcpu. > > > > > > > > I am not sure what the DPDK-example is doing in this discussion, would > > > > that not just be cpu_policy=dedicated? I guess normal behaviour of > > > > dedicated is that emulators and io happily share pCPUs with vCPUs and > > > > you are looking for a way to restrict emulators/io to a subset of pCPUs > > > > because you can live with some of them beeing not 100%. > > > > > > Yes. A typical DPDK-using VM might look something like this: > > > > > > vCPU0: non-realtime, housekeeping and I/O, handles all virtual interrupts > > > and "normal" linux stuff, emulator runs on same pCPU > > > vCPU1: realtime, runs in tight loop in userspace processing packets > > > vCPU2: realtime, runs in tight loop in userspace processing packets > > > vCPU3: realtime, runs in tight loop in userspace processing packets > > > > > > In this context, vCPUs 1-3 don't really ever enter the kernel, and we've > > > offloaded as much kernel work as possible from them onto vCPU0. This > > > works > > > pretty well with the current system. > > > > > > > > For RT we have to isolate the emulator threads to an additional pCPU > > > > > per guests or as your are suggesting to a set of pCPUs for all the > > > > > guests running. > > > > > > > > > > I think we should introduce a new option: > > > > > > > > > > - hw:cpu_emulator_threads_mask=^1 > > > > > > > > > > If on 'nova.conf' - that mask will be applied to the set of all host > > > > > CPUs (vcpu_pin_set) to basically pack the emulator threads of all VMs > > > > > running here (useful for RT context). > > > > > > > > That would allow modelling exactly what we need. > > > > In nova.conf we are talking absolute known values, no need for a mask > > > > and a set is much easier to read. Also using the same name does not > > > > sound like a good idea. > > > > And the name vcpu_pin_set clearly suggest what kind of load runs here, > > > > if using a mask it should be called pin_set. > > > > > > I agree with Henning. > > > > > > In nova.conf we should just use a set, something like > > > "rt_emulator_vcpu_pin_set" which would be used for running the emulator/io > > > threads of *only* realtime instances. > > > > I'm not agree with you, we have a set of pCPUs and we want to > > substract some of them for the emulator threads. We need a mask. The > > only set we need is to selection which pCPUs Nova can use > > (vcpus_pin_set). > > > > > We may also want to have "rt_emulator_overcommit_ratio" to control how > > > many > > > threads/instances we allow per pCPU. > > > > Not really sure to have understand this point? If it is to indicate > > that for a pCPU isolated we want X guest emulator threads, the same > > behavior is achieved by the mask. A host for realtime is dedicated for > > realtime, no overcommitment and the operator
Re: [openstack-dev] realtime kvm cpu affinities
On Mon, Jun 26, 2017 at 10:19:12AM +0200, Henning Schild wrote: > Am Sun, 25 Jun 2017 10:09:10 +0200 > schrieb Sahid Orentino Ferdjaoui : > > > On Fri, Jun 23, 2017 at 10:34:26AM -0600, Chris Friesen wrote: > > > On 06/23/2017 09:35 AM, Henning Schild wrote: > > > > Am Fri, 23 Jun 2017 11:11:10 +0200 > > > > schrieb Sahid Orentino Ferdjaoui : > > > > > > > > In Linux RT context, and as you mentioned, the non-RT vCPU can > > > > > acquire some guest kernel lock, then be pre-empted by emulator > > > > > thread while holding this lock. This situation blocks RT vCPUs > > > > > from doing its work. So that is why we have implemented [2]. > > > > > For DPDK I don't think we have such problems because it's > > > > > running in userland. > > > > > > > > > > So for DPDK context I think we could have a mask like we have > > > > > for RT and basically considering vCPU0 to handle best effort > > > > > works (emulator threads, SSH...). I think it's the current > > > > > pattern used by DPDK users. > > > > > > > > DPDK is just a library and one can imagine an application that has > > > > cross-core communication/synchronisation needs where the emulator > > > > slowing down vpu0 will also slow down vcpu1. You DPDK application > > > > would have to know which of its cores did not get a full pcpu. > > > > > > > > I am not sure what the DPDK-example is doing in this discussion, > > > > would that not just be cpu_policy=dedicated? I guess normal > > > > behaviour of dedicated is that emulators and io happily share > > > > pCPUs with vCPUs and you are looking for a way to restrict > > > > emulators/io to a subset of pCPUs because you can live with some > > > > of them beeing not 100%. > > > > > > Yes. A typical DPDK-using VM might look something like this: > > > > > > vCPU0: non-realtime, housekeeping and I/O, handles all virtual > > > interrupts and "normal" linux stuff, emulator runs on same pCPU > > > vCPU1: realtime, runs in tight loop in userspace processing packets > > > vCPU2: realtime, runs in tight loop in userspace processing packets > > > vCPU3: realtime, runs in tight loop in userspace processing packets > > > > > > In this context, vCPUs 1-3 don't really ever enter the kernel, and > > > we've offloaded as much kernel work as possible from them onto > > > vCPU0. This works pretty well with the current system. > > > > > > > > For RT we have to isolate the emulator threads to an additional > > > > > pCPU per guests or as your are suggesting to a set of pCPUs for > > > > > all the guests running. > > > > > > > > > > I think we should introduce a new option: > > > > > > > > > >- hw:cpu_emulator_threads_mask=^1 > > > > > > > > > > If on 'nova.conf' - that mask will be applied to the set of all > > > > > host CPUs (vcpu_pin_set) to basically pack the emulator threads > > > > > of all VMs running here (useful for RT context). > > > > > > > > That would allow modelling exactly what we need. > > > > In nova.conf we are talking absolute known values, no need for a > > > > mask and a set is much easier to read. Also using the same name > > > > does not sound like a good idea. > > > > And the name vcpu_pin_set clearly suggest what kind of load runs > > > > here, if using a mask it should be called pin_set. > > > > > > I agree with Henning. > > > > > > In nova.conf we should just use a set, something like > > > "rt_emulator_vcpu_pin_set" which would be used for running the > > > emulator/io threads of *only* realtime instances. > > > > I'm not agree with you, we have a set of pCPUs and we want to > > substract some of them for the emulator threads. We need a mask. The > > only set we need is to selection which pCPUs Nova can use > > (vcpus_pin_set). > > At that point it does not really matter whether it is a set or a mask. > They can both express the same where a set is easier to read/configure. > With the same argument you could say that vcpu_pin_set should be a mask > over the hosts pcpus. > > As i said before: vcpu_pin_set should be renamed because all sorts of > threads
Re: [openstack-dev] realtime kvm cpu affinities
On Fri, Jun 23, 2017 at 10:34:26AM -0600, Chris Friesen wrote: > On 06/23/2017 09:35 AM, Henning Schild wrote: > > Am Fri, 23 Jun 2017 11:11:10 +0200 > > schrieb Sahid Orentino Ferdjaoui : > > > > In Linux RT context, and as you mentioned, the non-RT vCPU can acquire > > > some guest kernel lock, then be pre-empted by emulator thread while > > > holding this lock. This situation blocks RT vCPUs from doing its > > > work. So that is why we have implemented [2]. For DPDK I don't think > > > we have such problems because it's running in userland. > > > > > > So for DPDK context I think we could have a mask like we have for RT > > > and basically considering vCPU0 to handle best effort works (emulator > > > threads, SSH...). I think it's the current pattern used by DPDK users. > > > > DPDK is just a library and one can imagine an application that has > > cross-core communication/synchronisation needs where the emulator > > slowing down vpu0 will also slow down vcpu1. You DPDK application would > > have to know which of its cores did not get a full pcpu. > > > > I am not sure what the DPDK-example is doing in this discussion, would > > that not just be cpu_policy=dedicated? I guess normal behaviour of > > dedicated is that emulators and io happily share pCPUs with vCPUs and > > you are looking for a way to restrict emulators/io to a subset of pCPUs > > because you can live with some of them beeing not 100%. > > Yes. A typical DPDK-using VM might look something like this: > > vCPU0: non-realtime, housekeeping and I/O, handles all virtual interrupts > and "normal" linux stuff, emulator runs on same pCPU > vCPU1: realtime, runs in tight loop in userspace processing packets > vCPU2: realtime, runs in tight loop in userspace processing packets > vCPU3: realtime, runs in tight loop in userspace processing packets > > In this context, vCPUs 1-3 don't really ever enter the kernel, and we've > offloaded as much kernel work as possible from them onto vCPU0. This works > pretty well with the current system. > > > > For RT we have to isolate the emulator threads to an additional pCPU > > > per guests or as your are suggesting to a set of pCPUs for all the > > > guests running. > > > > > > I think we should introduce a new option: > > > > > >- hw:cpu_emulator_threads_mask=^1 > > > > > > If on 'nova.conf' - that mask will be applied to the set of all host > > > CPUs (vcpu_pin_set) to basically pack the emulator threads of all VMs > > > running here (useful for RT context). > > > > That would allow modelling exactly what we need. > > In nova.conf we are talking absolute known values, no need for a mask > > and a set is much easier to read. Also using the same name does not > > sound like a good idea. > > And the name vcpu_pin_set clearly suggest what kind of load runs here, > > if using a mask it should be called pin_set. > > I agree with Henning. > > In nova.conf we should just use a set, something like > "rt_emulator_vcpu_pin_set" which would be used for running the emulator/io > threads of *only* realtime instances. I'm not agree with you, we have a set of pCPUs and we want to substract some of them for the emulator threads. We need a mask. The only set we need is to selection which pCPUs Nova can use (vcpus_pin_set). > We may also want to have "rt_emulator_overcommit_ratio" to control how many > threads/instances we allow per pCPU. Not really sure to have understand this point? If it is to indicate that for a pCPU isolated we want X guest emulator threads, the same behavior is achieved by the mask. A host for realtime is dedicated for realtime, no overcommitment and the operators know the number of host CPUs, they can easily deduct a ratio and so the corresponding mask. > > > If on flavor extra-specs It will be applied to the vCPUs dedicated for > > > the guest (useful for DPDK context). > > > > And if both are present the flavor wins and nova.conf is ignored? > > In the flavor I'd like to see it be a full bitmask, not an exclusion mask > with an implicit full set. Thus the end-user could specify > "hw:cpu_emulator_threads_mask=0" and get the emulator threads to run > alongside vCPU0. Same here, I'm not agree, the only set is the vCPUs of the guest. Then we want a mask to substract some of them. > Henning, there is no conflict, the nova.conf setting and the flavor setting > are used for two different things. > > Chris > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] realtime kvm cpu affinities
On Wed, Jun 21, 2017 at 12:47:27PM +0200, Henning Schild wrote: > Am Tue, 20 Jun 2017 10:04:30 -0400 > schrieb Luiz Capitulino : > > > On Tue, 20 Jun 2017 09:48:23 +0200 > > Henning Schild wrote: > > > > > Hi, > > > > > > We are using OpenStack for managing realtime guests. We modified > > > it and contributed to discussions on how to model the realtime > > > feature. More recent versions of OpenStack have support for > > > realtime, and there are a few proposals on how to improve that > > > further. > > > > > > But there is still no full answer on how to distribute threads > > > across host-cores. The vcpus are easy but for the emulation and > > > io-threads there are multiple options. I would like to collect the > > > constraints from a qemu/kvm perspective first, and than possibly > > > influence the OpenStack development > > > > > > I will put the summary/questions first, the text below provides more > > > context to where the questions come from. > > > - How do you distribute your threads when reaching the really low > > > cyclictest results in the guests? In [3] Rik talked about problems > > > like hold holder preemption, starvation etc. but not where/how to > > > schedule emulators and io > > > > We put emulator threads and io-threads in housekeeping cores in > > the host. I think housekeeping cores is what you're calling > > best-effort cores, those are non-isolated cores that will run host > > load. > > As expected, any best-effort/housekeeping core will do but overlap with > the vcpu-cores is a bad idea. > > > > - Is it ok to put a vcpu and emulator thread on the same core as > > > long as the guest knows about it? Any funny behaving guest, not > > > just Linux. > > > > We can't do this for KVM-RT because we run all vcpu threads with > > FIFO priority. > > Same point as above, meaning the "hw:cpu_realtime_mask" approach is > wrong for realtime. > > > However, we have another project with DPDK whose goal is to achieve > > zero-loss networking. The configuration required by this project is > > very similar to the one required by KVM-RT. One difference though is > > that we don't use RT and hence don't use FIFO priority. > > > > In this project we've been running with the emulator thread and a > > vcpu sharing the same core. As long as the guest housekeeping CPUs > > are idle, we don't get any packet drops (most of the time, what > > causes packet drops in this test-case would cause spikes in > > cyclictest). However, we're seeing some packet drops for certain > > guest workloads which we are still debugging. > > Ok but that seems to be a different scenario where hw:cpu_policy > dedicated should be sufficient. However if the placement of the io and > emulators has to be on a subset of the dedicated cpus something like > hw:cpu_realtime_mask would be required. > > > > - Is it ok to make the emulators potentially slow by running them on > > > busy best-effort cores, or will they quickly be on the critical > > > path if you do more than just cyclictest? - our experience says we > > > don't need them reactive even with rt-networking involved > > > > I believe it is ok. > > Ok. > > > > Our goal is to reach a high packing density of realtime VMs. Our > > > pragmatic first choice was to run all non-vcpu-threads on a shared > > > set of pcpus where we also run best-effort VMs and host load. > > > Now the OpenStack guys are not too happy with that because that is > > > load outside the assigned resources, which leads to quota and > > > accounting problems. > > > > > > So the current OpenStack model is to run those threads next to one > > > or more vcpu-threads. [1] You will need to remember that the vcpus > > > in question should not be your rt-cpus in the guest. I.e. if vcpu0 > > > shares its pcpu with the hypervisor noise your preemptrt-guest > > > would use isolcpus=1. > > > > > > Is that kind of sharing a pcpu really a good idea? I could imagine > > > things like smp housekeeping (cache invalidation etc.) to eventually > > > cause vcpu1 having to wait for the emulator stuck in IO. > > > > Agreed. IIRC, in the beginning of KVM-RT we saw a problem where > > running vcpu0 on an non-isolated core and without FIFO priority > > caused spikes in vcpu1. I guess we debugged this down to vcpu1 > > waiting a few dozen microseconds for vcpu0 for some reason. Running > > vcpu0 on a isolated core with FIFO priority fixed this (again, this > > was years ago, I won't remember all the details). > > > > > Or maybe a busy polling vcpu0 starving its own emulator causing high > > > latency or even deadlocks. > > > > This will probably happen if you run vcpu0 with FIFO priority. > > Two more points that indicate that hw:cpu_realtime_mask (putting > emulators/io next to any vcpu) does not work for general rt. > > > > Even if it happens to work for Linux guests it seems like a strong > > > assumption that an rt-guest that has noise cores can deal with even > > > more noise one schedul
Re: [openstack-dev] [nova] allow vfs to be trusted
On Wed, Jun 07, 2017 at 11:23:23AM -0500, Matt Riedemann wrote: > On 6/7/2017 8:28 AM, Sahid Orentino Ferdjaoui wrote: > > I still have a question do > > I need to provide a spec for this? > > There is a spec for it: > > https://review.openstack.org/#/c/397932/ > > So why not just revive that for Queens? It's because when we have wrote the spec we made wrong assumptions, all is already implemented, the PCI framework already handle tags and it's essentially what we need. > Specs also serve as documentation of a feature. Release notes are > not a substitute for documenting how to use a feature. Specs aren't > really either, or shouldn't be, but sometimes that's the only thing > we have since we don't get things into the manuals or in-tree > devref. > > That's my way of saying I think a spec is a good idea. > > -- > > Thanks, > > Matt __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] bug/1686116 - using more than 6 scsi disks with virtio-scsi
Hello, We have an issue in Nova which makes no possible to configure more than 6 SCSI disks with virtio-scsi where the controller could handle up to 256 disks. The fix has been reviewed by Stephen Finucane (thanks to him) and some other contributors (thanks to them), any chance to get one core to do the last reviews so we can consider it for Pike. https://review.openstack.org/#/q/topic:bug/1686116 Thanks, s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] allow vfs to be trusted
On Fri, Apr 28, 2017 at 10:52:47AM +0200, Sahid Orentino Ferdjaoui wrote: > Hello Matt, > > There is a serie of patches pushed upstream [0] to configure virtual > functions of a SRIOV device to be "trusted". > > That is to fix an issue when bonding two SRIOV nics in failover mode, > basically without that capabability set to the VFs assigned, the guest > would not have the privilege to update the MAC address of the slave. > > I do think that is spec-less but needs a reno note that well > explaining how to configure it. > > Is the blueprint attached to it can be considered for Pike? > > The 3 patches are relatively small; > - network: add command to configure trusted mode for VFs > - network: update pci request spec to handle trusted tags > - libvirt: configure trust mode for vfs > > [0] > https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/sriov-trusted-vfs > Unfortunatly that is not accepted for Pike. I still have a question do I need to provide a spec for this? Thanks, s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] allow vfs to be trusted
Hello Matt, There is a serie of patches pushed upstream [0] to configure virtual functions of a SRIOV device to be "trusted". That is to fix an issue when bonding two SRIOV nics in failover mode, basically without that capabability set to the VFs assigned, the guest would not have the privilege to update the MAC address of the slave. I do think that is spec-less but needs a reno note that well explaining how to configure it. Is the blueprint attached to it can be considered for Pike? The 3 patches are relatively small; - network: add command to configure trusted mode for VFs - network: update pci request spec to handle trusted tags - libvirt: configure trust mode for vfs [0] https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/sriov-trusted-vfs Thanks, s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] FFE Request "libvirt-emulator-threads-policy"
I'm requesting a FFE for the libvirt driver blueprint/spec to isolate emulator threads [1]. The code is up and ready since Mid of November 2016. [1] https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/libvirt-emulator-threads-policy s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] focused review pipeline of bug fix changes?
On Tue, Jul 12, 2016 at 06:45:03PM +0200, Markus Zoeller wrote: > On 12.07.2016 17:39, Sahid Orentino Ferdjaoui wrote: > > On Tue, Jul 12, 2016 at 09:59:12AM +0200, Markus Zoeller wrote: > >> After closing the old (>18months) bug reports nobody is working on a few > >> days ago [1], it became clear that the "in progress" reports are the > >> majority [2]. After asking Gerrit how long it usually takes to get a fix > >> merged [3], this is the result: > >> > >> number of merged bug fixes within the last 365 days: 689 > >> merged within ~1m : 409 (~59%) > >> merged within ~2m : 102 (~14%) > >> merged within ~3m : 57 (~ 8%) > >> merged > 3month : 121 (~17%) > >> > >> Note: This doesn't reflect the time a patch might be marked as > >> "WIP". It also doesn't add up to 100% as I rounded down the > >> percentages. > >> > >> This made me thinking about ways to increase the review throughput of > >> bug fix changes, especially the bug fixes in the "~2m" and "~3m" area. I > >> *assume* that the fixes in the ">3m" area had inherent problems or > >> waited for basic structural changes, but that's just guesswork. > >> > >> The proposal is this: > >> * have a TBD list with max 10 items on it (see list possibilities below) > >> * add new items during nova meeting if slots are free > >> * Change owners must propose their changes as meeting agenda items > >> * drop change from list in nova meeting if progress is not satisfying > > > > Considering a raw list of patches would be difficult to maintain, it's > > time consuming and Nova has contributors working on different areas > > which are sometimes really differents. How to order this list and how > > to consider a patch is satisfying the progress policy. > > I'm not sure if I understand the concerns correctly. The list > possibilities below seem to be easy to maintain. My assumption is, that > minimizing the "wait time" (reviewers wait for new patchsets OR change > owners wait for new reviews) can increase the throughput. It makes the > commitment of change owners and reviewers necessary though. > I don't think that the list needs specific ordering rules. As I want to > target bug fixes with this proposal, the ordering is given by the impact > of the bugs. How I see that you want to concentrate reviewers to be on 10 patches every weeks but here the bottleneck would be the authors, no? I think we have a pretty good flow, most of the bug fixes which need to get attention are reviewed quickly. > > To make things working we should find some automations and have one > > list by areas which is probably the purpose of tags in gerrit. > > This could result in silo mentality and discussions why list A gets > preferred over list B. That's why I want to avoid having multiple lists. > It's about fixing issues in Nova, not fixing issues in > . I don't think that could result in silo mentality and I remember a initiative in the same way but with etherpad and so difficult to maitain. > > Since we don't have such feature on our gerrit yet, a possible > > solution would be to introduce a tag in commit messages which reflects > > the subteam or area related. Then a simple script could parse reviews > > in progress to print them somewhere. > > Changing the commit message creates a new patchset which triggers the > gate checks again, that's why I thought making comments with a keyword > which can be queried is less trouble. Gerrit comments are lost when the patch get merged, we may want to provide some stats from these tags in future that's why I think commit message is better. Also if we only change commit messages, all of the gate is re-executed ? > > So each subteams can have a view of the work in progress which could > > make things moving faster. > > > > The point would be to order the lists by importance but we can expect > > the lists relatively small. > > > >> List possibilities: > >> 1) etherpad of doom? maintained by (?|me) > >> + easy to add/remove from everyone > >> - hard to query > >> 2) gerrit: starred by (?|me) > >> + easy to add/remove from the list maintainer > >> + easy to query > >> - No additions/removals when the list maintainer is on vacation > >> 3) gerrit: add a comment flag TOP10BUGFIX and DROPTOP10BUGFIX > >> + easy to add/remove from everyone > >> + easy to query (comment:TOP10BUG
Re: [openstack-dev] [nova] focused review pipeline of bug fix changes?
On Tue, Jul 12, 2016 at 09:59:12AM +0200, Markus Zoeller wrote: > After closing the old (>18months) bug reports nobody is working on a few > days ago [1], it became clear that the "in progress" reports are the > majority [2]. After asking Gerrit how long it usually takes to get a fix > merged [3], this is the result: > > number of merged bug fixes within the last 365 days: 689 > merged within ~1m : 409 (~59%) > merged within ~2m : 102 (~14%) > merged within ~3m : 57 (~ 8%) > merged > 3month : 121 (~17%) > > Note: This doesn't reflect the time a patch might be marked as > "WIP". It also doesn't add up to 100% as I rounded down the > percentages. > > This made me thinking about ways to increase the review throughput of > bug fix changes, especially the bug fixes in the "~2m" and "~3m" area. I > *assume* that the fixes in the ">3m" area had inherent problems or > waited for basic structural changes, but that's just guesswork. > > The proposal is this: > * have a TBD list with max 10 items on it (see list possibilities below) > * add new items during nova meeting if slots are free > * Change owners must propose their changes as meeting agenda items > * drop change from list in nova meeting if progress is not satisfying Considering a raw list of patches would be difficult to maintain, it's time consuming and Nova has contributors working on different areas which are sometimes really differents. How to order this list and how to consider a patch is satisfying the progress policy. To make things working we should find some automations and have one list by areas which is probably the purpose of tags in gerrit. Since we don't have such feature on our gerrit yet, a possible solution would be to introduce a tag in commit messages which reflects the subteam or area related. Then a simple script could parse reviews in progress to print them somewhere. So each subteams can have a view of the work in progress which could make things moving faster. The point would be to order the lists by importance but we can expect the lists relatively small. > List possibilities: > 1) etherpad of doom? maintained by (?|me) > + easy to add/remove from everyone > - hard to query > 2) gerrit: starred by (?|me) > + easy to add/remove from the list maintainer > + easy to query > - No additions/removals when the list maintainer is on vacation > 3) gerrit: add a comment flag TOP10BUGFIX and DROPTOP10BUGFIX > + easy to add/remove from everyone > + easy to query (comment:TOP10BUGFIX not comment:DROPTOP10BUGFIX) > - once removed with a comment "DROPTOP10BUGFIX", a repeated > addition is not practical anymore. > 4) gerrit: tag a change > + easy to add/remove from everyone > + easy to query > - not yet available in our setup > > Personally I prefer 3, as it doesn't rely on a single person and the > tooling is ready for that. It could be sufficient until one of the next > infra Gerrit updates brings us 4. I'd like to avoid 1+2. > > My hope is, that a focused list helps us to get (few) things done faster > and increase the overall velocity. Is this a feasible proposal from your > point of view? Which concerns do you have? > > References: > [1] http://lists.openstack.org/pipermail/openstack-dev/2016-July/098792.html > [2] http://45.55.105.55:3000/dashboard/db/openstack-bugs > [3] > https://github.com/markuszoeller/openstack/blob/master/scripts/gerrit/bug_fix_histogram.py > > > -- > Regards, Markus Zoeller (markus_z) > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Do we need a config option for use_usb_tablet/pointer_model?
On Fri, Jun 17, 2016 at 03:36:53PM -0500, Matt Riedemann wrote: > I was reviewing the last change in this blueprint series today: > > https://review.openstack.org/#/c/174854/ > > And started to question why we even have a config option for this anymore. > The blueprint didn't have a spec but there are some details in the > description about the use cases: > > https://blueprints.launchpad.net/nova/+spec/virt-configure-usb-tablet-from-image > > From the code and help text for the option I realize that there is some > per-compute enablement required for this to work (VNC or SPICE enabled and > the SPICE agent disabled). But otherwise it seems totally image-specific, > which is why the blueprint is adding support for calculating whether or not > to enable USB support in the guest based on the image metadata properties. > > But do we still need the configuration option then? > > The tricky scenario I have in mind is I create my Windows instance on a host > that has use_usb_table=True and I can use my USB pointer mouse and I'm > happy, yay! Then that host goes under maintenance, the admin migrates it to > another host that has use_usb_tablet=False and now I can't use my USB mouse > anymore. I guess the chance of this happening are pretty slim given > use_usb_tablet defaults to True. > > However, in https://review.openstack.org/#/c/176242/ use_usb_table is > deprecated in favor of the new 'pointer_model' config option which defaults > to None, so it's not backward compatible with use_usb_tablet and when we > remove use_usb_tablet as an option in Ocata, the default behavior has > changed. We did not wanted set a default value for pointer_model, as you indicated before it's something more related to the guest images, we only want give to operators ability to set a default behavior. Also operators who use the option 'usbtablet' will be warn for an entire release to update to pointer_model option. > Anyway, my point is, why do we even need a config option for this at all if > the image metadata can tell us what to do now? I don't see any problem to let operators decide what is the hosts default behavior. s. > -- > > Thanks, > > Matt Riedemann __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron] What to do about booting into port_security_enabled=False networks?
On Wed, Mar 30, 2016 at 09:46:45PM -0500, Matt Riedemann wrote: > > > On 3/30/2016 5:55 PM, Armando M. wrote: > > > > > >On 29 March 2016 at 18:55, Matt Riedemann ><mailto:mrie...@linux.vnet.ibm.com>> wrote: > > > > > > > >On 3/29/2016 4:44 PM, Armando M. wrote: > > > > > > > >On 29 March 2016 at 08:08, Matt Riedemann > >mailto:mrie...@linux.vnet.ibm.com> > ><mailto:mrie...@linux.vnet.ibm.com > ><mailto:mrie...@linux.vnet.ibm.com>>> wrote: > > > > Nova has had some long-standing bugs that Sahid is trying > >to fix > > here [1]. > > > > You can create a network in neutron with > > port_security_enabled=False. However, the bug is that since > >Nova > > adds the 'default' security group to the request (if none are > > specified) when allocating networks, neutron raises an > >error when > > you try to create a port on that network with a 'default' > >security > > group. > > > > Sahid's patch simply checks if the network that we're going > >to use > > has port_security_enabled and if it does not, no security > >groups are > > applied when creating the port (regardless of what's > >requested for > > security groups, which in nova is always at least 'default'). > > > > There has been a similar attempt at fixing this [2]. That > >change > > simply only added the 'default' security group when allocating > > networks with nova-network. It omitted the default security > >group if > > using neutron since: > > > > a) If the network does not have port security enabled, > >we'll blow up > > trying to add a port on it with the default security group. > > > > b) If the network does have port security enabled, neutron will > > automatically apply a 'default' security group to the port, > >nova > > doesn't need to specify one. > > > > The problem both Feodor's and Sahid's patches ran into was > >that the > > nova REST API adds a 'default' security group to the server > >create > > response when using neutron if specific security groups > >weren't on > > the server create request [3]. > > > > This is clearly wrong in the case of > > network.port_security_enabled=False. When listing security > >groups > > for an instance, they are correctly not listed, but the server > > create response is still wrong. > > > > So the question is, how to resolve this? A few options > >come to mind: > > > > a) Don't return any security groups in the server create > >response > > when using neutron as the backend. Given by this point > >we've cast > > off to the compute which actually does the work of network > > allocation, we can't call back into the network API to see what > > security groups are being used. Since we can't be sure, don't > > provide what could be false info. > > > > b) Add a new method to the network API which takes the > >requested > > networks from the server create request and returns a best > >guess if > > security groups are going to be applied or not. In the case of > > network.port_security_enabled=False, we know a security > >group won't > > be applied so the method returns False. If there is > > port_security_enabled, we return whatever security group was > > requested (or 'default'). If there are multiple networks on the > > request, we return the security groups that will be applied > >to any > > networks that have port security enabled. > > > > Option (b) is obviously more intensive and requires hitting the > > neutron API from nova API before we respond, which we'd like to > > avoid if pos
Re: [openstack-dev] [nova][libvirt] Deprecating the live_migration_flag and block_migration_flag config options
On Mon, Jan 04, 2016 at 09:12:06PM +, Mark McLoughlin wrote: > Hi > > commit 8ecf93e[1] got me thinking - the live_migration_flag config > option unnecessarily allows operators choose arbitrary behavior of the > migrateToURI() libvirt call, to the extent that we allow the operator > to configure a behavior that can result in data loss[1]. > > I see that danpb recently said something similar: > > https://review.openstack.org/171098 > > "Honestly, I wish we'd just kill off 'live_migration_flag' and > 'block_migration_flag' as config options. We really should not be > exposing low level libvirt API flags as admin tunable settings. > > Nova should really be in charge of picking the correct set of flags > for the current libvirt version, and the operation it needs to > perform. We might need to add other more sensible config options in > their place [..]" Nova should really handle internal flags and this serie is running in the right way. > ... > 4) Add a new config option for tunneled versus native: > > [libvirt] > live_migration_tunneled = true > > This enables the use of the VIR_MIGRATE_TUNNELLED flag. We have > historically defaulted to tunneled mode because it requires the > least configuration and is currently the only way to have a > secure migration channel. > > danpb's quote above continues with: > > "perhaps a "live_migration_secure_channel" to indicate that > migration must use encryption, which would imply use of > TUNNELLED flag" > > So we need to discuss whether the config option should express the > choice of tunneled vs native, or whether it should express another > choice which implies tunneled vs native. > > https://review.openstack.org/263434 We probably have to consider that operator does not know much about internal libvirt flags, so options we are exposing for him should reflect benefice of using them. I commented on your review we should at least explain benefice of using this option whatever the name is. > 5) Add a new config option for additional migration flags: > > [libvirt] > live_migration_extra_flags = VIR_MIGRATE_COMPRESSED > > This allows operators to continue to experiment with libvirt behaviors > in safe ways without each use case having to be accounted for. > >https://review.openstack.org/263435 > > We would disallow setting the following flags via this option: > > VIR_MIGRATE_LIVE > VIR_MIGRATE_PEER2PEER > VIR_MIGRATE_TUNNELLED > VIR_MIGRATE_PERSIST_DEST > VIR_MIGRATE_UNDEFINE_SOURCE > VIR_MIGRATE_NON_SHARED_INC > VIR_MIGRATE_NON_SHARED_DISK > > which would allow the following currently available flags to be set: > VIR_MIGRATE_PAUSED > VIR_MIGRATE_CHANGE_PROTECTION > VIR_MIGRATE_UNSAFE > VIR_MIGRATE_OFFLINE > VIR_MIGRATE_COMPRESSED > VIR_MIGRATE_ABORT_ON_ERROR > VIR_MIGRATE_AUTO_CONVERGE > VIR_MIGRATE_RDMA_PIN_ALL We can probably consider to provide VIR_MIGRATE_PAUSED and VIR_MIGRATE_COMPRESSED as dedicated options too ? __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thoughts on things that don't make freeze cutoffs
On Tue, Aug 04, 2015 at 12:54:34PM +0200, Thierry Carrez wrote: > John Garbutt wrote: > > [...] > > Personally I find a mix of coding and reviewing good to keep a decent > > level of empathy and sanity. I don't have time for any coding this > > release (only a bit of documenting), and its not something I can > > honestly recommend as a best practice. If people don't maintain a good > > level of reviews, we do tend to drop those folks from nova-core. > > > > I know ttx has been pushing for dedicated reviewers. It would be nice > > to find folks that can do that, but we just haven't found any of those > > people to date. > > Hell no! I'd hate dedicated reviewers. > > [...] > This is why I advocate dividing code / reviewers / expertise along > smaller areas within Nova, so that new people can focus and become a > master again. What I'm pushing for is creating Nova subteams with their > own core reviewers, which would be experts and trusted to +2 on a > defined subset of code. Yep this makes a lot of sense and unfortunately we bring this idea every time but nothing seems to move in that way. Contributors working in Nova feel more and more frustrated. So specs got approved June 23-25 then about 15 working days to push all of the code and 12 working days to make it merged. From my experience working on Nova it's not possible. For instance on libvirt driver we have one core who does most of the reviews, but we have problem to find the +2/+W and without to mention problem when he is author of the fix [1]. We delay good features (with code pushed and waiting reviews) we can bring to users. I guess users are happy to hear that we are working hard to improve our code base but perhaps they also want features without to wait a year (3.1, 95, 98, me, xp...). And I know that because I'm working every days in Nova since more than 2 years - We have really skilled people who can help. [1] https://review.openstack.org/#/c/176360/ s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Create a network filter
On Thu, Jul 23, 2015 at 04:44:01PM +0200, Silvia Fichera wrote: > Hi all, > > I'm using OpenStack together with OpenDaylight to add a network awareness > feature to the scheduler. > I have 3 compute nodes (one of these is also the Openstack Controller) > connected by a openvswitch controlled by OpenDaylight. > What I would like to do is to write a filter to check if a link is up and > then assign weight acconding to the available bw (I think I will collect > this data by ODL and update an entry in a db). So you would like to check if a link is up in compute nodes and order compute nodes by BW, right ? I do not think you can use OpenDaylight for something like that that will be too specific. One solution could be to create a new monitor, they run on compute nodes and are used to collect any kind of data. nova/compute/monitors Then you may want to create a new "weight" to order hosts eligible by data you have collected from the monitor. nova/scheduler/weights > For each host I have a management interface (eth0) and an interface > connected to the OVS switch to build the physical network (eth1). > Have you got any suggestion to check the link status? > I thought I can be inspired by the second script in this link > http://stackoverflow.com/questions/17434079/python-check-to-see-if-host-is-connected-to-network > to verify if the iface is up and then check the connectivity but It has to > be run in the compute node and I don't know which IP address I could point > at. > > > Thank you > > -- > Silvia Fichera > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] The unbearable lightness of specs
On Wed, Jun 24, 2015 at 11:28:59AM +0100, Nikola Đipanov wrote: > Hey Nova, > > I'll cut to the chase and keep this email short for brevity and clarity: > > Specs don't work! They do nothing to facilitate good design happening, > if anything they prevent it. The process layered on top with only a > minority (!) of cores being able to approve them, yet they are a prereq > of getting any work done, makes sure that the absolute minimum that > people can get away with will be proposed. This in turn goes and > guarantees that no good design collaboration will happen. To add insult > to injury, Gerrit and our spec template are a horrible tool for > discussing design. Also the spec format itself works for only a small > subset of design problems Nova development is faced with. I do not consider specs don't work, personnaly I refer myself to this relatively good documentation [1] instead of to dig in code to remember how work a feature early introduced. I guess we have some efforts to do about the level of details we want before a spec is approved. We should just consider the general idea/design, options introduced, API changed and keep in mind the contributors who will implement the feature can/have to update it during the developpement phase. [1] http://specs.openstack.org/openstack/nova-specs/specs/kilo/ s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Proposal of nova-hyper driver
On Sun, Jun 21, 2015 at 07:18:10PM +0300, Joe Gordon wrote: > On Fri, Jun 19, 2015 at 12:55 PM, Peng Zhao wrote: > > >Hi, all, > > > > I would like to propose nova-hyper driver: > > https://blueprints.launchpad.net/nova/+spec/nova-hyper. > > > >- What is Hyper? > >Put simply, Hyper is a hypervisor-agnostic Docker runtime. It is > >similar to Intel’s ClearContainer, allowing to run a Docker image with > > any > >hypervisor. > > > > > >- Why Hyper driver? > >Given its hypervisor nature, Hyper makes it easy to integrate with > >OpenStack ecosystem, e.g. Nova, Cinder, Neutron > > > >- How to implement? > >Similar to nova-docker driver. Hyper has a daemon “hyperd” running on > >each physical box. hyperd exposed a set of REST APIs. Integrating Nova > > with > >the APIs would do the job. > > > >- Roadmap > >Integrate with Magnum & Ironic. > > > > > This sounds like a better fit for something on top of Nova such as Magnum > then as a Nova driver. > > Nova only supports things that look like 'VMs'. That includes bare metal, > and containers, but it only includes a subset of container features. > > Looking at the hyper CLI [0], there are many commands that nova would not > suppprt, such as: > > * The pod notion > * exec > * pull Then I guess you need to see if Hyper can implement mandatory features for Nova [1], [2]. [1] http://docs.openstack.org/developer/nova/support-matrix.html [2] https://wiki.openstack.org/wiki/HypervisorSupportMatrix > [0] https://docs.hyper.sh/reference/cli.html > > > > > Appreciate for comments and inputs! > > Thanks,Peng > > > > - > > Hyper - Make VM run like Container > > > > > > __ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][infra][tc][ptl] Scaling up code review process (subdir cores)
On Wed, Jun 03, 2015 at 10:22:59AM +0200, Julien Danjou wrote: > On Wed, Jun 03 2015, Robert Collins wrote: > > > We *really* don't need a technical solution to a social problem. > > I totally agree. The trust issues is not going to be solve with a tool. +1 I can not believe people will commit something on a area he does not understand. > -- > Julien Danjou > ;; Free Software hacker > ;; http://julien.danjou.info > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][api] New micro-version needed for api bug fix or not?
On Fri, May 29, 2015 at 08:47:01AM +0200, Jens Rosenboom wrote: > As the discussion in https://review.openstack.org/179569 still > continues about whether this is "just" a bug fix, or an API change > that will need a new micro version, maybe it makes sense to take this > issue over here to the ML. Changing version of the API probably makes sense also for bug if it changes the behavior of a command/option to something backward incompatible. I do not believe it is the case for your change. > My personal opinion is undecided, I can see either option as being > valid, but maybe after having this open bug for four weeks now we can > come to some conclusion either way. > > Yours, > Jens > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] "correct" API for getting image metadata for an instance ?
On Wed, May 27, 2015 at 02:59:10PM +0100, Daniel P. Berrange wrote: > As part of the work to object-ify the image metadata dicts, I'm looking > at the current way the libvirt driver fetches image metadata for an > instance, in cases where the compute manager hasn't already passed it > into the virt driver API. I see 2 methods that libvirt uses to get the > image metadata > > - nova.utils.get_image_from_system_metadata(instance.system_metadata) > > It takes the system metadata stored against the instance > and turns it into image metadata. > > > - nova.compute.utils.get_image_metadata(context, > image_api, > instance.image_ref, >instance) > > This tries to get metadata from the image api and turns > this into system metadata > > It then gets system metadata from the instance and merges > it from the data from the image > > It then calls nova.utils.get_image_from_system_metadata() > > IIUC, any changes against the image will override what > is stored against the instance > > > > IIUC, when an instance is booted, the image metadata should be > saved against the instance. So I'm wondering why we need to have > code in compute.utils that merges back in the image metadata each > time ? As you said during boot time we store in the instance system_metadata the image properties. Except for some special cases I don't see any reason to use the method 'get_image_metadata' and we should probably clean the code in libvirt. > Is this intentional so that we pull in latest changes from the > image, to override what's previously saved on the instance ? If > so, then it seems that we should have been consistent in using > the compute_utils get_image_metadata() API everywhere. > > It seems wrong though to pull in the latest metadata from the > image. The libvirt driver makes various decisions at boot time > about how to configure the guest based on the metadata. When we > later do changes to that guest (snapshot, hotplug, etc, etc) > we *must* use exactly the same image metadata we had at boot > time, otherwise decisions we make will be inconsistent with how > the guest is currently configured. > > eg if you set hw_disk_bus=virtio at boot time, and then later > change the image to use hw_disk_bus=scsi, and then try to hotplug > a new drive on the guest, we *must* operate wrt hw_disk_bus=virtio > because the guest will not have any scsi bus present. > > This says to me we should /never/ use the compute_utils > get_image_metadata() API once the guest is running, and so we > should convert libvirt to use nova.utils.get_image_from_system_metadata() > exclusively. > > It also makes me wonder how nova/compute/manager.py is obtaining image > meta in cases where it passes it into the API and whether that needs > changing at all. +1 > > Regards, > Daniel > -- > |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][libvirt] Understand why we lookup libvirt domains by instance name
On Thu, May 21, 2015 at 09:11:57AM -0700, Michael Still wrote: > On Thu, May 21, 2015 at 7:49 AM, Sahid Orentino Ferdjaoui > wrote: > > On Thu, May 21, 2015 at 10:23:35AM +0100, Daniel P. Berrange wrote: > >> On Wed, May 20, 2015 at 03:01:50PM -0700, Michael Still wrote: > >> > I note that we use instance.name to lookup the libvirt domain a bunch > >> > in the driver. I'm wondering why we don't just use instance.uuid all > >> > the time -- the code for that already exists. Is there a good reason > >> > to not move to always using the uuid? > >> > > >> > I ask because instance.name is not guaranteed to be unique depending > >> > on how weird the nova deployment is. > >> > >> Agreed, there's no benefit to using name - internally libvirt will always > >> prefer to use the UUID itself too. > >> > >> These days though, there is only a single place in nova libvirt driver > >> that needs updating - the nova.virt.libvirt.host.Host class get_domain() > >> method just needs to be switched to use uuid. > > > > I'm currently working on the libvirt driver I can add this in my TODO > > > > https://review.openstack.org/#/c/181969/ > > Well, I am playing in this code to do some qemu stuff, so I will throw > something out in the next day or so anyways. If you beat me to it then > that's fine as well. Nothing hurry in my side I let you play with this part of code :) s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][oslo] RPC Asynchronous Communication
On Fri, May 08, 2015 at 09:13:59AM -0400, Doug Hellmann wrote: > Excerpts from Joe Gordon's message of 2015-05-07 17:43:06 -0700: > > On May 7, 2015 2:37 AM, "Sahid Orentino Ferdjaoui" < > > sahid.ferdja...@redhat.com> wrote: > > > > > > Hi, > > > > > > The primary point of this expected discussion around asynchronous > > > communication is to optimize performance by reducing latency. > > > > > > For instance the design used in Nova and probably other projects let > > > able to operate ascynchronous operations from two way. > > > > > > 1. When communicate between inter-services > > > 2. When communicate to the database > > > > > > 1 and 2 are close since they use the same API but I prefer to keep a > > > difference here since the high level layer is not the same. > > > > > > From Oslo Messaging point of view we currently have two methods to > > > invoke an RPC: > > > > > > Cast and Call: The first one is not bloking and will invoke a RPC > > > without to wait any response while the second will block the > > > process and wait for the response. > > > > > > The aim is to add new method which will return without to block the > > > process an object let's call it "Future" which will provide some basic > > > methods to wait and get a response at any time. > > > > > > The benefice from Nova will comes on a higher level: > > > > > > 1. When communicate between services it will be not necessary to block > > >the process and use this free time to execute some other > > >computations. > > > > Isn't this what the use of green threads (and eventlet) is supposed to > > solve. Assuming my understanding is correct, and we can fix any issues > > without adding async oslo.messaging, then adding yet another async pattern > > seems like a bad thing. The aim is to be not library specific and avoid to add different and custom patterns on the base code each time we want something not blocking. We can let the well experimented community working in olso messaging to maintain that part and as Doug said oslo can use different "executors" so we can avoid the requirement of a specific library. > Yes, this is what the various executors in the messaging library do, > including the eventlet-based executor we use by default. > > Where are you seeing nova block on RPC calls? In Nova we use the indirection api to make call to the database by the conductor through RPCs. By the solution presented we can create async operations to read and write from the database. Olekssi asks me to give an example I will reply to him. s. > Doug > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][libvirt] Understand why we lookup libvirt domains by instance name
On Thu, May 21, 2015 at 10:23:35AM +0100, Daniel P. Berrange wrote: > On Wed, May 20, 2015 at 03:01:50PM -0700, Michael Still wrote: > > I note that we use instance.name to lookup the libvirt domain a bunch > > in the driver. I'm wondering why we don't just use instance.uuid all > > the time -- the code for that already exists. Is there a good reason > > to not move to always using the uuid? > > > > I ask because instance.name is not guaranteed to be unique depending > > on how weird the nova deployment is. > > Agreed, there's no benefit to using name - internally libvirt will always > prefer to use the UUID itself too. > > These days though, there is only a single place in nova libvirt driver > that needs updating - the nova.virt.libvirt.host.Host class get_domain() > method just needs to be switched to use uuid. I'm currently working on the libvirt driver I can add this in my TODO https://review.openstack.org/#/c/181969/ s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova][oslo] RPC Asynchronous Communication
Hi, The primary point of this expected discussion around asynchronous communication is to optimize performance by reducing latency. For instance the design used in Nova and probably other projects let able to operate ascynchronous operations from two way. 1. When communicate between inter-services 2. When communicate to the database 1 and 2 are close since they use the same API but I prefer to keep a difference here since the high level layer is not the same. >From Oslo Messaging point of view we currently have two methods to invoke an RPC: Cast and Call: The first one is not bloking and will invoke a RPC without to wait any response while the second will block the process and wait for the response. The aim is to add new method which will return without to block the process an object let's call it "Future" which will provide some basic methods to wait and get a response at any time. The benefice from Nova will comes on a higher level: 1. When communicate between services it will be not necessary to block the process and use this free time to execute some other computations. future = rpcapi.invoke_long_process() ... do something else here ... result = future.get_response() 2. We can use the benefice of all of the work previously done with the Conductor and so by updating the framework Objects and Indirection Api we should take advantage of async operations to the database. MyObject = MyClassObject.get_async() ... do something else here ... MyObject.wait() MyObject.foo = "bar" MyObject.save_async() ... do something else here ... MyObject.wait() All of this is to illustrate and have to be discussed. I guess the first job needs to come from Oslo Messaging so the question is to know the feeling here and then from Nova since it will be the primary consumer of this feature. https://blueprints.launchpad.net/nova/+spec/asynchronous-communication Thanks, s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [stable] freeze exception
On Fri, Apr 03, 2015 at 09:37:30AM +0200, Sahid Orentino Ferdjaoui wrote: > Hello, > > A request to get an exception for a fix related to PCI-Passthough. The > backport seems to be reasonable and not invincible and the code is > well covered by tests. s/invincible/invasive/ > > The impact of this fix is to make compatible the config option > 'pci_passthrough_whitelist' from icehouse to juno. > > https://review.openstack.org/#/c/170089/ > > Thanks, > s > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [stable] freeze exception
Hello, A request to get an exception for a fix related to PCI-Passthough. The backport seems to be reasonable and not invincible and the code is well covered by tests. The impact of this fix is to make compatible the config option 'pci_passthrough_whitelist' from icehouse to juno. https://review.openstack.org/#/c/170089/ Thanks, s __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] bp serial-ports *partly* implemented?
On Tue, Feb 24, 2015 at 04:19:39PM +0100, Markus Zoeller wrote: > Sahid Orentino Ferdjaoui wrote on 02/23/2015 > 11:13:12 AM: > > > From: Sahid Orentino Ferdjaoui > > To: "OpenStack Development Mailing List (not for usage questions)" > > > > Date: 02/23/2015 11:17 AM > > Subject: Re: [openstack-dev] [nova] bp serial-ports *partly* > implemented? > > > > On Fri, Feb 20, 2015 at 06:03:46PM +0100, Markus Zoeller wrote: > > > It seems to me that the blueprint serial-ports[1] didn't implement > > > everything which was described in its spec. If one of you could have a > > > > look at the following examples and help me to understand if these > > > observations are right/wrong that would be great. > > > > > > Example 1: > > > The flavor provides the extra_spec "hw:serial_port_count" and the > image > > > the property "hw_serial_port_count". This is used to decide how many > > > serial devices (with different ports) should be defined for an > instance. > > > But the libvirt driver returns always only the *first* defined port > > > (see method "get_serial_console" [2]). I didn't find anything in the > > > code which uses the other defined ports. > > > > The method you are referencing [2] is used to return the first well > > binded and not connected port in the domain. > > Is that the intention behind the code ``mode='bind'`` in said method? > In my test I created an instance with 2 ports with the default cirros > image with a flavor which has the hw:serial_port_count=2 property. > The domain XML has this snippet: > > > > > > > My expectation was to be able to connect to the same instance via both > ports at the same time. But the second connection is blocked as long > as the first connection is established. A debug trace in the code shows > that both times the first port is returned. IOW I was not able to create > a scenario where the *second* port was returned and that confuses me > a little. Any thoughts about this? So we probably have a bug here, can you at least refer it in launchpad ? We need to see if the problem comes from the code in Nova or a bad interpretation of the behavior of libvirt or a bug in libvirt. Please on the report can you also paste the XML when you have a well connected session on the first port? > > When defining the domain '{hw_|hw:}serial_port_count' are well take > > into account as you can see: > > > >https://github.com/openstack/nova/blob/master/nova/virt/libvirt/ > > driver.py#L3702 > > > > (The method looks to have been refactored and include several parts > > not related to serial-console) > > > > > Example 2: > > > "If a user is already connected, then reject the attempt of a > second > > > user to access the console, but have an API to forceably > disconnect > > > an existing session. This would be particularly important to cope > > > with hung sessions where the client network went away before the > > > console was cleanly closed." [1] > > > I couldn't find the described API. If there is a hung session one > cannot > > > gracefully recover from that. This could lead to a bad UX in horizons > > > serial console client implementation[3]. > > > > This API is not implemented, I will see what I can do on that > > part. Thanks for this. > > Sounds great, thanks for that! Please keep me in the loop when > reviews or help with coding are needed. > > > > [1] nova bp serial-ports; > > > > > > https://github.com/openstack/nova-specs/blob/master/specs/juno/ > > implemented/serial-ports.rst > > > [2] libvirt driver; return only first port; > > > > > > https://github.com/openstack/nova/blob/master/nova/virt/libvirt/ > > driver.py#L2518 > > > [3] horizon bp serial-console; > > > https://blueprints.launchpad.net/horizon/+spec/serial-console > > > > > > > > > > __ > > > OpenStack Development Mailing List (not for usage questions) > > > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > __ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > openstack
Re: [openstack-dev] [nova] bp serial-ports *partly* implemented?
On Fri, Feb 20, 2015 at 06:03:46PM +0100, Markus Zoeller wrote: > It seems to me that the blueprint serial-ports[1] didn't implement > everything which was described in its spec. If one of you could have a > look at the following examples and help me to understand if these > observations are right/wrong that would be great. > > Example 1: > The flavor provides the extra_spec "hw:serial_port_count" and the image > the property "hw_serial_port_count". This is used to decide how many > serial devices (with different ports) should be defined for an instance. > But the libvirt driver returns always only the *first* defined port > (see method "get_serial_console" [2]). I didn't find anything in the > code which uses the other defined ports. The method you are referencing [2] is used to return the first well binded and not connected port in the domain. When defining the domain '{hw_|hw:}serial_port_count' are well take into account as you can see: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3702 (The method looks to have been refactored and include several parts not related to serial-console) > Example 2: > "If a user is already connected, then reject the attempt of a second > user to access the console, but have an API to forceably disconnect > an existing session. This would be particularly important to cope > with hung sessions where the client network went away before the > console was cleanly closed." [1] > I couldn't find the described API. If there is a hung session one cannot > gracefully recover from that. This could lead to a bad UX in horizons > serial console client implementation[3]. This API is not implemented, I will see what I can do on that part. Thanks for this. > [1] nova bp serial-ports; > > https://github.com/openstack/nova-specs/blob/master/specs/juno/implemented/serial-ports.rst > [2] libvirt driver; return only first port; > > https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2518 > [3] horizon bp serial-console; > https://blueprints.launchpad.net/horizon/+spec/serial-console > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On Thu, Feb 05, 2015 at 09:56:21AM +, Matthew Booth wrote: > On 04/02/15 19:04, Jay Pipes wrote: > > On 02/04/2015 12:05 PM, Sahid Orentino Ferdjaoui wrote: > >> On Wed, Feb 04, 2015 at 04:30:32PM +, Matthew Booth wrote: > >>> I've spent a few hours today reading about Galera, a clustering solution > >>> for MySQL. Galera provides multi-master 'virtually synchronous' > >>> replication between multiple mysql nodes. i.e. I can create a cluster of > >>> 3 mysql dbs and read and write from any of them with certain consistency > >>> guarantees. > >>> > >>> I am no expert[1], but this is a TL;DR of a couple of things which I > >>> didn't know, but feel I should have done. The semantics are important to > >>> application design, which is why we should all be aware of them. > >>> > >>> > >>> * Commit will fail if there is a replication conflict > >>> > >>> foo is a table with a single field, which is its primary key. > >>> > >>> A: start transaction; > >>> B: start transaction; > >>> A: insert into foo values(1); > >>> B: insert into foo values(1); <-- 'regular' DB would block here, and > >>>report an error on A's commit > >>> A: commit; <-- success > >>> B: commit; <-- KABOOM > >>> > >>> Confusingly, Galera will report a 'deadlock' to node B, despite this not > >>> being a deadlock by any definition I'm familiar with. > > > > It is a failure to certify the writeset, which bubbles up as an InnoDB > > deadlock error. See my article here: > > > > http://www.joinfu.com/2015/01/understanding-reservations-concurrency-locking-in-nova/ > > > > > > Which explains this. > > > >> Yes ! and if I can add more information and I hope I do not make > >> mistake I think it's a know issue which comes from MySQL, that is why > >> we have a decorator to do a retry and so handle this case here: > >> > >> > >> http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n177 > >> > > > > It's not an issue with MySQL. It's an issue with any database code that > > is highly contentious. I wanted to speak about the term "deadlock" (which also looks to surprise Matthew) used, I though it comes from MySQL. In our situation it's not really a deadlock, just a locked sessions from A and so B needs to retry ? I believe a deadlock would be when a session A tries to read something on table x.foo to update y.bar when B tries to read something on y.bar to update x.foo - so when A acquires a lock to read x.foo, B acquires a lock to read y.bar, then when A needs to acquire lock to update y.bar it can not, then same thing for B with x.foo. > > Almost all highly distributed or concurrent applications need to handle > > deadlock issues, and the most common way to handle deadlock issues on > > database records is using a retry technique. There's nothing new about > > that with Galera. > > > > The issue with our use of the @_retry_on_deadlock decorator is *not* > > that the retry decorator is not needed, but rather it is used too > > frequently. The compare-and-swap technique I describe in the article > > above dramatically* reduces the number of deadlocks that occur (and need > > to be handled by the @_retry_on_deadlock decorator) and dramatically > > reduces the contention over critical database sections. Thanks for these informations. > I'm still confused as to how this code got there, though. We shouldn't > be hitting Galera lock contention (reported as deadlocks) if we're using > a single master, which I thought we were. Does this mean either: I guess we can hit a lock contention even in single master. > A. There are deployments using multi-master? > B. These are really deadlocks? > > If A, is this something we need to continue to support? > > Thanks, > > Matt > -- > Matthew Booth > Red Hat Engineering, Virtualisation Team > > Phone: +442070094448 (UK) > GPG ID: D33C3490 > GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] memory reporting for huge pages
On Wed, Feb 04, 2015 at 05:35:55PM -0600, Chris Friesen wrote: > As part of > "https://blueprints.launchpad.net/nova/+spec/virt-driver-large-pages"; we > have introduced the ability to specify based on flavor/image that we want to > use huge pages. Yes, to add more information; When using image properties, a large pages request would only be honoured if the flavor already had a policy 'large' or 'any'. > Is there a way to query the number of huge pages available on each NUMA node > of each compute node? Nova does not provide any API to request the number of large pages available per NUMA on compute nodes. On Linux you can use some tools to request specific information about pages and NUMA topology like: numastat and numactl. libvirt also provides some information about free pages availables. [stack@localhost ~]$ virsh freepages --all Node 0: 4KiB: 466511 2048KiB: 128 If you want to see what consum guests on host: [root@localhost ~]# numastat -p qemu Per-node process memory usage (in MBs) for PID 12863 (qemu-system-x86) Node 0 Total --- --- Huge 128.00 128.00 Heap 3.853.85 Stack0.110.11 Private 89.98 89.98 --- --- Total 221.94 221.94 But you still have to use a tool to request each of your compute nodes > I haven't been able to find one (short of querying the database directly) > and it's proven somewhat frustrating. > > Currently we report the total amount of memory available, but when that can > be broken up into several page sizes and multiple NUMA nodes per compute > node it can be very difficult to determine whether a given flavor/image is > bootable within the network, or to debug any issues that occur. Sorry, I'm not sure to have completly understand your point. The scheduler is responsible to find the best host to handle a request (But i can be off-topic of your question). Also if you need to track more information about memory per numa nodes there is probably something to do with the extensible resource tracking spec here. http://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/extensible-resource-tracking.html s. > Chris > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On Wed, Feb 04, 2015 at 04:30:32PM +, Matthew Booth wrote: > I've spent a few hours today reading about Galera, a clustering solution > for MySQL. Galera provides multi-master 'virtually synchronous' > replication between multiple mysql nodes. i.e. I can create a cluster of > 3 mysql dbs and read and write from any of them with certain consistency > guarantees. > > I am no expert[1], but this is a TL;DR of a couple of things which I > didn't know, but feel I should have done. The semantics are important to > application design, which is why we should all be aware of them. > > > * Commit will fail if there is a replication conflict > > foo is a table with a single field, which is its primary key. > > A: start transaction; > B: start transaction; > A: insert into foo values(1); > B: insert into foo values(1); <-- 'regular' DB would block here, and > report an error on A's commit > A: commit; <-- success > B: commit; <-- KABOOM > > Confusingly, Galera will report a 'deadlock' to node B, despite this not > being a deadlock by any definition I'm familiar with. Yes ! and if I can add more information and I hope I do not make mistake I think it's a know issue which comes from MySQL, that is why we have a decorator to do a retry and so handle this case here: http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n177 > Essentially, anywhere that a regular DB would block, Galera will not > block transactions on different nodes. Instead, it will cause one of the > transactions to fail on commit. This is still ACID, but the semantics > are quite different. > > The impact of this is that code which makes correct use of locking may > still fail with a 'deadlock'. The solution to this is to either fail the > entire operation, or to re-execute the transaction and all its > associated code in the expectation that it won't fail next time. > > As I understand it, these can be eliminated by sending all writes to a > single node, although that obviously makes less efficient use of your > cluster. > > > * Write followed by read on a different node can return stale data > > During a commit, Galera replicates a transaction out to all other db > nodes. Due to its design, Galera knows these transactions will be > successfully committed to the remote node eventually[2], but it doesn't > commit them straight away. The remote node will check these outstanding > replication transactions for write conflicts on commit, but not for > read. This means that you can do: > > A: start transaction; > A: insert into foo values(1) > A: commit; > B: select * from foo; <-- May not contain the value we inserted above[3] > > This means that even for 'synchronous' slaves, if a client makes an RPC > call which writes a row to write master A, then another RPC call which > expects to read that row from synchronous slave node B, there's no > default guarantee that it'll be there. > > Galera exposes a session variable which will fix this: wsrep_sync_wait > (or wsrep_causal_reads on older mysql). However, this isn't the default. > It presumably has a performance cost, but I don't know what it is, or > how it scales with various workloads. > > > Because these are semantic issues, they aren't things which can be > easily guarded with an if statement. We can't say: > > if galera: > try: > commit > except: > rewind time > > If we are to support this DB at all, we have to structure code in the > first place to allow for its semantics. > > Matt > > [1] No, really: I just read a bunch of docs and blogs today. If anybody > who is an expert would like to validate/correct that would be great. > > [2] > http://www.percona.com/blog/2012/11/20/understanding-multi-node-writing-conflict-metrics-in-percona-xtradb-cluster-and-galera/ > > [3] > http://www.percona.com/blog/2013/03/03/investigating-replication-latency-in-percona-xtradb-cluster/ > -- > Matthew Booth > Red Hat Engineering, Virtualisation Team > > Phone: +442070094448 (UK) > GPG ID: D33C3490 > GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] problems with huge pages and libvirt
On Tue, Feb 03, 2015 at 03:05:24PM +0100, Sahid Orentino Ferdjaoui wrote: > On Mon, Feb 02, 2015 at 11:44:37AM -0600, Chris Friesen wrote: > > On 02/02/2015 11:00 AM, Sahid Orentino Ferdjaoui wrote: > > >On Mon, Feb 02, 2015 at 10:44:09AM -0600, Chris Friesen wrote: > > >>Hi, > > >> > > >>I'm trying to make use of huge pages as described in > > >>"http://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/virt-driver-large-pages.html";. > > >>I'm running kilo as of Jan 27th. > > >>I've allocated 1 2MB pages on a compute node. "virsh capabilities" > > >>on that node contains: > > >> > > >> > > >> > > >> > > >> 67028244 > > >> 16032069 > > >> 5000 > > >> 1 > > >>... > > >> > > >> 67108864 > > >> 16052224 > > >> 5000 > > >> 1 > > >> > > >> > > >>I then restarted nova-compute, I set "hw:mem_page_size=large" on a > > >>flavor, and then tried to boot up an instance with that flavor. I > > >>got the error logs below in nova-scheduler. Is this a bug? > > > > > >Hello, > > > > > >Launchpad.net could be more appropriate to > > >discuss on something which looks like a bug. > > > > > > https://bugs.launchpad.net/nova/+filebug > > > > Just wanted to make sure I wasn't missing something. Bug has been opened at > > https://bugs.launchpad.net/nova/+bug/1417201 > > > > I added some additional logs to the bug report of what the numa topology > > looks like on the compute node and in NUMATopologyFilter.host_passes(). > > > > >According to your trace I would say you are running different versions > > >of Nova services. > > > > nova should all be the same version. I'm running juno versions of other > > openstack components though. > > Hum if I understand well and according your issue reported to > launchpad.net > > https://bugs.launchpad.net/nova/+bug/1417201 > > You are trying to test hugepages under kilo which it is not possible > since it has been implemented in this release (Juno, not yet > published) Please ignore this point. > I have tried to reproduce your issue with trunk but I have not been > able to do it. Please reopen the bug with more information of your env > if still present. I should received any notification from it. > > Thanks, > s. > > > >BTW please verify your version of libvirt. Hugepages is supported > > >start to 1.2.8 (but this should difinitly not failed so badly like > > >that) > > > > Libvirt is 1.2.8. > > Chris > > > > __ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] problems with huge pages and libvirt
On Mon, Feb 02, 2015 at 11:44:37AM -0600, Chris Friesen wrote: > On 02/02/2015 11:00 AM, Sahid Orentino Ferdjaoui wrote: > >On Mon, Feb 02, 2015 at 10:44:09AM -0600, Chris Friesen wrote: > >>Hi, > >> > >>I'm trying to make use of huge pages as described in > >>"http://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/virt-driver-large-pages.html";. > >>I'm running kilo as of Jan 27th. > >>I've allocated 1 2MB pages on a compute node. "virsh capabilities" on > >>that node contains: > >> > >> > >> > >> > >> 67028244 > >> 16032069 > >> 5000 > >> 1 > >>... > >> > >> 67108864 > >> 16052224 > >> 5000 > >> 1 > >> > >> > >>I then restarted nova-compute, I set "hw:mem_page_size=large" on a > >>flavor, and then tried to boot up an instance with that flavor. I > >>got the error logs below in nova-scheduler. Is this a bug? > > > >Hello, > > > >Launchpad.net could be more appropriate to > >discuss on something which looks like a bug. > > > > https://bugs.launchpad.net/nova/+filebug > > Just wanted to make sure I wasn't missing something. Bug has been opened at > https://bugs.launchpad.net/nova/+bug/1417201 > > I added some additional logs to the bug report of what the numa topology > looks like on the compute node and in NUMATopologyFilter.host_passes(). > > >According to your trace I would say you are running different versions > >of Nova services. > > nova should all be the same version. I'm running juno versions of other > openstack components though. Hum if I understand well and according your issue reported to launchpad.net https://bugs.launchpad.net/nova/+bug/1417201 You are trying to test hugepages under kilo which it is not possible since it has been implemented in this release (Juno, not yet published) I have tried to reproduce your issue with trunk but I have not been able to do it. Please reopen the bug with more information of your env if still present. I should received any notification from it. Thanks, s. > >BTW please verify your version of libvirt. Hugepages is supported > >start to 1.2.8 (but this should difinitly not failed so badly like > >that) > > Libvirt is 1.2.8. > Chris > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] problems with huge pages and libvirt
On Mon, Feb 02, 2015 at 11:51:47AM -0500, Jay Pipes wrote: > This is a bug that I discovered when fixing some of the NUMA related nova > objects. I have a patch that should fix it up shortly. Never seen this issue, could be great to have a bug repported. > This is what happens when we don't have any functional testing of stuff that > is merged into master... > Best, > -jay Thanks, s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] problems with huge pages and libvirt
On Mon, Feb 02, 2015 at 10:44:09AM -0600, Chris Friesen wrote: > Hi, > > I'm trying to make use of huge pages as described in > "http://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/virt-driver-large-pages.html";. > I'm running kilo as of Jan 27th. > I've allocated 1 2MB pages on a compute node. "virsh capabilities" on > that node contains: > > > > > 67028244 > 16032069 > 5000 > 1 > ... > > 67108864 > 16052224 > 5000 > 1 > > > I then restarted nova-compute, I set "hw:mem_page_size=large" on a > flavor, and then tried to boot up an instance with that flavor. I > got the error logs below in nova-scheduler. Is this a bug? Hello, Launchpad.net could be more appropriate to discuss on something which looks like a bug. https://bugs.launchpad.net/nova/+filebug According to your trace I would say you are running different versions of Nova services. BTW please verify your version of libvirt. Hugepages is supported start to 1.2.8 (but this should difinitly not failed so badly like that) s. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] How to debug test cases in Openstack
On Fri, Jan 16, 2015 at 11:07:07AM +0530, Abhishek Talwar/HYD/TCS wrote: > Hi, > > I have been trying to debug the test cases in OpenStack, but I am not getting > successful with it. So if someone can help me with that. The last response > from the dev-list was to use " $ ./run_tests.sh -d [test module path] > " but this gives bDb quit error. There are several ways to execute UT, you can use tox or run_tests.sh or for a more specific test I prefer to enter in the venv and to use nose. s. > So kindly help me this. > -- > Thanks and Regards > Abhishek Talwar > =-=-= > Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you > > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] serial-console *replaces* console-log file?
On Fri, Jan 09, 2015 at 09:15:39AM +0800, Lingxian Kong wrote: > There is an excellent post describing this, for your information: > http://blog.oddbit.com/2014/12/22/accessing-the-serial-console-of-your-nova-servers/ Good reference, you can also get some information here: https://review.openstack.org/#/c/132269/ > 2015-01-07 22:38 GMT+08:00 Markus Zoeller : > > The blueprint "serial-ports" introduced a serial console connection > > to an instance via websocket. I'm wondering > > * why enabling the serial console *replaces* writing into log file [1]? > > * how one is supposed to retrieve the boot messages *before* one connects? The good point of using serial console is that you can create with a few lines of python an interactive console to debug your virtual machine. s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Volunteer for BP 'Improve Nova KVM IO support'
On Fri, Dec 19, 2014 at 11:36:03AM +0800, Rui Chen wrote: > Hi, > > Is Anybody still working on this nova BP 'Improve Nova KVM IO support'? > https://blueprints.launchpad.net/nova/+spec/improve-nova-kvm-io-support This feature is already in review, since it is only to add an option to libvirt I guess we can consider to do not address a spec but I may be wrong. https://review.openstack.org/#/c/117442/ s. > I willing to complement nova-spec and implement this feature in kilo or > subsequent versions. > > Feel free to assign this BP to me, thanks:) > > Best Regards. > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] anyway to pep8 check on a specified file
On Mon, Dec 15, 2014 at 09:37:23AM +, Daniel P. Berrange wrote: > On Mon, Dec 15, 2014 at 05:04:59PM +0800, Chen CH Ji wrote: > > > > tox -e pep8 usually takes several minutes on my test server, actually I > > only changes one file and I knew something might wrong there > > anyway to only check that file? Thanks a lot > > Use > > ./run_tests.sh -8 > > > That will only check pep8 against the files listed in the current > commit. If you want to check an entire branch patch series then > > git rebase -i master -x './run_tests.sh -8' Really useful point! s. > Regards, > Daniel > -- > |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Kilo specs review day
On Thu, Dec 11, 2014 at 08:41:49AM +1100, Michael Still wrote: > Hi, > > at the design summit we said that we would not approve specifications > after the kilo-1 deadline, which is 18 December. Unfortunately, we’ve > had a lot of specifications proposed this cycle (166 to my count), and > haven’t kept up with the review workload. > > Therefore, I propose that Friday this week be a specs review day. We > need to burn down the queue of specs needing review, as well as > abandoning those which aren’t getting regular updates based on our > review comments. > > I’d appreciate nova-specs-core doing reviews on Friday, but its always > super helpful when non-cores review as well. Sure it could be *super* useful :) - I will try to help on this way. > A +1 for a developer or > operator gives nova-specs-core a good signal of what might be ready to > approve, and that helps us optimize our review time. > > For reference, the specs to review may be found at: > > > https://review.openstack.org/#/q/project:openstack/nova-specs+status:open,n,z > > Thanks heaps, > Michael > > -- > Rackspace Australia > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [hacking] proposed rules drop for 1.0
On Tue, Dec 09, 2014 at 08:16:34AM -0500, Sean Dague wrote: > On 12/09/2014 07:32 AM, Sahid Orentino Ferdjaoui wrote: > > On Tue, Dec 09, 2014 at 06:39:43AM -0500, Sean Dague wrote: > >> I'd like to propose that for hacking 1.0 we drop 2 groups of rules > >> entirely. > >> > >> 1 - the entire H8* group. This doesn't function on python code, it > >> functions on git commit message, which makes it tough to run locally. It > >> also would be a reason to prevent us from not rerunning tests on commit > >> message changes (something we could do after the next gerrit update). > > > > -1, We probably want to recommend a git commit message more stronger > > formatted mainly about the first line which is the most important. It > > should reflect which part of the code the commit is attended to update > > that gives the ability for contributors to quickly see on what the > > submission is related; > > > > An example with Nova which is quite big: api, compute, > > doc, scheduler, virt, vmware, libvirt, objects... > > > > We should to use a prefix in the first line of commit message. There > > is a large number of commits waiting for reviews, that can help > > contributors with a knowledge in a particular domain to identify > > quickly which one to pick. > > And how exactly do you expect a machine to decide if that's done correctly? Keep what we already have then let community move forward about how to make those rules better. Does it is something we want to remove machine validations to make them human validations ? Contributors already have a ton of work and I guess we are agree the aim is not to remove validations to have all in green in the dashboard. s. > -Sean > > > > >> 2 - the entire H3* group - because of this - > >> https://review.openstack.org/#/c/140168/2/nova/tests/fixtures.py,cm > >> > >> A look at the H3* code shows that it's terribly complicated, and is > >> often full of bugs (a few bit us last week). I'd rather just delete it > >> and move on. > >> > >>-Sean > >> > >> -- > >> Sean Dague > >> http://dague.net > >> > >> ___ > >> OpenStack-dev mailing list > >> OpenStack-dev@lists.openstack.org > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > ___ > > OpenStack-dev mailing list > > OpenStack-dev@lists.openstack.org > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > -- > Sean Dague > http://dague.net > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] People of OpenStack (and their IRC nicks)
On Tue, Dec 09, 2014 at 10:46:12AM +, Matthew Gilliard wrote: > Sometimes, I want to ask the author of a patch about it on IRC. > However, there doesn't seem to be a reliable way to find out someone's > IRC handle. The potential for useful conversation is sometimes > missed. Unless there's a better alternative which I didn't find, > https://wiki.openstack.org/wiki/People seems to fulfill that purpose, > but is neither complete nor accurate. > > What do people think about this? Should we put more effort into > keeping the People wiki up-to-date? That's a(nother) manual process > though - can we autogenerate it somehow? We probably don't want to maintain an other page of Wiki. We can recommend in how to contribute to fill correctly the IRC field in launchpad.NET since OpenStack is closely related. https://wiki.openstack.org/wiki/How_To_Contribute s. > Matthew > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [hacking] proposed rules drop for 1.0
On Tue, Dec 09, 2014 at 06:39:43AM -0500, Sean Dague wrote: > I'd like to propose that for hacking 1.0 we drop 2 groups of rules entirely. > > 1 - the entire H8* group. This doesn't function on python code, it > functions on git commit message, which makes it tough to run locally. It > also would be a reason to prevent us from not rerunning tests on commit > message changes (something we could do after the next gerrit update). -1, We probably want to recommend a git commit message more stronger formatted mainly about the first line which is the most important. It should reflect which part of the code the commit is attended to update that gives the ability for contributors to quickly see on what the submission is related; An example with Nova which is quite big: api, compute, doc, scheduler, virt, vmware, libvirt, objects... We should to use a prefix in the first line of commit message. There is a large number of commits waiting for reviews, that can help contributors with a knowledge in a particular domain to identify quickly which one to pick. > 2 - the entire H3* group - because of this - > https://review.openstack.org/#/c/140168/2/nova/tests/fixtures.py,cm > > A look at the H3* code shows that it's terribly complicated, and is > often full of bugs (a few bit us last week). I'd rather just delete it > and move on. > > -Sean > > -- > Sean Dague > http://dague.net > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Spring cleaning nova-core
On Fri, Dec 05, 2014 at 01:41:59PM +, Daniel P. Berrange wrote: > On Fri, Dec 05, 2014 at 11:05:28AM +1100, Michael Still wrote: > > One of the things that happens over time is that some of our core > > reviewers move on to other projects. This is a normal and healthy > > thing, especially as nova continues to spin out projects into other > > parts of OpenStack. > > > > However, it is important that our core reviewers be active, as it > > keeps them up to date with the current ways we approach development in > > Nova. I am therefore removing some no longer sufficiently active cores > > from the nova-core group. > > > > I’d like to thank the following people for their contributions over the > > years: > > > > * cbehrens: Chris Behrens > > * vishvananda: Vishvananda Ishaya > > * dan-prince: Dan Prince > > * belliott: Brian Elliott > > * p-draigbrady: Padraig Brady > > > > I’d love to see any of these cores return if they find their available > > time for code reviews increases. > > What stats did you use to decide whether to cull these reviewers ? Looking > at the stats over a 6 month period, I think Padraig Brady is still having > a significant positive impact on Nova - on a par with both cerberus and > alaski who you've not proposing for cut. I think we should keep Padraig > on the team, but probably suggest cutting Markmc instead > > http://russellbryant.net/openstack-stats/nova-reviewers-180.txt > > +-+++ > | Reviewer | Reviews -2 -1 +1 +2 +A+/- % | > Disagreements* | > +-+++ > | berrange ** |1766 26 435 12 1293 35773.9% | 157 > ( 8.9%) | > | jaypipes ** |1359 11 378 436 534 13371.4% | 109 > ( 8.0%) | > | jogo ** |1053 131 326 7 589 35356.6% | 47 > ( 4.5%) | > | danms ** | 921 67 381 23 450 16751.4% | 32 > ( 3.5%) | > | oomichi ** | 8894 306 55 524 18265.1% | 40 > ( 4.5%) | > |johngarbutt ** | 808 319 227 10 252 14532.4% | 37 > ( 4.6%) | > | mriedem ** | 642 27 279 25 311 13652.3% | 17 > ( 2.6%) | > | klmitch ** | 6061 90 2 513 7085.0% | 67 > ( 11.1%) | > | ndipanov ** | 588 19 179 10 380 11366.3% | 62 > ( 10.5%) | > |mikalstill **| 564 31 34 3 496 20788.5% | 20 > ( 3.5%) | > | cyeoh-0 ** | 546 12 207 30 297 10359.9% | 35 > ( 6.4%) | > | sdague ** | 511 23 89 6 393 22978.1% | 25 > ( 4.9%) | > | russellb ** | 4656 83 0 376 15880.9% | 23 > ( 4.9%) | > | alaski ** | 4151 65 21 328 14984.1% | 24 > ( 5.8%) | > | cerberus ** | 4056 25 48 326 10292.3% | 33 > ( 8.1%) | > | p-draigbrady ** | 3762 40 9 325 6488.8% | 49 > ( 13.0%) | > | markmc ** | 2432 54 3 184 6977.0% | 14 > ( 5.8%) | > | belliott ** | 2311 68 5 157 3570.1% | 19 > ( 8.2%) | > |dan-prince **| 1782 48 9 119 2971.9% | 11 > ( 6.2%) | > | cbehrens ** | 1322 49 2 79 1961.4% |6 > ( 4.5%) | > |vishvananda ** | 540 5 3 46 1590.7% |5 > ( 9.3%) | > +1 Padraig gave to us several robust reviews on important topics. Lose him will make more difficult the work on nova. s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [libvirt] enabling per node filtering of mempage sizes
On Tue, Dec 02, 2014 at 07:44:23PM +, Mooney, Sean K wrote: > Hi all > > I have submitted a small blueprint to allow filtering of available memory > pages > Reported by libvirt. Can you address this with aggregate? this will also avoid to do something specific in the driver libvirt. Which will have to be extended to other drivers at the end. > https://blueprints.launchpad.net/nova/+spec/libvirt-allowed-mempage-sizes > > I believe that this change is small enough to not require a spec as per > http://docs.openstack.org/developer/nova/devref/kilo.blueprints.html > > if a core (and others are welcome too :)) has time to review my blueprint and > confirm > that a spec is not required I would be grateful as the spd is rapidly > approaching > > I have wip code developed which I hope to make available for review once > I add unit tests. > > All relevant detail (copied below) are included in the whiteboard for the > blueprint. > > Regards > Sean > > Problem description > === > > In the Kilo cycle, the virt drivers large pages feature[1] was introduced > to allow a guests to request the type of memory backing that they desire > via a flavor or image metadata. > > In certain configurations, it may be desired or required to filter the > memory pages available to vms booted on a node. At present no mechanism > exists to allow filtering of reported memory pages. > > Use Cases > -- > > On a host that only supports vhost-user or ivshmem, > all VMs are required to use large page memory. > If a vm is booted with standard pages with these interfaces, > network connectivity will not available. > > In this case it is desirable to filter out small/4k pages when reporting > available memory to the scheduler. > > Proposed change > === > > This blueprint proposes adding a new config variable (allowed_memory_pagesize) > to the libvirt section of the nova.conf. > > cfg.ListOpt('allowed_memory_pagesize', > default=['any'], > help='List of allowed memory page sizes' > 'Syntax is SizeA,SizeB e.g. small,large' > 'valid sizes are: small,large,any,4,2048,1048576') > > The _get_host_capabilities function in nova/nova/virt/libvirt/driver.py > will be modified to filter the mempages reported for each cell based on the > value of CONF.libvirt.allowed_memory_pagesize > > If small is set then only 4k pages will be reported. > If large is set 2MB and 1GB will be reported. > If any is set no filtering will be applied. > > The default value of "any" was chosen to ensure that this change has no > effect on > existing deployment. > > References > == > [1] - https://blueprints.launchpad.net/nova/+spec/virt-driver-large-pages > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal new hacking rules
On Fri, Nov 21, 2014 at 10:30:59AM -0800, Joe Gordon wrote: > On Fri, Nov 21, 2014 at 8:57 AM, Sahid Orentino Ferdjaoui < > sahid.ferdja...@redhat.com> wrote: > > > On Thu, Nov 20, 2014 at 02:00:11PM -0800, Joe Gordon wrote: > > > On Thu, Nov 20, 2014 at 9:49 AM, Sahid Orentino Ferdjaoui < > > > sahid.ferdja...@redhat.com> wrote: > > > > > > > This is something we can call nitpiking or low priority. > > > > > > > > > > This all seems like nitpicking for very little value. I think there are > > > better things we can be focusing on instead of thinking of new ways to > > nit > > > pick. So I am -1 on all of these. > > > > Yes as written this is low priority but something necessary for a > > project like Nova it is. > > > > > Why do you think this is necessary? Your asking make sense; You/Nova is looking for engineering time to focus on other development more importants. I would to help with my humble experience. * Let developer a chance to know about what values was expected when he broke a test. * Let developer to know what to use between warn or warning instead of loosing time by looking in the module what was used or doing a coin flip. Contributors are expected to read HACKING.rst and some of these rules can be tested by gate. > > Considered that I feel sad to take your time. Can I suggest you to > > take no notice of this and let's others developers working on Nova too > > do this job ? > > > > > As the maintainer of openstack-dev/hacking and as a nova core, I don't > think this is worth doing at all. Nova already has enough on its plate and > doesn't need extra code to review. My point was not to discredit your opinion (My phrasing can be wrong I'm non-native english) I believe that you and contributors in general like me are working to make Nova better. Usually in opensource software contributors are welcome to help even if it is to fix a typo and I was hoped to help. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal new hacking rules
On Fri, Nov 21, 2014 at 05:23:28PM -0500, Matthew Treinish wrote: > On Fri, Nov 21, 2014 at 04:15:07PM -0500, Sean Dague wrote: > > On 11/21/2014 01:52 PM, Matthew Treinish wrote: > > > On Fri, Nov 21, 2014 at 07:15:49PM +0100, jordan pittier wrote: > > >> Hey, > > >> I am not a Nova developer but I still have an opinion. > > >> > > >>> Using boolean assertions > > >> I like what you propose. We should use and enforce the assert* that best > > >> matches the intention. It's about semantic and the more precise we are, > > >> the better. > > >> > > >>> Using same order of arguments in equality assertions > > >> Why not. But I don't know how we can write a Hacking rule for this. So > > >> you may fix all the occurrences for this now, but it might get back in > > >> the future. > > > > > > Ok I'll bite, besides the enforceability issue which you pointed out, it > > > just > > > doesn't make any sense, you're asserting 2 things are equal: (A == B) == > > > (B == A) > > > and I honestly feel that it goes beyond nitpicking because of that. > > > > > > It's also a fallacy that there will always be an observed value and an > > > expected value. For example: > > > > > > self.assertEqual(method_a(), method_b()) > > > > > > Which one is observed and which one is expected? I think this proposal is > > > just > > > reading into the parameter names a bit too much. > > > > If you are using assertEqual with 2 variable values... you are doing > > your test wrong. > > > > I was originally in your camp. But honestly, the error message provided > > to the user does say expected and observed, and teaching everyone that > > you have to ignore the error message is a much harder thing to do than > > flip the code to conform to it, creating less confusion. > > > > Uhm, no it doesn't, the default error message is "A != B". [1][2][3] (both > with > unittest and testtools) If the error message was like that, then sure saying > one way was right over the other would be fine, (assuming you didn't specify a > different error message) but, that's not what it does. > > > [1] > https://github.com/testing-cabal/testtools/blob/master/testtools/testcase.py#L340 > [2] > https://github.com/testing-cabal/testtools/blob/master/testtools/matchers/_basic.py#L85 > > [3] https://hg.python.org/cpython/file/301d62ef5c0b/Lib/unittest/case.py#l508 ... File "/opt/stack/nova/.venv/lib/python2.7/site-packages/testtools/testcase.py", line 348, in assertEqual self.assertThat(observed, matcher, message) File "/opt/stack/nova/.venv/lib/python2.7/site-packages/testtools/testcase.py", line 433, in assertThat raise mismatch_error MismatchError: !=: reference = {'nova_object.changes': ['cells'], 'nova_object.data': {... actual= {'nova_object.changes': ['cells'], 'nova_object.data': {... ... ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal new hacking rules
On Fri, Nov 21, 2014 at 01:52:50PM -0500, Matthew Treinish wrote: > On Fri, Nov 21, 2014 at 07:15:49PM +0100, jordan pittier wrote: > > Hey, > > I am not a Nova developer but I still have an opinion. > > > > >Using boolean assertions > > I like what you propose. We should use and enforce the assert* that best > > matches the intention. It's about semantic and the more precise we are, the > > better. > > > > >Using same order of arguments in equality assertions > > Why not. But I don't know how we can write a Hacking rule for this. So you > > may fix all the occurrences for this now, but it might get back in the > > future. Let write this rules in HACKING.rst, developers and reviewers are expected to read it. > Ok I'll bite, besides the enforceability issue which you pointed out, it just > doesn't make any sense, you're asserting 2 things are equal: (A == B) == (B > == A) > and I honestly feel that it goes beyond nitpicking because of that. > > It's also a fallacy that there will always be an observed value and an > expected value. For example: > > self.assertEqual(method_a(), method_b()) > > Which one is observed and which one is expected? I think this proposal is just > reading into the parameter names a bit too much. Let developer to know about what values was expected when he broke a test during his development without looking into the testcase code. Operators can also want to know about what values was expected/observed without reading the code when executing tests. > > > > >Using LOG.warn instead of LOG.warning > > I am -1 on this. The part that comes after LOG. (LOG.warning, LOG.error, > > LOG.debug, etc) is the log level, it's not a verb. In syslog, the > > well-known log level is "warning" so the correct method to use here is, > > imo, log.warning(). Well I have choiced 'warn' because there is less changes but I do not have any preference, just want something clear from Nova. > > Have you concidered submitting this hacking rules to the hacking project > > here : https://github.com/openstack-dev/hacking ? I am sure these new rules > > makes sense on other openstack projects. Make them accepted by Nova community first before to think about other projects ;) > > Jordan > > > > - Original Message - > > From: "Sahid Orentino Ferdjaoui" > > To: "OpenStack Development Mailing List (not for usage questions)" > > > > Sent: Friday, November 21, 2014 5:57:14 PM > > Subject: Re: [openstack-dev] [nova] Proposal new hacking rules > > > > On Thu, Nov 20, 2014 at 02:00:11PM -0800, Joe Gordon wrote: > > > On Thu, Nov 20, 2014 at 9:49 AM, Sahid Orentino Ferdjaoui < > > > sahid.ferdja...@redhat.com> wrote: > > > > > > > This is something we can call nitpiking or low priority. > > > > > > > > > > This all seems like nitpicking for very little value. I think there are > > > better things we can be focusing on instead of thinking of new ways to nit > > > pick. So I am -1 on all of these. > > > > Yes as written this is low priority but something necessary for a > > project like Nova it is. > > > > Considered that I feel sad to take your time. Can I suggest you to > > take no notice of this and let's others developers working on Nova too > > do this job ? > > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal new hacking rules
On Thu, Nov 20, 2014 at 02:00:11PM -0800, Joe Gordon wrote: > On Thu, Nov 20, 2014 at 9:49 AM, Sahid Orentino Ferdjaoui < > sahid.ferdja...@redhat.com> wrote: > > > This is something we can call nitpiking or low priority. > > > > This all seems like nitpicking for very little value. I think there are > better things we can be focusing on instead of thinking of new ways to nit > pick. So I am -1 on all of these. Yes as written this is low priority but something necessary for a project like Nova it is. Considered that I feel sad to take your time. Can I suggest you to take no notice of this and let's others developers working on Nova too do this job ? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] Proposal new hacking rules
This is something we can call nitpiking or low priority. I would like we introduce 3 new hacking rules to enforce the cohesion and consistency in the base code. Using boolean assertions Some tests are written with equality assertions to validate boolean conditions which is something not clean: assertFalse([]) asserts an empty list assertEqual(False, []) asserts an empty list is equal to the boolean value False which is something not correct. Some changes has been started here but still needs to be appreciated by community: * https://review.openstack.org/#/c/133441/ * https://review.openstack.org/#/c/119366/ Using same order of arguments in equality assertions Most of the code is written with assertEqual(Expected, Observed) but some part are still using the opposite. Even if they provide any real optimisation using the same convention help reviewing and keep a better consistency in the code. assertEqual(Expected, Observed) OK assertEqual(Observed, Expected) KO A change has been started here but still needs to be appreciated by community: * https://review.openstack.org/#/c/119366/ Using LOG.warn instead of LOG.warning - We can see many time reviewers -1ed a patch to ask developer to use 'warn' instead of 'warning'. This will provide no optimisation but let's finally have something clear about what we have to use. LOG.warning: 74 LOG.warn:319 We probably want to use 'warn' Nothing has been started from what I know. Thanks, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] questions on result of diagnostics on libvirt
On Tue, Nov 11, 2014 at 01:59:12PM +0800, Chen CH Ji wrote: > > see the error value of diagnostics is huge , but I don't think my disk is > that bad ... is this wrong info or wrong usage of libvirt? > Also, all the disk has same error number curious me , any guide ? Considering you are using libvirt/KVM, libvirt asks qemu to get information about block stats. You can begin by executing the command from libvirt and compare the result. virsh domstats If results are equivalent we can probably avoid that the problem comes from Nova. Also you could connect yourself to the VM or using libguestfs to execute some commands and get information about errors. s. > jichen@cloudcontroller:/opt/stack/nova/nova$ nova diagnostics jieph1 > +---+--+ > | Property | Value| > +---+--+ > | cpu0_time | 1071000 | > | hdd_errors| 18446744073709551615 | > ... > | tapf1ce9c02-01_tx_packets | 24 | > | vda_errors| 18446744073709551615 | > | vda_read | 2848768 | > | vda_read_req | 453 | > | vda_write | 348160 | > | vda_write_req | 105 | > | vdb_errors| 18446744073709551615 | > | vdb_read | 829440 | > | vdb_read_req | 30 | > | vdb_write | 4096 | > > Best Regards! > > Kevin (Chen) Ji 纪 晨 > > Engineer, zVM Development, CSTL > Notes: Chen CH Ji/China/IBM@IBMCN Internet: jiche...@cn.ibm.com > Phone: +86-10-82454158 > Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian District, > Beijing 100193, PRC > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] How to connect to a serial port of an instance via websocket?
On Tue, Oct 28, 2014 at 03:09:44PM +0100, Markus Zoeller wrote: > The API provides an endpoint for querying the serial console of an > instance ('os-getSerialConsole'). The nova-client interacts with this > API endpoint via the command `get-serial-console`. > > nova get-serial-console myInstance > > It returns a string like: > > ws://127.0.0.1:6083/?token=e2b42240-375d-41fe-a166-367e4bbdce35 > > Q: How is one supposed to connect to such a websocket? The aim of the feature is exposing an interactive web-based serial consoles through a websocket proxy. The API returns an URL with a valid token that should be used with a websocket client to read/write on the stream. Considering the service nova-serialproxy is running and well configured you can use this simple test purpose client to connect yourself on the URL returned by the API: https://gist.github.com/sahid/894c31f306bebacb2207 The general idea behind this service is for example to help debugging VMs when something was wrong with the network configuration. s. > [1] > https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/contrib/consoles.py#L111 > [2] > https://ask.openstack.org/en/question/50671/how-to-connect-to-a-serial-port-of-an-instance-via-websocket/ > > Regards, > Markus Zoeller > IRC: markus_z > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] FFE request serial-ports
Hello, I would like to request a FFE for 4 changesets to complete the blueprint serial-ports. Topic on gerrit: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/serial-ports,n,z Blueprint on launchpad.net: https://blueprints.launchpad.net/nova/+spec/serial-ports They have already been approved but didn't get enough time to be merged by the gate. Sponsored by: Daniel Berrange Nikola Dipanov s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat]Heat use as a standalone component for Cloud Managment over multi IAAS
warm is just an other client, like we can have for the cli. It does not claim to do what Heat can. It should be useful to prepare some templates to be reused in different OpenStack environment without using script shell or python. When I said standalone client, I mean there is no need to install services in your OpenStack cloud to use it. Regards, s. - Original Message - From: "Thomas Spatzier" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Thursday, March 20, 2014 9:44:22 AM Subject: Re: [openstack-dev] [Heat]Heat use as a standalone component for Cloud Managment over multi IAAS Just out of curiosity: what is the purpose of project "warm"? From the wiki page and the sample it looks pretty much like what Heat is doing. And "warm" is almost "HOT" so could you imagine your use cases can just be addressed by Heat using HOT templates? Regards, Thomas sahid wrote on 18/03/2014 12:56:47: > From: sahid > To: "OpenStack Development Mailing List (not for usage questions)" > > Date: 18/03/2014 12:59 > Subject: Re: [openstack-dev] [Heat]Heat use as a standalone > component for Cloud Managment over multi IAAS > > Sorry for the late of this response, > > I'm currently working on a project called Warm. > https://wiki.openstack.org/wiki/Warm > > It is used as a standalone client and try to deploy small OpenStack > environments from Yzml templates. You can find some samples here: > https://github.com/sahid/warm-templates > > s. > > - Original Message - > From: "Charles Walker" > To: openstack-dev@lists.openstack.org > Sent: Wednesday, February 26, 2014 2:47:44 PM > Subject: [openstack-dev] [Heat]Heat use as a standalone component > for Cloud Managment over multi IAAS > > Hi, > > > I am trying to deploy the proprietary application made in my company on the > cloud. The pre requisite for this is to have a IAAS which can be either a > public cloud or private cloud (openstack is an option for a private IAAS). > > > The first prototype I made was based on a homemade python orchestrator and > apache libCloud to interact with IAAS (AWS and Rackspace and GCE). > > The orchestrator part is a python code reading a template file which > contains the info needed to deploy my application. This template file > indicates the number of VM and the scripts associated to each VM type to > install it. > > > Now I was trying to have a look on existing open source tool to do the > orchestration part. I find JUJU (https://juju.ubuntu.com/) or HEAT ( > https://wiki.openstack.org/wiki/Heat). > > I am investigating deeper HEAT and also had a look on > https://wiki.openstack.org/wiki/Heat/DSL which mentioned: > > *"Cloud Service Provider* - A service entity offering hosted cloud services > on OpenStack or another cloud technology. Also known as a Vendor." > > > I think HEAT as its actual version will not match my requirement but I have > the feeling that it is going to evolve and could cover my needs. > > > I would like to know if it would be possible to use HEAT as a standalone > component in the future (without Nova and other Ostack modules)? The goal > would be to deploy an application from a template file on multiple cloud > service (like AWS, GCE). > > > Any feedback from people working on HEAT could help me. > > > Thanks, Charles. > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat]Heat use as a standalone component for Cloud Managment over multi IAAS
Sorry for the late of this response, I'm currently working on a project called Warm. https://wiki.openstack.org/wiki/Warm It is used as a standalone client and try to deploy small OpenStack environments from Yzml templates. You can find some samples here: https://github.com/sahid/warm-templates s. - Original Message - From: "Charles Walker" To: openstack-dev@lists.openstack.org Sent: Wednesday, February 26, 2014 2:47:44 PM Subject: [openstack-dev] [Heat]Heat use as a standalone component for Cloud Managment over multi IAAS Hi, I am trying to deploy the proprietary application made in my company on the cloud. The pre requisite for this is to have a IAAS which can be either a public cloud or private cloud (openstack is an option for a private IAAS). The first prototype I made was based on a homemade python orchestrator and apache libCloud to interact with IAAS (AWS and Rackspace and GCE). The orchestrator part is a python code reading a template file which contains the info needed to deploy my application. This template file indicates the number of VM and the scripts associated to each VM type to install it. Now I was trying to have a look on existing open source tool to do the orchestration part. I find JUJU (https://juju.ubuntu.com/) or HEAT ( https://wiki.openstack.org/wiki/Heat). I am investigating deeper HEAT and also had a look on https://wiki.openstack.org/wiki/Heat/DSL which mentioned: *"Cloud Service Provider* - A service entity offering hosted cloud services on OpenStack or another cloud technology. Also known as a Vendor." I think HEAT as its actual version will not match my requirement but I have the feeling that it is going to evolve and could cover my needs. I would like to know if it would be possible to use HEAT as a standalone component in the future (without Nova and other Ostack modules)? The goal would be to deploy an application from a template file on multiple cloud service (like AWS, GCE). Any feedback from people working on HEAT could help me. Thanks, Charles. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] bugs that needs to be reviewed
Greetings, There are two fixes for bugs that need to be reviewed. One for the feature shelve instance and the other one for the API to get the list of migrations in progress. These two bugs are marked to high and medium because they broke feature. The code was push several months ago, if some cores can take a look. Fix: Unshelving an instance uses original image https://review.openstack.org/#/c/72407/ Fix: Fix unicode error in os-migrations https://review.openstack.org/#/c/61717/ Thanks, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] bp proposal: New per-aggregate filters targeted for icehouse-3
Greetings, I wanted to ask if some cores could take a look at these reviews, The code was pushed since 2 months and didn't get a lot of reviews. All of these blueprints are approved for icehouse-3. https://review.openstack.org/#/c/65452/ https://review.openstack.org/#/c/65108/ https://review.openstack.org/#/c/65474/ Thanks a lot, s. - Original Message - From: "sahid" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Tuesday, January 28, 2014 5:14:49 PM Subject: [openstack-dev] [nova] bp proposal: New per-aggregate filters targeted for icehouse-3 Hi there, The deadline of blueprints approval comes really quickly and I understand that there are a lot of work todo, but I would like get your attention about 3 new filters targeted for icehouse-3 and with a code already in review. - https://blueprints.launchpad.net/nova/+spec/per-aggregate-disk-allocation-ratio - https://blueprints.launchpad.net/nova/+spec/per-aggregate-max-instances-per-host - https://blueprints.launchpad.net/nova/+spec/per-aggregate-max-io-ops-per-host The main aim of these bp is to relocate the configurations per aggregate. And these features are interesting when building a large cloud. Can you take a small amount of time to let me know how to move forward with them? Thanks a lot, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: filter based on the load averages of the host
I have implemented a new monitor based on the system load averages https://review.openstack.org/#/c/74014/1 What do you think? s. - Original Message - From: "sahid" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Monday, February 17, 2014 10:32:19 AM Subject: Re: [openstack-dev] [Nova] bp proposal: filter based on the load averages of the host yes I want to use the load average because it is based on more information than the cpu utilization. May be instead of to add a new field to hypervisor status I can create a new monitor or update the exiting one. - Original Message - From: "yunhong jiang" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Saturday, February 15, 2014 12:54:52 AM Subject: Re: [openstack-dev] [Nova] bp proposal: filter based on the load averages of the host On Fri, 2014-02-14 at 15:29 +, sahid wrote: > Greetings, > > I would like to add a new filter based on the load averages. > > This filter will use the command uptime and will provides an option to choice > a > period between 1, 5, and 15 minutes and an option to choice the max load > average (a float between 0 and 1). > > Why: > During a scheduling it could be useful to exclude a host that have a too > heavy load and the command uptime (available in all linux system) > can return a load average of the system in different periods. > > About the implementation: > Currently 'all' drivers (libvirt, xenapi, vmware) supports a method > get_host_uptime that returns the output of the command 'uptime'. We have to > add > in compute/stats.py a new method calculate_loadavg() that returns based on the > output of driver.get_host_uptime() from compute/ressource_tracker.py a well > formatted tuple of load averages for each periods. We also need to update > api/openstack/compute/contrib/hypervisors.py to take care of this new > field. > > The implementation will be divided in several parts: > * Add to host_manager the possibility to get the loads_averages > * Implement the filter based on this new property > * Implement the filter with a per-aggregate configuration > > The blueprint: https://blueprints.launchpad.net/nova/+spec/filter-based-uptime > > I will be happy to get any comments about this filter, perharps it is not > implemented > yet because of something I didn't see or my thinking of the implementation is > wrong. > > PS: I have checked metrics and cpu_resource but It does not get an averages > of the > system load or perhaps I have not understand all. > > Thanks a lot, > s. > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I think load average has more than CPU, you need consider like I/O usage, or even other metrics. Maybe you can have a look at the https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling ? Also IMHO the policy of "exclude a host that have a too heavy load" is not so clean, would it be better to keep the usage as a scheduler weight? Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] bp proposal: filter based on the load averages of the host
yes I want to use the load average because it is based on more information than the cpu utilization. May be instead of to add a new field to hypervisor status I can create a new monitor or update the exiting one. - Original Message - From: "yunhong jiang" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Saturday, February 15, 2014 12:54:52 AM Subject: Re: [openstack-dev] [Nova] bp proposal: filter based on the load averages of the host On Fri, 2014-02-14 at 15:29 +, sahid wrote: > Greetings, > > I would like to add a new filter based on the load averages. > > This filter will use the command uptime and will provides an option to choice > a > period between 1, 5, and 15 minutes and an option to choice the max load > average (a float between 0 and 1). > > Why: > During a scheduling it could be useful to exclude a host that have a too > heavy load and the command uptime (available in all linux system) > can return a load average of the system in different periods. > > About the implementation: > Currently 'all' drivers (libvirt, xenapi, vmware) supports a method > get_host_uptime that returns the output of the command 'uptime'. We have to > add > in compute/stats.py a new method calculate_loadavg() that returns based on the > output of driver.get_host_uptime() from compute/ressource_tracker.py a well > formatted tuple of load averages for each periods. We also need to update > api/openstack/compute/contrib/hypervisors.py to take care of this new > field. > > The implementation will be divided in several parts: > * Add to host_manager the possibility to get the loads_averages > * Implement the filter based on this new property > * Implement the filter with a per-aggregate configuration > > The blueprint: https://blueprints.launchpad.net/nova/+spec/filter-based-uptime > > I will be happy to get any comments about this filter, perharps it is not > implemented > yet because of something I didn't see or my thinking of the implementation is > wrong. > > PS: I have checked metrics and cpu_resource but It does not get an averages > of the > system load or perhaps I have not understand all. > > Thanks a lot, > s. > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I think load average has more than CPU, you need consider like I/O usage, or even other metrics. Maybe you can have a look at the https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling ? Also IMHO the policy of "exclude a host that have a too heavy load" is not so clean, would it be better to keep the usage as a scheduler weight? Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] bp proposal: filter based on the load averages of the host
Greetings, I would like to add a new filter based on the load averages. This filter will use the command uptime and will provides an option to choice a period between 1, 5, and 15 minutes and an option to choice the max load average (a float between 0 and 1). Why: During a scheduling it could be useful to exclude a host that have a too heavy load and the command uptime (available in all linux system) can return a load average of the system in different periods. About the implementation: Currently 'all' drivers (libvirt, xenapi, vmware) supports a method get_host_uptime that returns the output of the command 'uptime'. We have to add in compute/stats.py a new method calculate_loadavg() that returns based on the output of driver.get_host_uptime() from compute/ressource_tracker.py a well formatted tuple of load averages for each periods. We also need to update api/openstack/compute/contrib/hypervisors.py to take care of this new field. The implementation will be divided in several parts: * Add to host_manager the possibility to get the loads_averages * Implement the filter based on this new property * Implement the filter with a per-aggregate configuration The blueprint: https://blueprints.launchpad.net/nova/+spec/filter-based-uptime I will be happy to get any comments about this filter, perharps it is not implemented yet because of something I didn't see or my thinking of the implementation is wrong. PS: I have checked metrics and cpu_resource but It does not get an averages of the system load or perhaps I have not understand all. Thanks a lot, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Should we limit the disk IO bandwidth in copy_image while creating new instance?
It could be a good idea but as Sylvain said how to configure this? Then, what about using scp instead of rsync for a local copy? - Original Message - From: "Wangpan" To: "OpenStack Development Mailing List" Sent: Friday, February 14, 2014 4:52:20 AM Subject: [openstack-dev] [nova] Should we limit the disk IO bandwidth in copy_image while creating new instance? Currently nova doesn't limit the disk IO bandwidth in copy_image() method while creating a new instance, so the other instances on this host may be affected by this high disk IO consuming operation, and some time-sensitive business(e.g RDS instance with heartbeat) may be switched between master and slave. So can we use the `rsync --bwlimit=${bandwidth} src dst` command instead of `cp src dst` while copy_image in create_image() of libvirt driver, the remote image copy operation also can be limited by `rsync --bwlimit=${bandwidth}` or `scp -l=${bandwidth}`, this parameter ${bandwidth} can be a new configuration in nova.conf which allow cloud admin to config it, it's default value is 0 which means no limitation, then the instances on this host will be not affected while a new instance with not cached image is creating. the example codes: nova/virt/libvit/utils.py: diff --git a/nova/virt/libvirt/utils.py b/nova/virt/libvirt/utils.py index e926d3d..5d7c935 100644 --- a/nova/virt/libvirt/utils.py +++ b/nova/virt/libvirt/utils.py @@ -473,7 +473,10 @@ def copy_image(src, dest, host=None): # sparse files. I.E. holes will not be written to DEST, # rather recreated efficiently. In addition, since # coreutils 8.11, holes can be read efficiently too. -execute('cp', src, dest) +if CONF.mbps_in_copy_image > 0: +execute('rsync', '--bwlimit=%s' % CONF.mbps_in_copy_image * 1024, src, dest) +else: +execute('cp', src, dest) else: dest = "%s:%s" % (host, dest) # Try rsync first as that can compress and create sparse dest files. @@ -484,11 +487,22 @@ def copy_image(src, dest, host=None): # Do a relatively light weight test first, so that we # can fall back to scp, without having run out of space # on the destination for example. -execute('rsync', '--sparse', '--compress', '--dry-run', src, dest) +if CONF.mbps_in_copy_image > 0: +execute('rsync', '--sparse', '--compress', '--dry-run', +'--bwlimit=%s' % CONF.mbps_in_copy_image * 1024, src, dest) +else: +execute('rsync', '--sparse', '--compress', '--dry-run', src, dest) except processutils.ProcessExecutionError: -execute('scp', src, dest) +if CONF.mbps_in_copy_image > 0: +execute('scp', '-l', '%s' % CONF.mbps_in_copy_image * 1024 * 8, src, dest) +else: +execute('scp', src, dest) else: -execute('rsync', '--sparse', '--compress', src, dest) +if CONF.mbps_in_copy_image > 0: +execute('rsync', '--sparse', '--compress', +'--bwlimit=%s' % CONF.mbps_in_copy_image * 1024, src, dest) +else: +execute('rsync', '--sparse', '--compress', src, dest) 2014-02-14 Wangpan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova][Gate] qemu: linux kernel too old to load a ram disk
Hello, It looks since 12 hours the gate fails in 100% of case because an error with libvirt (logs/libvirtd.txt): qemu: linux kernel too old to load a ram disk Bug reported on openstack-ci: https://bugs.launchpad.net/openstack-ci/+bug/1280142 Fingerprint: http://logstash.openstack.org/#eyJzZWFyY2giOiIgbWVzc2FnZTpcInFlbXU6IGxpbnV4IGtlcm5lbCB0b28gb2xkIHRvIGxvYWQgYSByYW0gZGlza1wiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI0MzIwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIiLCJzdGFtcCI6MTM5MjM2NzU5MTY1MX0= s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Bug Triage Day Proposal - Friday 7th February
+1 - Original Message - From: "John Garbutt" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Friday, February 7, 2014 9:12:42 AM Subject: Re: [openstack-dev] [Nova] Bug Triage Day Proposal - Friday 7th February Just a quick reminder, its bug day! Lets collaborate in #openstack-nova We can track progress here: http://webnumbr.com/untouched-nova-bugs And later progress: http://status.openstack.org/bugday Get those bugs tagged: https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW Tag owners, and others, lets set the priorities: https://wiki.openstack.org/wiki/Nova/BugTriage But don't forget: * Critical if the bug prevents a key feature from working properly (regression) for all users (or without a simple workaround) or result in data loss * High if the bug prevents a key feature from working properly for some users (or with a workaround) * Medium if the bug prevents a secondary feature from working properly * Low if the bug is mostly cosmetic * Wishlist if the bug is not really a bug, but rather a welcome change in behavior Lets also watch out for stale bugs: https://bugs.launchpad.net/nova/+bugs?orderby=date_last_updated&field.status%3Alist=INPROGRESS&assignee_option=any John PS I am having to be an emergency taxi service first thing this morning, but should be joining you this afternoon. On 5 February 2014 01:01, Russell Bryant wrote: > On 02/04/2014 05:10 PM, John Garbutt wrote: >> Hi, >> >> Now that we getting close towards the end of Icehouse, it seems a good >> time to make sure we tame the un-triaged bug backlog (try say that >> really quickly a few times over), and look at what really needs fixing >> before Icehouse is released. >> >> I propose that we have a bug triage day this Friday, February 7th. >> That way, things should be in a more reasonable state by the Utah >> mid-cycle meet up, on Monday. >> >> If you have some bugs you keep meaning to raise, but haven't quite got >> around to it yet, please do that before Friday, rather than after >> Friday. >> >> The usual process applies for Bug Triage. Applying official nova tags, etc: >> https://wiki.openstack.org/wiki/Nova/BugTriage >> https://wiki.openstack.org/wiki/BugTriage >> >> To see how we are doing, take a look at: >> http://webnumbr.com/untouched-nova-bugs >> http://status.openstack.org/bugday >> >> Lets also not forgot about fixing bugs too, particularly ones that show up >> here: >> http://status.openstack.org/elastic-recheck/ >> >> Hopefully you can join us on #openstack-nova for some bug triage "fun" >> on Friday. >> >> If there are horrid clashes, or other issues or ideas, do speak up. > > Sounds great. We're due for a bug day. An improved bug queue as we > head toward the freeze would be very helpful. Thanks! > > -- > Russell Bryant > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] about the bp cpu-entitlement
Greetings, I saw a really interesting blueprint about cpu entitlement, it will be targeted for icehouse-3 and I would like to get some details about the progress?. Does the developer need help? I can give a part of my time on it. https://blueprints.launchpad.net/nova/+spec/cpu-entitlement Thanks a lot, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] bp proposal: libvirt-resize-disk-down
Ok Rich, I'm going to test your tools and add it in the workflow if available on the host. s. - Original Message - From: "Richard W.M. Jones" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Friday, January 31, 2014 8:13:57 AM Subject: Re: [openstack-dev] [nova] bp proposal: libvirt-resize-disk-down On Thu, Jan 30, 2014 at 02:59:45PM +, sahid wrote: > Greetings, > > A blueprint is being discussed about the disk resize down feature of libvirt > driver. > https://blueprints.launchpad.net/nova/+spec/libvirt-resize-disk-down > > The current implementation does not handle disk resize down and just skips the > step during a resize down of the instance. I'm really convinced we can > implement > this feature by using the good job of disk resize down of the driver xenapi. resize2fs -M is problematic as another reply mentions. virt-sparsify is designed to handle this case properly. It currently works by copying the disk image, but it should soon work in-place too (waiting on some qemu command line changes). And incidentally, virt-resize can handle the offline growing case well too. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] bp proposal: libvirt-resize-disk-down
> In case it hasn't been considered yet, shrinking a filesystem can result > in terrible fragmentation. The block allocator in resize2fs does not do > a great job of handling this case. The result will be a very > non-optimal file layout and measurably worse performance, especially for > drives with a relatively high average seek time. This is an interesting point and I really want to get more information about it, I done some search on the manual of resize2fs but nothing. As well, what do you think about to use "freezero" after the resize if available on the host, could it fix this kind of problem? Best, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] bp proposal: libvirt-resize-disk-down
> For metering/usage purposes, does the old size of ephemeral disk > continue to be shown in usage records, or does the size of the disk in > the newly-selected instance type (flavor) get used? If the former, then > this would be an avenue for users to Get more disk space than they are > paying for. Something to look into... Actually yes, the status of the instance is with the new flavor disk space while the real space allocated for the instance is always the same. We probably need to raise a ResizeError exception, also to keep a good backward compatibility we can add a config like libvirt.use_strong_resize=True or something else. Regards, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] bp proposal: libvirt-resize-disk-down
Greetings, A blueprint is being discussed about the disk resize down feature of libvirt driver. https://blueprints.launchpad.net/nova/+spec/libvirt-resize-disk-down The current implementation does not handle disk resize down and just skips the step during a resize down of the instance. I'm really convinced we can implement this feature by using the good job of disk resize down of the driver xenapi. Criteria for allowing disk resize down: + The disk must have one partition + The fs must be ext3 or ext4 The implementation will be separated in several commits: + Move shared utility methods to a common module: - virt.xenapi.vm_utils._get_partitions to virt.disk.utils.get_partitions - virt.libvirt.utils.copy_image to virt.disk.utils.copy_image - virt.xenapi.vm_utils._repair_filesystem to virt.disk.utils.repair_filesystem + Disk resize down implementation Notes: - Another point we have to discuss, is that the current implementation just skips the fs resize if not supported, is it a good choice? Should we have to raise an exception to inform the user that it is not possible to resize the instance? (if we have to raise an exception, a task will be added to the TODO to handle this case for resize up before working on resize down.) - The current workflow for a user is to confirm the resize when the state of the instance is VERIFY_RESIZE, I think we probably have to add a checklist of good pratices of how to verify a resize in the manual: http://docs.openstack.org/user-guide/content/nova_cli_resize.html Thanks a lot, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] bp proposal: New per-aggregate filters targeted for icehouse-3
Hi there, The deadline of blueprints approval comes really quickly and I understand that there are a lot of work todo, but I would like get your attention about 3 new filters targeted for icehouse-3 and with a code already in review. - https://blueprints.launchpad.net/nova/+spec/per-aggregate-disk-allocation-ratio - https://blueprints.launchpad.net/nova/+spec/per-aggregate-max-instances-per-host - https://blueprints.launchpad.net/nova/+spec/per-aggregate-max-io-ops-per-host The main aim of these bp is to relocate the configurations per aggregate. And these features are interesting when building a large cloud. Can you take a small amount of time to let me know how to move forward with them? Thanks a lot, s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] hugepage support
I have started an implementation in review: https://review.openstack.org/#/c/69148/ - Original Message - From: "sahid" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Saturday, January 25, 2014 5:56:10 PM Subject: Re: [openstack-dev] [nova] hugepage support Hi Anil, I have checked on the code and it looks it is not possible to enable this feature for the guest. We are running kvm through libvirt and it support this option: http://libvirt.org/formatdomain.html It could be an interesting feature. may be by an option readed from the image properties. There is something similar here: https://review.openstack.org/#/c/65028/ . What do you think? s. - Original Message - From: "Anil Gunturu" To: openstack-dev@lists.openstack.org Sent: Saturday, January 25, 2014 4:31:51 PM Subject: [openstack-dev] [nova] hugepage support Reposting with correct category in the subject. From: Anil Gunturu [mailto:anil.gunt...@riftio.com] Sent: Thursday, January 23, 2014 12:18 AM To: openstack-dev@lists.openstack.org Subject: [openstack-dev] hugepage support Hi, Is it possible to enable the hugepages in the guest OS for the VMs launched in OpenStack (with KVM hypervisor)? Specifically is it possible to pass the “-mem-path” option when invoking QEMU? Thanks, Anil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] hugepage support
Hi Anil, I have checked on the code and it looks it is not possible to enable this feature for the guest. We are running kvm through libvirt and it support this option: http://libvirt.org/formatdomain.html It could be an interesting feature. may be by an option readed from the image properties. There is something similar here: https://review.openstack.org/#/c/65028/ . What do you think? s. - Original Message - From: "Anil Gunturu" To: openstack-dev@lists.openstack.org Sent: Saturday, January 25, 2014 4:31:51 PM Subject: [openstack-dev] [nova] hugepage support Reposting with correct category in the subject. From: Anil Gunturu [mailto:anil.gunt...@riftio.com] Sent: Thursday, January 23, 2014 12:18 AM To: openstack-dev@lists.openstack.org Subject: [openstack-dev] hugepage support Hi, Is it possible to enable the hugepages in the guest OS for the VMs launched in OpenStack (with KVM hypervisor)? Specifically is it possible to pass the “-mem-path” option when invoking QEMU? Thanks, Anil ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova]can someone help me? when I use cmd "nova migration-list" error.
Perhaps a bug-maintainer should to update the status, the bug is not related to python-novaclient and it is not tried yet. Thanks a lot, s. - Original Message - From: "li zheming" To: "OpenStack Development Mailing List (not for usage questions)" Sent: Monday, January 20, 2014 4:52:27 AM Subject: Re: [openstack-dev] [nova]can someone help me? when I use cmd "nova migration-list" error. Ok .thanks Jay I consider it is error in novaclient before. it is my misunderstand. thank you very much! lizheming 2014/1/20 Jay Lau < jay.lau@gmail.com > It is being fixed https://review.openstack.org/#/c/61717/ Thanks, Jay 2014/1/20 li zheming < lizhemin...@gmail.com > hi all: when I use cmd nova migration-list, it return error,like this: openstack@ devstack: /home$ nova migration-list ERROR: 'unicode' object has no attribute 'iteritems' I step the codes and find the codes have some error. python-novaclient/novaclient/base.py class Manager(utils.HookableMixin): .. def _list(self, url, response_key, obj_class=None, body=None): if body: _resp, body = self.api.client.post(url, body=body) else: _resp, body = self.api.client.get(url) if obj_class is None: obj_class = self.resource_class data = body[response_key] # NOTE(ja): keystone returns values as list as {'values': [ ... ]} # unlike other services which just return the list... if isinstance(data, dict): try: data = data['values'] except KeyError: pass with self.completion_cache('human_id', obj_class, mode="w"): with self.completion_cache('uuid', obj_class, mode="w"): return [obj_class(self, res, loaded=True) for res in data if res] I set a breakpoint in " data = data['values']", and find the date is {u'objects': []}}, it has no key named values. it except a keyError and pass. if go " for res in data if res ", the res is unicode "object", this will occur error in the next fun. do you met this issue? and someone who know why the comment say " keystone returns values as list as {'values': [ ... ]}" but I think this is not relevant about keystone. may be I misunderstand this codes. please give me more info about this code. thank you very much! ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] pep8 gating fails due to tools/config/check_uptodate.sh
Hello all, It looks 100% of the pep8 gate for nova is failing because of a bug reported, we probably need to mark this as Critical. https://bugs.launchpad.net/nova/+bug/1268614 Ivan Melnikov has pushed a patchset waiting for review: https://review.openstack.org/#/c/66346/ http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiRVJST1I6IEludm9jYXRpb25FcnJvcjogXFwnL2hvbWUvamVua2lucy93b3Jrc3BhY2UvZ2F0ZS1ub3ZhLXBlcDgvdG9vbHMvY29uZmlnL2NoZWNrX3VwdG9kYXRlLnNoXFwnXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjQzMjAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM4OTYzMTQzMzQ4OSwibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIifQ== s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev