[Yahoo-eng-team] [Bug 1975637] [NEW] Ceph VM images leak on instance deletion if there are snapshots of that image
Public bug reported: Description === I'm using backy2 to back up instance images. For the sake of incremental backups, we keep a pending last-backed-up snapshot associated with each instance at all times. When an instance is deleted, the rbd delete call fails silently and leaves both the VM and snapshot behind forever. Ideally two things would be different: 1) The logs would reflect this failure 2) A config option would allow me to demand that all associated snaps are purged on instance deletion. I thought I had a weirdo edge use-case but I see at least one other user encountering this same leakage, here: https://heiterbiswolkig.blogs.nde.ag/2019/03/07/orphaned-instances- part-2/ (I'm running version Wallaby but the code is the same on the current git head) Steps to reproduce == * Create an instance backed with ceph/rbd * Take a snapshot of that instance (in my case, I'm doing this out of band using rbd commands, not a nova API) * Delete the instance * Note that the volume is still present on the storage backend Expected result * Log messages should announce a failure to delete, or * (optionally) server volume is actually deleted Actual result = * Logfile is silent about failure to delete * server volume is leaked and lives on forever, invisible to Nova Environment === Openstack Wallaby installed from the debian BPO on Debian Bullseye # dpkg --list | grep nova ii nova-common 2:23.1.0-2~bpo11+1 all OpenStack Compute - common files ii nova-compute 2:23.1.0-2~bpo11+1 all OpenStack Compute - compute node ii nova-compute-kvm 2:23.1.0-2~bpo11+1 all OpenStack Compute - compute node (KVM) ii python3-nova 2:23.1.0-2~bpo11+1 all OpenStack Compute - libraries ii python3-novaclient 2:17.4.0-1~bpo11+1 all client library for OpenStack Compute API - 3.x The same issue is present in the latest git code. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1975637 Title: Ceph VM images leak on instance deletion if there are snapshots of that image Status in OpenStack Compute (nova): New Bug description: Description === I'm using backy2 to back up instance images. For the sake of incremental backups, we keep a pending last-backed-up snapshot associated with each instance at all times. When an instance is deleted, the rbd delete call fails silently and leaves both the VM and snapshot behind forever. Ideally two things would be different: 1) The logs would reflect this failure 2) A config option would allow me to demand that all associated snaps are purged on instance deletion. I thought I had a weirdo edge use-case but I see at least one other user encountering this same leakage, here: https://heiterbiswolkig.blogs.nde.ag/2019/03/07/orphaned-instances- part-2/ (I'm running version Wallaby but the code is the same on the current git head) Steps to reproduce == * Create an instance backed with ceph/rbd * Take a snapshot of that instance (in my case, I'm doing this out of band using rbd commands, not a nova API) * Delete the instance * Note that the volume is still present on the storage backend Expected result * Log messages should announce a failure to delete, or * (optionally) server volume is actually deleted Actual result = * Logfile is silent about failure to delete * server volume is leaked and lives on forever, invisible to Nova Environment === Openstack Wallaby installed from the debian BPO on Debian Bullseye # dpkg --list | grep nova ii nova-common 2:23.1.0-2~bpo11+1 all OpenStack Compute - common files ii nova-compute 2:23.1.0-2~bpo11+1 all OpenStack Compute - compute node ii nova-compute-kvm 2:23.1.0-2~bpo11+1 all OpenStack Compute - compute node (KVM) ii python3-nova 2:23.1.0-2~bpo11+1 all OpenStack Compute - libraries ii python3-novaclient 2:17.4.0-1~bpo11+1 all client library for OpenStack Compute API - 3.x The same issue is present in the latest git code. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1975637/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to :
[Yahoo-eng-team] [Bug 1924790] [NEW] default role documentation: who can assign roles?
Public bug reported: I'm hoping that my cloud will soon be able to adopt the new default scoped role model documented at https://docs.openstack.org/keystone/latest/admin/service-api- protection.html That document is good about detailing which roles can read and view existing role assignments, but I can't tell which users can or can't assign new roles. For example, if I give a user the admin role in a project, can that user add additional users to that project? ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1924790 Title: default role documentation: who can assign roles? Status in OpenStack Identity (keystone): New Bug description: I'm hoping that my cloud will soon be able to adopt the new default scoped role model documented at https://docs.openstack.org/keystone/latest/admin/service-api- protection.html That document is good about detailing which roles can read and view existing role assignments, but I can't tell which users can or can't assign new roles. For example, if I give a user the admin role in a project, can that user add additional users to that project? To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1924790/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1919369] [NEW] Instances panel shows some readable flavors as 'Not available '
Public bug reported: As we move towards having 'reader' roles in nova, we can get into some interesting situations where a user can see a resource but not use it. Today I have a case where I've added a private flavor to projects A and B, created a VM in project B, and then removed the flavor project B. I belong to both projects, so in some contexts I can still see the flavor. In the Horizon instance view for project A, though, the flavor is shown as 'not available'. That's reasonable behavior, but it happens to be unnecessary. The code change to make the flavor appear for that VM is trivial, and doesn't affect the ability to create VMs with the removed flavor (which would be bad). Supporting this case also provides a potential solution to the issue raised in bug 1259262, phasing out a flavor without causing VMs to know know what size they are; flavors can be moved out of scope for a project (thus preventing their re-use for new instances) but still remain viewable by the project. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1919369 Title: Instances panel shows some readable flavors as 'Not available ' Status in OpenStack Dashboard (Horizon): New Bug description: As we move towards having 'reader' roles in nova, we can get into some interesting situations where a user can see a resource but not use it. Today I have a case where I've added a private flavor to projects A and B, created a VM in project B, and then removed the flavor project B. I belong to both projects, so in some contexts I can still see the flavor. In the Horizon instance view for project A, though, the flavor is shown as 'not available'. That's reasonable behavior, but it happens to be unnecessary. The code change to make the flavor appear for that VM is trivial, and doesn't affect the ability to create VMs with the removed flavor (which would be bad). Supporting this case also provides a potential solution to the issue raised in bug 1259262, phasing out a flavor without causing VMs to know know what size they are; flavors can be moved out of scope for a project (thus preventing their re-use for new instances) but still remain viewable by the project. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1919369/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1914641] [NEW] jinja rendering broken in latest git checkout
Public bug reported: I use jinja templating for vendor data; it works with my .deb packaged version of cloud-init, 20.2-2~deb10u1 Testing with the latest git checkout, I see a json parser chocking on curly braces. That suggests that it's skipping the jinja rendering step, or trying to run it after json parsing, which won't work. Here is the top part of my vendor data: ``` root@cloudinit-test:~# curl http://169.254.169.254/openstack/latest/vendor_data.json {"domain": "codfw1dev.wikimedia.cloud", "cloud-init": "MIME-Version: 1.0\nContent-Type: multipart/mixed; boundary=\"boundary text\"\n\nThis is a multipart config in MIME format.\nIt contains a cloud-init config followed by\na first boot script.\n\n--boundary text\nMIME-Version: 1.0\nContent-Type: text/cloud-config; charset=\"us-ascii\"\n\n## template: jinja\n#cloud-config\n\nhostname: {{ds.meta_data.name}}\nfqdn: {{ds.meta_data.name}}.{{ds.meta_data.project_id}}.codfw1dev.wikimedia.cloud\n\n\n# /etc/block-ldap-key-lookup:\n# Prevent non-root logins while the VM is being setup\n# The ssh-key-ldap-lookup script rejects non-root user logins if this file\n# is present.\n#\n# /etc/rsyslog.d/60-puppet.conf:\n# Enable console logging for puppet\n#\n# /etc/systemd/system/serial-getty@ttyS0.service.d/override.conf:\n# Enable root console on serial0\n# (cloud-init will create any needed parent dirs)\nwrite_files:\n- content: \"VM is work in progress\"\n path: /etc/block-ldap-key-lookup\n- content: \"daemon.* |/dev/console\"\n path: /etc/rsyslog.d/60-puppet.conf\n- content: |\n[Service]\n ExecStart=\nExecStart=-/sbin/agetty --autologin root --noclear %I $TERM\n path: /etc/systemd/system/serial-getty@ttyS0.service.d/override.conf\n\n# resetting ttys0 so root is logged in\nruncmd:\n- [systemctl, enable, serial-getty@ttyS0.service]\n- [systemctl, restart, serial-getty@ttyS0.service]\n\n\nmanage_etc_hosts: true\n\npackages:\n- gpg\n- curl\n- nscd\n- lvm2\n- parted\n- puppet\n\ngrowpart:\nmode: false\n\n# You'll see that we're setting apt_preserve_sources_list twice here. That's\n# because there's a bug in cloud-init where it tries to reconcile the\n# two settings and if they're different the stage fails. That means that\n# if one of them is set differently from the default (True) then nothing\n# works.\napt_preserve_sources_list: False\napt:\npreserve_sources_list: False\n ``` And here are the errors: ``` 2021-02-04 18:08:43,117 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 4 column 1: "while parsing a block mapping in "", line 4, column 1: hostname: {{ds.meta_data.name}} ^ expected , but found '' in "", line 5, column 28: fqdn: {{ds.meta_data.name}}.{{ds.meta_data.project_id}}.cod ... ^" 2021-02-04 18:08:43,131 - util.py[WARNING]: Failed loading yaml blob. Invalid format at line 4 column 1: "while parsing a block mapping in "", line 4, column 1: hostname: {{ds.meta_data.name}} ^ expected , but found '' in "", line 5, column 28: fqdn: {{ds.meta_data.name}}.{{ds.meta_data.project_id}}.cod ... ^" 2021-02-04 18:08:43,131 - util.py[WARNING]: Failed at merging in cloud config part from part-001 ``` ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1914641 Title: jinja rendering broken in latest git checkout Status in cloud-init: New Bug description: I use jinja templating for vendor data; it works with my .deb packaged version of cloud-init, 20.2-2~deb10u1 Testing with the latest git checkout, I see a json parser chocking on curly braces. That suggests that it's skipping the jinja rendering step, or trying to run it after json parsing, which won't work. Here is the top part of my vendor data: ``` root@cloudinit-test:~# curl http://169.254.169.254/openstack/latest/vendor_data.json {"domain": "codfw1dev.wikimedia.cloud", "cloud-init": "MIME-Version: 1.0\nContent-Type: multipart/mixed; boundary=\"boundary text\"\n\nThis is a multipart config in MIME format.\nIt contains a cloud-init config followed by\na first boot script.\n\n--boundary text\nMIME-Version: 1.0\nContent-Type: text/cloud-config; charset=\"us-ascii\"\n\n## template: jinja\n#cloud-config\n\nhostname: {{ds.meta_data.name}}\nfqdn: {{ds.meta_data.name}}.{{ds.meta_data.project_id}}.codfw1dev.wikimedia.cloud\n\n\n# /etc/block-ldap-key-lookup:\n# Prevent non-root logins while the VM is being setup\n# The ssh-key-ldap-lookup script rejects non-root user logins if this file\n# is present.\n#\n# /etc/rsyslog.d/60-puppet.conf:\n# Enable console logging for puppet\n#\n#
[Yahoo-eng-team] [Bug 1861926] [NEW] Horizon sessions live on after keystone token expires
Public bug reported: If I'm in the process of using Horizon and my keystone token expires, Horizon keeps on 'working.' Anything that involves an API call fails, and I get an error notice about the failure, but the frames keep loading and display a patchwork of information (depending on what may be cached and what does and doesn't require an API call.) I can reproduce this issue by deleting my token from the keystone database if I don't feel like waiting. After some unpredictable amount of time, I'm kicked back to the login screen. I would expect the prompt to re-login to happen immediately, as soon as my token expires. Indeed, I'm pretty sure this was the behavior back in Ocata. I'm running Horizon version 'train' with UUID tokens and backend APIs of version 'Pike,' using django.core.cache.backends.memcached.MemcachedCache. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1861926 Title: Horizon sessions live on after keystone token expires Status in OpenStack Dashboard (Horizon): New Bug description: If I'm in the process of using Horizon and my keystone token expires, Horizon keeps on 'working.' Anything that involves an API call fails, and I get an error notice about the failure, but the frames keep loading and display a patchwork of information (depending on what may be cached and what does and doesn't require an API call.) I can reproduce this issue by deleting my token from the keystone database if I don't feel like waiting. After some unpredictable amount of time, I'm kicked back to the login screen. I would expect the prompt to re-login to happen immediately, as soon as my token expires. Indeed, I'm pretty sure this was the behavior back in Ocata. I'm running Horizon version 'train' with UUID tokens and backend APIs of version 'Pike,' using django.core.cache.backends.memcached.MemcachedCache. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1861926/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1855506] [NEW] Incorrect django version in requirements.txt
Public bug reported: requirements.txt contains these lines: Django<2,>=1.11;python_version<'3.0' # BSD Django<2.1,>=1.11;python_version>='3.0' # BSD First of all, it seems weird that there are two conflicting lines for the same package. But, I'm seeing a more serious issue. Throughout the code there are import lines like this: openstack_dashboard/views.py:from django import urls but django.urls is only present in django version 2 and later. In version 1.8, for example, that module appears to be django.conf.urls. I may be misunderstanding something fundamental but I think that requrements line needs to specify a lower bound of 2.0. I don't know how to reconcile that with there also being an upper bound of 2. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1855506 Title: Incorrect django version in requirements.txt Status in OpenStack Dashboard (Horizon): New Bug description: requirements.txt contains these lines: Django<2,>=1.11;python_version<'3.0' # BSD Django<2.1,>=1.11;python_version>='3.0' # BSD First of all, it seems weird that there are two conflicting lines for the same package. But, I'm seeing a more serious issue. Throughout the code there are import lines like this: openstack_dashboard/views.py:from django import urls but django.urls is only present in django version 2 and later. In version 1.8, for example, that module appears to be django.conf.urls. I may be misunderstanding something fundamental but I think that requrements line needs to specify a lower bound of 2.0. I don't know how to reconcile that with there also being an upper bound of 2. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1855506/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1771851] [NEW] Image panel doesn't check 'compute:create' policy
Public bug reported: The Horizon image panel provides a 'Launch' button to create a server from a given image. The django code for this button has correct policy checks; the Angular code has none. That means that the 'Launch' button displays even if the user is not permitted to launch instances, resulting in a frustrating failure much later in the process. The button should not display if the user is not permitted to create VMs. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1771851 Title: Image panel doesn't check 'compute:create' policy Status in OpenStack Dashboard (Horizon): New Bug description: The Horizon image panel provides a 'Launch' button to create a server from a given image. The django code for this button has correct policy checks; the Angular code has none. That means that the 'Launch' button displays even if the user is not permitted to launch instances, resulting in a frustrating failure much later in the process. The button should not display if the user is not permitted to create VMs. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1771851/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1611895] [NEW] Security groups don't work by default in newish kernels
Public bug reported: I recently had some bad experiences running nova-compute on a linux 4.4-series kernel. Specifically, the security-group code properly configured IPtables but the actual rules were completely bypassed -- EVERY port on EVERY instance was open to the outside world. This is presumably due to kernel change described below. I'm unclear on where responsibility sits for activating the proper modprobe; maybe this is something for packagers to care about and not strictly a nova bug. $ git describe --contains 34666d467cbf1e2e3c7bb15a63eccfb582cdd71f v3.18-rc1~115^2~111^2~2 netfilter: bridge: move br_netfilter out of the core Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1611895 Title: Security groups don't work by default in newish kernels Status in OpenStack Compute (nova): New Bug description: I recently had some bad experiences running nova-compute on a linux 4.4-series kernel. Specifically, the security-group code properly configured IPtables but the actual rules were completely bypassed -- EVERY port on EVERY instance was open to the outside world. This is presumably due to kernel change described below. I'm unclear on where responsibility sits for activating the proper modprobe; maybe this is something for packagers to care about and not strictly a nova bug. $ git describe --contains 34666d467cbf1e2e3c7bb15a63eccfb582cdd71f v3.18-rc1~115^2~111^2~2 netfilter: bridge: move br_netfilter out of the core Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1611895/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1611871] [NEW] Timeouts in conductor when updating large sets of security group rules (liberty)
Public bug reported: I have a project with 130+ instances in it. When I set a 'source group' security rule in that project, the rule is never applied on the compute nodes. nova-compute logs include timeout warnings like the one pasted below. This timeout only happens in 'big' cases. If I add a single port to every instance in the project, everything's fine. If I add a 'source group' rule to a project with fewer instances, everything is also fine. It's only the n^2 case for large numbers of n that I get the timeout. Increasing my rpc_response_timeout setting from the default of 60 makes the issue go away. From this I conclude that conductor is not choking, it just really takes longer than 60 seconds. Openstack Liberty, kvm, running on Ubuntu Trusty servers with 3.13-series kernels. Sample stack trace: 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher [req-7af38b91-cfaa-4739-88ec-cbcf10142653 andrew tools - - -] Exception during message handling: Timed out waiting for a reply to message ID 77988fad9aa940aa929752826bde7cdc 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher executor_callback)) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher executor_callback) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 470, in decorated_function 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 89, in wrapped 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher payload) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__ 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 72, in wrapped 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1387, in refresh_instance_security_rules 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher return _sync_refresh() 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 254, in inner 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher return f(*args, **kwargs) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1382, in _sync_refresh 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher return self.driver.refresh_instance_security_rules(instance) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5074, in refresh_instance_security_rules 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher self.firewall_driver.refresh_instance_security_rules(instance) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 434, in refresh_instance_security_rules 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher self.do_refresh_instance_rules(instance) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 467, in do_refresh_instance_rules 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher ipv4_rules, ipv6_rules = self.instance_rules(instance, network_info) 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/dist-packages/nova/virt/firewall.py", line 399, in instance_rules 2016-08-10 16:30:23.102 9877 ERROR oslo_messaging.rpc.dispatcher ctxt, rule['grantee_group']))
[Yahoo-eng-team] [Bug 1610693] [NEW] Broken instances quota check in Liberty
Public bug reported: I have recently upgraded my cluster to Liberty for all projects. Now, when I create new instances, I frequently get an incorrect quota warning from the instance creation workflow, despite having plenty of available quota: "The requested instance cannot be launched as you only have 0 of your quota available." The issue is happening within the _get_tenant_compute_usages() check. That function determines (correctly) that I am able to list instances in all projects, so passes 'all_tenants' to the nova api. That results in an api call that looks like this: "GET /v2//servers/detail?all_tenants=True_id= HTTP/1.1" Nova replies with a list of every instance in my entire cloud, 719 at last count. The call is very slow and, of course, 719 is many more than my instance quota for the current project, so Horizon determines that I am over quota. Note that this issue didn't appear when I was running Liberty Horizon with Kilo Nova (so maybe this is a bug or change in the nova-api) Best I can tell, the offending code in Horizon is still present in the git head. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1610693 Title: Broken instances quota check in Liberty Status in OpenStack Dashboard (Horizon): New Bug description: I have recently upgraded my cluster to Liberty for all projects. Now, when I create new instances, I frequently get an incorrect quota warning from the instance creation workflow, despite having plenty of available quota: "The requested instance cannot be launched as you only have 0 of your quota available." The issue is happening within the _get_tenant_compute_usages() check. That function determines (correctly) that I am able to list instances in all projects, so passes 'all_tenants' to the nova api. That results in an api call that looks like this: "GET /v2//servers/detail?all_tenants=True_id= HTTP/1.1" Nova replies with a list of every instance in my entire cloud, 719 at last count. The call is very slow and, of course, 719 is many more than my instance quota for the current project, so Horizon determines that I am over quota. Note that this issue didn't appear when I was running Liberty Horizon with Kilo Nova (so maybe this is a bug or change in the nova-api) Best I can tell, the offending code in Horizon is still present in the git head. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1610693/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1566025] [NEW] Unable to delete security groups; security_group table 'deleted' field needs migration
Public bug reported: My long-standing Nova installation has the following columns in the security_groups table: +-+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +-+--+--+-+-++ | created_at | datetime | YES | | NULL|| | updated_at | datetime | YES | | NULL|| | deleted_at | datetime | YES | | NULL|| | deleted | tinyint(1) | YES | MUL | NULL|| | id | int(11) | NO | PRI | NULL| auto_increment | | name| varchar(255) | YES | | NULL|| | description | varchar(255) | YES | | NULL|| | user_id | varchar(255) | YES | | NULL|| | project_id | varchar(255) | YES | | NULL|| +-+--+--+-+-++ A more recent install looks like this: +-+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +-+--+--+-+-++ | created_at | datetime | YES | | NULL|| | updated_at | datetime | YES | | NULL|| | deleted_at | datetime | YES | | NULL|| | id | int(11) | NO | PRI | NULL| auto_increment | | name| varchar(255) | YES | | NULL|| | description | varchar(255) | YES | | NULL|| | user_id | varchar(255) | YES | | NULL|| | project_id | varchar(255) | YES | MUL | NULL|| | deleted | int(11) | YES | | NULL|| +-+--+--+-+-++ Note that the 'deleted' field has changed types. It now stores a group ID upon deletion. But, the old table can't store that group ID because of the tinyint data type. This means that security groups cannot be deleted. I haven't yet located the source of this regression, but presumably it happened when the table definition was changed to use models.SoftDeleteMixin, and the accompanying migration change was overlooked. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1566025 Title: Unable to delete security groups; security_group table 'deleted' field needs migration Status in OpenStack Compute (nova): New Bug description: My long-standing Nova installation has the following columns in the security_groups table: +-+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +-+--+--+-+-++ | created_at | datetime | YES | | NULL|| | updated_at | datetime | YES | | NULL|| | deleted_at | datetime | YES | | NULL|| | deleted | tinyint(1) | YES | MUL | NULL|| | id | int(11) | NO | PRI | NULL| auto_increment | | name| varchar(255) | YES | | NULL|| | description | varchar(255) | YES | | NULL|| | user_id | varchar(255) | YES | | NULL|| | project_id | varchar(255) | YES | | NULL|| +-+--+--+-+-++ A more recent install looks like this: +-+--+--+-+-++ | Field | Type | Null | Key | Default | Extra | +-+--+--+-+-++ | created_at | datetime | YES | | NULL|| | updated_at | datetime | YES | | NULL|| | deleted_at | datetime | YES | | NULL|| | id | int(11) | NO | PRI | NULL| auto_increment | | name| varchar(255) | YES | | NULL|| | description | varchar(255) | YES | | NULL|| | user_id | varchar(255) | YES | | NULL|| | project_id | varchar(255) | YES | MUL | NULL|| | deleted | int(11) | YES | | NULL|| +-+--+--+-+-++ Note that the 'deleted' field has changed types. It now stores a
[Yahoo-eng-team] [Bug 1513654] [NEW] scheduler: disk_filter permits scheduling on full drives
Public bug reported: I use qcow images and have disk_allocation_ratio == 2.1 to allow large amounts of overcommitting of disk space. To quote the nova config reference: > If the value is set to >1, we recommend keeping track of the free disk > space, as the value approaching 0 may result in the incorrect > functioning of instances using it at the moment. Good advice, but 'keeping track' can be a bit impractical at times. I just now had the scheduler drop a large sized instance onto a server with a 98% full drive since the behavior of disk_allocation_ratio intentionally ignores the actual free space on the drive. I propose that we add an additional config setting to the disk scheduler so that I can overschedule but can /also/ request that the scheduler stop piling things onto an already groaning server. ** Affects: nova Importance: Undecided Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1513654 Title: scheduler: disk_filter permits scheduling on full drives Status in OpenStack Compute (nova): In Progress Bug description: I use qcow images and have disk_allocation_ratio == 2.1 to allow large amounts of overcommitting of disk space. To quote the nova config reference: > If the value is set to >1, we recommend keeping track of the free disk space, as the value approaching 0 may result in the incorrect > functioning of instances using it at the moment. Good advice, but 'keeping track' can be a bit impractical at times. I just now had the scheduler drop a large sized instance onto a server with a 98% full drive since the behavior of disk_allocation_ratio intentionally ignores the actual free space on the drive. I propose that we add an additional config setting to the disk scheduler so that I can overschedule but can /also/ request that the scheduler stop piling things onto an already groaning server. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1513654/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1513216] [NEW] Mismatched keystone api version produces cryptic 'Error: Openstack'
Public bug reported: The 'openstack' cli tool defaults to keystone version 2.0. When pointed to a v3 endpoint, it fails like this: $ openstack service list ERROR: openstack This can easily be resolved by setting OS_IDENTITY_API_VERSION=3 -- that's not obvious from the error message, though, and isn't even obvious from log- and code-diving. I propose that we actually detect the api version mismatch and error out with a helpful message. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1513216 Title: Mismatched keystone api version produces cryptic 'Error: Openstack' Status in OpenStack Identity (keystone): New Bug description: The 'openstack' cli tool defaults to keystone version 2.0. When pointed to a v3 endpoint, it fails like this: $ openstack service list ERROR: openstack This can easily be resolved by setting OS_IDENTITY_API_VERSION=3 -- that's not obvious from the error message, though, and isn't even obvious from log- and code-diving. I propose that we actually detect the api version mismatch and error out with a helpful message. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1513216/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1498039] [NEW] projects drop-down broken for large numbers or projects
Public bug reported: I'm a member of several dozen projects -- more than can fit in a single screen's worth of drop-down. Right now I'm trying to view the VMs in a project called 'testlabs.' - When I click on the 'project' drop-down up top, it displays the first 20 or so projects, and then a "More Projects" link. - I select 'More Projects', and it drops me into the 'Projects' tab of the Identity panel. But, hey, it says 'Unable to retrieve project list'. I guess that's because I wasn't ADMIN in the project that was active... before? The one I'm trying to switch away from? - I select a different project from the project drop-down... not 'testlabs' but just one that I happen to be admin in. - I select 'More Projects' again, and now I'm back in the Identity panel but now I can actually see a project list. - I find 'testlabs' in the list, click on the little arrow next to 'Manage Members' and there's a drop down there - I select 'Set as Active Project' from that dropdown - I click away from the Identity panel, back on the 'Project' panel - NOW I can see my VMs. Correct behavior is that the 'projects' pull down is just super long and scrolls. Ugly, but not as ugly as the status quo! Running Kilo from the Ubuntu cloud archive. ** Affects: horizon Importance: Undecided Status: New ** Description changed: I'm a member of several dozen projects -- more than can fit in a single screen's worth of drop-down. Right now I'm trying to view the VMs in a project called 'testlabs.' - When I click on the 'project' drop-down up top, it displays the first 20 or so projects, and then a "More Projects" link. - I select 'More Projects', and it drops me into the 'Projects' tab of the Identity panel. But, hey, it says 'Unable to retrieve project list'. I guess that's because I wasn't ADMIN in the project that was active... before? The one I'm trying to switch away from? - I select a different project from the project drop-down... not 'testlabs' but just one that I happen to be admin in. - I select 'More Projects' again, and now I'm back in the Identity panel but now I can actually see a project list. - I find 'testlabs' in the list, click on the little arrow next to 'Manage Members' and there's a drop down there - I select 'Set as Active Project' from that dropdown - - NOW I am where i want to be. + - I click away from the Identity panel, back on the 'Project' panel + - NOW I can see my VMs. Correct behavior is that the 'projects' pull down is just super long and scrolls. Ugly, but not as ugly as the status quo! Running Kilo from the Ubuntu cloud archive. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1498039 Title: projects drop-down broken for large numbers or projects Status in OpenStack Dashboard (Horizon): New Bug description: I'm a member of several dozen projects -- more than can fit in a single screen's worth of drop-down. Right now I'm trying to view the VMs in a project called 'testlabs.' - When I click on the 'project' drop-down up top, it displays the first 20 or so projects, and then a "More Projects" link. - I select 'More Projects', and it drops me into the 'Projects' tab of the Identity panel. But, hey, it says 'Unable to retrieve project list'. I guess that's because I wasn't ADMIN in the project that was active... before? The one I'm trying to switch away from? - I select a different project from the project drop-down... not 'testlabs' but just one that I happen to be admin in. - I select 'More Projects' again, and now I'm back in the Identity panel but now I can actually see a project list. - I find 'testlabs' in the list, click on the little arrow next to 'Manage Members' and there's a drop down there - I select 'Set as Active Project' from that dropdown - I click away from the Identity panel, back on the 'Project' panel - NOW I can see my VMs. Correct behavior is that the 'projects' pull down is just super long and scrolls. Ugly, but not as ugly as the status quo! Running Kilo from the Ubuntu cloud archive. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1498039/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1498197] [NEW] No longer able to delete service group rules in kilo
Public bug reported: Security groups and rules worked fine in Juno, but ever since my upgrade to Kilo I'm unable to delete rules. andrew@labcontrol1001:~$ nova secgroup-delete-rule default tcp 666 666 10.0.0.0/8 +-+---+-++--+ | IP Protocol | From Port | To Port | IP Range | Source Group | +-+---+-++--+ | tcp | 666 | 666 | 10.0.0.0/8 | | +-+---+-++--+ ERROR (ClientException): The server has either erred or is incapable of performing the requested operation. (HTTP 500) Here's the interesting bit of the stack trace: 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 951, in _execute_context 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack context) 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 436, in do_execute 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack cursor.execute(statement, parameters) 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack self.errorhandler(self, exc, value) 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack raise errorclass, errorvalue 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack DBError: (DataError) (1264, "Out of range value for column 'deleted' at row 1") 'UPDATE security_group_rules SET updated_at=updated_at, deleted_at=%s, deleted=id WHERE security_group_rules.deleted = %s AND security_group_rules.id = %s' (datetime.datetime(2015, 9, 16, 21, 34, 31, 99896), 0, 2769L) I've no doubt that this feature works in a fresh install of Kilo, and that my db schema is messed up due to a faulty upgrade script. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1498197 Title: No longer able to delete service group rules in kilo Status in OpenStack Compute (nova): New Bug description: Security groups and rules worked fine in Juno, but ever since my upgrade to Kilo I'm unable to delete rules. andrew@labcontrol1001:~$ nova secgroup-delete-rule default tcp 666 666 10.0.0.0/8 +-+---+-++--+ | IP Protocol | From Port | To Port | IP Range | Source Group | +-+---+-++--+ | tcp | 666 | 666 | 10.0.0.0/8 | | +-+---+-++--+ ERROR (ClientException): The server has either erred or is incapable of performing the requested operation. (HTTP 500) Here's the interesting bit of the stack trace: 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 951, in _execute_context 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack context) 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 436, in do_execute 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack cursor.execute(statement, parameters) 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack self.errorhandler(self, exc, value) 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack raise errorclass, errorvalue 2015-09-16 21:34:31.105 3179 TRACE nova.api.openstack DBError: (DataError) (1264, "Out of range value for column 'deleted' at row 1") 'UPDATE security_group_rules SET updated_at=updated_at, deleted_at=%s, deleted=id WHERE security_group_rules.deleted = %s AND security_group_rules.id = %s' (datetime.datetime(2015, 9, 16, 21, 34, 31, 99896), 0, 2769L) I've no doubt that this feature works in a fresh install of Kilo, and that my db schema is messed up due to a faulty upgrade script. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1498197/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1470179] [NEW] Instance metadata should include project_id
Public bug reported: As per https://www.mail- archive.com/search?l=openst...@lists.openstack.orgq=subject:%22Re\%3A+\[Openstack\]+How+should+an+instance+learn+what+tenant+it+is+in\%3F%22o=newest It's weirdly hard for an instance to learn what project it's in. Let's just add project_id to instance metadata. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1470179 Title: Instance metadata should include project_id Status in OpenStack Compute (Nova): New Bug description: As per https://www.mail- archive.com/search?l=openst...@lists.openstack.orgq=subject:%22Re\%3A+\[Openstack\]+How+should+an+instance+learn+what+tenant+it+is+in\%3F%22o=newest It's weirdly hard for an instance to learn what project it's in. Let's just add project_id to instance metadata. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1470179/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1470225] [NEW] Support deprecated image types
Public bug reported: I frequently update the base Trusty images available to my users. After I do that, I want to discourage them from creating new servers based on the old images. If I remove the old images entirely or make them private, Horizon shows servers as having type 'unknown.' I'd like Horizon to support a glance property of 'deprecated' for an image that should remain visible in Horizon but not be added to the image pulldown when an instance is created. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1470225 Title: Support deprecated image types Status in OpenStack Dashboard (Horizon): New Bug description: I frequently update the base Trusty images available to my users. After I do that, I want to discourage them from creating new servers based on the old images. If I remove the old images entirely or make them private, Horizon shows servers as having type 'unknown.' I'd like Horizon to support a glance property of 'deprecated' for an image that should remain visible in Horizon but not be added to the image pulldown when an instance is created. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1470225/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1444469] [NEW] keystone should clean up expired tokens
Public bug reported: As of Icehouse, at least, keystone doesn't ever clean up expired tokens. After a few years, my keystone ridiculously huge, causing query timeouts and such. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Keystone. https://bugs.launchpad.net/bugs/169 Title: keystone should clean up expired tokens Status in OpenStack Identity (Keystone): New Bug description: As of Icehouse, at least, keystone doesn't ever clean up expired tokens. After a few years, my keystone ridiculously huge, causing query timeouts and such. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/169/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp