[openstack-dev] why don't we deal with claims when live migrating an instance?

2014-01-15 Thread Chris Friesen
When we create a new instance via _build_instance() or _build_and_run_instance(), in both cases we call instance_claim() to reserve and test for resources. During a cold migration I see us calling prep_resize() which calls resize_claim(). How come we don't need to do something like this

Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.

2014-01-15 Thread Chris Friesen
On 12/26/2013 01:56 AM, cosmos cosmos wrote: Hello. My name is Rucia for Samsung SDS. I had in truouble in volume deleting. I am developing for supporting big data storage such as hadoop in lvm. it use as a full disk io for deleting of cinder lvm volume because of dd the high disk I/O

Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.

2014-01-15 Thread Chris Friesen
On 01/15/2014 06:00 PM, Fox, Kevin M wrote: What about a configuration option on the volume for delete type? I can see some possible options: * None - Don't clear on delete. Its junk data for testing and I don't want to wait. * Zero - Return zero's from subsequent reads either by zeroing on

Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.

2014-01-15 Thread Chris Friesen
On 01/15/2014 06:30 PM, Jay S Bryant wrote: There is already an option that can be set in cinder.conf using 'volume_clear=none' Is there a reason that that option is not sufficient? That option would be for the cloud operator and since it would apply to all volumes on that cinder node. My

Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Chris Friesen
On 12/12/2013 11:02 AM, Clint Byrum wrote: So I'm asking, is there a standard way to determine whether or not a nova-compute is definitely ready to have things scheduled on it? This can be via an API, or even by observing something on the nova-compute host itself. I just need a definitive

[openstack-dev] anyone aware of networking issues with grizzly live migration of kvm instances?

2013-12-09 Thread Chris Friesen
Hi, We've got a grizzly setup using quantum networking and libvirt/kvm with VIR_MIGRATE_LIVE set. I was live-migrating an instance back and forth between a couple of compute nodes. It worked fine for maybe half a dozen migrations and then after a migration I could no longer ping it. It

Re: [openstack-dev] [Nova] Blueprint: standard specification of guest CPU topology

2013-12-03 Thread Chris Friesen
On 12/03/2013 04:08 AM, Daniel P. Berrange wrote: On Tue, Dec 03, 2013 at 01:47:31AM -0800, Gary Kotton wrote: Hi, I think that this information should be used as part of the scheduling decision, that is hosts that are to be selected should be excluded if they do not have the necessary

Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime

2013-12-02 Thread Chris Friesen
On 12/02/2013 02:31 PM, Vishvananda Ishaya wrote: I'm going to reopen a can of worms, though. I think the most difficult part of the forklift will be moving stuff out of the existing databases into a new database. Do we really need to move it to a new database for the forklift? Chris

[openstack-dev] problems with rabbitmq on HA controller failure...anyone seen this?

2013-11-29 Thread Chris Friesen
Hi, We're currently running Grizzly (going to Havana soon) and we're running into an issue where if the active controller is ungracefully killed then nova-compute on the compute node doesn't properly connect to the new rabbitmq server on the newly-active controller node. I saw a bugfix in

Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime

2013-11-28 Thread Chris Friesen
On 11/28/2013 09:50 AM, Gary Kotton wrote: One option worth thinking about is to introduce a new scheduling driver to nova - this driver will interface with the external scheduler. This will let us define the scheduling API, model etc, without being in the current confines of Nova. This will

Re: [openstack-dev] [Nova] Is there a way for the VM to identify that it is getting booted in OpenStack

2013-11-27 Thread Chris Friesen
On 11/26/2013 07:48 PM, Vijay Venkatachalam wrote: Hi, Is there a way for the VM to identify that it is getting booted in OpenStack? As said in the below mail, once the VM knows it is booting in OpenStack it will alter the boot sequence. What does getting

Re: [openstack-dev] [heat] Is it time for a v2 Heat API?

2013-11-27 Thread Chris Friesen
On 11/27/2013 11:50 AM, Zane Bitter wrote: Even better would be if we had the keystone domain (instead of the tenant id) incorporated into the endpoint in the keystone catalog and then we could use the tenant^W project *name* in the URL and users would never have to deal with UUIDs and

[openstack-dev] curious, why wasn't nova commit 52f6981 backported to grizzly?

2013-11-27 Thread Chris Friesen
Hi, Just wondering why nova commit 52f6981 (Evacuated instance disk not deleted) wasn't backported to grizzly? The symptoms of this bug are that if you evacuate a server off a compute node that uses local storage then you can never move it back to that compute node because the old files are

[Openstack] intended mode of operation on server evacuation due to compute service failure

2013-11-25 Thread Chris Friesen
Hi, I'm trying to figure out how things are supposed to be done. Suppose I'm running compute nodes using shared instance storage (via NFS, for example). Now suppose I have a network issue which isolates a compute node. I evacuate the instances that were on that node, which causes them to

Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-21 Thread Chris Friesen
On 11/21/2013 10:52 AM, Stephen Gran wrote: On 21/11/13 15:49, Chris Friesen wrote: On 11/21/2013 02:58 AM, Soren Hansen wrote: 2013/11/20 Chris Friesen chris.frie...@windriver.com: What about a hybrid solution? There is data that is only used by the scheduler--for performance reasons maybe

Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-21 Thread Chris Friesen
On 11/21/2013 02:58 AM, Soren Hansen wrote: 2013/11/20 Chris Friesen chris.frie...@windriver.com: What about a hybrid solution? There is data that is only used by the scheduler--for performance reasons maybe it would make sense to store that information in RAM as described at https

Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-20 Thread Chris Friesen
On 11/20/2013 10:06 AM, Soren Hansen wrote: 2013/11/18 Mike Spreitzer mspre...@us.ibm.com: There were some concerns expressed at the summit about scheduler scalability in Nova, and a little recollection of Boris' proposal to keep the needed state in memory. I also heard one guy say that he

Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-19 Thread Chris Friesen
On 11/18/2013 06:47 PM, Joshua Harlow wrote: An idea related to this, what would need to be done to make the DB have the exact state that a compute node is going through (and therefore the scheduler would not make unreliable/racey decisions, even when there are multiple schedulers). It's not

Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-19 Thread Chris Friesen
On 11/19/2013 12:35 PM, Clint Byrum wrote: Each scheduler process can own a different set of resources. If they each grab instance requests in a round-robin fashion, then they will fill their resources up in a relatively well balanced way until one scheduler's resources are exhausted. At that

Re: [openstack-dev] [Nova] Does Nova really need an SQL database?

2013-11-19 Thread Chris Friesen
On 11/19/2013 12:27 PM, Joshua Harlow wrote: Personally I would prefer #3 from the below. #2 I think will still have to deal with consistency issues, just switching away from a DB doesn't make magical ponies and unicorns appear (in-fact it can potentially make the problem worse if its done

Re: [openstack-dev] [Neutron] Race condition between DB layer and plugin back-end implementation

2013-11-18 Thread Chris Friesen
On 11/18/2013 02:25 PM, Edgar Magana wrote: The problem that we are experiencing is when concurrent calls to the same API are sent, the number of operation at the plug-in back-end are long enough to make the next concurrent API call to get stuck at the DB transaction level, which creates a hung

Re: [openstack-dev] [nova] Core pinning

2013-11-13 Thread Chris Friesen
On 11/13/2013 11:40 AM, Jiang, Yunhong wrote: But, from performance point of view it is better to exclusively dedicate PCPUs for VCPUs and emulator. In some cases you may want to guarantee that only one instance(and its VCPUs) is using certain PCPUs. By using core pinning you can optimize

Re: [Openstack] Is there a way to tell Nova not to count NFS-mounted instances directory to disk space on each node?

2013-11-07 Thread Chris Friesen
I think it would make sense to have an entry in the nova.conf file on a compute node to indicate that the instances are mounted on shared storage. This could be fed back into the database, used in resource reporting (like in this case where it affects the real amount of storage available),

Re: [Openstack] Is there a way to tell Nova not to count NFS-mounted instances directory to disk space on each node?

2013-11-07 Thread Chris Friesen
On 11/07/2013 12:52 PM, Daniel Speichert wrote: Thank you both for your answers. What Chris describes is definitely a good approach. I think it's worth filing a wish bug. I can do it unless you want to describe your idea, Chris. If you want to run with it, go for it. :) I'd appreciate it if

Re: [Openstack] Entries Marked 'deleted' in 'instance_type' Table in Nova Database

2013-11-06 Thread Chris Friesen
On 11/06/2013 11:18 AM, Jay Pipes wrote: On 11/06/2013 11:42 AM, Craig E. Ward wrote: It looks like when flavors are modified in nova, the database row with the old data is marked deleted and a new row created. Is there a reason to keep these deleted rows around? Can it cause problems to

Re: [openstack-dev] Improvement of Cinder API wrt https://bugs.launchpad.net/nova/+bug/1213953

2013-11-05 Thread Chris Friesen
On 11/05/2013 01:27 AM, Avishay Traeger wrote: I think the proper fix is to make sure that Cinder is moving the volume into 'error' state in all cases where there is an error. Nova can then poll as long as its in the 'downloading' state, until it's 'available' or 'error'. Is there a reason

Re: [openstack-dev] Improvement of Cinder API wrt https://bugs.launchpad.net/nova/+bug/1213953

2013-11-05 Thread Chris Friesen
Wouldn't you still need variable timeouts? I'm assuming that copying multi-gig cinder volumes might take a while, even if it's local. (Or are you assuming copy-on-write?) Chris On 11/05/2013 01:43 AM, Caitlin Bestler wrote: Replication of snapshots is one solution to this. You create a

Re: [Openstack] nova-compute OpenStack piece on Android devices?

2013-11-05 Thread Chris Friesen
On 11/05/2013 03:21 AM, Moltchanov Boris wrote: Hi, I’ve seen that there are few attempts in the OpenStack community to have its Swift APIs on the Android devices, but I’ve not found anything about Nova, nova-compute specifically, “porting” to Android mobile devices. Is there any evaluation

Re: [openstack-dev] Improvement of Cinder API wrt https://bugs.launchpad.net/nova/+bug/1213953

2013-11-04 Thread Chris Friesen
On 11/04/2013 03:49 PM, Solly Ross wrote: So, There's currently an outstanding issue with regards to a Nova shortcut command that creates a volume from an image and then boots from it in one fell swoop. The gist of the issue is that there is currently a set timeout which can time out before the

Re: [Openstack] Operation offload to the SAN. RE: Wiping of old cinder volumes

2013-11-04 Thread Chris Friesen
On 11/03/2013 08:39 PM, Qixiaozhen wrote: In my opinion, we should rethink the way of wiping the data in the volumes. Filling in the device with /dev/zero with “dd” command was the most primitive method. The standard scsi command “write same” could be taken into considered. Once the LBA was

Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Chris Friesen
On 11/01/2013 11:42 AM, Jiang, Yunhong wrote: Shawn, yes, there is 56 VM access every second, and for each VM access, the scheduler will invoke filter for each host, that means, for each VM access, the filter function will be invoked 10k times. So 56 * 10k = 560k, yes, half of 1M, but still big

Re: [openstack-dev] When is it okay for submitters to say 'I don't want to add tests' ?

2013-10-31 Thread Chris Friesen
On 10/31/2013 06:04 AM, Rosa, Andrea (HP Cloud Services) wrote: A - there is no test suite at all, adding one in unreasonable B - this thing cannot be tested in this context (e.g. functional tests are defined in a different tree) C - this particular thing is very hard to test D - testing

Re: [openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

2013-10-30 Thread Chris Friesen
On 10/30/2013 01:34 PM, Joshua Harlow wrote: To me u just made state consistency be a lock by another name. A lock protects a region of code from being mutually accessed Personally I view a lock as protecting a set of data from being mutually accessed. The question to me becomes what

Re: [openstack-dev] [nova][scheduler] Instance Group Model and APIs - Updated document with an example request payload

2013-10-29 Thread Chris Friesen
On 10/29/2013 03:14 PM, Andrew Laski wrote: Having Nova call into Heat is backwards IMO. Agreed. If there are specific pieces of information that Nova can expose, or API capabilities to help with orchestration/placement that Heat or some other service would like to use then let's look at

Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-28 Thread Chris Friesen
On 10/28/2013 10:30 AM, Joshua Harlow wrote: I wish everything was so simple in distributed systems (like openstack) but there are real boundaries and limits to doing something like a kill -9 correctly while retaining the consistency of the resources in your cloud (any inconsistency costs

Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-28 Thread Chris Friesen
On 10/28/2013 12:01 PM, Joshua Harlow wrote: But there is a difference here that I think needs to be clear. Releasing the resources from nova (in the current way its done) means another individual can take those resources and that causes inconsistencies (bad for deployer). I think we talked

Re: [Openstack] [Nova]How to setup shared storage for live migration based on IPSAN device?

2013-10-28 Thread Chris Friesen
On 10/28/2013 09:35 AM, Ray Sun wrote: Daniel, Thanks for you response. But I am still confusing about use a iscsi device, if I want to use it as shared storage, first I need to attach to a node as a block storage, then I mount it to other compute nodes using NFS. The problem is that this will

Re: [Openstack] [Nova]How to setup shared storage for live migration based on IPSAN device?

2013-10-28 Thread Chris Friesen
, meaning you can mount the ISCSI block on every compute node and you should be fine - Razique On Oct 28, 2013, at 8:54, Chris Friesen chris.frie...@windriver.com wrote: On 10/28/2013 09:35 AM, Ray Sun wrote: Daniel, Thanks for you response. But I am still confusing about use a iscsi device, if I want

Re: [openstack-dev] does the nova client python API support Keystone's token-based authentication?

2013-10-25 Thread Chris Friesen
On 10/25/2013 02:08 PM, openstack learner wrote: hi guys, Instead of username/password, does the nova client python API support Keystone's token-based authentication? Yes, but normal tokens expire, so the idea is that you authenticate with username/password, then get back a token that you

Re: [openstack-dev] dd performance for wipe in cinder

2013-10-11 Thread Chris Friesen
On 10/11/2013 03:20 AM, cosmos cosmos wrote: Hello. My name is Rucia for Samsung SDS. Now I am in trouble in cinder volume deleting. I am developing for supporting big data storage in lvm But it takes too much time for deleting of cinder lvm volume because of dd. Cinder volume is 200GB for

Re: [openstack-dev] [cinder] dd performance for wipe in cinder

2013-10-11 Thread Chris Friesen
On 10/11/2013 09:02 AM, John Griffith wrote: As Matt pointed out there's an option to turn off secure-delete altogether. The reason for the volume_clear setting (aka secure delete) is that since we're allocating volumes via LVM from a shared VG there is the possibility that a user had a volume

[openstack-dev] [nova] odd behaviour from sqlalchemy

2013-10-11 Thread Chris Friesen
Hi, I'm using grizzly with sqlalchemy 0.7.9. I'm seeing some funny behaviour related to the automatic update of updated_at column for the Service class in the sqlalchemy model. I added a new column to the Service class, and I want to be able to update that column without triggering the

Re: [openstack-dev] [Openstack] Neutron support for passthrough of networking devices?

2013-10-10 Thread Chris Friesen
On 10/10/2013 01:19 AM, Prashant Upadhyaya wrote: Hi Chris, I note two of your comments -- When we worked on H release, we target for basic PCI support like accelerator card or encryption card etc. PU So I note that you are already solving the PCI pass through usecase somehow ? How ? If

Re: [Openstack] [openstack-dev] Neutron support for passthrough of networking devices?

2013-10-10 Thread Chris Friesen
On 10/10/2013 01:19 AM, Prashant Upadhyaya wrote: Hi Chris, I note two of your comments -- When we worked on H release, we target for basic PCI support like accelerator card or encryption card etc. PU So I note that you are already solving the PCI pass through usecase somehow ? How ? If

Re: [openstack-dev] [nova] automatically evacuate instances on compute failure

2013-10-08 Thread Chris Friesen
On 10/08/2013 03:20 PM, Alex Glikson wrote: Seems that this can be broken into 3 incremental pieces. First, would be great if the ability to schedule a single 'evacuate' would be finally merged (_https://blueprints.launchpad.net/nova/+spec/find-host-and-evacuate-instance_). Agreed. Then, it

Re: [openstack-dev] [nova] BUG? nova-compute should delete unused instance files on boot

2013-10-07 Thread Chris Friesen
On 10/07/2013 12:44 PM, Russell Bryant wrote: On 10/07/2013 02:28 PM, Chris Friesen wrote: I've been doing a lot of instance creation/deletion/evacuate and I've noticed that if I 1)create an instance 2) power off the compute node it was running on 3) delete the instance 4) boot up the compute

Re: [openstack-dev] [nova] RFC: adding on_shared_storage field to instance

2013-10-04 Thread Chris Friesen
On 10/04/2013 03:31 AM, Cristian Tomoiaga wrote: Hello Chris, Just a note regarding this. I was thinking on using local plus shared storage for an instance ( eg. root disk local and another disk as a cinder volume ). If I understand this correctly, flagging the instance as having local storage

Re: [openstack-dev] [nova] RFC: adding on_shared_storage field to instance

2013-10-04 Thread Chris Friesen
On 10/04/2013 12:06 PM, Caitlin Bestler wrote: You've covered some reasons why there might be an instance attribute, but you still need to deal with getting the information about the underlying storage services from those storage services. Don't make assumptions about what a storage service

Re: [openstack-dev] strange behaviour (possible bug) with nova evacuate

2013-10-03 Thread Chris Friesen
On 10/03/2013 05:45 AM, Pavel Kravchenco wrote: Hi Chris, You probably encountered this bug: *https://bugs.launchpad.net/nova/+bug/1156269* It been fixed here: https://review.openstack.org/#/c/24600/ Yes, that looks like what I'm seeing. Thanks for the pointer. Btw, what code are you

Re: [openstack-dev] strange behaviour (possible bug) with nova evacuate

2013-10-03 Thread Chris Friesen
On 10/02/2013 11:42 PM, Lingxian Kong wrote: Hi Chris: Aftering exploring the code, I think there is already clean up on the original compute node, it will check that the instances reported by the driver are still associated with this host. If they are not, they will be destroyed. Please refer

[openstack-dev] [nova] RFC: adding on_shared_storage field to instance

2013-10-03 Thread Chris Friesen
I was wondering if there is any interest in adding an on_shared_storage field to the Instance class. This would be set once at instance creation time and we would then be able to avoid having the admin manually pass it in for the various API calls (evacuate/rebuild_instance/migration/etc.)

Re: [openstack-dev] [nova] RFC: adding on_shared_storage field to instance

2013-10-03 Thread Chris Friesen
On 10/03/2013 02:02 PM, Caitlin Bestler wrote: On October 3, 2013 12:44:50 PM Chris Friesen chris.frie...@windriver.com wrote: I was wondering if there is any interest in adding an on_shared_storage field to the Instance class. This would be set once at instance creation time and we would

[openstack-dev] strange behaviour (possible bug) with nova evacuate

2013-10-02 Thread Chris Friesen
Hi all, I posted this on the IRC channel but got no response, so I'll try here. Suppose I do the following: 1) create an instance (instance files not on shared storage) 2) kill its compute node and evacuate the instance to another node 3) boot up the original compute node 4) kill the second

[openstack-dev] [nova] automatically evacuate instances on compute failure

2013-09-25 Thread Chris Friesen
I'm interested in automatically evacuating instances in the case of a failed compute node. I found the following blueprint that covers exactly this case: https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically However, the comments there seem to indicate that the code

[Openstack] cgroups cpu share allocation in grizzly seems incorrect

2013-08-23 Thread Chris Friesen
I sent this to the openstack-dev list but got no response, so I'll repost here. In Grizzly regardless of the number of vCPUs the value of /sys/fs/cgroup/cpu/libvirt/qemu/instance-X/cpu.shares seems to be the same. If we were overloaded, this would give all instances the same cpu time

[openstack-dev] cgroups cpu share allocation in grizzly seems incorrect

2013-08-22 Thread Chris Friesen
I just noticed that in Grizzly regardless of the number of vCPUs the value of /sys/fs/cgroup/cpu/libvirt/qemu/instance-X/cpu.shares seems to be the same. If we were overloaded, this would give all instances the same cpu time regardless of the number of vCPUs in the instance. Is this design

Re: [Openstack] object-oriented design in nova--room for improvement?

2013-08-22 Thread Chris Friesen
On 08/21/2013 09:04 PM, Joshua Harlow wrote: There is always room for improvement I hope ;) +openstack-dev (I think where u wanted this to go). A question, are u thinking about organizing the 'metadata' associated with resources? If so it might be interesting to see if there could be a grand

Re: [Openstack] object-oriented design in nova--room for improvement?

2013-08-22 Thread Chris Friesen
On 08/22/2013 11:31 AM, Joshua Harlow wrote: I think that would make sense to. Would u want to try to prototype some code that might do this. That might help the nova core people see what your idea is. Although maybe they should chime in also (since I'm not sure if any other similar efforts

<    2   3   4   5   6   7