Re: [Openstack] Nova and asynchronous instance launching
There wasn't a blueprint, but you can see the change here: https://review.openstack.org/#/c/7542/ Bandwidth is updated in a DB table outside of notifications. Notifications just pulls the last data received and sends it. With rapid state changes, I would expect that bandidth_usage would mostly not be different in the messages… unless a bandwidth update in the background happens to sneak in during the middle of the events. In any case… these state change events are noted by 'compute.instance.update'. For actions like 'rebuild', you'll get an 'exists' message when the action starts… but then you'll also see some instance.update events as the states switch. At least this is how I understand it. Besides the code, your best resource for information about notification payloads, etc is this: http://wiki.openstack.org/SystemUsageData - Chris On Jul 2, 2012, at 4:38 AM, Day, Phil wrote: Hi Chris, Thanks for the pointer on the new notification on state change stuff, I'd missed that change. Is there a blueprint or some such which describes the change ? In particular I'm trying to understand how the bandwidth_usage values fit in here. It seems that during a VM creation there would normally be a number of fairly rapid state changes, so re-calculating the bandwidth_usage figures might be quiet expensive jut to log a change in task_state from say Networking to Block Device Mapping. I was kind of expecting that to be more part of the compute.exists messages than the update. Do we have something that catalogues the various notification messages and their payloads ? Thanks, Phil -Original Message- From: Chris Behrens [mailto:cbehr...@codestud.com] Sent: 02 July 2012 00:14 To: Day, Phil Cc: Jay Pipes; Huang Zhiteng; openstack@lists.launchpad.net Subject: Re: [Openstack] Nova and asynchronous instance launching On Jul 1, 2012, at 3:04 PM, Day, Phil philip@hp.com wrote: Rather than adding debug statements could we please add additional notification events (for example a notification event whenever task_state changes) This has been in trunk for a month or maybe a little longer. FYI - Chris ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
Hi Chris, Thanks for the pointer on the new notification on state change stuff, I'd missed that change. Is there a blueprint or some such which describes the change ? In particular I'm trying to understand how the bandwidth_usage values fit in here. It seems that during a VM creation there would normally be a number of fairly rapid state changes, so re-calculating the bandwidth_usage figures might be quiet expensive jut to log a change in task_state from say Networking to Block Device Mapping. I was kind of expecting that to be more part of the compute.exists messages than the update. Do we have something that catalogues the various notification messages and their payloads ? Thanks, Phil -Original Message- From: Chris Behrens [mailto:cbehr...@codestud.com] Sent: 02 July 2012 00:14 To: Day, Phil Cc: Jay Pipes; Huang Zhiteng; openstack@lists.launchpad.net Subject: Re: [Openstack] Nova and asynchronous instance launching On Jul 1, 2012, at 3:04 PM, Day, Phil philip@hp.com wrote: Rather than adding debug statements could we please add additional notification events (for example a notification event whenever task_state changes) This has been in trunk for a month or maybe a little longer. FYI - Chris ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On 06/29/2012 01:50 PM, David Kranz wrote: An assumption is being made here that the user and cloud provider are unrelated. But I think there are many projects under development where a cloud-based service is being provided on top of an OpenStack infrastructure. In that use case, the direct user of OpenStack APIs and the cloud provider may be the same entity. It would be really nice if when an application fires up an instance that enters the error state, there was an api that could get the reason why it failed with as much information as the OpenStack code that set the instance state to ERROR had. If we are concerned that such information is sensitive and a public provider might not want to give it all to users, this could be an admin-only API. There are many variations of how the information is controlled. Yeah, I think this is an excellent suggestion. To be clear, I responded earlier about adding more debug log statements to nova-network and nova-compute -- but I wasn't suggesting that as a user-facing solution to incident tracking :) I was only suggesting that more granular debug messages in logs can assist the operator. Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
Rather than adding debug statements could we please add additional notification events (for example a notification event whenever task_state changes) Anyone that want's log file entries could then use the log_notifier, but those that want to get information like this back into a central system can then use rabbit_notifier. Maybe we need some way of configuring filters on the notifier stream for those that want to decide which events should be logged, sent to MQ, or ignored altogether. Phil -Original Message- From: openstack-bounces+philip.day=hp@lists.launchpad.net [mailto:openstack-bounces+philip.day=hp@lists.launchpad.net] On Behalf Of Jay Pipes Sent: 29 June 2012 18:47 To: Huang Zhiteng Cc: openstack@lists.launchpad.net Subject: Re: [Openstack] Nova and asynchronous instance launching On 06/29/2012 04:25 AM, Huang Zhiteng wrote: Sound like a performance issue. I think this symptom can be much eased if we spend sometime fixing whatever bottleneck causing this (slow AMQP, scheduler, or network)? Now that Nova API has got multprocess enabled, we'd move to next bottleneck in long path of 'launching instance'. Devin, is it possible that you provide more details about this issue so that someone else can reproduce it? Actually, Vish, David Kranz and I had a discussion about similar stuff on IRC yesterday. I think that an easy win for this would be to add much more fine-grained DEBUG logging statements in the various nova service pieces -- nova-compute, nova-network, etc. Right now, there are areas that seem to look like performance or locking culprits (iptables save/restore for example), but because there isn't very fine-grained logging statements, it's tough to say whether: a) A process (or greenthread) has simply yielded to another while it waits for something b) A process is doing something that is blocking or c) A process is doing some other work but no log statements are being logged about that work, which makes it seem like some other work is taking much longer than it really is This would be a really easy win for a beginner developer or someone looking for something to assist with -- simply add informative LOG.debug() statements at various points in the API call pipelines Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On Jul 1, 2012, at 3:04 PM, Day, Phil philip@hp.com wrote: Rather than adding debug statements could we please add additional notification events (for example a notification event whenever task_state changes) This has been in trunk for a month or maybe a little longer. FYI - Chris ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On Fri, Jun 29, 2012 at 5:19 AM, Devin Carlen de...@openstack.org wrote: On Jun 28, 2012, at 9:01 AM, Jay Pipes wrote: On 06/27/2012 06:51 PM, Doug Davis wrote: Consider the creation of a Job type of entity that will be returned from the original call - probably a 202. Then the client can check the Job to see how things are going. BTW - this pattern can be used for any async op, not just the launching of multiple instances since technically any op might be long-running (or queued) based on the current state of the system. Note that much of the job of launching an instance is already asynchronous -- the initial call to create an instance really just creates an instance UUID and returns to the caller -- most of the actual work to create the instance is then done via messaging calls and the caller can continue to call for a status of her instance to check on it. In this particular case, I believe Devin is referring to when you indicate you want to spawn a whole bunch of instances and in that case, things happen synchronously instead of asynchronously? Devin, is that correct? If so, it seems like returning a packet immediately that contains a list of the instance UUIDs that can be used for checking status is the best option? Yep, exactly. The client still waits synchronously for the underlying RPC to complete. Sound like a performance issue. I think this symptom can be much eased if we spend sometime fixing whatever bottleneck causing this (slow AMQP, scheduler, or network)? Now that Nova API has got multprocess enabled, we'd move to next bottleneck in long path of 'launching instance'. Devin, is it possible that you provide more details about this issue so that someone else can reproduce it? Or am I missing something here? -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
Note that I do distinguish between a 'real' async op (where you really return little more than a 202) and one that returns a skeleton of the resource being created - like instance.create() does now. So the latter approach at least provides a way to poll on the resource status, so as to figure out if and when it becomes usable. In the happy-path, eventually the instance status transitions to ACTIVE and away we go. However, considering the unhappy-path for a second, is there a place for surfacing some more context as to why the new instance unexpectedly went into the ERROR state? For example even just an indication that failure occurred in the scheduler (e.g. resource starvation) or on the target compute node. Is the thought that such information may be operationally sensitive, or just TMI for a typical cloud user? Cheers, Eoghan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
Right - examining the current state isn't a good way to determine what happened with one particular request. This is exactly one of the reasons some providers create Jobs for all actions. Checking the resource later to see why something bad happened is fragile since other opertaons might have happened since then, erasing any error message type of state info. And relying on event/error logs is hard since correlating one particular action with a flood of events is tricky - especially in a multi-user environment where several actions could be underway at once. If each action resulted in a Job URI being returned then the client can check that Job resource when its convinient for them - and this could be quite useful in both happy and unhappy situations. And to be clear, a Job doesn't necessarily need to be a a full new resource, it could (under the covers) map to a grouping of event logs entries but the point is that from a client's perspective they have an easy mechanism (e.g. issue a GET to a single URI) that returns all of the info needed to determine what happened with one particular operation. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.com The more I'm around some people, the more I like my dog. Eoghan Glynn egl...@redhat.com 06/29/2012 06:00 AM To Doug Davis/Raleigh/IBM@IBMUS cc openstack@lists.launchpad.net, Jay Pipes jaypi...@gmail.com Subject Re: [Openstack] Nova and asynchronous instance launching Note that I do distinguish between a 'real' async op (where you really return little more than a 202) and one that returns a skeleton of the resource being created - like instance.create() does now. So the latter approach at least provides a way to poll on the resource status, so as to figure out if and when it becomes usable. In the happy-path, eventually the instance status transitions to ACTIVE and away we go. However, considering the unhappy-path for a second, is there a place for surfacing some more context as to why the new instance unexpectedly went into the ERROR state? For example even just an indication that failure occurred in the scheduler (e.g. resource starvation) or on the target compute node. Is the thought that such information may be operationally sensitive, or just TMI for a typical cloud user? Cheers, Eoghan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
However, considering the unhappy-path for a second, is there a place for surfacing some more context as to why the new instance unexpectedly went into the ERROR state? I assume the philosophy is that the API has validated the request as far and it can, and returned any meaningful error messages, etc. Anything that fails past that point is something going wrong from the cloud provider and there is nothing the user could have done to avoid the error, so any additional information won't help them. However on the basis that up-front validation is seldom perfect, and things can change while a request is in flight I think that being able to tell a user that, for example, their request failed because the image was deleted before it could be downloaded would be useful. One approach might be to make the task_state more granular and use that to qualify the error. In general our users have found having the state shown as vm_state (task_state) was useful as it shows progress during things like building. Phil From: openstack-bounces+philip.day=hp@lists.launchpad.net [mailto:openstack-bounces+philip.day=hp@lists.launchpad.net] On Behalf Of Doug Davis Sent: 29 June 2012 12:45 To: Eoghan Glynn Cc: openstack@lists.launchpad.net Subject: Re: [Openstack] Nova and asynchronous instance launching Right - examining the current state isn't a good way to determine what happened with one particular request. This is exactly one of the reasons some providers create Jobs for all actions. Checking the resource later to see why something bad happened is fragile since other opertaons might have happened since then, erasing any error message type of state info. And relying on event/error logs is hard since correlating one particular action with a flood of events is tricky - especially in a multi-user environment where several actions could be underway at once. If each action resulted in a Job URI being returned then the client can check that Job resource when its convinient for them - and this could be quite useful in both happy and unhappy situations. And to be clear, a Job doesn't necessarily need to be a a full new resource, it could (under the covers) map to a grouping of event logs entries but the point is that from a client's perspective they have an easy mechanism (e.g. issue a GET to a single URI) that returns all of the info needed to determine what happened with one particular operation. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.commailto:d...@us.ibm.com The more I'm around some people, the more I like my dog. Eoghan Glynn egl...@redhat.commailto:egl...@redhat.com 06/29/2012 06:00 AM To Doug Davis/Raleigh/IBM@IBMUS cc openstack@lists.launchpad.netmailto:openstack@lists.launchpad.net, Jay Pipes jaypi...@gmail.commailto:jaypi...@gmail.com Subject Re: [Openstack] Nova and asynchronous instance launching Note that I do distinguish between a 'real' async op (where you really return little more than a 202) and one that returns a skeleton of the resource being created - like instance.create() does now. So the latter approach at least provides a way to poll on the resource status, so as to figure out if and when it becomes usable. In the happy-path, eventually the instance status transitions to ACTIVE and away we go. However, considering the unhappy-path for a second, is there a place for surfacing some more context as to why the new instance unexpectedly went into the ERROR state? For example even just an indication that failure occurred in the scheduler (e.g. resource starvation) or on the target compute node. Is the thought that such information may be operationally sensitive, or just TMI for a typical cloud user? Cheers, Eoghan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On 06/29/2012 04:25 AM, Huang Zhiteng wrote: Sound like a performance issue. I think this symptom can be much eased if we spend sometime fixing whatever bottleneck causing this (slow AMQP, scheduler, or network)? Now that Nova API has got multprocess enabled, we'd move to next bottleneck in long path of 'launching instance'. Devin, is it possible that you provide more details about this issue so that someone else can reproduce it? Actually, Vish, David Kranz and I had a discussion about similar stuff on IRC yesterday. I think that an easy win for this would be to add much more fine-grained DEBUG logging statements in the various nova service pieces -- nova-compute, nova-network, etc. Right now, there are areas that seem to look like performance or locking culprits (iptables save/restore for example), but because there isn't very fine-grained logging statements, it's tough to say whether: a) A process (or greenthread) has simply yielded to another while it waits for something b) A process is doing something that is blocking or c) A process is doing some other work but no log statements are being logged about that work, which makes it seem like some other work is taking much longer than it really is This would be a really easy win for a beginner developer or someone looking for something to assist with -- simply add informative LOG.debug() statements at various points in the API call pipelines Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
An assumption is being made here that the user and cloud provider are unrelated. But I think there are many projects under development where a cloud-based service is being provided on top of an OpenStack infrastructure. In that use case, the direct user of OpenStack APIs and the cloud provider may be the same entity. It would be really nice if when an application fires up an instance that enters the error state, there was an api that could get the reason why it failed with as much information as the OpenStack code that set the instance state to ERROR had. If we are concerned that such information is sensitive and a public provider might not want to give it all to users, this could be an admin-only API. There are many variations of how the information is controlled. -David If we are concerned that a public provider might not want to give some information to users, this could be an admin-only API. On 6/29/2012 11:40 AM, Day, Phil wrote: However, considering the unhappy-path for a second, is there a place for surfacing some more context as to why the new instance unexpectedly went into the ERROR state? I assume the philosophy is that the API has validated the request as far and it can, and returned any meaningful error messages, etc. Anything that fails past that point is something going wrong from the cloud provider and there is nothing the user could have done to avoid the error, so any additional information won't help them. However on the basis that up-front validation is seldom perfect, and things can change while a request is in flight I think that being able to tell a user that, for example, their request failed because the image was deleted before it could be downloaded would be useful. One approach might be to make the task_state more granular and use that to qualify the error. In general our users have found having the state shown as vm_state (task_state) was useful as it shows progress during things like building. Phil *From:*openstack-bounces+philip.day=hp@lists.launchpad.net [mailto:openstack-bounces+philip.day=hp@lists.launchpad.net] *On Behalf Of *Doug Davis *Sent:* 29 June 2012 12:45 *To:* Eoghan Glynn *Cc:* openstack@lists.launchpad.net *Subject:* Re: [Openstack] Nova and asynchronous instance launching Right - examining the current state isn't a good way to determine what happened with one particular request. This is exactly one of the reasons some providers create Jobs for all actions. Checking the resource later to see why something bad happened is fragile since other opertaons might have happened since then, erasing any error message type of state info. And relying on event/error logs is hard since correlating one particular action with a flood of events is tricky - especially in a multi-user environment where several actions could be underway at once. If each action resulted in a Job URI being returned then the client can check that Job resource when its convinient for them - and this could be quite useful in both happy and unhappy situations. And to be clear, a Job doesn't necessarily need to be a a full new resource, it could (under the covers) map to a grouping of event logs entries but the point is that from a client's perspective they have an easy mechanism (e.g. issue a GET to a single URI) that returns all of the info needed to determine what happened with one particular operation. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.com mailto:d...@us.ibm.com The more I'm around some people, the more I like my dog. *Eoghan Glynn egl...@redhat.com mailto:egl...@redhat.com* 06/29/2012 06:00 AM To Doug Davis/Raleigh/IBM@IBMUS cc openstack@lists.launchpad.net mailto:openstack@lists.launchpad.net, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com Subject Re: [Openstack] Nova and asynchronous instance launching Note that I do distinguish between a 'real' async op (where you really return little more than a 202) and one that returns a skeleton of the resource being created - like instance.create() does now. So the latter approach at least provides a way to poll on the resource status, so as to figure out if and when it becomes usable. In the happy-path, eventually the instance status transitions to ACTIVE and away we go. However, considering the unhappy-path for a second, is there a place for surfacing some more context as to why the new instance unexpectedly went into the ERROR state? For example even just an indication that failure occurred in the scheduler (e.g. resource starvation) or on the target compute node. Is the thought that such information may be operationally sensitive, or just TMI for a typical cloud user? Cheers, Eoghan ___ Mailing list: https://launchpad.net
Re: [Openstack] Nova and asynchronous instance launching
You don't really expect a client (think ec2-like-user) to analyze debug info do you? I really think we need a nice consistent way for people to see what's going on with long-running operations. Debug info isn't that to me. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.com The more I'm around some people, the more I like my dog. Jay Pipes jaypi...@gmail.com Sent by: openstack-bounces+dug=us.ibm@lists.launchpad.net 06/29/2012 01:46 PM To Huang Zhiteng winsto...@gmail.com cc openstack@lists.launchpad.net Subject Re: [Openstack] Nova and asynchronous instance launching On 06/29/2012 04:25 AM, Huang Zhiteng wrote: Sound like a performance issue. I think this symptom can be much eased if we spend sometime fixing whatever bottleneck causing this (slow AMQP, scheduler, or network)? Now that Nova API has got multprocess enabled, we'd move to next bottleneck in long path of 'launching instance'. Devin, is it possible that you provide more details about this issue so that someone else can reproduce it? Actually, Vish, David Kranz and I had a discussion about similar stuff on IRC yesterday. I think that an easy win for this would be to add much more fine-grained DEBUG logging statements in the various nova service pieces -- nova-compute, nova-network, etc. Right now, there are areas that seem to look like performance or locking culprits (iptables save/restore for example), but because there isn't very fine-grained logging statements, it's tough to say whether: a) A process (or greenthread) has simply yielded to another while it waits for something b) A process is doing something that is blocking or c) A process is doing some other work but no log statements are being logged about that work, which makes it seem like some other work is taking much longer than it really is This would be a really easy win for a beginner developer or someone looking for something to assist with -- simply add informative LOG.debug() statements at various points in the API call pipelines Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On 06/29/2012 05:45 PM, Doug Davis wrote: You don't really expect a client (think ec2-like-user) to analyze debug info do you? I really think we need a nice consistent way for people to see what's going on with long-running operations. Debug info isn't that to me. thanks -Doug Also, see: http://wiki.openstack.org/MailingListEtiquette particularly the first point, re: HTML email. Cheers, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
True that's all useful info but I thought the original problem being addressed was how the end-user could know what's going on for long-running ops. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.com The more I'm around some people, the more I like my dog. Jay Pipes jaypi...@gmail.com 06/29/2012 06:03 PM To Doug Davis/Raleigh/IBM@IBMUS cc openstack@lists.launchpad.net, Huang Zhiteng winsto...@gmail.com Subject Re: [Openstack] Nova and asynchronous instance launching I'm not expecting a client to do anything, and I'm not sure where you got that from my response below... I'm talking about adding debug statements into the nova-compute/nova-network logs that an *operator* or *core developer* would use to determine which parts of the code are taking that most amount of time. -jay On 06/29/2012 05:45 PM, Doug Davis wrote: You don't really expect a client (think ec2-like-user) to analyze debug info do you? I really think we need a nice consistent way for people to see what's going on with long-running operations. Debug info isn't that to me. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.com The more I'm around some people, the more I like my dog. *Jay Pipes jaypi...@gmail.com* Sent by: openstack-bounces+dug=us.ibm@lists.launchpad.net 06/29/2012 01:46 PM To Huang Zhiteng winsto...@gmail.com cc openstack@lists.launchpad.net Subject Re: [Openstack] Nova and asynchronous instance launching On 06/29/2012 04:25 AM, Huang Zhiteng wrote: Sound like a performance issue. I think this symptom can be much eased if we spend sometime fixing whatever bottleneck causing this (slow AMQP, scheduler, or network)? Now that Nova API has got multprocess enabled, we'd move to next bottleneck in long path of 'launching instance'. Devin, is it possible that you provide more details about this issue so that someone else can reproduce it? Actually, Vish, David Kranz and I had a discussion about similar stuff on IRC yesterday. I think that an easy win for this would be to add much more fine-grained DEBUG logging statements in the various nova service pieces -- nova-compute, nova-network, etc. Right now, there are areas that seem to look like performance or locking culprits (iptables save/restore for example), but because there isn't very fine-grained logging statements, it's tough to say whether: a) A process (or greenthread) has simply yielded to another while it waits for something b) A process is doing something that is blocking or c) A process is doing some other work but no log statements are being logged about that work, which makes it seem like some other work is taking much longer than it really is This would be a really easy win for a beginner developer or someone looking for something to assist with -- simply add informative LOG.debug() statements at various points in the API call pipelines Best, -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
There's only 1 rpc call unless you're running cactus or something. All schedulers have a loop...not API. min-count is unfortunately special cased right now to be a single call vs cast, though. I was going to fix that real soon. Problem is scheduler creating the DB records vs API in this case. I can expand on this when I'm not replying from a phone. :) There's some other things that would be nice to do here with the API but the call can change to a cast with no API behavior change (except for speeding up the response :) - Chris On Jun 27, 2012, at 12:53 PM, Devin Carlen de...@openstack.org wrote: We filed a blueprint for this yesterday: https://blueprints.launchpad.net/nova/+spec/launch-instances-async Currently if a user attempts to create a lot of instances with a single API call (using min_count) the request will hang for a long time while all RPC calls are completed. For a large number of instances this can take a very long time. The API should return immediately and asynchronously make RPC calls. We are looking for creative ways to work around this problem, but in the meantime I'd like to hear from folks on what they think the preferred solution would be. Devin ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On 06/27/2012 06:51 PM, Doug Davis wrote: Consider the creation of a Job type of entity that will be returned from the original call - probably a 202. Then the client can check the Job to see how things are going. BTW - this pattern can be used for any async op, not just the launching of multiple instances since technically any op might be long-running (or queued) based on the current state of the system. Note that much of the job of launching an instance is already asynchronous -- the initial call to create an instance really just creates an instance UUID and returns to the caller -- most of the actual work to create the instance is then done via messaging calls and the caller can continue to call for a status of her instance to check on it. In this particular case, I believe Devin is referring to when you indicate you want to spawn a whole bunch of instances and in that case, things happen synchronously instead of asynchronously? Devin, is that correct? If so, it seems like returning a packet immediately that contains a list of the instance UUIDs that can be used for checking status is the best option? Or am I missing something here? -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
Understood but I'd rather solve this more generically once instead of each possible async op doing its own thing. I like consistency :-) Note that I do distinguish between a 'real' async op (where you really return little more than a 202) and one that returns a skeleton of the resource being created - like instance.create() does now. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.com The more I'm around some people, the more I like my dog. Jay Pipes jaypi...@gmail.com Sent by: openstack-bounces+dug=us.ibm@lists.launchpad.net 06/28/2012 12:01 PM To openstack@lists.launchpad.net cc Subject Re: [Openstack] Nova and asynchronous instance launching On 06/27/2012 06:51 PM, Doug Davis wrote: Consider the creation of a Job type of entity that will be returned from the original call - probably a 202. Then the client can check the Job to see how things are going. BTW - this pattern can be used for any async op, not just the launching of multiple instances since technically any op might be long-running (or queued) based on the current state of the system. Note that much of the job of launching an instance is already asynchronous -- the initial call to create an instance really just creates an instance UUID and returns to the caller -- most of the actual work to create the instance is then done via messaging calls and the caller can continue to call for a status of her instance to check on it. In this particular case, I believe Devin is referring to when you indicate you want to spawn a whole bunch of instances and in that case, things happen synchronously instead of asynchronously? Devin, is that correct? If so, it seems like returning a packet immediately that contains a list of the instance UUIDs that can be used for checking status is the best option? Or am I missing something here? -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On Jun 28, 2012, at 9:01 AM, Jay Pipes wrote: On 06/27/2012 06:51 PM, Doug Davis wrote: Consider the creation of a Job type of entity that will be returned from the original call - probably a 202. Then the client can check the Job to see how things are going. BTW - this pattern can be used for any async op, not just the launching of multiple instances since technically any op might be long-running (or queued) based on the current state of the system. Note that much of the job of launching an instance is already asynchronous -- the initial call to create an instance really just creates an instance UUID and returns to the caller -- most of the actual work to create the instance is then done via messaging calls and the caller can continue to call for a status of her instance to check on it. In this particular case, I believe Devin is referring to when you indicate you want to spawn a whole bunch of instances and in that case, things happen synchronously instead of asynchronously? Devin, is that correct? If so, it seems like returning a packet immediately that contains a list of the instance UUIDs that can be used for checking status is the best option? Yep, exactly. The client still waits synchronously for the underlying RPC to complete. An immediate 202 would be a great way to deal with this. Or am I missing something here? -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] Nova and asynchronous instance launching
We filed a blueprint for this yesterday: https://blueprints.launchpad.net/nova/+spec/launch-instances-async Currently if a user attempts to create a lot of instances with a single API call (using min_count) the request will hang for a long time while all RPC calls are completed. For a large number of instances this can take a very long time. The API should return immediately and asynchronously make RPC calls. We are looking for creative ways to work around this problem, but in the meantime I'd like to hear from folks on what they think the preferred solution would be. Devin___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
Consider the creation of a Job type of entity that will be returned from the original call - probably a 202. Then the client can check the Job to see how things are going. BTW - this pattern can be used for any async op, not just the launching of multiple instances since technically any op might be long-running (or queued) based on the current state of the system. thanks -Doug __ STSM | Standards Architect | IBM Software Group (919) 254-6905 | IBM 444-6905 | d...@us.ibm.com The more I'm around some people, the more I like my dog. Devin Carlen de...@openstack.org Sent by: openstack-bounces+dug=us.ibm@lists.launchpad.net 06/27/2012 03:53 PM To openstack@lists.launchpad.net (openstack@lists.launchpad.net) openstack@lists.launchpad.net cc Subject [Openstack] Nova and asynchronous instance launching We filed a blueprint for this yesterday: https://blueprints.launchpad.net/nova/+spec/launch-instances-async Currently if a user attempts to create a lot of instances with a single API call (using min_count) the request will hang for a long time while all RPC calls are completed. For a large number of instances this can take a very long time. The API should return immediately and asynchronously make RPC calls. We are looking for creative ways to work around this problem, but in the meantime I'd like to hear from folks on what they think the preferred solution would be. Devin___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp