Re: [Openstack] Reinstating Trey Morris for Nova Core

2013-01-24 Thread Yun Mao
+1

On Tue, Jan 22, 2013 at 7:25 PM, Vishvananda Ishaya
wrote:

> +1
>
> We mentioned previously that we would fast-track former core members back
> in.
> I gess we can wait a couple of days to see if anyone objects and then add
> him back.
>
> Vish
> On Jan 22, 2013, at 3:38 PM, Matt Dietz  wrote:
>
> > All,
> >
> >   I think Trey Morris has been doing really well on reviews again,
> so I'd
> > like to propose him to be reinstated for Nova core. Thoughts?
> >
> > -Dietz
> >
> >
> >
> > ___
> > Mailing list: https://launchpad.net/~openstack
> > Post to : openstack@lists.launchpad.net
> > Unsubscribe : https://launchpad.net/~openstack
> > More help   : https://help.launchpad.net/ListHelp
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Versioning for notification messages

2012-10-10 Thread Yun Mao
Just a thought - this sounds like systems such as Google protocol
buffer are for, where multiple versions of structured data is
serialized/deserialized. Thanks,

Yun

On Wed, Oct 10, 2012 at 3:27 AM, Day, Phil  wrote:
> Hi All,
>
> I guess I may have mis-stated the problem a tad in talking about version 
> numbering.  The notification system is an outbound interface, and my interest 
> is in being able to write consumers with some guarantee that they won't be 
> broken as the notification message format evolves.
>
> Having a version number gives the client a way to know that it may now be 
> broken, but that's not really the same as having an interface with some 
> degree of guaranteed compatibility,
>
> Phil
>
> -Original Message-
> From: openstack-bounces+philip.day=hp@lists.launchpad.net 
> [mailto:openstack-bounces+philip.day=hp@lists.launchpad.net] On Behalf Of 
> David Ripton
> Sent: 09 October 2012 20:59
> To: openstack@lists.launchpad.net
> Subject: Re: [Openstack] Versioning for notification messages
>
> On 10/09/2012 01:07 PM, Day, Phil wrote:
>
>> What do people think about adding a version number to the notification
>> systems, so that consumers of notification messages are protected to
>> some extent from changes in the message contents ?
>>
>> For example, would it be enough to add a version number to the
>> messages - or should we have the version number as part of the topic
>> itself (so that the notification system can provide both a 1.0 and 1.1 
>> feed), etc ?
>
> Putting a version number in the messages is easy, and should work fine.
>   Of course it only really helps if someone writes clients that can deal with 
> multiple versions, or at least give helpful error messages when they get an 
> unexpected version.
>
> I think using separate topics for each version would be inefficient and 
> error-prone.
>
> Inefficient because you'd have to send out multiples of each message, some of 
> which would probably never be read.  Obviously, if you're sending out N 
> copies of each message then you expect only 1/N the queue performance.  
> Worse, if you're sending out N copies of each message but only 1 of them is 
> being consumed, your queue server is using a lot more memory than it needs 
> to, to hold onto old messages that nobody needs.
> (If you properly configure a high-water mark or timeout, then the old 
> messages will eventually be thrown away.  If you don't, then your queue 
> server will eventually consume way too much memory and start swapping, your 
> cloud will break, and someone will get paged at 2 a.m.)
>
> Error-prone because someone would end up reusing the notification queue code 
> for less idempotent/safe uses of queues, like internal API calls.
> And then client A would pick up the message from topic_v1, and client B would 
> pick up the same message from topic_v2, and they'd both perform the same API 
> operation, resulting in wasted resources in the best case and data corruption 
> in the worst case.
>
> --
> David Ripton   Red Hat   drip...@redhat.com
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] [nova] Proposal for Sean Dague to join nova-core

2012-07-25 Thread Yun Mao
+1

Yun

On Monday, July 23, 2012, Johannes Erdfelt wrote:

> On Fri, Jul 20, 2012, Vishvananda Ishaya  wrote:
> > When I was going through the list of reviewers to see who would be good
> > for nova-core a few days ago, I left one out. Sean has been doing a lot
> > of reviews lately[1] and did the refactor and cleanup of the driver
> > loading code. I think he would also be a great addition to nova-core.
>
> +1
>
> JE
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Nova] The server instance changes from 'shutoff' to 'active' after created an instance snapshot

2012-07-11 Thread Yun Mao
Hi,

What you describe seems like a bug. If the VM is running when you do
snapshot, the VM will be temporarily suspended, snapshotted, then
resumed. But if the VM is off when you do snapshot, the VM should
remained to be off after the snapshot, not to be set back to the
ACTIVE state. Would you mind filing a bug at launchpad? Thanks,

Yun

On Wed, Jul 11, 2012 at 3:25 AM, Wangpan  wrote:
> Hi all,
>
> These are what I did (nova version is 'Essex'):
> 1. logon the instance
> 2. shutdown it using 'shutdown -h now'
> 3. wait a few minutes until the instance's vm_state, power_state change to
> 'shutoff', '5' respectively
> 4. run cmd 'nova image-create  ' to snapshot the server
> instance
> 5. the instance's vm_state changes to 'active', but the power_state doesn't
> change(still '5')
>
> My questions:
> 1. Can we create a snapshot with a 'shutoff' instance?
> 2. If we can do the above operation, what is the correct vm_state after it?
> still 'shutoff' or changes to 'active'?
> 3. I also saw the source codes in
> nova/nova/compute/manager.py:snapshot_instance(), the vm_state is update at
> first by
>self._instance_update(context,
>   instance_ref['id'],
>   power_state=current_power_state,
>   vm_state=vm_states.ACTIVE)
>and the task_state is changed to 'None' finally
>self._instance_update(context, instance_ref['id'],
> task_state=None)
>should we change back the vm_state like this, too?
>
> Thanks in Advance
>
> Best Regards,
> Wangpan
>
> 
> 有道词典――最好用的免费全能翻译软件!
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Timeout during image build (task Networking)

2012-06-20 Thread Yun Mao
Jay,

there is a tools/clean_file_locks.py that you might be able to take
advantage of.

Yun

On Wed, Jun 20, 2012 at 3:23 PM, Jay Pipes  wrote:
> Turns out my issue was a borked run of Tempest that left a
> nova-ensure_bridge.lock file around. After manually destroying this lock
> file, Tempest is running cleanly again.
>
> I'll look into adding a forcible removal of this lockfile to the unstack.sh
> script (which I personally use to reset my Devstack envs)
>
> Thanks,
> -jay
>
>
> On 06/19/2012 03:13 PM, Vishvananda Ishaya wrote:
>>
>> Sorry, paste fail on the last message.
>>
>> This seems like a likely culprit:
>>
>> https://review.openstack.org/#/c/8339/
>>
>> I'm guessing it only happens on concurrent builds? We probably need a
>> synchronized somewhere.
>>
>> Vish
>>
>> On Jun 19, 2012, at 12:03 PM, Jay Pipes wrote:
>>
>>> cc'ing Vish on this, as this is now occurring on every single devstack
>>> + Tempest run, for multiple servers.
>>>
>>> Vish, I am seeing the exact same issue as shown below. Instances end
>>> up in ERROR state and looking into the nova-network log, I find *no*
>>> errors at all, and yet looking at the nova-compute log, I see multiple
>>> timeout errors -- all of them trying to RPC while in the
>>> allocate_network method. Always the same method, always the same
>>> error, and no errors in nova-network or nova-api (other than just
>>> reporting a failed build)
>>>
>>> Any idea on something that may have crept in recently? This wasn't
>>> happening a week or so ago, AFAICT.
>>>
>>> Best,
>>> -jay
>>>
>>> On 06/18/2012 06:03 PM, Lillie Ross-CDSR11 wrote:

 I'm receiving RPC timeouts when trying to launch an instance. My
 installation is the Essex release running on Ubuntu 12.04.

 When I launch a test image, the launch fails. In my setup, Nova network
 runs on a controller node, and all compute instances run on separate,
 dedicated server nodes. The failure is repeatable. Upon examining the
 various logs, I see the following (see below). Any insight would be
 welcome.

 Regards,
 Ross

 From 'nova show ' I read the following:

 root@cirrus1:~# nova show test

 +-+-+
 | Property | Value |

 +-+-+
 | OS-DCF:diskConfig | MANUAL |
 | OS-EXT-SRV-ATTR:host | nova8 |
 | OS-EXT-SRV-ATTR:hypervisor_hostname | None |
 | OS-EXT-SRV-ATTR:instance_name | instance-0005 |
 | OS-EXT-STS:power_state | 0 |
 | OS-EXT-STS:task_state | networking |
 | OS-EXT-STS:vm_state | error |
 | accessIPv4 | |
 | accessIPv6 | |
 | config_drive | |
 | created | 2012-06-18T20:42:56Z |
 | fault | {u'message': u'Timeout', u'code': 500, u'created':
 u'2012-06-18T20:43:58Z'} |
 | flavor | m1.tiny |
 | hostId | 50272989300483e2b5e5236cd572fef3f9149ae60faa5f5660f8da54 |
 | id | d569b16f-10a8-4cb8-90a3-d5b664c2322d |
 | image | tty-linux |
 | key_name | admin |
 | metadata | {} |
 | name | test |
 | private_0 network | |
 | status | ERROR |
 | tenant_id | 1 |
 | updated | 2012-06-18T20:43:57Z |
 | user_id | 1 |

 +-+-+

 From the nova-network.log I see the following:

 2012-06-18 15:43:36 DEBUG nova.manager [-] Running periodic task
 VlanManager._disassociate_stale_fixed_ips from (pid=1381) periodic_tasks
 /usr/lib/python2.7/dist-packages
 /nova/manager.py:152
 2012-06-18 15:43:57 ERROR nova.rpc.common [-] Timed out waiting for RPC
 response: timed out
 2012-06-18 15:43:57 TRACE nova.rpc.common Traceback (most recent call
 last):
 2012-06-18 15:43:57 TRACE nova.rpc.common File
 "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490, in
 ensure
 2012-06-18 15:43:57 TRACE nova.rpc.common return method(*args, **kwargs)
 2012-06-18 15:43:57 TRACE nova.rpc.common File
 "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567, in
 _consume
 2012-06-18 15:43:57 TRACE nova.rpc.common return
 self.connection.drain_events(timeout=timeout)
 2012-06-18 15:43:57 TRACE nova.rpc.common File
 "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in
 drain_events
 2012-06-18 15:43:57 TRACE nova.rpc.common return
 self.transport.drain_events(self.connection, **kwargs)
 2012-06-18 15:43:57 TRACE nova.rpc.common File
 "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line
 238, in drain_events
 2012-06-18 15:43:57 TRACE nova.rpc.common return
 connection.drain_events(**kwargs)
 2012-06-18 15:43:57 TRACE nova.rpc.common File
 "/usr/lib/python2.7/d

Re: [Openstack] Deleting a volume stuck in "attaching" state?

2012-06-20 Thread Yun Mao
John,

A strategy we are making in Nova (WIP) is to allow instance
termination no matter what. Perhaps a similar strategy could be
adopted for volumes too? Thanks,

Yun

On Wed, Jun 20, 2012 at 12:02 AM, John Griffith
 wrote:
> On Tue, Jun 19, 2012 at 7:40 PM, Lars Kellogg-Stedman
>  wrote:
>> I attempted to attach a volume to a running instance, but later
>> deleted the instance, leaving the volume stuck in the "attaching"
>> state:
>>
>>  # nova volume-list
>>  ++---+--+--+-+-+
>>  | ID |   Status  | Display Name | Size | Volume Type | Attached to |
>>  ++---+--+--+-+-+
>>  | 9  | attaching | None         | 1    | None        |             |
>>  ++---+--+--+-+-+
>>
>> It doesn't appear to be possible to delete this with "nova
>> volume-delete":
>>
>>  # nova volume-delete
>>   nova volume-delete 9
>>   ERROR: Invalid volume: Volume status must be available or error (HTTP 400)
>>
>> Other than directly editing the database (and I've had to do that an
>> awful lot already), how do I recover from this situation?
>>
>> --
>> Lars Kellogg-Stedman        |
>> Senior Technologist                                | 
>> http://ac.seas.harvard.edu/
>> Academic Computing                                 | 
>> http://code.seas.harvard.edu/
>> Harvard School of Engineering and Applied Sciences |
>>
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
> Hi Lars,
>
> Unfortunately manipulating the database might be your best bet for
> now.  We do have plans to come up with another option in the Cinder
> project, but unfortunately that won't help you much right now.
>
> If somebody has a better method, I'm sure they'll speak up and reply
> to this email, but I think right now that's your best bet.
>
> Thanks,
> John
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API

2012-06-18 Thread Yun Mao
Hi David,

Yes there is a plan to change that for Folsom. vm_state will be purely
stable state and task_state will be purely for transition state. See
http://wiki.openstack.org/VMState for the new design rational of
(power_state, vm_state, task_state)

After the cleanup, vm_state will have

ACTIVE = 'active'  # VM is running
BUILDING = 'building'  # VM only exists in DB
PAUSED = 'paused'
SUSPENDED = 'suspended'  # VM is suspended to disk.
STOPPED = 'stopped'  # VM is powered off, the disk image is still there.
RESCUED = 'rescued'  # A rescue image is running with the original VM image
# attached.
RESIZED = 'resized'  # a VM with the new size is active. The user is expected
# to manually confirm or revert.
SOFT_DELETED = 'soft-delete'  # VM is marked as deleted but the disk images are
# still available to restore.
DELETED = 'deleted'  # VM is permanently deleted.
ERROR = 'error'

There is no SHUTOFF (merged with STOPPED), and VERIFY_RESIZE is named
(from task state) as RESIZED (in vm state). BUILDING state is not my
favorite but it's left there mostly for backward compatibility reason.

This is still up for discussion and your input is welcome. Thanks,

Yun

On Mon, Jun 18, 2012 at 3:54 PM, David Kranz  wrote:
> Thanks, Yun. The problem is that the API calls give you status which is
> neither task state nor vm state. I think these are the stable states:
>
> ACTIVE, VERIFY_RESIZE, STOPPED, SHUTOFF, PAUSED, SUSPENDED, RESCUE, ERROR,
> DELETED
>
> Does that seem right to you, and is there a plan to change that set for
> Folsom?
>
>  -David
>
>
>
>
>
> On 6/18/2012 12:51 PM, Yun Mao wrote:
>>
>> Hi Jay et al,
>>
>> there is a patch in review here to overhaul the state machine:
>>
>> https://review.openstack.org/#/c/8254/
>>
>> All transient state in vm state will be moved to task state. Stable
>> state in task state (RESIZE_VERIFY) will be moved to vm state. There
>> is also a state transition diagram in dot format.
>>
>> Comments welcome. Thanks,
>>
>> All
>>
>> On Mon, Jun 18, 2012 at 12:26 PM, Jay Pipes  wrote:
>>>
>>> On 06/18/2012 12:01 PM, David Kranz wrote:
>>>>
>>>> There are a few tempest tests, and many in the old kong suite that is
>>>> still there, that wait for a server status that is something other than
>>>> ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT,
>>>> are transient so I don't understand why it is correct for code to poll
>>>> for those states. Am I missing something or do those tests have race
>>>> condition bugs?
>>>
>>>
>>> No, you are correct, and I have made some comments in recent code reviews
>>> to
>>> that effect.
>>>
>>> Here are all the task states:
>>>
>>> https://github.com/openstack/nova/blob/master/nova/compute/task_states.py
>>>
>>> Out of all those task states, I believe the only one safe to poll in a
>>> wait
>>> loop is RESIZE_VERIFY. All the others are prone to state transitions
>>> outside
>>> the control of the user.
>>>
>>> For the VM states:
>>>
>>> https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py
>>>
>>> I consider the following to be non-racy, quiescent states:
>>>
>>> ACTIVE
>>> DELETED
>>> STOPPED
>>> SHUTDOFF
>>> PAUSED
>>> SUSPENDED
>>> ERROR
>>>
>>> I consider the following to be racy states that should not be tested for:
>>>
>>> MIGRATING -- Instead, the final state should be checked for...
>>> RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states
>>> should
>>> be checked
>>>
>>> I have absolutely no idea what the state termination is for the following
>>> VM
>>> states:
>>>
>>> RESCUED -- is this a permanent state? Is this able to be queried for in a
>>> consistent manner before it transitions to some further state?
>>>
>>> SOFT_DELETE -- I have no clue what the purpose or queryability of this
>>> state
>>> is, but would love to know...
>>>
>>> Best,
>>> -jay
>>>
>>> ___
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to     : openstack@lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Openstack-qa-team] wait_for_server_status and Compute API

2012-06-18 Thread Yun Mao
Hi Jay et al,

there is a patch in review here to overhaul the state machine:

https://review.openstack.org/#/c/8254/

All transient state in vm state will be moved to task state. Stable
state in task state (RESIZE_VERIFY) will be moved to vm state. There
is also a state transition diagram in dot format.

Comments welcome. Thanks,

All

On Mon, Jun 18, 2012 at 12:26 PM, Jay Pipes  wrote:
> On 06/18/2012 12:01 PM, David Kranz wrote:
>>
>> There are a few tempest tests, and many in the old kong suite that is
>> still there, that wait for a server status that is something other than
>> ACTIVE or VERIFY_RESIZE. These other states, such as BUILD or REBOOT,
>> are transient so I don't understand why it is correct for code to poll
>> for those states. Am I missing something or do those tests have race
>> condition bugs?
>
>
> No, you are correct, and I have made some comments in recent code reviews to
> that effect.
>
> Here are all the task states:
>
> https://github.com/openstack/nova/blob/master/nova/compute/task_states.py
>
> Out of all those task states, I believe the only one safe to poll in a wait
> loop is RESIZE_VERIFY. All the others are prone to state transitions outside
> the control of the user.
>
> For the VM states:
>
> https://github.com/openstack/nova/blob/master/nova/compute/vm_states.py
>
> I consider the following to be non-racy, quiescent states:
>
> ACTIVE
> DELETED
> STOPPED
> SHUTDOFF
> PAUSED
> SUSPENDED
> ERROR
>
> I consider the following to be racy states that should not be tested for:
>
> MIGRATING -- Instead, the final state should be checked for...
> RESIZING -- Instead, the RESIZE_VERIFY and RESIZE_CONFIRM task states should
> be checked
>
> I have absolutely no idea what the state termination is for the following VM
> states:
>
> RESCUED -- is this a permanent state? Is this able to be queried for in a
> consistent manner before it transitions to some further state?
>
> SOFT_DELETE -- I have no clue what the purpose or queryability of this state
> is, but would love to know...
>
> Best,
> -jay
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] nova state machine simplification and clarification

2012-06-08 Thread Yun Mao
Hi Sandy and Jay,

I've checked in one graphviz VM state transition in this review patch 3:

https://review.openstack.org/#/c/8254/

However since it's very complicated, the graph is too big after it's
rendered. Ideas are welcome. Thanks,

Yun

On Thu, May 24, 2012 at 3:10 PM, Jay Pipes  wrote:
> On 05/24/2012 10:46 AM, Yun Mao wrote:
>>
>> Sandy,
>>
>> I like the suggestion of graphvis, although I haven't used it for a
>> while. Is there a dir in nova appropriate to put .dot files? I was
>> hoping to get the proposal discussed a few round, and while it's
>> getting stabilized, we can work on the graphvis representation.
>> Thanks,
>
>
> Hi Yun!
>
> We use graphviz/dot representation in our /doc/source/architecture.rst [1]
> file to produce the Glance architecture diagram on glance.openstack.org [2].
> Works like a charm :)
>
> Cheers,
> -jay
>
> [1]
> https://raw.github.com/openstack/glance/master/doc/source/architecture.rst
> [2] http://glance.openstack.org/architecture.html
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Question on nova disk injection...

2012-06-05 Thread Yun Mao
Python is a scripting language. To get setuid work, you usually have
to give the setuid permission to /usr/bin/python which is a big no no.

One work around is to have a customized compiled program (e.g. from
C), which takes a python file as input, do all kinds of sanity check,
and switch to root user to execute Python. But in that case it's not
that much more appealing from the rootwrap.

my 2c.
Yun

On Tue, Jun 5, 2012 at 5:42 PM, Joshua Harlow  wrote:
> Hi all,
>
> Just some questions that I had about how nova is doing disk injection and
> such.
>
> I was noticing that it the main disk/api.py does a lot of tee, cat and
> similar commands. Is there any reason it couldn’t just use the standard
> python open and write data and such.
>
> Is it because of sudo access (which is connected to rootwrap?), just
> wondering since it seems sort of odd that to write a file there a tee call
> has to be done with piped input, when python already has file operators and
> such...
>
> Thx!
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Understanding shutdown VM behavior

2012-06-01 Thread Yun Mao
In EC2, shutdown_terminate is really to deal with instance-store
(local) vs EBS-backed (remote volume) instances. Once an instance is
stopped, all local state is cleaned so there is no way to bring back a
VM whose disk is local. That's why it's terminated automatically. For
EBS-backed instances, there is no local state. To start a stopped EBS
instance, it will go through the scheduler to find a compute node as
if we are creating a new instance.

My bet is that there are people who depend on this kind of EC2
compatibility. https://bugs.launchpad.net/nova/+bug/905270
Whether that's too much a burden to support, I'm not sure...

A related problem to stopped instances is about resource management.
Once the VM is stopped, it doesn't occupy resources (say memory). This
will be periodically reported back to the DB by the compute manager,
and the scheduler will be aware of that. However, starting the
instance back goes directly to the compute node. So it's possible that
you either have overcommitted resources in this case because new VMs
are spawned while in shutdown mode, or simply fail to start a VM due
to the lack of resources.

Billing stopped instances is tricky too but it seems to be more
provider specific.

Personally I like the EC2 API behavior over OS API because it makes
resource management simpler. But I'd be happy to go with OS API
behavior if we can have a good story to the resource update problem.
In any case, having a single, well defined, consistent behavior is
better than what's happening now. :)

Thanks,

Yun

On Fri, Jun 1, 2012 at 3:45 AM, Vishvananda Ishaya
 wrote:
> I did some cleanup of stop and power_off in the review here.
>
> https://review.openstack.org/#/c/8021/
>
> I removed the weird  shutdown_terminate handling. Honestly I feel like
> that is compatibility we don't need. It should be up to the provider whether
> a stop_instances counts as a terminate. In my mind they are two different
> things.
>
> Comments welcome on the review.
>
> Vish
>
> On May 31, 2012, at 6:40 PM, Yun Mao wrote:
>
> shutdown, stop, are power_off are synonym in this discussion. They all
> mean to stop the VM from running, but keep the disk image and network,
> so that the VM could be started back on again.
>
> There are three ways to do it: 1) using EC2 stop-instance API. 2) use
> OS API stop-server. 3) inside the VM, execute halt or equivalent.
> However, the devil is in the details.
>
> In EC2 API, a shutdown_terminate flag is checked when a stop-instance
> call is issued. If it's true, then stop-instances actually means
> terminate instances. The flag is true by default unless there is block
> device mapping provided, and it doesn't appear to be configurable by a
> user.
>
> In OS API, it's defined in v1.1, neither the specification nor the
> implementation check the shutdown_terminate flag at all. It will
> always do stop instead of terminate.
>
> So, when shutdown_terminate is true (default), the OS API and the EC2
> API will behave differently. If we accept this, it might still be
> acceptable. After all they are different APIs and could have different
> behavior. But the pickle is the case where a user initiates a shutdown
> inside the VM. What's the expected behavior after it's detected?
> Should it respect the shutdown_terminate flag or work more like an OS
> API?  Right now when a shutdown in a VM is detected, the vm state is
> updated to SHUTOFF and that's pretty much it..
>
> To summarize, there are 3 ways of doing the same thing, each now has a
> different behavior. I'd vote to patch the code to be a little more
> consistent. But what should be the right behavior?
>
> Yun
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] Understanding shutdown VM behavior

2012-05-31 Thread Yun Mao
shutdown, stop, are power_off are synonym in this discussion. They all
mean to stop the VM from running, but keep the disk image and network,
so that the VM could be started back on again.

There are three ways to do it: 1) using EC2 stop-instance API. 2) use
OS API stop-server. 3) inside the VM, execute halt or equivalent.
However, the devil is in the details.

In EC2 API, a shutdown_terminate flag is checked when a stop-instance
call is issued. If it's true, then stop-instances actually means
terminate instances. The flag is true by default unless there is block
device mapping provided, and it doesn't appear to be configurable by a
user.

In OS API, it's defined in v1.1, neither the specification nor the
implementation check the shutdown_terminate flag at all. It will
always do stop instead of terminate.

So, when shutdown_terminate is true (default), the OS API and the EC2
API will behave differently. If we accept this, it might still be
acceptable. After all they are different APIs and could have different
behavior. But the pickle is the case where a user initiates a shutdown
inside the VM. What's the expected behavior after it's detected?
Should it respect the shutdown_terminate flag or work more like an OS
API?  Right now when a shutdown in a VM is detected, the vm state is
updated to SHUTOFF and that's pretty much it..

To summarize, there are 3 ways of doing the same thing, each now has a
different behavior. I'd vote to patch the code to be a little more
consistent. But what should be the right behavior?

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] nova state machine simplification and clarification

2012-05-25 Thread Yun Mao
Hi,

the first (simple) step to simplify power_state is in gerrit for review.
https://review.openstack.org/#/c/7796/

The document is also migrated to wiki: http://wiki.openstack.org/VMState
Thanks,

Yun


On Fri, May 25, 2012 at 8:42 AM, Vaze, Mandar  wrote:
> Sorry for responding to old thread, I later realized that Yun had already 
> responded to Gabe's query.
>
> -Original Message-
> From: Vaze, Mandar
> Sent: Friday, May 25, 2012 6:10 PM
> To: 'Gabe Westmaas'; Mark Washenberger; openstack@lists.launchpad.net; 
> yun...@gmail.com
> Subject: RE: [Openstack] nova state machine simplification and clarification
>
>> I'd like to understand the difference between a soft and hard delete.
>
> soft_delete is invoked when "reclaim_instance_interval" flag is set to 
> non-zero.
> In this case, when delete command is fired, only the VM is powered off, and 
> vm_state is set to SOFT_DELETE Other resources (like network, volume, and 
> files created in instances_path etc) are cleaned up at later point via a 
> periodic task.
>
>> What exactly is a hard delete from the standpoint of a customer?  Is
>> this just a delete
> hard_delete is when vm is "destroyed" (As opposed to power off) and resources 
> are cleaned up immediately.
> This is the default configuration
>
> -Mandar
>
>
>
> __
> Disclaimer:This email and any attachments are sent in strictest confidence 
> for the sole use of the addressee and may contain legally privileged, 
> confidential, and proprietary data.  If you are not the intended recipient, 
> please advise the sender by replying promptly to this email and then delete 
> and destroy this email and any attachments without any further use, copying 
> or forwarding

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] nova state machine simplification and clarification

2012-05-24 Thread Yun Mao
Sandy,

I like the suggestion of graphvis, although I haven't used it for a
while. Is there a dir in nova appropriate to put .dot files? I was
hoping to get the proposal discussed a few round, and while it's
getting stabilized, we can work on the graphvis representation.
Thanks,

Yun

On Wed, May 23, 2012 at 11:28 AM, Sandy Walsh  wrote:
> Hi Yun,
>
> I like the direction you're going with this. Unifying these three enums would 
> be a great change. Honestly it's really just combing two enums (vm & task) 
> and using power state as a tool for reconciliation (actual != reality).
>
> Might I suggest using graphvis instead of a spreadsheet? That way we can keep 
> it under version control, have funky pictures and there are libraries for 
> parsing .dot files in Python. Also we can use the graphvis doc to actually 
> drive the state machine (via attributes on nodes/edges)
>
> I'd like to see more discussion on how reconciliation will be handled in the 
> event of a conflict.
>
> Cheers!
> -S
>
> 
> From:  Yun Mao [yun...@gmail.com]
> Sent: Thursday, May 17, 2012 10:16 AM
> To: openstack@lists.launchpad.net
> Subject: [Openstack] nova state machine simplification and clarification
>
> ...

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] rant on the resize API implementation

2012-05-18 Thread Yun Mao
According to 
http://docs.openstack.org/api/openstack-compute/2/content/Resize_Server-d1e3707.html

"The resize operation converts an existing server to a different
flavor, in essence, scaling the server up or down. The original server
is saved for a period of time to allow rollback if there is a problem.
All resizes should be tested and explicitly confirmed, at which time
the original server is removed. All resizes are automatically
confirmed after 24 hours if they are not explicitly confirmed or
reverted."

Whether this feature is useful in the cloud is not the scope of the
thread. I'd like to discuss the implementation. In the current
implementation, it will first cast to scheduler to decide the
destination host, then shutdown the VM, copy the disk image to the
dest, start the new VM, and wait for a user confirmation, then either
delete the old VM image as confirmation or delete the new VM as
revert.

Problem 1: the image is copied from source to destination via scp/ssh.
This probably means that you will need a password-less ssh private key
setup among all compute nodes. It seems like a security problem.

Problem 2: resize needs to boot up VMs too, once at the destination,
once at the source in case of revert. They have their own
implementation, and look slightly different from spawn which is the
default create instance call.

Problem 3: it's not clear what the semantics is when there are volumes
attached to the VM before resize. What should happen to the VM?

Without the resize API, a user can still do that by first make a
snapshot of a VM, then start a new VM with that snapshot. It's not
that much of a difference. If getting rid of resize is not an option,
I wonder if it makes more sense to implement the resize function by
calling the snapshot and create compute APIs instead of doing it in
the driver.

Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] nova state machine simplification and clarification

2012-05-18 Thread Yun Mao
Gabe,

There is a flag reclaim_instance_interval on API. If it's set to 0 (by
default), everything is hard_delete. Otherwise, it's soft_delete. and
will be automatically hard deleted after the configured interval.
There is also an API extension to as force_delete, which is hard
delete no matter what.

Right now I *think* task_state is already exposed via some API
(extension?). Otherwise the dashboard won't be able to see it.

Thanks,

Yun


On Fri, May 18, 2012 at 12:26 PM, Gabe Westmaas
 wrote:
> Also, with this proposal I'd be a lot more interested in exposing task
> state as a part of the API eventually.  This is helpful to communicate
> whether or not other actions would be allowed in certain states.  For
> example, right now we don't allow other actions when a server is
> snapshotting, but while the server is being snapshotted, the state is set
> to ACTIVE.  With these well thought out states, I think we could more
> safely expose those task states, and we would just have to be vigilant
> about adding new ones to make sure they make sense to expose to end users.
>
> Gabe
>
> On 5/18/12 10:20 AM, "Mark Washenberger" 
> wrote:
>
>>Hi Yun,
>>
>>This proposal looks very good to me. I am glad you included in it the
>>requirement that hard deletes can take place in any vm/task/power state.
>>
>>I however feel that a similar requirement exists for revert resize. It
>>should be possible to issue a RevertResize command for any task_state
>>(assuming that a resize is happening or has recently happened and is not
>>yet confirmed). The code to support this capability doesn't exist yet,
>>but I want to ask you: is it compatible with your proposal to allow
>>RevertResize in any task state?
>>
>>"Yun Mao"  said:
>>
>>> Hi,
>>>
>>> There are vm_states, task_states, and power_states for each VM. The
>>> use of them is complicated. Some states are confusing, and sometimes
>>> ambiguous. There also lacks a guideline to extend/add new state. This
>>> proposal aims to simplify things, explain and define precisely what
>>> they mean, and why we need them. A new user-friendly behavior of
>>> deleting a VM is also discussed.
>>>
>>> A TL;DR summary:
>>> * power_state is the hypervisor state, loaded ³bottom-up² from compute
>>> worker;
>>> * vm_state reflects the stable state based on API calls, matching user
>>> expectation, revised ³top-down² within API implementation.
>>> * task_state reflects the transition state introduced by in-progress
>>>API calls.
>>> * ³hard² delete of a VM should always succeed no matter what.
>>> * power_state and vm_state may conflict with each other, which needs
>>> to be resolved case-by-case.
>>>
>>> It's not a definite guide yet and is up for discussion. I'd like to
>>> thank vishy and comstud for the early input. comstud: the task_state
>>> is different from when you looked at it. It's a lot closer to what's
>>> in the current code.
>>>
>>> The full text is here and is editable by anyone like etherpad.
>>>
>>>
>>>https://docs.google.com/document/d/1nlKmYld3xxpTv6Xx0Iky6L46smbEqg7-SWPu_
>>>o6VJws/edit?pli=1
>>>
>>> Thanks,
>>>
>>> Yun
>>>
>>> ___
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to     : openstack@lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>>
>>___
>>Mailing list: https://launchpad.net/~openstack
>>Post to     : openstack@lists.launchpad.net
>>Unsubscribe : https://launchpad.net/~openstack
>>More help   : https://help.launchpad.net/ListHelp
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] nova state machine simplification and clarification

2012-05-18 Thread Yun Mao
Hi Mark,

I haven't looked at resize related API calls very closely. But what
you are saying makes sense. revert_resize() should be able to preempt
an existing resize() call, which might get stuck. I'm not clear how
the leftovers will be garbage collected yet.

Yun

On Fri, May 18, 2012 at 10:20 AM, Mark Washenberger
 wrote:
> Hi Yun,
>
> This proposal looks very good to me. I am glad you included in it the 
> requirement that hard deletes can take place in any vm/task/power state.
>
> I however feel that a similar requirement exists for revert resize. It should 
> be possible to issue a RevertResize command for any task_state (assuming that 
> a resize is happening or has recently happened and is not yet confirmed). The 
> code to support this capability doesn't exist yet, but I want to ask you: is 
> it compatible with your proposal to allow RevertResize in any task state?
>
> "Yun Mao"  said:
>
>> Hi,
>>
>> There are vm_states, task_states, and power_states for each VM. The
>> use of them is complicated. Some states are confusing, and sometimes
>> ambiguous. There also lacks a guideline to extend/add new state. This
>> proposal aims to simplify things, explain and define precisely what
>> they mean, and why we need them. A new user-friendly behavior of
>> deleting a VM is also discussed.
>>
>> A TL;DR summary:
>> * power_state is the hypervisor state, loaded “bottom-up” from compute
>> worker;
>> * vm_state reflects the stable state based on API calls, matching user
>> expectation, revised “top-down” within API implementation.
>> * task_state reflects the transition state introduced by in-progress API 
>> calls.
>> * “hard” delete of a VM should always succeed no matter what.
>> * power_state and vm_state may conflict with each other, which needs
>> to be resolved case-by-case.
>>
>> It's not a definite guide yet and is up for discussion. I'd like to
>> thank vishy and comstud for the early input. comstud: the task_state
>> is different from when you looked at it. It's a lot closer to what's
>> in the current code.
>>
>> The full text is here and is editable by anyone like etherpad.
>>
>> https://docs.google.com/document/d/1nlKmYld3xxpTv6Xx0Iky6L46smbEqg7-SWPu_o6VJws/edit?pli=1
>>
>> Thanks,
>>
>> Yun
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] nova state machine simplification and clarification

2012-05-17 Thread Yun Mao
Hi,

There are vm_states, task_states, and power_states for each VM. The
use of them is complicated. Some states are confusing, and sometimes
ambiguous. There also lacks a guideline to extend/add new state. This
proposal aims to simplify things, explain and define precisely what
they mean, and why we need them. A new user-friendly behavior of
deleting a VM is also discussed.

A TL;DR summary:
* power_state is the hypervisor state, loaded “bottom-up” from compute worker;
* vm_state reflects the stable state based on API calls, matching user
expectation, revised “top-down” within API implementation.
* task_state reflects the transition state introduced by in-progress API calls.
* “hard” delete of a VM should always succeed no matter what.
* power_state and vm_state may conflict with each other, which needs
to be resolved case-by-case.

It's not a definite guide yet and is up for discussion. I'd like to
thank vishy and comstud for the early input. comstud: the task_state
is different from when you looked at it. It's a lot closer to what's
in the current code.

The full text is here and is editable by anyone like etherpad.

https://docs.google.com/document/d/1nlKmYld3xxpTv6Xx0Iky6L46smbEqg7-SWPu_o6VJws/edit?pli=1

Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] HELP: About Nova long time response when taking snapshot

2012-05-15 Thread Yun Mao
If you are using the essex release, have you tried to enable the
libvirt_nonblocking option?

Yun

On Tue, May 15, 2012 at 2:18 AM, Sam Su  wrote:
> Hi,
>
> I have a multi-nodes openstack environment, including a control node running
> Glance, nova-api, nova-scheduler, nova-network, rabbitmq, mysql, keystone
> and dashboard services, and two compute nodes running nova-compute and
> nova-network services.
>
> When someone is taking a snapshot for his/her VMs, the Openstack system
> looks like very busy and it will take a long time (at least 3 to 4 minutes
> in this situation and regular time is in 30 seconds) to create a VM.
>
> I wonder is there any solution to optimize this system so that it can
> response quickly. it will be much appreciated if someone could give me some
> hints about this.
>
> Thanks,
> Sam
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Nova] Blueprints for Folsom

2012-05-08 Thread Yun Mao
mine is here https://blueprints.launchpad.net/nova/+spec/task-management

it's not obsolete per se since I'm working on that feature branch:
https://github.com/maoy/nova/tree/orchestration
 but I'd be happy to merge it with other ideas. Thanks,

Yun

On Tue, May 8, 2012 at 7:18 PM, Sriram Subramanian
 wrote:
> Following orchestration related blueprints can be obsoleted since they were
> targeted towards design summit sessions.
>
>
>
> https://blueprints.launchpad.net/nova/+spec/transaction-orchestration
> (Sandy’s, for Essex summit)
>
> https://blueprints.launchpad.net/nova/+spec/nova-orchestration (mine, for
> Folsom summit)
>
>
>
> Both can be obsoleted/ deleted. We have the applicable specs in wiki.
>
>
>
> There is one more that Yun Mao submitted, which I am not able to locate. Yun
> – could you please update it as appropriately?
>
>
>
> Thanks,
>
> -Sriram
>
>
>
> From: openstack-bounces+sriram=computenext@lists.launchpad.net
> [mailto:openstack-bounces+sriram=computenext@lists.launchpad.net] On
> Behalf Of Vishvananda Ishaya
> Sent: Monday, May 07, 2012 4:16 PM
> To: openstack@lists.launchpad.net (openstack@lists.launchpad.net)
> Subject: [Openstack] [Nova] Blueprints for Folsom
>
>
>
> Hello everyone,
>
>
>
> The number of blueprints for nova has gotten entirely out-of-hand.  I've
> obsoleted about 40 blueprints and there are still about 150 blueprints for
> nova. Many of these are old, or represent features that are cool ideas, but
> haven't had any activity in a long time. I've attempted to target all of the
> relevant blueprints to Folsom.  You can see the progress here:
>
>
>
> https://blueprints.launchpad.net/nova/folsom
>
>
>
> I would like to get our nova blueprints cleaned up as much as possible.  In
> one week, I am going to mark all blueprints that are not targeted to folsom
> Obsolete. This will allow us to start over from a clean slate. So here is
> what I need from everyone:
>
>
>
> 1. If you see a blueprint on the main nova list that is not targeted to
> folsom that should stay around, please let me know ASAP or it will get
> deleted:
>
>
>
> https://blueprints.launchpad.net/nova
>
>
>
> 2. Operational Support Team, there are a bunch of blueprints that are not
> targeted to folsom, so please either target them or mark them obsolete by
> next week.
>
>
>
> 3. Orchestration Team, there are a whole bunch of blueprints relating to
> orchestration, some seem like duplicates, I'm tempted to just delete them
> all and start over with one or two simple blueprints with clear objectives.
>  I don't really know which ones are current, so please help with this.
>
>
>
> 4. There are a bunch of blueprints targeted to folsom that don't have people
> assigned yet. If you want to help with coding, there is a lot of opportunity
> there, so let me know if I can assign one of the blueprints to you.
>
>
>
> 5. If there is any work being done as a result of the summit that doesn't
> have an associated blueprint, please make one and let me know so I can
> target it.
>
>
>
> 6. If you are a blueprint assignee, please let me know when you can have the
> work completed so I can finish assigning blueprints to milestones
>
>
>
> I'm currently working on prioritizing the targeted blueprints, so hopefully
> we have a decent list of priorities by the meeting tomorrow.
>
>
>
> Thanks for the help,
>
> Vish

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] [orchestration] feature branch of a work in progress on orchestration

2012-05-07 Thread Yun Mao
Hi,

I've uploaded some code as work in progress towards what we discussed
at the Folsom summit, nova orchestration session. Where I'm going is
more or less described in this blueprint.
https://blueprints.launchpad.net/nova/+spec/task-management

The first step is to build a proof of concept based on a db backend
for persistent storage, and then implement zookeeper backend for high
performance. Right now it only works in the testing, not real
deployment since I haven't worked on the mysql schema migration. But I
figure it's about time to get some early feedback. You can track the
branch for progress. The code is available here:

https://github.com/maoy/nova/tree/orchestration

Comments, collaborations are very welcome. You can also find me at
this week's orchestration meeting on Thursday. Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] need help on passing unit/integration tests

2012-05-07 Thread Yun Mao
Hi guys,

I can't get my master branch freshly off github to pass the
run_test.sh script. The errors are as follows. Tried on mac and ubuntu
12.04.. Any ideas? Thanks,

Yun


==
ERROR: test_json (nova.tests.test_log.JSONFormatterTestCase)
--
Traceback (most recent call last):
  File "/Users/maoy/git/nova/nova/tests/test_log.py", line 183, in test_json
data = json.loads(self.stream.getvalue())
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/__init__.py",
line 307, in loads
return _default_decoder.decode(s)
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py",
line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py",
line 338, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
 >> begin captured logging << 
test-json: DEBUG: This is a log line
- >> end captured logging << -

==
ERROR: test_json_exception (nova.tests.test_log.JSONFormatterTestCase)
--
Traceback (most recent call last):
  File "/Users/maoy/git/nova/nova/tests/test_log.py", line 207, in
test_json_exception
data = json.loads(self.stream.getvalue())
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/__init__.py",
line 307, in loads
return _default_decoder.decode(s)
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py",
line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File 
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/decoder.py",
line 338, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
 >> begin captured logging << 
test-json: ERROR: This is exceptional
Traceback (most recent call last):
  File "/Users/maoy/git/nova/nova/tests/test_log.py", line 203, in
test_json_exception
raise Exception('This is exceptional')
Exception: This is exceptional
- >> end captured logging << -

==
FAIL: test_deserialize_remote_exception
(nova.tests.rpc.test_common.RpcCommonTestCase)
--
Traceback (most recent call last):
  File "/Users/maoy/git/nova/nova/tests/rpc/test_common.py", line 98,
in test_deserialize_remote_exception
self.assertTrue('test message' in unicode(after_exc))
AssertionError

==
FAIL: test_deserialize_remote_exception_user_defined_exception
(nova.tests.rpc.test_common.RpcCommonTestCase)
--
Traceback (most recent call last):
  File "/Users/maoy/git/nova/nova/tests/rpc/test_common.py", line 127,
in test_deserialize_remote_exception_user_defined_exception
self.assertTrue('raise FakeUserDefinedException' in unicode(after_exc))
AssertionError

==
FAIL: test_call_converted_exception (nova.tests.rpc.test_kombu.RpcKombuTestCase)
--
Traceback (most recent call last):
  File "/Users/maoy/git/nova/nova/test.py", line 87, in _skipper
func(*args, **kw)
  File "/Users/maoy/git/nova/nova/tests/rpc/test_kombu.py", line 369,
in test_call_converted_exception
self.assertTrue(value in unicode(exc))
AssertionError:
 >> begin captured logging << 
nova.rpc.common: INFO: Connected to AMQP server on localhost:5672
nova.rpc.common: INFO: Connected to AMQP server on localhost:5672
nova.rpc.amqp: ERROR: Exception during message handling
2012-05-07 11:54:32 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-05-07 11:54:32 TRACE nova.rpc.amqp   File
"/Users/maoy/git/nova/nova/rpc/amqp.py", line 263, in _process_data
2012-05-07 11:54:32 TRACE nova.rpc.amqp rval =
node_func(context=ctxt, **node_args)
2012-05-07 11:54:32 TRACE nova.rpc.amqp   File
"/Users/maoy/git/nova/nova/tests/rpc/common.py", line 264, in
fail_converted
2012-05-07 11:54:32 TRACE nova.rpc.amqp raise
exception.ConvertedException(explanation=value)
2012-05-07 11:54:32 TRACE nova.rpc.amqp ConvertedException: This is
the exception message
2012-05-07 11:54:32 TRACE nova.rpc.amqp
nova.rpc.common: INFO: Connected to AMQP server on localhost:5672
nova.rpc

Re: [Openstack] profiling nova-api

2012-04-12 Thread Yun Mao
Hi Jay,

I will take a look later when I find a box with multiple cores to take
advantage it..

Agreed - cprofiling is not so useful this case. This would be a purely
performance benchmark. If implemented correctly, we should see a
notable gain. However, It will probably not that linear because we
might hit keystone's performance issue later. It looks like the second
CPU hogger behind nova-api. Is there a multiprocess keystone too?

Yun

On Thu, Apr 12, 2012 at 9:49 AM, Jay Pipes  wrote:
> Hi Yun!
>
> Thanks very much for sharing this information. Can I ask you to pull the
> code in this branch:
>
> https://review.openstack.org/#change,5762
>
> And retry your test? On my test machines, the work that Huang Zhiteng has
> done to convert Nova's servers to use multiple operating system processes,
> each with its own greenthread service pool, resulted in a massive throughput
> improvement, turning a quickly-bottlenecked system into a server that could
> scale nearly linearly with the number of worker processes.
>
> Be sure to set osapi_workers to the number of cores your machine has... and
> also note that your code profiling technique is unlikely to be effective
> since cProfile wouldn't track the forked child worker processes' stacks,
> AFAIK. Still interested to see if the time to execute the 300 API calls is
> dramatically reduced, though.
>
> Looking forward to any results you might have.
>
> Best,
> -jay
>
>
> On 04/11/2012 04:48 PM, Yun Mao wrote:
>>
>> Hi Stackers, I spent some time looking at nova-api today.
>>
>> Setup: everything-on-one-node devstack, essex trunk. I setup 1 user
>> with 10 tiny VMs.
>> Client: 3 python threads each doing a loop of "nova list" equivalent
>> for 100 times. So 300 API calls with concurrency=3.
>> how to profile: python -m cProfile -s time
>> /opt/stack/nova/bin/nova-api --flagfile=/etc/nova/nova.conf
>> --logdir=/var/log/nova --nodebug
>> The partial output is attached in the end.
>>
>> Observations:
>> * It takes about 60 seconds to finish. CPU of nova-api is around 70% to
>> 90%
>>
>> * Database access: Each "nova list" API call will issue 4 db APIs: 3
>> instance_get_all_by_filters(), 1
>> instance_fault_get_by_instance_uuids(), so 1200 db API calls total
>> (note: not necessarily 1200 SQL statements, could be more). The 900
>> instance_get_all_by_filters() calls took 30.2 seconds (i.e. 0.03s
>> each)! The 300 instance_fault_get_by_instance_uuids() calls only took
>> 1.129 seconds (0.004 each).
>>
>> You might think: MySQL sucks. Not so fast. Remember this is a tiny
>> database with only 10 VMs. Profile also shows that the actual
>> _mysql.connection.query() method only took 1.883 seconds in total. So,
>> we pretty much spend 29 seconds out of 60 seconds doing either
>> sqlalchemy stuff or our own wrapper. You can also see from the sheer
>> volume of sqlalchemy library calls involved.
>>
>> * the cfg.py library inefficiency. During 300 API calls,
>> common.cfg.ConfigOpts._get() is called 135005 times! and we paid 2.470
>> sec for that.
>>
>> Hopefully this is useful for whoever wants to improve the performance
>> of nova-api.
>>
>> Thanks,
>> Yun
>>
>> ===
>>
>>          23355694 function calls (22575841 primitive calls) in 77.874
>> seconds
>>
>>    Ordered by: internal time
>>
>>    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>>       812   25.725    0.032   25.725    0.032 {method 'poll' of
>> 'select.epoll' objects}
>>      2408    1.883    0.001    1.883    0.001 {method 'query' of
>> '_mysql.connection' objects}
>>     70380    1.667    0.000    7.187    0.000
>> expression.py:2263(corresponding_column)
>>    135005    1.254    0.000    2.470    0.000 cfg.py:1058(_get)
>>     41027    1.043    0.000    1.907    0.000 schema.py:542(__init__)
>>     38802    1.008    0.000    1.219    0.000 __init__.py:451(format)
>>    162206    0.821    0.000    0.821    0.000 util.py:883(values)
>>   1530666    0.773    0.000    0.774    0.000 {isinstance}
>> 135046/134960    0.716    0.000    1.919    0.000 cfg.py:1107(_substitute)
>>      1205    0.713    0.001    1.369    0.001 base.py:2106(__init__)
>>    183600    0.690    0.000    0.796    0.000
>> interfaces.py:954(_reduce_path)
>>     81002    0.687    0.000    2.492    0.000 compiler.py:312(visit_label)
>>     38802    0.650    0.000    6.087    0.000 log.py:227(format)
>>    319270    0.622    0.000    0.748    0.000 attributes.py:164(_

[Openstack] profiling nova-api

2012-04-11 Thread Yun Mao
Hi Stackers, I spent some time looking at nova-api today.

Setup: everything-on-one-node devstack, essex trunk. I setup 1 user
with 10 tiny VMs.
Client: 3 python threads each doing a loop of "nova list" equivalent
for 100 times. So 300 API calls with concurrency=3.
how to profile: python -m cProfile -s time
/opt/stack/nova/bin/nova-api --flagfile=/etc/nova/nova.conf
--logdir=/var/log/nova --nodebug
The partial output is attached in the end.

Observations:
* It takes about 60 seconds to finish. CPU of nova-api is around 70% to 90%

* Database access: Each "nova list" API call will issue 4 db APIs: 3
instance_get_all_by_filters(), 1
instance_fault_get_by_instance_uuids(), so 1200 db API calls total
(note: not necessarily 1200 SQL statements, could be more). The 900
instance_get_all_by_filters() calls took 30.2 seconds (i.e. 0.03s
each)! The 300 instance_fault_get_by_instance_uuids() calls only took
1.129 seconds (0.004 each).

You might think: MySQL sucks. Not so fast. Remember this is a tiny
database with only 10 VMs. Profile also shows that the actual
_mysql.connection.query() method only took 1.883 seconds in total. So,
we pretty much spend 29 seconds out of 60 seconds doing either
sqlalchemy stuff or our own wrapper. You can also see from the sheer
volume of sqlalchemy library calls involved.

* the cfg.py library inefficiency. During 300 API calls,
common.cfg.ConfigOpts._get() is called 135005 times! and we paid 2.470
sec for that.

Hopefully this is useful for whoever wants to improve the performance
of nova-api.

Thanks,
Yun

===

 23355694 function calls (22575841 primitive calls) in 77.874 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  812   25.7250.032   25.7250.032 {method 'poll' of
'select.epoll' objects}
 24081.8830.0011.8830.001 {method 'query' of
'_mysql.connection' objects}
703801.6670.0007.1870.000
expression.py:2263(corresponding_column)
   1350051.2540.0002.4700.000 cfg.py:1058(_get)
410271.0430.0001.9070.000 schema.py:542(__init__)
388021.0080.0001.2190.000 __init__.py:451(format)
   1622060.8210.0000.8210.000 util.py:883(values)
  15306660.7730.0000.7740.000 {isinstance}
135046/1349600.7160.0001.9190.000 cfg.py:1107(_substitute)
 12050.7130.0011.3690.001 base.py:2106(__init__)
   1836000.6900.0000.7960.000 interfaces.py:954(_reduce_path)
810020.6870.0002.4920.000 compiler.py:312(visit_label)
388020.6500.0006.0870.000 log.py:227(format)
   3192700.6220.0000.7480.000 attributes.py:164(__get__)
890242/8842290.6080.0001.8850.000 {getattr}
405000.6050.0003.1010.000 schema.py:955(_make_proxy)
120783/1207380.6030.0000.6050.000 {method 'sub' of
'_sre.SRE_Pattern' objects}
810000.6010.0002.1560.000
interfaces.py:677(create_row_processor)
630000.5900.0000.7070.000 times.py:44(DateTime_or_None)
981020.5880.0000.8860.000 compiler.py:337(visit_column)
   6580980.5800.0000.5810.000 {method 'intersection'
of 'set' objects}
   1098020.5620.0000.5620.000 expression.py:3625(_from_objects)
231610/12020.5510.0005.8130.005
visitors.py:58(_compiler_dispatch)
   1440020.5100.0000.6930.000
compiler.py:622(_truncated_identifier)
135005/1349600.4850.0004.8720.000 cfg.py:860(__getattr__)
 24080.4630.0001.9420.001 {built-in method fetch_row}
711000.4600.0000.5800.000
strategies.py:121(create_row_processor)
   2990310.4370.0000.4370.000 {_codecs.utf_8_decode}
 60000.4370.0001.7990.000 models.py:93(iteritems)
36000/90000.4090.0004.7910.001 mapper.py:2146(populate_state)
810020.3930.0001.1040.000
compiler.py:672(label_select_column)
45000/90000.3900.0005.1480.001 mapper.py:2186(_instance)
 12020.3660.0005.7970.005 compiler.py:701(visit_select)
231610/12020.3470.0005.8170.005 base.py:714(process)
   1728000.3410.0001.1480.000
interfaces.py:651(_get_context_strategy)
258680.3390.0000.3390.000 {method 'write' of 'file' objects}
91800/522000.3270.0006.7280.000 interfaces.py:673(setup)

228010.3240.0000.3240.000 {method 'strftime' of
'datetime.date' objects}
784780.3160.0000.9140.000
expression.py:2143(contains_column)
258680.3140.0000.3140.000 {method 'flush' of 'file' objects}
   2990330.3090.0000.9300.000 {method 'decode' of 'str' objects}
118969/1189240.3050.0000.9300.000 string.py:174(safe_substitute)
143683

Re: [Openstack] [Nova-orchestration] Preliminary analysis of SpiffWorkflow

2012-04-06 Thread Yun Mao
Hi Ziad,

thanks for the great work. Do we know how the states are persisted in
Spiff? Thanks,

Yun

On Fri, Apr 6, 2012 at 3:53 PM, Ziad Sawalha  wrote:
> Here's a link to my analysis so far:
> http://wiki.openstack.org/NovaOrchestration/WorkflowEngines/SpiffWorkflow
>
> It looks good, but I won't pass a final verdict until I have completed a
> working project in it. I have one in progress and will let ya know when
> it's done.
>
> Z
>
> On 4/3/12 4:56 PM, "Ziad Sawalha"  wrote:
>
>>Just confirming what Sandy said; I am playing around with SpiffWorkflow.
>>I'll post my findings when I'm done on the wiki under the Nova
>>Orchestration page.
>>
>>So far I've found some of the documentation lacking and concepts
>>confusing, which has resulted in a steep learning curve and made it
>>difficult to integrate into something like RabbitMQ (for long-running
>>tasks). But the thinking behind it (http://www.workflowpatterns.com/)
>>seems sound and I will continue to investigate it.
>>
>>Z
>>
>>On 3/29/12 5:56 PM, "Sriram Subramanian"  wrote:
>>
>>>Guys,
>>>
>>>Sorry for missing the meeting today. Thanks for the detailed summary/
>>>logs. I am cool with the action item : #action sriram to update the
>>>Orchestration session proposal. This is my understanding the logs of
>>>things to be updated in the blueprint:
>>>
>>>1) orchestration service provides state management with client side APIs
>>>2) add API design and state storage as topics for the orchestration
>>>session at the Summit
>>>3) add implementation plan as session topic
>>>
>>>Please correct me if I missed anything.
>>>
>>>Just to bring everyone to same page, here are the new links
>>>
>>>Folsom BluePrint:
>>>https://blueprints.launchpad.net/nova/+spec/nova-orchestration
>>>Folsom Session proposal:
>>>https://blueprints.launchpad.net/nova/+spec/nova-orchestration
>>>Wiki: http://wiki.openstack.org/NovaOrchestration (I will clean this up
>>>tonight)
>>>
>>>Maoy: Sandy's pointers are in this email thread (which n0ano meant to fwd
>>>you)
>>>Mikeyp: Moving the conversation to the main mailing list per your
>>>suggestion
>>>
>>>Thanks,
>>>_Sriram
>>>
>>>-Original Message-
>>>From: Dugger, Donald D [mailto:donald.d.dug...@intel.com]
>>>Sent: Thursday, March 29, 2012 12:52 PM
>>>To: Sriram Subramanian; Sandy Walsh
>>>Cc: Michael Pittaro (mik...@lahondaresearch.org)
>>>Subject: RE: [Nova-orchestration] Thoughts on Orchestration (was Re:
>>>Documentation on Caching)
>>>
>>>NP, I'll be on the IRC for whoever wants to talk.  Maybe we can try and
>>>do the sync you want via email, that's always been my favorite way to
>>>communicate (it allows you to focus thoughts and deals with timezones
>>>nicely).
>>>
>>>--
>>>Don Dugger
>>>"Censeo Toto nos in Kansa esse decisse." - D. Gale
>>>Ph: 303/443-3786
>>>
>>>
>>>-Original Message-
>>>From: Sriram Subramanian [mailto:sri...@computenext.com]
>>>Sent: Thursday, March 29, 2012 1:45 PM
>>>To: Sriram Subramanian; Sandy Walsh
>>>Cc: Dugger, Donald D; Michael Pittaro (mik...@lahondaresearch.org)
>>>Subject: RE: [Nova-orchestration] Thoughts on Orchestration (was Re:
>>>Documentation on Caching)
>>>
>>>I will most likely be running little late from my 12 - 1 meeting which
>>>doesn't seem to be ending anytime now :(
>>>
>>>I haven't gotten a chance to submit a branch yet. Hopefully by this week
>>>end (at least a bare bones)
>>>
>>>If you are available for offline sync later this week - I would
>>>appreciate that. Apologies for possibly missing the sync.
>>>
>>>Thanks,
>>>-Sriram
>>>
>>>-Original Message-
>>>From:
>>>nova-orchestration-bounces+sriram=computenext@lists.launchpad.net
>>>[mailto:nova-orchestration-bounces+sriram=computenext.com@lists.launchpad
>>>.
>>>net] On Behalf Of Sriram Subramanian
>>>Sent: Wednesday, March 28, 2012 2:44 PM
>>>To: Sandy Walsh
>>>Cc: nova-orchestrat...@lists.launchpad.net
>>>Subject: Re: [Nova-orchestration] Thoughts on Orchestration (was Re:
>>>Documentation on Caching)
>>>
>>>Thanks for the pointers Sandy. I will try to spend some cycles on the
>>>branch per your suggestion; we will also discuss more tomorrow.
>>>
>>>Yes, BP is not far off from last summit, and would like to flush out more
>>>for this summit.
>>>
>>>Thanks,
>>>-Sriram
>>>
>>>-Original Message-
>>>From: Sandy Walsh [mailto:sandy.wa...@rackspace.com]
>>>Sent: Wednesday, March 28, 2012 11:31 AM
>>>To: Sriram Subramanian
>>>Cc: Michael Pittaro; Dugger, Donald D (donald.d.dug...@intel.com);
>>>nova-orchestrat...@lists.launchpad.net
>>>Subject: Thoughts on Orchestration (was Re: Documentation on Caching)
>>>
>>>Ah, gotcha.
>>>
>>>I don't think the caching stuff will really affect the Orchestration
>>>layer all that much. Certainly the Cells stuff that comstud is working on
>>>should be considered.
>>>
>>>The BP isn't really too far off from what we discussed last summit.
>>>Although I would give more consideration to the stuff Redhat is thinking
>>>about and some of the efforts by HP and IBM with re

[Openstack] where nova-compute runs: KVM vs Xen

2012-04-05 Thread Yun Mao
Right now, if you use KVM via libvirt (the default case), on the
compute node, nova-compute runs on the host. If you use Xen via
xenapi, nova-compute runs on Dom-U. (I'll ignore Xen via libvirt since
no one really uses it.)

What's the fundamental design decision to make the distinction?
Presumably, it is not *that* hard to run nova-compute in a KVM VM,
since the libvirt control socket works on tcp. I can see updating
iptables rules would be painful but shouldn't we have the same problem
with Xen? Conversely, it's also not impossible to run nova-compute in
Dom-0. I understand running something in a VM is more secure in some
sense than running in Dom0. But shouldn't the same argument apply to
KVM's case as well?

Your input is appreciated. Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Nova-orchestration] Thoughts on Orchestration (was Re: Documentation on Caching)

2012-04-03 Thread Yun Mao
Hi Ziad,

Thanks for taking the effort. Do you know which ones out of the 43
workflows patterns are relavant to us? I'm slightly concerned that
SpiffWorkflow might be an overkill and bring unnecessary complexity
into the game. There was a discussion a while ago suggesting that
relatively simple sequential execution pattern:
https://lists.launchpad.net/nova-orchestration/msg00043.html

Thanks,

Yun

On Tue, Apr 3, 2012 at 5:56 PM, Ziad Sawalha  wrote:
> Just confirming what Sandy said; I am playing around with SpiffWorkflow.
> I'll post my findings when I'm done on the wiki under the Nova
> Orchestration page.
>
> So far I've found some of the documentation lacking and concepts
> confusing, which has resulted in a steep learning curve and made it
> difficult to integrate into something like RabbitMQ (for long-running
> tasks). But the thinking behind it (http://www.workflowpatterns.com/)
> seems sound and I will continue to investigate it.
>
> Z
>
> On 3/29/12 5:56 PM, "Sriram Subramanian"  wrote:
>
>>Guys,
>>
>>Sorry for missing the meeting today. Thanks for the detailed summary/
>>logs. I am cool with the action item : #action sriram to update the
>>Orchestration session proposal. This is my understanding the logs of
>>things to be updated in the blueprint:
>>
>>1) orchestration service provides state management with client side APIs
>>2) add API design and state storage as topics for the orchestration
>>session at the Summit
>>3) add implementation plan as session topic
>>
>>Please correct me if I missed anything.
>>
>>Just to bring everyone to same page, here are the new links
>>
>>Folsom BluePrint:
>>https://blueprints.launchpad.net/nova/+spec/nova-orchestration
>>Folsom Session proposal:
>>https://blueprints.launchpad.net/nova/+spec/nova-orchestration
>>Wiki: http://wiki.openstack.org/NovaOrchestration (I will clean this up
>>tonight)
>>
>>Maoy: Sandy's pointers are in this email thread (which n0ano meant to fwd
>>you)
>>Mikeyp: Moving the conversation to the main mailing list per your
>>suggestion
>>
>>Thanks,
>>_Sriram
>>
>>-Original Message-
>>From: Dugger, Donald D [mailto:donald.d.dug...@intel.com]
>>Sent: Thursday, March 29, 2012 12:52 PM
>>To: Sriram Subramanian; Sandy Walsh
>>Cc: Michael Pittaro (mik...@lahondaresearch.org)
>>Subject: RE: [Nova-orchestration] Thoughts on Orchestration (was Re:
>>Documentation on Caching)
>>
>>NP, I'll be on the IRC for whoever wants to talk.  Maybe we can try and
>>do the sync you want via email, that's always been my favorite way to
>>communicate (it allows you to focus thoughts and deals with timezones
>>nicely).
>>
>>--
>>Don Dugger
>>"Censeo Toto nos in Kansa esse decisse." - D. Gale
>>Ph: 303/443-3786
>>
>>
>>-Original Message-
>>From: Sriram Subramanian [mailto:sri...@computenext.com]
>>Sent: Thursday, March 29, 2012 1:45 PM
>>To: Sriram Subramanian; Sandy Walsh
>>Cc: Dugger, Donald D; Michael Pittaro (mik...@lahondaresearch.org)
>>Subject: RE: [Nova-orchestration] Thoughts on Orchestration (was Re:
>>Documentation on Caching)
>>
>>I will most likely be running little late from my 12 - 1 meeting which
>>doesn't seem to be ending anytime now :(
>>
>>I haven't gotten a chance to submit a branch yet. Hopefully by this week
>>end (at least a bare bones)
>>
>>If you are available for offline sync later this week - I would
>>appreciate that. Apologies for possibly missing the sync.
>>
>>Thanks,
>>-Sriram
>>
>>-Original Message-
>>From:
>>nova-orchestration-bounces+sriram=computenext@lists.launchpad.net
>>[mailto:nova-orchestration-bounces+sriram=computenext.com@lists.launchpad.
>>net] On Behalf Of Sriram Subramanian
>>Sent: Wednesday, March 28, 2012 2:44 PM
>>To: Sandy Walsh
>>Cc: nova-orchestrat...@lists.launchpad.net
>>Subject: Re: [Nova-orchestration] Thoughts on Orchestration (was Re:
>>Documentation on Caching)
>>
>>Thanks for the pointers Sandy. I will try to spend some cycles on the
>>branch per your suggestion; we will also discuss more tomorrow.
>>
>>Yes, BP is not far off from last summit, and would like to flush out more
>>for this summit.
>>
>>Thanks,
>>-Sriram
>>
>>-Original Message-
>>From: Sandy Walsh [mailto:sandy.wa...@rackspace.com]
>>Sent: Wednesday, March 28, 2012 11:31 AM
>>To: Sriram Subramanian
>>Cc: Michael Pittaro; Dugger, Donald D (donald.d.dug...@intel.com);
>>nova-orchestrat...@lists.launchpad.net
>>Subject: Thoughts on Orchestration (was Re: Documentation on Caching)
>>
>>Ah, gotcha.
>>
>>I don't think the caching stuff will really affect the Orchestration
>>layer all that much. Certainly the Cells stuff that comstud is working on
>>should be considered.
>>
>>The BP isn't really too far off from what we discussed last summit.
>>Although I would give more consideration to the stuff Redhat is thinking
>>about and some of the efforts by HP and IBM with respect to scheduling
>>(mostly HPC stuff). Unifying and/or understanding those efforts would be
>>important.
>>
>>That said, as with all things Open

Re: [Openstack] Caching strategies in Nova ...

2012-03-23 Thread Yun Mao
Got it. Thanks,

If I read your number correctly, there are 10 db api calls, with total
time 0.388 seconds.

This is certainly not lightning fast. But it's not really slow, given
that the user is expecting to have the VM created in more than 10
seconds. <0.5 s latency is tolerable. If most of the time is spent in
network to db, then I'd say when we scale up a lot in compute/vm
numbers, the latency won't increase much.

One thing to note is that right now the DB APIs are all blocking
calls. So it could be tricky to get the performance number right when
measuring multiple concurrent requests.

Yun

On Fri, Mar 23, 2012 at 6:47 PM, Mark Washenberger
 wrote:
> Yun,
>
> I was working with a very small but fairly realistic setup. In this
> case I had only 3 Xen hosts, no more than 10 nova vms up at a time.
> And the environment was very nearly "fresh" so I believe the db
> tables were as small as they could be. I believe the utilization
> across the board in my setup was very low, and indeed the numbers
> were very consistent (I ran a large number of times, but didn't
> save all of the data :-(). Also, there were only 2 compute nodes
> running, but as the workflow only had rpc casts, I'm not sure that
> really mattered very much.
>
> The profile I gave was for vm creation. But I also ran tests for
> deletion, listing, and showing vms in the OS API.
>
> Networks were static throughout the process. Volumes were absent.
>
> "Yun Mao"  said:
>
>> Hi Mark,
>>
>> what workload and what setup do you have while you are profiling? e.g.
>> how many compute nodes do you have, how many VMs do you have, are you
>> creating/destroying/migrating VMs, volumes, networks?
>>
>> Thanks,
>>
>> Yun
>>
>> On Fri, Mar 23, 2012 at 4:26 PM, Mark Washenberger
>>  wrote:
>>>
>>>
>>> "Johannes Erdfelt"  said:
>>>
>>>>
>>>> MySQL isn't exactly slow and Nova doesn't have particularly large
>>>> tables. It looks like the slowness is coming from the network and how
>>>> many queries are being made.
>>>>
>>>> Avoiding joins would mean even more queries, which looks like it would
>>>> slow it down even further.
>>>>
>>>
>>> This is exactly what I saw in my profiling. More complex queries did
>>> still seem to take longer than less complex ones, but it was a second
>>> order effect compared to the overall volume of queries.
>>>
>>> I'm not sure that network was the culprit though, since my ping
>>> roundtrip time was small relative to the wall time I measured for each
>>> nova.db.api call.
>>>
>>>
>>> ___
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to     : openstack@lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Caching strategies in Nova ...

2012-03-23 Thread Yun Mao
Hi Mark,

what workload and what setup do you have while you are profiling? e.g.
how many compute nodes do you have, how many VMs do you have, are you
creating/destroying/migrating VMs, volumes, networks?

Thanks,

Yun

On Fri, Mar 23, 2012 at 4:26 PM, Mark Washenberger
 wrote:
>
>
> "Johannes Erdfelt"  said:
>
>>
>> MySQL isn't exactly slow and Nova doesn't have particularly large
>> tables. It looks like the slowness is coming from the network and how
>> many queries are being made.
>>
>> Avoiding joins would mean even more queries, which looks like it would
>> slow it down even further.
>>
>
> This is exactly what I saw in my profiling. More complex queries did
> still seem to take longer than less complex ones, but it was a second
> order effect compared to the overall volume of queries.
>
> I'm not sure that network was the culprit though, since my ping
> roundtrip time was small relative to the wall time I measured for each
> nova.db.api call.
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] zookeeper problem

2012-03-19 Thread Yun Mao
Hi,

As far as I know, OpenStack doesn't use zookeeper yet. Is this
something you work on as an extra component? Nova/glance/keystone uses
eventlet, which doesn't work well with the default zookeeper python
lib. We have some success with this library I wrote:
https://github.com/maoy/python-evzookeeper. Hopefully that helps..


Yun

On Mon, Mar 19, 2012 at 11:26 AM, khabou imen  wrote:
> i've tried to change the ticktime value
> but i'm still getting "exceeded  deadline by 13ms"
> can any one advice me please
> --
> cordialement,
>  Imen Khabou,
> Elève Ingénieur en Informatique
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] git review complains about Contributor Agreement not signed

2012-03-14 Thread Yun Mao
Hi,

I have signed the agreement but I'm not sure how to make my git review
command realize that. Right now I got:


$ git review
fatal:  A Contributor Agreement must be completed before uploading:

  http://wiki.openstack.org/HowToContribute


fatal: The remote end hung up unexpectedly


Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] eventlet and OpenStack external libraries

2012-03-13 Thread Yun Mao
I took a hack and made all C-based libvirt calls using tpool. It
wasn't that bad..

https://github.com/maoy/nova/commit/757bfcb239c62c23c8455a507389efe9c1a2676e

>From my limited testing, it works fine. At least snapshot doesn't
block the compute node anymore.

Yun

On Tue, Mar 13, 2012 at 2:04 PM, Johannes Erdfelt  wrote:
> On Tue, Mar 13, 2012, Yun Mao  wrote:
>> There are two places in the current master branch that use tpool:
>> NWFilterFirewall and XenAPISession. Are they safe?
>
> I've looked at XenAPISession and it appears to be safe. It doesn't use
> logging nor any other locks.
>
> It does use other Python modules, but they appear to be fine too.
>
> I've never looked at NWFilterFirewall since I've been doing almost all
> of my development on xenapi.
>
>> I think if it's a pure C-based API call, then monkey patch should not
>> mess with it and it shouldn't try to reschedule among co-routines,
>> right?
>
> If it's 100% C, then it's most likely safe. There are ways that it can
> become unsafe, but it really needs to go out of it's way to do so.
>
>> After examining, the code, I see all libvirt-based calls are blocking,
>> and XenAPIs are non-blocking. This probably makes a huge difference in
>> a non-trivial deployment. However, libvirt-based KVM is probably the
>> most widely adopted choice right now, which is very strange..
>
> Yeah, it's not clear to me why only that one call in libvirt is handed
> off to a tpool thread. I'm not all that familiar with libvirt, nor do I
> use it, so I haven't looked into that.
>
> JE
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] eventlet and OpenStack external libraries

2012-03-13 Thread Yun Mao
Hi JE,

There are two places in the current master branch that use tpool:
NWFilterFirewall and XenAPISession. Are they safe?

I think if it's a pure C-based API call, then monkey patch should not
mess with it and it shouldn't try to reschedule among co-routines,
right?

After examining, the code, I see all libvirt-based calls are blocking,
and XenAPIs are non-blocking. This probably makes a huge difference in
a non-trivial deployment. However, libvirt-based KVM is probably the
most widely adopted choice right now, which is very strange..

Thanks,

Yun

On Mon, Mar 12, 2012 at 4:18 PM, Johannes Erdfelt  wrote:
> On Mon, Mar 12, 2012, Yun Mao  wrote:
>> My understanding is that if the answer to question3 is yes, then the
>> blocking call should be executed in tpool, although it's more likely
>> to have bugs in that case.
>
> Please be very careful with tpool. If the code being executed in the
> tpool thread ends up using a lock that can contend with code executing
> the main thread, you can end up with the tpool thread hanging.
>
> In particular, using logging can trigger this hang. You would need to
> audit the library to ensure it's safe to be used.
>
> This is one of the reasons I'd prefer to see Openstack move away from
> eventlet. It has a handful of problems that requires a high level of
> diligence to avoid properly.
>
> JE
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] eventlet and OpenStack external libraries

2012-03-12 Thread Yun Mao
Hi stackers,

A couple of days ago there was a long discussion of eventlet. I am
trying to summarize all external python dependencies for nova, glance
and keystone. I extracted the dependency from devstack, but I realize
that it is slightly different from tools/pip-requires. So I'm a little
confused on that and it may not be entirely accurate.

For each dependency, I try to understand the following questions:
1) is it pure python?

2) if it is pure python, does it have blocking operations (read file,
socket, acquire lock) such that eventlet monkey patch made an impact?

3) If it is not pure python, does it have blocking operations in the C
implementation such that eventlet green thread would block?

My understanding is that if the answer to question3 is yes, then the
blocking call should be executed in tpool, although it's more likely
to have bugs in that case.

Right now I find that mysql, xattr, libvirt, sendfile, and sqlite2 are
the libraries with potential blocking calls. There are many empty
cells that I'm not very sure. The result is in google doc and is
editable by anyone with the link. Feel free to edit. Once it
stabilizes maybe I'll put it in a wiki.

Thanks,

Yun

https://docs.google.com/spreadsheet/ccc?key=0AsgKparJuTF2dDNJXzd3YTFDQzRubm50VmhtWDl6RUE

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] eventlet weirdness

2012-03-05 Thread Yun Mao
Hi Phil,

My understanding is that, (forget Nova for a second) in a perfect
eventlet world, a green thread is either doing CPU intensive
computing, or wait in system calls that are IO related. In the latter
case, the eventlet scheduler will suspend the green thread and switch
to another green thread that is ready to run.

Back to reality, as you mentioned this is broken - some IO bound
activity won't cause an eventlet switch. To me the only possibility
that happens is the same reason those MySQL calls are blocking - we
are using C-based modules that don't respect monkey patch and never
yield. I'm suspecting that all libvirt based calls also belong to this
category.

Now if those blocking calls can finish in a very short of time (as we
assume for DB calls), then I think inserting a sleep(0) after every
blocking call should be a quick fix to the problem. But if it's a long
blocking call like the snapshot case, we are probably screwed anyway
and need OS thread level parallelism or multiprocessing to make it
truly non-blocking.. Thanks,

Yun

On Mon, Mar 5, 2012 at 10:43 AM, Day, Phil  wrote:
> Hi Yun,
>
> The point of the sleep(0) is to explicitly yield from a long running eventlet 
> to so that other eventlets aren't blocked for a long period.   Depending on 
> how you look at that either means we're making an explicit judgement on 
> priority, or trying to provide a more equal sharing of run-time across 
> eventlets.
>
> It's not that things are CPU bound as such - more just that eventlets have 
> every few pre-emption points.    Even an IO bound activity like creating a 
> snapshot won't cause an eventlet switch.
>
> So in terms of priority we're trying to get to the state where:
>  - Important periodic events (such as service status) run when expected  (if 
> these take a long time we're stuffed anyway)
>  - User initiated actions don't get blocked by background system eventlets 
> (such as refreshing power-state)
> - Slow action from one user don't block actions from other users (the first 
> user will expect their snapshot to take X seconds, the second one won't 
> expect their VM creation to take X + Y seconds).
>
> It almost feels like the right level of concurrency would be to have a 
> task/process running for each VM, so that there is concurrency across 
> un-related VMs, but serialisation for each VM.
>
> Phil
>
> -Original Message-
> From: Yun Mao [mailto:yun...@gmail.com]
> Sent: 02 March 2012 20:32
> To: Day, Phil
> Cc: Chris Behrens; Joshua Harlow; openstack
> Subject: Re: [Openstack] eventlet weirdness
>
> Hi Phil, I'm a little confused. To what extend does sleep(0) help?
>
> It only gives the greenlet scheduler a chance to switch to another green 
> thread. If we are having a CPU bound issue, sleep(0) won't give us access to 
> any more CPU cores. So the total time to finish should be the same no matter 
> what. It may improve the fairness among different green threads but shouldn't 
> help the throughput. I think the only apparent gain to me is situation such 
> that there is 1 green thread with long CPU time and many other green threads 
> with small CPU time.
> The total finish time will be the same with or without sleep(0), but with 
> sleep in the first threads, the others should be much more responsive.
>
> However, it's unclear to me which part of Nova is very CPU intensive.
> It seems that most work here is IO bound, including the snapshot. Do we have 
> other blocking calls besides mysql access? I feel like I'm missing something 
> but couldn't figure out what.
>
> Thanks,
>
> Yun
>
>
> On Fri, Mar 2, 2012 at 2:08 PM, Day, Phil  wrote:
>> I didn't say it was pretty - Given the choice I'd much rather have a 
>> threading model that really did concurrency and pre-emption all the right 
>> places, and it would be really cool if something managed the threads that 
>> were started so that is a second conflicting request was received it did 
>> some proper tidy up or blocking rather than just leaving the race condition 
>> to work itself out (then we wouldn't have to try and control it by checking 
>> vm_state).
>>
>> However ...   In the current code base where we only have user space based 
>> eventlets, with no pre-emption, and some activities that need to be 
>> prioritised then forcing pre-emption with a sleep(0) seems a pretty small 
>> bit of untidy.   And it works now without a major code refactor.
>>
>> Always open to other approaches ...
>>
>> Phil
>>
>>
>> -Original Message-
>> From: openstack-bounces+philip.day=hp@lists.launchpad.net
>> [mailto:openstack-bo

Re: [Openstack] eventlet weirdness

2012-03-02 Thread Yun Mao
First I agree that having blocking DB calls is no big deal given the
way Nova uses mysql and reasonably powerful db server hardware.

However I'd like to point out that the math below is misleading (the
average time for the nonblocking case is also miscalculated but it's
not my point). The number that matters more in real life is
throughput. For the blocking case it's 3/30 = 0.1 request per second.
For the non-blocking case it's 3/27=0.11 requests per second. That
means if there is a request coming in every 9 seconds constantly, the
blocking system will eventually explode but the nonblocking system can
still handle it. Therefore, the non-blocking one should be preferred.
Thanks,

Yun

>
> For example in the API server (before we made it properly multi-threaded) 
> with blocking db calls the server was essentially a serial processing queue - 
> each request was fully processed before the next.  With non-blocking db calls 
> we got a lot more apparent concurrencybut only at the expense of making all 
> of the requests equally bad.
>
> Consider a request takes 10 seconds, where after 5 seconds there is a call to 
> the DB which takes 1 second, and three are started at the same time:
>
> Blocking:
> 0 - Request 1 starts
> 10 - Request 1 completes, request 2 starts
> 20 - Request 2 completes, request 3 starts
> 30 - Request 3 competes
> Request 1 completes in 10 seconds
> Request 2 completes in 20 seconds
> Request 3 completes in 30 seconds
> Ave time: 20 sec
>
>
> Non-blocking
> 0 - Request 1 Starts
> 5 - Request 1 gets to db call, request 2 starts
> 10 - Request 2 gets to db call, request 3 starts
> 15 - Request 3 gets to db call, request 1 resumes
> 19 - Request 1 completes, request 2 resumes
> 23 - Request 2 completes,  request 3 resumes
> 27 - Request 3 completes
>
> Request 1 completes in 19 seconds  (+ 9 seconds)
> Request 2 completes in 24 seconds (+ 4 seconds)
> Request 3 completes in 27 seconds (- 3 seconds)
> Ave time: 20 sec
>
> So instead of worrying about making db calls non-blocking we've been working 
> to make certain eventlets non-blocking - i.e. add sleep(0) calls to long 
> running iteration loops - which IMO has a much bigger impact on the 
> performance of the apparent latency of the system. Thanks for the 
> explanation. Let me see if I understand this.

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] eventlet weirdness

2012-03-02 Thread Yun Mao
Hi Phil, I'm a little confused. To what extend does sleep(0) help?

It only gives the greenlet scheduler a chance to switch to another
green thread. If we are having a CPU bound issue, sleep(0) won't give
us access to any more CPU cores. So the total time to finish should be
the same no matter what. It may improve the fairness among different
green threads but shouldn't help the throughput. I think the only
apparent gain to me is situation such that there is 1 green thread
with long CPU time and many other green threads with small CPU time.
The total finish time will be the same with or without sleep(0), but
with sleep in the first threads, the others should be much more
responsive.

However, it's unclear to me which part of Nova is very CPU intensive.
It seems that most work here is IO bound, including the snapshot. Do
we have other blocking calls besides mysql access? I feel like I'm
missing something but couldn't figure out what.

Thanks,

Yun


On Fri, Mar 2, 2012 at 2:08 PM, Day, Phil  wrote:
> I didn't say it was pretty - Given the choice I'd much rather have a 
> threading model that really did concurrency and pre-emption all the right 
> places, and it would be really cool if something managed the threads that 
> were started so that is a second conflicting request was received it did some 
> proper tidy up or blocking rather than just leaving the race condition to 
> work itself out (then we wouldn't have to try and control it by checking 
> vm_state).
>
> However ...   In the current code base where we only have user space based 
> eventlets, with no pre-emption, and some activities that need to be 
> prioritised then forcing pre-emption with a sleep(0) seems a pretty small bit 
> of untidy.   And it works now without a major code refactor.
>
> Always open to other approaches ...
>
> Phil
>
>
> -Original Message-
> From: openstack-bounces+philip.day=hp@lists.launchpad.net 
> [mailto:openstack-bounces+philip.day=hp@lists.launchpad.net] On Behalf Of 
> Chris Behrens
> Sent: 02 March 2012 19:00
> To: Joshua Harlow
> Cc: openstack; Chris Behrens
> Subject: Re: [Openstack] eventlet weirdness
>
> It's not just you
>
>
> On Mar 2, 2012, at 10:35 AM, Joshua Harlow wrote:
>
>> Does anyone else feel that the following seems really "dirty", or is it just 
>> me.
>>
>> "adding a few sleep(0) calls in various places in the Nova codebase
>> (as was recently added in the _sync_power_states() periodic task) is
>> an easy and simple win with pretty much no ill side-effects. :)"
>>
>> Dirty in that it feels like there is something wrong from a design point of 
>> view.
>> Sprinkling "sleep(0)" seems like its a band-aid on a larger problem imho.
>> But that's just my gut feeling.
>>
>> :-(
>>
>> On 3/2/12 8:26 AM, "Armando Migliaccio"  
>> wrote:
>>
>> I knew you'd say that :P
>>
>> There you go: https://bugs.launchpad.net/nova/+bug/944145
>>
>> Cheers,
>> Armando
>>
>> > -Original Message-
>> > From: Jay Pipes [mailto:jaypi...@gmail.com]
>> > Sent: 02 March 2012 16:22
>> > To: Armando Migliaccio
>> > Cc: openstack@lists.launchpad.net
>> > Subject: Re: [Openstack] eventlet weirdness
>> >
>> > On 03/02/2012 10:52 AM, Armando Migliaccio wrote:
>> > > I'd be cautious to say that no ill side-effects were introduced. I
>> > > found a
>> > race condition right in the middle of sync_power_states, which I
>> > assume was exposed by "breaking" the task deliberately.
>> >
>> > Such a party-pooper! ;)
>> >
>> > Got a link to the bug report for me?
>> >
>> > Thanks!
>> > -jay
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] eventlet weirdness

2012-03-01 Thread Yun Mao
It seems that there used to be a db_pool in session.py but got removed
by this commit.

https://github.com/openstack/nova/commit/f3dd56e916232e38e74d9e2f24ce9a738cac63cf

due to this bug: https://bugs.launchpad.net/nova/+bug/838581

But still I'm confused by the discussion. Are we saying eventlet +
sqlalchemy + mysql pool is buggy so instead we make every DB call a
blocking call? Thanks,

Yun

On Thu, Mar 1, 2012 at 2:45 PM, Yun Mao  wrote:
> There are plenty eventlet discussion recently but I'll stick my
> question to this thread, although it's pretty much a separate
> question. :)
>
> How is MySQL access handled in eventlet? Presumably it's external C
> library so it's not going to be monkey patched. Does that make every
> db access call a blocking call? Thanks,

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] eventlet weirdness

2012-03-01 Thread Yun Mao
There are plenty eventlet discussion recently but I'll stick my
question to this thread, although it's pretty much a separate
question. :)

How is MySQL access handled in eventlet? Presumably it's external C
library so it's not going to be monkey patched. Does that make every
db access call a blocking call? Thanks,

Yun

On Wed, Feb 29, 2012 at 9:18 PM, Johannes Erdfelt  wrote:
> On Wed, Feb 29, 2012, Yun Mao  wrote:
>> Thanks for the explanation. Let me see if I understand this.
>>
>> 1. Eventlet will never have this problem if there is only 1 OS thread
>> -- let's call it main thread.
>
> In fact, that's exactly what Python calls it :)
>
>> 2. In Nova, there is only 1 OS thread unless you use xenapi and/or the
>> virt/firewall driver.
>> 3. The python logging module uses locks. Because of the monkey patch,
>> those locks are actually eventlet or "green" locks and may trigger a
>> green thread context switch.
>>
>> Based on 1-3, does it make sense to say that in the other OS threads
>> (i.e. not main thread), if logging (plus other pure python library
>> code involving locking) is never used, and we do not run a eventlet
>> hub at all, we should never see this problem?
>
> That should be correct. I'd have to double check all of the monkey
> patching that eventlet does to make sure there aren't other cases where
> you may inadvertently use eventlet primitives across real threads.
>
> JE
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] eventlet weirdness

2012-02-29 Thread Yun Mao
Thanks for the explanation. Let me see if I understand this.

1. Eventlet will never have this problem if there is only 1 OS thread
-- let's call it main thread.
2. In Nova, there is only 1 OS thread unless you use xenapi and/or the
virt/firewall driver.
3. The python logging module uses locks. Because of the monkey patch,
those locks are actually eventlet or "green" locks and may trigger a
green thread context switch.

Based on 1-3, does it make sense to say that in the other OS threads
(i.e. not main thread), if logging (plus other pure python library
code involving locking) is never used, and we do not run a eventlet
hub at all, we should never see this problem?

Thanks,

Yun

On Wed, Feb 29, 2012 at 5:24 PM, Johannes Erdfelt  wrote:
> On Wed, Feb 29, 2012, Yun Mao  wrote:
>> we sometimes notice this error message which prevent us from starting
>> nova services occasionally. We are using a somewhat modified diablo
>> stable release on Ubuntu 11.10. It may very well be the problem from
>> our patches but I'm wondering if you guys have any insight. In what
>> condition does this error occur? There is a similar bug in here:
>> https://bugs.launchpad.net/nova/+bug/831599
>>
>> but that doesn't offer much insight to me. Helps are very appreciated. 
>> Thanks,
>
> greenlet threads (used by eventlet) can't be scheduled across real
> threads. This usually isn't done explicitly, but can happen as a side
> effect if code uses locks. logging is one instance that I've run into.
>
> This generally hasn't been a problem with nova since it uses the
> eventlet monkey patching that makes it hard to generate real threads.
>
> There are two places (at least in trunk) where you need to be careful,
> both nova/virt/xenapi_conn.py and libvirt/firewall.py use tpool which
> does create a real thread in the background.
>
> If you use logging (and it's not the only source of this problem) then
> you can run into this eventlet message.
>
> JE
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] how to run selected tests

2012-02-29 Thread Yun Mao
Cool. Thanks Brad and Vish for the quick reply!

Yun

On Wed, Feb 29, 2012 at 3:57 PM, Vishvananda Ishaya
 wrote:
> ./run_tests.sh -N scheduler test_notify
> (replace -N with -V if you are using a virtual environment)
> You can also get very specific using : for class separation and . for path 
> and method separation
>
> for example:
> ./run_tests.sh -N 
> nova.tests.api.openstack.compute.contrib.test_floating_ips:FloatingIpSerializerTest.test_index_serializer
>
> also consider using -n for subsequent runs of run_tests.sh, it stops 
> run_tests.sh from recreating the database each time
>
> Vish
>
> On Feb 29, 2012, at 12:42 PM, Yun Mao wrote:
>
>> Greetings,
>>
>> What's the most convenient way to run a subset of the existing tests?
>> By default run_tests.sh tests everything. For example, I'd like to run
>> everything in test_scheduler plus test_notify.py, what's the best way
>> to do that? Thanks,
>>
>> Yun
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] eventlet weirdness

2012-02-29 Thread Yun Mao
Hi,

we sometimes notice this error message which prevent us from starting
nova services occasionally. We are using a somewhat modified diablo
stable release on Ubuntu 11.10. It may very well be the problem from
our patches but I'm wondering if you guys have any insight. In what
condition does this error occur? There is a similar bug in here:
https://bugs.launchpad.net/nova/+bug/831599

but that doesn't offer much insight to me. Helps are very appreciated. Thanks,

Yun

2012-02-23 16:54:52,788 DEBUG nova.utils
[43f98259-6ba8-4e5d-bc0e-9eab978194e5 None None] backend  from (pid=6385)
__get_backend /opt/stack/nova/nova/utils.py:449
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line
336, in fire_timers
timer()
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line
56, in __call__
cb(*args, **kw)
  File "/usr/lib/python2.7/dist-packages/eventlet/semaphore.py", line
95, in _do_acquire
waiter.switch()
error: cannot switch to a different thread

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] how to run selected tests

2012-02-29 Thread Yun Mao
Greetings,

What's the most convenient way to run a subset of the existing tests?
By default run_tests.sh tests everything. For example, I'd like to run
everything in test_scheduler plus test_notify.py, what's the best way
to do that? Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] devstack and stable/diablo

2012-02-21 Thread Yun Mao
Hi Anthony, the issues with stable/diablo branch are: mostly the left
over VMs are not scrubbed cleanly, and EXTRA_FLAGS doesn't work due to
the typo. Those bugs are all fixed in the master branch a while ago.
Thanks,

Yun

On Tue, Feb 21, 2012 at 2:51 PM, Anthony Young
 wrote:
>
> On Tue, Feb 21, 2012 at 11:23 AM, Yun Mao  wrote:
>>
>> What's the recommended way to play with stable/diablo with devstack?
>
>
> Ideally:
>
>> git checkout stable/diablo
>> ./stack.sh
>
> Which you are probably doing.
>
>>
>> We've been using the stable/diablo branch of devstack but stack.sh in
>> that branch is old that has some annoying small issues.
>
>
> It would be best to just submit bugs, or fix issues as they pop up and
> propose them to the stable/diablo branch.  What issues are you seeing?  I'm
> happy to help here.
>
>>
>> If I use the
>> master branch of devstack but replace stackrc with the stable/diablo
>> branch content, would that be  a problem? Is it recommended? Thanks,
>
>
> No, unfortunately the configuration between diablo and essex is different
> enough that this will not work.  I'm sure there are others out there that
> are also using stable/diablo devstack, so filing bugs and patches against
> that would be very helpful.
>
> Anthony
>
>>
>>
>> Yun
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] devstack and stable/diablo

2012-02-21 Thread Yun Mao
What's the recommended way to play with stable/diablo with devstack?
We've been using the stable/diablo branch of devstack but stack.sh in
that branch is old that has some annoying small issues. If I use the
master branch of devstack but replace stackrc with the stable/diablo
branch content, would that be  a problem? Is it recommended? Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Remove Zones code - FFE

2012-02-13 Thread Yun Mao
agreed..

-1 on shard, +1 on cluster

Yun

On Mon, Feb 13, 2012 at 7:59 PM, Martin Paulo  wrote:
> Please not 'shards'
> Sharding as a concept is so intertwined with databases IMHO that it
> will serve to confuse even more. Why not 'cluster'?
>
> Martin
>
> On 13 February 2012 09:50, Chris Behrens  wrote:
>> Sorry, I'm late.  Really getting down to the wire here. :)
>>
>> I've thrown up a version here: https://review.openstack.org/#change,4062
>>
>> I've not functionally tested it yet, but there's really good test coverage 
>> for the zones service itself.   I also have added a test_compute_zones which 
>> tests that all of the compute tests pass while using the new ComputeZonesAPI 
>> class.
>>
>> There's a couple bugs I note in the review and then I think I'm missing 
>> pushing some instance updates to the top in libvirt code.  And missing an 
>> update for instance deletes in the compute manager.  Going to hit those up 
>> today and finish this off.
>>
>> One other comment:  It's been suggested we not call this stuff 'Zones' 
>> anymore.  It gets confused with availability zones and so forth.  Since this 
>> is really a way to shard nova, it has been suggested to call this 'Shards'. 
>> :)   Not sure I dig that name completely, although it makes sense.  Thoughts?
>>
>> - Chris
>>
>>
>> On Feb 9, 2012, at 10:29 AM, Leandro Reox wrote:
>>
>>> Awesome Chris !!!
>>>
>>> Lean
>>>
>>> On Thu, Feb 9, 2012 at 3:26 PM, Alejandro Comisario 
>>>  wrote:
>>> Niceee !!
>>>
>>> Alejandro.
>>>
>>> On 02/09/2012 02:02 PM, Chris Behrens wrote:
 I should be pushing something up by end of day...  Even if it's not 
 granted an FFE, I'll have a need to keep my branch updated and working, so 
 I should at least always have a branch pushed up to a github account 
 somewhere until F1 opens up.  So, I guess worst case... there'll be a 
 branch somewhere for you to play with. :)

 - Chris


 On Feb 8, 2012, at 3:21 PM, Tom Fifield wrote:


> Just raising another deployment waiting on this new Zone implementation - 
> we currently have 2000 cores sitting idle in another datacentre that we 
> can use "better" if this is done.
>
> How can we help? ;)
>
> Regards,
>
> Tom
>
> On 02/08/2012 07:30 PM, Ziad Sawalha wrote:
>
>> We were working on providing the necessary functionality in Keystone but
>> stopped when we heard of the alternative solution. We could resume the
>> conversation about what is needed on the Keystone side and implement if
>> needed.
>>
>> Z
>>
>> From: Sandy Walsh <
>> sandy.wa...@rackspace.com
>> 
>> >
>> Date: Thu, 2 Feb 2012 01:49:58 +
>> To: Joshua McKenty <
>> jos...@pistoncloud.com
>> 
>> >, Vishvananda Ishaya
>> <
>> vishvana...@gmail.com 
>> >
>> Cc: "
>> openstack@lists.launchpad.net
>> " > 
>> >
>> Subject: Re: [Openstack] Remove Zones code - FFE
>>
>> Understood, timing is everything. I'll let Chris talk about expected
>> timing for the replacement. From a deployers side, nothing would really
>> change, just some configuration options ... but a replacement should be
>> available.
>>
>> I'm sure we could get it working pretty easily. The Keystone integration
>> was the biggest pita.
>>
>> I can keep this branch fresh with trunk for when we're ready to pull the
>> trigger.
>>
>> -S
>>
>> 
>> *From:* Joshua McKenty [
>> jos...@pistoncloud.com
>> 
>> ]
>> *Sent:* Wednesday, February 01, 2012 4:45 PM
>> *To:* Vishvananda Ishaya
>> *Cc:* Sandy Walsh;
>> openstack@lists.launchpad.net
>> 
>>
>> *Subject:* Re: [Openstack] Remove Zones code - FFE
>>
>> +1 to Vish's points. I know there are some folks coming online in the
>> Folsom timeline that can help out with the new stuff, but this feels a
>> bit like going backwards.
>>
>> --
>> Joshua McKenty, CEO
>> Piston Cloud Computing, Inc.
>> w: (650) 24-CLOUD
>> m: (650) 283-6846
>>
>> http://www.pistoncloud.com
>>
>>
>> "Oh, Westley, we'll never survive!"
>> "Nonsense. You're only saying that because no one ever has."
>>
>> On Wednesday, February 1, 2012 at 12:41 PM, Vishvananda Ishaya wrote:
>>
>>
>>> I am all for pulling this out, but I'm a bit concerned with the fact
>>> that we have nothing to replace it with. There are some groups still
>>> trying to use it. MercadoLibre is trying to use it for example. I know
>>> you guys are trying to replace this with so

Re: [Openstack] ZeroMQ RPC Driver - FF-Exception request

2012-01-27 Thread Yun Mao
Sorry to bring back a rather quiet thread from 3 days ago.

How fast do we need to queueing component to be? My observation from
Amazon EC2 us-east-1 is about 2 VMs provisioned per second on average.
Let's say there are 100 messages exchanged for the workload per second
per VM (which I believe is over estimated), and the peak time workload
is 100x higher. Then we need a queue that can do 20,000 messages per
second at the peak rate. Either Rabbit or 0MQ should handle this very
easily. So I'm assuming performance is not a concern.

Now if we go brokerless completely for all messages, that's an obvious
gain as we get rid of one type source of failure. Can that be
achieved? My impression after quickly skimming through the 0MQ
document is that those direct connection can be brokerless but things
more like broadcast can't be. I might very much be wrong at this, and
I would appreciate a lot if someone could help to explain.

Thanks,

Yun

On Wed, Jan 25, 2012 at 11:56 AM, Alexis Richardson  wrote:
> Eric
>
> Understood ;-)
>
> I am all in favour of community experimentation.
>
> 1:1 messaging is a core use case for RabbitMQ.  Unlike regular
> queueing systems which use queues for shared topics, Rabbit is
> designed to support very large numbers of short lived queues as well
> as long lived queues.  These can be private or shared.  In other
> words: queues are buffers.  ZeroMQ goes one step further and
> co-locates the consumer with the buffer, and the routing logic with
> the producer.  The cases for which this is useful are discussed on the
> web site.
>
> alexis
>
>
>
> On Wed, Jan 25, 2012 at 4:49 PM, Eric Windisch  wrote:
>> Alexis,
>>
>> It is also obvious that the link I provided is a particularly biased source,
>> so it should be taken with a grain of salt. I only mentioned Qpid because
>> Nova just received a driver for it, I didn't know the differences in such
>> detail.
>>
>> One of the problems Nova has is that it registers N queues for N hosts, with
>> one host pulling from each queue (1:1). This is why ZeroMQ is a good fit,
>> whereby messages can be sent directly to those hosts. There are also a small
>> (but active) number of N to N queues which remain centralized and for which
>> running Rabbit or Qpid is a good fit.
>>
>> It would be interesting exercise to allow the ZeroMQ driver to defer back to
>> the Kombu or Qpid driver for those messages which must remain centralized.
>>
>> --
>> Eric Windisch
>>
>> On Wednesday, January 25, 2012 at 1:18 AM, Alexis Richardson wrote:
>>
>> On Wed, Jan 25, 2012 at 4:46 AM, Eric Windisch 
>> wrote:
>>
>> Sorry, I had originally sent only to Yun Mao. Sending to list.
>>
>> ---
>>
>> Rather than attempt to answer this, I defer to the ZeroMQ guide. It should
>> be noted that the designers of AMPQ, iMatix, designed and build ZeroMQ.
>> (RabbitMQ and QUID implement AMQP)
>>
>>
>> Hold on a second there...
>>
>> There has been a LOT of muddle and fud ("fuddle"?) around AMQP. Let
>> me try to clear that up.
>>
>> Qpid's default option is AMQP 0-10. While feature-rich, 0-10 is not
>> widely used and was described by the AMQP chairman as too long and
>> complicated not long after it was published. See also commentary on
>> the web, on the subject of its length. Rabbit does not and will not
>> support this version, and other folks have not implemented it either.
>>
>> WHEREAS --
>>
>> RabbitMQ implements AMQP 0-91, a 40 page spec. It's the one most people use.
>>
>> 0-9-1 is the version of AMQP that is used across a very large number
>> of use cases, that is quite easy to implement. It was created by all
>> the implementers of AMQP that existed at time of writing including
>> Rabbit, Redhat, JPMorgan, and of course iMatix. Pieter @iMatix was
>> the spec editor, and did a fantastic job. 0-9-1 provides
>> interoperable messaging as witnessed by the large number (100s) of
>> clients and add-ons that have been created by the community. There
>> have also been several servers implemented, that all just work with
>> those clients. For example Kaazing, StormMQ, and OpenAMQ. I believe
>> that Qpid also supports it, which might be important for this
>> community (Redhat guys please note).
>>
>> This is what Pieter said: "Read AMQP/0.9.1, it is a beautiful, clean,
>> minimalist work essentially created by cooperation in the working
>> group to improve AMQP/0.8. I edited AMQP/0.9.1, based on a hundred or
>> more fixes made by the best individual brains in that group. Alexis is
>> right - 

Re: [Openstack] How to re-create a stack with devstack

2012-01-27 Thread Yun Mao
There is a hack on top of devstack for you to restart those services
easily across reboot.

https://blueprints.launchpad.net/devstack/+spec/upstart

Yun

On Fri, Jan 27, 2012 at 1:18 AM, nandakumar raghavan
 wrote:
> Hi,
>
> I have similar query. I had installed open stack using devstack on a freshly
> installed stand-alone machine(not vm). For the first time once the stack.sh
> is completed I was able to connect to the dashboard and all the services are
> up and running. Once I rebooted the box, all my settings are gone and I am
> not able to connect the dashboard as none of the services were running. I
> had to run stack.sh again and I was able to connect to the dashboard.
> Whether installing open stack using devstack is not persistent across
> reboots? Running stack.sh again is the only solution or is there any other
> way I can do ?
>
> Thanks in advance.
>
> Regards,
> NandaKumar Raghavan
>
>
> On Fri, Jan 27, 2012 at 5:13 AM, Naveed Massjouni 
> wrote:
>>
>> Awesome authors indeed! Thanks.
>> -Naveed
>>
>> On Thu, Jan 26, 2012 at 6:31 PM, Vishvananda Ishaya
>>  wrote:
>> > looks like the awesome authors of devstack are now handling this for
>> > you:
>> >
>> > https://github.com/openstack-dev/devstack/blob/master/stack.sh#L931
>> >
>> > So the instances are destroyed on the second run.
>> >
>> > Vish
>> >
>> > On Jan 26, 2012, at 3:14 PM, Naveed Massjouni wrote:
>> >
>> > That's easy enough, thanks. Sometimes I forget to delete all my
>> > instances before blowing away screen and running ./stack.sh. Just
>> > curious, what happens to all those vm's? Am I building up an army of
>> > zombie vm's that are taking up resources? Or do they disappear into
>> > the ether?
>> > -Naveed
>> >
>> > On Thu, Jan 26, 2012 at 5:53 PM, Vishvananda Ishaya
>> >  wrote:
>> >
>> > There is another thread on this, but the quick answer is;
>> >
>> > killall screen
>> >
>> > ./stack.sh
>> >
>> >
>> > You should generally make sure that you have terminated all instances
>> > and
>> > deleted all volumes in advance or you could run into issues.  It is
>> > always
>> > safer to start from a clean vm, but the above should work in most cases
>> >
>> >
>> > If you would also like to grab new code:
>> >
>> > killall screen
>> >
>> > cd devstack
>> >
>> > git pull
>> >
>> > RECLONE=yes ./stack.sh
>> >
>> >
>> > Vish
>> >
>> >
>> > On Jan 26, 2012, at 12:58 PM, Naveed Massjouni wrote:
>> >
>> >
>> > I would like to know the proper way to blow away a stack and create a
>> >
>> > fresh stack with devstack. Currently, I hit ctrl-c and ctrl-d a bunch
>> >
>> > of times to close all the windows in the screen session. Then I run
>> >
>> > ./stack.sh again. Is this the best way? Is this documented somewhere?
>> >
>> > Thanks,
>> >
>> > Naveed
>> >
>> >
>> > ___
>> >
>> > Mailing list: https://launchpad.net/~openstack
>> >
>> > Post to     : openstack@lists.launchpad.net
>> >
>> > Unsubscribe : https://launchpad.net/~openstack
>> >
>> > More help   : https://help.launchpad.net/ListHelp
>> >
>> >
>> >
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] How to start/stop/restart services in devstack

2012-01-26 Thread Yun Mao
if you need to restart your service frequently without destroying your
existing data, you might want to take a look at the upstart patch for
devstack.

https://blueprints.launchpad.net/devstack/+spec/upstart

Yun

On Thu, Jan 26, 2012 at 2:30 PM, Joe Smithian  wrote:
> localadmin@k:~$ sudo screen -x
> There is no screen to be attached.
>
> localadmin@k:~$ killall screen
> screen: no process found
>
> Should I re-run stack.sh?
>
>
>
>
>
> On Thu, Jan 26, 2012 at 2:24 PM, Dean Troyer  wrote:
>> On Thu, Jan 26, 2012 at 1:02 PM, Joe Smithian  wrote:
>>> The devstack document doesn't explain how to start/stop services,
>>> maybe it's obvious for the devstack developers but not for a new user
>>> like me!  I can't use  commands like "restart nova-api" because they
>>> are not installed.
>>
>> Devstack starts the OpenStack services running in the foreground in a
>> screen session.  Type 'screen -x' to attach to the session, there will
>> be a window for each service plus one shell window.  Stop the each
>> service with a Ctrl-C.  Press up-arrow to see the command stack.sh
>> used to start it and execute that to restart the service.
>>
>>> I installed OpenStack using devsatck stack.sh script
>>> (http://devstack.org/) on Ubuntu 11.10. Installation was successful
>>> and I was able to login to Dahsboard; but it doesn't work anymore, I
>>> think after I changed the IP address of the machine and moved it to
>>> another network.
>>> Apache2 is running but the nova and keystone services are not running.
>>
>> If you had already stated an instance, Nova probably moved your IP
>> from eth0 to br100.  You would need to manually update the br100
>> configuration.  You might also need to update some other configuration
>> bits (floating IPs, etc) if you changed networks and want to access
>> the instances from off the host.
>>
>> Your best bet here may be to just bite the bullet and 'killall screen'
>> re-run stack.sh.  Of course this will re-initialize all of the
>> databases and kill running instances.
>>
>> dt
>>
>> --
>>
>> Dean Troyer
>> dtro...@gmail.com
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] ZeroMQ RPC Driver - FF-Exception request

2012-01-24 Thread Yun Mao
Hi I'm curious and unfamiliar with the subject. What's the benefit of
0MQ vs Kombu? Thanks,

Yun

On Tue, Jan 24, 2012 at 7:08 PM, Eric Windisch  wrote:
> Per today's meeting, I am proposing the ZeroMQ RPC driver for a
> feature-freeze exception.
>
> I am making good progress on this blueprint, it adds a new optional module
> and service without modifying any existing code or modules. I have been
> pushing to complete this work by E3, so I am close to completion, but cannot
> finish by tonight's deadline.
>
> The ZeroMQ driver will provide an alternative to Kombu (RabbitMQ) and QPID
> for messaging within Nova. Currently, the code passes unit tests but fails
> on smoketests. I expect to have the code viable for a merge proposal in less
> than a week, tomorrow if I'm smart, lucky, and the store doesn't sell out of
> RedBull. A two week grace would give me a nice buffer.
>
> Thanks,
> Eric Windisch
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] noob question: tests and the master branch

2012-01-05 Thread Yun Mao
I've always thought that whatever committed to the master branch has
already passed the unit tests by default. But I saw some failed tests
when I check out the master branch. Is it because I have a bad setting
on my Ubuntu 11.10 or it is not strictly enforced that everything must
pass run_test.sh before getting in?

For example, here is the errors I have now with run_tests.sh


==
FAIL: test_instance_set_to_error_on_uncaught_exception
(nova.tests.test_compute.ComputeTestCase)
--
Traceback (most recent call last):
  File "/reserve/maoy/git/nova/nova/tests/test_compute.py", line 840,
in test_instance_set_to_error_on_uncaught_exception
instance_uuid)
  File "/usr/lib/python2.7/unittest/case.py", line 471, in assertRaises
callableObj(*args, **kwargs)
  File "/reserve/maoy/git/nova/nova/exception.py", line 154, in wrapped
payload)
  File "/reserve/maoy/git/nova/nova/notifier/api.py", line 112, in notify
payload = utils.to_primitive(payload, convert_instances=True)
  File "/reserve/maoy/git/nova/nova/utils.py", line 708, in to_primitive
level=level)
  File "/reserve/maoy/git/nova/nova/utils.py", line 702, in to_primitive
level=level))
  File "/reserve/maoy/git/nova/nova/utils.py", line 723, in to_primitive
level=level + 1)
  File "/reserve/maoy/git/nova/nova/utils.py", line 708, in to_primitive
level=level)
  File "/reserve/maoy/git/nova/nova/utils.py", line 723, in to_primitive
level=level + 1)
  File "/reserve/maoy/git/nova/nova/utils.py", line 708, in to_primitive
level=level)
  File "/reserve/maoy/git/nova/nova/utils.py", line 713, in to_primitive
return to_primitive(dict(value.iteritems()),
  File "/usr/lib/python2.7/dist-packages/mox.py", line 985, in __call__
expected_method = self._VerifyMethodCall()
  File "/usr/lib/python2.7/dist-packages/mox.py", line 1032, in
_VerifyMethodCall
expected = self._PopNextMethod()
  File "/usr/lib/python2.7/dist-packages/mox.py", line 1018, in _PopNextMethod
raise UnexpectedMethodCallError(self, None)
UnexpectedMethodCallError: Unexpected method call Stub for >.iteritems() -> None
 >> begin captured logging << 
2012-01-05 15:35:56,944 AUDIT nova.compute.manager
[5cd8ce6d-c9ab-46c6-8a37-f61e2d3f978d fake fake] instance
badcb818-7c26-4c5d-9e6c-7ac19586fd91: starting...
2012-01-05 15:35:57,052 ERROR nova.compute.manager [-] Instance
badcb818-7c26-4c5d-9e6c-7ac19586fd91 failed network setup
(nova.compute.manager): TRACE: Traceback (most recent call last):
(nova.compute.manager): TRACE:   File
"/reserve/maoy/git/nova/nova/compute/manager.py", line 473, in
_allocate_network
(nova.compute.manager): TRACE: requested_networks=requested_networks)
(nova.compute.manager): TRACE:   File
"/usr/lib/python2.7/dist-packages/mox.py", line 993, in __call__
(nova.compute.manager): TRACE: raise expected_method._exception
(nova.compute.manager): TRACE: QuantumServerException
(nova.compute.manager): TRACE:
- >> end captured logging << -

==
FAIL: test_network_is_deallocated_on_spawn_failure
(nova.tests.test_compute.ComputeTestCase)
--
Traceback (most recent call last):
  File "/reserve/maoy/git/nova/nova/tests/test_compute.py", line 862,
in test_network_is_deallocated_on_spawn_failure
instance['uuid'])
  File "/usr/lib/python2.7/unittest/case.py", line 471, in assertRaises
callableObj(*args, **kwargs)
  File "/reserve/maoy/git/nova/nova/exception.py", line 154, in wrapped
payload)
  File "/reserve/maoy/git/nova/nova/notifier/api.py", line 112, in notify
payload = utils.to_primitive(payload, convert_instances=True)
  File "/reserve/maoy/git/nova/nova/utils.py", line 708, in to_primitive
level=level)
  File "/reserve/maoy/git/nova/nova/utils.py", line 702, in to_primitive
level=level))
  File "/reserve/maoy/git/nova/nova/utils.py", line 723, in to_primitive
level=level + 1)
  File "/reserve/maoy/git/nova/nova/utils.py", line 708, in to_primitive
level=level)
  File "/reserve/maoy/git/nova/nova/utils.py", line 713, in to_primitive
return to_primitive(dict(value.iteritems()),
  File "/usr/lib/python2.7/dist-packages/mox.py", line 985, in __call__
expected_method = self._VerifyMethodCall()
  File "/usr/lib/python2.7/dist-packages/mox.py", line 1032, in
_VerifyMethodCall
expected = self._PopNextMethod()
  File "/usr/lib/python2.7/dist-packages/mox.py", line 1018, in _PopNextMethod
raise UnexpectedMethodCallError(self, None)
UnexpectedMethodCallError: Unexpected method call Stub for >.iteritems() -> None
 >> begin captured logging << 
2012-01-05 15:35:59,230 AUDIT nova.compute.manager
[3106077e-df25-411e-b9

[Openstack] [Orchestration] Blueprint for transactional task management

2011-12-21 Thread Yun Mao
Greetings,

I have registered a blueprint for HA task management

https://blueprints.launchpad.net/nova/+spec/task-management

Tasks in Nova such as launching instances are complicated and error
prone. Currently there is no systematic, reusable way to keep track of
the distributed task executions. There is also no mechanism to know
which tasks are currently using what resources. Task management is
implicitly assumed to be VM state management. This blueprint proposes
to build a highly available service to offer first-class APIs to task
and resource lock management.

It is very much work in progress but I finally have some time to write
it down and start coding on it. Comments are very welcome. Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Orchestration] Handling error events ... explicit vs. implicit

2011-12-07 Thread Yun Mao
Hi Sandy,

I'm wondering if it is possible to change the scheduler's rpc cast to
rpc call. This way the exceptions should be magically propagated back
to the scheduler, right? Naturally the scheduler can find another node
to retry or decide to give up and report failure. If we need to
provision many instances, we can spawn a few green threads for that.

Yun

On Wed, Dec 7, 2011 at 10:26 AM, Sandy Walsh  wrote:
> For orchestration (and now the scheduler improvements) we need to know when 
> an operation fails ... and specifically, which resource was involved. In the 
> majority of the cases it's an instance_uuid we're looking for, but it could 
> be a security group id or a reservation id.
>
> With most of the compute.manager calls the resource id is the third parameter 
> in the call (after self & context), but there are some oddities. And 
> sometimes we need to know the additional parameters (like a migration id 
> related to an instance uuid). So simply enforcing parameter orders may be 
> insufficient and impossible to enforce programmatically.
>
> A little background:
>
> In nova, exceptions are generally handled in the RPC or middleware layers as 
> a logged event and life goes on. In an attempt to tie this into the 
> notification system, a while ago I added stuff to the wrap_exception 
> decorator. I'm sure you've seen this nightmare scattered around the code:
> @exception.wrap_exception(notifier=notifier, publisher_id=publisher_id())
>
> What started as a simple decorator now takes parameters and the code has 
> become nasty.
>
> But it works ... no matter where the exception was generated, the notifier 
> gets:
> *   compute.
> *   
> *   and whatever arguments the method takes.
>
> So, we know what operation failed and the host it failed on, but someone 
> needs to crack the argument nut to get the goodies. It's a fragile coupling 
> from publisher to receiver.
>
> One, less fragile, alternative is to put a try/except block inside every 
> top-level nova.compute.manager method and send meaningful exceptions right 
> from the source. More fidelity, but messier code. Although "explicit is 
> better than implicit" keeps ringing in my head.
>
> Or, we make a general event parser that anyone can use ... but again, the 
> link between the actual method and the parser is fragile. The developers have 
> to remember to update both.
>
> Opinions?
>
> -S
>
>
>
>
>
>
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] turn a devstack into a deployment

2011-11-22 Thread Yun Mao
devstack makes setting up a dev environment such a breeze, that I'd
rather not go back to the packages and manual installations if
possible, for a not so serious deployment environment.

So I wrote the script upstart.sh and a few templates. The basic idea
is that once you like what stack.sh has done to your system, you can
convert them into upstart services. The benefits are: run them like
daemons instead of in screens; store logs in files; automatic start up
after a reboot.

It works with glance-*, nova-*, novnc, and keystone, not with quantum
or swift yet.

put it on github https://github.com/maoy/devstack/tree/upstart

Usage: after you've done ./stack.sh
do: ./upstart.sh install
Now the services are installed. To use them, either reboot to get rid
of the services started by stack.sh, and automatically start those
services, or do killall screen, then ./upstart.sh start

Feedback is welcome. Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] rpms for rhel5.x to install Open stack Object Storage

2011-11-21 Thread Yun Mao
John,

there is OpenStack Object Store, a.k.a. Swift, there is also an object
store inside nova called nova-objectstore. The latter is deprecated.
See here:

https://answers.launchpad.net/nova/+question/156113

Yun

On Mon, Nov 21, 2011 at 8:19 AM, John Dickinson  wrote:
> I suspect there is a communication gap somewhere, but this is certainly not 
> the case. Openstack Object Storage (swift) is not deprecated. Glance provides 
> a bridge between nova and swift, but all three are important, active projects.
>
>
> Sudhaker,
>
> I know that rpms exist for swift, but I don't know where they live. (I should 
> find out--anyone know?)
>
> --John
>
>
>
> On Nov 21, 2011, at 6:18 AM, David Busby wrote:
>
>> Also, as I recall Object Store is deprecated in favour of glance, at least 
>> this was the case in October during the training course.
>>
>> Added cc to openstack@lists.launchpad.net as I forgot in last email.
>>
>> On 21 Nov 2011, at 12:15, David Busby wrote:
>>
>>> HI Sudhakar,
>>>
>>> I do not believe there are any RPM packages being built of maintained for 
>>> 5.x due to the large list of dependencies, one of which being the libvirt 
>>> version required (The exact version escapes me for the moment).
>>>
>>> There are EPEL packages for 6.x in the works (and we would always welcome 
>>> another tester), and there are GridDynamics RPMS already available for 6.x 
>>> I believe.
>>>
>>>
>>> Cheers
>>>
>>> David
>>>
>>>
>>> On 21 Nov 2011, at 11:25, Sudhakar Maiya wrote:
>>>
 Hi,
     Can some one help for prerequisites to install Openstack Object 
 Storage in RHEL system.

 Thanks & Regards
 Sudhakar


 ___
 Mailing list: https://launchpad.net/~openstack
 Post to     : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp
>>>
>>>
>>
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Keystone versioning and tarballs

2011-10-28 Thread Yun Mao
The problem looks like that only users with role ['projectmanager',
'sysadmin'] can run instances. The demo user created by devstack only
has "Member" role. Not sure how it's mapped to the roles described in
http://docs.openstack.org/diablo/openstack-compute/admin/content/users-and-projects.html

After switching to admin user, it works fine.

Anyway, this keystone vs old authentication is really confusing..

On Thu, Oct 27, 2011 at 10:43 PM, Yun Mao  wrote:
> I think I'm close to figuring this out. You can take a look at the
> devstack scripts. In particular,
> https://github.com/cloudbuilders/devstack/blob/master/files/keystone_data.sh
>
> Then you can source openrc to get the EC2_* environment variables.
>
> However, it only works for euca-describe-instances,
> euca-describe-images, at least for me.
>
> When I tried euca-run-instances, the error is:
> $ euca-run-instances ami-0004
> Warning: failed to parse error message from AWS: :1:0: syntax error
> None: None
>
> The log on the nova-api daemon looks like this:
> 2011-10-27 18:29:22,288 DEBUG nova [-] HTTP PERF: 0.01362 seconds to
> GET 127.0.0.1:35357 /v2.0/tokens/bd9c6abd-eeb4-4ba9-b49e-7aafe790ef9c)
> from (pid=2774) getresponse
> /opt/stack/keystone/keystone/common/bufferedhttp.py:99
> 2011-10-27 18:29:22,301 DEBUG nova [-] HTTP PERF: 0.01282 seconds to
> GET 127.0.0.1:35357 /v2.0/tokens/bd9c6abd-eeb4-4ba9-b49e-7aafe790ef9c)
> from (pid=2774) getresponse
> /opt/stack/keystone/keystone/common/bufferedhttp.py:99
> 2011-10-27 18:29:22,302 DEBUG nova.api [-] action: RunInstances from
> (pid=2774) __call__ /opt/stack/nova/nova/api/ec2/__init__.py:240
> 2011-10-27 18:29:22,302 DEBUG nova.api [-] arg: ImageId         val:
> ami-000 from (pid=2774) __call__
> /opt/stack/nova/nova/api/ec2/__init__.py:242
> 2011-10-27 18:29:22,303 DEBUG nova.api [-] arg: MaxCount
>  val: 1 from (pid=2774) __call__
> /opt/stack/nova/nova/api/ec2/__init__.py:242
> 2011-10-27 18:29:22,303 DEBUG nova.api [-] arg: MinCount
>  val: 1 from (pid=2774) __call__
> /opt/stack/nova/nova/api/ec2/__init__.py:242
> 2011-10-27 18:29:22,303 DEBUG nova.api [-] arg: InstanceType
>  val: m1.small from (pid=2774) __call__
> /opt/stack/nova/nova/api/ec2/__init__.py:242
> 2011-10-27 18:29:22,303 AUDIT nova.api
> [4f056dc4-6515-4bd0-bd09-0c1584b9fc39 demo 2] Unauthorized request for
> controller=CloudController and action=RunInstances
> 2011-10-27 18:29:22,304 INFO nova.api
> [4f056dc4-6515-4bd0-bd09-0c1584b9fc39 demo 2] 0.60822s 127.0.0.1 POST
> /services/Cloud/ CloudController:RunInstances 401 [Boto/2.0 (linux2)]
> application/x-www-form-urlencoded text/plain
>
> Does anyone know what's going on? Thanks,
>
> Yun
>
> On Tue, Oct 25, 2011 at 8:51 AM, David Kranz  wrote:
>> Along the same lines,  how do you export the shell variables for euca-tools
>> with keystone since nova-manage to create the zipfile does not work?
>>
>>  -David
>>
>> On 10/24/2011 8:29 PM, Vishvananda Ishaya wrote:
>>
>> Speaking of keystone diablo tag, it is currently missing the following
>> commit:
>> https://github.com/openstack/keystone/commit/2bb474331d73e7c6d2a507cb097c50cfe65ad6b6
>> This commit is required for the ec2 api to work with keystone.  Seems like
>> we need to move the tag or create a keystone/stable branch and pull this in.
>> Vish
>> On Oct 24, 2011, at 12:03 AM, Mark McLoughlin wrote:
>>
>> Hey,
>>
>> I just noticed a few things when reviewing the Fedora packaging of
>> keystone:
>>
>>  - There's no diablo release tarball on https://launchpad.net/keystone
>>    like other projects
>>
>>  - The 2011.3 tag in git has version=1.0 in setup.py. Which versioning
>>    scheme is keystone going to follow?
>>
>>  - The version in master is non-numeric 'essex' rather than e.g.
>>    2011.3 or 1.1
>>
>> Thanks,
>> Mark.
>>
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Keystone versioning and tarballs

2011-10-27 Thread Yun Mao
I think I'm close to figuring this out. You can take a look at the
devstack scripts. In particular,
https://github.com/cloudbuilders/devstack/blob/master/files/keystone_data.sh

Then you can source openrc to get the EC2_* environment variables.

However, it only works for euca-describe-instances,
euca-describe-images, at least for me.

When I tried euca-run-instances, the error is:
$ euca-run-instances ami-0004
Warning: failed to parse error message from AWS: :1:0: syntax error
None: None

The log on the nova-api daemon looks like this:
2011-10-27 18:29:22,288 DEBUG nova [-] HTTP PERF: 0.01362 seconds to
GET 127.0.0.1:35357 /v2.0/tokens/bd9c6abd-eeb4-4ba9-b49e-7aafe790ef9c)
from (pid=2774) getresponse
/opt/stack/keystone/keystone/common/bufferedhttp.py:99
2011-10-27 18:29:22,301 DEBUG nova [-] HTTP PERF: 0.01282 seconds to
GET 127.0.0.1:35357 /v2.0/tokens/bd9c6abd-eeb4-4ba9-b49e-7aafe790ef9c)
from (pid=2774) getresponse
/opt/stack/keystone/keystone/common/bufferedhttp.py:99
2011-10-27 18:29:22,302 DEBUG nova.api [-] action: RunInstances from
(pid=2774) __call__ /opt/stack/nova/nova/api/ec2/__init__.py:240
2011-10-27 18:29:22,302 DEBUG nova.api [-] arg: ImageId val:
ami-000 from (pid=2774) __call__
/opt/stack/nova/nova/api/ec2/__init__.py:242
2011-10-27 18:29:22,303 DEBUG nova.api [-] arg: MaxCount
 val: 1 from (pid=2774) __call__
/opt/stack/nova/nova/api/ec2/__init__.py:242
2011-10-27 18:29:22,303 DEBUG nova.api [-] arg: MinCount
 val: 1 from (pid=2774) __call__
/opt/stack/nova/nova/api/ec2/__init__.py:242
2011-10-27 18:29:22,303 DEBUG nova.api [-] arg: InstanceType
 val: m1.small from (pid=2774) __call__
/opt/stack/nova/nova/api/ec2/__init__.py:242
2011-10-27 18:29:22,303 AUDIT nova.api
[4f056dc4-6515-4bd0-bd09-0c1584b9fc39 demo 2] Unauthorized request for
controller=CloudController and action=RunInstances
2011-10-27 18:29:22,304 INFO nova.api
[4f056dc4-6515-4bd0-bd09-0c1584b9fc39 demo 2] 0.60822s 127.0.0.1 POST
/services/Cloud/ CloudController:RunInstances 401 [Boto/2.0 (linux2)]
application/x-www-form-urlencoded text/plain

Does anyone know what's going on? Thanks,

Yun

On Tue, Oct 25, 2011 at 8:51 AM, David Kranz  wrote:
> Along the same lines,  how do you export the shell variables for euca-tools
> with keystone since nova-manage to create the zipfile does not work?
>
>  -David
>
> On 10/24/2011 8:29 PM, Vishvananda Ishaya wrote:
>
> Speaking of keystone diablo tag, it is currently missing the following
> commit:
> https://github.com/openstack/keystone/commit/2bb474331d73e7c6d2a507cb097c50cfe65ad6b6
> This commit is required for the ec2 api to work with keystone.  Seems like
> we need to move the tag or create a keystone/stable branch and pull this in.
> Vish
> On Oct 24, 2011, at 12:03 AM, Mark McLoughlin wrote:
>
> Hey,
>
> I just noticed a few things when reviewing the Fedora packaging of
> keystone:
>
>  - There's no diablo release tarball on https://launchpad.net/keystone
>    like other projects
>
>  - The 2011.3 tag in git has version=1.0 in setup.py. Which versioning
>    scheme is keystone going to follow?
>
>  - The version in master is non-numeric 'essex' rather than e.g.
>    2011.3 or 1.1
>
> Thanks,
> Mark.
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] virtio for VM NICs

2011-10-27 Thread Yun Mao
Is there a reason that libvirt_use_virtio_for_bridges is not set to
True by default? Without virtio the network performance in kvm is
ridiculously slow.. Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] high availability deployment

2011-10-24 Thread Yun Mao
Hi stackers,

is there a document somewhere that talks about the deployment strategy
for high availability? There seems to be a few single point of
failures in the nova architecture -- the controller, which has the API
and the scheduler, the rabbitmq server, and the mysql server.

Google helped me to reach this thread
http://www.mail-archive.com/openstack@lists.launchpad.net/msg03516.html,
which covers the rabbitmq and mysql part, although it appears that
rabbitmq will still lose messages during failure in the setup. I'm
wondering if someone has tried to make the controller more available
during node failure? Thanks,

Yun

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp