Re: [Openstack] [nova] [cinder] Nova-volume vs. Cinder in Folsom

2012-07-12 Thread Blake Yeager
Wow, I have been out for the evening and I feel like I missed out on one of
the most critical threads I have seen on this mailing list in a while.

First of all, I think everyone needs to take a deep breath and remember
that we are all on the same team and part of the same community working to
make OpenStack successful.  We do need to challenge the community to hold
true to it's commitments but we need to do that in a constructive and
positive way.  A healthy meritocracy will encourage debate and not conceit
mediocrity when there are better options to be found.

Reading this thread I think there is a very real criticism around the fact
that not enough emphasis has been placed on forward migration and general
compatibility between OpenStack deployments.  I do not think that the ship
has sailed and I think the Cinder project is a very real example of how we
can reverse that trend and start to take a different approach.  I was in
every session in San Francisco around Cinder and I voted for the separation
of the volume code from Nova.  However, in San Francisco it was decided
that we would maintain nova-volumes for the Folsom release and then
deprecate/remove that code in future releases.  To me this feels like a
very big change to the plan of record.  I understand the desire to tear out
nova-volumes sooner to make development easier but as was pointed out
earlier in this thread it is not practical for people that have active
deployments which currently leverage nova-volumes.

Speaking on behalf of HP which is currently running an implementation which
relies on the current nova-volumes code I do no think we should be forced
to cut over to Cinder in order to upgrade to Folsom.

Vish made a great suggestion which outlined the following 4 steps:

> 1) Leave the code in nova-volume for now.
> 2) Document and test a clear migration path to cinder.
> 3) Take the working example upgrade to the operators list and ask them for
> opinions.
> 4) Decide based on their feedback whether it is acceptable to cut the
nova-
> volume code out for folsom.

I am skeptical that we will be able to cut the nova-volume code out of
Folsom, but I believe that we should let this process run it's course.
 Part of the maturation process has to include suggesting aggressive
changes, listening to the feedback for the community (especially
the operators) and then adjusting the plan based on that feedback.  No one
is saying that we can't change and evolve the code, only that it has to be
done according to a pre-defined schedule and that schedule has to
incorporate feedback from the community.

I am excited to see such passion from the community but we need to make
sure that passion is directed in a constructive manner.

-Blake
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Configure Rate limits on OS API

2012-01-10 Thread Blake Yeager
On Tue, Dec 27, 2011 at 2:33 PM, Nirmal Ranganathan wrote:

> You can configure those values thru the paste conf.
>
> [filter:ratelimit]
> paste.filter_factory =
> nova.api.openstack.limits:RateLimitingMiddleware.factory
> limits =("POST", "*", ".*", 10, MINUTE);("POST", "*/servers", "^/servers",
> 50, DAY);("PUT", "*", ".*", 10, MINUTE);("GET", "*changes-since*",
> ".*changes-since.*", 3, MINUTE);("DELETE", "*", ".*", 100, MINUTE)
>
>
Am I correct in assuming that this will only work with setting the global
limits?  Is there anyway to specify different limits for different accounts
or groups of accounts?

-Blake


>
> On Mon, Dec 19, 2011 at 1:28 PM, Day, Phil  wrote:
>
>> Hi Folks,
>>
>> ** **
>>
>> Is there a file that can be used to configure the API rate limits for the
>> OS API on a per user basis ?
>>
>> ** **
>>
>> I can see where the default values are set in the code, but it looks as
>> if there should be a less brutal configuration mechanism to go along with
>> this ?
>>
>> ** **
>>
>> Thanks
>>
>> Phil 
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
>
> --
> Nirmal
> 
> http://rnirmal.com
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] swift essex status update

2012-01-12 Thread Blake Yeager
I can't tell you all how excited I am to see temp urls and to hear about
the work that is going on around object versioning.  It is really nice to
see some traction on these much requested features.

Great Job Swift Team!

-Blake

On Wed, Jan 11, 2012 at 10:20 AM, John Dickinson  wrote:

> A quick note to talk about what's been going on with swift since the
> diablo release.
>
> Swift 1.4.2 was the openstack diablo release. Since then, we've had three
> releases. We've done quite a bit of small bug fixes and general polish, and
> I'd like to highlight some of the bigger improvements we've made.
>
> First, we'e included a new tool in swift called swift-recon. This is a
> combination of a scripts and middleware for the object-server, and it
> allows the swift cluster to report on its own health. For example, using
> swift-recon, you can find out the disk utilization in the cluster, socket
> utilization, load stats, async pending stats, replication stats, and
> unmounted disks info. It's a great tool that gives you good insight into
> important metrics in your swift cluster. Florian Hines designed and wrote
> this tool.
>
> On the bug-fixing front, we saw a memory leak error under high load at
> large scale. In short, the Python garbage collector was not always freeing
> memory associated with a socket when a client would disconnect early. This
> would cause the proxy servers to run out of memory after a few days of use.
> Greg Holt spent quite a bit of time finding and fixing this error.
>
> We've also included two new tools for managing production clusters
> (swift-orphans and swift-oldies). These tools are used to find potential
> issues with long-running swift processes. These tools were written by Greg
> Holt.
>
> That brings us to our current release. All of the above-mentioned changes
> are available in swift 1.4.5 (released earlier this week). I'd also like to
> highlight another exciting update that was just merged into swift today and
> will be included in the swift 1.4.6 release: temp urls and form uploading.
>
> With this new feature, you will be able to craft a temporary URL that
> grants a user limited access to your swift account. For example, you can
> craft a URL to your swift cluster that grants PUT access to a particular
> container for the next 30 minutes. You can use this in conjunction with
> HTML forms to directly upload content from a browser into swift (without
> having to proxy the data on your web servers). This feature has been
> requested by many and was written primarily by Greg Holt with input from
> David Goetz and Greg Lange.
>
> We're halfway through the openstack essex release cycle. I'm excited about
> the improvements we've made to swift, and I expect some more exciting
> things to come before our final essex release is made. As always, patches
> welcome!
>
> John Dickinson
> Swift Project Technical Lead
> notmyname on IRC
>
>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Is there a security issue with qcow2 images ?

2012-01-25 Thread Blake Yeager
Just FYI for the mailing list we found the bug and the resolution here:

https://bugs.launchpad.net/nova/+bug/853330

Cheers,
-Blake

On Wed, Jan 25, 2012 at 12:11 PM, Day, Phil  wrote:

> Hi Folks,
>
> ** **
>
> I have a half remembered conversation from the Boston summit where someone
> said that there was a security issue in using qcow2 as a format for
> creating snapshot images.  
>
> ** **
>
> Does that ring any bells with anyone, and if so can you expand on the
> potential issue please ?
>
> ** **
>
> I think it was something to do with why base images have to be expanded
> after download, but I might be wrong on that.   I’m particularly interested
> in using qcow2 as an upload format for snapshots.
>
> ** **
>
> Thanks
>
> Phil
>
> ** **
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Scaling][Orchestration] Zone changes. WAS: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

2012-01-26 Thread Blake Yeager
Sandy,

I am excited to hear about the work that is going on around communication
between trusted zones and look forward to seeing what you have created.

In general, the scalability of Nova is an area where I think we need to put
additional emphasis.  Rackspace has done a lot of work on zones, but they
don't seem to be receiving a lot of support from the rest of the community.

The OpenStack mission statement indicates the mission of the project is*:* "To
produce the ubiquitous Open Source cloud computing platform that will meet
the needs of public and private cloud providers regardless of size, by
being simple to implement and massively scalable."

I would challenge the community to ensure that scale is being given the
appropriate focus in upcoming releases, especially Nova.  Perhaps we need
to start by setting very specific scale targets for a single Nova zone in
terms of nodes, instances, volumes, etc.  I did a quick search of the wiki
but I didn't find anything about scale targets.  Does anyone know if
something exists and I am just missing it?  Obviously scale will depend a
lot on your specific hardware and configuration but we could start by
saying with this minimum hardware spec and this configuration we want to be
able to hit this scale.  Likewise it would be nice to publish some
statistics about the scale that we believe a given release can operate at
safely.  This would tie into some of the QA/Testing work that Jay & team
are working on.

Does anyone have other thoughts about how we ensure we are all working
toward building a massively scalable system?

-Blake

On Thu, Jan 26, 2012 at 9:20 AM, Sandy Walsh wrote:

> Zones is going through some radical changes currently.
>
> Specifically, we're planning to use direct Rabbit-to-Rabbit communication
> between trusted Zones to avoid the complication of changes to OS API,
> Keystone and novaclient.
>
> To the user deploying Nova not much will change, there may be a new
> service to deploy (a Zones service), but that would be all. To a developer,
> the code in OS API will greatly simplify and the Distributed Scheduler will
> be able to focus on single zone scheduling (vs doing both zone and host
> scheduling as it does today).
>
> We'll have more details soon, but we aren't planning on introducing the
> new stuff until we have a working replacement in place. The default Essex
> Scheduler now will largely be the same and the filters/weight functions
> will still carry forward, so any investments there won't be lost.
>
> Stay tuned, we're hoping to get all this in a new blueprint soon.
>
> Hope it helps,
> Sandy
>
> 
> From: boun...@canonical.com [boun...@canonical.com] on behalf of
> Alejandro Comisario [question185...@answers.launchpad.net]
> Sent: Thursday, January 26, 2012 8:50 AM
> To: Sandy Walsh
> Subject: Re: [Question #185840]: Multi-Zone finally working on ESSEX but
> cant   "nova list" (KeyError: 'uuid') + doubts
>
> Question #185840 on OpenStack Compute (nova) changed:
> https://answers.launchpad.net/nova/+question/185840
>
>Status: Answered => Open
>
> Alejandro Comisario is still having a problem:
> Sandy, Vish !
>
> Thanks for the replies ! let me get to the relevant points.
>
> #1 I totally agree with you guys, the policy for spawning instances
> maybe very special of each company strategy, but, as you can pass from
> "Fill First" to "Spread First" just adding a "reverse=True" on
> nova.scheduler.least_cost.weighted_sum" and
> "nova.scheduler.distributed_scheduler._schedule" maybe its a harmless
> addition to manipulate (since we are going to have a lot of zones across
> datacenters, and many different departments are going to create many
> instances to load-balance their applications, we really preffer
> SpreadFirst to make sure hight availability of the pools )
>
> #2 As we are going to test essex-3, i would like if you can tell me if
> the zones code from Chris Behrens is going to be added on Final Essex /
> Milestone 4, so we can keep testing other features, or you preffer us to
> load this as a bug to be fixed since maybe the code that broke is not
> going to have major changes.
>
> Kindest regards !
>
> --
> You received this question notification because you are a member of Nova
> Core, which is an answer contact for OpenStack Compute (nova).
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [Scaling][Orchestration] Zone changes. WAS: [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova list" (KeyError: 'uuid') + doubts

2012-01-26 Thread Blake Yeager
Sandy,

This is exactly what I was looking for, thanks for sending it along.  I am
glad to see that you all are thinking about this and have some targets in
mind.

This weekend I will pull your numbers into a wiki page so we can begin to
expand and elaborate on other scale metrics around Nova.  In general I
think they look like reasonable numbers although the number of hosts per
zone looks a little low to me and the response time of 3 seconds seems a
little generous ;-)

Thanks again!

-Blake


On Thu, Jan 26, 2012 at 10:40 AM, Sandy Walsh wrote:

>  Thanks Blake ... all very valid points.
>
> Based on our discussions yesterday (the ink is still wet on the
> whiteboard) we've been kicking around numbers in the following ranges:
>
> 500-1000 hosts per zone (zone = single nova deployment. 1 db, 1 rabbit)
> 25-100 instances per host (minimum flavor)
> 3s api response time fully loaded (over that would be considered a
> failure). 'nova list' being the command that can bring down the house. But
> also 'nova boot' is another concern. We're always trying to get more async
> operations in there.
>
> Hosts per zone is a tricky one because we run into so many issues around
> network architecture, so your mileage may vary. Network is the limiting
> factor in this regard.
>
> All of our design decisions are being made with these metrics in mind.
>
> That said, we'd love to get more feedback on realistic metric expectations
> to ensure we're in the right church.
>
> Hope this is what you're looking for?
>
> -S
>
>
>  --
> *From:* Blake Yeager [blake.yea...@gmail.com]
> *Sent:* Thursday, January 26, 2012 12:13 PM
> *To:* Sandy Walsh
> *Cc:* openstack@lists.launchpad.net
> *Subject:* Re: [Openstack] [Scaling][Orchestration] Zone changes. WAS:
> [Question #185840]: Multi-Zone finally working on ESSEX but cant "nova
> list" (KeyError: 'uuid') + doubts
>
>  Sandy,
>
>  I am excited to hear about the work that is going on around
> communication between trusted zones and look forward to seeing what you
> have created.
>
>  In general, the scalability of Nova is an area where I think we need to
> put additional emphasis.  Rackspace has done a lot of work on zones, but
> they don't seem to be receiving a lot of support from the rest of the
> community.
>
>  The OpenStack mission statement indicates the mission of the project is*:
> * "To produce the ubiquitous Open Source cloud computing platform that
> will meet the needs of public and private cloud providers regardless of
> size, by being simple to implement and massively scalable."
>
>  I would challenge the community to ensure that scale is being given the
> appropriate focus in upcoming releases, especially Nova.  Perhaps we need
> to start by setting very specific scale targets for a single Nova zone in
> terms of nodes, instances, volumes, etc.  I did a quick search of the wiki
> but I didn't find anything about scale targets.  Does anyone know if
> something exists and I am just missing it?  Obviously scale will depend a
> lot on your specific hardware and configuration but we could start by
> saying with this minimum hardware spec and this configuration we want to be
> able to hit this scale.  Likewise it would be nice to publish some
> statistics about the scale that we believe a given release can operate at
> safely.  This would tie into some of the QA/Testing work that Jay & team
> are working on.
>
>  Does anyone have other thoughts about how we ensure we are all working
> toward building a massively scalable system?
>
>  -Blake
>
>  On Thu, Jan 26, 2012 at 9:20 AM, Sandy Walsh 
> wrote:
>
>> Zones is going through some radical changes currently.
>>
>> Specifically, we're planning to use direct Rabbit-to-Rabbit communication
>> between trusted Zones to avoid the complication of changes to OS API,
>> Keystone and novaclient.
>>
>> To the user deploying Nova not much will change, there may be a new
>> service to deploy (a Zones service), but that would be all. To a developer,
>> the code in OS API will greatly simplify and the Distributed Scheduler will
>> be able to focus on single zone scheduling (vs doing both zone and host
>> scheduling as it does today).
>>
>> We'll have more details soon, but we aren't planning on introducing the
>> new stuff until we have a working replacement in place. The default Essex
>> Scheduler now will largely be the same and the filters/weight functions
>> will still carry forward, so any investments there won't be lost.
>>
>> Stay tuned, we're hoping to get all this in a new

Re: [Openstack] quota question

2012-07-23 Thread Blake Yeager
>
> 
>
> (BTW, I'd like to point out the Boson proposal and thread…)
>

Good point, we also need to think through how distributed quotas will work
across multiple cells.  I can see a lot of overlap between these two use
cases: I want to limit users to a specific quota for a specific flavor and
I want to limit users to a specific quota for a given cell - both of which
would be independent of a user's overall quota.

IMHO, we need to address both of these use cases at the same time.

-Blake
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Large image snapshots

2012-07-26 Thread Blake Yeager
On Thu, Jul 26, 2012 at 8:06 AM, David Kranz  wrote:

> I am a bit ignorant about image formats and such. The size of the Ubuntu
> precise cloud image at http://uec-images.ubuntu.com/**
> precise/current/precise-**server-cloudimg-amd64-disk1.**imgis
>  about 221Mb. If I boot that image with flavor m1.tiny and use
> image-create I get an image that is 2Gb. If I do the same with flavor
> m1.large the resulting image is 10Gb.  Is there a way to create snapshots
> that don't result in huge images?
>
>  -David
>
>
David,

This has to do with the disk format that you are using (for example raw vs.
qcow2).  Unless you are using a sparse disk format the snapshot will always
be the total size of the root partition. By default when Nova creates an
image it expands the root partition to 10GB so when you create a snapshot
of that partition it will also be 10GB.  By switching the disk format you
are using to a sparse format you should be able to avoid this behavior.
 The m1.tiny flavor is treated as a special case by Nova so that is why the
root partition and corresponding snapshot is only 2GB.

-Blake
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] [openstack-dev] Nova PTL candidacy

2012-09-06 Thread Blake Yeager
He also lives vicariously through himself.

On Thu, Sep 6, 2012 at 10:12 AM, Ravi Jagannathan wrote:

> +1 . Plus he has a cool name.
>
>
> On Thu, Sep 6, 2012 at 10:58 AM, Sam Su  wrote:
>
>> +1
>>
>>
>> On Wed, Sep 5, 2012 at 12:19 AM, Michael Still <
>> michael.st...@canonical.com> wrote:
>>
>>> On 09/05/2012 06:03 AM, Matt Joyce wrote:
>>> > Vish is also a pretty cool guy and doesn't afraid of anything.
>>>
>>> Vish does a great job -- many hours a day of code review and mentoring,
>>> puts up with criticism much more calmly than I think many would, and is
>>> a pleasure to work with.
>>>
>>> Mikal
>>>
>>>
>>> ___
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to : openstack@lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>> ___
>> Mailing list: https://launchpad.net/~openstack
>> Post to : openstack@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>>
>
> ___
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Suspend and locking instances

2010-12-14 Thread Blake Yeager
Looking at it from a service provider point of view I think there are two 
separate reasons to lock the state of an instance.

First is the issue of a legal violation on behalf of the customer.  Either they 
have not paid their bill and are in violation of their contract or they have 
been using the resource to perform illegal activity or activity that is 
expressly prohibited in the the a service provider's Acceptable Use Policy.  In 
this case the service provider needs to be able to shut down the instance and  
prevent the customer from deleting the instance or restarting it.  (Deleting it 
could erase data that the service provider is legally required to maintain in 
the case of illegal activity.)  This action should only be preformed via the 
admin API and needs to be executable regardless of the state of the instance or 
any "state_lock" field.

The second use case is where a customer may want to lock an instance to prevent 
themselves from accidentally deleting it or shutting it down via the API.  This 
would prevent a broken script or program from running away and deleting 
important servers in a customer's account.  They would probably enable this 
lock on instances that contained critical infrastructure that they were 
expecting to keep running for a long time or didn't want to be accidentally 
deleted.  I could even envision an extreme setup where a customer is not able 
to take a server out of this locked state without interacting with the service 
providers support team who could unlock it via the admin API.  Even when this 
lock is in place the service provider needs to be able to override this lock 
and shutdown and "boot_lock" the instance in the case of a legal violation 
mentioned above.

I don't have much of an opinion as to whether it should be implement using the 
boot_lock inside the hypervisor or in the Nova code but I think it should be 
able to support both the use cases above.

-Blake


From: Matt Dietz mailto:matt.di...@rackspace.com>>
Date: Tue, 14 Dec 2010 22:10:48 +
To: "openstack@lists.launchpad.net" 
mailto:openstack@lists.launchpad.net>>
Subject: Re: [Openstack] Suspend and locking instances

Trey and I were discussing this earlier, and the suggestion below is what we 
came up with.

Regarding the boot lock functionality, this is a feature that Rackspace 
specifically requires. However, we don't want to implement it in a way that's 
restrictive or business specific, so we're proposing the idea of the state lock 
instead. As I understand it, a similar feature has been requested that 
basically prevents any actions from accidentally taking place against a given 
instance.

The idea would be to add an extra field to the Instance model that's a simple 
boolean, and said boolean could be checked by any code attempting to change 
state. This leads into a secondary discussion of possible state machine 
implementations. I'd prefer to not force anyone implementing a new instance 
action to handle the state_lock field manually, as it would probably result in 
the same 3 or 4 lines of code copy-pasted everywhere. However, I'm largely 
unaware of what the "good" solutions are to this problem. My knee-jerk answer 
is a decorator for state changing methods with set of "black-list" states that 
are automatically rejected by the decorator.

From: Trey Morris mailto:trey.mor...@rackspace.com>>
Date: Tue, 14 Dec 2010 15:04:17 -0600
To: mailto:openstack@lists.launchpad.net>>
Subject: [Openstack] Suspend and locking instances

Regarding https://blueprints.launchpad.net/nova/+spec/xs-suspend

We need suspend as a power state (ram goes to disk, instance is shut down) as 
well as a locking suspend (ram goes to disk, instance is shut down, instance is 
no longer customer bootable) which make sense from a billing point of view. The 
issue is where this lock is going to be. For example, in Xenserver we can set 
whether an instance is bootable, but I'm unsure if this feature is supported in 
other hypervisors. Even if it were supported across the hypervisor board this 
would be a specific case of lock (locking the instance in a shutdown/suspend 
type state). Instead, I propose a state lock. When an instance is state locked, 
no functions which enact a change in state may be executed. State lock would be 
above the hypervisor level and require storage in a table. Being above the 
hypervisor level gives us the advantage of not having to implement boot lock 
functionality in each hypervisor api and we get a more general lock feature 
which can be used in more situations. I don't like the idea of all the compute 
worker functions having to check for state lock status because it's ugly as 
well as code duplication, but I'm not sure of a better way at the moment.

I'm looking for suggestions/comments on the general state lock idea vs 
hypervisor boot lock vs implementing instance locking at all within nova, as 
well as ideas for a clean implementation of a state lo

Re: [Openstack] Physical host identification

2011-07-18 Thread Blake Yeager
The original purpose behind sharing the HostID with a customer was to allow
a customer to identify situations where they had two VMs placed on the same
host.  This knowledge is extremely critical if a customer is trying to setup
two VMs in an HA configuration.  Because this use case does not require the
HostID to be globally unique the HostID was hashed with the customer ID to
create a customer specific identifier for each host.  (This is how Rackspace
Cloud Servers works today.)

The two arguments that I have heard for not making HostIDs globally unique
are; one, it would allow customers (or partners) to figure out exactly how
many hosts a given cloud is running and two, it would potentially allow
customers to figure out if their VM is on the same host as another
customer's VM.  The first concern is specific to the cloud operator; some
operators may feel that the size of their cloud is material information that
needs to be protected, while other operators may go as far as to publish
those types of stats.  The second concern is around the potential for a
customer to either a) impact the performance of another customer's VM by
excessively taxing the host machine or b) if there was a hypervisor exploit
it would enable a customer to more easily target another customer's VM.  For
example, if you found out the HostID of your target VM you could then keep
spinning up instances until you got one on the same host, then attack.

I strongly feel that HostID should be unique per project (or tenant) and if
it isn't that way right now I think we should look at changing it.

Of course this doesn't address Glen's original question which is around the
need for an cloud admin to have a globally unique host identifier for
targeting operations on a specific host.  That type of concept is probably
worth building in, but in my opinion should be separate from HostID (which
should be reserved for determining VM placement collisions).

-Blake

On Sat, Jul 16, 2011 at 10:47 AM, Jorge Williams <
jorge.willi...@rackspace.com> wrote:

> Right so we should really be hashing this with the tenant ID as well.
>
> -jOrGe W.
>
> On Jul 15, 2011, at 6:16 PM, Chris Behrens wrote:
>
> > I think it's sensitive because one could figure out how many hosts a SP
> has globally... which a SP might not necessarily want to reveal.
> >
> > - Chris
> >
> >
> > On Jul 15, 2011, at 3:34 PM, karim.allah.ah...@gmail.com wrote:
> >
> >> On Fri, Jul 15, 2011 at 11:31 PM, Chris Behrens <
> chris.behr...@rackspace.com> wrote:
> >> Nevermind.  Just found a comment in the API spec that says "hostID" is
> unique per account, not globally.  Hmmm...
> >>
> >> This is weird ! I can't find anything in the code that says so !! hostID
> is just a hashed version of the 'host' which is set as the 'hostname' of the
> physical machine and this isn't user sensitive. So, It's supposed to be a
> global thing !
> >>
> >> Can somebody explain how this is a user sensitive ?
> >>
> >>
> >>
> >> On Jul 15, 2011, at 2:27 PM, Chris Behrens wrote:
> >>
> >>> I see the v1.1 API spec talks about a 'hostId' item returned when you
> list your instances (section 4.1.1 in the spec).  These should be the same
> thing, IMO.
> >>>
> >>> I think you're right, though.  I don't believe we have any sort of
> 'hostId' today, since hosts just become available by attaching to AMQP.
> >>>
> >>> - Chris
> >>>
> >>> On Jul 15, 2011, at 1:16 PM, Glen Campbell wrote:
> >>>
>  I understand that we're all familiar with virtualization and its
> benefits. However, in the Real World, those of us who run clouds often need
> to work with physical devices. I've proposed a blueprint and spec for a
> /hosts admin API resource that would return information on physical hosts.
> However, I don't believe that there's any way for us to actually identify a
> specific server (I'm actually hoping I'm mistaken about this, because that
> would make my life easier).
> 
>  So, to get information about a specific host, you'd use /host/{id} —
> but what should go in the {id} slot?
> 
>  We'd also like to include this data elsewhere; for example, in error
> messages, it might help to know the physical device on which a server is
> created.
> 
> 
>  
>  This email may include confidential information. If you received it in
> error, please delete it.
>  ___
>  Mailing list: https://launchpad.net/~openstack
>  Post to : openstack@lists.launchpad.net
>  Unsubscribe : https://launchpad.net/~openstack
>  More help   : https://help.launchpad.net/ListHelp
> >>>
> >>
> >> This email may include confidential information. If you received it in
> error, please delete it.
> >>
> >>
> >> ___
> >> Mailing list: https://launchpad.net/~openstack
> >> Post to : openstack@lists.launchpad.net
> >> Unsubscribe : https://launchpad.net/~openstack
> >> More help   : https://help.launchpad.net/ListHelp
>