Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

2014-10-17 Thread Joe Cropper
I’m glad to see this topic getting some focus once again.  :-)

>From several of the administrators I talk with, when they think of putting a 
>host into maintenance mode, the common requests I hear are:

1. Don’t schedule more VMs to the host
2. Provide an optional way to automatically migrate all (usually active) VMs 
off the host so that users’ workloads remain “unaffected” by the maintenance 
operation

#1 can easily be achieved, as has been mentioned several times, by simply 
disabling the compute service.  However, #2 involves a little more work, 
although certainly possible using all the operations provided by nova today 
(e.g., live migration, etc.).  I believe these types of discussions have come 
up several times over the past several OpenStack releases—certainly since 
Grizzly (i.e., when I started watching this space).

It seems that the general direction is to have the type of workflow needed for 
#2 outside of nova (which is certainly a valid stance).  To that end, it would 
be fairly straightforward to build some code that logically sits on top of 
nova, that when entering maintenance:

1. Prevents VMs from being scheduled to the host;
2. Maintains state about the maintenance operation (e.g., not in maintenance, 
migrations in progress, in maintenance, or error);
3. Provides mechanisms to, upon entering maintenance, dictates which VMs 
(active, all, none) to migrate and provides some throttling capabilities to 
prevent hundreds of parallel migrations on densely packed hosts (all done via a 
REST API).

If anyone has additional questions, comments, or would like to discuss some 
options, please let me know.  If interested, upon request, I could even share a 
video of how such cases might work.  :-)  My colleagues and I have given these 
use cases a lot of thought and consideration and I’d love to talk more about 
them (perhaps a small session in Paris would be possible).

- Joe

On Oct 17, 2014, at 4:18 AM, John Garbutt  wrote:

> On 17 October 2014 02:28, Matt Riedemann  wrote:
>> 
>> 
>> On 10/16/2014 7:26 PM, Christopher Aedo wrote:
>>> 
>>> On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov
>>>  wrote:
> 
> On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum  wrote:
 
 The idea is not simply deny or hang requests from clients, but provide
 them
 "we are in maintenance mode, retry in X seconds"
 
> You probably would want 'nova host-servers-migrate '
 
 yeah for migrations - but as far as I understand, it doesn't help with
 disabling this host in scheduler - there is can be a chance that some
 workloads will be scheduled to the host.
>>> 
>>> 
>>> Regarding putting a compute host in maintenance mode using "nova
>>> host-update --maintenance enable", it looks like the blueprint and
>>> associated commits were abandoned a year and a half ago:
>>> https://blueprints.launchpad.net/nova/+spec/host-maintenance
>>> 
>>> It seems that "nova service-disable  nova-compute" effectively
>>> prevents the scheduler from trying to send new work there.  Is this
>>> the best approach to use right now if you want to pull a compute host
>>> out of an environment before migrating VMs off?
>>> 
>>> I agree with Tim and Mike that having something respond "down for
>>> maintenance" rather than ignore or hang would be really valuable.  But
>>> it also looks like that hasn't gotten much traction in the past -
>>> anyone feel like they'd be in support of reviving the notion of
>>> "maintenance mode"?
>>> 
>>> -Christopher
>>> 
>>> ___
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> 
>> 
>> host-maintenance-mode is definitely a thing in nova compute via the os-hosts
>> API extension and the --maintenance parameter, the compute manager code is
>> here [1].  The thing is the only in-tree virt driver that implements it is
>> xenapi, and I believe when you put the host in maintenance mode it's
>> supposed to automatically evacuate the instances to some other host, but you
>> can't target the other host or tell the driver, from the API, which
>> instances you want to evacuate, e.g. all, none, running only, etc.
>> 
>> [1]
>> http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990
> 
> We should certainly make that more generic. It doesn't update the VM
> state, so its really only admin focused in its current form.
> 
> The XenAPI logic only works when using XenServer pools with shared NFS
> storage, if my memory serves me correctly. Honestly, its a bit of code
> I have planned on removing, along with the rest of the pool support.
> 
> In terms of requiring DB downtime in Nova, the current efforts are
> focusing on avoiding downtime all together, via expand/contract style
> migrations, with a little help from objects to avoid data migrations.
> 
> That doesn't mean maintenance mode if not useful for other things,
> 

Re: [openstack-dev] DB Datasets CI broken

2014-10-17 Thread Michael Still
On Sat, Oct 18, 2014 at 11:02 AM, Jeremy Stanley  wrote:
> On 2014-10-18 10:45:23 +1100 (+1100), Michael Still wrote:
> [...]
>> Is it possible to add a verification step to nodepool so that it
>> doesn't mark a new image as ready unless it passes some basic sanity
>> checks?
>
> Back in the beforetime, when devstack-gate had scripts which managed
> the worker pool as scheduled Jenkins jobs, it would run DevStack
> exercises on a test boot of the new image before using it to boot
> real images. Of course you can imagine the number of perfectly good
> images which were thrown away because of nondeterministic bugs
> causing false negative results there, so we probably wouldn't want
> to duplicate that exactly, but perhaps something more lightweight
> would be a reasonable compromise.
>
> Anyway, I consider it a good feature request (others may disagree),
> just nobody's reimplemented it in nodepool to date.

Yeah, I'm starting to think along the lines of adding a simple sanity
check to the shell worker in turbo hipster before the real tests run.
Things like checking if the git directory exists, and contains a git
repo with the branches we need. We could run that pre-flight script
(or a variant) of it on images before marking them as real.

For reference, what we think happened here is that the cache of SQL
databases baked into the image was rsynced from our master while
jhesketh was in the process of updating the SQL databases to a more
recent version of OpenStack.

Michael

-- 
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [api] Request Validation - Stoplight

2014-10-17 Thread Sam Harwell
Hi Amit,

Keeping in mind this viewpoint is nothing but my own personal view, my 
recommendation would be to not mandate the use of a particular validation 
framework, but to instead define what kind of validation clients should expect 
the server to perform in general. For example, I would expect a service to 
return an error code and not perform any action if I called "Create server" but 
did not include a request body, but the actual manner in which that error is 
generated within the service does not matter from the client's perspective.

This is not to say the API Working Group wouldn't help you evaluate the 
potential of Stoplight to meet the needs of a service. To the contrary, by 
clearly defining the expectations of a service's responses to requests, you'll 
have a great idea of exactly what to look for in your evaluation, and your 
final decision would be based on objective results.

Thank you,
Sam Harwell

From: Amit Gandhi [mailto:amit.gan...@rackspace.com]
Sent: Friday, October 17, 2014 12:32 PM
To: OpenStack Development Mailing List (not for usage questions)
Cc: r...@ryanpetrello.com
Subject: [openstack-dev] [api] Request Validation - Stoplight

Hi API Working Group

Last night at the Openstack Meetup in Atlanta, a group of us discussed how 
request validation is being performed over various projects and how some teams 
are using pecan wsmi, or warlock, jsonschema etc.

Each of these libraries have their own pro's and con's.  My understanding is 
that the API working group is in the early stages of looking into these various 
libraries and will likely provide guidance in the near future on this.

I would like to suggest another library to evaluate when deciding this.  Some 
of our teams have started to use a library named "Stoplight"[1][2] in our 
projects.  For example, in the Poppy CDN project, we found it worked around 
some of the issues we had with warlock such as validating nested json correctly 
[3].

Stoplight is an input validation framework for python.  It can be used to 
decorate any function (including routes in pecan or falcon) to validate its 
parameters.

Some good examples can be found here [4] on how to use Spotlight.

Let us know your thoughts/interest and we would be happy to discuss further on 
if and how this would be valuable as a library for API request validation in 
Openstack.


Thanks


Amit Gandhi
Senior Manager - Rackspace



[1] https://pypi.python.org/pypi/stoplight
[2] https://github.com/painterjd/stoplight
[3] 
https://github.com/stackforge/poppy/blob/master/poppy/transport/pecan/controllers/v1/services.py#L108
[4] 
https://github.com/painterjd/stoplight/blob/master/stoplight/tests/test_validation.py#L138

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer

2014-10-17 Thread Jim Mankovich

Chris,
See answers inline. I don't have any concrete answers as to how to deal
with some of questions you brought up, but I do have some more detail
that may be useful to further the discussion.

On 10/17/2014 11:03 AM, Chris Dent wrote:

On Thu, 16 Oct 2014, Jim Mankovich wrote:

What I would like to propose is dropping the ipmi string from the 
name altogether and appending the Sensor ID to the name instead of to 
the Resource ID. So, transforming the above to the new naming would 
result in the following:



| Name | Type | Unit | Resource ID
| hardware.current.power_meter_(0x16) | gauge | W | 
edafe6f4-5996-4df8-bc84-7d92439e15c0
| hardware.temperature.system_board_(0x15) | gauge | C | 
edafe6f4-5996-4df8-bc84-7d92439e15c0


[plus sensor_provider in resource_metadata]

If this makes sense for the kinds of queries that need to happen then
we may as well do it, but I'm not sure it is. When I was writing the
consumer code for the notifications the names of the meters was a big
open question that was hard to resolve because of insufficient data
and input on what people really need to do with the samples.

The scenario you've listed is getting all sensors on a given single
platform.

What about the scenario where you want to create an alarm that says
"If temperate gets over X on any system board on any of my hardware,
notify the authorities"? Will having the "_(0x##)" qualifier allow that
to work? I don't actually know, are those qualifiers standard in some
way or are they specific to different equipment? If they are different
having them in the meter name makes creating a useful alarm in a
heterogeneous a bit more of a struggle, doesn't it?


The "_(0x##)" is an ipmitool display artifact that is tacked onto the 
end of the Sensor ID

in order to provide more information beyond what Sensor ID has in it.
The ## is the sensor record ID which is specific to IPMI. Whether or
not a Sensor ID (sans _(0x##)) is unique is up to the vendor, but in 
general
I believe all vendors will likely name their sensors uniquely; 
otherwise, how can a
person differentiate textually what component in a platform the sensor 
represents?


Personally, I would like to see the _(0x##) removed form the Sensor ID 
string (by the ipmitool
driver) before it returns sensors to the Ironic conductor. I just don't 
see any value in this
extra info. This 0x## addition only helps if a vendor used the exact 
same Sensor ID string for multiple
sensors of the same sensor type. i.e. Multiple sensors of type 
"Temperature", each with the
exact same Sensor ID string of "CPU" instead of giving each Sensor ID 
string a unique name

like "CPU 1 ", " CPU 2",...

Now if you want to get deeper into the IPMI realm, (which I don't really 
want to advocate)
the Entity ID Code actually tells you the component. From the IPMI spec 
section, 43.14 Entity IDs


"The Entity ID field is used for identifying the physical entity that a 
sensor or device is associated with. If multiple
sensors refer to the same entity, they will have the same Entity ID 
field value. For example, if a voltage sensor and
a temperature sensor are both for a ‘Power Supply 1’ entity the Entity 
ID in their sensor data records would both
be 10 (0Ah), per the Entity ID table." FYI: Entity 10 (0Ah) means "power 
supply".


In a heterogeneous platform environment, the Sensor ID string is likely 
going to be different per vendor,
so your question "If temperate...on any system board...on any hardware, 
notify the authorities" is
going to be tough because each vendor may name their "system board" 
differently. But, I bet that
vendors use similar strings, so worst case, your alarm creation could 
require 1 alarm definition

per vendor.



Perhaps (if they are not standard) this would work:

| hardware.current.power_meter | gauge | W | 
edafe6f4-5996-4df8-bc84-7d92439e15c0


with both sensor_provider and whatever that qualifier is called in the
metadata?


I see generic naming as somewhat problematic. If you lump all the 
temperature sensors
for a platform under hardware.temperature the consumer will always need 
to query for a specific temperature
sensor that it is interested in, like "system board". The notion of 
having different samples from
multiple sensors under a single generic name seems harder to deal with 
to me. If you have multiple
temperature samples under the same generic meter name, how do you figure 
out what all the possible

temperature samples actual exist?



Then the name remains sufficiently generic to allow aggregates across
multiple systems, while still having the necessary info to narrow to
different sensors of the same type.

I understand that this proposed change is not backward compatible 
with the existing naming, but I don't really see a good solution that 
would retain backward compatibility.


I think we should strive to worry less about such things, especially
when it's just names in data fields. Not always possible, or even a
good idea, but sometimes its a win.



Re: [openstack-dev] [infra] [all] config repository rename to system-config

2014-10-17 Thread Jeremy Stanley
As previously announced, we have now completed renaming of the
following Git repositories:

openstack-infra/config -> openstack-infra/system-config
stackforge/glance-formula -> stackforge/glance-salt-formula
stackforge/keystone-formula -> stackforge/keystone-salt-formula

You will want to update your git remotes on any existing clones of
these repositories accordingly. Something like...

cd git/openstack-infra/
mv config/ system-config
cd system-config/
git remote set-url origin \
https://review.openstack.org/p/openstack-infra/system-config
git remote set-url gerrit \
ssh://u...@review.openstack.org:29418/openstack-infra/system-config.git
cd

For users of Gertty, James Blair has provided the following example
recipe for updating its database...

sqlite3 ~/.gertty.db "update project
set name='openstack-infra/system-config'
where name='openstack-infra/config'"
sqlite3 ~/.gertty.db "update change
set id = replace(
id, 'openstack-infra%2Fconfig',
'openstack-infra%2Fsystem-config')
where id like 'openstack-infra%%2Fconfig%'"

-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DB Datasets CI broken

2014-10-17 Thread Jeremy Stanley
On 2014-10-18 10:45:23 +1100 (+1100), Michael Still wrote:
[...]
> Is it possible to add a verification step to nodepool so that it
> doesn't mark a new image as ready unless it passes some basic sanity
> checks?

Back in the beforetime, when devstack-gate had scripts which managed
the worker pool as scheduled Jenkins jobs, it would run DevStack
exercises on a test boot of the new image before using it to boot
real images. Of course you can imagine the number of perfectly good
images which were thrown away because of nondeterministic bugs
causing false negative results there, so we probably wouldn't want
to duplicate that exactly, but perhaps something more lightweight
would be a reasonable compromise.

Anyway, I consider it a good feature request (others may disagree),
just nobody's reimplemented it in nodepool to date.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [api] API recommendation

2014-10-17 Thread Peter Balland
On Oct 16, 2014 8:24 AM, "Dean Troyer"  wrote:
>
>
>
> On Thu, Oct 16, 2014 at 4:57 AM, Salvatore Orlando 
wrote:
>>
>> From an API guideline viewpoint, I understand that
https://review.openstack.org/#/c/86938/ proposes the introduction of a
rather simple endpoint to query active tasks and filter them by resource
uuid or state, for example.
>
>
> That review/blueprint contains one thing that I want to address in more
detail below along with Sal's comment on persistence...
>
>>
>> While this is hardly questionable, I wonder if it might be worth
"typifying" the task, ie: adding a resource_type attribute, and/or allowing
to retrieve active tasks as a chile resource of an object, eg.: GET
/servers//tasks?state=running or if just for running tasks GET
/servers//active_tasks
>
>
> I'd prefer the filter approach, but more importantly, it should be the
_same_ structure as listing resources themselves.
>
> To note: here is another API design detail, specifying resource types in
the URL path:
>
> /server//foo
>
> vs
>
> //foo
>
> or what we have today, for example, in compute:
>
> //foo
>
>> The proposed approach for the multiple server create case also makes
sense to me. Other than "bulk" operations there are indeed cases where a
single API operation needs to perform multiple tasks. For instance, in
Neutron, creating a port implies L2 wiring, setting up DHCP info, and
securing it on the compute node by enforcing anti-spoof rules and security
groups. This means there will be 3/4 active tasks. For this reason I wonder
if it might be the case of differentiating between the concept of
"operation" and "tasks" where the former is the activity explicitly
initiated by the API consumer, and the latter are the activities which need
to complete to fulfil it. This is where we might leverage the already
proposed request_id attribute of the task data structure.
>
>
> I like the ability to track the fan-out, especially if I can get the
state of the entire set of tasks in a single round-trip.  This also makes
it easier to handle backout of failed requests without having to maintain a
lot of client-side state, or make a lot of round-trips.
>

Based on previous experience, I highly recommend maintaining separation
between tracking work at an API call level aggregate and other "subtasks."
In non-provisioning scenarios, tasks may fire independent of API
operations, so there wouldn't be an API handle to query on. It is great to
manage per-API call level tasks in the framework. The "other work" type
tasks are *much* more complicated beasts, deserving of their own design.

>> Finally, a note on persistency. How long a completed task, successfully
or not should be stored for? Do we want to store them until the resource
they operated on is deleted?
>> I don't think it's a great idea to store them indefinitely in the DB.
Tying their lifespan to resources is probably a decent idea, but time-based
cleanup policies might also be considered (e.g.: destroy a task record 24
hours after its completion)
>
>
> I can envision an operator/user wanting to be able to pull a log of an
operation/task for not only cloud debugging (x failed to build, when/why?)
but also app-level debugging (concrete use case not ready at deadline).
This would require a minimum of life-of-resource + some-amount-of-time.
The time might also be variable, failed operations might actually need to
stick around longer.
>
> Even as an operator with access to backend logging, pulling these state
transitions out should not be hard, and should be available to the resource
owner (project).
>
> dt
>
> --
>
> Dean Troyer
> dtro...@gmail.com
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DB Datasets CI broken

2014-10-17 Thread Michael Still
This was a bad image in nodepool. I've rebuilt the image and killed
our pool of workers running the old image and things seem to be ok
now. I'm in the process of enqueueing rechecks for every failed
turbo-hipster run now, but they'll take some time to all get executed.

Thanks for your patience everyone.

Is it possible to add a verification step to nodepool so that it
doesn't mark a new image as ready unless it passes some basic sanity
checks?

Thanks,
Michael

On Sat, Oct 18, 2014 at 8:44 AM, Michael Still  wrote:
> Hi,
>
> I've just noticed that the DB Datasets CI (the artist formerly known
> as turbo hipster) is failing for many patches. I'm looking into it
> now.
>
> Michael
>
> --
> Rackspace Australia



-- 
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] DB Datasets CI broken

2014-10-17 Thread Michael Still
Hi,

I've just noticed that the DB Datasets CI (the artist formerly known
as turbo hipster) is failing for many patches. I'm looking into it
now.

Michael

-- 
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] CI report : 04/10/2014 - 17/10/2014

2014-10-17 Thread Derek Higgins
Hi All,

   Nothing to report since the last report, 2 weeks of no breakages.

thanks,
Derek.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Jeremy Stanley
On 2014-10-17 16:17:39 +0100 (+0100), Louis Taylor wrote:
> This looks like a continuation of the old PYTHONHASHSEED bug:
> 
> https://launchpad.net/bugs/1348818

The underlying design choices in python-glanceclient's tests do
cause both problems (can't run with a random hash seed, but also
can't run under a different hash algorithm), and properly fixing one
will fix the other. Unfortunately there isn't an easy workaround for
the Python 3.4 testing issue, unlike bug 1348818.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [api] Request Validation - Stoplight

2014-10-17 Thread Amit Gandhi
Hi API Working Group

Last night at the Openstack Meetup in Atlanta, a group of us discussed how 
request validation is being performed over various projects and how some teams 
are using pecan wsmi, or warlock, jsonschema etc.

Each of these libraries have their own pro’s and con’s.  My understanding is 
that the API working group is in the early stages of looking into these various 
libraries and will likely provide guidance in the near future on this.

I would like to suggest another library to evaluate when deciding this.  Some 
of our teams have started to use a library named “Stoplight”[1][2] in our 
projects.  For example, in the Poppy CDN project, we found it worked around 
some of the issues we had with warlock such as validating nested json correctly 
[3].

Stoplight is an input validation framework for python.  It can be used to 
decorate any function (including routes in pecan or falcon) to validate its 
parameters.

Some good examples can be found here [4] on how to use Spotlight.

Let us know your thoughts/interest and we would be happy to discuss further on 
if and how this would be valuable as a library for API request validation in 
Openstack.


Thanks


Amit Gandhi
Senior Manager – Rackspace



[1] https://pypi.python.org/pypi/stoplight
[2] https://github.com/painterjd/stoplight
[3] 
https://github.com/stackforge/poppy/blob/master/poppy/transport/pecan/controllers/v1/services.py#L108
[4] 
https://github.com/painterjd/stoplight/blob/master/stoplight/tests/test_validation.py#L138

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer

2014-10-17 Thread Chris Dent

On Thu, 16 Oct 2014, Jim Mankovich wrote:

What I would like to propose is dropping the ipmi string from the name 
altogether and appending the Sensor ID to the name  instead of to the 
Resource ID.   So, transforming the above to the new naming would result in 
the following:



| Name | Type  | Unit | Resource ID
| hardware.current.power_meter_(0x16)  | gauge | W| 
edafe6f4-5996-4df8-bc84-7d92439e15c0
| hardware.temperature.system_board_(0x15) | gauge | C| 
edafe6f4-5996-4df8-bc84-7d92439e15c0


[plus sensor_provider in resource_metadata]

If this makes sense for the kinds of queries that need to happen then
we may as well do it, but I'm not sure it is. When I was writing the
consumer code for the notifications the names of the meters was a big
open question that was hard to resolve because of insufficient data
and input on what people really need to do with the samples.

The scenario you've listed is getting all sensors on a given single
platform.

What about the scenario where you want to create an alarm that says
"If temperate gets over X on any system board on any of my hardware,
notify the authorities"? Will having the "_(0x##)" qualifier allow that
to work? I don't actually know, are those qualifiers standard in some
way or are they specific to different equipment? If they are different
having them in the meter name makes creating a useful alarm in a
heterogeneous a bit more of a struggle, doesn't it?

Perhaps (if they are not standard) this would work:

 | hardware.current.power_meter | gauge | W | 
edafe6f4-5996-4df8-bc84-7d92439e15c0

with both sensor_provider and whatever that qualifier is called in the
metadata?

Then the name remains sufficiently generic to allow aggregates across
multiple systems, while still having the necessary info to narrow to
different sensors of the same type.

I understand that this proposed change is not backward compatible with the 
existing naming, but I don't really see a good solution that would retain 
backward compatibility.


I think we should strive to worry less about such things, especially
when it's just names in data fields. Not always possible, or even a
good idea, but sometimes its a win.

--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-17 Thread Fox, Kevin M
docker exec would be awesome.

So... whats redhat's stance on docker upgrades here?

I'm running centos7, and dockers topped out at 
docker-0.11.1-22.el7.centos.x86_64.
(though redhat package versions don't always reflect the upstream version)

I tried running docker 1.2 binary from docker.io but selinux flipped out on it.

how long before docker exec actually is useful solution for debugging on such 
systems?

Thanks,
Kevin

From: Lars Kellogg-Stedman [l...@redhat.com]
Sent: Thursday, October 16, 2014 7:14 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [kolla] on Dockerfile patterns

On Fri, Oct 17, 2014 at 12:44:50PM +1100, Angus Lees wrote:
> You just need to find the pid of a process in the container (perhaps using
> docker inspect to go from container name -> pid) and then:
>  nsenter -t $pid -m -u -i -n -p -w

Note also that the 1.3 release of Docker ("any day now") will sport a
shiny new "docker exec" command that will provide you with the ability
to run commands inside the container via the docker client without
having to involve nsenter (or nsinit).

It looks like:

docker exec  ps -fe

Or:

docker exec -it  bash

--
Lars Kellogg-Stedman  | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] APIImpact flag for nova specs

2014-10-17 Thread Everett Toews
On Oct 15, 2014, at 5:52 AM, Christopher Yeoh 
mailto:cbky...@gmail.com>> wrote:

We don't require new templates as part of nova-specs and api changes don't 
necessarily change the api sample tpl files. We do ask for some jsonschema 
descriptions of the new APIs input but they work pretty well in the spec 
document itself. I agree it could be prone to spelling mistakes etc, though 
just being able to search for 'api' would be sufficient and people who review 
specs could pick up missing or mispelled flags in the commit message (and it 
wouldn't necessarily need to be restricted to just APIImpact as possible flags).

+1 to APIImpact flag

That there could be misses is not a good reason to not do this. Which is to 
say, let’s do this.

Everett

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Ian Cordasco
I would also advise pinning the version of mccabe we’re using. Mccabe was
originally a proof-of-concept script that Ned Batchelder wrote and which
Tarek Ziade vendored into Flake8. After we split it out in v2 of Flake8,
we’ve found several (somewhat serious) reporting problems with the tool.
Currently the package owner on PyPI hasn’t granted me permissions to
release a new version of the package, but we have several fixes in the
repository: https://github.com/flintwork/mccabe. The changes are somewhat
drastic but they should reduce the average function/method’s complexity by
1 or 2 points. I’m going to bother Florent again to give me permission to
release the package since it has been far too long since a release has
been cut.

For what it’s worth, Florent doesn’t pay close attention to GitHub
notifications so chiming in (or creating) issues on mccabe to release a
new version will only spam *me*. So please don’t pile on to anything
existing or create a new one.

Cheers,
Ian

On 10/17/14, 12:39 AM, "Michael Davies"  wrote:

>On Fri, Oct 17, 2014 at 2:39 PM, Joe Gordon
> wrote:
>
>
>
>First step in fixing this, put a cap on it:
>https://review.openstack.org/129125
>
>
>
>
>
>
>
>Thanks Joe - I've just put up a similar patch for Ironic:
>https://review.openstack.org/129132 
> 
>
>
>-- 
>Michael Davies   mich...@the-davies.net
>Rackspace Australia
>

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Louis Taylor
On Fri, Oct 17, 2014 at 03:01:22PM +, Jeremy Stanley wrote:
> Gah! You'd think *I* would know better at this point--sorry about
> that... I've now opened https://launchpad.net/bugs/1382582 to track
> this. Thanks for any assistance you're able to provide!

This looks like a continuation of the old PYTHONHASHSEED bug:

https://launchpad.net/bugs/1348818


signature.asc
Description: Digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Jeremy Stanley
On 2014-10-17 16:57:59 +1300 (+1300), Fei Long Wang wrote:
> Thanks for the heads up. Is there a bug opened to track this? If
> not, I'm going to open one and dig into it. Cheers.

Gah! You'd think *I* would know better at this point--sorry about
that... I've now opened https://launchpad.net/bugs/1382582 to track
this. Thanks for any assistance you're able to provide!
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] Turbo hipster problems

2014-10-17 Thread Gary Kotton
Hi,
Anyone aware why Turbo his peter is failing with:

real-db-upgrade_nova_percona_user_002:th-percona Exception: 
[Errno 2] No such file or directory: '/var/lib/turbo-hipster/datasets_user_002' 
in 0s

Thanks
Gary
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Elections] Results of the TC Election

2014-10-17 Thread Anita Kuno
Please join me in congratulating the 6 newly elected members of the TC.

* Monty Taylor
* Sean Dague
* Doug Hellmann
* Russell Bryant
* Anne Gentle
* John Griffith

Full results:
http://civs.cs.cornell.edu/cgi-bin/results.pl?id=E_c105db929e6c11f4

Thank you to all candidates who stood for election, having a good group
of candidates helps engage the community in our democratic process,

Thank you to Mark McLoughlin, who served in the previous TC and chose
run for a seat this time.

Thank you to all who voted and who encouraged others to vote. We need to
ensure your voice is heard.

Thanks to my fellow election official, Tristan Cacqueray, I appreciate
your help and perspective.

Thank you for another great round.

Here's to Kilo,
Anita.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [novaclient] E12* rules

2014-10-17 Thread Andrey Kurilin
Hi everyone!

I'm working on enabling E12* PEP8 rules in novaclient(status of my work
listed below). Imo, PEP8 rules should be ignored only in extreme cases/for
important reasons and we should decrease a number of ignored rules. This
helps to keep code in more strict, readable form, which is very important
when working in community.

While working on rule E126, we started discussion with Joe Gordon about
demand of these rules. I have no idea about reasons of why they should be
ignored, so I want to know:
- Why these rules should be ignored?
- What do you think about enabling these rules?

Please, leave your opinion about E12* rules.

Already enabled rules:
  E121,E125 - https://review.openstack.org/#/c/122888/
  E122 - https://review.openstack.org/#/c/123830/
  E123 - https://review.openstack.org/#/c/123831/

Abandoned rule:
  E124 - https://review.openstack.org/#/c/123832/

Pending review:
  E126 - https://review.openstack.org/#/c/123850/
  E127 - https://review.openstack.org/#/c/123851/
  E128 - https://review.openstack.org/#/c/127559/
  E129 - https://review.openstack.org/#/c/123852/


-- 
Best regards,
Andrey Kurilin.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Proposal: A launchpad bug description template

2014-10-17 Thread Markus Zoeller
Thierry Carrez  wrote on 10/17/2014 11:28:56 AM:

> From: Thierry Carrez 
> To: openstack-dev@lists.openstack.org
> Date: 10/17/2014 11:31 AM
> Subject: Re: [openstack-dev] [QA] Proposal: A launchpad bug description 
template
> 
> Markus Zoeller wrote:
> > TL;DR: A proposal for a template for launchpad bug entries which asks 
> >for the minimal needed data to work on a bug.
> 
> Note that Launchpad doesn't support bug entry templates. You can display
> "bug reporting guidelines" which appear under the textbox, but that's
> about it.
> 
> Also note that the text is project-specific, so it needs to be entered
> in every "openstack" project. Depending on the exact nature of the
> project, I suspect the text should be different.
> 
> Regards,
> 
> -- 
> Thierry Carrez (ttx)

Thanks for the note on Launchpads capabilities. Providing the infor-
mation in the "bug reporting guidelines" on launchpad looks like a good
place. Currently there is for Nova "Please include the exact version
of Nova with which you're experiencing this issue.". 

The wiki page about the bugs [1] could be enhanced as well and then 
we could let Launchpad link to this wiki page. Maybe this would reduce
the maintenance of the template. Subsections could be introduced for 
project specific debug data.

[1] https://wiki.openstack.org/wiki/Bugs

Regards, 
Markus Zoeller 
IRC: markus_z


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Cell Initialization

2014-10-17 Thread Vineet Menon
Hi,

I was trying to create cells under openstack using devstack. My setup
contains 3 machines. One toplevel and 2 compute cells.
I'm following this documentation,
http://docs.openstack.org/trunk/config-reference/content/section_compute-cells.html
.

Both these cells instantiation are generating errors.
1. first one doesn't generate any error logs unless I issue a command at
the parent 'nova cell-show cell2'. At this point the toplevel cell throws
the following error,

2014-10-17 12:03:34.888 ERROR oslo.messaging.rpc.dispatcher [-] Exception
> during message handling: Circular reference detected
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher Traceback
> (most recent call last):
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line
> 134, in _dispatch_and_reply
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
> incoming.message))
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py",
> line 72, in reply
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
> self._send_reply(conn, reply, failure, log_failure=log_failure)
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py",
> line 62, in _send_reply
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
> conn.direct_send(self.reply_q, rpc_common.serialize_msg(msg))
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/common.py", line
> 302, in serialize_msg
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
> _MESSAGE_KEY: jsonutils.dumps(raw_msg)}
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib/python2.7/site-packages/oslo/messaging/openstack/common/jsonutils.py",
> line 172, in dumps
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher return
> json.dumps(value, default=default, **kwargs)
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib64/python2.7/json/__init__.py", line 250, in dumps
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
> sort_keys=sort_keys, **kw).encode(obj)
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib64/python2.7/json/encoder.py", line 207, in encode
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher chunks =
> self.iterencode(o, _one_shot=True)
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
> "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher return
> _iterencode(o, 0)
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher ValueError:
> Circular reference detected
> 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
> 2014-10-17 12:03:34.890 ERROR oslo.messaging._drivers.common [-] Returning
> exception Circular reference detected to caller
> 2014-10-17 12:03:34.890 ERROR oslo.messaging._drivers.common [-]
> ['Traceback (most recent call last):\n', '  File
> "/usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line
> 134, in _dispatch_and_reply\nincoming.message))\n', '  File
> "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py",
> line 72, in reply\nself._send_reply(conn, reply, failure,
> log_failure=log_failure)\n', '  File
> "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py",
> line 62, in _send_reply\nconn.direct_send(self.reply_q,
> rpc_common.serialize_msg(msg))\n', '  File
> "/usr/lib/python2.7/site-packages/oslo/messaging/_drivers/common.py", line
> 302, in serialize_msg\n_MESSAGE_KEY: jsonutils.dumps(raw_msg)}\n', '
> File
> "/usr/lib/python2.7/site-packages/oslo/messaging/openstack/common/jsonutils.py",
> line 172, in dumps\nreturn json.dumps(value, default=default,
> **kwargs)\n', '  File "/usr/lib64/python2.7/json/__init__.py", line 250, in
> dumps\nsort_keys=sort_keys, **kw).encode(obj)\n', '  File
> "/usr/lib64/python2.7/json/encoder.py", line 207, in encode\nchunks =
> self.iterencode(o, _one_shot=True)\n', '  File
> "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode\nreturn
> _iterencode(o, 0)\n', 'ValueError: Circular reference detected\n']
>
> This one seems to be similar to the bug reported here,
https://bugs.launchpad.net/nova/+bug/1312002

 2. In the second child cell initialization, the error crops up as soon as
I add the toplevel cell in the child cell using 'nova-manage' command.

2014-10-17 12:05:29.500 ERROR nova.cells.messaging
> [req-f74d05cf-061a-4488-bfcb-0cb1edec44e2 None None] Error locating next
> hop for message: Inconsistency in cell routing: destination is
> cell1!toplevel but routing_path is cell1!cell1
> 2014-10-17 12:05:29.500 TRACE nova.cells.mes

[openstack-dev] [Horizon] Template Blueprint

2014-10-17 Thread Ana Krivokapic

Hello Horizoners,

I would like to draw your attention to the excellent Template 
Blueprint[1] which David created. The aim of this is to create a 
template which will be used for all future blueprints. This way we can 
try to ensure that enough information/detail is provided in blueprints, 
as we have had problems with blueprints lacking in details in the past.


Please take a minute to review [1] and add your comments to the 
whiteboard. We are hoping to finalize this and starting using this 
template ASAP.


Thanks!


[1] https://blueprints.launchpad.net/horizon/+spec/template

--
Regards,

Ana Krivokapic
Software Engineer
OpenStack team
Red Hat Inc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-17 Thread Florian Haas
On Fri, Oct 17, 2014 at 9:53 AM, Jastrzebski, Michal
 wrote:
>
>
>> -Original Message-
>> From: Florian Haas [mailto:flor...@hastexo.com]
>> Sent: Thursday, October 16, 2014 10:53 AM
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: Re: [openstack-dev] [Nova] Automatic evacuate
>>
>> On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
>>  wrote:
>> > In my opinion flavor defining is a bit hacky. Sure, it will provide us
>> > functionality fairly quickly, but also will strip us from flexibility
>> > Heat would give. Healing can be done in several ways, simple destroy
>> > -> create (basic convergence workflow so far), evacuate with or
>> > without shared storage, even rebuild vm, probably few more when we put
>> > more thoughts to it.
>>
>> But then you'd also need to monitor the availability of *individual* guest 
>> and
>> down you go the rabbit hole.
>>
>> So suppose you're monitoring a guest with a simple ping. And it stops
>> responding to that ping.
>
> I was more reffering to monitoring host (not guest), and for sure not by ping.
> I was thinking of current zookeeper service group implementation, we might 
> want
> to use corosync and write servicegroup plugin for that. There are several 
> choices
> for that, each requires testing really before we make any decission.
>
> There is also fencing case, which we agree is important, and I think nova 
> should
> be able to do that (since it does evacuate, it also should do a fencing). But
> for working fencing we really need working host health monitoring, so I 
> suggest
> we take baby steps here and solve one issue at the time. And that would be 
> host
> monitoring.

You're describing all of the cases for which Pacemaker is the perfect
fit. Sorry, I see absolutely no point in teaching Nova to do that.

>> (1) Has it died?
>> (2) Is it just too busy to respond to the ping?
>> (3) Has its guest network stack died?
>> (4) Has its host vif died?
>> (5) Has the L2 agent on the compute host died?
>> (6) Has its host network stack died?
>> (7) Has the compute host died?
>>
>> Suppose further it's using shared storage (running off an RBD volume or
>> using an iSCSI volume, or whatever). Now you have almost as many recovery
>> options as possible causes for the failure, and some of those recovery
>> options will potentially destroy your guest's data.
>>
>> No matter how you twist and turn the problem, you need strongly consistent
>> distributed VM state plus fencing. In other words, you need a full blown HA
>> stack.
>>
>> > I'd rather use nova for low level task and maybe low level monitoring
>> > (imho nova should do that using servicegroup). But I'd use something
>> > more more configurable for actual task triggering like heat. That
>> > would give us framework rather than mechanism. Later we might want to
>> > apply HA on network or volume, then we'll have mechanism ready just
>> > monitoring hook and healing will need to be implemented.
>> >
>> > We can use scheduler hints to place resource on host HA-compatible
>> > (whichever health action we'd like to use), this will bit more
>> > complicated, but also will give us more flexibility.
>>
>> I apologize in advance for my bluntness, but this all sounds to me like 
>> you're
>> vastly underrating the problem of reliable guest state detection and
>> recovery. :)
>
> Guest health in my opinion is just a bit out of scope here. If we'll have 
> robust
> way of detecting host health, we can pretty much asume that if host dies, 
> guests follow.
> There are ways to detect guest health (libvirt watchdog, ceilometer, ping you 
> mentioned),
> but that should be done somewhere else. And for sure not by evacuation.

You're making an important point here; you're asking for a "robust way
of detecting host health". I can guarantee you that the way of
detecting host health that you suggest (i.e. from within Nova) will
not be "robust" by HA standards for at least two years, if your patch
lands tomorrow.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Jay Pipes

On 10/17/2014 05:10 AM, Chris Dent wrote:

On Fri, 17 Oct 2014, Daniel P. Berrange wrote:


IMHO this tool is of pretty dubious value. I mean that function is long
for sure, but it is by no means a serious problem in the Nova libvirt
codebase. The stuff it complains about in the libvirt/config.py file is
just incredibly stupid thing to highlight.


I find a lot of the OpenStack code very hard to read. If it is very
hard to read it is very hard to maintain, whether that means fix or
improve.


Exactly, ++.


That said, the value I see in these kinds of tools is not
specifically in preventing complexity, but in providing entry points
for people who want to fix things. You don't know where to start
(because you haven't yet got the insight or experience): run
flake8 or pylint or some other tools, do what it tells you. In the
process you will:

* learn more about the code
* probably find bugs
* make an incremental improvement to something that needs it


Agreed.

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Summit scheduling - using our time together wisely.

2014-10-17 Thread Thierry Carrez
Clint Byrum wrote:
> * The Ops Summit is Wendesday/Thursday [3], which overlaps with these
>   sessions. I am keenly interested in gathering more contribution from
>   those already operating and deploying OpenStack. It can go both ways,
>   but I think it might make sense to have more ops-centric topics
>   discussed on Friday, when those participants might not be fully
>   wrapped up in the ops sessions.

The Ops Summit is actually on Monday and Thursday. Not on Wednesday.
You were wrong on the Internet.

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Proposal: A launchpad bug description template

2014-10-17 Thread Thierry Carrez
Markus Zoeller wrote:
> TL;DR: A proposal for a template for launchpad bug entries which asks 
>for the minimal needed data to work on a bug.

Note that Launchpad doesn't support bug entry templates. You can display
"bug reporting guidelines" which appear under the textbox, but that's
about it.

Also note that the text is project-specific, so it needs to be entered
in every "openstack" project. Depending on the exact nature of the
project, I suspect the text should be different.

Regards,

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

2014-10-17 Thread John Garbutt
On 17 October 2014 02:28, Matt Riedemann  wrote:
>
>
> On 10/16/2014 7:26 PM, Christopher Aedo wrote:
>>
>> On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov
>>  wrote:

 On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum  wrote:
>>>
>>> The idea is not simply deny or hang requests from clients, but provide
>>> them
>>> "we are in maintenance mode, retry in X seconds"
>>>
 You probably would want 'nova host-servers-migrate '
>>>
>>> yeah for migrations - but as far as I understand, it doesn't help with
>>> disabling this host in scheduler - there is can be a chance that some
>>> workloads will be scheduled to the host.
>>
>>
>> Regarding putting a compute host in maintenance mode using "nova
>> host-update --maintenance enable", it looks like the blueprint and
>> associated commits were abandoned a year and a half ago:
>> https://blueprints.launchpad.net/nova/+spec/host-maintenance
>>
>> It seems that "nova service-disable  nova-compute" effectively
>> prevents the scheduler from trying to send new work there.  Is this
>> the best approach to use right now if you want to pull a compute host
>> out of an environment before migrating VMs off?
>>
>> I agree with Tim and Mike that having something respond "down for
>> maintenance" rather than ignore or hang would be really valuable.  But
>> it also looks like that hasn't gotten much traction in the past -
>> anyone feel like they'd be in support of reviving the notion of
>> "maintenance mode"?
>>
>> -Christopher
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> host-maintenance-mode is definitely a thing in nova compute via the os-hosts
> API extension and the --maintenance parameter, the compute manager code is
> here [1].  The thing is the only in-tree virt driver that implements it is
> xenapi, and I believe when you put the host in maintenance mode it's
> supposed to automatically evacuate the instances to some other host, but you
> can't target the other host or tell the driver, from the API, which
> instances you want to evacuate, e.g. all, none, running only, etc.
>
> [1]
> http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990

We should certainly make that more generic. It doesn't update the VM
state, so its really only admin focused in its current form.

The XenAPI logic only works when using XenServer pools with shared NFS
storage, if my memory serves me correctly. Honestly, its a bit of code
I have planned on removing, along with the rest of the pool support.

In terms of requiring DB downtime in Nova, the current efforts are
focusing on avoiding downtime all together, via expand/contract style
migrations, with a little help from objects to avoid data migrations.

That doesn't mean maintenance mode if not useful for other things,
like an emergency patching of the hypervisor.

John

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer

2014-10-17 Thread Dmitry Tantsur

Hi Jim,

On 10/16/2014 07:23 PM, Jim Mankovich wrote:

All,

I would like to get some feedback on a proposal  to change to the
current sensor naming implemented in ironic and ceilometer.

I would like to provide vendor specific sensors within the current
structure for IPMI sensors in ironic and ceilometer, but I have found
that the current  implementation of sensor meters in ironic and
ceilometer is IPMI specific (from a meter naming perspective) . This is
not suitable as it currently stands to support sensor information from a
provider other than IPMI.Also, the current Resource ID naming makes
it difficult for a consumer of sensors to quickly find all the sensors
for a given Ironic Node ID, so I would like to propose changing the
Resource ID naming as well.

Currently, sensors sent by ironic to ceilometer get named by ceilometer
as has "hardware.ipmi.SensorType", and the Resource ID is the Ironic
Node ID with a post-fix containing the Sensor ID.  For Details
pertaining to the issue with the Resource ID naming, see
https://bugs.launchpad.net/ironic/+bug/1377157, "ipmi sensor naming in
ceilometer is not consumer friendly"

Here is an example of what meters look like for sensors in ceilometer
with the current implementation:
| Name| Type  | Unit | Resource ID
| hardware.ipmi.current   | gauge | W|
edafe6f4-5996-4df8-bc84-7d92439e15c0-power_meter_(0x16)
| hardware.ipmi.temperature   | gauge | C|
edafe6f4-5996-4df8-bc84-7d92439e15c0-16-system_board_(0x15)

What I would like to propose is dropping the ipmi string from the name
altogether and appending the Sensor ID to the name  instead of to the
Resource ID.   So, transforming the above to the new naming would result
in the following:
| Name | Type  | Unit | Resource ID
| hardware.current.power_meter_(0x16)  | gauge | W|
edafe6f4-5996-4df8-bc84-7d92439e15c0
| hardware.temperature.system_board_(0x15) | gauge | C|
edafe6f4-5996-4df8-bc84-7d92439e15c0

+1

Very-very nit, feel free to ignore if inappropriate: maybe 
hardware.temperature.system_board.0x15 ? I.e. use separation with dots, 
do not use brackets?


This structure would provide the ability for a consumer to do a
ceilometer resource list using the Ironic Node ID as the Resource ID to
get all the sensors in a given platform.   The consumer would then then
iterate over each of the sensors to get the samples it wanted.   In
order to retain the information as to who provide the sensors, I would
like to propose that a standard "sensor_provider" field be added to the
resource_metadata for every sensor where the "sensor_provider" field
would have a string value indicating the driver that provided the sensor
information. This is where the string "ipmi", or a vendor specific
string would be specified.

+1


I understand that this proposed change is not backward compatible with
the existing naming, but I don't really see a good solution that would
retain backward compatibility.
For backward compatibility you could _also_ keep old ones (with ipmi in 
it) for IPMI sensors.




Any/All Feedback will be appreciated,
In this version it makes a lot of sense to me, +1 if Ceilometer folks 
are not against.



Jim




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Flavio Percoco
On 10/17/2014 05:57 AM, Fei Long Wang wrote:
> Hi Jeremy,
> 
> Thanks for the heads up. Is there a bug opened to track this? If not,
> I'm going to open one and dig into it. Cheers.

Hey Fei Long,

Thanks for taking care of this, please keep me in the loop.

@Jeremy: Thanks for the heads up

Flavio

> 
> On 17/10/14 14:17, Jeremy Stanley wrote:
>> As part of an effort to deprecate our specialized testing platform
>> for Python 3.3, many of us have been working to confirm projects
>> which currently gate on 3.3 can also pass their same test sets under
>> Python 3.4 (which comes by default in Ubuntu Trusty). For the vast
>> majority of projects, the differences between 3.3 and 3.4 are
>> immaterial and no effort is required. For some, minor adjustments
>> are needed...
>>
>> For python-glanceclient, we have 22 failing tests in a tox -e py34
>> run. I spent the better part of today digging into them, and they
>> basically all stem from the fact that PEP 456 switches the unordered
>> data hash algorithm from FNV to SipHash in 3.4. The unit tests in
>> python-glanceclient frequently rely on trying to match
>> multi-parameter URL queries and JSON built from unordered data types
>> against predetermined string representations. Put simply, this just
>> doesn't work if you can't guarantee their ordering.
>>
>> I'm left with a dilemma--I don't really have time to fix all of
>> these (I started to go through and turn the fixture keys into format
>> strings embedding dicts filtered through urlencode() for example,
>> but it created as many new failures as it fixed), however I'd hate
>> to drop Py3K testing for software which currently has it no matter
>> how fragile. This is mainly a call for help to anyone with some
>> background and/or interest in python-glanceclient's unit tests to
>> get them working under Python 3.4, so that we can eliminate the
>> burden of maintaining special 3.3 test infrastructure.
> 


-- 
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Chris Dent

On Fri, 17 Oct 2014, Daniel P. Berrange wrote:


IMHO this tool is of pretty dubious value. I mean that function is long
for sure, but it is by no means a serious problem in the Nova libvirt
codebase. The stuff it complains about in the libvirt/config.py file is
just incredibly stupid thing to highlight.


I find a lot of the OpenStack code very hard to read. If it is very
hard to read it is very hard to maintain, whether that means fix or
improve.

That said, the value I see in these kinds of tools is not
specifically in preventing complexity, but in providing entry points
for people who want to fix things. You don't know where to start
(because you haven't yet got the insight or experience): run
flake8 or pylint or some other tools, do what it tells you. In the
process you will:

* learn more about the code
* probably find bugs
* make an incremental improvement to something that needs it

--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-17 Thread Chris Dent

On Thu, 16 Oct 2014, Lars Kellogg-Stedman wrote:


On Fri, Oct 17, 2014 at 12:44:50PM +1100, Angus Lees wrote:

You just need to find the pid of a process in the container (perhaps using
docker inspect to go from container name -> pid) and then:
 nsenter -t $pid -m -u -i -n -p -w


Note also that the 1.3 release of Docker ("any day now") will sport a


Yesterday:
http://blog.docker.com/2014/10/docker-1-3-signed-images-process-injection-security-options-mac-shared-directories/


--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-17 Thread Jastrzebski, Michal


> -Original Message-
> From: Florian Haas [mailto:flor...@hastexo.com]
> Sent: Thursday, October 16, 2014 10:53 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Nova] Automatic evacuate
> 
> On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
>  wrote:
> > In my opinion flavor defining is a bit hacky. Sure, it will provide us
> > functionality fairly quickly, but also will strip us from flexibility
> > Heat would give. Healing can be done in several ways, simple destroy
> > -> create (basic convergence workflow so far), evacuate with or
> > without shared storage, even rebuild vm, probably few more when we put
> > more thoughts to it.
> 
> But then you'd also need to monitor the availability of *individual* guest and
> down you go the rabbit hole.
> 
> So suppose you're monitoring a guest with a simple ping. And it stops
> responding to that ping.

I was more reffering to monitoring host (not guest), and for sure not by ping.
I was thinking of current zookeeper service group implementation, we might want
to use corosync and write servicegroup plugin for that. There are several 
choices
for that, each requires testing really before we make any decission.

There is also fencing case, which we agree is important, and I think nova should
be able to do that (since it does evacuate, it also should do a fencing). But
for working fencing we really need working host health monitoring, so I suggest
we take baby steps here and solve one issue at the time. And that would be host
monitoring.

> (1) Has it died?
> (2) Is it just too busy to respond to the ping?
> (3) Has its guest network stack died?
> (4) Has its host vif died?
> (5) Has the L2 agent on the compute host died?
> (6) Has its host network stack died?
> (7) Has the compute host died?
> 
> Suppose further it's using shared storage (running off an RBD volume or
> using an iSCSI volume, or whatever). Now you have almost as many recovery
> options as possible causes for the failure, and some of those recovery
> options will potentially destroy your guest's data.
> 
> No matter how you twist and turn the problem, you need strongly consistent
> distributed VM state plus fencing. In other words, you need a full blown HA
> stack.
> 
> > I'd rather use nova for low level task and maybe low level monitoring
> > (imho nova should do that using servicegroup). But I'd use something
> > more more configurable for actual task triggering like heat. That
> > would give us framework rather than mechanism. Later we might want to
> > apply HA on network or volume, then we'll have mechanism ready just
> > monitoring hook and healing will need to be implemented.
> >
> > We can use scheduler hints to place resource on host HA-compatible
> > (whichever health action we'd like to use), this will bit more
> > complicated, but also will give us more flexibility.
> 
> I apologize in advance for my bluntness, but this all sounds to me like you're
> vastly underrating the problem of reliable guest state detection and
> recovery. :)

Guest health in my opinion is just a bit out of scope here. If we'll have robust
way of detecting host health, we can pretty much asume that if host dies, 
guests follow.
There are ways to detect guest health (libvirt watchdog, ceilometer, ping you 
mentioned),
but that should be done somewhere else. And for sure not by evacuation.

> 
> > I agree that we all should meet in Paris and discuss that so we can
> > join our forces. This is one of bigger gaps to be filled imho.
> 
> Pretty much every user I've worked with in the last 2 years agrees.
> Granted, my view may be skewed as HA is typically what customers approach
> us for in the first place, but yes, this definitely needs a globally 
> understood
> and supported solution.
> 
> Cheers,
> Florian
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Matthew Gilliard
I like measuring code metrics, and I definitely support Joe's change
here. I think of McCabe complexity as a proxy for testability and
readability of code, both of which are IMO actual real problems in the
nova codebase. If you are an experienced openstack dev you might find
the code easy to move around, but large and complex functions are
difficult for beginners to grok.

As an exercise, I took the method in libvirt/config.py and removed
everything except the flow-control keywords (ie the things that affect
the McCabe complexity): http://paste.openstack.org/show/121589/ - I
would find it difficult to hold all that in my head at once. It's
possible to argue that this is a false-positive, but my experience is
that this tool finds code which need improvement.

That said, these should be descriptive metrics rather than
prescriptive targets. There are products which chart a codebase's
evolution over time, such as www.sonarsource.com, which are really
great for provoking thought and conversation about code quality. Now
I'm interested, I'll have a look into it.

  Matthew

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] BGPVPN implementation discussions

2014-10-17 Thread Damon Wang
Good news, +1

2014-10-17 0:48 GMT+08:00 Mathieu Rohon :

> Hi all,
>
> as discussed during today's l3-meeting, we keep on working on BGPVPN
> service plugin implementation [1].
> MPLS encapsulation is now supported in OVS [2], so we would like to
> summit a design to leverage OVS capabilities. A first design proposal,
> based on l3agent, can be found here :
>
>
> https://docs.google.com/drawings/d/1NN4tDgnZlBRr8ZUf5-6zzUcnDOUkWSnSiPm8LuuAkoQ/edit
>
> this solution is based on bagpipe [3], and its capacity to manipulate
> OVS, based on advertised and learned routes.
>
> [1]https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn
> [2]https://raw.githubusercontent.com/openvswitch/ovs/master/FAQ
> [3]https://github.com/Orange-OpenSource/bagpipe-bgp
>
>
> Thanks
>
> Mathieu
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev