Re: [Openstack-operators] Lets talk capacity monitoring

2015-01-15 Thread Tim Bell
One good topic to try to pin down at the Ops meet up would be how we could do 
the flavour/aggregate/project/hypervisor mappings. We’ve got a local patch for 
some function but it was not possible to get the right way to do it agreed 
(https://blueprints.launchpad.net/nova/+spec/multi-tenancy-isolation-only-aggregates).

We do a fair amount of ‘just how many of flavour X could I accept’ but when we 
add in the various combinations of availability zones and cells, it can be easy 
to run out in one corner of the cloud. There is also the tetris problem where 
one hypervisor has some memory left over, another one has some CPU but you 
can’t combine them (we’d need some hardware support to get that feature 
working…)

Tim

From: matt [mailto:m...@nycresistor.com]
Sent: 16 January 2015 00:30
To: George Shuklin
Cc: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] Lets talk capacity monitoring

I've found histograms to be pretty useful in figuring out patterns during 
sizable time deltas... and anomaly detection there can highlight stuff you 
might want to check out ( ie raise the alert condition on that device ).

example of a histogram i did many many moons ago to track disk sizes from our 
nagios plugin that did dynamic disk free analytics.  I don't have any of the 
animated GIFs I made that showed fluctuations over days... but that was great 
from a human visual sense.

I suppose this could be further automated and refined, I've not been focused 
here anymore though.

-Matt

On Thu, Jan 15, 2015 at 3:08 PM, George Shuklin 
mailto:george.shuk...@gmail.com>> wrote:
On 01/15/2015 06:43 PM, Jesse Keating wrote:
We have a need to better manage the various openstack capacities across our 
numerous clouds. We want to be able to detect when capacity of one system or 
another is approaching the point where it would be a good idea to arrange to 
increase that capacity. Be it volume space, VCPU capability, object storage 
space, etc...

What systems are you folks using to monitor and react to such things?

In our case we are using standard metrics (ganglia) and monitoring (shinken). I 
have thoughts about 'capacity planing', but the problem is that you cannot 
separate payload from wasted resources. For example, when snapshot is created, 
it eats space on compute (for some configuration) beyond flavor limits. If 
instance boots, _base is used too (and if instance is booting from big 
snapshot, it use more space in _base, than in /instances). CPU can be heavily 
used by many host-internal processes, and memory is shared with management 
software (which can be greedy too). IO can be overspend on snapshots/booting.

So we are using cumulative graphs for free space, cpu usage, memory usage. It 
does not cover flavor/aggregate/pinning-to-host-by-metadata cases, but overall 
give some feeling about available free resources.


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Sam Morrison
We’ve had a lot of issues with Icehouse related to rabbitMQ. Basically the 
change from openstack.rpc to oslo.messaging broke things. These things are now 
fixed in oslo.messaging version 1.5.1, there is still an issue with heartbeats 
and that patch is making it’s way through review process now.

https://review.openstack.org/#/c/146047/ 


Cheers,
Sam


> On 16 Jan 2015, at 10:55 am, sridhar basam  wrote:
> 
> 
> If you are using ha queues, use a version of rabbitmq > 3.3.0. There was a 
> change in that version where consumption on queues was automatically enabled 
> when a master election for a queue happened. Previous versions only informed 
> clients that they had to reconsume on a queue. It was the clients 
> responsibility to start consumption on a queue.
> 
> Make sure you enable tcp keepalives to a low enough value in case you have a 
> firewall device in between your rabbit server and it's consumers.
> 
> Monitor consumers on your rabbit infrastructure using 'rabbitmqctl 
> list_queues name messages consumers'. Consumers on fanout queues is going to 
> depend on the number of services of any type you have in your environment.
> 
> Sri
> On Jan 15, 2015 6:27 PM, "Michael Dorman"  > wrote:
> Here is the bug I’ve been tracking related to this for a while.  I haven’t 
> really kept up to speed with it, so I don’t know the current status.
> 
> https://bugs.launchpad.net/nova/+bug/856764 
> 
> 
> 
> From: Kris Lindgren mailto:klindg...@godaddy.com>>
> Date: Thursday, January 15, 2015 at 12:10 PM
> To: Gustavo Randich  >, OpenStack Operators 
>  >
> Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
> connectivity
> 
> During the Atlanta ops meeting this topic came up and I specifically 
> mentioned about adding a "no-op" or healthcheck ping to the rabbitmq stuff to 
> both nova & neutron.  The dev's in the room looked at me like I was crazy, 
> but it was so that we could exactly catch issues as you described.  I am also 
> interested if any one knows of a lightweight call that could be used to 
> verify/confirm rabbitmq connectivity as well.  I haven't been able to devote 
> time to dig into it.  Mainly because if one client is having issues - you 
> will notice other clients are having similar/silent errors and a restart of 
> all the things is the easiest way to fix, for us atleast.
> 
>  
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy, LLC.
> 
> 
> From: Gustavo Randich  >
> Date: Thursday, January 15, 2015 at 11:53 AM
> To: "openstack-operators@lists.openstack.org 
> " 
>  >
> Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
> connectivity
> 
> Just to add one more background scenario, we also had similar problems trying 
> to load balance rabbitmq via F5 Big IP LTM. For that reason we don't use it 
> now. Our installation is a single rabbitmq instance and no intermediaries 
> (albeit network switches). We use Folsom and Icehouse, the problem being 
> perceived more in Icehouse nodes.
> 
> We are already monitoring message queue size, but we would like to pinpoint 
> in semi-realtime the specific hosts/racks/network paths experiencing the 
> "stale connection" before a user complains about an operation being stuck, or 
> even hosts with no such pending operations but already "disconnected" -- we 
> also could diagnose possible network causes and avoid massive service 
> restarting.
> 
> So, for now, if someone knows about a cheap and quick openstack operation 
> that triggers a message interchange between rabbitmq and nova-compute and a 
> way of checking the result it would be great.
> 
> 
> 
> 
> On Thu, Jan 15, 2015 at 1:45 PM, Kris G. Lindgren  > wrote:
> We did have an issue using celery  on an internal application that we wrote - 
> but I believe it was fixed after much failover testing and code changes.  We 
> also use logstash via rabbitmq and haven't noticed any issues there either.
> 
> So this seems to be just openstack/oslo related.
> 
> We have tried a number of different configurations - all of them had their 
> issues.  We started out listing all the members in the cluster on the 
> rabbit_hosts line.  This worked most of the time without issue, until we 
> would restart one of the servers, then it seemed like the clients wouldn't 
> figure out they were disconnected and reconnect to the next host.  
> 
> In an attempt to solve that we moved to using harpoxy to present a vip that 
> we configured in the rabbit_hosts line.  This created issues with long lived 
> connections disconnects and a bunch of other issues.  In our production 
> envi

Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread sridhar basam
If you are using ha queues, use a version of rabbitmq > 3.3.0. There was a
change in that version where consumption on queues was automatically
enabled when a master election for a queue happened. Previous versions only
informed clients that they had to reconsume on a queue. It was the clients
responsibility to start consumption on a queue.

Make sure you enable tcp keepalives to a low enough value in case you have
a firewall device in between your rabbit server and it's consumers.

Monitor consumers on your rabbit infrastructure using 'rabbitmqctl
list_queues name messages consumers'. Consumers on fanout queues is going
to depend on the number of services of any type you have in your
environment.

Sri
 On Jan 15, 2015 6:27 PM, "Michael Dorman"  wrote:

>   Here is the bug I’ve been tracking related to this for a while.  I
> haven’t really kept up to speed with it, so I don’t know the current status.
>
>  https://bugs.launchpad.net/nova/+bug/856764
>
>
>   From: Kris Lindgren 
> Date: Thursday, January 15, 2015 at 12:10 PM
> To: Gustavo Randich , OpenStack Operators <
> openstack-operators@lists.openstack.org>
> Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq
> connectivity
>
>   During the Atlanta ops meeting this topic came up and I specifically
> mentioned about adding a "no-op" or healthcheck ping to the rabbitmq stuff
> to both nova & neutron.  The dev's in the room looked at me like I was
> crazy, but it was so that we could exactly catch issues as you described.
> I am also interested if any one knows of a lightweight call that could be
> used to verify/confirm rabbitmq connectivity as well.  I haven't been able
> to devote time to dig into it.  Mainly because if one client is having
> issues - you will notice other clients are having similar/silent errors and
> a restart of all the things is the easiest way to fix, for us atleast.
>  
>
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy, LLC.
>
>
>   From: Gustavo Randich 
> Date: Thursday, January 15, 2015 at 11:53 AM
> To: "openstack-operators@lists.openstack.org" <
> openstack-operators@lists.openstack.org>
> Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq
> connectivity
>
>Just to add one more background scenario, we also had similar problems
> trying to load balance rabbitmq via F5 Big IP LTM. For that reason we don't
> use it now. Our installation is a single rabbitmq instance and no
> intermediaries (albeit network switches). We use Folsom and Icehouse, the
> problem being perceived more in Icehouse nodes.
>
>  We are already monitoring message queue size, but we would like to
> pinpoint in semi-realtime the specific hosts/racks/network paths
> experiencing the "stale connection" before a user complains about an
> operation being stuck, or even hosts with no such pending operations but
> already "disconnected" -- we also could diagnose possible network causes
> and avoid massive service restarting.
>
>  So, for now, if someone knows about a cheap and quick openstack
> operation that triggers a message interchange between rabbitmq and
> nova-compute and a way of checking the result it would be great.
>
>
>
>
> On Thu, Jan 15, 2015 at 1:45 PM, Kris G. Lindgren 
> wrote:
>
>>  We did have an issue using celery  on an internal application that we
>> wrote - but I believe it was fixed after much failover testing and code
>> changes.  We also use logstash via rabbitmq and haven't noticed any issues
>> there either.
>>
>>  So this seems to be just openstack/oslo related.
>>
>>  We have tried a number of different configurations - all of them had
>> their issues.  We started out listing all the members in the cluster on the
>> rabbit_hosts line.  This worked most of the time without issue, until we
>> would restart one of the servers, then it seemed like the clients wouldn't
>> figure out they were disconnected and reconnect to the next host.
>>
>>  In an attempt to solve that we moved to using harpoxy to present a vip
>> that we configured in the rabbit_hosts line.  This created issues with long
>> lived connections disconnects and a bunch of other issues.  In our
>> production environment we moved to load balanced rabbitmq, but using a real
>> loadbalancer, and don’t have the weird disconnect issues.  However, anytime
>> we reboot/take down a rabbitmq host or pull a member from the cluster we
>> have issues, or if their is a network disruption we also have issues.
>>
>>  Thinking the best course of action is to move rabbitmq off on to its
>> own box and to leave it alone.
>>
>>  Does anyone have a rabbitmq setup that works well and doesn’t have
>> random issues when pulling nodes for maintenance?
>>   
>>
>> Kris Lindgren
>> Senior Linux Systems Engineer
>> GoDaddy, LLC.
>>
>>
>>   From: Joe Topjian 
>> Date: Thursday, January 15, 2015 at 9:29 AM
>> To: "Kris G. Lindgren" 
>> Cc: "openstack-operators@lists.op

Re: [Openstack-operators] Lets talk capacity monitoring

2015-01-15 Thread matt
I've found histograms to be pretty useful in figuring out patterns during
sizable time deltas... and anomaly detection there can highlight stuff you
might want to check out ( ie raise the alert condition on that device ).

example of a histogram i did many many moons ago to track disk sizes from
our nagios plugin that did dynamic disk free analytics.  I don't have any
of the animated GIFs I made that showed fluctuations over days... but that
was great from a human visual sense.

I suppose this could be further automated and refined, I've not been
focused here anymore though.

-Matt

On Thu, Jan 15, 2015 at 3:08 PM, George Shuklin 
wrote:

> On 01/15/2015 06:43 PM, Jesse Keating wrote:
>
>> We have a need to better manage the various openstack capacities across
>> our numerous clouds. We want to be able to detect when capacity of one
>> system or another is approaching the point where it would be a good idea to
>> arrange to increase that capacity. Be it volume space, VCPU capability,
>> object storage space, etc...
>>
>> What systems are you folks using to monitor and react to such things?
>>
>>
> In our case we are using standard metrics (ganglia) and monitoring
> (shinken). I have thoughts about 'capacity planing', but the problem is
> that you cannot separate payload from wasted resources. For example, when
> snapshot is created, it eats space on compute (for some configuration)
> beyond flavor limits. If instance boots, _base is used too (and if instance
> is booting from big snapshot, it use more space in _base, than in
> /instances). CPU can be heavily used by many host-internal processes, and
> memory is shared with management software (which can be greedy too). IO can
> be overspend on snapshots/booting.
>
> So we are using cumulative graphs for free space, cpu usage, memory usage.
> It does not cover flavor/aggregate/pinning-to-host-by-metadata cases, but
> overall give some feeling about available free resources.
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Michael Dorman
Here is the bug I’ve been tracking related to this for a while.  I haven’t 
really kept up to speed with it, so I don’t know the current status.

https://bugs.launchpad.net/nova/+bug/856764


From: Kris Lindgren mailto:klindg...@godaddy.com>>
Date: Thursday, January 15, 2015 at 12:10 PM
To: Gustavo Randich 
mailto:gustavo.rand...@gmail.com>>, OpenStack 
Operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
connectivity

During the Atlanta ops meeting this topic came up and I specifically mentioned 
about adding a "no-op" or healthcheck ping to the rabbitmq stuff to both nova & 
neutron.  The dev's in the room looked at me like I was crazy, but it was so 
that we could exactly catch issues as you described.  I am also interested if 
any one knows of a lightweight call that could be used to verify/confirm 
rabbitmq connectivity as well.  I haven't been able to devote time to dig into 
it.  Mainly because if one client is having issues - you will notice other 
clients are having similar/silent errors and a restart of all the things is the 
easiest way to fix, for us atleast.


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.


From: Gustavo Randich 
mailto:gustavo.rand...@gmail.com>>
Date: Thursday, January 15, 2015 at 11:53 AM
To: 
"openstack-operators@lists.openstack.org"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
connectivity

Just to add one more background scenario, we also had similar problems trying 
to load balance rabbitmq via F5 Big IP LTM. For that reason we don't use it 
now. Our installation is a single rabbitmq instance and no intermediaries 
(albeit network switches). We use Folsom and Icehouse, the problem being 
perceived more in Icehouse nodes.

We are already monitoring message queue size, but we would like to pinpoint in 
semi-realtime the specific hosts/racks/network paths experiencing the "stale 
connection" before a user complains about an operation being stuck, or even 
hosts with no such pending operations but already "disconnected" -- we also 
could diagnose possible network causes and avoid massive service restarting.

So, for now, if someone knows about a cheap and quick openstack operation that 
triggers a message interchange between rabbitmq and nova-compute and a way of 
checking the result it would be great.




On Thu, Jan 15, 2015 at 1:45 PM, Kris G. Lindgren 
mailto:klindg...@godaddy.com>> wrote:
We did have an issue using celery  on an internal application that we wrote - 
but I believe it was fixed after much failover testing and code changes.  We 
also use logstash via rabbitmq and haven't noticed any issues there either.

So this seems to be just openstack/oslo related.

We have tried a number of different configurations - all of them had their 
issues.  We started out listing all the members in the cluster on the 
rabbit_hosts line.  This worked most of the time without issue, until we would 
restart one of the servers, then it seemed like the clients wouldn't figure out 
they were disconnected and reconnect to the next host.

In an attempt to solve that we moved to using harpoxy to present a vip that we 
configured in the rabbit_hosts line.  This created issues with long lived 
connections disconnects and a bunch of other issues.  In our production 
environment we moved to load balanced rabbitmq, but using a real loadbalancer, 
and don’t have the weird disconnect issues.  However, anytime we reboot/take 
down a rabbitmq host or pull a member from the cluster we have issues, or if 
their is a network disruption we also have issues.

Thinking the best course of action is to move rabbitmq off on to its own box 
and to leave it alone.

Does anyone have a rabbitmq setup that works well and doesn’t have random 
issues when pulling nodes for maintenance?


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.


From: Joe Topjian mailto:j...@topjian.net>>
Date: Thursday, January 15, 2015 at 9:29 AM
To: "Kris G. Lindgren" mailto:klindg...@godaddy.com>>
Cc: 
"openstack-operators@lists.openstack.org"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
connectivity

Hi Kris,

 Our experience is pretty much the same on anything that is using rabbitmq - 
not just nova-compute.

Just to clarify: have you experienced this outside of OpenStack (or Oslo)?

We've seen similar issues with rabbitmq and OpenStack. We used to run rabbit 
through haproxy and tried a myriad of options like setting no timeouts, very 
very long timeouts, etc, but would always eventually see similar issues as 
described.

Last month, we reconfigured all OpenStack components to use the `rabbit_hosts` 

Re: [Openstack-operators] Small openstack (part 2), distributed glance

2015-01-15 Thread George Shuklin
We do not using centralized storages (all instances running with local 
drives). And I just can't express my happiness about this. Every time 
monitoring send me '** PROBLEM ALERT bla-bla-bla', I know it not a big 
deal. Just one server.


I do not want to turn gray prematurely. Just light glance on 
https://www.google.com/search?q=ceph+crash+corruption give me strong 
feeling I don't want to centralize points of failures.


Btw: If I sold nodes designated for Ceph as normal compute nodes, it 
will be more effective than sell only space from them (and buy more 
compute nodes for actual work).


On 01/16/2015 12:31 AM, Abel Lopez wrote:
That specific bottleneck can be solved by running glance on ceph, and 
running ephemeral instances also on ceph. Snapshots are a quick 
backend operation then. But you've made your installation on a house 
of cards.


On Thursday, January 15, 2015, George Shuklin 
mailto:george.shuk...@gmail.com>> wrote:


Hello everyone.

One more thing in the light of small openstack.

I really dislike tripple network load caused by current glance
snapshot operations. When compute do snapshot, it playing with
files locally, than it sends them to glance-api, and (if glance
API is linked to swift), glance sends them to swift. Basically,
for each 100Gb disk there is 300Gb on network operations. It is
specially painful for glance-api, which need to get more CPU and
network bandwidth than we want to spend on it.

So idea: put glance-api on each compute node without cache.

To help compute to go to the proper glance, endpoint points to
fqdn, and on each compute that fqdn is pointing to localhost
(where glance-api is live). Plus normal glance-api on
API/controller node to serve dashboard/api clients.

I didn't test it yet.

Any ideas on possible problems/bottlenecks? And how many
glance-registry I need for this?

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Lets talk capacity monitoring

2015-01-15 Thread George Shuklin

On 01/15/2015 06:43 PM, Jesse Keating wrote:
We have a need to better manage the various openstack capacities 
across our numerous clouds. We want to be able to detect when capacity 
of one system or another is approaching the point where it would be a 
good idea to arrange to increase that capacity. Be it volume space, 
VCPU capability, object storage space, etc...


What systems are you folks using to monitor and react to such things?



In our case we are using standard metrics (ganglia) and monitoring 
(shinken). I have thoughts about 'capacity planing', but the problem is 
that you cannot separate payload from wasted resources. For example, 
when snapshot is created, it eats space on compute (for some 
configuration) beyond flavor limits. If instance boots, _base is used 
too (and if instance is booting from big snapshot, it use more space in 
_base, than in /instances). CPU can be heavily used by many 
host-internal processes, and memory is shared with management software 
(which can be greedy too). IO can be overspend on snapshots/booting.


So we are using cumulative graphs for free space, cpu usage, memory 
usage. It does not cover flavor/aggregate/pinning-to-host-by-metadata 
cases, but overall give some feeling about available free resources.


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Small openstack (part 2), distributed glance

2015-01-15 Thread Abel Lopez
That specific bottleneck can be solved by running glance on ceph, and
running ephemeral instances also on ceph. Snapshots are a quick backend
operation then. But you've made your installation on a house of cards.

On Thursday, January 15, 2015, George Shuklin 
wrote:

> Hello everyone.
>
> One more thing in the light of small openstack.
>
> I really dislike tripple network load caused by current glance snapshot
> operations. When compute do snapshot, it playing with files locally, than
> it sends them to glance-api, and (if glance API is linked to swift), glance
> sends them to swift. Basically, for each 100Gb disk there is 300Gb on
> network operations. It is specially painful for glance-api, which need to
> get more CPU and network bandwidth than we want to spend on it.
>
> So idea: put glance-api on each compute node without cache.
>
> To help compute to go to the proper glance, endpoint points to fqdn, and
> on each compute that fqdn is pointing to localhost (where glance-api is
> live). Plus normal glance-api on API/controller node to serve dashboard/api
> clients.
>
> I didn't test it yet.
>
> Any ideas on possible problems/bottlenecks? And how many glance-registry I
> need for this?
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Small openstack (part 2), distributed glance

2015-01-15 Thread George Shuklin

Hello everyone.

One more thing in the light of small openstack.

I really dislike tripple network load caused by current glance snapshot 
operations. When compute do snapshot, it playing with files locally, 
than it sends them to glance-api, and (if glance API is linked to 
swift), glance sends them to swift. Basically, for each 100Gb disk there 
is 300Gb on network operations. It is specially painful for glance-api, 
which need to get more CPU and network bandwidth than we want to spend 
on it.


So idea: put glance-api on each compute node without cache.

To help compute to go to the proper glance, endpoint points to fqdn, and 
on each compute that fqdn is pointing to localhost (where glance-api is 
live). Plus normal glance-api on API/controller node to serve 
dashboard/api clients.


I didn't test it yet.

Any ideas on possible problems/bottlenecks? And how many glance-registry 
I need for this?


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] The state of nova-network to neutron migration

2015-01-15 Thread Kyle Mestery
On Thu, Jan 15, 2015 at 2:19 PM, Kris G. Lindgren 
wrote:

>   Is the fact that neutron security groups don’t provide the same level
> of isolation as nova security groups on your guys radar?
>
>  Specifically talking about:
> https://bugs.launchpad.net/neutron/+bug/1274034
>
> That bug is actually tracked by a BP as well which is approved and marked
as Kilo-2 [1].



>  I am sure their are a few other thigns that nova is doing that neutron
> is currently not.
>

That may be true. The plan which was agreed to and executed on by the TC
during Juno is here [2]. If there are other features people want from
nova-network in neutron, I encourage them to file bugs or specs to track
them.

[1] https://blueprints.launchpad.net/neutron/+spec/arp-spoof-patch-ebtables
[2]
https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee/Neutron_Gap_Coverage


>
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy, LLC.
>
>   From: matt 
> Date: Thursday, January 15, 2015 at 1:10 PM
> To: Anita Kuno 
> Cc: Angus Lees , OpenStack Development Mailing List <
> openstack-...@lists.openstack.org>, "
> openstack-operators@lists.openstack.org" <
> openstack-operators@lists.openstack.org>
> Subject: Re: [Openstack-operators] The state of nova-network to neutron
> migration
>
>   Will do.
>
> On Thu, Jan 15, 2015 at 12:08 PM, Anita Kuno  wrote:
>
>> On 01/14/2015 01:06 PM, matt wrote:
>> > Hey Mike!
>> >
>> > Thanks for this info.  Super helpful to me at least.  I am very
>> interested
>> > in hearing more about nova-network to neutron migrations.
>> >
>> > -Matt
>> >
>> Hello Matt:
>>
>> Please start attending the weekly meetings:
>> https://wiki.openstack.org/wiki/Meetings/Nova-nettoNeutronMigration
>> follow the logs from it:
>> http://eavesdrop.openstack.org/meetings/nova_net_to_neutron_migration/
>> as well as the logs for the neutron weekly meeting:
>> http://eavesdrop.openstack.org/meetings/networking/
>> and the nova weekly meeting:
>> http://eavesdrop.openstack.org/meetings/nova/
>>
>> After you have had a chance to update yourself, please ask any questions
>> either at one of the above meetings or do email me.
>>
>> Thank you Matt,
>> Anita.
>>
>> > On Tue, Jan 13, 2015 at 1:53 PM, Michael Still 
>> wrote:
>> >
>> >> Hi, I just wanted to make sure people know that a small group of us
>> >> got together in a hallway at linux.conf.au 2015 to talk about this. It
>> >> wasn't an attempt to exclude anyone, we just all happened to be in the
>> >> same place at the same time.
>> >>
>> >> To that end, we made some notes from the chat, which are at
>> >> https://etherpad.openstack.org/p/nova-network-migration-lca2015 .
>> >> Specific points to note are that Angus Lees has volunteered to help
>> >> Oleg with the spec for this, and that we'd very much like to see a
>> >> discussion of this at the Nova midcycle meetup in a couple of weeks.
>> >>
>> >> I'd also like to call out that there's a link to the mailing list
>> >> thread where we discussed CERN's concerns in the etherpad for
>> >> reference as well.
>> >>
>> >> Cheers,
>> >> Michael
>> >>
>> >> On Sat, Dec 20, 2014 at 3:59 AM, Anita Kuno 
>> wrote:
>> >>> Rather than waste your time making excuses let me state where we are
>> and
>> >>> where I would like to get to, also sharing my thoughts about how you
>> can
>> >>> get involved if you want to see this happen as badly as I have been
>> told
>> >>> you do.
>> >>>
>> >>> Where we are:
>> >>> * a great deal of foundation work has been accomplished to achieve
>> >>> parity with nova-network and neutron to the extent that those involved
>> >>> are ready for migration plans to be formulated and be put in place
>> >>> * a summit session happened with notes and intentions[0]
>> >>> * people took responsibility and promptly got swamped with other
>> >>> responsibilities
>> >>> * spec deadlines arose and in neutron's case have passed
>> >>> * currently a neutron spec [1] is a work in progress (and it needs
>> >>> significant work still) and a nova spec is required and doesn't have a
>> >>> first draft or a champion
>> >>>
>> >>> Where I would like to get to:
>> >>> * I need people in addition to Oleg Bondarev to be available to
>> help
>> >>> come up with ideas and words to describe them to create the specs in a
>> >>> very short amount of time (Oleg is doing great work and is a fabulous
>> >>> person, yay Oleg, he just can't do this alone)
>> >>> * specifically I need a contact on the nova side of this complex
>> >>> problem, similar to Oleg on the neutron side
>> >>> * we need to have a way for people involved with this effort to
>> find
>> >>> each other, talk to each other and track progress
>> >>> * we need to have representation at both nova and neutron weekly
>> >>> meetings to communicate status and needs
>> >>>
>> >>> We are at K-2 and our current status is insufficient to expect this
>> work
>> >>> will be accomplished by the end of K-3. I wi

Re: [Openstack-operators] The state of nova-network to neutron migration

2015-01-15 Thread Anita Kuno
On 01/16/2015 09:19 AM, Kris G. Lindgren wrote:
> Is the fact that neutron security groups don’t provide the same level of 
> isolation as nova security groups on your guys radar?
> 
> Specifically talking about:  https://bugs.launchpad.net/neutron/+bug/1274034
> 
> I am sure their are a few other thigns that nova is doing that neutron is 
> currently not.
> 
> 
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy, LLC.
Can I convince you to attend the weekly nova-net to neutron migration
meeting and ensure that it is?

Thanks Kris,
Anita.
> 
> From: matt mailto:m...@nycresistor.com>>
> Date: Thursday, January 15, 2015 at 1:10 PM
> To: Anita Kuno mailto:ante...@anteaya.info>>
> Cc: Angus Lees mailto:gusl...@gmail.com>>, OpenStack 
> Development Mailing List 
> mailto:openstack-...@lists.openstack.org>>,
>  
> "openstack-operators@lists.openstack.org"
>  
> mailto:openstack-operators@lists.openstack.org>>
> Subject: Re: [Openstack-operators] The state of nova-network to neutron 
> migration
> 
> Will do.
> 
> On Thu, Jan 15, 2015 at 12:08 PM, Anita Kuno 
> mailto:ante...@anteaya.info>> wrote:
> On 01/14/2015 01:06 PM, matt wrote:
>> Hey Mike!
>>
>> Thanks for this info.  Super helpful to me at least.  I am very interested
>> in hearing more about nova-network to neutron migrations.
>>
>> -Matt
>>
> Hello Matt:
> 
> Please start attending the weekly meetings:
> https://wiki.openstack.org/wiki/Meetings/Nova-nettoNeutronMigration
> follow the logs from it:
> http://eavesdrop.openstack.org/meetings/nova_net_to_neutron_migration/
> as well as the logs for the neutron weekly meeting:
> http://eavesdrop.openstack.org/meetings/networking/
> and the nova weekly meeting:
> http://eavesdrop.openstack.org/meetings/nova/
> 
> After you have had a chance to update yourself, please ask any questions
> either at one of the above meetings or do email me.
> 
> Thank you Matt,
> Anita.
> 
>> On Tue, Jan 13, 2015 at 1:53 PM, Michael Still 
>> mailto:mi...@stillhq.com>> wrote:
>>
>>> Hi, I just wanted to make sure people know that a small group of us
>>> got together in a hallway at linux.conf.au 2015 to 
>>> talk about this. It
>>> wasn't an attempt to exclude anyone, we just all happened to be in the
>>> same place at the same time.
>>>
>>> To that end, we made some notes from the chat, which are at
>>> https://etherpad.openstack.org/p/nova-network-migration-lca2015 .
>>> Specific points to note are that Angus Lees has volunteered to help
>>> Oleg with the spec for this, and that we'd very much like to see a
>>> discussion of this at the Nova midcycle meetup in a couple of weeks.
>>>
>>> I'd also like to call out that there's a link to the mailing list
>>> thread where we discussed CERN's concerns in the etherpad for
>>> reference as well.
>>>
>>> Cheers,
>>> Michael
>>>
>>> On Sat, Dec 20, 2014 at 3:59 AM, Anita Kuno 
>>> mailto:ante...@anteaya.info>> wrote:
 Rather than waste your time making excuses let me state where we are and
 where I would like to get to, also sharing my thoughts about how you can
 get involved if you want to see this happen as badly as I have been told
 you do.

 Where we are:
 * a great deal of foundation work has been accomplished to achieve
 parity with nova-network and neutron to the extent that those involved
 are ready for migration plans to be formulated and be put in place
 * a summit session happened with notes and intentions[0]
 * people took responsibility and promptly got swamped with other
 responsibilities
 * spec deadlines arose and in neutron's case have passed
 * currently a neutron spec [1] is a work in progress (and it needs
 significant work still) and a nova spec is required and doesn't have a
 first draft or a champion

 Where I would like to get to:
 * I need people in addition to Oleg Bondarev to be available to help
 come up with ideas and words to describe them to create the specs in a
 very short amount of time (Oleg is doing great work and is a fabulous
 person, yay Oleg, he just can't do this alone)
 * specifically I need a contact on the nova side of this complex
 problem, similar to Oleg on the neutron side
 * we need to have a way for people involved with this effort to find
 each other, talk to each other and track progress
 * we need to have representation at both nova and neutron weekly
 meetings to communicate status and needs

 We are at K-2 and our current status is insufficient to expect this work
 will be accomplished by the end of K-3. I will be championing this work,
 in whatever state, so at least it doesn't fall off the map. If you would
 like to help this effort please get in contact. I will be thinking of
 ways to further this work and will be communicating to those who

Re: [Openstack-operators] [ha-guide] HA Guide update next steps

2015-01-15 Thread Sriram Subramanian
Matt, can u please send the link for the wiki page?


On Thu, Jan 15, 2015 at 7:17 AM, Matt Griffin 
wrote:

> Just a reminder that we're going to meet today (and every Thursday) from
> 3:00-3:30pm US Central.
> Like last time, let's chat in #openstack-haguide on freenode.
>
> A bit later today (before our meeting), I'll review the wiki so we can
> charge ahead as soon as possible on updating areas.
>
> Best,
> Matt
>
>
> ---
> Matt Griffin
> Director of Product Management
> Percona
> m: 1-214-727-4100
> skype: thebear78
>
>
> On Sat, Jan 10, 2015 at 12:00 PM, Matt Griffin 
> wrote:
>
>> Thanks Sriram and thanks for everyone's participation in the poll.
>> I picked Thursday from 3:00 PM - 3:30 PM US Central Time for the
>> OpenStack HA Guide Regular Meeting.
>> We'll start this Thursday, January 15, 2015.
>>
>> I'll do some cleanup to the pad [1] before our first meeting so hopefully
>> we can quickly arrive at owners and move forward.
>>
>> Best,
>> Matt
>>
>> [1] https://etherpad.openstack.org/p/openstack-haguide-update
>>
>>
>>
>> On Fri, Jan 9, 2015 at 12:30 AM, Sriram Subramanian <
>> sri...@sriramhere.com> wrote:
>>
>>> Dear Docs,
>>>
>>> I noted some discussions [1] on the operators mailing list about HA
>>> guide meetings. I also added my availability in the Doodle poll[2]. Thanks
>>> for starting this Matt.
>>>
>>> Since many may not be on the Docs list, I am resurfacing it here.
>>>
>>> Thanks,
>>> -Sriram
>>>
>>> 1.
>>> http://lists.openstack.org/pipermail/openstack-operators/2015-January/005810.html
>>> 2. https://doodle.com/4r9i2m7tyrz3aayv
>>>
>>
>>
>


-- 
Thanks,
-Sriram
425-610-8465
www.sriramhere.com | www.clouddon.com
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] The state of nova-network to neutron migration

2015-01-15 Thread Kris G. Lindgren
Is the fact that neutron security groups don’t provide the same level of 
isolation as nova security groups on your guys radar?

Specifically talking about:  https://bugs.launchpad.net/neutron/+bug/1274034

I am sure their are a few other thigns that nova is doing that neutron is 
currently not.


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.

From: matt mailto:m...@nycresistor.com>>
Date: Thursday, January 15, 2015 at 1:10 PM
To: Anita Kuno mailto:ante...@anteaya.info>>
Cc: Angus Lees mailto:gusl...@gmail.com>>, OpenStack 
Development Mailing List 
mailto:openstack-...@lists.openstack.org>>, 
"openstack-operators@lists.openstack.org"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] The state of nova-network to neutron 
migration

Will do.

On Thu, Jan 15, 2015 at 12:08 PM, Anita Kuno 
mailto:ante...@anteaya.info>> wrote:
On 01/14/2015 01:06 PM, matt wrote:
> Hey Mike!
>
> Thanks for this info.  Super helpful to me at least.  I am very interested
> in hearing more about nova-network to neutron migrations.
>
> -Matt
>
Hello Matt:

Please start attending the weekly meetings:
https://wiki.openstack.org/wiki/Meetings/Nova-nettoNeutronMigration
follow the logs from it:
http://eavesdrop.openstack.org/meetings/nova_net_to_neutron_migration/
as well as the logs for the neutron weekly meeting:
http://eavesdrop.openstack.org/meetings/networking/
and the nova weekly meeting:
http://eavesdrop.openstack.org/meetings/nova/

After you have had a chance to update yourself, please ask any questions
either at one of the above meetings or do email me.

Thank you Matt,
Anita.

> On Tue, Jan 13, 2015 at 1:53 PM, Michael Still 
> mailto:mi...@stillhq.com>> wrote:
>
>> Hi, I just wanted to make sure people know that a small group of us
>> got together in a hallway at linux.conf.au 2015 to 
>> talk about this. It
>> wasn't an attempt to exclude anyone, we just all happened to be in the
>> same place at the same time.
>>
>> To that end, we made some notes from the chat, which are at
>> https://etherpad.openstack.org/p/nova-network-migration-lca2015 .
>> Specific points to note are that Angus Lees has volunteered to help
>> Oleg with the spec for this, and that we'd very much like to see a
>> discussion of this at the Nova midcycle meetup in a couple of weeks.
>>
>> I'd also like to call out that there's a link to the mailing list
>> thread where we discussed CERN's concerns in the etherpad for
>> reference as well.
>>
>> Cheers,
>> Michael
>>
>> On Sat, Dec 20, 2014 at 3:59 AM, Anita Kuno 
>> mailto:ante...@anteaya.info>> wrote:
>>> Rather than waste your time making excuses let me state where we are and
>>> where I would like to get to, also sharing my thoughts about how you can
>>> get involved if you want to see this happen as badly as I have been told
>>> you do.
>>>
>>> Where we are:
>>> * a great deal of foundation work has been accomplished to achieve
>>> parity with nova-network and neutron to the extent that those involved
>>> are ready for migration plans to be formulated and be put in place
>>> * a summit session happened with notes and intentions[0]
>>> * people took responsibility and promptly got swamped with other
>>> responsibilities
>>> * spec deadlines arose and in neutron's case have passed
>>> * currently a neutron spec [1] is a work in progress (and it needs
>>> significant work still) and a nova spec is required and doesn't have a
>>> first draft or a champion
>>>
>>> Where I would like to get to:
>>> * I need people in addition to Oleg Bondarev to be available to help
>>> come up with ideas and words to describe them to create the specs in a
>>> very short amount of time (Oleg is doing great work and is a fabulous
>>> person, yay Oleg, he just can't do this alone)
>>> * specifically I need a contact on the nova side of this complex
>>> problem, similar to Oleg on the neutron side
>>> * we need to have a way for people involved with this effort to find
>>> each other, talk to each other and track progress
>>> * we need to have representation at both nova and neutron weekly
>>> meetings to communicate status and needs
>>>
>>> We are at K-2 and our current status is insufficient to expect this work
>>> will be accomplished by the end of K-3. I will be championing this work,
>>> in whatever state, so at least it doesn't fall off the map. If you would
>>> like to help this effort please get in contact. I will be thinking of
>>> ways to further this work and will be communicating to those who
>>> identify as affected by these decisions in the most effective methods of
>>> which I am capable.
>>>
>>> Thank you to all who have gotten us as far as well have gotten in this
>>> effort, it has been a long haul and you have all done great work. Let's
>>> keep going and finish this.
>>>
>>> Thank you,
>>> Anita.
>>>
>>> [0] ht

Re: [Openstack-operators] The state of nova-network to neutron migration

2015-01-15 Thread matt
Will do.

On Thu, Jan 15, 2015 at 12:08 PM, Anita Kuno  wrote:

> On 01/14/2015 01:06 PM, matt wrote:
> > Hey Mike!
> >
> > Thanks for this info.  Super helpful to me at least.  I am very
> interested
> > in hearing more about nova-network to neutron migrations.
> >
> > -Matt
> >
> Hello Matt:
>
> Please start attending the weekly meetings:
> https://wiki.openstack.org/wiki/Meetings/Nova-nettoNeutronMigration
> follow the logs from it:
> http://eavesdrop.openstack.org/meetings/nova_net_to_neutron_migration/
> as well as the logs for the neutron weekly meeting:
> http://eavesdrop.openstack.org/meetings/networking/
> and the nova weekly meeting:
> http://eavesdrop.openstack.org/meetings/nova/
>
> After you have had a chance to update yourself, please ask any questions
> either at one of the above meetings or do email me.
>
> Thank you Matt,
> Anita.
>
> > On Tue, Jan 13, 2015 at 1:53 PM, Michael Still 
> wrote:
> >
> >> Hi, I just wanted to make sure people know that a small group of us
> >> got together in a hallway at linux.conf.au 2015 to talk about this. It
> >> wasn't an attempt to exclude anyone, we just all happened to be in the
> >> same place at the same time.
> >>
> >> To that end, we made some notes from the chat, which are at
> >> https://etherpad.openstack.org/p/nova-network-migration-lca2015 .
> >> Specific points to note are that Angus Lees has volunteered to help
> >> Oleg with the spec for this, and that we'd very much like to see a
> >> discussion of this at the Nova midcycle meetup in a couple of weeks.
> >>
> >> I'd also like to call out that there's a link to the mailing list
> >> thread where we discussed CERN's concerns in the etherpad for
> >> reference as well.
> >>
> >> Cheers,
> >> Michael
> >>
> >> On Sat, Dec 20, 2014 at 3:59 AM, Anita Kuno 
> wrote:
> >>> Rather than waste your time making excuses let me state where we are
> and
> >>> where I would like to get to, also sharing my thoughts about how you
> can
> >>> get involved if you want to see this happen as badly as I have been
> told
> >>> you do.
> >>>
> >>> Where we are:
> >>> * a great deal of foundation work has been accomplished to achieve
> >>> parity with nova-network and neutron to the extent that those involved
> >>> are ready for migration plans to be formulated and be put in place
> >>> * a summit session happened with notes and intentions[0]
> >>> * people took responsibility and promptly got swamped with other
> >>> responsibilities
> >>> * spec deadlines arose and in neutron's case have passed
> >>> * currently a neutron spec [1] is a work in progress (and it needs
> >>> significant work still) and a nova spec is required and doesn't have a
> >>> first draft or a champion
> >>>
> >>> Where I would like to get to:
> >>> * I need people in addition to Oleg Bondarev to be available to
> help
> >>> come up with ideas and words to describe them to create the specs in a
> >>> very short amount of time (Oleg is doing great work and is a fabulous
> >>> person, yay Oleg, he just can't do this alone)
> >>> * specifically I need a contact on the nova side of this complex
> >>> problem, similar to Oleg on the neutron side
> >>> * we need to have a way for people involved with this effort to
> find
> >>> each other, talk to each other and track progress
> >>> * we need to have representation at both nova and neutron weekly
> >>> meetings to communicate status and needs
> >>>
> >>> We are at K-2 and our current status is insufficient to expect this
> work
> >>> will be accomplished by the end of K-3. I will be championing this
> work,
> >>> in whatever state, so at least it doesn't fall off the map. If you
> would
> >>> like to help this effort please get in contact. I will be thinking of
> >>> ways to further this work and will be communicating to those who
> >>> identify as affected by these decisions in the most effective methods
> of
> >>> which I am capable.
> >>>
> >>> Thank you to all who have gotten us as far as well have gotten in this
> >>> effort, it has been a long haul and you have all done great work. Let's
> >>> keep going and finish this.
> >>>
> >>> Thank you,
> >>> Anita.
> >>>
> >>> [0] https://etherpad.openstack.org/p/kilo-nova-nova-network-to-neutron
> >>> [1] https://review.openstack.org/#/c/142456/
> >>>
> >>> ___
> >>> OpenStack-operators mailing list
> >>> OpenStack-operators@lists.openstack.org
> >>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >>
> >>
> >>
> >> --
> >> Rackspace Australia
> >>
> >> ___
> >> OpenStack-operators mailing list
> >> OpenStack-operators@lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >>
> >
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/open

Re: [Openstack-operators] The state of nova-network to neutron migration

2015-01-15 Thread Anita Kuno
On 01/14/2015 01:06 PM, matt wrote:
> Hey Mike!
> 
> Thanks for this info.  Super helpful to me at least.  I am very interested
> in hearing more about nova-network to neutron migrations.
> 
> -Matt
> 
Hello Matt:

Please start attending the weekly meetings:
https://wiki.openstack.org/wiki/Meetings/Nova-nettoNeutronMigration
follow the logs from it:
http://eavesdrop.openstack.org/meetings/nova_net_to_neutron_migration/
as well as the logs for the neutron weekly meeting:
http://eavesdrop.openstack.org/meetings/networking/
and the nova weekly meeting:
http://eavesdrop.openstack.org/meetings/nova/

After you have had a chance to update yourself, please ask any questions
either at one of the above meetings or do email me.

Thank you Matt,
Anita.

> On Tue, Jan 13, 2015 at 1:53 PM, Michael Still  wrote:
> 
>> Hi, I just wanted to make sure people know that a small group of us
>> got together in a hallway at linux.conf.au 2015 to talk about this. It
>> wasn't an attempt to exclude anyone, we just all happened to be in the
>> same place at the same time.
>>
>> To that end, we made some notes from the chat, which are at
>> https://etherpad.openstack.org/p/nova-network-migration-lca2015 .
>> Specific points to note are that Angus Lees has volunteered to help
>> Oleg with the spec for this, and that we'd very much like to see a
>> discussion of this at the Nova midcycle meetup in a couple of weeks.
>>
>> I'd also like to call out that there's a link to the mailing list
>> thread where we discussed CERN's concerns in the etherpad for
>> reference as well.
>>
>> Cheers,
>> Michael
>>
>> On Sat, Dec 20, 2014 at 3:59 AM, Anita Kuno  wrote:
>>> Rather than waste your time making excuses let me state where we are and
>>> where I would like to get to, also sharing my thoughts about how you can
>>> get involved if you want to see this happen as badly as I have been told
>>> you do.
>>>
>>> Where we are:
>>> * a great deal of foundation work has been accomplished to achieve
>>> parity with nova-network and neutron to the extent that those involved
>>> are ready for migration plans to be formulated and be put in place
>>> * a summit session happened with notes and intentions[0]
>>> * people took responsibility and promptly got swamped with other
>>> responsibilities
>>> * spec deadlines arose and in neutron's case have passed
>>> * currently a neutron spec [1] is a work in progress (and it needs
>>> significant work still) and a nova spec is required and doesn't have a
>>> first draft or a champion
>>>
>>> Where I would like to get to:
>>> * I need people in addition to Oleg Bondarev to be available to help
>>> come up with ideas and words to describe them to create the specs in a
>>> very short amount of time (Oleg is doing great work and is a fabulous
>>> person, yay Oleg, he just can't do this alone)
>>> * specifically I need a contact on the nova side of this complex
>>> problem, similar to Oleg on the neutron side
>>> * we need to have a way for people involved with this effort to find
>>> each other, talk to each other and track progress
>>> * we need to have representation at both nova and neutron weekly
>>> meetings to communicate status and needs
>>>
>>> We are at K-2 and our current status is insufficient to expect this work
>>> will be accomplished by the end of K-3. I will be championing this work,
>>> in whatever state, so at least it doesn't fall off the map. If you would
>>> like to help this effort please get in contact. I will be thinking of
>>> ways to further this work and will be communicating to those who
>>> identify as affected by these decisions in the most effective methods of
>>> which I am capable.
>>>
>>> Thank you to all who have gotten us as far as well have gotten in this
>>> effort, it has been a long haul and you have all done great work. Let's
>>> keep going and finish this.
>>>
>>> Thank you,
>>> Anita.
>>>
>>> [0] https://etherpad.openstack.org/p/kilo-nova-nova-network-to-neutron
>>> [1] https://review.openstack.org/#/c/142456/
>>>
>>> ___
>>> OpenStack-operators mailing list
>>> OpenStack-operators@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>
>> --
>> Rackspace Australia
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
> 


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Fwd: HAPROXY 504 errors in HA conf

2015-01-15 Thread Pedro Sousa
False alarm, after more tests the issue persisted, so I switched to backup
mode in the other haproxy nodes and now everything works as expected.

Thanks
Em 15/01/2015 12:13, "Pedro Sousa"  escreveu:

> Hi all,
>
> the culprit was haproxy, I had "option httpchk" when I disabled this
> stopped having timeouts rebooting the servers.
>
> Thank you all.
>
>
> On Wed, Jan 14, 2015 at 5:29 PM, John Dewey  wrote:
>
>>  I would verify that the VIP failover is occurring.
>>
>> Your master should have the IP address.  If you shut down keepalived the
>> VIP should move to one of the others.   I generally set the state to MASTER
>> on all systems, and have one with a higher priority than the others (e.g.
>> 100 vs 150 on others).
>>
>> On Tuesday, January 13, 2015 at 12:18 PM, Pedro Sousa wrote:
>>
>> As expected If I reboot the Keepalived MASTER node, I get timeouts again,
>> so my understanding is that this happens when the VIP fails over to another
>> node. Anyone has explanation for this?
>>
>> Thanks
>>
>> On Tue, Jan 13, 2015 at 8:08 PM, Pedro Sousa  wrote:
>>
>> Hi,
>>
>> I think I found out the issue, as I have all the 3 nodes running
>> Keepalived as MASTER, when I reboot one of the servers, one of the VIPS
>> failsover to it, causing the timeout issues. So I left only one server as
>> MASTER and the other 2 as BACKUP, and If I reboot the BACKUP servers
>> everything will work fine.
>>
>> As a note aside, I don't know if this is some ARP issue because I have a
>> similar problem with Neutron L3 running in HA Mode. If I reboot the server
>> that is running as MASTER I loose connection to my floating IPS because the
>> switch doesn't know yet that the Mac Addr has changed. To everything start
>> working I have to ping an outside host  like google from an instance.
>>
>> Maybe someone could share some experience on this,
>>
>> Thank you for your help.
>>
>>
>>
>>
>> On Tue, Jan 13, 2015 at 7:18 PM, Pedro Sousa  wrote:
>>
>> Jesse,
>>
>> I see a lot of these messages in glance-api:
>>
>> 2015-01-13 19:16:29.084 29269 DEBUG
>> glance.api.middleware.version_negotiation
>> [29d94a9a-135b-4bf2-a97b-f23b0704ee15 eb7ff2b5f0f34f51ac9ea0f75b60065d
>> 2524b02b63994749ad1fed6f3a825c15 - - -] Unknown version. Returning version
>> choices. process_request
>> /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:64
>>
>> While running openstack-status (glance image-list)
>>
>> == Glance images ==
>> Error finding address for
>> http://172.16.21.20:9292/v1/images/detail?sort_key=name&sort_dir=asc&limit=20:
>> HTTPConnectionPool(host='172.16.21.20', port=9292): Max retries exceeded
>> with url: /v1/images/detail?sort_key=name&sort_dir=asc&limit=20 (Caused by
>> : '')
>>
>>
>> Thanks
>>
>>
>> On Tue, Jan 13, 2015 at 6:52 PM, Jesse Keating  wrote:
>>
>> On 1/13/15 10:42 AM, Pedro Sousa wrote:
>>
>> Hi
>>
>>
>> I've changed some haproxy confs, now I'm getting a different error:
>>
>> *== Nova networks ==*
>> *ERROR (ConnectionError): HTTPConnectionPool(host='172.16.21.20',
>> port=8774): Max retries exceeded with url:
>> /v2/2524b02b63994749ad1fed6f3a825c15/os-networks (Caused by > 'httplib.BadStatusLine'>: '')*
>> *== Nova instance flavors ==*
>>
>> If I restart my openstack services everything will start working.
>>
>> I'm attaching my new haproxy conf.
>>
>>
>> Thanks
>>
>>
>> Sounds like your services are losing access to something, like rabbit or
>> the database. What do your service logs show prior to restart? Are they
>> throwing any errors?
>>
>>
>> --
>> -jlk
>>
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>
>>
>>
>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Kris G. Lindgren
During the Atlanta ops meeting this topic came up and I specifically mentioned 
about adding a "no-op" or healthcheck ping to the rabbitmq stuff to both nova & 
neutron.  The dev's in the room looked at me like I was crazy, but it was so 
that we could exactly catch issues as you described.  I am also interested if 
any one knows of a lightweight call that could be used to verify/confirm 
rabbitmq connectivity as well.  I haven't been able to devote time to dig into 
it.  Mainly because if one client is having issues - you will notice other 
clients are having similar/silent errors and a restart of all the things is the 
easiest way to fix, for us atleast.


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.


From: Gustavo Randich 
mailto:gustavo.rand...@gmail.com>>
Date: Thursday, January 15, 2015 at 11:53 AM
To: 
"openstack-operators@lists.openstack.org"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
connectivity

Just to add one more background scenario, we also had similar problems trying 
to load balance rabbitmq via F5 Big IP LTM. For that reason we don't use it 
now. Our installation is a single rabbitmq instance and no intermediaries 
(albeit network switches). We use Folsom and Icehouse, the problem being 
perceived more in Icehouse nodes.

We are already monitoring message queue size, but we would like to pinpoint in 
semi-realtime the specific hosts/racks/network paths experiencing the "stale 
connection" before a user complains about an operation being stuck, or even 
hosts with no such pending operations but already "disconnected" -- we also 
could diagnose possible network causes and avoid massive service restarting.

So, for now, if someone knows about a cheap and quick openstack operation that 
triggers a message interchange between rabbitmq and nova-compute and a way of 
checking the result it would be great.




On Thu, Jan 15, 2015 at 1:45 PM, Kris G. Lindgren 
mailto:klindg...@godaddy.com>> wrote:
We did have an issue using celery  on an internal application that we wrote - 
but I believe it was fixed after much failover testing and code changes.  We 
also use logstash via rabbitmq and haven't noticed any issues there either.

So this seems to be just openstack/oslo related.

We have tried a number of different configurations - all of them had their 
issues.  We started out listing all the members in the cluster on the 
rabbit_hosts line.  This worked most of the time without issue, until we would 
restart one of the servers, then it seemed like the clients wouldn't figure out 
they were disconnected and reconnect to the next host.

In an attempt to solve that we moved to using harpoxy to present a vip that we 
configured in the rabbit_hosts line.  This created issues with long lived 
connections disconnects and a bunch of other issues.  In our production 
environment we moved to load balanced rabbitmq, but using a real loadbalancer, 
and don’t have the weird disconnect issues.  However, anytime we reboot/take 
down a rabbitmq host or pull a member from the cluster we have issues, or if 
their is a network disruption we also have issues.

Thinking the best course of action is to move rabbitmq off on to its own box 
and to leave it alone.

Does anyone have a rabbitmq setup that works well and doesn’t have random 
issues when pulling nodes for maintenance?


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.


From: Joe Topjian mailto:j...@topjian.net>>
Date: Thursday, January 15, 2015 at 9:29 AM
To: "Kris G. Lindgren" mailto:klindg...@godaddy.com>>
Cc: 
"openstack-operators@lists.openstack.org"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
connectivity

Hi Kris,

 Our experience is pretty much the same on anything that is using rabbitmq - 
not just nova-compute.

Just to clarify: have you experienced this outside of OpenStack (or Oslo)?

We've seen similar issues with rabbitmq and OpenStack. We used to run rabbit 
through haproxy and tried a myriad of options like setting no timeouts, very 
very long timeouts, etc, but would always eventually see similar issues as 
described.

Last month, we reconfigured all OpenStack components to use the `rabbit_hosts` 
option with all nodes in our cluster listed. So far this has worked well, 
though I probably just jinxed myself. :)

We still have other services (like Sensu) using the same rabbitmq cluster and 
accessing it through haproxy. We've never had any issues there.

What's also strange is that I have another OpenStack deployment (from Folsom to 
Icehouse) with just a single rabbitmq server installed directly on the cloud 
controller (meaning: no nova-compute). I never have any rabbit issues in that 
c

Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Gustavo Randich
Just to add one more background scenario, we also had similar problems
trying to load balance rabbitmq via F5 Big IP LTM. For that reason we don't
use it now. Our installation is a single rabbitmq instance and no
intermediaries (albeit network switches). We use Folsom and Icehouse, the
problem being perceived more in Icehouse nodes.

We are already monitoring message queue size, but we would like to pinpoint
in semi-realtime the specific hosts/racks/network paths experiencing the
"stale connection" before a user complains about an operation being stuck,
or even hosts with no such pending operations but already "disconnected" --
we also could diagnose possible network causes and avoid massive service
restarting.

So, for now, if someone knows about a cheap and quick openstack operation
that triggers a message interchange between rabbitmq and nova-compute and a
way of checking the result it would be great.




On Thu, Jan 15, 2015 at 1:45 PM, Kris G. Lindgren 
wrote:

>  We did have an issue using celery  on an internal application that we
> wrote - but I believe it was fixed after much failover testing and code
> changes.  We also use logstash via rabbitmq and haven't noticed any issues
> there either.
>
>  So this seems to be just openstack/oslo related.
>
>  We have tried a number of different configurations - all of them had
> their issues.  We started out listing all the members in the cluster on the
> rabbit_hosts line.  This worked most of the time without issue, until we
> would restart one of the servers, then it seemed like the clients wouldn't
> figure out they were disconnected and reconnect to the next host.
>
>  In an attempt to solve that we moved to using harpoxy to present a vip
> that we configured in the rabbit_hosts line.  This created issues with long
> lived connections disconnects and a bunch of other issues.  In our
> production environment we moved to load balanced rabbitmq, but using a real
> loadbalancer, and don't have the weird disconnect issues.  However, anytime
> we reboot/take down a rabbitmq host or pull a member from the cluster we
> have issues, or if their is a network disruption we also have issues.
>
>  Thinking the best course of action is to move rabbitmq off on to its own
> box and to leave it alone.
>
>  Does anyone have a rabbitmq setup that works well and doesn't have
> random issues when pulling nodes for maintenance?
>  
>
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy, LLC.
>
>
>   From: Joe Topjian 
> Date: Thursday, January 15, 2015 at 9:29 AM
> To: "Kris G. Lindgren" 
> Cc: "openstack-operators@lists.openstack.org" <
> openstack-operators@lists.openstack.org>
> Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq
> connectivity
>
>   Hi Kris,
>
> Our experience is pretty much the same on anything that is using
>> rabbitmq - not just nova-compute.
>>
>
>  Just to clarify: have you experienced this outside of OpenStack (or
> Oslo)?
>
>  We've seen similar issues with rabbitmq and OpenStack. We used to run
> rabbit through haproxy and tried a myriad of options like setting no
> timeouts, very very long timeouts, etc, but would always eventually see
> similar issues as described.
>
>  Last month, we reconfigured all OpenStack components to use the
> `rabbit_hosts` option with all nodes in our cluster listed. So far this has
> worked well, though I probably just jinxed myself. :)
>
>  We still have other services (like Sensu) using the same rabbitmq
> cluster and accessing it through haproxy. We've never had any issues there.
>
>  What's also strange is that I have another OpenStack deployment (from
> Folsom to Icehouse) with just a single rabbitmq server installed directly
> on the cloud controller (meaning: no nova-compute). I never have any rabbit
> issues in that cloud.
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Cinder api enpoint not found error while attach volume to instance

2015-01-15 Thread Jay Pipes

On 01/15/2015 06:22 AM, Geo Varghese wrote:

Hi Jay/Abel,

Thanks for your help.

Just fixed issue by changing following line in nova.conf

cinder_catalog_info=volumev2:cinderv2:publicURL

to

cinder_catalog_info=volume:cinder:publicURL

Now attachment successfuly done.

Do guys know how this fixed the issue?


Cool, good to hear you fixed your issue.

Cause of your issue was that your Keystone services table has 
"regionOne" for the cinderv2 service, instead of "RegionOne", and the 
catalog was being generated for both the volume and volumev2 endpoints 
with "RegionOne".


Best,
-jay


On Thu, Jan 15, 2015 at 12:01 PM, Geo Varghese mailto:gvargh...@aqorn.com>> wrote:

Hi Abel,

Oh okay.  yes sure compute nod ecan access the controller. I have
added it in /etc/hosts

The current error is,  it couldn't find an endpoint.  is this
related with anything aboe you mentioned?

On Thu, Jan 15, 2015 at 11:56 AM, Abel Lopez mailto:alopg...@gmail.com>> wrote:

I know it's "Available ", however that doesn't imply attachment.
Cinder uses iSCSI or NFS, to attach the volume to a running
instance on a compute node. If you're missing the required
protocol packages, the attachment will fail. You can have
"Available " volumes, and lack tgtadm (or nfs-utils if that's
your protocol).

Secondly, Is your compute node able to resolve "controller"?


On Wednesday, January 14, 2015, Geo Varghese
mailto:gvargh...@aqorn.com>> wrote:

Hi Abel,

Thanks for the reply.

I have created volume and its in available state. Please
check attached screenshot.



On Thu, Jan 15, 2015 at 11:34 AM, Abel Lopez
 wrote:

Do your compute nodes have the required iSCSI packages
installed?


On Wednesday, January 14, 2015, Geo Varghese
 wrote:

Hi Jay,

Thanks for the reply. Just pasting the details below

keystone catalog

Service: compute

+-++
|   Property  |
Value|

+-++
|   adminURL  |
http://controller:8774/v2/e600ba9727924a3b97ede34aea8279c1
|
|  id |
02028b1f4c0849c68eb79f5887516299  |
| internalURL |
http://controller:8774/v2/e600ba9727924a3b97ede34aea8279c1
|
|  publicURL  |
http://controller:8774/v2/e600ba9727924a3b97ede34aea8279c1
|
|region   |
RegionOne  |

+-++
Service: network
+-+--+
|   Property  |  Value   |
+-+--+
|   adminURL  | http://controller:9696  |
|  id | 32f687d4f7474769852d88932288b893 |
| internalURL | http://controller:9696  |
|  publicURL  | http://controller:9696  |
|region   |RegionOne |
+-+--+
Service: volumev2

+-++
|   Property  |
Value|

+-++
|   adminURL  |
http://controller:8776/v2/e600ba9727924a3b97ede34aea8279c1
|
|  id |
5bca493cdde2439887d54fb805c4d2d4  |
| internalURL |
http://controller:8776/v2/e600ba9727924a3b97ede34aea8279c1
|
|  publicURL  |
http://controller:8776/v2/e600ba9727924a3b97ede34aea8279c1
|
|region   |
RegionOne  |

+-++
Service: image
+-+--+
   

Re: [Openstack-operators] Lets talk capacity monitoring

2015-01-15 Thread matt
I know we've been working on that on our commercial product side at big
switch with an analyzer... the issue i think you are going to run into is
getting insight into network upstream info from your top of racks and spine
switches.

Setting up a uniform access to ovs stats in the API or in an external API (
probably preferable ) is not a bad idea.

Way I see it, you might want to consider an external ( not a full project
in OpenStack ) api for aggregating ovs stats ( use the message bus to cull
that data ). whether you want to try to make use of ceilometer or monaas is
really up to you, I'd recommend only using ceilometer if you are using
zones.  Then making a display panel pluggable to horizon is fairly straight
forward with d3.js

Off hand I don't know of a specific project targetting this, but it could
shoehorn into projects like ceilometer or monass.  Also future tap as a
service work that's starting to occur in juno now may be super helpful as
well.

This looks like it WILL exist... how it exists and what it ties into is
probably waiting on a bit more definition and execution commitment from
other projects.  Unless you happen to be using zones and want to augment
ceilometer.

-Matt

On Thu, Jan 15, 2015 at 9:25 AM, Mathieu Gagné  wrote:

> On 2015-01-15 11:43 AM, Jesse Keating wrote:
>
>> We have a need to better manage the various openstack capacities across
>> our numerous clouds. We want to be able to detect when capacity of one
>> system or another is approaching the point where it would be a good idea
>> to arrange to increase that capacity. Be it volume space, VCPU
>> capability, object storage space, etc...
>>
>> What systems are you folks using to monitor and react to such things?
>>
>>
> Thanks for bringing up the subject Jesse.
>
> I believe you are not the only one facing this challenge because I am too.
>
> I added the subject to the midcycle ops meetup (Capacity
> planning/monitoring) which I hope to be able to attend:
> https://etherpad.openstack.org/p/PHL-ops-meetup
>
>
> We are using host aggregates and have a complex combination of them.
> (imaging a venn diagram)
>
> What we do is retrieving all:
> - hypervisor stats
> - host aggregates
>
> From there, we compute resource usage (vcpus, ram, disk) in any given host
> aggregate.
>
> This part is very challenging as we have to partially reimplement
> nova-scheduler logic to determine if a given hypervisor has different
> resource allocation ratios based on host aggregate attributes.
>
> The result in a table with resource usage percentage (and absolute
> numbers) for each host aggregates (and combinations).
>
> Unfortunately, I can't share yet this first tool as my coworker very
> tightly integrated it to our internal monitoring tool and wouldn't work
> outside it. No promise but I'll try to find time to extract it and share it
> with you guys.
>
>
> We also coded a very primitive tool which takes a flavor name and compute
> available "slots" on each hypervisors (regardless of host aggregate
> memberships):
>
> https://gist.github.com/mgagne/bc54c3434a119246a88d
>
> This tool is not actively used in our monitoring due to mentioned
> limitation as we would again have to partially reimplement nova-scheduler
> logic to determine if a given flavor can (or not) be spawn on a given
> hypervisor and filter it out from the output if it can't accept the flavor.
> Furthermore, it does not take into account resource allocation ratios based
> on host aggregates.
>
> Hopefully, other people will join in and share their tools so we can all
> improve our OpenStack operations experience.
>
> --
> Mathieu
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Lets talk capacity monitoring

2015-01-15 Thread Mathieu Gagné

On 2015-01-15 11:43 AM, Jesse Keating wrote:

We have a need to better manage the various openstack capacities across
our numerous clouds. We want to be able to detect when capacity of one
system or another is approaching the point where it would be a good idea
to arrange to increase that capacity. Be it volume space, VCPU
capability, object storage space, etc...

What systems are you folks using to monitor and react to such things?



Thanks for bringing up the subject Jesse.

I believe you are not the only one facing this challenge because I am too.

I added the subject to the midcycle ops meetup (Capacity 
planning/monitoring) which I hope to be able to attend:

https://etherpad.openstack.org/p/PHL-ops-meetup


We are using host aggregates and have a complex combination of them. 
(imaging a venn diagram)


What we do is retrieving all:
- hypervisor stats
- host aggregates

From there, we compute resource usage (vcpus, ram, disk) in any given 
host aggregate.


This part is very challenging as we have to partially reimplement 
nova-scheduler logic to determine if a given hypervisor has different 
resource allocation ratios based on host aggregate attributes.


The result in a table with resource usage percentage (and absolute 
numbers) for each host aggregates (and combinations).


Unfortunately, I can't share yet this first tool as my coworker very 
tightly integrated it to our internal monitoring tool and wouldn't work 
outside it. No promise but I'll try to find time to extract it and share 
it with you guys.



We also coded a very primitive tool which takes a flavor name and 
compute available "slots" on each hypervisors (regardless of host 
aggregate memberships):


https://gist.github.com/mgagne/bc54c3434a119246a88d

This tool is not actively used in our monitoring due to mentioned 
limitation as we would again have to partially reimplement 
nova-scheduler logic to determine if a given flavor can (or not) be 
spawn on a given hypervisor and filter it out from the output if it 
can't accept the flavor. Furthermore, it does not take into account 
resource allocation ratios based on host aggregates.


Hopefully, other people will join in and share their tools so we can all 
improve our OpenStack operations experience.


--
Mathieu

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] metadata-api 500 errors

2015-01-15 Thread Alex Leonhardt
Hmmm, I did do some playing with security groups, but even the last network
(network5) that I created was unchanged from "default" - maybe it has
something to do with that, I'm testing a brand new tenant/project now with
a similar setup, except I'm setting up all the networking / networks first
this time.



On Thu Jan 15 2015 at 17:12:47 Edgar Magana 
wrote:

>  Did you change the security groups?
> I will try to compare the iptables configuration for each name space
>
>  Edgar
>
>   From: Alex Leonhardt 
> Date: Thursday, January 15, 2015 at 9:03 AM
> To: Edgar Magana , openstack-operators <
> openstack-operators@lists.openstack.org>
> Subject: Re: [Openstack-operators] metadata-api 500 errors
>
>  Hi Edgar,
>
> that's the crazy thing - so all the gre tunnels are up, I can see them in
> openvswitch and also can see that there are some openflow rules applied.
> I've craeted VMs on every hypervisor (including the controller, as it's a
> test install) on network1 (192.168.1.0), every VM (and that is still the
> case now) started there works just fine and gets the metadata as expected,
> the same for network2 (192.168.2.0).
>
>  the issue only appeared after I created network3 (192.168.3.0), VMs
> there (tried again all 3 hypervisors) get a 500 errror instead of the
> expected metadata files/json. The same for any / all other networks I
> created after (network4 and 5).
>
>  On the VM all I can see is this:
>
> 2015-01-15 17:02:57,310 - url_helper.py[WARNING]: Calling 
> 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: 
> bad status code [500]
> 2015-01-15 17:02:58,509 - url_helper.py[WARNING]: Calling 
> 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [1/120s]: 
> bad status code [500]
>
>  Alex
>
>
> On Thu Jan 15 2015 at 16:53:31 Edgar Magana 
> wrote:
>
>>  Alex,
>>
>>  Did you follow the networking recommendations:
>>
>> http://docs.openstack.org/openstack-ops/content/network_troubleshooting.html
>>
>>  It will ell you if you write your own topology and complete a packet
>> trace to find out the issue.
>> Make sure all tunnels are established between your three nodes.
>>
>>  Thanks,
>>
>>  Edgar
>>
>>   From: Alex Leonhardt 
>> Date: Thursday, January 15, 2015 at 7:45 AM
>> To: openstack-operators 
>> Subject: [Openstack-operators] metadata-api 500 errors
>>
>>  hi,
>>
>>  i've got a test openstack install with 3 nodes, using gre tunneling --
>>
>>  initially it all worked fine, but, after creating > 2 networks, VMs in
>> networks 3,4,5 do not seem to get the metadata due to it erroring with 500
>> errors. whilst this is happening, VMs in networks 1 and 2 are still working
>> fine and can be provisioned OK.
>>
>>  anyone seen something similar or ideas on how to go about
>> troubleshooting this ? I got a tcpdump from the VM but as it does get to
>> the metadata api, am not sure where the issue is (especially since other
>> VMs in other Networks work just fine)
>>
>>  any ideas ?
>>
>>  Alex
>>
>>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] metadata-api 500 errors

2015-01-15 Thread Edgar Magana
Did you change the security groups?
I will try to compare the iptables configuration for each name space

Edgar

From: Alex Leonhardt mailto:aleonhardt...@gmail.com>>
Date: Thursday, January 15, 2015 at 9:03 AM
To: Edgar Magana mailto:edgar.mag...@workday.com>>, 
openstack-operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] metadata-api 500 errors

Hi Edgar,

that's the crazy thing - so all the gre tunnels are up, I can see them in 
openvswitch and also can see that there are some openflow rules applied. I've 
craeted VMs on every hypervisor (including the controller, as it's a test 
install) on network1 (192.168.1.0), every VM (and that is still the case now) 
started there works just fine and gets the metadata as expected, the same for 
network2 (192.168.2.0).

the issue only appeared after I created network3 (192.168.3.0), VMs there 
(tried again all 3 hypervisors) get a 500 errror instead of the expected 
metadata files/json. The same for any / all other networks I created after 
(network4 and 5).

On the VM all I can see is this:

2015-01-15 17:02:57,310 - url_helper.py[WARNING]: Calling 
'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: bad 
status code [500]
2015-01-15 17:02:58,509 - url_helper.py[WARNING]: Calling 
'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [1/120s]: bad 
status code [500]

Alex


On Thu Jan 15 2015 at 16:53:31 Edgar Magana 
mailto:edgar.mag...@workday.com>> wrote:
Alex,

Did you follow the networking recommendations:
http://docs.openstack.org/openstack-ops/content/network_troubleshooting.html

It will ell you if you write your own topology and complete a packet trace to 
find out the issue.
Make sure all tunnels are established between your three nodes.

Thanks,

Edgar

From: Alex Leonhardt mailto:aleonhardt...@gmail.com>>
Date: Thursday, January 15, 2015 at 7:45 AM
To: openstack-operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] metadata-api 500 errors

hi,

i've got a test openstack install with 3 nodes, using gre tunneling --

initially it all worked fine, but, after creating > 2 networks, VMs in networks 
3,4,5 do not seem to get the metadata due to it erroring with 500 errors. 
whilst this is happening, VMs in networks 1 and 2 are still working fine and 
can be provisioned OK.

anyone seen something similar or ideas on how to go about troubleshooting this 
? I got a tcpdump from the VM but as it does get to the metadata api, am not 
sure where the issue is (especially since other VMs in other Networks work just 
fine)

any ideas ?

Alex

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] metadata-api 500 errors

2015-01-15 Thread Alex Leonhardt
Hi Edgar,

that's the crazy thing - so all the gre tunnels are up, I can see them in
openvswitch and also can see that there are some openflow rules applied.
I've craeted VMs on every hypervisor (including the controller, as it's a
test install) on network1 (192.168.1.0), every VM (and that is still the
case now) started there works just fine and gets the metadata as expected,
the same for network2 (192.168.2.0).

the issue only appeared after I created network3 (192.168.3.0), VMs there
(tried again all 3 hypervisors) get a 500 errror instead of the expected
metadata files/json. The same for any / all other networks I created after
(network4 and 5).

On the VM all I can see is this:

2015-01-15 17:02:57,310 - url_helper.py[WARNING]: Calling
'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed
[0/120s]: bad status code [500]
2015-01-15 17:02:58,509 - url_helper.py[WARNING]: Calling
'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed
[1/120s]: bad status code [500]

Alex


On Thu Jan 15 2015 at 16:53:31 Edgar Magana 
wrote:

>  Alex,
>
>  Did you follow the networking recommendations:
>
> http://docs.openstack.org/openstack-ops/content/network_troubleshooting.html
>
>  It will ell you if you write your own topology and complete a packet
> trace to find out the issue.
> Make sure all tunnels are established between your three nodes.
>
>  Thanks,
>
>  Edgar
>
>   From: Alex Leonhardt 
> Date: Thursday, January 15, 2015 at 7:45 AM
> To: openstack-operators 
> Subject: [Openstack-operators] metadata-api 500 errors
>
>  hi,
>
>  i've got a test openstack install with 3 nodes, using gre tunneling --
>
>  initially it all worked fine, but, after creating > 2 networks, VMs in
> networks 3,4,5 do not seem to get the metadata due to it erroring with 500
> errors. whilst this is happening, VMs in networks 1 and 2 are still working
> fine and can be provisioned OK.
>
>  anyone seen something similar or ideas on how to go about
> troubleshooting this ? I got a tcpdump from the VM but as it does get to
> the metadata api, am not sure where the issue is (especially since other
> VMs in other Networks work just fine)
>
>  any ideas ?
>
>  Alex
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] metadata-api 500 errors

2015-01-15 Thread Edgar Magana
Alex,

Did you follow the networking recommendations:
http://docs.openstack.org/openstack-ops/content/network_troubleshooting.html

It will ell you if you write your own topology and complete a packet trace to 
find out the issue.
Make sure all tunnels are established between your three nodes.

Thanks,

Edgar

From: Alex Leonhardt mailto:aleonhardt...@gmail.com>>
Date: Thursday, January 15, 2015 at 7:45 AM
To: openstack-operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] metadata-api 500 errors

hi,

i've got a test openstack install with 3 nodes, using gre tunneling --

initially it all worked fine, but, after creating > 2 networks, VMs in networks 
3,4,5 do not seem to get the metadata due to it erroring with 500 errors. 
whilst this is happening, VMs in networks 1 and 2 are still working fine and 
can be provisioned OK.

anyone seen something similar or ideas on how to go about troubleshooting this 
? I got a tcpdump from the VM but as it does get to the metadata api, am not 
sure where the issue is (especially since other VMs in other Networks work just 
fine)

any ideas ?

Alex

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 6.5

2015-01-15 Thread Edgar Magana
It seems that you got some answers already. Basically, whatever "SECURE" that 
you want to generate.
I will provide a future post about how to secure your Cloud, which mean a best 
practices way to generate these secure code and keep them away from everybody, 
even your own folks!   :-)

Let's empower the operators!

Cheers!

Edgar

From: Anwar Durrani mailto:durrani.an...@gmail.com>>
Date: Thursday, January 15, 2015 at 1:48 AM
To: Edgar Magana mailto:edgar.mag...@workday.com>>
Cc: openstack-operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON 
CENTOS 6.5

Thanks Edgar for help, i have question in following section :


  *   Edit /etc/keystone/keystone.conf:

vim /etc/keystone/keystone.conf

[DEFAULT]
admin_token=ADMIN
log_dir=/var/log/keystone

[database]
connection = mysql://keystone:password@controller/keystone

admin_token=ADMIN -- > means hex generated token value OR just ADMIN as it is ?


On Thu, Jan 15, 2015 at 2:54 PM, Edgar Magana 
mailto:edgar.mag...@workday.com>> wrote:
Go for Icehouse:
https://github.com/emagana/OpenStack-Icehouse-Install-Guide/blob/master/OpenStack-Icehouse-Installation.rst

Edgar

From: Anwar Durrani mailto:durrani.an...@gmail.com>>
Date: Thursday, January 15, 2015 at 1:12 AM
To: openstack-operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 
6.5

Hello everyone,

I want to setup Havana, Do anyone has installation guide for the same ?

Thanks

--
Thanks & regards,
Anwar M. Durrani
+91-8605010721





--
Thanks & regards,
Anwar M. Durrani
+91-8605010721


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Kris G. Lindgren
We did have an issue using celery  on an internal application that we wrote - 
but I believe it was fixed after much failover testing and code changes.  We 
also use logstash via rabbitmq and haven't noticed any issues there either.

So this seems to be just openstack/oslo related.

We have tried a number of different configurations - all of them had their 
issues.  We started out listing all the members in the cluster on the 
rabbit_hosts line.  This worked most of the time without issue, until we would 
restart one of the servers, then it seemed like the clients wouldn't figure out 
they were disconnected and reconnect to the next host.

In an attempt to solve that we moved to using harpoxy to present a vip that we 
configured in the rabbit_hosts line.  This created issues with long lived 
connections disconnects and a bunch of other issues.  In our production 
environment we moved to load balanced rabbitmq, but using a real loadbalancer, 
and don't have the weird disconnect issues.  However, anytime we reboot/take 
down a rabbitmq host or pull a member from the cluster we have issues, or if 
their is a network disruption we also have issues.

Thinking the best course of action is to move rabbitmq off on to its own box 
and to leave it alone.

Does anyone have a rabbitmq setup that works well and doesn't have random 
issues when pulling nodes for maintenance?


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.


From: Joe Topjian mailto:j...@topjian.net>>
Date: Thursday, January 15, 2015 at 9:29 AM
To: "Kris G. Lindgren" mailto:klindg...@godaddy.com>>
Cc: 
"openstack-operators@lists.openstack.org"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Way to check compute <-> rabbitmq 
connectivity

Hi Kris,

 Our experience is pretty much the same on anything that is using rabbitmq - 
not just nova-compute.

Just to clarify: have you experienced this outside of OpenStack (or Oslo)?

We've seen similar issues with rabbitmq and OpenStack. We used to run rabbit 
through haproxy and tried a myriad of options like setting no timeouts, very 
very long timeouts, etc, but would always eventually see similar issues as 
described.

Last month, we reconfigured all OpenStack components to use the `rabbit_hosts` 
option with all nodes in our cluster listed. So far this has worked well, 
though I probably just jinxed myself. :)

We still have other services (like Sensu) using the same rabbitmq cluster and 
accessing it through haproxy. We've never had any issues there.

What's also strange is that I have another OpenStack deployment (from Folsom to 
Icehouse) with just a single rabbitmq server installed directly on the cloud 
controller (meaning: no nova-compute). I never have any rabbit issues in that 
cloud.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Lets talk capacity monitoring

2015-01-15 Thread Jesse Keating
We have a need to better manage the various openstack capacities across 
our numerous clouds. We want to be able to detect when capacity of one 
system or another is approaching the point where it would be a good idea 
to arrange to increase that capacity. Be it volume space, VCPU 
capability, object storage space, etc...


What systems are you folks using to monitor and react to such things?

--
-jlk

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Openstack-Keystone error

2015-01-15 Thread Jesse Keating

On 1/15/15 2:17 AM, Anwar Durrani wrote:

​I did following steps earlier :


These steps don't mention doing the keystone-manage db_sync action. When 
you install keystone itself and configure it to connect to a sql 
service, and you have created a keystone database within the sql 
service, the next step is to "sync" the database which will create all 
the tables necessary for keystone to operate.


http://docs.openstack.org/juno/install-guide/install/apt/content/keystone-install.html


--
-jlk

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 6.5

2015-01-15 Thread Jesse Keating

On 1/15/15 1:48 AM, Anwar Durrani wrote:

Thanks Edgar for help, i have question in following section :

  *

Edit /etc/keystone/keystone.conf:

vim /etc/keystone/keystone.conf

[DEFAULT]
admin_token=ADMIN
log_dir=/var/log/keystone

[database]
connection = mysql://keystone:password@controller/keystone

admin_token=ADMIN -- > means hex generated token value OR just ADMIN as
it is ?



admin_token is a string, whatever string you want, that can be used as a 
key to get in before you have real users set up. However because it's a 
string, and because you end up using it from a shell, it is not wise to 
leave this string enabled. Once you've bootstrapped keystone its advised 
to remove admin_token_auth from your paste pipeline and/or remove the 
admin_token definition from your keystone config.


To echo Edgar, but modify slightly, Havana is a dead end and not an 
advisable starting point. Icehouse has a short life ahead of it. Juno is 
the latest release and if you're starting fresh that's the one I would 
recommend you start with. It'll give you the longest time of support 
before you have to contemplate a version upgrade.


--
-jlk

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] New services disable reason

2015-01-15 Thread Alex Leonhardt
:) did that too just now. ta!

On Thu Jan 15 2015 at 16:28:36 Michael Dorman  wrote:

>   +1 as well, for the same reasons.  I also added by +1 to the review.
> Thanks!
>
>
>   From: Alex Leonhardt 
> Date: Thursday, January 15, 2015 at 1:02 AM
> To: Belmiro Moreira , OpenStack
> Operators 
> Subject: Re: [Openstack-operators] New services disable reason
>
>   Our install is still quite small and we take the risk of hitting that
> compute node whilst the service is starting,  but we haven't actually
> encountered that yet (probably a user base size issue) ...
>
> IMHO, having a reason why stuff is disabled avoids hours of confusion and
> trying to find the person who 'did it'.
>
> In terms of keeping track, I'd have thought that the dashboard admin panel
> can show you service states and I'd expect a disabled service to say
> 'disabled', but again we don't even use this feature at the moment.
>
> +1 from me :)
>
> Alex
>
> On Wed, 14 Jan 2015 19:34 Belmiro Moreira <
> moreira.belmiro.email.li...@gmail.com> wrote:
>
>> Hi,
>> as operators I would like to have your comments/suggestions on:
>> https://review.openstack.org/#/c/136645/1
>>
>>
>>  With a large number of nodes several services are disabled because
>> various reasons (in our case mainly hardware interventions).
>> To help operations we use the "disable reason" as fast filter to identify
>> why the service is disabled.
>>
>>  At same time, we add several new nodes (nova-compute) per week.
>>  At CERN to avoid adding a service when the daemon starts for the first
>> time nova is configured with:
>> enable_new_services=False
>> This is great, however no "disable reason" is set.
>> For us having services disabled with no reason specified creates
>> additional checks.
>>
>>  How are others keeping track of disabled services?
>>
>>
>>  Belmiro
>> ---
>> CERN
>>  ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>   ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Joe Topjian
Hi Kris,

 Our experience is pretty much the same on anything that is using rabbitmq
> - not just nova-compute.
>

Just to clarify: have you experienced this outside of OpenStack (or Oslo)?

We've seen similar issues with rabbitmq and OpenStack. We used to run
rabbit through haproxy and tried a myriad of options like setting no
timeouts, very very long timeouts, etc, but would always eventually see
similar issues as described.

Last month, we reconfigured all OpenStack components to use the
`rabbit_hosts` option with all nodes in our cluster listed. So far this has
worked well, though I probably just jinxed myself. :)

We still have other services (like Sensu) using the same rabbitmq cluster
and accessing it through haproxy. We've never had any issues there.

What's also strange is that I have another OpenStack deployment (from
Folsom to Icehouse) with just a single rabbitmq server installed directly
on the cloud controller (meaning: no nova-compute). I never have any rabbit
issues in that cloud.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] metadata-api 500 errors

2015-01-15 Thread Alex Leonhardt
In case it helps - attached a screenshot of the topology from openstack
dashboard.

alex


On Thu Jan 15 2015 at 14:36:24 Alex Leonhardt 
wrote:

> hi,
>
> i've got a test openstack install with 3 nodes, using gre tunneling --
>
> initially it all worked fine, but, after creating > 2 networks, VMs in
> networks 3,4,5 do not seem to get the metadata due to it erroring with 500
> errors. whilst this is happening, VMs in networks 1 and 2 are still working
> fine and can be provisioned OK.
>
> anyone seen something similar or ideas on how to go about troubleshooting
> this ? I got a tcpdump from the VM but as it does get to the metadata api,
> am not sure where the issue is (especially since other VMs in other
> Networks work just fine)
>
> any ideas ?
>
> Alex
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Jesse Keating

On 1/15/15 7:34 AM, Gustavo Randich wrote:

Hi,

I'm experiencing some issues with nova-compute services not responding
to rabbitmq messages, despite the service reporting OK state via
periodic tasks. Apparently the TCP connection is open but in a stale or
unresponsive state. This happens sporadically when there is some not yet
understood network problem. Restarting nova-compute solves the problem.

Is there any way, preferably via openstack API, to probe service
responsiveness, i.e., that it consumes messages, so we can program an alert?



One strategy I've seen has been to monitor the queue sizes, and if they 
start growing beyond a boundary than we know something isn't consuming 
the messages correctly and can narrow down to which host is having issues.


This isn't all that elegant though, so I'm interested as well to see if 
there is any way to trigger a particular nova process to send/consume a 
message.



--
-jlk

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] New services disable reason

2015-01-15 Thread Michael Dorman
+1 as well, for the same reasons.  I also added by +1 to the review.  Thanks!


From: Alex Leonhardt mailto:aleonhardt...@gmail.com>>
Date: Thursday, January 15, 2015 at 1:02 AM
To: Belmiro Moreira 
mailto:moreira.belmiro.email.li...@gmail.com>>,
 OpenStack Operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] New services disable reason


Our install is still quite small and we take the risk of hitting that compute 
node whilst the service is starting,  but we haven't actually encountered that 
yet (probably a user base size issue) ...

IMHO, having a reason why stuff is disabled avoids hours of confusion and 
trying to find the person who 'did it'.

In terms of keeping track, I'd have thought that the dashboard admin panel can 
show you service states and I'd expect a disabled service to say 'disabled', 
but again we don't even use this feature at the moment.

+1 from me :)

Alex

On Wed, 14 Jan 2015 19:34 Belmiro Moreira 
mailto:moreira.belmiro.email.li...@gmail.com>>
 wrote:
Hi,
as operators I would like to have your comments/suggestions on:
https://review.openstack.org/#/c/136645/1


With a large number of nodes several services are disabled because various 
reasons (in our case mainly hardware interventions).
To help operations we use the "disable reason" as fast filter to identify why 
the service is disabled.

At same time, we add several new nodes (nova-compute) per week.
At CERN to avoid adding a service when the daemon starts for the first time 
nova is configured with:
enable_new_services=False
This is great, however no "disable reason" is set.
For us having services disabled with no reason specified creates additional 
checks.

How are others keeping track of disabled services?


Belmiro
---
CERN
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Kris G. Lindgren
+1 on this.

In general rabbitmq connectivity/failover is pretty terrible.  Services look to 
be connected to rabbitmq but in reality they aren't, monitoring on the server 
to see if it has an established connection to rabbitmq isn't enough. Our 
experience is pretty much the same on anything that is using rabbitmq - not 
just nova-compute.  The issue seems to be that it can send messages, but it 
doesn't actually pull messages from the queue.  Also, when we restart a rabbit 
node in the cluster, connections typically have issues re-establishing and we 
need to restart most services to fix the issue.


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.



From: Gustavo Randich 
mailto:gustavo.rand...@gmail.com>>
Date: Thursday, January 15, 2015 at 8:34 AM
To: 
"openstack-operators@lists.openstack.org"
 
mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] Way to check compute <-> rabbitmq connectivity

Hi,

I'm experiencing some issues with nova-compute services not responding to 
rabbitmq messages, despite the service reporting OK state via periodic tasks. 
Apparently the TCP connection is open but in a stale or unresponsive state. 
This happens sporadically when there is some not yet understood network 
problem. Restarting nova-compute solves the problem.

Is there any way, preferably via openstack API, to probe service 
responsiveness, i.e., that it consumes messages, so we can program an alert?

Thanks in advance!

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] metadata-api 500 errors

2015-01-15 Thread Alex Leonhardt
hi,

i've got a test openstack install with 3 nodes, using gre tunneling --

initially it all worked fine, but, after creating > 2 networks, VMs in
networks 3,4,5 do not seem to get the metadata due to it erroring with 500
errors. whilst this is happening, VMs in networks 1 and 2 are still working
fine and can be provisioned OK.

anyone seen something similar or ideas on how to go about troubleshooting
this ? I got a tcpdump from the VM but as it does get to the metadata api,
am not sure where the issue is (especially since other VMs in other
Networks work just fine)

any ideas ?

Alex
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Way to check compute <-> rabbitmq connectivity

2015-01-15 Thread Gustavo Randich
Hi,

I'm experiencing some issues with nova-compute services not responding to
rabbitmq messages, despite the service reporting OK state via periodic
tasks. Apparently the TCP connection is open but in a stale or unresponsive
state. This happens sporadically when there is some not yet understood
network problem. Restarting nova-compute solves the problem.

Is there any way, preferably via openstack API, to probe service
responsiveness, i.e., that it consumes messages, so we can program an alert?

Thanks in advance!
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [ha-guide] HA Guide update next steps

2015-01-15 Thread Matt Griffin
Just a reminder that we're going to meet today (and every Thursday) from
3:00-3:30pm US Central.
Like last time, let's chat in #openstack-haguide on freenode.

A bit later today (before our meeting), I'll review the wiki so we can
charge ahead as soon as possible on updating areas.

Best,
Matt


---
Matt Griffin
Director of Product Management
Percona
m: 1-214-727-4100
skype: thebear78


On Sat, Jan 10, 2015 at 12:00 PM, Matt Griffin 
wrote:

> Thanks Sriram and thanks for everyone's participation in the poll.
> I picked Thursday from 3:00 PM - 3:30 PM US Central Time for the OpenStack
> HA Guide Regular Meeting.
> We'll start this Thursday, January 15, 2015.
>
> I'll do some cleanup to the pad [1] before our first meeting so hopefully
> we can quickly arrive at owners and move forward.
>
> Best,
> Matt
>
> [1] https://etherpad.openstack.org/p/openstack-haguide-update
>
>
>
> On Fri, Jan 9, 2015 at 12:30 AM, Sriram Subramanian  > wrote:
>
>> Dear Docs,
>>
>> I noted some discussions [1] on the operators mailing list about HA guide
>> meetings. I also added my availability in the Doodle poll[2]. Thanks for
>> starting this Matt.
>>
>> Since many may not be on the Docs list, I am resurfacing it here.
>>
>> Thanks,
>> -Sriram
>>
>> 1.
>> http://lists.openstack.org/pipermail/openstack-operators/2015-January/005810.html
>> 2. https://doodle.com/4r9i2m7tyrz3aayv
>>
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Fwd: HAPROXY 504 errors in HA conf

2015-01-15 Thread Pedro Sousa
Hi all,

the culprit was haproxy, I had "option httpchk" when I disabled this
stopped having timeouts rebooting the servers.

Thank you all.


On Wed, Jan 14, 2015 at 5:29 PM, John Dewey  wrote:

>  I would verify that the VIP failover is occurring.
>
> Your master should have the IP address.  If you shut down keepalived the
> VIP should move to one of the others.   I generally set the state to MASTER
> on all systems, and have one with a higher priority than the others (e.g.
> 100 vs 150 on others).
>
> On Tuesday, January 13, 2015 at 12:18 PM, Pedro Sousa wrote:
>
> As expected If I reboot the Keepalived MASTER node, I get timeouts again,
> so my understanding is that this happens when the VIP fails over to another
> node. Anyone has explanation for this?
>
> Thanks
>
> On Tue, Jan 13, 2015 at 8:08 PM, Pedro Sousa  wrote:
>
> Hi,
>
> I think I found out the issue, as I have all the 3 nodes running
> Keepalived as MASTER, when I reboot one of the servers, one of the VIPS
> failsover to it, causing the timeout issues. So I left only one server as
> MASTER and the other 2 as BACKUP, and If I reboot the BACKUP servers
> everything will work fine.
>
> As a note aside, I don't know if this is some ARP issue because I have a
> similar problem with Neutron L3 running in HA Mode. If I reboot the server
> that is running as MASTER I loose connection to my floating IPS because the
> switch doesn't know yet that the Mac Addr has changed. To everything start
> working I have to ping an outside host  like google from an instance.
>
> Maybe someone could share some experience on this,
>
> Thank you for your help.
>
>
>
>
> On Tue, Jan 13, 2015 at 7:18 PM, Pedro Sousa  wrote:
>
> Jesse,
>
> I see a lot of these messages in glance-api:
>
> 2015-01-13 19:16:29.084 29269 DEBUG
> glance.api.middleware.version_negotiation
> [29d94a9a-135b-4bf2-a97b-f23b0704ee15 eb7ff2b5f0f34f51ac9ea0f75b60065d
> 2524b02b63994749ad1fed6f3a825c15 - - -] Unknown version. Returning version
> choices. process_request
> /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:64
>
> While running openstack-status (glance image-list)
>
> == Glance images ==
> Error finding address for
> http://172.16.21.20:9292/v1/images/detail?sort_key=name&sort_dir=asc&limit=20:
> HTTPConnectionPool(host='172.16.21.20', port=9292): Max retries exceeded
> with url: /v1/images/detail?sort_key=name&sort_dir=asc&limit=20 (Caused by
> : '')
>
>
> Thanks
>
>
> On Tue, Jan 13, 2015 at 6:52 PM, Jesse Keating  wrote:
>
> On 1/13/15 10:42 AM, Pedro Sousa wrote:
>
> Hi
>
>
> I've changed some haproxy confs, now I'm getting a different error:
>
> *== Nova networks ==*
> *ERROR (ConnectionError): HTTPConnectionPool(host='172.16.21.20',
> port=8774): Max retries exceeded with url:
> /v2/2524b02b63994749ad1fed6f3a825c15/os-networks (Caused by  'httplib.BadStatusLine'>: '')*
> *== Nova instance flavors ==*
>
> If I restart my openstack services everything will start working.
>
> I'm attaching my new haproxy conf.
>
>
> Thanks
>
>
> Sounds like your services are losing access to something, like rabbit or
> the database. What do your service logs show prior to restart? Are they
> throwing any errors?
>
>
> --
> -jlk
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
>
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Cinder api enpoint not found error while attach volume to instance

2015-01-15 Thread Geo Varghese
Hi Jay/Abel,

Thanks for your help.

Just fixed issue by changing following line in nova.conf

cinder_catalog_info=volumev2:cinderv2:publicURL

to

cinder_catalog_info=volume:cinder:publicURL

Now attachment successfuly done.

Do guys know how this fixed the issue?


On Thu, Jan 15, 2015 at 12:01 PM, Geo Varghese  wrote:

> Hi Abel,
>
> Oh okay.  yes sure compute nod ecan access the controller. I have added it
> in /etc/hosts
>
> The current error is,  it couldn't find an endpoint.  is this related with
> anything aboe you mentioned?
>
> On Thu, Jan 15, 2015 at 11:56 AM, Abel Lopez  wrote:
>
>> I know it's "Available ", however that doesn't imply attachment. Cinder
>> uses iSCSI or NFS, to attach the volume to a running instance on a compute
>> node. If you're missing the required protocol packages, the attachment will
>> fail. You can have "Available " volumes, and lack tgtadm (or nfs-utils if
>> that's your protocol).
>>
>> Secondly, Is your compute node able to resolve "controller"?
>>
>>
>> On Wednesday, January 14, 2015, Geo Varghese  wrote:
>>
>>> Hi Abel,
>>>
>>> Thanks for the reply.
>>>
>>> I have created volume and its in available state. Please check attached
>>> screenshot.
>>>
>>>
>>>
>>> On Thu, Jan 15, 2015 at 11:34 AM, Abel Lopez  wrote:
>>>
 Do your compute nodes have the required iSCSI packages installed?


 On Wednesday, January 14, 2015, Geo Varghese 
 wrote:

> Hi Jay,
>
> Thanks for the reply. Just pasting the details below
>
> keystone catalog
> 
> Service: compute
>
> +-++
> |   Property  |
> Value|
>
> +-++
> |   adminURL  |
> http://controller:8774/v2/e600ba9727924a3b97ede34aea8279c1 |
> |  id |
> 02028b1f4c0849c68eb79f5887516299  |
> | internalURL |
> http://controller:8774/v2/e600ba9727924a3b97ede34aea8279c1 |
> |  publicURL  |
> http://controller:8774/v2/e600ba9727924a3b97ede34aea8279c1 |
> |region   |
> RegionOne  |
>
> +-++
> Service: network
> +-+--+
> |   Property  |  Value   |
> +-+--+
> |   adminURL  |  http://controller:9696  |
> |  id | 32f687d4f7474769852d88932288b893 |
> | internalURL |  http://controller:9696  |
> |  publicURL  |  http://controller:9696  |
> |region   |RegionOne |
> +-+--+
> Service: volumev2
>
> +-++
> |   Property  |
> Value|
>
> +-++
> |   adminURL  |
> http://controller:8776/v2/e600ba9727924a3b97ede34aea8279c1 |
> |  id |
> 5bca493cdde2439887d54fb805c4d2d4  |
> | internalURL |
> http://controller:8776/v2/e600ba9727924a3b97ede34aea8279c1 |
> |  publicURL  |
> http://controller:8776/v2/e600ba9727924a3b97ede34aea8279c1 |
> |region   |
> RegionOne  |
>
> +-++
> Service: image
> +-+--+
> |   Property  |  Value   |
> +-+--+
> |   adminURL  |  http://controller:9292  |
> |  id | 2e2294b9151e4fb9b6efccf33c62181b |
> | internalURL |  http://controller:9292  |
> |  publicURL  |  http://controller:9292  |
> |region   |RegionOne |
> +-+--+
> Service: volume
>
> +-++
> |   Property  |
> Value|
>
> +-++
> |   adminURL  |
> http://controller:8776/v1/e600ba9727924a3b97ede34aea8279c1 |
> |  id |
> 0e29cfaa785e4e148c57601b182a5e26  |
> | internalURL |
> http://controller:8776/v1/e600ba9727924a3b97ede34aea8279c1 |
> |  publicURL  |
> http://controller:8776/v1/e600ba9727924a3b97ede34aea8279c1 |
> |region   |
> RegionOne  |
>
> +-++
> Service: ec2
> +-+

Re: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 6.5

2015-01-15 Thread Vedprakash Nimbalkar
If you are doing setup for self do it on ubuntu with juno release

http://docs.openstack.org/havana/install-guide/install/yum/content/
On 15-Jan-2015 2:45 PM, "Anwar Durrani"  wrote:

> Hello everyone,
>
> I want to setup Havana, Do anyone has installation guide for the same ?
>
> Thanks
>
> --
> Thanks & regards,
> Anwar M. Durrani
> +91-8605010721
> 
>
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Openstack-Keystone error

2015-01-15 Thread Alex Leonhardt
this is probably the issue then:

2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi ProgrammingError:
(ProgrammingError) (1146, "Table 'keystone.token' doesn't exist") 'SELECT
token.idAS token_id, token.expires AS token_expires, token.extra AS
token_extra, token.valid AS token_valid, token.user_id AS token_user_id,
token.trust_id AS token_trust_id \nFROM token \nWHERE token.id = %s'
('2c0dc0032d675623f37a',)


you may need to run those db migrate scripts for keystone first 

alex


On Thu Jan 15 2015 at 09:09:57 Anwar Durrani 
wrote:

> Hi Alex, below is error in log file
>
> 2015-01-15 01:08:34.128 50243 ERROR keystone.common.wsgi [-]
> (ProgrammingError) (1146, "Table 'keystone.token' doesn't exist") 'SELECT
> token.id AS token_id, token.expires AS token_expires, token.extra AS
> token_extra, token.valid AS token_valid, token.user_id AS token_user_id,
> token.trust_id AS token_trust_id \nFROM token \nWHERE token.id = %s'
> ('2c0dc0032d675623f37a',)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi Traceback (most
> recent call last):
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/keystone/common/wsgi.py", line 430, in
> __call__
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi response =
> self.process_request(request)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/keystone/middleware/core.py", line 279,
> in process_request
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi auth_context
> = self._build_auth_context(request)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/keystone/middleware/core.py", line 259,
> in _build_auth_context
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
> token_data=self.token_provider_api.validate_token(token_id))
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/keystone/token/provider.py", line 225, in
> validate_token
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi token =
> self._validate_token(unique_id)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1013, in
> decorate
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
> should_cache_fn)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 640, in
> get_or_create
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
> async_creator) as value:
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 158, in
> __enter__
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi return
> self._enter()
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 98, in
> _enter
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi generated =
> self._enter_create(createdtime)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 149, in
> _enter_create
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi created =
> self.creator()
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 612, in
> gen_value
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi created_value
> = creator()
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1009, in
> creator
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi return
> fn(*arg, **kw)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/keystone/token/provider.py", line 318, in
> _validate_token
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi token_ref =
> self._persistence.get_token(token_id)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/keystone/token/persistence/core.py", line
> 76, in get_token
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi token_ref =
> self._get_token(unique_id)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1013, in
> decorate
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
> should_cache_fn)
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 640, in
> get_or_create
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
> async_creator) as value:
> 2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
> "/usr/lib/python2.7/site-packages/dogpile/core/dog

Re: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 6.5

2015-01-15 Thread Alex Leonhardt
means generate something "secure" :)

On Thu Jan 15 2015 at 09:50:34 Anwar Durrani 
wrote:

> Thanks Edgar for help, i have question in following section :
>
>
>-
>
>Edit /etc/keystone/keystone.conf:
>
>vim /etc/keystone/keystone.conf
>
>[DEFAULT]
>admin_token=ADMIN
>log_dir=/var/log/keystone
>
>[database]
>connection = mysql://keystone:password@controller/keystone
>
>
> admin_token=ADMIN -- > means hex generated token value OR just ADMIN as it
> is ?
>
>
> On Thu, Jan 15, 2015 at 2:54 PM, Edgar Magana 
> wrote:
>
>>  Go for Icehouse:
>>
>> https://github.com/emagana/OpenStack-Icehouse-Install-Guide/blob/master/OpenStack-Icehouse-Installation.rst
>>
>>  *Edgar*
>>
>>   From: Anwar Durrani 
>> Date: Thursday, January 15, 2015 at 1:12 AM
>> To: openstack-operators 
>> Subject: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON
>> CENTOS 6.5
>>
>>Hello everyone,
>>
>>  I want to setup Havana, Do anyone has installation guide for the same ?
>>
>>  Thanks
>>
>>  --
>>  Thanks & regards,
>> Anwar M. Durrani
>> +91-8605010721
>>  
>>
>>
>>
>
>
> --
> Thanks & regards,
> Anwar M. Durrani
> +91-8605010721
> 
>
>
>  ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 6.5

2015-01-15 Thread Anwar Durrani
Thanks Edgar for help, i have question in following section :


   -

   Edit /etc/keystone/keystone.conf:

   vim /etc/keystone/keystone.conf

   [DEFAULT]
   admin_token=ADMIN
   log_dir=/var/log/keystone

   [database]
   connection = mysql://keystone:password@controller/keystone


admin_token=ADMIN -- > means hex generated token value OR just ADMIN as it
is ?


On Thu, Jan 15, 2015 at 2:54 PM, Edgar Magana 
wrote:

>  Go for Icehouse:
>
> https://github.com/emagana/OpenStack-Icehouse-Install-Guide/blob/master/OpenStack-Icehouse-Installation.rst
>
>  *Edgar*
>
>   From: Anwar Durrani 
> Date: Thursday, January 15, 2015 at 1:12 AM
> To: openstack-operators 
> Subject: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON
> CENTOS 6.5
>
>Hello everyone,
>
>  I want to setup Havana, Do anyone has installation guide for the same ?
>
>  Thanks
>
>  --
>  Thanks & regards,
> Anwar M. Durrani
> +91-8605010721
>  
>
>
>


-- 
Thanks & regards,
Anwar M. Durrani
+91-8605010721

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 6.5

2015-01-15 Thread Edgar Magana
Go for Icehouse:
https://github.com/emagana/OpenStack-Icehouse-Install-Guide/blob/master/OpenStack-Icehouse-Installation.rst

Edgar

From: Anwar Durrani mailto:durrani.an...@gmail.com>>
Date: Thursday, January 15, 2015 at 1:12 AM
To: openstack-operators 
mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 
6.5

Hello everyone,

I want to setup Havana, Do anyone has installation guide for the same ?

Thanks

--
Thanks & regards,
Anwar M. Durrani
+91-8605010721


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] I WANT TO SETUP AND CONFIGURE HAVANA ON CENTOS 6.5

2015-01-15 Thread Anwar Durrani
Hello everyone,

I want to setup Havana, Do anyone has installation guide for the same ?

Thanks

-- 
Thanks & regards,
Anwar M. Durrani
+91-8605010721

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Openstack-Keystone error

2015-01-15 Thread Anwar Durrani
Hi Alex, below is error in log file

2015-01-15 01:08:34.128 50243 ERROR keystone.common.wsgi [-]
(ProgrammingError) (1146, "Table 'keystone.token' doesn't exist") 'SELECT
token.id AS token_id, token.expires AS token_expires, token.extra AS
token_extra, token.valid AS token_valid, token.user_id AS token_user_id,
token.trust_id AS token_trust_id \nFROM token \nWHERE token.id = %s'
('2c0dc0032d675623f37a',)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi Traceback (most
recent call last):
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/keystone/common/wsgi.py", line 430, in
__call__
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi response =
self.process_request(request)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/keystone/middleware/core.py", line 279,
in process_request
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi auth_context =
self._build_auth_context(request)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/keystone/middleware/core.py", line 259,
in _build_auth_context
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
token_data=self.token_provider_api.validate_token(token_id))
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/keystone/token/provider.py", line 225, in
validate_token
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi token =
self._validate_token(unique_id)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1013, in
decorate
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
should_cache_fn)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 640, in
get_or_create
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi async_creator)
as value:
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 158, in
__enter__
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi return
self._enter()
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 98, in
_enter
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi generated =
self._enter_create(createdtime)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 149, in
_enter_create
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi created =
self.creator()
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 612, in
gen_value
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi created_value
= creator()
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1009, in
creator
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi return
fn(*arg, **kw)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/keystone/token/provider.py", line 318, in
_validate_token
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi token_ref =
self._persistence.get_token(token_id)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/keystone/token/persistence/core.py", line
76, in get_token
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi token_ref =
self._get_token(unique_id)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1013, in
decorate
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi
should_cache_fn)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 640, in
get_or_create
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi async_creator)
as value:
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 158, in
__enter__
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi return
self._enter()
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 98, in
_enter
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi generated =
self._enter_create(createdtime)
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/core/dogpile.py", line 149, in
_enter_create
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi created =
self.creator()
2015-01-15 01:08:34.128 50243 TRACE keystone.common.wsgi   File
"/usr/lib/python2.7/site-packages/dogpile/cache/regio

Re: [Openstack-operators] Openstack-Keystone error

2015-01-15 Thread Alex Leonhardt
I don't think anyone should try to install OS manually :) .. But check the
keystone logs for what caused the 500? Maybe the admin tenant/project
already exists?

On Thu, 15 Jan 2015 08:29 Anwar Durrani  wrote:

> Hi everyone,
>
> i am getting below error while running below command
>
> [root@localhost ~]# keystone tenant-create --name admin --description
> "Admin Tenant"
> An unexpected error prevented the server from fulfilling your request.
> (HTTP 500)
> [root@localhost ~]#
>
> Prior to run this command i have done following :
>
> *o -  Juno -  Juno -  Juno -  Juno -  Juno -  Juno -  Juno - *
>
> * Create tenants, users, and roles*
>
> After you install the Identity service, create tenants (projects), users,
> and roles for your environment. You must use the temporary administration
> token that you created in the section called “Install and configure”
> 
> and manually configure the location (endpoint) of the Identity service
> before you run *keystone* commands.
>
> You can pass the value of the administration token to the *keystone*
> command with the --os-token option or set the temporary OS_SERVICE_TOKEN
> environment variable. Similarly, you can pass the location of the Identity
> service to the *keystone* command with the --os-endpoint option or set
> the temporary OS_SERVICE_ENDPOINT environment variable. This guide uses
> environment variables to reduce command length.
>
> For more information, see the Operations Guide - Managing Project and
> Users
> .
>
>
>
> *To configure prerequisites*
>
>1. Configure the administration token:
>$ export OS_SERVICE_TOKEN=1dd717043ad277e29edb
>$ export OS_SERVICE_TOKEN=294a4c8a8a475f9b9836
>2. Configure the endpoint:
>$ export OS_SERVICE_ENDPOINT=http://*controller*:35357/v2.0
>
>
> ​Please advise, how to fix this issue ?
>
> Thanks​
>
> --
> Thanks & regards,
> Anwar M. Durrani
> +91-8605010721
> 
>
>
>  ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Openstack-Keystone error

2015-01-15 Thread Anwar Durrani
Hi everyone,

i am getting below error while running below command

[root@localhost ~]# keystone tenant-create --name admin --description
"Admin Tenant"
An unexpected error prevented the server from fulfilling your request.
(HTTP 500)
[root@localhost ~]#

Prior to run this command i have done following :

*o -  Juno -  Juno -  Juno -  Juno -  Juno -  Juno -  Juno - *

* Create tenants, users, and roles*

After you install the Identity service, create tenants (projects), users,
and roles for your environment. You must use the temporary administration
token that you created in the section called “Install and configure”

and manually configure the location (endpoint) of the Identity service
before you run *keystone* commands.

You can pass the value of the administration token to the *keystone*
command with the --os-token option or set the temporary OS_SERVICE_TOKEN
environment variable. Similarly, you can pass the location of the Identity
service to the *keystone* command with the --os-endpoint option or set the
temporary OS_SERVICE_ENDPOINT environment variable. This guide uses
environment variables to reduce command length.

For more information, see the Operations Guide - Managing Project and Users
.



*To configure prerequisites*

   1. Configure the administration token:
   $ export OS_SERVICE_TOKEN=1dd717043ad277e29edb
   $ export OS_SERVICE_TOKEN=294a4c8a8a475f9b9836
   2. Configure the endpoint:
   $ export OS_SERVICE_ENDPOINT=http://*controller*:35357/v2.0


​Please advise, how to fix this issue ?

Thanks​

-- 
Thanks & regards,
Anwar M. Durrani
+91-8605010721

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] New services disable reason

2015-01-15 Thread Alex Leonhardt
Our install is still quite small and we take the risk of hitting that
compute node whilst the service is starting,  but we haven't actually
encountered that yet (probably a user base size issue) ...

IMHO, having a reason why stuff is disabled avoids hours of confusion and
trying to find the person who 'did it'.

In terms of keeping track, I'd have thought that the dashboard admin panel
can show you service states and I'd expect a disabled service to say
'disabled', but again we don't even use this feature at the moment.

+1 from me :)

Alex

On Wed, 14 Jan 2015 19:34 Belmiro Moreira <
moreira.belmiro.email.li...@gmail.com> wrote:

> Hi,
> as operators I would like to have your comments/suggestions on:
> https://review.openstack.org/#/c/136645/1
>
>
> With a large number of nodes several services are disabled because various
> reasons (in our case mainly hardware interventions).
> To help operations we use the "disable reason" as fast filter to identify
> why the service is disabled.
>
> At same time, we add several new nodes (nova-compute) per week.
> At CERN to avoid adding a service when the daemon starts for the first
> time nova is configured with:
> enable_new_services=False
> This is great, however no "disable reason" is set.
> For us having services disabled with no reason specified creates
> additional checks.
>
> How are others keeping track of disabled services?
>
>
> Belmiro
> ---
> CERN
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators