Re: [Openstack] [openstack-dev] [nova][api] Novaclient redirect endpoint https into http

2018-07-05 Thread Monty Taylor

On 07/05/2018 01:55 PM, melanie witt wrote:

+openstack-dev@

On Wed, 4 Jul 2018 14:50:26 +, Bogdan Katynski wrote:
But, I can not use nova command, endpoint nova have been redirected 
from https to http. Here:http://prntscr.com/k2e8s6  (command: nova 
–insecure service list)
First of all, it seems that the nova client is hitting /v2.1 instead 
of /v2.1/ URI and this seems to be triggering the redirect.


Since openstack CLI works, I presume it must be using the correct URL 
and hence it’s not getting redirected.


And this is error log: Unable to establish connection 
tohttp://192.168.30.70:8774/v2.1/: ('Connection aborted.', 
BadStatusLine("''",))
Looks to me that nova-api does a redirect to an absolute URL. I 
suspect SSL is terminated on the HAProxy and nova-api itself is 
configured without SSL so it redirects to an http URL.


In my opinion, nova would be more load-balancer friendly if it used a 
relative URI in the redirect but that’s outside of the scope of this 
question and since I don’t know the context behind choosing the 
absolute URL, I could be wrong on that.


Thanks for mentioning this. We do have a bug open in python-novaclient 
around a similar issue [1]. I've added comments based on this thread and 
will consult with the API subteam to see if there's something we can do 
about this in nova-api.


A similar thing came up the other day related to keystone and version 
discovery. Version discovery documents tend to return full urls - even 
though relative urls would make public/internal API endpoints work 
better. (also, sometimes people don't configure things properly and the 
version discovery url winds up being incorrect)


In shade/sdk - we actually construct a wholly-new discovery url based on 
the url used for the catalog and the url in the discovery document since 
we've learned that the version discovery urls are frequently broken.


This is problematic because SOMETIMES people have public urls deployed 
as a sub-url and internal urls deployed on a port - so you have:


Catalog:
public: https://example.com/compute
internal: https://compute.example.com:1234

Version discovery:
https://example.com/compute/v2.1

When we go to combine the catalog url and the versioned url, if the user 
is hitting internal, we product 
https://compute.example.com:1234/compute/v2.1 - because we have no way 
of systemically knowing that /compute should also be stripped.


VERY LONG WINDED WAY of saying 2 things:

a) Relative URLs would be *way* friendlier (and incidentally are 
supported by keystoneauth, openstacksdk and shade - and are written up 
as being a thing people *should* support in the documents about API 
consumption)


b) Can we get agreement that changing behavior to return or redirect to 
a relative URL would not be considered an api contract break? (it's 
possible the answer to this is 'no' - so it's a real question)


Monty

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Recovering from full outage

2018-07-05 Thread Torin Woltjer
Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't 
working on my VMs. created a new instance from an ubuntu 18.04 image to test 
with, the hostname was not set to the name of the instance and could not login 
as users I had specified in the configuration.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/5/18 12:57 PM
To: torin.wolt...@granddial.com
Cc: "openstack@lists.openstack.org" , 
"openstack-operat...@lists.openstack.org" 

Subject: Re: [Openstack] Recovering from full outage
You should tcpdump inside the qdhcp namespace to see if the requests make it 
there, and also check iptables rules on the compute nodes for the return 
traffic.

On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer  
wrote:
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually 
come up with no addresses. neutron-dhcp-agent has been restarted on both 
controllers. The qdhcp netns's were all present; I stopped the service, removed 
the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, 
restarted all neutron services, noted the qdhcp netns's were recreated, 
restarted a VM again and it still fails to pull an IP address.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/5/18 10:38 AM
To: torin.wolt...@granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer  
wrote:
The qrouter netns appears once the lock_path is specified, the neutron router 
is pingable as well. However, instances are not pingable. If I log in via 
console, the instances have not been given IP addresses, if I manually give 
them an address and route they are pingable and seem to work. So the router is 
working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: "Torin Woltjer" 
Sent: 7/5/18 8:53 AM
To: 
Cc: openstack-operat...@lists.openstack.org, openstack@lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately 
matter what it is set to as long as it is consistent? Does it need to be set on 
compute nodes as well as controllers?

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 7:47 PM
To: torin.wolt...@granddial.com
Cc: openstack-operat...@lists.openstack.org, openstack@lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer  wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both 
controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: "Torin Woltjer" 
Sent: 7/3/18 5:14 PM
To: 
Cc: "openstack-operat...@lists.openstack.org" 
, "openstack@lists.openstack.org" 

Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be 
stuck in a rebooting status. Most notable of the logs is neutron-server.log 
which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted 
controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of 
the instances floating IPs or the neutron router. And when logging into an 
instance with the console, there is no IP address on any interface.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 11:50 AM
To: torin.wolt...@granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the 
nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer  
wrote:
We just suffered a power outage in out data center and I'm having trouble 
recovering the Openstack cluster. All of the nodes are back online, every 
instance shows active but `virsh list --all` on the compute nodes show that all 
of the VMs are actually shut down.

Re: [Openstack] [nova][api] Novaclient redirect endpoint https into http

2018-07-05 Thread melanie witt

+openstack-dev@

On Wed, 4 Jul 2018 14:50:26 +, Bogdan Katynski wrote:

But, I can not use nova command, endpoint nova have been redirected from https 
to http. Here:http://prntscr.com/k2e8s6  (command: nova –insecure service list)

First of all, it seems that the nova client is hitting /v2.1 instead of /v2.1/ 
URI and this seems to be triggering the redirect.

Since openstack CLI works, I presume it must be using the correct URL and hence 
it’s not getting redirected.

  
And this is error log: Unable to establish connection tohttp://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",))
  

Looks to me that nova-api does a redirect to an absolute URL. I suspect SSL is 
terminated on the HAProxy and nova-api itself is configured without SSL so it 
redirects to an http URL.

In my opinion, nova would be more load-balancer friendly if it used a relative 
URI in the redirect but that’s outside of the scope of this question and since 
I don’t know the context behind choosing the absolute URL, I could be wrong on 
that.


Thanks for mentioning this. We do have a bug open in python-novaclient 
around a similar issue [1]. I've added comments based on this thread and 
will consult with the API subteam to see if there's something we can do 
about this in nova-api.


-melanie

[1] https://bugs.launchpad.net/python-novaclient/+bug/1776928




___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Recovering from full outage

2018-07-05 Thread George Mihaiescu
You should tcpdump inside the qdhcp namespace to see if the requests make
it there, and also check iptables rules on the compute nodes for the return
traffic.


On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer 
wrote:

> Yes, I've done this. The VMs hang for awhile waiting for DHCP and
> eventually come up with no addresses. neutron-dhcp-agent has been restarted
> on both controllers. The qdhcp netns's were all present; I stopped the
> service, removed the qdhcp netns's, noted the dhcp agents show offline by
> `neutron agent-list`, restarted all neutron services, noted the qdhcp
> netns's were recreated, restarted a VM again and it still fails to pull an
> IP address.
>
> *Torin Woltjer*
>
> *Grand Dial Communications - A ZK Tech Inc. Company*
>
> *616.776.1066 ext. 2006*
> * www.granddial.com *
>
> --
> *From*: George Mihaiescu 
> *Sent*: 7/5/18 10:38 AM
> *To*: torin.wolt...@granddial.com
> *Subject*: Re: [Openstack] Recovering from full outage
> Did you restart the neutron-dhcp-agent  and rebooted the VMs?
>
> On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <
> torin.wolt...@granddial.com> wrote:
>
>> The qrouter netns appears once the lock_path is specified, the neutron
>> router is pingable as well. However, instances are not pingable. If I log
>> in via console, the instances have not been given IP addresses, if I
>> manually give them an address and route they are pingable and seem to work.
>> So the router is working correctly but dhcp is not working.
>>
>> No errors in any of the neutron or nova logs on controllers or compute
>> nodes.
>>
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> *  
>> www.granddial.com *
>>
>> --
>> *From*: "Torin Woltjer" 
>> *Sent*: 7/5/18 8:53 AM
>> *To*: 
>> *Cc*: openstack-operat...@lists.openstack.org,
>> openstack@lists.openstack.org
>> *Subject*: Re: [Openstack] Recovering from full outage
>> There is no lock path set in my neutron configuration. Does it ultimately
>> matter what it is set to as long as it is consistent? Does it need to be
>> set on compute nodes as well as controllers?
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> *  
>>  
>> www.granddial.com *
>>
>> --
>> *From*: George Mihaiescu 
>> *Sent*: 7/3/18 7:47 PM
>> *To*: torin.wolt...@granddial.com
>> *Cc*: openstack-operat...@lists.openstack.org,
>> openstack@lists.openstack.org
>> *Subject*: Re: [Openstack] Recovering from full outage
>>
>> Did you set a lock_path in the neutron’s config?
>>
>> On Jul 3, 2018, at 17:34, Torin Woltjer 
>> wrote:
>>
>> The following errors appear in the neutron-linuxbridge-agent.log on both
>> controllers: 
>> 
>> 
>> 
>> 
>> http://paste.openstack.org/sho
>> w/724930/
>>
>> No such errors are on the compute nodes themselves.
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> *  
>>  
>>  
>> www.granddial.com *
>>
>> --
>> *From*: "Torin Woltjer" 
>> *Sent*: 7/3/18 5:14 PM
>> *To*: 
>> *Cc*: "openstack-operat...@lists.openstack.org" <
>> openstack-operat...@lists.openstack.org>, "openstack@lists.openstack.org"
>> 
>> *Subject*: Re: [Openstack] Recovering from full outage
>> Running `openstack server reboot` on an instance just causes the instance
>> to be stuck in a rebooting status. Most notable of the logs is
>> neutron-server.log which shows the following:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> http://paste.openstack.org/sho
>> w/724917/
>>
>> I realized that rabbitmq was in a failed state, so I bootstrapped it,
>> rebooted controllers, and all of the agents show online.
>> 
>> 
>> 

Re: [Openstack] Recovering from full outage

2018-07-05 Thread Torin Woltjer
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually 
come up with no addresses. neutron-dhcp-agent has been restarted on both 
controllers. The qdhcp netns's were all present; I stopped the service, removed 
the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, 
restarted all neutron services, noted the qdhcp netns's were recreated, 
restarted a VM again and it still fails to pull an IP address.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/5/18 10:38 AM
To: torin.wolt...@granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer  
wrote:
The qrouter netns appears once the lock_path is specified, the neutron router 
is pingable as well. However, instances are not pingable. If I log in via 
console, the instances have not been given IP addresses, if I manually give 
them an address and route they are pingable and seem to work. So the router is 
working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: "Torin Woltjer" 
Sent: 7/5/18 8:53 AM
To: 
Cc: openstack-operat...@lists.openstack.org, openstack@lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately 
matter what it is set to as long as it is consistent? Does it need to be set on 
compute nodes as well as controllers?

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 7:47 PM
To: torin.wolt...@granddial.com
Cc: openstack-operat...@lists.openstack.org, openstack@lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer  wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both 
controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: "Torin Woltjer" 
Sent: 7/3/18 5:14 PM
To: 
Cc: "openstack-operat...@lists.openstack.org" 
, "openstack@lists.openstack.org" 

Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be 
stuck in a rebooting status. Most notable of the logs is neutron-server.log 
which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted 
controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of 
the instances floating IPs or the neutron router. And when logging into an 
instance with the console, there is no IP address on any interface.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 11:50 AM
To: torin.wolt...@granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the 
nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer  
wrote:
We just suffered a power outage in out data center and I'm having trouble 
recovering the Openstack cluster. All of the nodes are back online, every 
instance shows active but `virsh list --all` on the compute nodes show that all 
of the VMs are actually shut down. Running `ip addr` on any of the nodes shows 
that none of the bridges are present and `ip netns` shows that all of the 
network namespaces are missing as well. So despite all of the neutron service 
running, none of the networking appears to be active, which is concerning. How 
do I solve this without recreating all of the networks?

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsub

Re: [Openstack] NUMA some of the time?

2018-07-05 Thread Fabrizio Soppelsa
Greetings Toni,
Not sure I'm answering, but just as a quick note you can prioritize NUMA
and CPU-pinning by creating dedicated flavors [1] with something like:

openstack flavor set m1.largenuma --property hw:numa_cpus.0=0,1 --property
hw:numa_mem.0=2048

Usually NUMA cores are reserved or shared only for specific fixed ops, and
I'm not sure what you mean by "throwing off more cores" in case of need,
probably you need to look into something like Heat autoscale?

Cheers,
Fabrizio

[1] https://docs.openstack.org/nova/pike/admin/cpu-topologies.html

On Wed, Jul 4, 2018 at 5:19 PM Toni Mueller  wrote:

>
> Hi,
>
> I am still trying to figure how to best utilise the small set of
> hardware, and discovered the NUMA configuration mechanism. It allows me
> to configure reserved cores for certain VMs, but it does not seem to
> allow me to say "you can share these cores, but VMs of, say, appropriate
> flavour take precedence and will throw you off these cores in case they
> need more power".
>
> How can I achieve that, dynamically?
>
> TIA!
>
>
> Thanks,
> Toni
>
>
> ___
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Recovering from full outage

2018-07-05 Thread Torin Woltjer
The qrouter netns appears once the lock_path is specified, the neutron router 
is pingable as well. However, instances are not pingable. If I log in via 
console, the instances have not been given IP addresses, if I manually give 
them an address and route they are pingable and seem to work. So the router is 
working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: "Torin Woltjer" 
Sent: 7/5/18 8:53 AM
To: 
Cc: openstack-operat...@lists.openstack.org, openstack@lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately 
matter what it is set to as long as it is consistent? Does it need to be set on 
compute nodes as well as controllers?

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 7:47 PM
To: torin.wolt...@granddial.com
Cc: openstack-operat...@lists.openstack.org, openstack@lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer  wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both 
controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: "Torin Woltjer" 
Sent: 7/3/18 5:14 PM
To: 
Cc: "openstack-operat...@lists.openstack.org" 
, "openstack@lists.openstack.org" 

Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be 
stuck in a rebooting status. Most notable of the logs is neutron-server.log 
which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted 
controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of 
the instances floating IPs or the neutron router. And when logging into an 
instance with the console, there is no IP address on any interface.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 11:50 AM
To: torin.wolt...@granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the 
nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer  
wrote:
We just suffered a power outage in out data center and I'm having trouble 
recovering the Openstack cluster. All of the nodes are back online, every 
instance shows active but `virsh list --all` on the compute nodes show that all 
of the VMs are actually shut down. Running `ip addr` on any of the nodes shows 
that none of the bridges are present and `ip netns` shows that all of the 
network namespaces are missing as well. So despite all of the neutron service 
running, none of the networking appears to be active, which is concerning. How 
do I solve this without recreating all of the networks?

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Recovering from full outage

2018-07-05 Thread Torin Woltjer
There is no lock path set in my neutron configuration. Does it ultimately 
matter what it is set to as long as it is consistent? Does it need to be set on 
compute nodes as well as controllers?

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 7:47 PM
To: torin.wolt...@granddial.com
Cc: openstack-operat...@lists.openstack.org, openstack@lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer  wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both 
controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: "Torin Woltjer" 
Sent: 7/3/18 5:14 PM
To: 
Cc: "openstack-operat...@lists.openstack.org" 
, "openstack@lists.openstack.org" 

Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be 
stuck in a rebooting status. Most notable of the logs is neutron-server.log 
which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted 
controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of 
the instances floating IPs or the neutron router. And when logging into an 
instance with the console, there is no IP address on any interface.

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com


From: George Mihaiescu 
Sent: 7/3/18 11:50 AM
To: torin.wolt...@granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the 
nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer  
wrote:
We just suffered a power outage in out data center and I'm having trouble 
recovering the Openstack cluster. All of the nodes are back online, every 
instance shows active but `virsh list --all` on the compute nodes show that all 
of the VMs are actually shut down. Running `ip addr` on any of the nodes shows 
that none of the bridges are present and `ip netns` shows that all of the 
network namespaces are missing as well. So despite all of the neutron service 
running, none of the networking appears to be active, which is concerning. How 
do I solve this without recreating all of the networks?

Torin Woltjer

Grand Dial Communications - A ZK Tech Inc. Company

616.776.1066 ext. 2006
www.granddial.com

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack