Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread Remo Mattei
Ouch no deployment tools? Nevertheless I will check the version I have  on mine

Remo

Il giorno 18 set 2017, alle ore 19:43, Jean-Philippe Méthot 
 ha scritto:

I use RDO Ocata without any deployment tool
Neutron version is openstack-neutron-10.0.3-1.el7.noarch

It's from August 28th.

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




> Le 19 sept. 2017 à 11:00, Remo Mattei  a écrit :
> 
> are you running RDO / Juju? What is the version?
> 
> Thanks 
> 
>> On 9/18/17 6:40 PM, Jean-Philippe Méthot wrote:
>> Hi,
>> 
>> Thank you for your reply. We did restart all neutron services, several 
>> times. We also restarted the servers but the issue is still there.
>> 
>> Best regards,
>> 
>> Jean-Philippe Méthot
>> Openstack system administrator
>> Administrateur système Openstack
>> PlanetHoster inc.
>> 
>> 
>> 
>> 
>>> Le 19 sept. 2017 à 10:01, Remo Mattei  a écrit :
>>> 
>>> I saw something similar did you restart all the services after the upgrade? 
>>> Just wonder. I saw some other issue when I upgraded from 7.3 to 7.4 where 
>>> it gave me some vif error after all servers reboot the problem has been 
>>> gone. 
>>> 
>>> Let me know. 
>>> 
>>> Il giorno 18 set 2017, alle ore 17:02, JP Japan 
>>>  ha scritto:
>>> 
>>> Sorry, I ended up sending the previous email a bit too quickly. Here’s some 
>>> more info about our setup.
>>> 
>>> -It’s running latest Ocata with Openvswitch and network dedicated nodes.
>>> -The network nodes are L3HA
>>> -There’s no DVR here.
>>> 
 Le 19 sept. 2017 à 08:51, JP Japan  a écrit :
 
 Hi,
 
 A few days ago, we made two big changes on our production infrastructure: 
 we updated to latest Ocata and we changed the outgoing port on our network 
 node to a lacp port. We made the change by switching the port in br-ex in 
 openvswitch to the new lacp-backed port. Ever since these two things 
 happened right after the other, we’ve ran into two issues, one which has 
 much worse consequences than the other:
 
 1.We can’t add floating ips to instances anymore. The interface says the 
 operation completed successfully, the database gets updated, but the IP 
 address doesn’t exist in the network namespace on the network nodes. 
 Strangely enough, the iptables rules in the NAT table do exist. The port 
 just doesn’t receive the new address. Adding the floating ip address 
 manually to the virtual interface with "ip netns exec *qrouter namespace 
 id* ip addr add *ip address* dev *virtual interface*" solves this, but is 
 in no way a permanent solution.
 
 2.We’re getting an error message in the L3-agent whenever it starts 
 informing us it was unable to add some rules in iptables because there’s a 
 lock on xtables, while as far as we know, the L3-agent itself is the one 
 holding the lock. Here’s the error: 
 
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager # Generated by iptables_manager
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager *nat
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager -I neutron-l3-agent-PREROUTING 7 -d 
 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 
 9697
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager COMMIT
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager # Completed by iptables_manager
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager ; Stdout: ; Stderr: Another app is currently 
 holding the xtables lock. Perhaps you want to use the -w option?
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager
 2017-09-18 13:00:55.426 18575 ERROR 
 neutron.callbacks.manager 
 
 It’s not clear exactly how this is affecting the setup, as metadata is 
 still going through properly (most likely through the DHCP) but it’s quite 
 worrying.
 ___
 Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 Post to : openstack@lists.openstack.org
 Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> 
>>> Jean-Philippe Méthot
>>> Openstack system administrator
>>> PlanetHoster inc.
>>> ___
>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> Post to : openstack@lists.openstack.org
>>> Unsubscribe : 

Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread Jean-Philippe Méthot
I use RDO Ocata without any deployment tool
Neutron version is openstack-neutron-10.0.3-1.el7.noarch

It's from August 28th.

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




> Le 19 sept. 2017 à 11:00, Remo Mattei  a écrit :
> 
> are you running RDO / Juju? What is the version?
> 
> Thanks 
> 
> On 9/18/17 6:40 PM, Jean-Philippe Méthot wrote:
>> Hi,
>> 
>> Thank you for your reply. We did restart all neutron services, several 
>> times. We also restarted the servers but the issue is still there.
>> 
>> Best regards,
>> 
>> Jean-Philippe Méthot
>> Openstack system administrator
>> Administrateur système Openstack
>> PlanetHoster inc.
>> 
>> 
>> 
>> 
>>> Le 19 sept. 2017 à 10:01, Remo Mattei >> > a écrit :
>>> 
>>> I saw something similar did you restart all the services after the upgrade? 
>>> Just wonder. I saw some other issue when I upgraded from 7.3 to 7.4 where 
>>> it gave me some vif error after all servers reboot the problem has been 
>>> gone. 
>>> 
>>> Let me know. 
>>> 
>>> Il giorno 18 set 2017, alle ore 17:02, JP Japan 
>>> > ha 
>>> scritto:
>>> 
>>> Sorry, I ended up sending the previous email a bit too quickly. Here’s some 
>>> more info about our setup.
>>> 
>>> -It’s running latest Ocata with Openvswitch and network dedicated nodes.
>>> -The network nodes are L3HA
>>> -There’s no DVR here.
>>> 
 Le 19 sept. 2017 à 08:51, JP Japan > a écrit :
 
 Hi,
 
 A few days ago, we made two big changes on our production infrastructure: 
 we updated to latest Ocata and we changed the outgoing port on our network 
 node to a lacp port. We made the change by switching the port in br-ex in 
 openvswitch to the new lacp-backed port. Ever since these two things 
 happened right after the other, we’ve ran into two issues, one which has 
 much worse consequences than the other:
 
 1.We can’t add floating ips to instances anymore. The interface says the 
 operation completed successfully, the database gets updated, but the IP 
 address doesn’t exist in the network namespace on the network nodes. 
 Strangely enough, the iptables rules in the NAT table do exist. The port 
 just doesn’t receive the new address. Adding the floating ip address 
 manually to the virtual interface with "ip netns exec *qrouter namespace 
 id* ip addr add *ip address* dev *virtual interface*" solves this, but is 
 in no way a permanent solution.
 
 2.We’re getting an error message in the L3-agent whenever it starts 
 informing us it was unable to add some rules in iptables because there’s a 
 lock on xtables, while as far as we know, the L3-agent itself is the one 
 holding the lock. Here’s the error: 
 
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Generated 
 by iptables_manager
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I 
 neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
 --dport 80 -j REDIRECT --to-ports 9697
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Completed 
 by iptables_manager
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ; Stdout: ; 
 Stderr: Another app is currently holding the xtables lock. Perhaps you 
 want to use the -w option?
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager 
 
 It’s not clear exactly how this is affecting the setup, as metadata is 
 still going through properly (most likely through the DHCP) but it’s quite 
 worrying.
 ___
 Mailing list: 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
 
 Post to : openstack@lists.openstack.org 
 
 Unsubscribe : 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
 
>>> 
>>> Jean-Philippe Méthot
>>> Openstack system administrator
>>> PlanetHoster inc.
>>> ___
>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
>>> 
>>> Post to : openstack@lists.openstack.org 
>>> 
>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
>>> 

[openstack-dev] [tricircle] PTG summary

2017-09-18 Thread Vega Cai
Hello folks,

After the discussion in the PTG on last Friday, we plan the following work
items for the Queens circle:

queens-1
LBaaS
Integration with Nova cell V2
Distinguish request from user and local neutron server
Default security group update

queens-2
Network deletion reliability
Security group delete
QoS

queens-3
Support new cross-neutron l3 networking model
Smoke test improvement

Please refer to the etherpad page[1] for details.

[1] https://etherpad.openstack.org/p/tricircle-queens-ptg

BR
Zhiyuan

-- 
BR
Zhiyuan
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all][ptg][os-upstream-institute] Poll on Interest in Upstream Institute at PTGs

2017-09-18 Thread Jay S Bryant

All,

I saw that there was some interest indicated in the feedback session 
with regards to having an OpenStack Upstream Institute session at 
Project Team Gatherings.  For those who are not familiar with the 
OpenStack Upstream Institute (OUI), it is a one to one and a half day 
education session designed to help people who are new to OpenStack to 
get started contributing to the community.  The detailed contents of the 
training may be seen on the OUI docs site.  [1]


A poll has been created [2] to understand how many people would have 
been interested in having the education at the Queens PTG in Denver and 
to also gauge the level of interest in having education at the Rocky PTG 
in Dublin.


The poll should only take a minute to complete and your time to share 
this information is greatly appreciated.  If you have any questions or 
concerns, feel free to ping me on IRC (jungleboyj) or via e-mail 
(jsbry...@electronicjungle.net).


Thanks!

Jay Bryant

IRC:  jungleboyj  e-mail: jsbry...@electronicjungle.net


[1]  https://docs.openstack.org/upstream-training/

[2]  https://www.surveymonkey.com/r/RSGMFNL


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread Remo Mattei
are you running RDO / Juju? What is the version?

Thanks

On 9/18/17 6:40 PM, Jean-Philippe Méthot wrote:
> Hi,
>
> Thank you for your reply. We did restart all neutron services, several
> times. We also restarted the servers but the issue is still there.
>
> Best regards,
>
> Jean-Philippe Méthot
> Openstack system administrator
> Administrateur système Openstack
> PlanetHoster inc.
>
>
>
>
>> Le 19 sept. 2017 à 10:01, Remo Mattei > > a écrit :
>>
>> I saw something similar did you restart all the services after the
>> upgrade? Just wonder. I saw some other issue when I upgraded from 7.3
>> to 7.4 where it gave me some vif error after all servers reboot the
>> problem has been gone. 
>>
>> Let me know. 
>>
>> Il giorno 18 set 2017, alle ore 17:02, JP Japan
>> > ha
>> scritto:
>>
>> Sorry, I ended up sending the previous email a bit too quickly.
>> Here’s some more info about our setup.
>>
>> -It’s running latest Ocata with Openvswitch and network dedicated nodes.
>> -The network nodes are L3HA
>> -There’s no DVR here.
>>
>>> Le 19 sept. 2017 à 08:51, JP Japan >> > a écrit :
>>>
>>> Hi,
>>>
>>> A few days ago, we made two big changes on our production
>>> infrastructure: we updated to latest Ocata and we changed the
>>> outgoing port on our network node to a lacp port. We made the change
>>> by switching the port in br-ex in openvswitch to the new lacp-backed
>>> port. Ever since these two things happened right after the other,
>>> we’ve ran into two issues, one which has much worse consequences
>>> than the other:
>>>
>>> 1.We can’t add floating ips to instances anymore. The interface says
>>> the operation completed successfully, the database gets updated, but
>>> the IP address doesn’t exist in the network namespace on the network
>>> nodes. Strangely enough, the iptables rules in the NAT table do
>>> exist. The port just doesn’t receive the new address. Adding the
>>> floating ip address manually to the virtual interface with "ip netns
>>> exec *qrouter namespace id* ip addr add *ip address* dev *virtual
>>> interface*" solves this, but is in no way a permanent solution.
>>>
>>> 2.We’re getting an error message in the L3-agent whenever it starts
>>> informing us it was unable to add some rules in iptables because
>>> there’s a lock on xtables, while as far as we know, the L3-agent
>>> itself is the one holding the lock. Here’s the error: 
>>>
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager #
>>> Generated by iptables_manager
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I
>>> neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp
>>> -m tcp --dport 80 -j REDIRECT --to-ports 9697
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager #
>>> Completed by iptables_manager
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ;
>>> Stdout: ; Stderr: Another app is currently holding the xtables lock.
>>> Perhaps you want to use the -w option?
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
>>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager 
>>>
>>> It’s not clear exactly how this is affecting the setup, as metadata
>>> is still going through properly (most likely through the DHCP) but
>>> it’s quite worrying.
>>> ___
>>> Mailing list:
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> Post to : openstack@lists.openstack.org
>>> 
>>> Unsubscribe :
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>
>> Jean-Philippe Méthot
>> Openstack system administrator
>> PlanetHoster inc.
>> ___
>> Mailing list:
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to : openstack@lists.openstack.org
>> 
>> Unsubscribe :
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread Jean-Philippe Méthot
Hi,

I do not have this fix. Seems it’s too recent for the latest RDO-Ocata. I will 
apply it, it should solve the iptables issue. I have a hunch it’s not the cause 
of the missing floatingip issue though, but I will try.

Thank you for your help,

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




> Le 19 sept. 2017 à 09:51, Ajay Kalambur (akalambu)  a 
> écrit :
> 
> Do you have this fix
> https://review.openstack.org/#/c/501317/ 
> 
> 
> 
> Ajay
> 
> From: JP Japan  >
> Date: Monday, September 18, 2017 at 5:02 PM
> To: "openstack@lists.openstack.org " 
> >
> Subject: Re: [Openstack] Floating IP not being added in namespace anymore
> 
> Sorry, I ended up sending the previous email a bit too quickly. Here’s some 
> more info about our setup.
> 
> -It’s running latest Ocata with Openvswitch and network dedicated nodes.
> -The network nodes are L3HA
> -There’s no DVR here.
> 
>> Le 19 sept. 2017 à 08:51, JP Japan > > a écrit :
>> 
>> Hi,
>> 
>> A few days ago, we made two big changes on our production infrastructure: we 
>> updated to latest Ocata and we changed the outgoing port on our network node 
>> to a lacp port. We made the change by switching the port in br-ex in 
>> openvswitch to the new lacp-backed port. Ever since these two things 
>> happened right after the other, we’ve ran into two issues, one which has 
>> much worse consequences than the other:
>> 
>> 1.We can’t add floating ips to instances anymore. The interface says the 
>> operation completed successfully, the database gets updated, but the IP 
>> address doesn’t exist in the network namespace on the network nodes. 
>> Strangely enough, the iptables rules in the NAT table do exist. The port 
>> just doesn’t receive the new address. Adding the floating ip address 
>> manually to the virtual interface with "ip netns exec *qrouter namespace id* 
>> ip addr add *ip address* dev *virtual interface*" solves this, but is in no 
>> way a permanent solution.
>> 
>> 2.We’re getting an error message in the L3-agent whenever it starts 
>> informing us it was unable to add some rules in iptables because there’s a 
>> lock on xtables, while as far as we know, the L3-agent itself is the one 
>> holding the lock. Here’s the error: 
>> 
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Generated by 
>> iptables_manager
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I 
>> neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
>> --dport 80 -j REDIRECT --to-ports 9697
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Completed by 
>> iptables_manager
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ; Stdout: ; 
>> Stderr: Another app is currently holding the xtables lock. Perhaps you want 
>> to use the -w option?
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager 
>> 
>> It’s not clear exactly how this is affecting the setup, as metadata is still 
>> going through properly (most likely through the DHCP) but it’s quite 
>> worrying.
>> ___
>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
>> 
>> Post to : openstack@lists.openstack.org 
>> 
>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
>> 
> 
> Jean-Philippe Méthot
> Openstack system administrator
> PlanetHoster inc.

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread Jean-Philippe Méthot
Hi,

Thank you for your reply. We did restart all neutron services, several times. 
We also restarted the servers but the issue is still there.

Best regards,

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




> Le 19 sept. 2017 à 10:01, Remo Mattei  a écrit :
> 
> I saw something similar did you restart all the services after the upgrade? 
> Just wonder. I saw some other issue when I upgraded from 7.3 to 7.4 where it 
> gave me some vif error after all servers reboot the problem has been gone. 
> 
> Let me know. 
> 
> Il giorno 18 set 2017, alle ore 17:02, JP Japan  > ha scritto:
> 
> Sorry, I ended up sending the previous email a bit too quickly. Here’s some 
> more info about our setup.
> 
> -It’s running latest Ocata with Openvswitch and network dedicated nodes.
> -The network nodes are L3HA
> -There’s no DVR here.
> 
>> Le 19 sept. 2017 à 08:51, JP Japan > > a écrit :
>> 
>> Hi,
>> 
>> A few days ago, we made two big changes on our production infrastructure: we 
>> updated to latest Ocata and we changed the outgoing port on our network node 
>> to a lacp port. We made the change by switching the port in br-ex in 
>> openvswitch to the new lacp-backed port. Ever since these two things 
>> happened right after the other, we’ve ran into two issues, one which has 
>> much worse consequences than the other:
>> 
>> 1.We can’t add floating ips to instances anymore. The interface says the 
>> operation completed successfully, the database gets updated, but the IP 
>> address doesn’t exist in the network namespace on the network nodes. 
>> Strangely enough, the iptables rules in the NAT table do exist. The port 
>> just doesn’t receive the new address. Adding the floating ip address 
>> manually to the virtual interface with "ip netns exec *qrouter namespace id* 
>> ip addr add *ip address* dev *virtual interface*" solves this, but is in no 
>> way a permanent solution.
>> 
>> 2.We’re getting an error message in the L3-agent whenever it starts 
>> informing us it was unable to add some rules in iptables because there’s a 
>> lock on xtables, while as far as we know, the L3-agent itself is the one 
>> holding the lock. Here’s the error: 
>> 
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Generated by 
>> iptables_manager
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I 
>> neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
>> --dport 80 -j REDIRECT --to-ports 9697
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Completed by 
>> iptables_manager
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ; Stdout: ; 
>> Stderr: Another app is currently holding the xtables lock. Perhaps you want 
>> to use the -w option?
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
>> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager 
>> 
>> It’s not clear exactly how this is affecting the setup, as metadata is still 
>> going through properly (most likely through the DHCP) but it’s quite 
>> worrying.
>> ___
>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
>> 
>> Post to : openstack@lists.openstack.org 
>> 
>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
>> 
> 
> Jean-Philippe Méthot
> Openstack system administrator
> PlanetHoster inc.
> ___
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
> 
> Post to : openstack@lists.openstack.org 
> 
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack 
> 

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


[openstack-dev] [os-upstream-institute] This week's meeting is cancelled

2017-09-18 Thread Ildiko Vancsa
Hi Training Team,

I hope all of you who attended the PTG had a good time and got home safe.

I’m out of timezone this week therefore this week’s meeting is cancelled.

As a reminder, we have upcoming training occasions in London (Sept. 26), 
Copenhagen (Oct. 18) and Sydney (Nov. 4-5). Please sign up on the staff wiki 
page if you can come and help out on any of these events: 
https://wiki.openstack.org/wiki/OpenStack_Upstream_Institute_Occasions

And also keep an eye on the open reviews: 
https://review.openstack.org/#/q/project:openstack/training-guides+status:open 
:)

Thanks,
Ildikó
(IRC: ildikov)
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread Remo Mattei
I saw something similar did you restart all the services after the upgrade? 
Just wonder. I saw some other issue when I upgraded from 7.3 to 7.4 where it 
gave me some vif error after all servers reboot the problem has been gone. 

Let me know. 

Il giorno 18 set 2017, alle ore 17:02, JP Japan  
ha scritto:

Sorry, I ended up sending the previous email a bit too quickly. Here’s some 
more info about our setup.

-It’s running latest Ocata with Openvswitch and network dedicated nodes.
-The network nodes are L3HA
-There’s no DVR here.

> Le 19 sept. 2017 à 08:51, JP Japan  a écrit :
> 
> Hi,
> 
> A few days ago, we made two big changes on our production infrastructure: we 
> updated to latest Ocata and we changed the outgoing port on our network node 
> to a lacp port. We made the change by switching the port in br-ex in 
> openvswitch to the new lacp-backed port. Ever since these two things happened 
> right after the other, we’ve ran into two issues, one which has much worse 
> consequences than the other:
> 
> 1.We can’t add floating ips to instances anymore. The interface says the 
> operation completed successfully, the database gets updated, but the IP 
> address doesn’t exist in the network namespace on the network nodes. 
> Strangely enough, the iptables rules in the NAT table do exist. The port just 
> doesn’t receive the new address. Adding the floating ip address manually to 
> the virtual interface with "ip netns exec *qrouter namespace id* ip addr add 
> *ip address* dev *virtual interface*" solves this, but is in no way a 
> permanent solution.
> 
> 2.We’re getting an error message in the L3-agent whenever it starts informing 
> us it was unable to add some rules in iptables because there’s a lock on 
> xtables, while as far as we know, the L3-agent itself is the one holding the 
> lock. Here’s the error: 
> 
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Generated by 
> iptables_manager
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I 
> neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
> --dport 80 -j REDIRECT --to-ports 9697
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Completed by 
> iptables_manager
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ; Stdout: ; 
> Stderr: Another app is currently holding the xtables lock. Perhaps you want 
> to use the -w option?
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager 
> 
> It’s not clear exactly how this is affecting the setup, as metadata is still 
> going through properly (most likely through the DHCP) but it’s quite worrying.
> ___
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Jean-Philippe Méthot
Openstack system administrator
PlanetHoster inc.
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


smime.p7s
Description: S/MIME cryptographic signature
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread Ajay Kalambur (akalambu)
Do you have this fix
https://review.openstack.org/#/c/501317/


Ajay

From: JP Japan >
Date: Monday, September 18, 2017 at 5:02 PM
To: "openstack@lists.openstack.org" 
>
Subject: Re: [Openstack] Floating IP not being added in namespace anymore

Sorry, I ended up sending the previous email a bit too quickly. Here’s some 
more info about our setup.

-It’s running latest Ocata with Openvswitch and network dedicated nodes.
-The network nodes are L3HA
-There’s no DVR here.

Le 19 sept. 2017 à 08:51, JP Japan 
> a écrit :

Hi,

A few days ago, we made two big changes on our production infrastructure: we 
updated to latest Ocata and we changed the outgoing port on our network node to 
a lacp port. We made the change by switching the port in br-ex in openvswitch 
to the new lacp-backed port. Ever since these two things happened right after 
the other, we’ve ran into two issues, one which has much worse consequences 
than the other:

1.We can’t add floating ips to instances anymore. The interface says the 
operation completed successfully, the database gets updated, but the IP address 
doesn’t exist in the network namespace on the network nodes. Strangely enough, 
the iptables rules in the NAT table do exist. The port just doesn’t receive the 
new address. Adding the floating ip address manually to the virtual interface 
with "ip netns exec *qrouter namespace id* ip addr add *ip address* dev 
*virtual interface*" solves this, but is in no way a permanent solution.

2.We’re getting an error message in the L3-agent whenever it starts informing 
us it was unable to add some rules in iptables because there’s a lock on 
xtables, while as far as we know, the L3-agent itself is the one holding the 
lock. Here’s the error:

2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Generated by 
iptables_manager
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I 
neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
--dport 80 -j REDIRECT --to-ports 9697
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Completed by 
iptables_manager
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ; Stdout: ; 
Stderr: Another app is currently holding the xtables lock. Perhaps you want to 
use the -w option?
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager

It’s not clear exactly how this is affecting the setup, as metadata is still 
going through properly (most likely through the DHCP) but it’s quite worrying.
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : 
openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Jean-Philippe Méthot
Openstack system administrator
PlanetHoster inc.
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread JP Japan
Sorry, I ended up sending the previous email a bit too quickly. Here’s some 
more info about our setup.

-It’s running latest Ocata with Openvswitch and network dedicated nodes.
-The network nodes are L3HA
-There’s no DVR here.

> Le 19 sept. 2017 à 08:51, JP Japan  a écrit :
> 
> Hi,
> 
> A few days ago, we made two big changes on our production infrastructure: we 
> updated to latest Ocata and we changed the outgoing port on our network node 
> to a lacp port. We made the change by switching the port in br-ex in 
> openvswitch to the new lacp-backed port. Ever since these two things happened 
> right after the other, we’ve ran into two issues, one which has much worse 
> consequences than the other:
> 
> 1.We can’t add floating ips to instances anymore. The interface says the 
> operation completed successfully, the database gets updated, but the IP 
> address doesn’t exist in the network namespace on the network nodes. 
> Strangely enough, the iptables rules in the NAT table do exist. The port just 
> doesn’t receive the new address. Adding the floating ip address manually to 
> the virtual interface with "ip netns exec *qrouter namespace id* ip addr add 
> *ip address* dev *virtual interface*" solves this, but is in no way a 
> permanent solution.
> 
> 2.We’re getting an error message in the L3-agent whenever it starts informing 
> us it was unable to add some rules in iptables because there’s a lock on 
> xtables, while as far as we know, the L3-agent itself is the one holding the 
> lock. Here’s the error: 
> 
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Generated by 
> iptables_manager
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I 
> neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
> --dport 80 -j REDIRECT --to-ports 9697
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Completed by 
> iptables_manager
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ; Stdout: ; 
> Stderr: Another app is currently holding the xtables lock. Perhaps you want 
> to use the -w option?
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
> 2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager 
> 
> It’s not clear exactly how this is affecting the setup, as metadata is still 
> going through properly (most likely through the DHCP) but it’s quite worrying.
> ___
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Jean-Philippe Méthot
Openstack system administrator
PlanetHoster inc.___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade TODAY 15:00 UTC - 23:59 UTC

2017-09-18 Thread Clark Boylan
On Mon, Sep 18, 2017, at 06:43 AM, Andreas Jaeger wrote:
> Just a friendly reminder that the upgrade will happen TODAY, Monday
> 18th, starting at 15:00 UTC. The infra team expects that it takes 8
> hours, so until 2359 UTC.

This work was functionally completed at 23:43 UTC. We are now running
Gerrit 2.13.9. There are some cleanup steps that need to be performed in
Infra land, mostly to get puppet running properly again.

You will also notice that newer Gerrit behaves in some new and exciting
ways. Most of these should be improvements like not needing to reapprove
changes that already have a +1 Workflow but also have a +1 Verified;
recheck should now work for these cases. If you find a new behavior that
looks like a bug please let us know, but we should also work to file
them upstream so that newer Gerrit can address them.

Feel free to ask us questions if anything else comes up.

Thank you to everyone that helped with the upgrade. Seems like these get
more and more difficult with each Gerrit release so all the help is
greatly appreciated.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[Openstack] Floating IP not being added in namespace anymore

2017-09-18 Thread JP Japan
Hi,

A few days ago, we made two big changes on our production infrastructure: we 
updated to latest Ocata and we changed the outgoing port on our network node to 
a lacp port. We made the change by switching the port in br-ex in openvswitch 
to the new lacp-backed port. Ever since these two things happened right after 
the other, we’ve ran into two issues, one which has much worse consequences 
than the other:

1.We can’t add floating ips to instances anymore. The interface says the 
operation completed successfully, the database gets updated, but the IP address 
doesn’t exist in the network namespace on the network nodes. Strangely enough, 
the iptables rules in the NAT table do exist. The port just doesn’t receive the 
new address. Adding the floating ip address manually to the virtual interface 
with "ip netns exec *qrouter namespace id* ip addr add *ip address* dev 
*virtual interface*" solves this, but is in no way a permanent solution.

2.We’re getting an error message in the L3-agent whenever it starts informing 
us it was unable to add some rules in iptables because there’s a lock on 
xtables, while as far as we know, the L3-agent itself is the one holding the 
lock. Here’s the error: 

2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Generated by 
iptables_manager
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager *nat
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager -I 
neutron-l3-agent-PREROUTING 7 -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp 
--dport 80 -j REDIRECT --to-ports 9697
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager COMMIT
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager # Completed by 
iptables_manager
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager ; Stdout: ; 
Stderr: Another app is currently holding the xtables lock. Perhaps you want to 
use the -w option?
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager
2017-09-18 13:00:55.426 18575 ERROR neutron.callbacks.manager 

It’s not clear exactly how this is affecting the setup, as metadata is still 
going through properly (most likely through the DHCP) but it’s quite worrying.___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


[OpenStack-Infra] Upcoming backwards incompatible change on Zuul master

2017-09-18 Thread James E. Blair
Hi,

We need to merge a backwards incompatible change to Zuul master.

The change is: https://review.openstack.org/482856 and it makes Gerrit
label entries case sensitive.  Unfortunately, some combinations of
Gerrit versions and underlying database configurations make this both
necessary and difficult to handle seamlessly.

This will affect both installations continuously delivered from git
master, as well as those that are upgraded to the latest releases.

The complexity of this situation leaves us few options other than to
make this change and minimize the impact by isolating it and providing
an upgrade plan.

The upgrade plan is the same regardless of whether you run Zuul
continuously deployed from master or releases.

Upgrade Procedure
-

The latest release of Zuul, as of this writing, is 2.5.2.  It treats all
Gerrit labels as case insensitive, however, if a label is capitalized in
Zuul's layout configuration with this version, typical gate pipelines
may not function correctly.  Therefore:

1) Prepare, *but do not merge*, a patch to change the case of all Gerrit
labels in layout.yaml.  Typically, this would mean changing instances of
"verified:" to "Verified:" or "workflow:" to "Workflow:".

2) Next Tuesday, September 26, we will merge
https://review.openstack.org/482856 and release version 2.6.0, which
switches to the case-sensitive behavior (and contains no other
substantive changes).  Once this change is merged to the master branch
and and the new version of Zuul is released is released, prepare to
upgrade.

3) Merge the change prepared in step 1, then upgrade to 2.6.0 (or the
master branch tip) immediately afterword and restart Zuul.

Thanks,

Jim

___
OpenStack-Infra mailing list
OpenStack-Infra@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

[openstack-dev] [tripleo] Blueprints for Queens

2017-09-18 Thread Alex Schultz
Hey folks,

At the end of the PTG we did some work to triage the blueprints for
TripleO. The goal was to ensure that some of the items we talked about
were properly captured for the Queens cycle.  Please take a look at
the blueprints targeted towards queens[0] and update them with a
reasonable milestone for delivery (queens-1/queens-2 ideally).  If you
need to add additional blueprints, please triage appropriately.  I
will be using this list for tracking completion of features during
this cycle and reaching out the the assignee on the blueprints for
status.  Please make sure there is an assignee for your blueprint(s).

In addition to the Queens blueprints, we also created a future target
that can be used for tracking future efforts or things that won't be
worked on during the Queens cycle. It would be advisable to review
this list[1] and make sure we did not accidentally move out work that
will be completed in Queens.  The same goes for the Pike list[2] to
make sure they have all been properly updated to reflect that they
have been implemented or need to be moved.

Thanks,
-Alex

[0] https://blueprints.launchpad.net/tripleo/queens
[1] https://blueprints.launchpad.net/tripleo/future
[2] https://blueprints.launchpad.net/tripleo/pike

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[Openstack-operators] [nova] Queens PTG recap - everything else

2017-09-18 Thread Matt Riedemann
There was a whole lot of other stuff discussed at the PTG. The details 
are in [1]. I won't go into everything here, so I'm just highlighting 
some of the more concrete items that had owners or TODOs.


Ironic
--

The Ironic team came over on Wednesday afternoon. We talked a bit, had 
some laughs, it was a good time. Since I don't speak fluent baremetal, 
Dmitry Tantsur is going to recap those discussions in the mailing list. 
Thanks again, Dmitry.


Privsep
---

Michael Still has been going hog wild converting the nova libvirt driver 
code to use privsep instead of rootwrap. He has a series of changes 
tracked under this blueprint [2]. Most of the discussion was a refresh 
on privsep and a recap of what's already been merged and some discussion 
on outstanding patches. The goal for Queens is to get the entire libvirt 
driver converted and also try to get all of nova-compute converted, but 
we want to limit that to getting things merged early in the release to 
flush out bugs since a lot of these are weird, possibly untested code 
paths. There was also discussion of a kind of privsep heartbeat daemon 
to tell if it's running (even though it's not a separate service) but 
this is complicated and is not something we'll pursue for Queens.


Websockify security proxy framework
---

This is a long-standing security hardening feature [3] which has changed 
hands a few times and hasn't gotten much review. Sean Dague and Melanie 
Witt agreed to focus on reviewing this for Queens.


Certificate validation
--

This is another item that's been discussed since at least the Ocata 
summit but hasn't made much progress. Sean Dague agreed to help review 
this, and Eric Fried said he knew someone that could help review the 
security aspects of this change. Sean also suggested scheduling a 
hangout so the John Hopkins University team working on this can give a 
primer on the feature and what to look out for during review. We also 
suggested getting a scenario test written for this in the barbican 
tempest plugin, which runs as an experimental queue job for nova.


Notifications
-

Given the state of the Searchlight project and how we don't plan on 
using Searchlight as a global proxy for the compute REST API, we are not 
going to work on parity with versioned notifications there. There are 
some cleanups we still need to do in Nova for versioned notifications 
from a performance perspective. We also agreed that we aren't going to 
consider deprecating legacy unversioned notifications until we have 
parity with the versioned notifications, especially given legacy 
unversioned notification consumers have not yet moved to using the 
versioned notifications.


vGPU support


This depends on nested resource providers (like lots of other things). 
It was not clear from the discussion if this is static or dynamic 
support, e.g. can we hot plug vGPUs using Cyborg? I assume we will not 
support hot plugging at first. We also need improved functional testing 
of this space before we can make big changes.


Preemptible (spot) instances
-

This was continuing the discussion from the Boston forum session [5]. 
The major issue in Nova is that we don't want Nova to be in charge of 
orchestrating preempting instances when a request comes in for a "paid" 
instance. We agreed to start small where you can't burst over quota. 
Blazar also delivered some reservation features in Pike [6] which sound 
like they can be built on here, which also sound like expiration 
policies. Someone will have to prototype an external (to nova) "reaper" 
which will cull the preemptible instances based on some configurable 
policy. Honestly the notes here are confusing so we're going to need 
someone to drive this forward. That might mean picking up John Garbutt's 
draft spec for this (link not available right now).


Driver updates
--

Various teams from IBM gave updates on plans for their drivers in Queens.

PowerVM (in tree): the team is proposing a few more capabilities to the 
driver in Queens. Details are in the spec [7].


zDPM (out of tree): this out of tree driver has had two releases (ocata 
and pike) and is working on 3rd party CI. One issue they have with 
Tempest is they can only boot from volume.


zVM (out of tree): the team is working on refactoring some code into a 
library, similar to os-xenapi, os-powervm and oslo.vmware. They have CI 
running but are not yet reporting against nova changes.


Endpoint discovery
--

This is carry-over work from Ocata and Pike to standardize how Nova does 
endpoint discovery with other services, like 
keystone/placement/cinder/glance/neutron/ironic/barbican. The spec is 
here [8]. The dependent keystoneauth1 changes were released in Pike so 
we should be able to make quick progress on this early in Queens to 
flush out bugs.


Documentation
-

We talked about the 

[openstack-dev] [nova] Queens PTG recap - everything else

2017-09-18 Thread Matt Riedemann
There was a whole lot of other stuff discussed at the PTG. The details 
are in [1]. I won't go into everything here, so I'm just highlighting 
some of the more concrete items that had owners or TODOs.


Ironic
--

The Ironic team came over on Wednesday afternoon. We talked a bit, had 
some laughs, it was a good time. Since I don't speak fluent baremetal, 
Dmitry Tantsur is going to recap those discussions in the mailing list. 
Thanks again, Dmitry.


Privsep
---

Michael Still has been going hog wild converting the nova libvirt driver 
code to use privsep instead of rootwrap. He has a series of changes 
tracked under this blueprint [2]. Most of the discussion was a refresh 
on privsep and a recap of what's already been merged and some discussion 
on outstanding patches. The goal for Queens is to get the entire libvirt 
driver converted and also try to get all of nova-compute converted, but 
we want to limit that to getting things merged early in the release to 
flush out bugs since a lot of these are weird, possibly untested code 
paths. There was also discussion of a kind of privsep heartbeat daemon 
to tell if it's running (even though it's not a separate service) but 
this is complicated and is not something we'll pursue for Queens.


Websockify security proxy framework
---

This is a long-standing security hardening feature [3] which has changed 
hands a few times and hasn't gotten much review. Sean Dague and Melanie 
Witt agreed to focus on reviewing this for Queens.


Certificate validation
--

This is another item that's been discussed since at least the Ocata 
summit but hasn't made much progress. Sean Dague agreed to help review 
this, and Eric Fried said he knew someone that could help review the 
security aspects of this change. Sean also suggested scheduling a 
hangout so the John Hopkins University team working on this can give a 
primer on the feature and what to look out for during review. We also 
suggested getting a scenario test written for this in the barbican 
tempest plugin, which runs as an experimental queue job for nova.


Notifications
-

Given the state of the Searchlight project and how we don't plan on 
using Searchlight as a global proxy for the compute REST API, we are not 
going to work on parity with versioned notifications there. There are 
some cleanups we still need to do in Nova for versioned notifications 
from a performance perspective. We also agreed that we aren't going to 
consider deprecating legacy unversioned notifications until we have 
parity with the versioned notifications, especially given legacy 
unversioned notification consumers have not yet moved to using the 
versioned notifications.


vGPU support


This depends on nested resource providers (like lots of other things). 
It was not clear from the discussion if this is static or dynamic 
support, e.g. can we hot plug vGPUs using Cyborg? I assume we will not 
support hot plugging at first. We also need improved functional testing 
of this space before we can make big changes.


Preemptible (spot) instances
-

This was continuing the discussion from the Boston forum session [5]. 
The major issue in Nova is that we don't want Nova to be in charge of 
orchestrating preempting instances when a request comes in for a "paid" 
instance. We agreed to start small where you can't burst over quota. 
Blazar also delivered some reservation features in Pike [6] which sound 
like they can be built on here, which also sound like expiration 
policies. Someone will have to prototype an external (to nova) "reaper" 
which will cull the preemptible instances based on some configurable 
policy. Honestly the notes here are confusing so we're going to need 
someone to drive this forward. That might mean picking up John Garbutt's 
draft spec for this (link not available right now).


Driver updates
--

Various teams from IBM gave updates on plans for their drivers in Queens.

PowerVM (in tree): the team is proposing a few more capabilities to the 
driver in Queens. Details are in the spec [7].


zDPM (out of tree): this out of tree driver has had two releases (ocata 
and pike) and is working on 3rd party CI. One issue they have with 
Tempest is they can only boot from volume.


zVM (out of tree): the team is working on refactoring some code into a 
library, similar to os-xenapi, os-powervm and oslo.vmware. They have CI 
running but are not yet reporting against nova changes.


Endpoint discovery
--

This is carry-over work from Ocata and Pike to standardize how Nova does 
endpoint discovery with other services, like 
keystone/placement/cinder/glance/neutron/ironic/barbican. The spec is 
here [8]. The dependent keystoneauth1 changes were released in Pike so 
we should be able to make quick progress on this early in Queens to 
flush out bugs.


Documentation
-

We talked about the 

Re: [openstack-dev] [nova] Queens PTG team photos

2017-09-18 Thread Jimmy McArthur
This was a common thread among many of the project pics. Pose with wigs was 
identical to normal pose, add wig. 




> On Sep 18, 2017, at 4:08 PM, David Medberry  wrote:
> 
> Are you sure Dan @get_offmylawn wasn't just photoshopped from one pic to the 
> other? 
> 
>> On Mon, Sep 18, 2017 at 10:27 AM, Matt Riedemann  wrote:
>> Here are the links to the Nova team photos from the PTG.
>> 
>> https://photos.app.goo.gl/JoYZyouzm0J670mH3
>> 
>> https://photos.app.goo.gl/YMo96j6KKc044XdG2
>> 
>> -- 
>> 
>> Thanks,
>> 
>> Matt
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Queens PTG team photos

2017-09-18 Thread David Medberry
Are you sure Dan @get_offmylawn wasn't just photoshopped from one pic to
the other?

On Mon, Sep 18, 2017 at 10:27 AM, Matt Riedemann 
wrote:

> Here are the links to the Nova team photos from the PTG.
>
> https://photos.app.goo.gl/JoYZyouzm0J670mH3
>
> https://photos.app.goo.gl/YMo96j6KKc044XdG2
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone] Does the policy.json for trustsworks?

2017-09-18 Thread William M Edmonds


Adrian Turjak  wrote on 09/18/2017 01:39:20 AM:
>
> Bug submitted:
>
> https://urldefense.proofpoint.com/v2/url?
>
u=https-3A__bugs.launchpad.net_keystone_-2Bbug_1717847=DwIGaQ=jf_iaSHvJObTbx-

>
siA1ZOg=uPMq7DJxi29v-9CkM5RT0pxLlwteWvldJgmFhLURdvg=pc-9BTikvQSYJU9gcS334Ut4ER1gN6c2hXl3vGzdTPY=9S9InbF78aSW8ums9lJm8snzR6XbHYUibLuPMFLmnFU=

>
> Note that this is an odd one, since the current state (while unhelpful)
> is safe, fixing it has a chance of exposing an API to users that
> shouldn't be able to use it if operators don't update their policy file
> to match the new default we'd add.
>
>

I think we're actually mostly ok here. The one rule that looks off is the
one that I think you may have thought was correct... create_trust. I
updated the bug with reasoning. Please take a look and comment if I've
missed something or you've got further questions. Specific examples that
you've tried and got unexpected results would provide useful talking
points. Thanks!
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

2017-09-18 Thread Sean McGinnis
On Mon, Sep 18, 2017 at 12:20:24PM -0500, Jay S Bryant wrote:
> All,
> 
> I am adding the e-mails from the Inspur CI in the OpenStack 3rd Party CI
> Wiki in case they do not monitor the mailing list.
> 
> Inspur CI team,
> 
> Please see the e-mails below.  Your job is currently voting and should not
> be.  Please update your config to be non-voting.
> 
> Thank you!
> 
> Jay
> 

Sorry, I think I caused this by adding them to the cinder-ci group in an
attempt to get them to stop showing up even when Filter CI was toggled. I
noticed that this morning and had removed them again before the gerrit down
time. It should not show up anymore.

Although I still need to remember how they need to be registered to get them
out of the filtered comments.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] pypy broken for many repos

2017-09-18 Thread Sean McGinnis
On Sun, Sep 17, 2017 at 08:32:00AM +0200, Andreas Jaeger wrote:
> Currently we use pypy for a couple of projects and many of these fail
> with the version of pypy that we use.
> 
> A common error is  "Pypy fails with "RuntimeError: cryptography 1.9 is
> not compatible with PyPy < 5.3. Please upgrade PyPy to use this library.".
> 
> Example:
> http://logs.openstack.org/51/503951/1/check/gate-python-neutronclient-pypy/206ac6a/
> 
> I propose in https://review.openstack.org/#/c/504748/ to remove pypy
> from those repos where it fails.
> 
> Alternative would be investigating what is broken and fix it. Anybody
> interested to do this?
> 
> Or should we remove the pypy jobs where they fail. I pushed
> https://review.openstack.org/504748 up and marked it as WIP, will wait
> for a week to see outcome of this discussion,
> 
> Andreas

I noticed this when we switched over to using cryptography. I think at the time
the consensus was - meh. IIRC, it's an issue that we use an older version of
pypy. If system packages are available for a newer version, it probably would
be good to test that. But I have never seen pypy use in the wild, so I'm not
sure if it would be worth the effort.

Maybe easier just declaring pypy unsupported for service projects?

Sean

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack] Disable distributed loadbalancers (LBaaSv2)?

2017-09-18 Thread Turbo Fredriksson
On 18 Sep 2017, at 14:50, Brian Haley  wrote:

> Sorry, due to the invasiveness of the changes it won't be backported to Newton

Bugger! That’s a shame :(. No way I can convince someone to do it,
for a (small) monetary donation?

> I think you should be able to remove the router interfaces on the external 
> and internal networks then remove the router

Not sure if I do this correctly. I’m getting an error:

- snip -
bladeA01:~# neutron router-port-list tenant
+--+--+---++
| id   | name | mac_address   | fixed_ips   
   |
+--+--+---++
| 1647ed58-2c64-486e-a41e-910bf91f0876 |  | fa:16:3e:48:e8:b1 | 
{"subnet_id": "67242897-47c9-47f0-a3cd-9d01c3825e07", "ip_address": 
"10.0.10.17"}  |
| 317c3cf0-3119-4606-9367-fb8d8319d908 |  | fa:16:3e:dc:95:1e | 
{"subnet_id": "b3d19d1a-387d-4316-b490-11c8cb98dfd1", "ip_address": 
"10.0.9.254"}  |
| 6c6f33e9-2a16-44e0-9970-d252ce7d120c |  | fa:16:3e:09:2a:21 | 
{"subnet_id": "ab4da704-0ed2-4e54-89e4-afc98b8bb631", "ip_address": "10.0.6.1"} 
   |
| 8f659e68-252f-4c35-bff8-62211983022a |  | fa:16:3e:a6:2c:53 | 
{"subnet_id": "67242897-47c9-47f0-a3cd-9d01c3825e07", "ip_address": 
"10.0.10.254"} |
| 9ea245f7-c4a4-42e0-a23e-8109761c20b9 |  | fa:16:3e:ad:12:d6 | 
{"subnet_id": "336dc07c-83e7-4a64-a698-15d42b8824b1", "ip_address": 
"10.0.8.254"}  |
| d0960758-39c1-40ef-9023-84d24d533f93 |  | fa:16:3e:8a:34:19 | 
{"subnet_id": "b3d19d1a-387d-4316-b490-11c8cb98dfd1", "ip_address": 
"10.0.9.17"}   |
| ed1dee2e-f122-45cb-84a2-10f9deafee6a |  | fa:16:3e:c5:54:a7 | 
{"subnet_id": "336dc07c-83e7-4a64-a698-15d42b8824b1", "ip_address": 
"10.0.8.14"}   |
+--+--+---++
bladeA01:~# neutron router-interface-delete tenant 
port=1647ed58-2c64-486e-a41e-910bf91f0876
Router dac1e4f4-dd02-4f97-bc77-952906e8daa7 does not have an interface with id 
1647ed58-2c64-486e-a41e-910bf91f0876
Neutron server returns request_ids: ['req-3f3c985f-e4fb-473c-9911-56f8ebff2e58']
- snip -

On another one it said I couldn’t do that because it was in use “by one or more
floating ips”. Those I could possibly recreate, if I can just get it to start 
deleting
interfaces.


I have all my compute nodes shut down at the moment, no point in taking them
up when the LBs don’t work. I rely heavily on LBs for my setup...

> Ocata supports DVR -> Centralized router migration, so you would only have to 
> go forward one release if you choose that path.

OS in Debian GNU/Linux is in somewhat of a … “limbo” right now. Not sure
what the status is of Ocata there...


signature.asc
Description: Message signed with OpenPGP
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [openstack-dev] [k8s][deployment][kolla-kubernetes][magnum][kuryr][zun][qa][api] Proposal for SIG-K8s

2017-09-18 Thread Hongbin Lu
Hi Chris,

Sorry I missed the meeting since I was not in PTG last week. After a quick 
research on the mission of SIG-K8s, I think we (the OpenStack Zun team) have an 
item that fits well into this SIG, which is the k8s connector feature:

  https://blueprints.launchpad.net/zun/+spec/zun-connector-for-k8s

I added it to the etherpad and hope it will be well accepted by the SIG.

Best regards,
Hongbin

From: Chris Hoge [mailto:ch...@openstack.org]
Sent: September-15-17 12:25 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] 
[k8s][deployment][kolla-kubernetes][magnum][kuryr][qa][api] Proposal for SIG-K8s

Link to the etherpad for the upcoming meeting.

https://etherpad.openstack.org/p/queens-ptg-sig-k8s


On Sep 14, 2017, at 10:23 AM, Chris Hoge 
> wrote:

This Friday, September 15 at the PTG we will be hosting an organizational
meeting for SIG-K8s. More information on the proposal, meeting time, and
remote attendance is in the openstack-sigs mailing list [1].

Thanks,
Chris Hoge
Interop Engineer
OpenStack Foundation

[1] 
http://lists.openstack.org/pipermail/openstack-sigs/2017-September/51.html
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack-operators] [tripleo] Making containerized service deployment the default

2017-09-18 Thread Mohammed Naser
On Mon, Sep 18, 2017 at 3:04 PM, Alex Schultz  wrote:
> Hey ops & devs,
>
> We talked about containers extensively at the PTG and one of the items
> that needs to be addressed is that currently we still deploy the
> services as bare metal services via puppet. For Queens we would like
> to switch the default to be containerized services.  With this switch
> we would also start the deprecation process for deploying services as
> bare metal services via puppet.  We still execute the puppet
> configuration as part of the container configuration process so the
> code will continue to be leveraged but we would be investing more in
> the continual CI of the containerized deployments and reducing the
> traditional scenario coverage.
>
> As we switch over to containerized services by default, we would also
> begin to reduce installed software on the overcloud images that we
> currently use.  We have an open item to better understand how we can
> switch away from the golden images to a traditional software install
> process during the deployment and make sure this is properly tested.
> In theory it should work today by switching the default for
> EnablePackageInstall[0] to true and configuring repositories, but this
> is something we need to verify.
>
> If anyone has any objections to this default switch, please let us know.

I think this is a great initiative.  It would be nice to share some of
the TripleO experience in containerized deployments so that we can use
Puppet for containerized deployments.  Perhaps we can work together on
adding some classes which can help deploy and configure containerized
services with Puppet.

>
> Thanks,
> -Alex
>
> [0] 
> https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/tripleo-packages.yaml#L33-L36
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [openstack-dev] [Openstack-operators] [tripleo] Making containerized service deployment the default

2017-09-18 Thread Mohammed Naser
On Mon, Sep 18, 2017 at 3:04 PM, Alex Schultz  wrote:
> Hey ops & devs,
>
> We talked about containers extensively at the PTG and one of the items
> that needs to be addressed is that currently we still deploy the
> services as bare metal services via puppet. For Queens we would like
> to switch the default to be containerized services.  With this switch
> we would also start the deprecation process for deploying services as
> bare metal services via puppet.  We still execute the puppet
> configuration as part of the container configuration process so the
> code will continue to be leveraged but we would be investing more in
> the continual CI of the containerized deployments and reducing the
> traditional scenario coverage.
>
> As we switch over to containerized services by default, we would also
> begin to reduce installed software on the overcloud images that we
> currently use.  We have an open item to better understand how we can
> switch away from the golden images to a traditional software install
> process during the deployment and make sure this is properly tested.
> In theory it should work today by switching the default for
> EnablePackageInstall[0] to true and configuring repositories, but this
> is something we need to verify.
>
> If anyone has any objections to this default switch, please let us know.

I think this is a great initiative.  It would be nice to share some of
the TripleO experience in containerized deployments so that we can use
Puppet for containerized deployments.  Perhaps we can work together on
adding some classes which can help deploy and configure containerized
services with Puppet.

>
> Thanks,
> -Alex
>
> [0] 
> https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/tripleo-packages.yaml#L33-L36
>
> ___
> OpenStack-operators mailing list
> openstack-operat...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[Openstack-operators] [publiccloud-wg] Extra meeting PublicCloudWorkingGroup

2017-09-18 Thread Tobias Rydberg

Hi everyone,

We will have an "extra" meeting on Wednesday at 1400 UTC in 
#openstack-publiccloud


Main purpose for this extra meeting will be to finalize the agenda for 
the meetup in London next week.


Agenda and etherpad: https://etherpad.openstack.org/p/publiccloud-wg
Meetup etherpad: 
https://etherpad.openstack.org/p/MEETUPS-2017-publiccloud-wg


Regards,
Tobias
Co-chair PublicCloud WG



smime.p7s
Description: S/MIME Cryptographic Signature
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[openstack-dev] [publiccloud-wg] Extra meeting PublicCloudWorkingGroup

2017-09-18 Thread Tobias Rydberg

Hi everyone,

We will have an "extra" meeting on Wednesday at 1400 UTC in 
#openstack-publiccloud


Main purpose for this extra meeting will be to finalize the agenda for 
the meetup in London next week.


Agenda and etherpad: https://etherpad.openstack.org/p/publiccloud-wg
Meetup etherpad: 
https://etherpad.openstack.org/p/MEETUPS-2017-publiccloud-wg


Regards,
Tobias
Co-chair PublicCloud WG



smime.p7s
Description: S/MIME Cryptographic Signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[Openstack-operators] [tripleo] Making containerized service deployment the default

2017-09-18 Thread Alex Schultz
Hey ops & devs,

We talked about containers extensively at the PTG and one of the items
that needs to be addressed is that currently we still deploy the
services as bare metal services via puppet. For Queens we would like
to switch the default to be containerized services.  With this switch
we would also start the deprecation process for deploying services as
bare metal services via puppet.  We still execute the puppet
configuration as part of the container configuration process so the
code will continue to be leveraged but we would be investing more in
the continual CI of the containerized deployments and reducing the
traditional scenario coverage.

As we switch over to containerized services by default, we would also
begin to reduce installed software on the overcloud images that we
currently use.  We have an open item to better understand how we can
switch away from the golden images to a traditional software install
process during the deployment and make sure this is properly tested.
In theory it should work today by switching the default for
EnablePackageInstall[0] to true and configuring repositories, but this
is something we need to verify.

If anyone has any objections to this default switch, please let us know.

Thanks,
-Alex

[0] 
https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/tripleo-packages.yaml#L33-L36

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [openstack-dev] [puppet][tripleo] Gates broken, avoid rechecks

2017-09-18 Thread Jon Schlueter
any progress on getting an elastic recheck entry added to detect this failure?

Jon

On Mon, Sep 18, 2017 at 9:21 AM, Mohammed Naser  wrote:
> Hi everyone,
>
> Just a quick heads up that at the moment, there is an issue in the
> TripleO CI which is in the process of being fixed at the moment:
>
> https://bugs.launchpad.net/tripleo/+bug/1717545
>
> As certain Puppet modules gate for TripleO, please don't recheck
> changes that have failing jobs which start with `gate-tripleo-ci-` as
> they will fail anyways.
>
> I'll send an update email when things are fixed (or when you see that
> bug resolved).
>
> Thank you,
> Mohammed
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Jon Schlueter
jschl...@redhat.com
IRC: jschlueter/yazug
Senior Software Engineer - OpenStack Productization Engineer

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Making containerized service deployment the default

2017-09-18 Thread Alex Schultz
Hey ops & devs,

We talked about containers extensively at the PTG and one of the items
that needs to be addressed is that currently we still deploy the
services as bare metal services via puppet. For Queens we would like
to switch the default to be containerized services.  With this switch
we would also start the deprecation process for deploying services as
bare metal services via puppet.  We still execute the puppet
configuration as part of the container configuration process so the
code will continue to be leveraged but we would be investing more in
the continual CI of the containerized deployments and reducing the
traditional scenario coverage.

As we switch over to containerized services by default, we would also
begin to reduce installed software on the overcloud images that we
currently use.  We have an open item to better understand how we can
switch away from the golden images to a traditional software install
process during the deployment and make sure this is properly tested.
In theory it should work today by switching the default for
EnablePackageInstall[0] to true and configuring repositories, but this
is something we need to verify.

If anyone has any objections to this default switch, please let us know.

Thanks,
-Alex

[0] 
https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/tripleo-packages.yaml#L33-L36

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] TripleO/Ansible PTG session

2017-09-18 Thread James Slagle
On Mon, Sep 18, 2017 at 12:14 PM, Joshua Harlow  wrote:
> Was there any discussion at the PTG on how the newly released AWX[1] will
> affect tripleo/ansible (will it?) or ara or such? Thoughts there?

I think we'd investigate ARA for ansible log and results storage,
since we'd want to save the output.

As for AWX, there wasn't any significant discussion of which I'm
aware. We already have an API (Mistral).

However, the outputs we will get from Heat with "config download" will
be ansible playbooks. So, I think that leaves the door open for a
variety of eventual solutions that can interface with playbooks
(Mistral, AWX, cli, etc).

I don't imagine we'd do anything prohibitive that would prevent
someone from loading those playbooks into AWX if desired, or using
them directly from the cli. You just wouldn't be going through the
official API.

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [ironic] this week's priorities and subteam reports

2017-09-18 Thread Yeleswarapu, Ramamani
Hi,

We are glad to present this week's priorities and subteam report for Ironic. As 
usual, this is pulled directly from the Ironic whiteboard[0] and formatted.

This Week's Priorities (as of the weekly ironic meeting)

1. Decide on the priorities for the Queens cycle (dtantsur to post a review 
soon)
2. dtantsur or TheJulia to do a number of Pike releases
3. Refactoring of the way we access clients: 
https://review.openstack.org/#/q/topic:bug/1699547
4. Rolling upgrades missing bit: https://review.openstack.org/#/c/497666/
4.1. check object versions in dbsync tool: 
https://review.openstack.org/#/c/497703/
5. Switch to none auth for standalone mode: 
https://review.openstack.org/#/c/359061/


Next Pike Release
=
- status as of Sept 7, PM
- assuming it will be 9.1.1
- to fix race condition: https://bugs.launchpad.net/ironic/+bug/1715190
patches:
- on stable/pike, cherry-picked, Fix race condition in 
backfill_version_column(): https://review.openstack.org/#/c/501816/1
- on stable/pike, Add release note for next pike release: 
https://review.openstack.org/#/c/501783/
- optional on master & maybe backport: Update upgrade guide to use new pike 
release: https://review.openstack.org/#/c/501784/2


Bugs (dtantsur, vdrok, TheJulia)

- Stats (diff between 4 Sep 2017 and 18 Sep 2017)
- Ironic: 264 bugs (+13) + 258 wishlist items. 29 new (+8), 198 in progress 
(+7), 0 critical, 32 high (+1) and 35 incomplete (-1)
- Inspector: 13 bugs + 29 wishlist items. 3 new (+1), 10 in progress (-1), 0 
critical, 2 high (-1) and 3 incomplete
- Nova bugs with Ironic tag: 15. 0 new (-1), 0 critical, 2 high (+1)

CI refactoring and missing test coverage

- not considered a priority, it's a 'do it always' thing
- Standalone CI tests (vsaienk0)
- next patch to be reviewed, needed for 3rd party CI: 
https://review.openstack.org/#/c/429770/
- Missing test coverage (all)
- portgroups and attach/detach tempest tests: 
https://review.openstack.org/382476
- local boot with partition images: TODO 
https://bugs.launchpad.net/ironic/+bug/1531149
- adoption: https://review.openstack.org/#/c/344975/
- should probably be changed to use standalone tests
- root device hints: TODO
- node take over?
- resource classes integration tests: 
https://review.openstack.org/#/c/443628/

Essential Priorities

!!! this list is work-in-progress now !!!

Reference architecture guide (dtantsur)
---
- status as of 14 Aug 2017:
- Common bits: https://review.openstack.org/487410 needs a revision
- I guess this moves to Queens

Driver composition (dtantsur)
-
- spec: 
http://specs.openstack.org/openstack/ironic-specs/specs/approved/driver-composition-reform.html
- gerrit topic: https://review.openstack.org/#/q/status:open+topic:bug/1524745
- status as of 28 Aug 2017:
- documentation
- upgrade guide for the remaining drivers: TODO
- ilo: https://review.openstack.org/#/c/496480/
- idrac: (rpioso) TBD
- snmp: https://review.openstack.org/#/c/498541/ MERGED
- dev docs on writing hardware types: TODO
- new hardware types:
- apparently all merged in Pike
- API for hardware interface properties:
- proposed spec: https://review.openstack.org/#/c/471174/
- spec on the classic drivers deprecation: 
http://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/classic-drivers-future.html
 to be continued in Queens

High Priorities
===
!!! this list is work-in-progress now !!!

Rescue mode (stendulker/aparnav)

- spec: 
http://specs.openstack.org/openstack/ironic-specs/specs/approved/implement-rescue-mode.html
- code: https://review.openstack.org/#/q/topic:bug/1526449+status:open
- Status: 04 Sep 2017
- The nova patch for Rescue is abandoned and rescue tempest 
patch(https://review.openstack.org/#/c/452308/) which is dependent on the nova 
patch is in merge conflict.
- any plans to revive the nova patch soon(ish)?
- (TheJulia) None that I'm aware of, but nova is going to expect ironic 
work be completed first.

Neutron event processing (vdrok, vsaienk0)
--
- spec at https://review.openstack.org/343684, ready for reviews
- WIP code at https://review.openstack.org/440778

Refactoring of code accessing other services (pas-ha)
-
- gerrit topic: https://review.openstack.org/#/q/topic:bug/1699547
- status as of 1 Aug 2017: ready for review
- discussed in ironic meeting; -2'd until Queens

Available clean steps API (rloo)

- spec had been approved in mitaka: 

[Openstack-operators] [nova][neutron] Queens PTG recap - nova/neutron

2017-09-18 Thread Matt Riedemann
There were a few nova/neutron interactions at the PTG, one on Tuesday 
[1] and one on Thursday [2].


Priorities
--

1. Neutron port binding extension for live migration: This was discussed 
at the Ocata summit in Barcelona [3] and resulted in a Neutron spec [4] 
and API definition in Pike. The point of this is to shorten the amount 
of network downtime when switching ports between the source and 
destination hosts during a live migration. Neutron would provide a new 
port binding API extension and if available, Nova would use that to bind 
ports on both the source and destination hosts during live migration and 
switch which one is active during post-migration. We discussed if this 
should be dependent on os-vif object negotiation and agreed both efforts 
could be worked concurrently and then we'll see if we should merge them 
at the end, mostly to avoid having to redo a bunch of work if vif 
negotiation comes later. We also discussed if we should make the port 
binding changes on the Nova side depend on moving port orchestration to 
conductor [5] and again agreed to work those separately and see how the 
port binding code looks if it's just started in the nova-compute 
service, mainly since we don't have an owner for [5]. Sean Mooney said 
he could work on the Nova changes for this. The nova spec [6], started 
by John Garbutt in Ocata, would need to get updated for Queens. Miguel 
Lavalle will drive the changes in Neutron.


2. Using os-vif for port binding negotiation: Sean Mooney and Rodolfo 
Alonso already have some proof of concept code for this. We will want to 
get the gate-tempest-dsvm-nova-os-vif-ubuntu-xenial-nv job to be voting 
with any of this code. We also said we could work this concurrently with 
the port binding for live migration work above.


3. Bandwidth-based scheduling: this has a spec already and some work was 
done in Neutron in Pike. There are multiple interested parties in this 
feature. This will depend on getting nested resource providers done in 
Nova, really within the first milestone. Rodolfo owns this as well.


Other discussion


There were several other use cases discussed in both [1] and [2] but for 
the most part they have dependencies on other work, or they don't have 
specs/designs/PoC code, or they don't have owners. So we on the Nova 
side aren't going to be focusing on those other items.


[1] https://etherpad.openstack.org/p/placement-nova-neutron-queens-ptg
[2] https://etherpad.openstack.org/p/nova-ptg-queens
[3] https://etherpad.openstack.org/p/ocata-nova-neutron-session
[4] 
https://specs.openstack.org/openstack/neutron-specs/specs/pike/portbinding_information_for_nova.html
[5] 
https://blueprints.launchpad.net/nova/+spec/prep-for-network-aware-scheduling-pike

[6] https://review.openstack.org/#/c/375580/

--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[openstack-dev] [tripleo] Pike Retrospective & Status reporting

2017-09-18 Thread Alex Schultz
Hey folks,

We started off our PTG with a retrospective for Pike. The output of
which can be viewed here[0][1].

One of the recurring themes from the retrospective and the PTG was the
need for better communication during the cycle.  One of the ideas that
was mentioned was adding a section to the weekly meeting calling for
current status from the various tripleo squads[2].  Starting next week
(Sept 26th), I would like for folks who are members of one of the
squads be able to provide a brief status or a link to the current
status during the weekly meeting.  There will be a spot added to the
agenda to do a status roll call.  It was mentioned that folks may
prefer to send a message to the ML and just be able to link to it
similar to what the CI squad currently does[3].  We'll give this a few
weeks and review how it works.

Additionally it might be a good time to re-evaluate the squad
breakdown as currently defined. I'm not sure we have anyone working on
python3 items.

Thanks,
-Alex

[0] http://people.redhat.com/aschultz/denver-ptg/tripleo-ptg-retro.jpg
[1] https://etherpad.openstack.org/p/tripleo-ptg-queens-pike-retrospective
[2] 
https://github.com/openstack/tripleo-specs/blob/master/specs/policy/squads.rst#squads
[3] 
http://lists.openstack.org/pipermail/openstack-dev/2017-September/121881.html

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][neutron] Queens PTG recap - nova/neutron

2017-09-18 Thread Matt Riedemann
There were a few nova/neutron interactions at the PTG, one on Tuesday 
[1] and one on Thursday [2].


Priorities
--

1. Neutron port binding extension for live migration: This was discussed 
at the Ocata summit in Barcelona [3] and resulted in a Neutron spec [4] 
and API definition in Pike. The point of this is to shorten the amount 
of network downtime when switching ports between the source and 
destination hosts during a live migration. Neutron would provide a new 
port binding API extension and if available, Nova would use that to bind 
ports on both the source and destination hosts during live migration and 
switch which one is active during post-migration. We discussed if this 
should be dependent on os-vif object negotiation and agreed both efforts 
could be worked concurrently and then we'll see if we should merge them 
at the end, mostly to avoid having to redo a bunch of work if vif 
negotiation comes later. We also discussed if we should make the port 
binding changes on the Nova side depend on moving port orchestration to 
conductor [5] and again agreed to work those separately and see how the 
port binding code looks if it's just started in the nova-compute 
service, mainly since we don't have an owner for [5]. Sean Mooney said 
he could work on the Nova changes for this. The nova spec [6], started 
by John Garbutt in Ocata, would need to get updated for Queens. Miguel 
Lavalle will drive the changes in Neutron.


2. Using os-vif for port binding negotiation: Sean Mooney and Rodolfo 
Alonso already have some proof of concept code for this. We will want to 
get the gate-tempest-dsvm-nova-os-vif-ubuntu-xenial-nv job to be voting 
with any of this code. We also said we could work this concurrently with 
the port binding for live migration work above.


3. Bandwidth-based scheduling: this has a spec already and some work was 
done in Neutron in Pike. There are multiple interested parties in this 
feature. This will depend on getting nested resource providers done in 
Nova, really within the first milestone. Rodolfo owns this as well.


Other discussion


There were several other use cases discussed in both [1] and [2] but for 
the most part they have dependencies on other work, or they don't have 
specs/designs/PoC code, or they don't have owners. So we on the Nova 
side aren't going to be focusing on those other items.


[1] https://etherpad.openstack.org/p/placement-nova-neutron-queens-ptg
[2] https://etherpad.openstack.org/p/nova-ptg-queens
[3] https://etherpad.openstack.org/p/ocata-nova-neutron-session
[4] 
https://specs.openstack.org/openstack/neutron-specs/specs/pike/portbinding_information_for_nova.html
[5] 
https://blueprints.launchpad.net/nova/+spec/prep-for-network-aware-scheduling-pike

[6] https://review.openstack.org/#/c/375580/

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][barbican][sahara] start RPC service before launcher wait?

2017-09-18 Thread Ken Giusti
On Thu, Sep 14, 2017 at 7:33 PM, Adam Spiers  wrote:
>
> Hi Ken,
>
> Thanks a lot for the analysis, and sorry for the slow reply!
> Comments inline...
>
> Ken Giusti  wrote:
> > Hi Adam,
> >
> > I think there's a couple of problems here.
> >
> > Regardless of worker count, the service.wait() is called before
> > service.start().  And from looking at the oslo.service code, the 'wait()'
> > method is call after start(), then again after stop().  This doesn't match
> > up with the intended use of oslo.messaging.server.wait(), which should only
> > be called after .stop().
>
> Hmm, so are you saying that there might be a bug in oslo.service's
> usage of oslo.messaging, and that this Sahara bugfix was the wrong
> approach too?
>
> https://review.openstack.org/#/c/280741/1/sahara/cli/sahara_engine.py
>

Well, I don't think the explicit call to start() is going to help,
esp. if the number of workers is > 1 since the workers are forked and
need to call start() from their own process space..
In fact, if # of workers > 1 then you not only get an RPC server in
each worker process, you'll end up with an extra RPC
server in the calling thread.

Take a look at a test service I've created for oslo.messaging:

https://pastebin.com/rSA6AD82

If you change the main code to call the new sequence, you'll end up
with 3 rpc servers (2 in the workers, one in the main process).

In that code I've made the wait() call a no op if the server hasn't
been started first.   And the stop method will call stop and wait on
the rpc server, which is the expected sequence as far as
oslo.messaging is concerned.

To me it seems that the bug is in oslo.service - calling wait() before
start() doesn't make sense to me.

> > Perhaps a bigger issue is that in the multi threaded case all threads
> > appear to be calling start, wait, and stop on the same instance of the
> > service (oslo.messaging rpc server).  At least that's what I'm seeing in my
> > muchly reduced test code:

I was wrong about this - I failed to notice that each service had
forked and was dealing with its own copy of the server.

> >
> > https://paste.fedoraproject.org/paste/-73zskccaQvpSVwRJD11cA
> >
> > The log trace shows multiple calls to start, wait, stop via different
> > threads to the same TaskServer instance:
> >
> > https://paste.fedoraproject.org/paste/dyPq~lr26sQZtMzHn5w~Vg
> >
> > Is that expected?
>
> Unfortunately in the interim, your pastes seem to have vanished - any
> chance you could repaste them?
>

Ugh - didn't keep a copy.  If you pull down that test code you can use
it to generate those traces.


> Thanks,
> Adam
>
> > On Mon, Jul 31, 2017 at 9:32 PM, Adam Spiers  wrote:
> > > Ken Giusti  wrote:
> > >> On Mon, Jul 31, 2017 at 10:01 AM, Adam Spiers  wrote:
> > >>> I recently discovered a bug where barbican-worker would hang on
> > >>> shutdown if queue.asynchronous_workers was changed from 1 to 2:
> > >>>
> > >>>https://bugs.launchpad.net/barbican/+bug/1705543
> > >>>
> > >>> resulting in a warning like this:
> > >>>
> > >>>WARNING oslo_messaging.server [-] Possible hang: stop is waiting for
> > >>> start to complete
> > >>>
> > >>> I found a similar bug in Sahara:
> > >>>
> > >>>https://bugs.launchpad.net/sahara/+bug/1546119
> > >>>
> > >>> where the fix was to call start() on the RPC service before making the
> > >>> launcher wait() on it, so I ported the fix to Barbican, and it seems
> > >>> to work fine:
> > >>>
> > >>>https://review.openstack.org/#/c/485755
> > >>>
> > >>> I noticed that both projects use ProcessLauncher; barbican uses
> > >>> oslo_service.service.launch() which has:
> > >>>
> > >>>if workers is None or workers == 1:
> > >>>launcher = ServiceLauncher(conf, restart_method=restart_method)
> > >>>else:
> > >>>launcher = ProcessLauncher(conf, restart_method=restart_method)
> > >>>
> > >>> However, I'm not an expert in oslo.service or oslo.messaging, and one
> > >>> of Barbican's core reviewers (thanks Kaitlin!) noted that not many
> > >>> other projects start the task before calling wait() on the launcher,
> > >>> so I thought I'd check here whether that is the correct fix, or
> > >>> whether there's something else odd going on.
> > >>>
> > >>> Any oslo gurus able to shed light on this?
> > >>>
> > >>
> > >> As far as an oslo.messaging server is concerned, the order of operations
> > >> is:
> > >>
> > >> server.start()
> > >> # do stuff until ready to stop the server...
> > >> server.stop()
> > >> server.wait()
> > >>
> > >> The final wait blocks until all requests that are in progress when stop()
> > >> is called finish and cleanup.
> > >
> > > Thanks - that makes sense.  So the question is, why would
> > > barbican-worker only hang on shutdown when there are multiple workers?
> > > Maybe the real bug is somewhere in oslo_service.service.ProcessLauncher
> > > and it's not calling start() correctly?




-- 
Ken 

Re: [openstack-dev] [TripleO] TripleO/Ansible PTG session

2017-09-18 Thread David Moreau Simard
On Mon, Sep 18, 2017 at 2:14 PM, Joshua Harlow  wrote:
> Was there any discussion at the PTG on how the newly released AWX[1] will
> affect tripleo/ansible (will it?) or ara or such? Thoughts there?
>
> [1] https://github.com/ansible/awx

I wasn't at this particular session (wish I could've cloned myself
several times to attend everything) but AWX doesn't change anything as
far as ARA is concerned.
The use cases are different.

One provides things like
ACL/RBAC/authentication/reporting/auditing/running/editing/etc but
also requires memcached, postgres, rabbitmq, a dedicated server and
running your playbooks through it.
The other one provides... reporting and requires 'pip install ara' on
your laptop :)

The folks from Ansible and Tower have reached out to me about
collaborating and talked about integration with AWX/Tower.
I don't know yet what this means and what it will translate to but
I'll have an idea in the next few weeks.

David Moreau Simard
Senior Software Engineer | OpenStack RDO

dmsimard = [irc, github, twitter]

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Update on Zuul v3 Rollout plans for PTG week

2017-09-18 Thread Monty Taylor

tl;dr - We didn't roll v3 out at the PTG, current plan is 2017-09-25.

On 09/09/2017 08:55 AM, Monty Taylor wrote:

Heya everybody!

You may or may not be aware that we've been targeting a go-live for Zuul 
v3 for Monday of the PTG. After our status meeting on the topic this 
afternoon we have decided to hold-off and do a bit more work on the 
migration itself while we're in Denver together.


Zuul v3 itself is in great shape, but it turns out a community of 
thousands of developers has produced a LOT of job content over the last 
7 years, and we want to hammer on the translation and migration plans a 
bit more before we unleash it on everyone. If all goes well we're still 
hoping to leave Denver having migrated all the things.


In case you didn't notice, we're still running v2.5. While Zuul v3 
itself is in great shape, we're still getting the job migration script 
solid to the point where we're comfortable the impact will be mimimal 
for most folks.


Since we're doing the Gerrit upgrade today, the new plan is to do the v3 
cutover next Monday, 2017-09-25.


We do not expect the cutover to take a long time - certainly not as long 
as the Gerrit upgrade.


We also had some really good interactions with folks about new job 
content at the PTG, so while we're not rolled out, the time was actually 
very productive.


We have a session planned Monday afternoon to go over v3 and what it 
means at 2:00PM in Vail. That session is still on, however, for those of 
you who either can't be there or prefer reading, we've also put a bunch 
of migration information into the Infra Manual:


   https://docs.openstack.org/infra/manual/zuulv3.html


Quick shout-out to everyone who came to the session. It was scheduled as 
a 30 minute session and wound up lasting 2.5 hours because of all of the 
EXCELLENT questions.


One that came up a few times that's worth bringing up here is:

"What about Third Party CI operators?"

Our STRONG recommendation is for the Third Party folks to hold off and 
not attempt to upgrade immediately. As soon as we're done migrating 
OpenStack we'll be focusing on fleshing out the documentation, and I'm 
sure we'll find some new bugs we need to squash. We'd like to ensure 
bugs have settled down, that there are good operational docs and at 
least a solid Third Party Migration FAQ before Third Party folks upgrade.


We also want to have puppet-openstackci updated properly for people who 
are using that.


Finally, we have written a migration script for OpenStack. It's MOSTLY 
configurable and should be usable by Third Party folks - but as the 
primary focus has been on OpenStack's transition, there are a few 
hard-coded things and a few places where assumptions were likely made. 
We'll want to work with our Third Party community on fixing any 
Infra-specific issues in that script before we recommend a wide rollout.


(PS. Please do not look at the code in the migration script. It is some 
of the ugliest and wonkiest code I've ever written and I wouldn't want 
you all to think less of me)


As we work over the first part of the week you may see Zuul running some 
jobs on your patches and leaving results. If you do - don't freak out, 
it's just us making sure things are working. We also may grab a few of 
you to verify that migrated versions of your jobs are working properly. 
Or you may not notice anything at all until the go-live happens.


We may still do that this week, so the advice holds true. If you see 
Zuul start to comment on patches in your project, don't freak out. 
HOWEVER - if Zuul does leave a comment this week and the results are 
different from what Jenkins leaves, it would be helpful to compare the 
output and let us know if the new Zuul jobs are doing the wrong things.


The content will not be identical, as v3 handles several things 
fundamentally differently. But if you have a job that thinks it's 
testing devstack with magnum enabled and that is not happening, then 
that's very important for us to figure out.


Thanks for your patience! It was great to talk to everyone we were able 
to connect with at the PTG. We're looking forward to rolling this out 
and getting a bunch of new tools in everyone's hands next week.


Thanks!
Monty

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] TripleO/Ansible PTG session

2017-09-18 Thread Joshua Harlow
Was there any discussion at the PTG on how the newly released AWX[1] 
will affect tripleo/ansible (will it?) or ara or such? Thoughts there?


[1] https://github.com/ansible/awx

James Slagle wrote:

On Wednesday at the PTG, TripleO held a session around our current use
of Ansible and how to move forward. I'll summarize the results of the
session. Feel free to add anything I forgot and provide any feedback
or questions.

We discussed the existing uses of Ansible in TripleO and how they
differ in terms of what they do and how they interact with Ansible. I
covered this in a previous email[1], so I'll skip over summarizing
those points again.

I explained a bit about the  "openstack overcloud config download"
approach implemented in Pike by the upgrades squad. This method
no-op's out the deployment steps during the actual Heat stack-update,
then uses the cli to query stack outputs to create actual Ansible
playbooks from those output values. The Undercloud is then used as the
Ansible runner to apply the playbooks to each Overcloud node.

I created a sequence diagram for this method and explained how it
would also work for initial stack deployment[2]:

https://slagle.fedorapeople.org/tripleo-ansible-arch.png

The high level proposal was to move in a direction where we'd use the
config download method for all Heat driven stack operations
(stack-create and stack-update).

We highlighted and discussed several key points about the method shown
in the diagram:

- The entire sequence and flow is driven via Mistral on the Undercloud
by default. This preserves the API layer and provides a clean reusable
interface for the CLI and GUI.

- It would still be possible to run ansible-playbook directly for
various use cases (dev/test/POC/demos). This preserves the quick
iteration via Ansible that is often desired.

- The remaining SoftwareDeployment resources in tripleo-heat-templates
need to be supported by config download so that the entire
configuration can be driven with Ansible, not just the deployment
steps. The success criteria for this point would be to illustrate
using an image that does not contain a running os-collect-config.

- The ceph-ansible implementation done in Pike could be reworked to
use this model. "config download" could generate playbooks that have
hooks for calling external playbooks, or those hooks could be
represented in the templates directly. The result would be the same
either way though in that Heat would no longer be triggering a
separate Mistral workflow just for ceph-ansible.

- We will need some centralized log storage for the ansible-playbook
results and should consider using ARA.

As it would be a lot of work to eventually make this method the
default, I don't expect or plan that we will complete all this work in
Queens. We can however start moving in this direction.

Specifically, I hope to soon add support to config download for the
rest of the SoftwareDeployment resources in tripleo-heat-templates as
that will greatly simplify the undercloud container installer. Doing
so will illustrate using the ephemeral heat-all process as simply a
means for generating ansible playbooks.

I plan to create blueprints this week for Queens and beyond. If you're
interested in this work, please let me know. I'm open to the idea of
creating an official squad for this work, but I'm not sure if it's
needed or not.

As not everyone was able to attend the PTG, please do provide feedback
about this plan as it should still be considered open for discussion.

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-July/119405.html
[2] https://slagle.fedorapeople.org/tripleo-ansible-arch.png



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[Openstack-operators] [nova][cinder] Queens PTG recap - nova/cinder

2017-09-18 Thread Matt Riedemann
On Thursday morning at the PTG the Nova and Cinder teams got together to 
talk through some items. Details are in the etherpad [1].


Bug 1547142 [2]
---

This is a long-standing bug where Nova does not terminate connections 
when shelve offloading an instance. There was some confusion when this 
was originally reported about whether or not calling 
os-terminate_connection would fix the issue for all backends. The Cinder 
team said it should, and if not it's a bug in the volume drivers in 
Cinder. So we went ahead and rebased the fix [3] which is merged and 
making its way through the stable backports now. This fixes old-style 
attachments. For the new style attachments which get enabled in [4] 
we'll also have to make sure that we create a new volume attachment to 
keep the volume reserved but delete the old attachments for the old host 
connector.


New style volume attach dev update
--

The Cinder team gave an overview of the work completed in Pike and what 
is on-going in Queens for enabling Nova to use new-style volume 
attachments in Cinder, which are based on the 3.27 and 3.44 Cinder API 
microversions. This was also a chance to merge some patches in the 
Queens series and give background to the review teams, mostly on the 
Nova side.


There was general agreement to get the new-style attachment flows merged 
early in Queens so we can flush out bugs and start working on 
multi-attach support.


We also said that we would not work on migrating old style attachments 
to new style in Queens. We don't plan on removing the old flows in Nova 
anytime soon, and once we do we can start talking about migrating data then.


Volume multi-attach
---

Most of the discussion here was around shared volume connections and how 
to model those out of the Cinder API so that Nova can know when it 
should perform a final disconnect_volume call on the host when detaching 
a volume. We agreed that Cinder needs a new API microversion to model 
this which we will then update [4] to rely on that new microversion 
before enabling new style attachments.


We also talked about whether or not we should allow boot from volume 
with an existing multi-attach volume. We decided to allow this but 
disable it via default policy. So there will be a new policy rule in 
both Nova and Cinder:


1. Nova: add a policy rule, disabled by default, to allow boot from 
volume with a multi-attach volume.


2. Cinder: allow multi-attach volumes based on the storage backend 
support, allow multi-attach but only for read-only volumes, or disable 
creating multi-attach volumes altogether. I'm a bit fuzzy on the details 
here, but looking at the existing Cinder API code I don't see any policy 
checks for creating a multiattach volume at all, so this is probably 
something good to add anyway since not all Nova compute drivers are 
going to support multiattach volumes right away.


Ildiko Vancsa is updating the nova spec for multi-attach support for 
Queens with the new details.


Refreshing volume connection_info
-

This was based on a mailing list discussion [5] and the PTG discussion 
was already summarized in that thread [6].


Cinder ephemeral storage


This was a rehash of the Boston forum discussion [7]. We agreed to work 
on both the short term and long term options here.


The short-term option is adding an "is_bfv" attribute on flavors in 
Nova, which defaults to False, but if True would perform a simple boot 
from volume using the specified image and flavor disk details. Think of 
this like get-me-a-network but for boot from volume. Anything more 
detailed, like volume type, guest_format, disk_bus, ephemeral or swap 
disks, would have to be handled through the normal API usage we have 
today. Also, user-defined or image-defined block device mapping 
attributes in the request would supersede the flavor.


The long-term option option is Nova having a Cinder imagebackend driver 
for ephemeral storage. Chet Burgess has started looking at this, and it 
was recommended to look at the ScaleIO imagebackend as a template since 
they both have to solve problems with non-local storage. The good news 
is a Cinder ephemeral imagebackend driver in Nova would not need to deal 
with image caching, since Cinder can do that for us.


--

All in all I felt we had a really productive set of topics and 
discussions between the teams with everyone being on the same page and 
going the same direction, which is nice to see. Boring is good.


[1] https://etherpad.openstack.org/p/cinder-ptg-queens
[2] https://bugs.launchpad.net/nova/+bug/1547142
[3] https://review.openstack.org/257275
[4] https://review.openstack.org/#/c/330285/
[5] http://lists.openstack.org/pipermail/openstack-dev/2017-June/118040.html
[6] 
http://lists.openstack.org/pipermail/openstack-dev/2017-September/122170.html

[7] 

[openstack-dev] [Neutron] Weekly IRC meeting cancelled on September 18th

2017-09-18 Thread Miguel Lavalle
Dear Neutrinos,

Since this is the first day after the PTG, we are going to cancel today's
weekly IRC meeting at 2100 UTC. We will resume the weekly meetings on
Tuesday, September 26th, at 1400 UTC.

I will send a message soon with the summary of the PTG. In the meantime,
team members can watch and listen all the sessions from Wednesday and
Thursday here:

Wednesday morning: https://www.youtube.com/watch?v=lJm8vIwxGec
Wedensday afternoon: https://www.youtube.com/watch?v=LPxydx5ypAE
Thursday Morning: https://www.youtube.com/watch?v=zSHIpkR9Jxg
Neutron / Nova x-project: https://www.youtube.com/watch?v=kF5uat0MbuY
Thursday late afternoon: https://www.youtube.com/watch?v=lJm8vIwxGec

Best regards

Miguel
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][cinder] Queens PTG recap - nova/cinder

2017-09-18 Thread Matt Riedemann
On Thursday morning at the PTG the Nova and Cinder teams got together to 
talk through some items. Details are in the etherpad [1].


Bug 1547142 [2]
---

This is a long-standing bug where Nova does not terminate connections 
when shelve offloading an instance. There was some confusion when this 
was originally reported about whether or not calling 
os-terminate_connection would fix the issue for all backends. The Cinder 
team said it should, and if not it's a bug in the volume drivers in 
Cinder. So we went ahead and rebased the fix [3] which is merged and 
making its way through the stable backports now. This fixes old-style 
attachments. For the new style attachments which get enabled in [4] 
we'll also have to make sure that we create a new volume attachment to 
keep the volume reserved but delete the old attachments for the old host 
connector.


New style volume attach dev update
--

The Cinder team gave an overview of the work completed in Pike and what 
is on-going in Queens for enabling Nova to use new-style volume 
attachments in Cinder, which are based on the 3.27 and 3.44 Cinder API 
microversions. This was also a chance to merge some patches in the 
Queens series and give background to the review teams, mostly on the 
Nova side.


There was general agreement to get the new-style attachment flows merged 
early in Queens so we can flush out bugs and start working on 
multi-attach support.


We also said that we would not work on migrating old style attachments 
to new style in Queens. We don't plan on removing the old flows in Nova 
anytime soon, and once we do we can start talking about migrating data then.


Volume multi-attach
---

Most of the discussion here was around shared volume connections and how 
to model those out of the Cinder API so that Nova can know when it 
should perform a final disconnect_volume call on the host when detaching 
a volume. We agreed that Cinder needs a new API microversion to model 
this which we will then update [4] to rely on that new microversion 
before enabling new style attachments.


We also talked about whether or not we should allow boot from volume 
with an existing multi-attach volume. We decided to allow this but 
disable it via default policy. So there will be a new policy rule in 
both Nova and Cinder:


1. Nova: add a policy rule, disabled by default, to allow boot from 
volume with a multi-attach volume.


2. Cinder: allow multi-attach volumes based on the storage backend 
support, allow multi-attach but only for read-only volumes, or disable 
creating multi-attach volumes altogether. I'm a bit fuzzy on the details 
here, but looking at the existing Cinder API code I don't see any policy 
checks for creating a multiattach volume at all, so this is probably 
something good to add anyway since not all Nova compute drivers are 
going to support multiattach volumes right away.


Ildiko Vancsa is updating the nova spec for multi-attach support for 
Queens with the new details.


Refreshing volume connection_info
-

This was based on a mailing list discussion [5] and the PTG discussion 
was already summarized in that thread [6].


Cinder ephemeral storage


This was a rehash of the Boston forum discussion [7]. We agreed to work 
on both the short term and long term options here.


The short-term option is adding an "is_bfv" attribute on flavors in 
Nova, which defaults to False, but if True would perform a simple boot 
from volume using the specified image and flavor disk details. Think of 
this like get-me-a-network but for boot from volume. Anything more 
detailed, like volume type, guest_format, disk_bus, ephemeral or swap 
disks, would have to be handled through the normal API usage we have 
today. Also, user-defined or image-defined block device mapping 
attributes in the request would supersede the flavor.


The long-term option option is Nova having a Cinder imagebackend driver 
for ephemeral storage. Chet Burgess has started looking at this, and it 
was recommended to look at the ScaleIO imagebackend as a template since 
they both have to solve problems with non-local storage. The good news 
is a Cinder ephemeral imagebackend driver in Nova would not need to deal 
with image caching, since Cinder can do that for us.


--

All in all I felt we had a really productive set of topics and 
discussions between the teams with everyone being on the same page and 
going the same direction, which is nice to see. Boring is good.


[1] https://etherpad.openstack.org/p/cinder-ptg-queens
[2] https://bugs.launchpad.net/nova/+bug/1547142
[3] https://review.openstack.org/257275
[4] https://review.openstack.org/#/c/330285/
[5] http://lists.openstack.org/pipermail/openstack-dev/2017-June/118040.html
[6] 
http://lists.openstack.org/pipermail/openstack-dev/2017-September/122170.html

[7] 

Re: [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

2017-09-18 Thread Jay S Bryant

All,

I am adding the e-mails from the Inspur CI in the OpenStack 3rd Party CI 
Wiki in case they do not monitor the mailing list.


Inspur CI team,

Please see the e-mails below.  Your job is currently voting and should 
not be.  Please update your config to be non-voting.


Thank you!

Jay



On 9/18/2017 12:07 PM, Ivan Kolodyazhny wrote:

Erlon,

Unfortunately, gerrit is unavailable now due to the maintainence.

Lenny,

I didn't find anything related to INSPUR in the project-config.

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/

On Mon, Sep 18, 2017 at 5:13 PM, Erlon Cruz > wrote:


Do you have the patch link? I don't see any report of it[1].



[1] http://ci-watch.tintri.com/project?project=cinder=7+days


On Mon, Sep 18, 2017 at 6:40 AM, Lenny Verkhovsky
> wrote:

You can do it in your Zuul layout.yaml configuration file

Ex:

- name: Nova-ML2-Sriov

branch: ^(master|stable/ocata|stable/newton|stable/pike).*$

voting: false

skip-if:

*From:*Ivan Kolodyazhny [mailto:e...@e0ne.info
]
*Sent:* Monday, September 18, 2017 1:05 PM
*To:* OpenStack Development Mailing List
>
*Cc:* wangyong2...@inspur.com
; inspur...@inspur.com
; jiaohao...@inspur.com

*Subject:* [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

Hi Team,

Looks like we incidentally made INSPUR-CI voting for Cinder.
For now, it puts -1 for all patches. According to [1] please
make it non-voting.

[1]

https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#When_thirdparty_CI_voting_will_be_required.3F




Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

2017-09-18 Thread Ivan Kolodyazhny
Erlon,

Unfortunately, gerrit is unavailable now due to the maintainence.

Lenny,

I didn't find anything related to INSPUR in the project-config.

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/

On Mon, Sep 18, 2017 at 5:13 PM, Erlon Cruz  wrote:

> Do you have the patch link? I don't see any report of it[1].
>
>
>
> [1] http://ci-watch.tintri.com/project?project=cinder=7+days
>
> On Mon, Sep 18, 2017 at 6:40 AM, Lenny Verkhovsky 
> wrote:
>
>> You can do it in your Zuul layout.yaml configuration file
>>
>>
>>
>> Ex:
>>
>> - name: Nova-ML2-Sriov
>>
>> branch: ^(master|stable/ocata|stable/newton|stable/pike).*$
>>
>> voting: false
>>
>> skip-if:
>>
>>
>>
>>
>>
>> *From:* Ivan Kolodyazhny [mailto:e...@e0ne.info]
>> *Sent:* Monday, September 18, 2017 1:05 PM
>> *To:* OpenStack Development Mailing List > .org>
>> *Cc:* wangyong2...@inspur.com; inspur...@inspur.com;
>> jiaohao...@inspur.com
>> *Subject:* [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI
>>
>>
>>
>> Hi Team,
>>
>>
>>
>> Looks like we incidentally made INSPUR-CI voting for Cinder. For now, it
>> puts -1 for all patches. According to [1] please make it non-voting.
>>
>>
>>
>>
>>
>> [1] https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-
>> drivers#When_thirdparty_CI_voting_will_be_required.3F
>> 
>>
>>
>>
>>
>> Regards,
>> Ivan Kolodyazhny,
>> http://blog.e0ne.info/
>> 
>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Queens PTG team photos

2017-09-18 Thread Matt Riedemann

Here are the links to the Nova team photos from the PTG.

https://photos.app.goo.gl/JoYZyouzm0J670mH3

https://photos.app.goo.gl/YMo96j6KKc044XdG2

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] TripleO/Ansible PTG session

2017-09-18 Thread James Slagle
On Wednesday at the PTG, TripleO held a session around our current use
of Ansible and how to move forward. I'll summarize the results of the
session. Feel free to add anything I forgot and provide any feedback
or questions.

We discussed the existing uses of Ansible in TripleO and how they
differ in terms of what they do and how they interact with Ansible. I
covered this in a previous email[1], so I'll skip over summarizing
those points again.

I explained a bit about the  "openstack overcloud config download"
approach implemented in Pike by the upgrades squad. This method
no-op's out the deployment steps during the actual Heat stack-update,
then uses the cli to query stack outputs to create actual Ansible
playbooks from those output values. The Undercloud is then used as the
Ansible runner to apply the playbooks to each Overcloud node.

I created a sequence diagram for this method and explained how it
would also work for initial stack deployment[2]:

https://slagle.fedorapeople.org/tripleo-ansible-arch.png

The high level proposal was to move in a direction where we'd use the
config download method for all Heat driven stack operations
(stack-create and stack-update).

We highlighted and discussed several key points about the method shown
in the diagram:

- The entire sequence and flow is driven via Mistral on the Undercloud
by default. This preserves the API layer and provides a clean reusable
interface for the CLI and GUI.

- It would still be possible to run ansible-playbook directly for
various use cases (dev/test/POC/demos). This preserves the quick
iteration via Ansible that is often desired.

- The remaining SoftwareDeployment resources in tripleo-heat-templates
need to be supported by config download so that the entire
configuration can be driven with Ansible, not just the deployment
steps. The success criteria for this point would be to illustrate
using an image that does not contain a running os-collect-config.

- The ceph-ansible implementation done in Pike could be reworked to
use this model. "config download" could generate playbooks that have
hooks for calling external playbooks, or those hooks could be
represented in the templates directly. The result would be the same
either way though in that Heat would no longer be triggering a
separate Mistral workflow just for ceph-ansible.

- We will need some centralized log storage for the ansible-playbook
results and should consider using ARA.

As it would be a lot of work to eventually make this method the
default, I don't expect or plan that we will complete all this work in
Queens. We can however start moving in this direction.

Specifically, I hope to soon add support to config download for the
rest of the SoftwareDeployment resources in tripleo-heat-templates as
that will greatly simplify the undercloud container installer. Doing
so will illustrate using the ephemeral heat-all process as simply a
means for generating ansible playbooks.

I plan to create blueprints this week for Queens and beyond. If you're
interested in this work, please let me know. I'm open to the idea of
creating an official squad for this work, but I'm not sure if it's
needed or not.

As not everyone was able to attend the PTG, please do provide feedback
about this plan as it should still be considered open for discussion.

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-July/119405.html
[2] https://slagle.fedorapeople.org/tripleo-ansible-arch.png

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[Openstack-operators] [nova] Queens PTG recap - placement

2017-09-18 Thread Matt Riedemann
Placement related items came up a lot at the Queens PTG. Some on Tuesday 
[1], some on Wednesday [2], some on Thursday [3] and some on Friday [4].


Priorities for Queens
-

The priorities for placement/scheduler related items in Queens are:

1. Migration allocations [5] - we realized late in Pike that the way we 
were tracking allocations across source and dest nodes during a move 
operation (cold migrate, live migrate, resize, evacuate) was confusing 
and error prone, and we had to "double up" allocations for the instance 
during the move. The idea here is to simplify the resource allocation 
modeling during a move operation by having the migration record be a 
consumer of resource allocations during the move, so we can keep the 
source/dest node allocations separate using the instance/migration 
records. This is mostly internal technical debt reduction and to 
simplify our accounting which should mean fewer bugs.


2. Alternate hosts - this is the work to have the scheduler determine a 
set of alternative hosts for reschedules. This is important for cells v2 
where the cell conductor and nova-compute services can't reach the API 
database or scheduler, so reschedules need to happen within the cell 
given a list of pre-determined hosts chosen by the scheduler at the top. 
Ed Leafe has already started on some of this [6].


3. Nested resource providers [7] - this has been around for awhile now 
but hasn't had the proper reviewer focus due to other priorities. We are 
making this a priority in Queens as it enables a lot of other use cases 
like bandwidth-aware scheduling and being able to eventually remove 
major chunks of the claims code in the ResourceTracker in the compute 
service. We agreed that in Queens we want to try and keep the scope of 
this small and focus on being able to model a simple SR-IOV PF/VF 
relationship. Modeling NUMA use cases will be post-Queens. We will need 
quite a bit of work on functional testing done along with this so that 
we have some fixtures and/or fake virt drivers in place to model things 
like CPU pinning, huge pages, NUMA, SR-IOV, etc which also verify 
allocations in Placement to know we are doing things correctly from the 
client perspective, similar to the functional tests added for verifying 
allocations during move operations in Pike.


General device management
-

This was a more forward looking discussion and the notes are in the 
etherpad [3]. This is not really slated for Queens work except to make 
sure that things we do in Queens don't limit what we can do for 
generically managing devices later, and is tied heavily to the nested 
resource providers work.


Other discussion


Traits - supporting required traits in a flavor is on-going and the spec 
is here [8].


Shared storage providers [9] - we have decided to defer working on this 
from Queens given other priorities. Modeling move allocations with 
migration records should help here though.


Modeling distance for (anti-)affinity use cases - this is being deferred 
from Queens. There are workarounds when running with multiple cells.


Limits and ordering in Placement - Chris Dent has proposed a spec [10] 
so that we can limit the size of a response when getting resource 
providers from Placement during scheduling and also optionally configure 
the behavior of how Placement orders the returned set, so you can pack 
or spread possible build candidates.


OSC plugin - I'm trying to push this work forward. We have the plugin 
installed with devstack now and a functional CI job for the repo but 
need to move some of the patches forward that add the CLI functionality.


There was lots of other random stuff in [2] and [4] but for the most 
part are not prioritized, spec'ed out or have a clear owner, so those 
are not really getting attention for Queens.


[1] https://etherpad.openstack.org/p/placement-nova-neutron-queens-ptg
[2] https://etherpad.openstack.org/p/nova-ptg-queens-placement
[3] 
https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management

[4] https://etherpad.openstack.org/p/nova-ptg-queens
[5] 
https://specs.openstack.org/openstack/nova-specs/specs/queens/approved/migration-allocations.html

[6] https://review.openstack.org/#/c/498830/
[7] 
https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/nested-resource-providers.html

[8] https://review.openstack.org/#/c/468797/
[9] https://bugs.launchpad.net/nova/+bug/1707256
[10] https://review.openstack.org/#/c/504540/

--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[openstack-dev] [nova] Queens PTG recap - placement

2017-09-18 Thread Matt Riedemann
Placement related items came up a lot at the Queens PTG. Some on Tuesday 
[1], some on Wednesday [2], some on Thursday [3] and some on Friday [4].


Priorities for Queens
-

The priorities for placement/scheduler related items in Queens are:

1. Migration allocations [5] - we realized late in Pike that the way we 
were tracking allocations across source and dest nodes during a move 
operation (cold migrate, live migrate, resize, evacuate) was confusing 
and error prone, and we had to "double up" allocations for the instance 
during the move. The idea here is to simplify the resource allocation 
modeling during a move operation by having the migration record be a 
consumer of resource allocations during the move, so we can keep the 
source/dest node allocations separate using the instance/migration 
records. This is mostly internal technical debt reduction and to 
simplify our accounting which should mean fewer bugs.


2. Alternate hosts - this is the work to have the scheduler determine a 
set of alternative hosts for reschedules. This is important for cells v2 
where the cell conductor and nova-compute services can't reach the API 
database or scheduler, so reschedules need to happen within the cell 
given a list of pre-determined hosts chosen by the scheduler at the top. 
Ed Leafe has already started on some of this [6].


3. Nested resource providers [7] - this has been around for awhile now 
but hasn't had the proper reviewer focus due to other priorities. We are 
making this a priority in Queens as it enables a lot of other use cases 
like bandwidth-aware scheduling and being able to eventually remove 
major chunks of the claims code in the ResourceTracker in the compute 
service. We agreed that in Queens we want to try and keep the scope of 
this small and focus on being able to model a simple SR-IOV PF/VF 
relationship. Modeling NUMA use cases will be post-Queens. We will need 
quite a bit of work on functional testing done along with this so that 
we have some fixtures and/or fake virt drivers in place to model things 
like CPU pinning, huge pages, NUMA, SR-IOV, etc which also verify 
allocations in Placement to know we are doing things correctly from the 
client perspective, similar to the functional tests added for verifying 
allocations during move operations in Pike.


General device management
-

This was a more forward looking discussion and the notes are in the 
etherpad [3]. This is not really slated for Queens work except to make 
sure that things we do in Queens don't limit what we can do for 
generically managing devices later, and is tied heavily to the nested 
resource providers work.


Other discussion


Traits - supporting required traits in a flavor is on-going and the spec 
is here [8].


Shared storage providers [9] - we have decided to defer working on this 
from Queens given other priorities. Modeling move allocations with 
migration records should help here though.


Modeling distance for (anti-)affinity use cases - this is being deferred 
from Queens. There are workarounds when running with multiple cells.


Limits and ordering in Placement - Chris Dent has proposed a spec [10] 
so that we can limit the size of a response when getting resource 
providers from Placement during scheduling and also optionally configure 
the behavior of how Placement orders the returned set, so you can pack 
or spread possible build candidates.


OSC plugin - I'm trying to push this work forward. We have the plugin 
installed with devstack now and a functional CI job for the repo but 
need to move some of the patches forward that add the CLI functionality.


There was lots of other random stuff in [2] and [4] but for the most 
part are not prioritized, spec'ed out or have a clear owner, so those 
are not really getting attention for Queens.


[1] https://etherpad.openstack.org/p/placement-nova-neutron-queens-ptg
[2] https://etherpad.openstack.org/p/nova-ptg-queens-placement
[3] 
https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management

[4] https://etherpad.openstack.org/p/nova-ptg-queens
[5] 
https://specs.openstack.org/openstack/nova-specs/specs/queens/approved/migration-allocations.html

[6] https://review.openstack.org/#/c/498830/
[7] 
https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/nested-resource-providers.html

[8] https://review.openstack.org/#/c/468797/
[9] https://bugs.launchpad.net/nova/+bug/1707256
[10] https://review.openstack.org/#/c/504540/

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [zun][unit test] Any python utils can collect pci info?

2017-09-18 Thread Eric Fried
You may get a little help from the methods in nova.pci.utils.

If you're calling out to lspci or accessing sysfs, be aware of this
series [1] and do it via the new privsep mechanisms.

[1]
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:hurrah-for-privsep

On 09/17/2017 09:41 PM, Hongbin Lu wrote:
> Hi Shunli,
> 
>  
> 
> I am not aware of any prevailing python utils for this. An alternative
> is to shell out Linux commands to collect the information. After a quick
> search, it looks xenapi [1] uses “lspci -vmmnk” to collect PCI device
> detail info and “ls /sys/bus/pci/devices//” to detect the
> PCI device type (PF or VF). FWIW, you might find it helpful to refer the
> implementation of Nova’s xenapi driver for gettiing PCI resources [2].
> Hope it helps.
> 
>  
> 
> [1]
> https://github.com/openstack/os-xenapi/blob/master/os_xenapi/dom0/etc/xapi.d/plugins/xenhost.py#L593
> 
> [2]
> https://github.com/openstack/nova/blob/master/nova/virt/xenapi/host.py#L154
> 
>  
> 
> Best regards,
> 
> Hongbin
> 
>  
> 
> *From:*Shunli Zhou [mailto:shunli6...@gmail.com]
> *Sent:* September-17-17 9:35 PM
> *To:* openstack-dev@lists.openstack.org
> *Subject:* [openstack-dev] [zun][unit test] Any python utils can collect
> pci info?
> 
>  
> 
> Hi all,
> 
>  
> 
> For https://blueprints.launchpad.net/zun/+spec/support-pcipassthroughfilter
> this BP, Nova use the libvirt to collect the PCI device info. But for
> zun, libvirt seems is a heavy dependecies. Is there a python utils that
> can be used to collect the PCI device detail info? Such as the whether
> it's a PF of network pci device of VF, the device capabilities, etc.
> 
>  
> 
> Note: For 'lspci -D -nnmm' , there are some info can not get.
> 
>  
> 
>  
> 
> Thanks
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack] MTU on Provider Networks

2017-09-18 Thread John Petrini
Great! Thank you both for the information.

John Petrini
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack-operators] [Openstack] MTU on Provider Networks

2017-09-18 Thread John Petrini
Great! Thank you both for the information.

John Petrini
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

2017-09-18 Thread Erlon Cruz
Do you have the patch link? I don't see any report of it[1].



[1] http://ci-watch.tintri.com/project?project=cinder=7+days

On Mon, Sep 18, 2017 at 6:40 AM, Lenny Verkhovsky 
wrote:

> You can do it in your Zuul layout.yaml configuration file
>
>
>
> Ex:
>
> - name: Nova-ML2-Sriov
>
> branch: ^(master|stable/ocata|stable/newton|stable/pike).*$
>
> voting: false
>
> skip-if:
>
>
>
>
>
> *From:* Ivan Kolodyazhny [mailto:e...@e0ne.info]
> *Sent:* Monday, September 18, 2017 1:05 PM
> *To:* OpenStack Development Mailing List  openstack.org>
> *Cc:* wangyong2...@inspur.com; inspur...@inspur.com; jiaohao...@inspur.com
> *Subject:* [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI
>
>
>
> Hi Team,
>
>
>
> Looks like we incidentally made INSPUR-CI voting for Cinder. For now, it
> puts -1 for all patches. According to [1] please make it non-voting.
>
>
>
>
>
> [1] https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#When_
> thirdparty_CI_voting_will_be_required.3F
> 
>
>
>
>
> Regards,
> Ivan Kolodyazhny,
> http://blog.e0ne.info/
> 
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack] Disable distributed loadbalancers (LBaaSv2)?

2017-09-18 Thread Brian Haley

On 09/16/2017 12:25 PM, Turbo Fredriksson wrote:

When I setup my OS cluster over a year ago, I chose to use
distributed LBaaSv2. That sounded like the most sensible
thing - redundancy is the primary goal with me choosing
OS in the first place!

However, it turned out that there’s a very grave bug in
OS - Neutron - (only just recently fixed - a few weeks ago and
only in the latest development code).

https://bugs.launchpad.net/neutron/+bug/1494003
https://bugs.launchpad.net/neutron/+bug/1493809
https://bugs.launchpad.net/neutron/+bug/1583694

I run Newton (and don’t want to risk everything by either
re-installing or upgrading - last time it took me two months
to get things working again!).

Doesn’t seem to be any backport of the fix to Newton :( :(.


Sorry, due to the invasiveness of the changes it won't be backported to 
Newton, only Pike will have this support.  It also might be slightly 
broken until very recent code in stable/pike...



Does anyone have an idea on how I can “hack” the DB
(MySQL) so that it isn’t distributed any more? The OS
command line tools won’t let you de-distribute one :(.

This should be “fairly” straight forward, for anyone that knows
the “inner workings” of Neutron. Simply “undo” whatever

 neutron router-create --distributed True --ha False rname

did. I can’t unfortunately delete the router and then recreate
it without destroying my whole setup, instances, networks,
etc, etc. Everything “hangs” off of that router...


I think you should be able to remove the router interfaces on the 
external and internal networks then remove the router, without removing 
any of the private networks, etc.  Then you can create it again with 
--distributed=False.  VMs might lose connectivity for a bit though.


Ocata supports DVR -> Centralized router migration, so you would only 
have to go forward one release if you choose that path.


-Brian

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [openstack-dev] [all] review.openstack.org downtime and Gerrit upgrade TODAY 15:00 UTC - 23:59 UTC

2017-09-18 Thread Andreas Jaeger
Just a friendly reminder that the upgrade will happen TODAY, Monday
18th, starting at 15:00 UTC. The infra team expects that it takes 8
hours, so until 2359 UTC.

For details, see Clark's announcement below:

On 2017-09-15 23:52, Clark Boylan wrote:
> On Wed, Aug 2, 2017, at 03:57 PM, Clark Boylan wrote:
>> Hello,
>>
>> The Infra team is planning to upgrade review.openstack.org from Gerrit
>> 2.11 to Gerrit 2.13 on September 18, 2017. This downtime will begin at
>> 1500UTC and is expected to take many hours as we have to perform an
>> offline update of Gerrit's secondary indexes. The outage should be
>> complete by 2359UTC.
>>
>> This upgrade is a relatively minor one for users. You'll find that
>> mobile use of Gerrit is slightly better (though still not great). The
>> bug that forces us to reapply Approval votes rather than just rechecking
>> has also been addressed. If you'd like to test out Gerrit 2.13 you can
>> do so at https://review-dev.openstack.org.
>>
>> The date we have chosen is the Monday after the PTG. The
>> expectation/hope is that many people will still be traveling or
>> otherwise recovering from the PTG so demand for Gerrit will be low. By
>> doing it on Monday we also hope that there will be load on the service
>> the following day which should help shake out any issues quickly (in the
>> past we've done it on weekends then had to wait a couple days before
>> problems are noticed).
>>
>> If you have any concerns or feedback please let the Infra team know.
> 
> As a friendly reminder we are planning to move ahead with this upgrade
> on Monday at 1500UTC. We are reviewing the process and getting some
> final preparations done on our last day at the PTG.
> 
> Thank you for your patience,
> Clark

Andreas
-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi
  SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Felix Imendörffer, Jane Smithard, Graham Norton,
   HRB 21284 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [vitrage] No IRC meeting this week

2017-09-18 Thread Afek, Ifat (Nokia - IL/Kfar Sava)
Hi,

The IRC meeting on the coming Wednesday (September 20th) is canceled, as many 
Vitrage contributors will be on vacation. We will meet again next week, on 
September 27th.

Best Regards,
Ifat.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [puppet][tripleo] Gates broken, avoid rechecks

2017-09-18 Thread Mohammed Naser
Hi everyone,

Just a quick heads up that at the moment, there is an issue in the
TripleO CI which is in the process of being fixed at the moment:

https://bugs.launchpad.net/tripleo/+bug/1717545

As certain Puppet modules gate for TripleO, please don't recheck
changes that have failing jobs which start with `gate-tripleo-ci-` as
they will fail anyways.

I'll send an update email when things are fixed (or when you see that
bug resolved).

Thank you,
Mohammed

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TC][Cyborg] Application For OpenStack Official Project

2017-09-18 Thread Zhipeng Huang
Hi everyone,

The Cyborg team would like to submit its application for official OpenStack
project under governance[1].

We would like to thank all the TC audience when we did the new project
update session in Denver for the courage and kind words you have provided.
Our overview slide used at that meeting could be found [2].

Look forward to your questions and I (as well as other team members) will
try the very best to answer them. Also look forward to a more exciting
future within the OpenStack community :)

[1] https://review.openstack.org/#/c/504940/
[2]
https://docs.google.com/presentation/d/1RyDDVMBsQndN-Qo_JInnHaj_oCY6zwT_3ky_gk1nJMo/edit?usp=sharing


-- 
Zhipeng (Howard) Huang

Standard Engineer
IT Standard & Patent/IT Product Line
Huawei Technologies Co,. Ltd
Email: huangzhip...@huawei.com
Office: Huawei Industrial Base, Longgang, Shenzhen

(Previous)
Research Assistant
Mobile Ad-Hoc Network Lab, Calit2
University of California, Irvine
Email: zhipe...@uci.edu
Office: Calit2 Building Room 2402

OpenStack, OPNFV, OpenDaylight, OpenCompute Aficionado
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova]notification update week 38

2017-09-18 Thread Balazs Gibizer

Hi,

Here is the status update / focus settings mail for w38.

Bugs

[Medium] https://bugs.launchpad.net/nova/+bug/1699115 api.fault
notification is never emitted
We still have to figure out what is the expected behavior here based on:
http://lists.openstack.org/pipermail/openstack-dev/2017-June/118639.html
I think I will propose a patch to remove the api.fault notification to
help start the discussion.

[High] https://bugs.launchpad.net/nova/+bug/1706563
TestRPC.test_cleanup_notifier_null fails with timeout
[High] https://bugs.launchpad.net/nova/+bug/1685333 Fatal Python error:
Cannot recover from stack overflow. - in py35 unit test job
The first bug is just a duplicate of the second. It seems the TetRPC
test suite has a way to end up in an infinite recusion.
I don't know about a way to reproduce it localy or to change the gate
env so that python prints out the full stack trace to see where the
problematic call is. Also adding extra log messages won't help as a
timed out test doesn't have the log messages printed to the logs. So
this bug is pretty stuck.

Versioned notification transformation
-
There are 3 transformation patches that only need a second +2:
* https://review.openstack.org/#/c/454023/ Transform servergroup.create 
notification
* https://review.openstack.org/#/c/483902/ Transform servergroup.delete 
notification
* https://review.openstack.org/#/c/396210/ Transform aggregate.add_host 
notification



Searchlight integration
---
As we discussed on the PTG the Searchlight integration is not likely to 
happen
in the near future so extending the nova notifications is not a 
priority. This

means that we are not planning to add the 'status' field to the instance
notifications. The other task in the current bp
https://blueprints.launchpad.net/nova/+spec/additional-notification-fields-for-searchlight-queens
is to avoid unnecessary BDM DB query when we emit instance 
notifications.
We agreed that we want to do this as it is a meningful optimization of 
the

current code. Patches already proposed and waiting for review:
https://review.openstack.org/#/q/topic:bp/additional-notification-fields-for-searchlight-queens

Small improvements
--

* https://review.openstack.org/#/q/topic:refactor-notification-samples
Factor out duplicated notification sample data
This is a start of a longer patch series to deduplicate notification
sample data. The third patch already shows how much sample data can be
deleted from nova tree. We added a minimal hand rolled json ref
implementation to notification sample test as the existing python json
ref implementations are not well maintained.


Weekly meeting
--
Next subteam meeting will be held on 19th of September, Tuesday 17:00 
UTC on openstack-meeting-4.

https://www.timeanddate.com/worldclock/fixedtime.html?iso=20170919T17


Cheers,
gibi





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Queens PTG: Thursday summary

2017-09-18 Thread Belmiro Moreira
Hi Brian,
Thanks for the sessions summaries.

We are really interested in the image lifecycle support.
Can you elaborate how searchlight would help solving this problem?

thanks,
Belmiro
CERN

On Fri, Sep 15, 2017 at 4:46 PM, Brian Rosmaita 
wrote:

> For those who couldn't attend, here's a quick synopsis of what was
> discussed yesterday.
>
> Please consult the etherpad for each session for details.  Feel free
> to put questions/comments on the etherpads, and then put an item on
> the agenda for the weekly meeting on Thursday 21 September, and we'll
> continue the discussion.
>
>
> Complexity removal
> --
> https://etherpad.openstack.org/p/glance-queens-ptg-complexity-removal
>
> In terms of a complexity contribution barrier, everyone agreed that
> the domain model is the largest factor.
>
> We also agreed that simplifying it is not something that could happen
> in the Queens cycle.  It's probably a two-cycle effort, one cycle to
> ensure sufficient test coverage, and one cycle to refactor.  Given the
> strategic planning session yesterday, we probably wouldn't want to
> tackle this until after the registry is completely removed, which is
> projected to happen in S.
>
>
> Image lifecycle support
> ---
> https://etherpad.openstack.org/p/glance-queens-ptg-lifecycle
>
> We sketched out several approaches, but trying to figure out a
> solution that would work across different types of deployments and
> various use cases gets complicated fast.  It would be better for
> deployers to use Searchlight to configure complex queries that could
> use all appropriate image metadata specified by the deployer.
>
> For interoperability, deployers could use the common image properties
> with suggested values on their public images.
>
> We looked at two particular approaches that might help operators.  The
> first would be introducing a kind of 'local_version' field that would
> be auto-incremented by Glance, the idea being that an image-list query
> that asked for the max value would yield the most recent version of
> that image.  One problem, however, is what other metadata would be
> used in the query, as there might be several versions of images with
> the same os_distro and os_version properties (for example, the base
> CentOS 7 image and the LAMP CentOS 7 image).
>
> The second approach is introducing a 'hidden' property which would
> cause the image to be hidden from any image list calls (except for the
> image owner or glance admin).  This has been requested before, but
> hasn't been enthusiastically endorsed because it leaves out several
> use cases.  But combined with Searchlight (with an updated glance
> plugin to understand the 'hidden' field), it might be the best
> solution.
>
>
> Should Glance be replaced?
> --
> https://etherpad.openstack.org/p/glance-queens-ptg-glance-removal
>
> The short answer is No.  Glance is the best way for deployments to
> provide the Images API v2.  The project team has recently regained the
> team:diverse-affiliation tag and is in a healthier state that it was
> immediately after the downsizing craze of 2017 that happened early in
> the Pike cycle.  The Glance project team is committed to the long term
> stability of Glance.
>
>
> glance_store
> 
> https://etherpad.openstack.org/p/glance-queens-ptg-glance_store
>
> We had a combined session with the Glare team, who also consume the
> glance_store library, and worked out a list of items to improve the
> library.
>
>
>
> Multiple same store type support
> 
> https://etherpad.openstack.org/p/glance-queens-ptg-multi-store
>
> This has been requested by operators, and the interoperable image
> import introduced in v2.6 of the Images API can be used to allow end
> users to request what store to use.  The Glance design will be
> consistent (to the largest extent possible) with Cinder (at least as
> far as configuration goes, to make it easy on operators).
>
>
>
> Queens Prioritization and Roadmapping
> -
> https://etherpad.openstack.org/p/glance-queens-ptg-roadmap
>
> See the etherpad for what we think we can get done.  I'll put up a
> patch for the Queens priorities to the glance-specs repo before the
> Glance meeting on Sept 21, and we can have a final discussion of any
> outstanding issues.
>
>
>
> If you missed the Wednesday summary, here it is:
> http://lists.openstack.org/pipermail/openstack-dev/2017-
> September/122156.html
>
> The scheduling etherpad has links to all the session etherpads:
> https://etherpad.openstack.org/p/glance-queens-ptg
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

Re: [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

2017-09-18 Thread Lenny Verkhovsky
You can do it in your Zuul layout.yaml configuration file

Ex:
- name: Nova-ML2-Sriov
branch: ^(master|stable/ocata|stable/newton|stable/pike).*$
voting: false
skip-if:


From: Ivan Kolodyazhny [mailto:e...@e0ne.info]
Sent: Monday, September 18, 2017 1:05 PM
To: OpenStack Development Mailing List 
Cc: wangyong2...@inspur.com; inspur...@inspur.com; jiaohao...@inspur.com
Subject: [openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

Hi Team,

Looks like we incidentally made INSPUR-CI voting for Cinder. For now, it puts 
-1 for all patches. According to [1] please make it non-voting.


[1] 
https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#When_thirdparty_CI_voting_will_be_required.3F


Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [cinder][3rd-party ci] Voting INSPUR-CI

2017-09-18 Thread Ivan Kolodyazhny
Hi Team,

Looks like we incidentally made INSPUR-CI voting for Cinder. For now, it
puts -1 for all patches. According to [1] please make it non-voting.


[1]
https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#When_thirdparty_CI_voting_will_be_required.3F


Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [mistral] Cancelling team meeting - 09/18/2017

2017-09-18 Thread Renat Akhmerov
Hi,

There won’t be a team today. Let’s meet next week and discuss the PTG summary 
and the current activities.

Thanks

Renat Akhmerov
@Nokia
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[Openstack] how to create pipeline works between two machines through Taskflow

2017-09-18 Thread lampahome
I tried to create a *remote pipeline* procedure by taskflow and saw this:
https://wiki.openstack.org/wiki/TaskFlow/Worker-based_Engine

But it mentioned about to create a worker one-side(server side), but it
doesn't show how to start the other-side(client side).

So I'm confused about how to build a complete client-server pipeline
through taskflow
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] QEMU/KVM crash when mixing cpu_policy:dedicated and non-dedicated flavors?

2017-09-18 Thread Tomas Brännström
We use Fuel for deployment, with a fairly simple network configuration
(Controller/Network node are the same) and OpenDaylight as the neutron
driver. However, we also have SR-IOV configured for some nics, and there
might be something interesting here.

The instance was created with an SR-IOV port, and in the logs I see
"Assigning a pci device without numa affinity toinstance
389109a4-540e-48d9-82b1-873b02cb4d31 which has numa topology". Then shortly
after creation fails and the hypervisor seems to crash.

So today I tried to create an instance without SR-IOV and
hw:policy=dedicated, and it worked fine. Then I did the same but added an
SR-IOV port, and I get the same crash (though not across all nodes this
time...)

I assume we have some kind of misconfiguration somewhere, though the entire
hypervisor crashing doesn't seem correct either :-)

/Tomas

On 17 September 2017 at 00:32, Steve Gordon  wrote:

> - Original Message -
> > From: "Tomas Brännström" 
> > To: openstack@lists.openstack.org
> > Sent: Friday, September 15, 2017 5:56:34 AM
> > Subject: [Openstack] QEMU/KVM crash when mixing cpu_policy:dedicated and
> non-dedicated flavors?
> >
> > Hi
> > I just noticed a strange (?) issue when I tried to create an instance
> with
> > a flavor with hw:cpu_policy=dedicated. The instance failed with error:
> >
> > Unable to read from monitor: Connection reset by peer', u'code': 500,
> > u'details': u'  File
> > "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1926,
> in
> > _do_build_and_run_instance\nfilter_properties)
> > File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
> 2116,
> > in _build_and_run_instance\ninstance_uuid=instance.uuid,
> > reason=six.text_type(e))
> >
> > And all other instances were shut down, even those living on another
> > compute host than the new one was scheduled to. A quick googling reveals
> > that this could be due to the hypervisor crashing (though why would it
> > crash on unrelated compute hosts??).
>
> Are there any more specific messages in the system logs or elsewhere?
> Check /var/log/libvirt/* in particular, though I suspect it will be the
> original source of the above message it may have some additional useful
> information earlier.
>
> >
> > The only odd thing here that I can think of was that the existing
> instances
> > did -not- use dedicated cpu policy -- can there be problems like this
> when
> > attempting to mix dedicated and non-dedicated policies?
>
> The main problem if you mix them *on the same node* is that Nova wont
> account properly for this when placing guests, the current design assumes
> that a node will be used either for "normal" instances (with CPU
> overcommit) or "dedicated" instances (no CPU overcommit, pinning) and the
> two will be separated via the use of host aggregates and flavors. This in
> and of itself should not result in a QEMU crash though it may eventually
> result in issues w.r.t. balancing of scheduling/placement decisions. If
> instances on other nodes went down at the same time I'd be looking for a
> broader issue, what is your storage and networking setup like?
>
> -Steve
>
> > This was with Mitaka.
> >
> > /Tomas
> >
> > ___
> > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack
> > Post to : openstack@lists.openstack.org
> > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack
> >
>
> --
> Steve Gordon,
> Principal Product Manager,
> Red Hat OpenStack Platform
>
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack