Re: [Openstack-operators] [User-committee] [Forum] Moderators needed!

2017-04-28 Thread UKASICK, ANDREW
Hi Shamail.

Alan Meadows will still be doing the Cloud-Native Design/Refactoring across 
OpenStack session.  He just forgot to confirm that.
Please accept my message here as his confirmation.

-Andy


From: Shamail Tahir [mailto:itzsham...@gmail.com]
Sent: Friday, April 28, 2017 7:23 AM
To: openstack-operators ; OpenStack 
Development Mailing List (not for usage questions) 
; user-committee 

Subject: [User-committee] [Forum] Moderators needed!

Hi everyone,

Most of the proposed/accepted Forum sessions currently have moderators but 
there are six sessions that do not have a confirmed moderator yet. Please look 
at the list below and let us know if you would be willing to help moderate any 
of these sessions.

The topics look really interesting but it will be difficult to keep the 
sessions on the schedule if there is not an assigned moderator. We look forward 
to seeing you at the Summit/Forum in Boston soon!

Achieving Resiliency at Scales of 
1000+

Feedback from users for I18n & translation - important 
part?

Neutron Pain 
Points

Making Neutron easy for people who want basic 
networking

High Availability in 
OpenStack

Cloud-Native Design/Refactoring across 
OpenStack



Thanks,
Doug, Emilien, Melvin, Mike, Shamail & Tom
Forum Scheduling Committee
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Kevin Benton
With the network down, does ovs-vsctl show that it is connected to the
controller?

On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich 
wrote:

> Exactly, we access via a tagged interface, which is part of br-ex
>
> # ip a show vlan171
> 16: vlan171:  mtu 9000 qdisc noqueue
> state UNKNOWN group default qlen 1
> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>valid_lft forever preferred_lft forever
> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>valid_lft forever preferred_lft forever
>
> # ovs-vsctl show
> ...
> Bridge br-ex
> Controller "tcp:127.0.0.1:6633"
> is_connected: true
> Port "vlan171"
> tag: 171
> Interface "vlan171"
> type: internal
> ...
>
>
> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton  wrote:
>
>> Ok, that's likely not the issue then. I assume the way you access each
>> host is via an IP assigned to an OVS bridge or an interface that somehow
>> depends on OVS?
>>
>> On Apr 28, 2017 12:04, "Gustavo Randich" 
>> wrote:
>>
>>> Hi Kevin, we are using the default listen address of loopback interface:
>>>
>>> # grep -r of_listen_address /etc/neutron
>>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
>>> 127.0.0.1
>>>
>>>
>>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
>>> -vconsole:emer -vsyslog:err -vfile:info 
>>> --remote=punix:/var/run/openvswitch/db.sock
>>> --private-key=db:Open_vSwitch,SSL,private_key
>>> --certificate=db:Open_vSwitch,SSL,certificate
>>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>>> --log-file=/var/log/openvswitch/ovsdb-server.log
>>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:
>>>
 Are you using an of_listen_address value of an interface being brought
 down?

 On Apr 25, 2017 17:34, "Gustavo Randich" 
 wrote:

> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>
> This sounds very strange (to me): recently, after a switch outage, we
> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
> host and restart networking service to regain access. Then restart
> neutron-openvswitch-agent to regain access to VMs.
>
> At first glance we thought it was a problem with the NIC linux driver
> of the hosts not detecting link state correctly.
>
> Then we reproduced the issue simply bringing down physical interfaces
> for around 5 minutes, then up again. Same issue.
>
> And then we found that if instead of using native (ryu) OpenFlow
> interface in Neutron Openvswitch we used ovs-ofctl, the problem 
> disappears.
>
> Any clue?
>
> Thanks in advance.
>
>
> ___
> Mailing list: http://lists.openstack.org/cgi
> -bin/mailman/listinfo/openstack
> Post to : openst...@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi
> -bin/mailman/listinfo/openstack
>
>
>>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread Matt Riedemann

On 4/28/2017 11:19 AM, Eric Fried wrote:

If it's *just* glance we're making an exception for, I prefer #1 (don't
deprecate/remove [glance]api_servers).  It's way less code &
infrastructure, and it discourages others from jumping on the
multiple-endpoints bandwagon.  If we provide endpoint_override_list
(handwave), people will think it's okay to use it.


If SSL and URLs work like mdorman said they do, then this works for me. 
I like keeping it simple.


--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Gustavo Randich
Exactly, we access via a tagged interface, which is part of br-ex

# ip a show vlan171
16: vlan171:  mtu 9000 qdisc noqueue state
UNKNOWN group default qlen 1
link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
   valid_lft forever preferred_lft forever
inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
   valid_lft forever preferred_lft forever

# ovs-vsctl show
...
Bridge br-ex
Controller "tcp:127.0.0.1:6633"
is_connected: true
Port "vlan171"
tag: 171
Interface "vlan171"
type: internal
...


On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton  wrote:

> Ok, that's likely not the issue then. I assume the way you access each
> host is via an IP assigned to an OVS bridge or an interface that somehow
> depends on OVS?
>
> On Apr 28, 2017 12:04, "Gustavo Randich" 
> wrote:
>
>> Hi Kevin, we are using the default listen address of loopback interface:
>>
>> # grep -r of_listen_address /etc/neutron
>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
>> 127.0.0.1
>>
>>
>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
>> -vconsole:emer -vsyslog:err -vfile:info 
>> --remote=punix:/var/run/openvswitch/db.sock
>> --private-key=db:Open_vSwitch,SSL,private_key
>> --certificate=db:Open_vSwitch,SSL,certificate
>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>> --log-file=/var/log/openvswitch/ovsdb-server.log
>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>
>> Thanks
>>
>>
>>
>>
>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:
>>
>>> Are you using an of_listen_address value of an interface being brought
>>> down?
>>>
>>> On Apr 25, 2017 17:34, "Gustavo Randich" 
>>> wrote:
>>>
 (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)

 This sounds very strange (to me): recently, after a switch outage, we
 lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
 host and restart networking service to regain access. Then restart
 neutron-openvswitch-agent to regain access to VMs.

 At first glance we thought it was a problem with the NIC linux driver
 of the hosts not detecting link state correctly.

 Then we reproduced the issue simply bringing down physical interfaces
 for around 5 minutes, then up again. Same issue.

 And then we found that if instead of using native (ryu) OpenFlow
 interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.

 Any clue?

 Thanks in advance.


 ___
 Mailing list: http://lists.openstack.org/cgi
 -bin/mailman/listinfo/openstack
 Post to : openst...@lists.openstack.org
 Unsubscribe : http://lists.openstack.org/cgi
 -bin/mailman/listinfo/openstack


>>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Kevin Benton
Ok, that's likely not the issue then. I assume the way you access each host
is via an IP assigned to an OVS bridge or an interface that somehow depends
on OVS?

On Apr 28, 2017 12:04, "Gustavo Randich"  wrote:

> Hi Kevin, we are using the default listen address of loopback interface:
>
> # grep -r of_listen_address /etc/neutron
> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
> 127.0.0.1
>
>
> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
> -vconsole:emer -vsyslog:err -vfile:info 
> --remote=punix:/var/run/openvswitch/db.sock
> --private-key=db:Open_vSwitch,SSL,private_key
> --certificate=db:Open_vSwitch,SSL,certificate 
> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
> --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>
> Thanks
>
>
>
>
> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:
>
>> Are you using an of_listen_address value of an interface being brought
>> down?
>>
>> On Apr 25, 2017 17:34, "Gustavo Randich" 
>> wrote:
>>
>>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>>>
>>> This sounds very strange (to me): recently, after a switch outage, we
>>> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
>>> host and restart networking service to regain access. Then restart
>>> neutron-openvswitch-agent to regain access to VMs.
>>>
>>> At first glance we thought it was a problem with the NIC linux driver of
>>> the hosts not detecting link state correctly.
>>>
>>> Then we reproduced the issue simply bringing down physical interfaces
>>> for around 5 minutes, then up again. Same issue.
>>>
>>> And then we found that if instead of using native (ryu) OpenFlow
>>> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.
>>>
>>> Any clue?
>>>
>>> Thanks in advance.
>>>
>>>
>>> ___
>>> Mailing list: http://lists.openstack.org/cgi
>>> -bin/mailman/listinfo/openstack
>>> Post to : openst...@lists.openstack.org
>>> Unsubscribe : http://lists.openstack.org/cgi
>>> -bin/mailman/listinfo/openstack
>>>
>>>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] looking for feedback on proposals to improve logging

2017-04-28 Thread Doug Hellmann
Excerpts from Nematollah Bidokhti's message of 2017-04-27 22:30:34 +:
> Hi,
> 
> I have been working on the concept of fault management blueprint to increase 
> cloud resiliency. As part of this proposal, info such as logs, KPIs, health 
> checks and so on are critical since we are engaging in deep data analysis and 
> machine learning.
> 
> To ease the data analysis process there must be consistent logs. This comes 
> with having IDs and severity properties.
> 
> Cloud resiliency in general requires fast fault detection, isolation and 
> recovery. In addition, there are applications such as NFV that is sensitive 
> to fast fault detection and recovery. One approach is to have meaningful logs 
> where by parsing the data we can make real time fault management decisions.
> 
> Similar to interrupts, I would like us to have logging hierarchy which can 
> help an automated fault management system to take accurate and appropriate 
> actions. The format of the logs is important since it will ease the ML 
> analysis later in the process.

I'm not sure what you mean by "logging hierarchy". Do you mean the
severity levels that we have (like INFO, WARNING, and ERROR), or
something else?

Are you using the JSON formatter to make the logs easier to parse for
the automated processing you're doing? If so, did you find that complex
to configure? If not, were you aware that was possible and if you were,
what caused you to decide not to use it? Does it not match your needs?

Doug

> 
> Thanks,
> Nemat
> 
> -Original Message-
> From: Doug Hellmann [mailto:d...@doughellmann.com] 
> Sent: Wednesday, April 26, 2017 7:28 AM
> To: openstack-operators 
> Subject: [Openstack-operators] looking for feedback on proposals to improve 
> logging
> 
> I am looking for some feedback on two new proposals to add IDs to log 
> messages. Please see the thread on openstack-dev, and comment there or on the 
> specs referenced there.
> 
> http://lists.openstack.org/pipermail/openstack-dev/2017-April/115958.html
> 
> Thanks!
> Doug
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] looking for feedback on proposals to improve logging

2017-04-28 Thread Tim Bell

There is an interesting proposal developing as part of 
https://review.openstack.org/460112 which it would be great to get more 
feedback on from other operators.

My view is that OpenStack would benefit greatly from a troubleshooting guide. I 
remember my days with AIX (long ago) when a guide had the classic IBM 
information on “if you get this, this is why, you need to check this and here 
is how to fix it”.

Developing a framework for every message to have an ID would be a significant 
change and would require major investment of effort.

The proposal outlined in that we use the Python Exception framework to have the 
Exception Name as the basis for trouble shooting and googling for errors. While 
there is no guarantee of uniqueness, this gives a basis for developing the 
appropriate guides.

Feel free to provide your input on the review,
Tim

On 28/04/17 00:30, "Nematollah Bidokhti"  wrote:

Hi,

I have been working on the concept of fault management blueprint to 
increase cloud resiliency. As part of this proposal, info such as logs, KPIs, 
health checks and so on are critical since we are engaging in deep data 
analysis and machine learning.
>> 
>> To ease the data analysis process there must be consistent logs. This 
comes with having IDs and severity properties.
>> 
>> Cloud resiliency in general requires fast fault detection, isolation and 
recovery. In addition, there are applications such as NFV that is sensitive to 
fast fault detection and recovery. One approach is to have meaningful logs 
where by parsing the data we can make real time fault management decisions.
>> 
>> Similar to interrupts, I would like us to have logging hierarchy which 
can help an automated fault management system to take accurate and appropriate 
actions. The format of the logs is important since it will ease the ML analysis 
later in the process.
>> 
>> Thanks,
>> Nemat

-Original Message-
From: Doug Hellmann [mailto:d...@doughellmann.com] 
Sent: Wednesday, April 26, 2017 7:28 AM
To: openstack-operators 
Subject: [Openstack-operators] looking for feedback on proposals to improve 
logging

I am looking for some feedback on two new proposals to add IDs to log 
messages. Please see the thread on openstack-dev, and comment there or on the 
specs referenced there.

http://lists.openstack.org/pipermail/openstack-dev/2017-April/115958.html

Thanks!
Doug

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread Eric Fried
If it's *just* glance we're making an exception for, I prefer #1 (don't
deprecate/remove [glance]api_servers).  It's way less code &
infrastructure, and it discourages others from jumping on the
multiple-endpoints bandwagon.  If we provide endpoint_override_list
(handwave), people will think it's okay to use it.

Anyone aware of any other services that use multiple endpoints?

On 04/28/2017 10:46 AM, Mike Dorman wrote:
> Maybe we are talking about two different things here?  I’m a bit confused.
> 
> Our Glance config in nova.conf on HV’s looks like this:
> 
> [glance]
> api_servers=http://glance1:9292,http://glance2:9292,http://glance3:9292,http://glance4:9292
> glance_api_insecure=True
> glance_num_retries=4
> glance_protocol=http
> 
> So we do provide the full URLs, and there is SSL support.  Right?  I am 
> fairly certain we tested this to ensure that if one URL fails, nova goes on 
> to retry the next one.  That failure does not get bubbled up to the user 
> (which is ultimately the goal.)
> 
> I don’t disagree with you that the client side choose-a-server-at-random is 
> not a great load balancer.  (But isn’t this roughly the same thing that 
> oslo-messaging does when we give it a list of RMQ servers?)  For us it’s more 
> about the failure handling if one is down than it is about actually equally 
> distributing the load.
> 
> In my mind options One and Two are the same, since today we are already 
> providing full URLs and not only server names.  At the end of the day, I 
> don’t feel like there is a compelling argument here to remove this 
> functionality (that people are actively making use of.)
> 
> To be clear, I, and I think others, are fine with nova by default getting the 
> Glance endpoint from Keystone.  And that in Keystone there should exist only 
> one Glance endpoint.  What I’d like to see remain is the ability to override 
> that for nova-compute and to target more than one Glance URL for purposes of 
> fail over.
> 
> Thanks,
> Mike
> 
> 
> 
> 
> On 4/28/17, 8:20 AM, "Monty Taylor"  wrote:
> 
> Thank you both for your feedback - that's really helpful.
> 
> Let me say a few more words about what we're trying to accomplish here 
> overall so that maybe we can figure out what the right way forward is. 
> (it may be keeping the glance api servers setting, but let me at least 
> make the case real quick)
> 
>  From a 10,000 foot view, the thing we're trying to do is to get nova's 
> consumption of all of the OpenStack services it uses to be less special.
> 
> The clouds have catalogs which list information about the services - 
> public, admin and internal endpoints and whatnot - and then we're asking 
> admins to not only register that information with the catalog, but to 
> also put it into the nova.conf. That means that any updating of that 
> info needs to be an API call to keystone and also a change to nova.conf. 
> If we, on the other hand, use the catalog, then nova can pick up changes 
> in real time as they're rolled out to the cloud - and there is hopefully 
> a sane set of defaults we could choose (based on operator feedback like 
> what you've given) so that in most cases you don't have to tell nova 
> where to find glance _at_all_ becuase the cloud already knows where it 
> is. (nova would know to look in the catalog for the interal interface of 
> the image service - for instance - there's no need to ask an operator to 
> add to the config "what is the service_type of the image service we 
> should talk to" :) )
> 
> Now - glance, and the thing you like that we don't - is especially hairy 
> because of the api_servers list. The list, as you know, is just a list 
> of servers, not even of URLs. This  means it's not possible to configure 
> nova to talk to glance over SSL (which I know you said works for you, 
> but we'd like for people to be able to choose to SSL all their things) 
> We could add that, but it would be an additional pile of special config. 
> Because of all of that, we also have to attempt to make working URLs 
> from what is usually a list of IP addresses. This is also clunky and 
> prone to failure.
> 
> The implementation on the underside of the api_servers code is the 
> world's dumbest load balancer. It picks a server from the  list at 
> random and uses it. There is no facility for dealing with a server in 
> the list that stops working or for allowing rolling upgrades like there 
> would with a real load-balancer across the set. If one of the API 
> servers goes away, we have no context to know that, so just some of your 
> internal calls to glance fail.
> 
> Those are the issues - basically:
> - current config is special and fragile
> - impossible to SSL
> - unflexible/unpowerful de-facto software loadbalancer
> 
> Now - as is often the case - it turns out the

[Openstack-operators] [docs] OpenStack documentation: Forum session

2017-04-28 Thread Ildiko Vancsa
Hi All,

I’m reaching out to you to draw your attention to a Forum topic we proposed 
with Alex (OpenStack Manuals PTL): 
https://www.openstack.org/summit/boston-2017/summit-schedule/events/18939/openstack-documentation-the-future-depends-on-all-of-us
 


As the documentation team is heavily affected by the OSIC news we are looking 
for more involvement from both our user and developer community. The 
documentation team is looking for new contributors and also options to 
restructure some of the guides and how the team operates.

Considering the importance of good and reliable documentation we hope to have 
many of you joining us on Monday in Boston. You can add ideas or questions to 
the Forum etherpad to help us prepare: 
https://etherpad.openstack.org/p/doc-future 
 

If you have any questions or comments in advance please respond to this thread 
or reach out to me (IRC: ildikov) or Alex (IRC: asettle).

Thanks and Best Regards,
Ildikó & Alex___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Gustavo Randich
Hi Kevin, we are using the default listen address of loopback interface:

# grep -r of_listen_address /etc/neutron
/etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
127.0.0.1


tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
-vconsole:emer -vsyslog:err -vfile:info
--remote=punix:/var/run/openvswitch/db.sock
--private-key=db:Open_vSwitch,SSL,private_key
--certificate=db:Open_vSwitch,SSL,certificate
--bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
--log-file=/var/log/openvswitch/ovsdb-server.log
--pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

Thanks




On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton  wrote:

> Are you using an of_listen_address value of an interface being brought
> down?
>
> On Apr 25, 2017 17:34, "Gustavo Randich" 
> wrote:
>
>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>>
>> This sounds very strange (to me): recently, after a switch outage, we
>> lost connectivity to all our Mitaka hosts. We had to enter via iLO host by
>> host and restart networking service to regain access. Then restart
>> neutron-openvswitch-agent to regain access to VMs.
>>
>> At first glance we thought it was a problem with the NIC linux driver of
>> the hosts not detecting link state correctly.
>>
>> Then we reproduced the issue simply bringing down physical interfaces for
>> around 5 minutes, then up again. Same issue.
>>
>> And then we found that if instead of using native (ryu) OpenFlow
>> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.
>>
>> Any clue?
>>
>> Thanks in advance.
>>
>>
>> ___
>> Mailing list: http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>> Post to : openst...@lists.openstack.org
>> Unsubscribe : http://lists.openstack.org/cgi
>> -bin/mailman/listinfo/openstack
>>
>>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread Mike Dorman
Maybe we are talking about two different things here?  I’m a bit confused.

Our Glance config in nova.conf on HV’s looks like this:

[glance]
api_servers=http://glance1:9292,http://glance2:9292,http://glance3:9292,http://glance4:9292
glance_api_insecure=True
glance_num_retries=4
glance_protocol=http

So we do provide the full URLs, and there is SSL support.  Right?  I am fairly 
certain we tested this to ensure that if one URL fails, nova goes on to retry 
the next one.  That failure does not get bubbled up to the user (which is 
ultimately the goal.)

I don’t disagree with you that the client side choose-a-server-at-random is not 
a great load balancer.  (But isn’t this roughly the same thing that 
oslo-messaging does when we give it a list of RMQ servers?)  For us it’s more 
about the failure handling if one is down than it is about actually equally 
distributing the load.

In my mind options One and Two are the same, since today we are already 
providing full URLs and not only server names.  At the end of the day, I don’t 
feel like there is a compelling argument here to remove this functionality 
(that people are actively making use of.)

To be clear, I, and I think others, are fine with nova by default getting the 
Glance endpoint from Keystone.  And that in Keystone there should exist only 
one Glance endpoint.  What I’d like to see remain is the ability to override 
that for nova-compute and to target more than one Glance URL for purposes of 
fail over.

Thanks,
Mike




On 4/28/17, 8:20 AM, "Monty Taylor"  wrote:

Thank you both for your feedback - that's really helpful.

Let me say a few more words about what we're trying to accomplish here 
overall so that maybe we can figure out what the right way forward is. 
(it may be keeping the glance api servers setting, but let me at least 
make the case real quick)

 From a 10,000 foot view, the thing we're trying to do is to get nova's 
consumption of all of the OpenStack services it uses to be less special.

The clouds have catalogs which list information about the services - 
public, admin and internal endpoints and whatnot - and then we're asking 
admins to not only register that information with the catalog, but to 
also put it into the nova.conf. That means that any updating of that 
info needs to be an API call to keystone and also a change to nova.conf. 
If we, on the other hand, use the catalog, then nova can pick up changes 
in real time as they're rolled out to the cloud - and there is hopefully 
a sane set of defaults we could choose (based on operator feedback like 
what you've given) so that in most cases you don't have to tell nova 
where to find glance _at_all_ becuase the cloud already knows where it 
is. (nova would know to look in the catalog for the interal interface of 
the image service - for instance - there's no need to ask an operator to 
add to the config "what is the service_type of the image service we 
should talk to" :) )

Now - glance, and the thing you like that we don't - is especially hairy 
because of the api_servers list. The list, as you know, is just a list 
of servers, not even of URLs. This  means it's not possible to configure 
nova to talk to glance over SSL (which I know you said works for you, 
but we'd like for people to be able to choose to SSL all their things) 
We could add that, but it would be an additional pile of special config. 
Because of all of that, we also have to attempt to make working URLs 
from what is usually a list of IP addresses. This is also clunky and 
prone to failure.

The implementation on the underside of the api_servers code is the 
world's dumbest load balancer. It picks a server from the  list at 
random and uses it. There is no facility for dealing with a server in 
the list that stops working or for allowing rolling upgrades like there 
would with a real load-balancer across the set. If one of the API 
servers goes away, we have no context to know that, so just some of your 
internal calls to glance fail.

Those are the issues - basically:
- current config is special and fragile
- impossible to SSL
- unflexible/unpowerful de-facto software loadbalancer

Now - as is often the case - it turns out the combo of those things is 
working very well for you -so we need to adjust our thinking on the 
topic a bit. Let me toss out some alternatives and see what you think:

Alternative One - Do Both things

We add the new "consume from catalog" and make it default. (and make it 
default to consuming the internal interface by default) We have to do 
that in parallel with the current glance api_servers setting anyway, 
because of deprecation periods, so the code to support both approaches 
will exist. Instead of then deprecating the api_servers list, we keep 
  

Re: [Openstack-operators] [Forum] Moderators needed!

2017-04-28 Thread Michał Jastrzębski
I can moderate HA session if you want (although there is one listed in
schedule?). Feel free to sign me up

On 28 April 2017 at 06:07, Jay Pipes  wrote:
> On 04/28/2017 08:22 AM, Shamail Tahir wrote:
>>
>> Hi everyone,
>>
>> Most of the proposed/accepted Forum sessions currently have moderators
>> but there are six sessions that do not have a confirmed moderator yet.
>> Please look at the list below and let us know if you would be willing to
>> help moderate any of these sessions.
>
>
> 
>
>> Cloud-Native Design/Refactoring across OpenStack
>>
>> 
>
>
> Hi Shamail,
>
> The one above looks like Alan (cc'd) is the moderator. :)
>
> Despite him having a awkwardly over-sized brain -- which unfortunately will
> limit the number of other people that can fit in the room -- I do think Alan
> will be a good moderator.
>
> Best,
> -jay
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread Mike Dorman
Ok.  That would solve some of the problem for us, but we’d still be losing the 
redundancy.  We could do some HAProxy tricks to route around downed services, 
but it wouldn’t handle the case when that one physical box is down.

Is there some downside to allowing endpoint_override to remain a list?   That 
piece seems orthogonal to the spec and IRC discussion referenced, which are 
more around the service catalog.  I don’t think anyone in this thread is 
arguing against the idea that there should be just one endpoint URL in the 
catalog.  But it seems like there are good reasons to allow multiples on the 
override setting (at least for glance in nova-compute.)

Thanks,
Mike



On 4/28/17, 8:05 AM, "Eric Fried"  wrote:

Blair, Mike-

There will be an endpoint_override that will bypass the service
catalog.  It still only takes one URL, though.

Thanks,
Eric (efried)

On 04/27/2017 11:50 PM, Blair Bethwaite wrote:
> We at Nectar are in the same boat as Mike. Our use-case is a little
> bit more about geo-distributed operations though - our Cells are in
> different States around the country, so the local glance-apis are
> particularly important for caching popular images close to the
> nova-computes. We consider these glance-apis as part of the underlying
> cloud infra rather than user-facing, so I think we'd prefer not to see
> them in the service-catalog returned to users either... is there going
> to be a (standard) way to hide them?
> 
> On 28 April 2017 at 09:15, Mike Dorman  wrote:
>> We make extensive use of the [glance]/api_servers list.  We configure 
that on hypervisors to direct them to Glance servers which are more “local” 
network-wise (in order to reduce network traffic across security 
zones/firewalls/etc.)  This way nova-compute can fail over in case one of the 
Glance servers in the list is down, without putting them behind a load 
balancer.  We also don’t run https for these “internal” Glance calls, to save 
the overhead when transferring images.
>>
>> End-user calls to Glance DO go through a real load balancer and then are 
distributed out to the Glance servers on the backend.  From the end-user’s 
perspective, I totally agree there should be one, and only one URL.
>>
>> However, we would be disappointed to see the change you’re suggesting 
implemented.  We would lose the redundancy we get now by providing a list.  Or 
we would have to shunt all the calls through the user-facing endpoint, which 
would generate a lot of extra traffic (in places where we don’t want it) for 
image transfers.
>>
>> Thanks,
>> Mike
>>
>>
>>
>> On 4/27/17, 4:02 PM, "Matt Riedemann"  wrote:
>>
>> On 4/27/2017 4:52 PM, Eric Fried wrote:
>> > Y'all-
>> >
>> >   TL;DR: Does glance ever really need/use multiple endpoint URLs?
>> >
>> >   I'm working on bp use-service-catalog-for-endpoints[1], which 
intends
>> > to deprecate disparate conf options in various groups, and 
centralize
>> > acquisition of service endpoint URLs.  The idea is to introduce
>> > nova.utils.get_service_url(group) -- note singular 'url'.
>> >
>> >   One affected conf option is [glance]api_servers[2], which 
currently
>> > accepts a *list* of endpoint URLs.  The new API will only ever 
return *one*.
>> >
>> >   Thus, as planned, this blueprint will have the side effect of
>> > deprecating support for multiple glance endpoint URLs in Pike, and
>> > removing said support in Queens.
>> >
>> >   Some have asserted that there should only ever be one endpoint 
URL for
>> > a given service_type/interface combo[3].  I'm fine with that - it
>> > simplifies things quite a bit for the bp impl - but wanted to make 
sure
>> > there were no loudly-dissenting opinions before we get too far 
down this
>> > path.
>> >
>> > [1]
>> > 
https://blueprints.launchpad.net/nova/+spec/use-service-catalog-for-endpoints
>> > [2]
>> > 
https://github.com/openstack/nova/blob/7e7bdb198ed6412273e22dea72e37a6371fce8bd/nova/conf/glance.py#L27-L37
>> > [3]
>> > 
http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2017-04-27.log.html#t2017-04-27T20:38:29
>> >
>> > Thanks,
>> > Eric Fried (efried)
>> > .
>> >
>> > 
__
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>> +openstack-operators
>>
>> --
>>
>>

Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread Matt Riedemann

On 4/28/2017 9:05 AM, Eric Fried wrote:

Blair, Mike-

There will be an endpoint_override that will bypass the service
catalog.  It still only takes one URL, though.

Thanks,
Eric (efried)



Eric,

I think the answer we're getting from users (operators) is please don't 
remove the ability to define [glance]api_servers because we actually 
really rely on it for valid reasons.


Given that, isn't it reasonable to leave that backdoor option in the 
code, with it defaulting to off, but if specified we use that, otherwise 
we fallback to the 'normal' service catalog lookup path with a single 
image endpoint URL?


--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread Eric Fried
Blair, Mike-

There will be an endpoint_override that will bypass the service
catalog.  It still only takes one URL, though.

Thanks,
Eric (efried)

On 04/27/2017 11:50 PM, Blair Bethwaite wrote:
> We at Nectar are in the same boat as Mike. Our use-case is a little
> bit more about geo-distributed operations though - our Cells are in
> different States around the country, so the local glance-apis are
> particularly important for caching popular images close to the
> nova-computes. We consider these glance-apis as part of the underlying
> cloud infra rather than user-facing, so I think we'd prefer not to see
> them in the service-catalog returned to users either... is there going
> to be a (standard) way to hide them?
> 
> On 28 April 2017 at 09:15, Mike Dorman  wrote:
>> We make extensive use of the [glance]/api_servers list.  We configure that 
>> on hypervisors to direct them to Glance servers which are more “local” 
>> network-wise (in order to reduce network traffic across security 
>> zones/firewalls/etc.)  This way nova-compute can fail over in case one of 
>> the Glance servers in the list is down, without putting them behind a load 
>> balancer.  We also don’t run https for these “internal” Glance calls, to 
>> save the overhead when transferring images.
>>
>> End-user calls to Glance DO go through a real load balancer and then are 
>> distributed out to the Glance servers on the backend.  From the end-user’s 
>> perspective, I totally agree there should be one, and only one URL.
>>
>> However, we would be disappointed to see the change you’re suggesting 
>> implemented.  We would lose the redundancy we get now by providing a list.  
>> Or we would have to shunt all the calls through the user-facing endpoint, 
>> which would generate a lot of extra traffic (in places where we don’t want 
>> it) for image transfers.
>>
>> Thanks,
>> Mike
>>
>>
>>
>> On 4/27/17, 4:02 PM, "Matt Riedemann"  wrote:
>>
>> On 4/27/2017 4:52 PM, Eric Fried wrote:
>> > Y'all-
>> >
>> >   TL;DR: Does glance ever really need/use multiple endpoint URLs?
>> >
>> >   I'm working on bp use-service-catalog-for-endpoints[1], which intends
>> > to deprecate disparate conf options in various groups, and centralize
>> > acquisition of service endpoint URLs.  The idea is to introduce
>> > nova.utils.get_service_url(group) -- note singular 'url'.
>> >
>> >   One affected conf option is [glance]api_servers[2], which currently
>> > accepts a *list* of endpoint URLs.  The new API will only ever return 
>> *one*.
>> >
>> >   Thus, as planned, this blueprint will have the side effect of
>> > deprecating support for multiple glance endpoint URLs in Pike, and
>> > removing said support in Queens.
>> >
>> >   Some have asserted that there should only ever be one endpoint URL 
>> for
>> > a given service_type/interface combo[3].  I'm fine with that - it
>> > simplifies things quite a bit for the bp impl - but wanted to make sure
>> > there were no loudly-dissenting opinions before we get too far down 
>> this
>> > path.
>> >
>> > [1]
>> > 
>> https://blueprints.launchpad.net/nova/+spec/use-service-catalog-for-endpoints
>> > [2]
>> > 
>> https://github.com/openstack/nova/blob/7e7bdb198ed6412273e22dea72e37a6371fce8bd/nova/conf/glance.py#L27-L37
>> > [3]
>> > 
>> http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2017-04-27.log.html#t2017-04-27T20:38:29
>> >
>> > Thanks,
>> > Eric Fried (efried)
>> > .
>> >
>> > 
>> __
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe: 
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>> +openstack-operators
>>
>> --
>>
>> Thanks,
>>
>> Matt
>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: 
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 
> 
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [scientific] Lightning talks on Scientific OpenStack

2017-04-28 Thread George Mihaiescu
Thanks Stig,

I added a presentation to the schedule.


Cheers,
George



On Thu, Apr 27, 2017 at 3:49 PM, Stig Telfer 
wrote:

> Hi George -
>
> Sorry for the slow response.  The consensus was for 8 minutes maximum.
> That should be plenty for a lightning talk, and enables us to fit one more
> in.
>
> Best wishes,
> Stig
>
>
> > On 27 Apr 2017, at 20:29, George Mihaiescu  wrote:
> >
> > Hi Stig, it will be 10 minutes sessions like in Barcelona?
> >
> > Thanks,
> > George
> >
> >> On Apr 26, 2017, at 03:31, Stig Telfer 
> wrote:
> >>
> >> Hi All -
> >>
> >> We have planned a session of lightning talks at the Boston summit to
> discuss topics specific for OpenStack and research computing applications.
> This was a great success at Barcelona and generated some stimulating
> discussion.  We are also hoping for a small prize for the best talk of the
> session!
> >>
> >> This is the event:
> >> https://www.openstack.org/summit/boston-2017/summit-
> schedule/events/18676
> >>
> >> If you’d like to propose a talk, please add a title and your name here:
> >> https://etherpad.openstack.org/p/Scientific-WG-boston
> >>
> >> Everyone is welcome.
> >>
> >> Cheers,
> >> Stig
> >>
> >>
> >> ___
> >> OpenStack-operators mailing list
> >> OpenStack-operators@lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Forum] Moderators needed!

2017-04-28 Thread Jay Pipes

On 04/28/2017 08:22 AM, Shamail Tahir wrote:

Hi everyone,

Most of the proposed/accepted Forum sessions currently have moderators
but there are six sessions that do not have a confirmed moderator yet.
Please look at the list below and let us know if you would be willing to
help moderate any of these sessions.





Cloud-Native Design/Refactoring across OpenStack



Hi Shamail,

The one above looks like Alan (cc'd) is the moderator. :)

Despite him having a awkwardly over-sized brain -- which unfortunately 
will limit the number of other people that can fit in the room -- I do 
think Alan will be a good moderator.


Best,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread John Garbutt
On 28 April 2017 at 12:17, Sean Dague  wrote:
> On 04/28/2017 12:50 AM, Blair Bethwaite wrote:
>> We at Nectar are in the same boat as Mike. Our use-case is a little
>> bit more about geo-distributed operations though - our Cells are in
>> different States around the country, so the local glance-apis are
>> particularly important for caching popular images close to the
>> nova-computes. We consider these glance-apis as part of the underlying
>> cloud infra rather than user-facing, so I think we'd prefer not to see
>> them in the service-catalog returned to users either... is there going
>> to be a (standard) way to hide them?
>
> In a situation like this, where Cells are geographically bounded, is
> there also a Region for that Cell/Glance?

So we had a slightly different case, where the Glance servers were
actually in the same rack as the hypervisors, so for you had the
option to avoid inter cell traffic for lots of the image downloads.

Another gain was not needed an expensive loadbalancer in front of
glance, looping around the API list spread the load enough (although a
loadbalancer doing a 302 redirect rather than being in the data path
worked rather well, although I think glanceclient wan't too keen
following the redirects on upload).

Thanks,
johnthetubaguy

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [Forum] Moderators needed!

2017-04-28 Thread Shamail Tahir
Hi everyone,

Most of the proposed/accepted Forum sessions currently have moderators but
there are six sessions that do not have a confirmed moderator yet. Please
look at the list below and let us know if you would be willing to help
moderate any of these sessions.

The topics look really interesting but it will be difficult to keep the
sessions on the schedule if there is not an assigned moderator. We look
forward to seeing you at the Summit/Forum in Boston soon!

Achieving Resiliency at Scales of 1000+

Feedback from users for I18n & translation - important part?

Neutron Pain Points

Making Neutron easy for people who want basic networking

High Availability in OpenStack

Cloud-Native Design/Refactoring across OpenStack



Thanks,
Doug, Emilien, Melvin, Mike, Shamail & Tom
Forum Scheduling Committee
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-04-28 Thread Sean Dague
On 04/28/2017 12:50 AM, Blair Bethwaite wrote:
> We at Nectar are in the same boat as Mike. Our use-case is a little
> bit more about geo-distributed operations though - our Cells are in
> different States around the country, so the local glance-apis are
> particularly important for caching popular images close to the
> nova-computes. We consider these glance-apis as part of the underlying
> cloud infra rather than user-facing, so I think we'd prefer not to see
> them in the service-catalog returned to users either... is there going
> to be a (standard) way to hide them?

In a situation like this, where Cells are geographically bounded, is
there also a Region for that Cell/Glance?

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack] Strange: lost physical connectivity to compute hosts when using native (ryu) openflow interface

2017-04-28 Thread Kevin Benton
Are you using an of_listen_address value of an interface being brought
down?

On Apr 25, 2017 17:34, "Gustavo Randich"  wrote:

> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / l2_population)
>
> This sounds very strange (to me): recently, after a switch outage, we lost
> connectivity to all our Mitaka hosts. We had to enter via iLO host by host
> and restart networking service to regain access. Then restart
> neutron-openvswitch-agent to regain access to VMs.
>
> At first glance we thought it was a problem with the NIC linux driver of
> the hosts not detecting link state correctly.
>
> Then we reproduced the issue simply bringing down physical interfaces for
> around 5 minutes, then up again. Same issue.
>
> And then we found that if instead of using native (ryu) OpenFlow
> interface in Neutron Openvswitch we used ovs-ofctl, the problem disappears.
>
> Any clue?
>
> Thanks in advance.
>
>
> ___
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack
> Post to : openst...@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators