date:20170523

Re: [Openstack-operators] [openstack-dev] [Nova] [Cells] Stupid question: Cells v2 & AZs

2017-05-23 Thread Belmiro Moreira

Hi David,
AVZs are basically aggregates.
In cells_v2 aggregates are defined in the cell_api, so it will be possible
to have
multiple AVZs per cell and AVZs that spread between different cells.

Belmiro

On Wed, May 24, 2017 at 5:14 AM, David Medberry 
wrote:

> Hi Devs and Implementers,
>
> A question came up tonight in the Colorado OpenStack meetup regarding
> cells v2 and availability zones.
>
> Can a cell contain multiple AZs? (I assume this is yes.)
>
> Can an AZ contain mutliple cells (I assumed this is no, but now in
> thinking about it, that's probably not right.)
>
> What's the proper way to think about this? In general, I'm considering AZs
> primarily as a fault zone type of mechanism (though they can be used in
> other ways.)
>
> Is there a clear diagram/documentation about this?
>
> And consider this to be an Ocata/Pike and later only type of question.
>
> Thanks.
>
> -dave
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] [Nova] [Cells] Stupid question: Cells v2 & AZs

2017-05-23 Thread David Medberry

Hi Devs and Implementers,

A question came up tonight in the Colorado OpenStack meetup regarding cells
v2 and availability zones.

Can a cell contain multiple AZs? (I assume this is yes.)

Can an AZ contain mutliple cells (I assumed this is no, but now in thinking
about it, that's probably not right.)

What's the proper way to think about this? In general, I'm considering AZs
primarily as a fault zone type of mechanism (though they can be used in
other ways.)

Is there a clear diagram/documentation about this?

And consider this to be an Ocata/Pike and later only type of question.

Thanks.

-dave
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: NOT Getting rid of the automated reschedule functionality

2017-05-23 Thread Matt Riedemann


On 5/23/2017 7:01 PM, Jay Pipes wrote:

On 05/23/2017 07:06 PM, Blair Bethwaite wrote:

Thanks Jay,

I wonder whether there is an easy-ish way to collect stats about the
sorts of errors deployers see in that catchall, so that when this
comes back around in a release or two there might be some less
anecdotal data available...?


Don't worry, Blair. I'm going to code up a backdoor'd 
call-home-to-my-personal-cloud-server thing inside the catch Exception: 
block that automatically sends me all the operator's failure information.


OK, just kidding. I'll probably just emit some lovely WARNING messages 
into your logs.


Best,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


It doesn't look like we record an instance fault in this case, which 
probably makes sense until you get a NoValidHost, but even then I see 
some code which looks like it's setting variables for creating an 
instance fault at some point, but I don't see where that actually 
happens if you get a NoValidHost due to MaxRetriesExceeded.


We do send an instance.create.error notification, if anyone is listening 
for notifications and recording them anywhere. That data could be mined.


--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: NOT Getting rid of the automated reschedule functionality

2017-05-23 Thread Jay Pipes


On 05/23/2017 07:06 PM, Blair Bethwaite wrote:

Thanks Jay,

I wonder whether there is an easy-ish way to collect stats about the
sorts of errors deployers see in that catchall, so that when this
comes back around in a release or two there might be some less
anecdotal data available...?


Don't worry, Blair. I'm going to code up a backdoor'd 
call-home-to-my-personal-cloud-server thing inside the catch Exception: 
block that automatically sends me all the operator's failure information.


OK, just kidding. I'll probably just emit some lovely WARNING messages 
into your logs.


Best,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: NOT Getting rid of the automated reschedule functionality

2017-05-23 Thread Blair Bethwaite

Thanks Jay,

I wonder whether there is an easy-ish way to collect stats about the
sorts of errors deployers see in that catchall, so that when this
comes back around in a release or two there might be some less
anecdotal data available...?

Cheers,

On 24 May 2017 at 06:43, Jay Pipes  wrote:
> Hello Dear Operators,
>
> OK, we've heard you loud and (mostly) clear. We won't remove the automated
> rescheduling behavior from Nova. While we will be removing the primary cause
> of reschedules (resource overconsumption races), we cannot yet eliminate the
> catchall exception handling on the compute node that triggers a retry of the
> instance launch. Nor will we be able to fully ameliorate the
> affinity/anti-affinity last-minute violation problems for at least another
> release.
>
> So, we'll continue to support basic retries within the originally-selected
> cell.
>
> Best,
> -jay
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Cheers,
~Blairo

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] Routed provider networks...

2017-05-23 Thread Curtis

On Mon, May 22, 2017 at 1:47 PM, Chris Marino  wrote:

> Hello operators, I will be talking about the new routed provider network
> 
> features in OpenStack at a Meetup
> next week and would
> like to get a better sense of how provider networks are currently being
> used and if anyone has deployed routed provider networks?
>

We use provider networks to essentially take neutron-l3 out of the
equation. Generally they are shared on all compute hosts, but usually there
aren't huge numbers of computes.

I have not deployed routed provider networks yet, but I think the premise
is great, and the /23 per rack is probably what we would do with routed
provider networks for multi-rack deployments. Combining routed provider
networks with Cells V2 (not sure if the two work together well, and once
Cells V2 is completely done) could be quite powerful IMHO.

>
> A typical L2 provider network is deployed as VLANs to every host. But
> curious to know how how many hosts or VMs an operator would allow on this
> network before you wanted to split into segments? Would you split hosts
> between VLANs, or trunk the VLANs to all hosts? How do you handle
> scheduling VMs across two provider networks?
>
> If you were to go with L3 provider networks, would it be L3 to the ToR, or
> L3 to the host?
>
> Are the new routed provider network features useful in their current form?
>
> Any experience you can share would be very helpful.
> CM
>
>
> ᐧ
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>

-- 
Blog: serverascode.com
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] [MassivelyDistributed] IRC Meeting tomorrow15:00 UTC

2017-05-23 Thread lebre . adrien

Dear all, 

A gentle reminder for our meeting tomorrow. 
As usual, the agenda is available at: 
https://etherpad.openstack.org/p/massively_distributed_ircmeetings_2017 (line 
597)
Please feel free to add items.

Best, 
ad_rien_

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] Routed provider networks...

2017-05-23 Thread Chris Marino

Kevin, should have been more clear

For the specific operator that is running L3 to host, with only a few /20
blocks...dynamic routing is absolutely necessary.

The /16 scenario you describe is totally fine without it.

CM

On Tue, May 23, 2017 at 2:40 PM, Kevin Benton  wrote:

> >Dynamic routing is absolutely necessary, though. Large blocks of 1918
> addresses are scarce, even inside the DC.
>
> I just described a 65 thousand VM topology and it used a /16. Dynamic
> routing is not necessary or even helpful in this scenario if you plan on
> ever running close to your max server density.
>
> Routed networks allows you to size your subnets specifically to the
> maximum number of VMs you can support in a segment, so there is very little
> IP waste once you actually start to use your servers to run VMs.
>
> On Tue, May 23, 2017 at 6:38 AM, Chris Marino  wrote:
>
>> On Mon, May 22, 2017 at 9:12 PM, Kevin Benton  wrote:
>>
>>> The operators that were asking for the spec were using private IP space
>>> and that is probably going to be the most common use case for routed
>>> networks. Splitting a /21 up across the entire data center isn't really
>>> something you would want to do because you would run out of IPs quickly
>>> like you mentioned.
>>>
>>> The use case routed networks is almost exactly like your Romana project.
>>> For example, you have a large chunk of IPs (e.g. 10.0.0.0/16) and
>>> you've setup the infrastructure so each rack gets a /23 with the ToR as the
>>> gateway which would buy you 509 VMs across 128 racks.
>>>
>>
>> Yes, it is. That's what brought me back to this. Working with an operator
>> that's using L2 provider networks today, but will bring L3 to host in their
>> new design.
>>
>> Dynamic routing is absolutely necessary, though. Large blocks of 1918
>> addresses are scarce, even inside the DC. VRFs and/or NAT just not an
>> option.
>>
>> CM
>>
>>
>>>
>>>
>>> On May 22, 2017 2:53 PM, "Chris Marino"  wrote:
>>>
>>> Thanks Jon, very helpful.
>>>
>>> I think a more common use case for provider networks (in enterprise,
>>> AFAIK) is that they'd have a small number of /20 or /21 networks (VLANs)
>>> that they would trunk to all hosts. The /21s are part of the larger
>>> datacenter network with segment firewalls and access to other datacenter
>>> resources (no NAT). Each functional area would get their own network (i.e.
>>> QA, Prod, Dev, Test, etc.) but users would have access to only certain
>>> networks.
>>>
>>> For various reasons, they're moving to spine/leaf L3 networks and they
>>> want to use the same provider network CIDRs with the new L3 network. While
>>> technically this is covered by the use case described in the spec,
>>> splitting a /21 into segments (i.e.one for each rack/ToRs) severely limits
>>> the scheduler (since each rack only get a part of the whole /21).
>>>
>>> This can be solved with route advertisement/distribution and/or IPAM
>>> coordination w/Nova, but this isn't possible today. Which brings me back to
>>> my earlier question, how useful are routed provider network?
>>>
>>> CM
>>> ᐧ
>>>
>>> On Mon, May 22, 2017 at 1:08 PM, Jonathan Proulx 
>>> wrote:
>>>

 Not sure if this is what you're looking for but...

 For my private cloud in research environment we have a public provider
 network available to all projects.

 This is externally routed and has basically been in the same config
 since Folsom (currently we're upto Mitaka).  It provides public ipv4
 addresses. DHCP is done in neutron (of course) the lower portion of
 the allocated subnet is excluded from the dynamic range.  We allow
 users to register DNS names in this range (through pre-exisiting
 custom, external IPAM tools) and to specify the fixed ip address when
 launching VMs.

 This network typically has 1k VMs running. We've assigned a /18 to
 which is obviously overkill.

 A few projects also have provider networks plumbed in to bridge they
 legacy physical networks into OpenStack.  For these there's no dynamic
 range and users must specify fixed ip, these are generally considered
 "a bad idea" and were used to facilitate dumping VMs from old Xen
 infrastructures into OpenStack with minimal changes.

 These are old patterns I wouldn't necessarily suggest anyone
 replicate, but they are the truth of my world...

 -Jon

 On Mon, May 22, 2017 at 12:47:01PM -0700, Chris Marino wrote:
 :Hello operators, I will be talking about the new routed provider
 network
 :
 :features in OpenStack at a Meetup
 :next week and
 would
 :like to get a better sense of how provider networks are currently being
 :used and if anyone has deployed routed provider networks?
 :
 :A typical L2 provider network is deployed as VLANs to ev

Re: [Openstack-operators] Routed provider networks...

2017-05-23 Thread Kevin Benton

>Dynamic routing is absolutely necessary, though. Large blocks of 1918
addresses are scarce, even inside the DC.

I just described a 65 thousand VM topology and it used a /16. Dynamic
routing is not necessary or even helpful in this scenario if you plan on
ever running close to your max server density.

Routed networks allows you to size your subnets specifically to the maximum
number of VMs you can support in a segment, so there is very little IP
waste once you actually start to use your servers to run VMs.

On Tue, May 23, 2017 at 6:38 AM, Chris Marino  wrote:

> On Mon, May 22, 2017 at 9:12 PM, Kevin Benton  wrote:
>
>> The operators that were asking for the spec were using private IP space
>> and that is probably going to be the most common use case for routed
>> networks. Splitting a /21 up across the entire data center isn't really
>> something you would want to do because you would run out of IPs quickly
>> like you mentioned.
>>
>> The use case routed networks is almost exactly like your Romana project.
>> For example, you have a large chunk of IPs (e.g. 10.0.0.0/16) and you've
>> setup the infrastructure so each rack gets a /23 with the ToR as the
>> gateway which would buy you 509 VMs across 128 racks.
>>
>
> Yes, it is. That's what brought me back to this. Working with an operator
> that's using L2 provider networks today, but will bring L3 to host in their
> new design.
>
> Dynamic routing is absolutely necessary, though. Large blocks of 1918
> addresses are scarce, even inside the DC. VRFs and/or NAT just not an
> option.
>
> CM
>
>
>>
>>
>> On May 22, 2017 2:53 PM, "Chris Marino"  wrote:
>>
>> Thanks Jon, very helpful.
>>
>> I think a more common use case for provider networks (in enterprise,
>> AFAIK) is that they'd have a small number of /20 or /21 networks (VLANs)
>> that they would trunk to all hosts. The /21s are part of the larger
>> datacenter network with segment firewalls and access to other datacenter
>> resources (no NAT). Each functional area would get their own network (i.e.
>> QA, Prod, Dev, Test, etc.) but users would have access to only certain
>> networks.
>>
>> For various reasons, they're moving to spine/leaf L3 networks and they
>> want to use the same provider network CIDRs with the new L3 network. While
>> technically this is covered by the use case described in the spec,
>> splitting a /21 into segments (i.e.one for each rack/ToRs) severely limits
>> the scheduler (since each rack only get a part of the whole /21).
>>
>> This can be solved with route advertisement/distribution and/or IPAM
>> coordination w/Nova, but this isn't possible today. Which brings me back to
>> my earlier question, how useful are routed provider network?
>>
>> CM
>> ᐧ
>>
>> On Mon, May 22, 2017 at 1:08 PM, Jonathan Proulx 
>> wrote:
>>
>>>
>>> Not sure if this is what you're looking for but...
>>>
>>> For my private cloud in research environment we have a public provider
>>> network available to all projects.
>>>
>>> This is externally routed and has basically been in the same config
>>> since Folsom (currently we're upto Mitaka).  It provides public ipv4
>>> addresses. DHCP is done in neutron (of course) the lower portion of
>>> the allocated subnet is excluded from the dynamic range.  We allow
>>> users to register DNS names in this range (through pre-exisiting
>>> custom, external IPAM tools) and to specify the fixed ip address when
>>> launching VMs.
>>>
>>> This network typically has 1k VMs running. We've assigned a /18 to
>>> which is obviously overkill.
>>>
>>> A few projects also have provider networks plumbed in to bridge they
>>> legacy physical networks into OpenStack.  For these there's no dynamic
>>> range and users must specify fixed ip, these are generally considered
>>> "a bad idea" and were used to facilitate dumping VMs from old Xen
>>> infrastructures into OpenStack with minimal changes.
>>>
>>> These are old patterns I wouldn't necessarily suggest anyone
>>> replicate, but they are the truth of my world...
>>>
>>> -Jon
>>>
>>> On Mon, May 22, 2017 at 12:47:01PM -0700, Chris Marino wrote:
>>> :Hello operators, I will be talking about the new routed provider network
>>> :>> outed-networks.html>
>>> :features in OpenStack at a Meetup
>>> :next week and would
>>> :like to get a better sense of how provider networks are currently being
>>> :used and if anyone has deployed routed provider networks?
>>> :
>>> :A typical L2 provider network is deployed as VLANs to every host. But
>>> :curious to know how how many hosts or VMs an operator would allow on
>>> this
>>> :network before you wanted to split into segments? Would you split hosts
>>> :between VLANs, or trunk the VLANs to all hosts? How do you handle
>>> :scheduling VMs across two provider networks?
>>> :
>>> :If you were to go with L3 provider networks, would it be L3 to the ToR,
>>> or
>>> :L3 to the host?
>>> :
>>

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Mike Bayer



per IRC deliberation, I'll do a patch in oslo.db:

* detect incoming URL that is like "mysql://" with no driver (I can set 
this up for all DBs, though excluding sqlite:// since that one is used 
by tests)


* log warning referring to the issue, rather than a hard-cut over ("the 
big issue we'll likely run into with that is if these are older installs 
chances are they don't have pymysql installed")   (though technically 
oslo.db could check for this too...).  for mysql I'll make sure it 
refers to "pymysql" in the message.


as far as what versions to backport etc., needs to be decided.


On 05/23/2017 04:38 PM, Mike Bayer wrote:



On 05/23/2017 04:17 PM, Doug Hellmann wrote:



On May 23, 2017, at 4:01 PM, Doug Hellmann > wrote:


Excerpts from Sean McGinnis's message of 2017-05-23 11:38:34 -0500:


This sounds like something we could fix completely by dropping the
use of the offending library. I know there was a lot of work done
to get pymysql support in place. It seems like we can finish that by
removing support for the old library and redirecting mysql://
connections to use pymysql.

Doug



I think that may be ideal. If there are known issues with the library,
and we have a different and well tested alternative that we know works,
it's probably easier all around to just redirect internally to use
pymysql.


Now we just have to find the code that's doing the mapping to the
driver. It doesn't seem to be oslo.db. Is it sqlalchemy?


Mike, do you have any insight into the best approach for this?


OK so the way this works is:

1. if you are using SQLAlchemy by itself, and you send a URL that is 
"mysql://user:pass@host/dbname", that omits the "+driver" portion; a 
default driver is selected, in the case of MySQL it uses the driver that 
imports under "import mysqldb".  This is either the mysqlclient or the 
older Python-MySQL which it replaces; these are native drivers written 
in C and the eventlet monkeypatching we use does not manage to modify 
these to act in a non-blocking fashion, so you get new kinds of 
deadlocks you wouldn't normally get when using eventlet.


2. If you send a URL like we want nowadays, 
"mysql+pymsql://user:pass@hsot/dbname", you get the pure-Python PyMySQL 
driver we've standardized upon, which works under eventlet 
monkeypatching and you don't get weird deadlocks of this nature.


3. The database URLs are inside of the .conf files for all the services, 
individually, such as nova.conf, neutron.conf, etc.   These got there 
based on the installer that one used, and from that point on I don't 
think they change (it's possible that installers like tripleo might be 
able to alter the files when you do an upgrade).


So the reason things "work" for people is that their installation / 
upgrade process has ensured that a MySQL database URL for a process that 
uses eventlet is of the form "mysql+pymysql://".   If that hasn't 
happened somewhere, then we'd have this problem.   I think the problem 
first and foremost needs to be "fixed" at this layer, e.g. 
"mysql+pymsql://" should preferably be explicit for as long as we use 
pure SQLAlchemy database URLs in our config files (e.g. these should 
either be fully correct URLs, or we shouldn't be using URLs if some 
other layer makes decisions about the database connection string).


On the "database management" side of things, e.g. projects that use 
oslo.db, we can look into failing an immediate assertion if a database 
backend but no driver is specified, e.g. to disable SQLAlchemy's usual 
"default driver" selection logic.   This is the minimum we should 
probably do here, however this will make existing installations that 
"sort of work" right now to "not work" at all until the configuration is 
fixed.


If we truly want to implicitly force the driver to be "pymysql" if 
"mysql" is present without a driver, we can do that also, but that feels 
kind of wrong to me; there are all kinds of things that might need to 
happen to database URLs and it would be unfortunate if we started just 
hardcoding driver decisions in oslo.db without solving the issue of 
configuration in a more general sense, not to mention it's misleading to 
continue to have full SQLAlchemy URLs in the conf files that get 
silently altered by a middle tier - better would be that the format of 
the database configuration changes to not be confused with this.   More 
flexible would be if there were some kind of "registry"-oriented 
configuration so that connection URLs across many services could be more 
centralized (this is how ODBC works for example), but that is also 
another layer of complexity.


I'd mostly want to understand how we have "mysql://" URLs floating 
around as the installers / upgraders should have taken care of that 
issue some time ago.  Otherwise we shouldn't have full database URLs in 
our .conf files if a middle layer is going to silently change them anyway.


Simplest fix here is of course if someone has the old st

[Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: NOT Getting rid of the automated reschedule functionality

2017-05-23 Thread Jay Pipes


Hello Dear Operators,

OK, we've heard you loud and (mostly) clear. We won't remove the 
automated rescheduling behavior from Nova. While we will be removing the 
primary cause of reschedules (resource overconsumption races), we cannot 
yet eliminate the catchall exception handling on the compute node that 
triggers a retry of the instance launch. Nor will we be able to fully 
ameliorate the affinity/anti-affinity last-minute violation problems for 
at least another release.


So, we'll continue to support basic retries within the 
originally-selected cell.


Best,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Mike Bayer




On 05/23/2017 04:17 PM, Doug Hellmann wrote:



On May 23, 2017, at 4:01 PM, Doug Hellmann > wrote:


Excerpts from Sean McGinnis's message of 2017-05-23 11:38:34 -0500:


This sounds like something we could fix completely by dropping the
use of the offending library. I know there was a lot of work done
to get pymysql support in place. It seems like we can finish that by
removing support for the old library and redirecting mysql://
connections to use pymysql.

Doug



I think that may be ideal. If there are known issues with the library,
and we have a different and well tested alternative that we know works,
it's probably easier all around to just redirect internally to use
pymysql.


Now we just have to find the code that's doing the mapping to the
driver. It doesn't seem to be oslo.db. Is it sqlalchemy?


Mike, do you have any insight into the best approach for this?


OK so the way this works is:

1. if you are using SQLAlchemy by itself, and you send a URL that is 
"mysql://user:pass@host/dbname", that omits the "+driver" portion; a 
default driver is selected, in the case of MySQL it uses the driver that 
imports under "import mysqldb".  This is either the mysqlclient or the 
older Python-MySQL which it replaces; these are native drivers written 
in C and the eventlet monkeypatching we use does not manage to modify 
these to act in a non-blocking fashion, so you get new kinds of 
deadlocks you wouldn't normally get when using eventlet.


2. If you send a URL like we want nowadays, 
"mysql+pymsql://user:pass@hsot/dbname", you get the pure-Python PyMySQL 
driver we've standardized upon, which works under eventlet 
monkeypatching and you don't get weird deadlocks of this nature.


3. The database URLs are inside of the .conf files for all the services, 
individually, such as nova.conf, neutron.conf, etc.   These got there 
based on the installer that one used, and from that point on I don't 
think they change (it's possible that installers like tripleo might be 
able to alter the files when you do an upgrade).


So the reason things "work" for people is that their installation / 
upgrade process has ensured that a MySQL database URL for a process that 
uses eventlet is of the form "mysql+pymysql://".   If that hasn't 
happened somewhere, then we'd have this problem.   I think the problem 
first and foremost needs to be "fixed" at this layer, e.g. 
"mysql+pymsql://" should preferably be explicit for as long as we use 
pure SQLAlchemy database URLs in our config files (e.g. these should 
either be fully correct URLs, or we shouldn't be using URLs if some 
other layer makes decisions about the database connection string).


On the "database management" side of things, e.g. projects that use 
oslo.db, we can look into failing an immediate assertion if a database 
backend but no driver is specified, e.g. to disable SQLAlchemy's usual 
"default driver" selection logic.   This is the minimum we should 
probably do here, however this will make existing installations that 
"sort of work" right now to "not work" at all until the configuration is 
fixed.


If we truly want to implicitly force the driver to be "pymysql" if 
"mysql" is present without a driver, we can do that also, but that feels 
kind of wrong to me; there are all kinds of things that might need to 
happen to database URLs and it would be unfortunate if we started just 
hardcoding driver decisions in oslo.db without solving the issue of 
configuration in a more general sense, not to mention it's misleading to 
continue to have full SQLAlchemy URLs in the conf files that get 
silently altered by a middle tier - better would be that the format of 
the database configuration changes to not be confused with this.   More 
flexible would be if there were some kind of "registry"-oriented 
configuration so that connection URLs across many services could be more 
centralized (this is how ODBC works for example), but that is also 
another layer of complexity.


I'd mostly want to understand how we have "mysql://" URLs floating 
around as the installers / upgraders should have taken care of that 
issue some time ago.  Otherwise we shouldn't have full database URLs in 
our .conf files if a middle layer is going to silently change them anyway.


Simplest fix here is of course if someone has the old style URLs, just 
fix them to be "correct".  I'm mostly comfortable with adding assertions 
for this but not as much silently "fixing" URLs.





___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Doug Hellmann



> On May 23, 2017, at 4:01 PM, Doug Hellmann  wrote:
> 
> Excerpts from Sean McGinnis's message of 2017-05-23 11:38:34 -0500:
>>> 
>>> This sounds like something we could fix completely by dropping the
>>> use of the offending library. I know there was a lot of work done
>>> to get pymysql support in place. It seems like we can finish that by
>>> removing support for the old library and redirecting mysql://
>>> connections to use pymysql.
>>> 
>>> Doug
>>> 
>> 
>> I think that may be ideal. If there are known issues with the library,
>> and we have a different and well tested alternative that we know works,
>> it's probably easier all around to just redirect internally to use
>> pymysql.
> 
> Now we just have to find the code that's doing the mapping to the
> driver. It doesn't seem to be oslo.db. Is it sqlalchemy?

Mike, do you have any insight into the best approach for this?

Doug

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Jay Pipes


Thanks for the feedback, Curtis, appreciated!

On 05/23/2017 04:09 PM, Curtis wrote:

On Tue, May 23, 2017 at 1:20 PM, Edward Leafe  wrote:

On May 23, 2017, at 1:27 PM, James Penick  wrote:


  Perhaps this is a place where the TC and Foundation should step in and foster 
the existence of a porcelain API. Either by constructing something new, or by 
growing Nova into that thing.



Oh please, not Nova. The last word that comes to mind when thinking about Nova 
code is “porcelain”.



I keep seeing the word "porcelain", but I'm not sure what it means in
this context. Could someone help me out here and explain what that is?
:)


Here's where the term porcelain comes from:

https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain

Best,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread melanie witt


On Tue, 23 May 2017 20:58:20 +0100 (IST), Chris Dent wrote:


If we're talking big crazy changes: Why not take the "small VM
driver" (presumably nova-compute) out of Nova? What stays behind is
_already_ orchestration but missing some features and having a fair
few bugs.


I've suggested this a couple of times on the dev ML in replies to other 
threads in the past. We could either build a new porcelain API fresh and 
then whittle Nova down into a small VM driver or we could take the small 
VM driver out of Nova and mold Nova into the porcelain API.


New porcelain API would be a fresh start to "do it right" and would 
involve having people switch over to it. I think there would be 
sufficient motivation for operators to take on the effort of deploying 
it, considering there would be a lot of features their end users would 
want to get.


Removing the small VM driver from Nova would allow people to keep using 
what they know (Nova API) but would keep a lot of cruft with it. So I 
would tend to favor a new porcelain API.


We really need one, like yesterday IMHO.

-melanie

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Curtis

On Tue, May 23, 2017 at 1:20 PM, Edward Leafe  wrote:
> On May 23, 2017, at 1:27 PM, James Penick  wrote:
>
>>  Perhaps this is a place where the TC and Foundation should step in and 
>> foster the existence of a porcelain API. Either by constructing something 
>> new, or by growing Nova into that thing.
>
>
> Oh please, not Nova. The last word that comes to mind when thinking about 
> Nova code is “porcelain”.
>

I keep seeing the word "porcelain", but I'm not sure what it means in
this context. Could someone help me out here and explain what that is?
:)

For my $0.02 as an operator, most of the time I see retries they are
all failures, but I haven't run as big of clouds as a lot of people on
this list. I have certainly seen IPMI fail intermittently (I have a
script that logs in to a bunch of service processors and restarts
them) and would very much like to use Ironic to manage large pools of
baremetal nodes, so I could see that being an issue.

As a user of cloud resources though I always use some kind of
automation tooling with some form of looping for retries, but that
it's not always easy to get customers/users to use that kind of
tooling. For NFV workloads/clouds there almost always be some kind of
higher level abstraction (eg. as mentioned MANO) managing the
resources and it can retry (thought not all of them actually have that
functionality...yet).

So, as an operator and a user, I would personally be Ok with Nova
retries if it significantly adds to the complexity of Nova. I
certainly would not abandon Ironic if Nova didn't retry. I do wonder
what custom code might be required in say a public cloud providing
Ironic nodes though.

Thanks,
Curtis.

>
> -- Ed Leafe
>
>
>
>
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Doug Hellmann

Excerpts from Sean McGinnis's message of 2017-05-23 11:38:34 -0500:
> > 
> > This sounds like something we could fix completely by dropping the
> > use of the offending library. I know there was a lot of work done
> > to get pymysql support in place. It seems like we can finish that by
> > removing support for the old library and redirecting mysql://
> > connections to use pymysql.
> > 
> > Doug
> > 
> 
> I think that may be ideal. If there are known issues with the library,
> and we have a different and well tested alternative that we know works,
> it's probably easier all around to just redirect internally to use
> pymysql.

Now we just have to find the code that's doing the mapping to the
driver. It doesn't seem to be oslo.db. Is it sqlalchemy?

Doug

> 
> The one thing I don't know is if there are any valid reasons for someone
> wanting to use mysql over pymysql.

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Chris Dent


On Tue, 23 May 2017, James Penick wrote:


I agree that a single entry point to OpenStack would be fantastic. If it
existed, scheduling, quota, etc would have moved out of Nova a long time
ago, and Nova at this point would be just a small VM driver. Unfortunately
such a thing does not yet exist, and Nova has the momentum and mind share
as -The- entry point for all things Compute in OpenStack.


[snip some reality]


Perhaps this is a place where the TC and Foundation should step in and
foster the existence of a porcelain API. Either by constructing something
new, or by growing Nova into that thing.


If we're talking big crazy changes: Why not take the "small VM
driver" (presumably nova-compute) out of Nova? What stays behind is
_already_ orchestration but missing some features and having a fair
few bugs.

Way back in April[1] ttx asserted:

One insight which I think we could take from this is that when a
smaller group of people "owns" a set of files, we raise quality
(compared to everyone owning everything). So the more we can
split the code along areas of expertise and smaller review
teams, the better. But I think that is also something we
intuitively knew.

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-April/115061.html

--
Chris Dent  ┬──┬◡ﾉ(° -°ﾉ)   https://anticdent.org/
freenode: cdent tw: @anticdent___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread James Penick

On Tue, May 23, 2017 at 12:20 PM, Edward Leafe  wrote:

>
>
> Oh please, not Nova. The last word that comes to mind when thinking about
> Nova code is “porcelain”.
>

Oh I dunno, porcelain is usually associated with so many every day objects.

If we really push, we could see a movement in the right direction. Better
to use what we have, then wipe it all and flush so much hard work.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Edward Leafe

On May 23, 2017, at 1:27 PM, James Penick  wrote:

>  Perhaps this is a place where the TC and Foundation should step in and 
> foster the existence of a porcelain API. Either by constructing something 
> new, or by growing Nova into that thing.


Oh please, not Nova. The last word that comes to mind when thinking about Nova 
code is “porcelain”.


-- Ed Leafe






___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] [publiccloud-wg] Tomorrows meeting PublicCloudWorkingGroup

2017-05-23 Thread Tobias Rydberg


Hi everyone,

First of all, really fun to see the interest for the group and the forum 
sessions we moderated in Boston. I hope that we can keep up that spirit 
and looking forward to a lot of participants in the bi-weekly meetings 
for this cycle.


So, reminder for tomorrows meeting for the PublicCloudWorkingGroup.
May 24th - 1400 UTC in IRC channel #openstack-meeting-3

Etherpad: https://etherpad.openstack.org/p/publiccloud-wg

Agenda
1. Recap Boston Summit
2. Goals for Sydney Summit
3. Other

Have a great day and see you all tomorrow!

Tobias
tob...@citynetwork.se




smime.p7s
Description: S/MIME Cryptographic Signature
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread James Penick

On Tue, May 23, 2017 at 8:52 AM, Jay Pipes  wrote:
>
> If Heat was more widely deployed, would you feel this way? Would you
> reconsider having Heat as one of those "basic compute services" in
> OpenStack, then?
>
>
 (Caveat: I haven't looked at Heat in at least a year) I haven't deployed
heat in my environment yet, because as a template based orchestration
system it requires that you pass the correct template to construct or tear
down a stack. If you were to come along and remove part of that stack in
the interim, you throw everything into disarray, which then requires
cleanup.

 Also, i'm pretty sure my users would mostly hate needing to pass a file to
boot a single instance.

 As an example: In my environment I allows users to request a custom disk
layout for baremetal hosts, by passing a yaml file as metadata (yeah, yeah
I know). The result? They hate that they have to pass a file. To them disk
layout should be a first class object, similar to flavors. I've pushed back
hard against this: It's not clean, disk profiles should be the exception to
the norm, just keep the profile in a code repo. But the truth is i'm coming
around to their way of thinking.

 I'm forced to choose between Architectural Purity[1] and what my customers
actually need. In the end the people who actually use my product define it
inasmuch as I do. At some point i'll probably give in and implement the
thing they want, because from a broad perspective it makes sense to me,
even though it doesn't align with the state of Nova right now.

This is, unfortunately, one of the main problems stemming from OpenStack
> not having a *single* public API, with projects implementing parts of that
> single public API. You know, the thing I started arguing for about 6 years
> ago.
>
> If we had one single public porcelain API, we wouldn't even need to have
> this conversation. People wouldn't even know we'd changed implementation
> details behind the scenes and were doing retries at a slightly higher level
> than before. Oh well... we live and learn (maybe).
>
>
 I agree that a single entry point to OpenStack would be fantastic. If it
existed, scheduling, quota, etc would have moved out of Nova a long time
ago, and Nova at this point would be just a small VM driver. Unfortunately
such a thing does not yet exist, and Nova has the momentum and mind share
as -The- entry point for all things Compute in OpenStack.

 If the community aligns behind a new porcelain API, great! But until it's
ready, deployers, operators, and users need to run their businesses.
Removing functionality that impedes our ability to provide a stable IaaS
experience isn't acceptable to us. If the expectation is that deployers
will hack around this, then that's putting us in the position of struggling
even more to keep up with, or move to a current version of OpenStack.
Worse, that's anathema to cloud interop.

 Perhaps this is a place where the TC and Foundation should step in and
foster the existence of a porcelain API. Either by constructing something
new, or by growing Nova into that thing.

-James
[1] Insert choir of angels sound here
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] Mitaka and gnocchi

2017-05-23 Thread gordon chung



On 23/05/17 10:16 AM, mate...@mailbox.org wrote:
> / 2017-05-23 13:29:10.961 1931583 ERROR ceilometer.dispatcher.gnocchi
> [-] Failed to connect to Gnocchi./
> /2017-05-23 13:29:10.962 1931583 ERROR stevedore.extension [-] Could not
> load 'gnocchi': Unexpected exception for
> http//10.10.10.69:8000/v1/capabilities/: Invalid URL
> 'http//10.10.10.69:8000/v1/capabilities/': No schema supplied. Perhaps
> you meant http://http//10.10.10.69:8000/v1/capabilities/?/

you missed ':' in http://

-- 
gord

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Tim Bell

One scenario would be to change the default and allow the exceptions to opt out 
(e.g. mysql-pymysql (

Tim

On 23.05.17, 19:08, "Matt Riedemann"  wrote:

On 5/23/2017 11:38 AM, Sean McGinnis wrote:
>>
>> This sounds like something we could fix completely by dropping the
>> use of the offending library. I know there was a lot of work done
>> to get pymysql support in place. It seems like we can finish that by
>> removing support for the old library and redirecting mysql://
>> connections to use pymysql.
>>
>> Doug
>>
> 
> I think that may be ideal. If there are known issues with the library,
> and we have a different and well tested alternative that we know works,
> it's probably easier all around to just redirect internally to use
> pymysql.
> 
> The one thing I don't know is if there are any valid reasons for someone
> wanting to use mysql over pymysql.
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 

The mysql library doesn't support python 3 and doesn't support eventlet, 
as far as I remember, which is why there was the push to adopt pymysql. 
But it's been years now so I can't remember exactly. I think Rackspace 
was still using the mysql backend for public cloud because of some 
straight to sql execution stuff they were doing for costly DB APIs [1] 
but I'd think that could be ported.

Anyway, +1 to dropping the mysql library and just rely on pymysql.

[1] https://review.openstack.org/#/c/243822/

-- 

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] CockroachDB for Core Services

2017-05-23 Thread Chris Apsey


All,

Now that cockroachdb has reached v1 and appears to be compatible with 
sqlalchemy 
(https://www.cockroachlabs.com/blog/building-application-cockroachdb-sqlalchemy-2/), 
has anyone tried using it as the backend for any of the service 
databases?  Wondering how far away it is from production-ready in this 
particular role...


Thanks

--
v/r

Chris Apsey
bitskr...@bitskrieg.net
https://www.bitskrieg.net

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] Eventbrite - Ops Meet-up Mexico City

2017-05-23 Thread Edgar Magana

Tom,

It is time to create the Eventbrite for the Ops Meet-up in Mexico City. I was 
told you are the man helping us on this area. I have also started the etherpad 
for the meet-up that we can use as a reference.
https://etherpad.openstack.org/p/MEX-ops-meetup

Let me know what you need to create the event. The price has been decided to 
keep it as $20 USD (MXP $370 - 
http://www.xe.com/currencyconverter/convert/?Amount=20&From=USD&To=MXN)

Thanks,

Edgar
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Matt Riedemann


On 5/23/2017 11:38 AM, Sean McGinnis wrote:


This sounds like something we could fix completely by dropping the
use of the offending library. I know there was a lot of work done
to get pymysql support in place. It seems like we can finish that by
removing support for the old library and redirecting mysql://
connections to use pymysql.

Doug



I think that may be ideal. If there are known issues with the library,
and we have a different and well tested alternative that we know works,
it's probably easier all around to just redirect internally to use
pymysql.

The one thing I don't know is if there are any valid reasons for someone
wanting to use mysql over pymysql.


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



The mysql library doesn't support python 3 and doesn't support eventlet, 
as far as I remember, which is why there was the push to adopt pymysql. 
But it's been years now so I can't remember exactly. I think Rackspace 
was still using the mysql backend for public cloud because of some 
straight to sql execution stuff they were doing for costly DB APIs [1] 
but I'd think that could be ported.


Anyway, +1 to dropping the mysql library and just rely on pymysql.

[1] https://review.openstack.org/#/c/243822/

--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] [nova] Cells v2 FAQs

2017-05-23 Thread Matt Riedemann

FYI, I've started a series of changes to add FAQs to the nova devref 
about cells v2:


https://review.openstack.org/#/q/topic:cells-v2-docs

The first few questions are based on things that have come up in IRC.

If you have other things you'd like to see here, please reply to this 
thread, or better yet post a patch to the end of the series.


--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Sean McGinnis

> 
> This sounds like something we could fix completely by dropping the
> use of the offending library. I know there was a lot of work done
> to get pymysql support in place. It seems like we can finish that by
> removing support for the old library and redirecting mysql://
> connections to use pymysql.
> 
> Doug
> 

I think that may be ideal. If there are known issues with the library,
and we have a different and well tested alternative that we know works,
it's probably easier all around to just redirect internally to use
pymysql.

The one thing I don't know is if there are any valid reasons for someone
wanting to use mysql over pymysql.


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Jay Pipes


On 05/23/2017 12:34 PM, Marc Heckmann wrote:

On Tue, 2017-05-23 at 11:44 -0400, Jay Pipes wrote:

On 05/23/2017 09:48 AM, Marc Heckmann wrote:

For the anti-affinity use case, it's really useful for smaller or
medium
size operators who want to provide some form of failure domains to
users
but do not have the resources to create AZ's at DC or even at rack
or
row scale. Don't forget that as soon as you introduce AZs, you need
to
grow those AZs at the same rate and have the same flavor offerings
across those AZs.

For the retry thing, I think enough people have chimed in to echo
the
general sentiment.


The purpose of my ML post was around getting rid of retries, not the
usefulness of affinity groups. That seems to have been missed,
however.

Do you or David have any data on how often you've actually seen
retries
due to the last-minute affinity constraint violation in real world
production?


No I don't have any data unfortunately. Mostly because we haven't
advertised the feature to end users yet. We only now are in a position
to do so because, previously there was a bug causing nova-scheduler to
grow in RAM usage if the required config flag to enable the feature was
  on.


k.


I have however seen retry's triggered on hypervisors for other reasons.
I can try to dig up why specifically if that would be useful. I will
add that we do not use Ironic at all.


Yeah, any data you can get about real-world retry causes would be 
awesome. Note that all "resource over-consumption" causes of retries 
will be going away once we do claims in the scheduler. So, really, we're 
looking for data on the *other* causes of retries.


Thanks much in advance!

-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Marc Heckmann

On Tue, 2017-05-23 at 11:44 -0400, Jay Pipes wrote:
> On 05/23/2017 09:48 AM, Marc Heckmann wrote:
> > For the anti-affinity use case, it's really useful for smaller or
> > medium 
> > size operators who want to provide some form of failure domains to
> > users 
> > but do not have the resources to create AZ's at DC or even at rack
> > or 
> > row scale. Don't forget that as soon as you introduce AZs, you need
> > to 
> > grow those AZs at the same rate and have the same flavor offerings 
> > across those AZs.
> > 
> > For the retry thing, I think enough people have chimed in to echo
> > the 
> > general sentiment.
> 
> The purpose of my ML post was around getting rid of retries, not the 
> usefulness of affinity groups. That seems to have been missed,
> however.
> 
> Do you or David have any data on how often you've actually seen
> retries 
> due to the last-minute affinity constraint violation in real world 
> production?

No I don't have any data unfortunately. Mostly because we haven't
advertised the feature to end users yet. We only now are in a position
to do so because, previously there was a bug causing nova-scheduler to
grow in RAM usage if the required config flag to enable the feature was
 on.

I have however seen retry's triggered on hypervisors for other reasons.
I can try to dig up why specifically if that would be useful. I will
add that we do not use Ironic at all.

-m



> 
> Thanks,
> -jay
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operato
> rs
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Tim Bell

Would an automatic transform of mysql:// to mysql+pymysql:// be possible? Is 
there any reason to not have pymysql?

Tim

From: Arne Wiebalck 
Date: Tuesday, 23 May 2017 at 17:50
To: Sean McGinnis 
Cc: openstack-operators 
Subject: Re: [Openstack-operators] DB deadlocks due to connection string

As discussed on the Cinder channel, I’ve opened

https://bugs.launchpad.net/oslo.db/+bug/1692956

to see if oslo.db would be a good place to produce a warning when it detects
this potential misconfiguration.

Cheers,
 Arne

On 23 May 2017, at 17:25, Sean McGinnis 
mailto:sean.mcgin...@gmx.com>> wrote:

Just wanted to put this out there to hopefully spread awareness and
prevent it from happening more.

We had a bug reported in Cinder of hitting a deadlock when attempting
to deelte multiple volumes simultaneously:

https://bugs.launchpad.net/cinder/+bug/1685818

Some were seeing it, but others were not able to reproduce the error
in their environments.

What it came down to is the use of "mysql://" vs "mysql+pymysql://"
for the database connection string. Big thanks to Gerhard Muntingh
for noticing this difference.

Basically, when using "mysql://" for the connection string, that uses
blocking calls that prevent other "threads" from running at the same
time, causing these deadlocks.

This doesn't just impact Cinder, so I wanted to get the word out that
it may be worth checking your configurations and make sure you are
using "mysql+pymysql://" for your connections.

Sean

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

--
Arne Wiebalck
CERN IT

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Doug Hellmann

Excerpts from Arne Wiebalck's message of 2017-05-23 15:50:43 +:
> As discussed on the Cinder channel, I’ve opened
> 
> https://bugs.launchpad.net/oslo.db/+bug/1692956
> 
> to see if oslo.db would be a good place to produce a warning when it detects
> this potential misconfiguration.

This sounds like something we could fix completely by dropping the
use of the offending library. I know there was a lot of work done
to get pymysql support in place. It seems like we can finish that by
removing support for the old library and redirecting mysql://
connections to use pymysql.

Doug

> 
> Cheers,
>  Arne
> 
> On 23 May 2017, at 17:25, Sean McGinnis 
> mailto:sean.mcgin...@gmx.com>> wrote:
> 
> Just wanted to put this out there to hopefully spread awareness and
> prevent it from happening more.
> 
> We had a bug reported in Cinder of hitting a deadlock when attempting
> to deelte multiple volumes simultaneously:
> 
> https://bugs.launchpad.net/cinder/+bug/1685818
> 
> Some were seeing it, but others were not able to reproduce the error
> in their environments.
> 
> What it came down to is the use of "mysql://" vs "mysql+pymysql://"
> for the database connection string. Big thanks to Gerhard Muntingh
> for noticing this difference.
> 
> Basically, when using "mysql://" for the connection string, that uses
> blocking calls that prevent other "threads" from running at the same
> time, causing these deadlocks.
> 
> This doesn't just impact Cinder, so I wanted to get the word out that
> it may be worth checking your configurations and make sure you are
> using "mysql+pymysql://" for your connections.
> 
> Sean
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Jay Pipes


On 05/22/2017 03:36 PM, Sean Dague wrote:

On 05/22/2017 02:45 PM, James Penick wrote:



 I recognize that large Ironic users expressed their concerns about
 IPMI/BMC communication being unreliable and not wanting to have
 users manually retry a baremetal instance launch. But, on this
 particular point, I'm of the opinion that Nova just do one thing and
 do it well. Nova isn't an orchestrator, nor is it intending to be a
 "just continually try to get me to this eventual state" system like
 Kubernetes.

Kubernetes is a larger orchestration platform that provides autoscale. I
don't expect Nova to provide autoscale, but

I agree that Nova should do one thing and do it really well, and in my
mind that thing is reliable provisioning of compute resources.
Kubernetes does autoscale among other things. I'm not asking for Nova to
provide Autoscale, I -AM- asking OpenStack's compute platform to
provision a discrete compute resource reliably. This means overcoming
common and simple error cases. As a deployer of OpenStack I'm trying to
build a cloud that wraps the chaos of infrastructure, and present a
reliable facade. When my users issue a boot request, I want to see if
fulfilled. I don't expect it to be a 100% guarantee across any possible
failure, but I expect (and my users demand) that my "Infrastructure as a
service" API make reasonable accommodation to overcome common failures.


Right, I think hits my major queeziness with throwing the baby out with
the bathwater here. I feel like Nova's job is to give me a compute when
asked for computes. Yes, like malloc, things could fail. But honestly if
Nova can recover from that scenario, it should try to. The baremetal and
affinity cases are pretty good instances where Nova can catch and
recover, and not just export that complexity up.

It would make me sad to just export that complexity to users, and
instead of handing those cases internally make every SDK, App, and
simple script build their own retry loop.


If Heat was more widely deployed, would you feel this way? Would you 
reconsider having Heat as one of those "basic compute services" in 
OpenStack, then?


This is, unfortunately, one of the main problems stemming from OpenStack 
not having a *single* public API, with projects implementing parts of 
that single public API. You know, the thing I started arguing for about 
6 years ago.


If we had one single public porcelain API, we wouldn't even need to have 
this conversation. People wouldn't even know we'd changed implementation 
details behind the scenes and were doing retries at a slightly higher 
level than before. Oh well... we live and learn (maybe).


Best,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Arne Wiebalck

As discussed on the Cinder channel, I’ve opened

https://bugs.launchpad.net/oslo.db/+bug/1692956

to see if oslo.db would be a good place to produce a warning when it detects
this potential misconfiguration.

Cheers,
 Arne

On 23 May 2017, at 17:25, Sean McGinnis 
mailto:sean.mcgin...@gmx.com>> wrote:

Just wanted to put this out there to hopefully spread awareness and
prevent it from happening more.

We had a bug reported in Cinder of hitting a deadlock when attempting
to deelte multiple volumes simultaneously:

https://bugs.launchpad.net/cinder/+bug/1685818

Some were seeing it, but others were not able to reproduce the error
in their environments.

What it came down to is the use of "mysql://" vs "mysql+pymysql://"
for the database connection string. Big thanks to Gerhard Muntingh
for noticing this difference.

Basically, when using "mysql://" for the connection string, that uses
blocking calls that prevent other "threads" from running at the same
time, causing these deadlocks.

This doesn't just impact Cinder, so I wanted to get the word out that
it may be worth checking your configurations and make sure you are
using "mysql+pymysql://" for your connections.

Sean


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

--
Arne Wiebalck
CERN IT

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] UTC 14:00 henceforth for Ops Meet Up planning

2017-05-23 Thread Chris Morgan

Many thanks to David for running this poll. We desperately needed a more
popular slot.

Minutes for today's meeting :

 Meeting ended Tue May 23 15:43:14 2017 UTC. Information about
MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)
11:43 AM O Minutes:
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2017/ops_meetup_team.2017-05-23-15.01.html
11:43 AM Minutes (text):
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2017/ops_meetup_team.2017-05-23-15.01.txt
11:43 AM O Log:
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2017/ops_meetup_team.2017-05-23-15.01.log.html

On Tue, May 23, 2017 at 11:42 AM, David Medberry 
wrote:

> Ah, excellent question sean.
>
> This is an IRC meeting that occurs weekly currently at 14:00 (starting
> next week) on Tuesdays in Freenode at #openstack-operators
> (the current one is just wrapping up)
>
>
> On Tue, May 23, 2017 at 9:34 AM, Sean McGinnis 
> wrote:
>
>> On Tue, May 23, 2017 at 09:16:13AM -0600, David Medberry wrote:
>> > I have picked "Tuesday, May 30, 2017 8:00 AM (Time zone: Mountain
>> Time)" as
>> > final option(s) for the Doodle poll "Ops Meetup Preferred Time."
>>
>> Hey David,
>>
>> Sorry, I'm sure this was stated elsewhere, but where is this meeting held?
>>
>> Thanks!
>> Sean
>>
>>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>

-- 
Chris Morgan 
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Jay Pipes


On 05/23/2017 09:48 AM, Marc Heckmann wrote:
For the anti-affinity use case, it's really useful for smaller or medium 
size operators who want to provide some form of failure domains to users 
but do not have the resources to create AZ's at DC or even at rack or 
row scale. Don't forget that as soon as you introduce AZs, you need to 
grow those AZs at the same rate and have the same flavor offerings 
across those AZs.


For the retry thing, I think enough people have chimed in to echo the 
general sentiment.


The purpose of my ML post was around getting rid of retries, not the 
usefulness of affinity groups. That seems to have been missed, however.


Do you or David have any data on how often you've actually seen retries 
due to the last-minute affinity constraint violation in real world 
production?


Thanks,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] UTC 14:00 henceforth for Ops Meet Up planning

2017-05-23 Thread David Medberry

Ah, excellent question sean.

This is an IRC meeting that occurs weekly currently at 14:00 (starting next
week) on Tuesdays in Freenode at #openstack-operators
(the current one is just wrapping up)


On Tue, May 23, 2017 at 9:34 AM, Sean McGinnis 
wrote:

> On Tue, May 23, 2017 at 09:16:13AM -0600, David Medberry wrote:
> > I have picked "Tuesday, May 30, 2017 8:00 AM (Time zone: Mountain Time)"
> as
> > final option(s) for the Doodle poll "Ops Meetup Preferred Time."
>
> Hey David,
>
> Sorry, I'm sure this was stated elsewhere, but where is this meeting held?
>
> Thanks!
> Sean
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] UTC 14:00 henceforth for Ops Meet Up planning

2017-05-23 Thread Sean McGinnis

On Tue, May 23, 2017 at 09:16:13AM -0600, David Medberry wrote:
> I have picked "Tuesday, May 30, 2017 8:00 AM (Time zone: Mountain Time)" as
> final option(s) for the Doodle poll "Ops Meetup Preferred Time."

Hey David,

Sorry, I'm sure this was stated elsewhere, but where is this meeting held?

Thanks!
Sean


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] DB deadlocks due to connection string

2017-05-23 Thread Sean McGinnis

Just wanted to put this out there to hopefully spread awareness and
prevent it from happening more.

We had a bug reported in Cinder of hitting a deadlock when attempting
to deelte multiple volumes simultaneously:

https://bugs.launchpad.net/cinder/+bug/1685818

Some were seeing it, but others were not able to reproduce the error
in their environments.

What it came down to is the use of "mysql://" vs "mysql+pymysql://"
for the database connection string. Big thanks to Gerhard Muntingh
for noticing this difference.

Basically, when using "mysql://" for the connection string, that uses
blocking calls that prevent other "threads" from running at the same
time, causing these deadlocks.

This doesn't just impact Cinder, so I wanted to get the word out that
it may be worth checking your configurations and make sure you are
using "mysql+pymysql://" for your connections.

Sean


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] [scientific] Reminder: Scientific WG IRC meeting - Wednesday 0900 UTC

2017-05-23 Thread Stig Telfer

Hello -

We have a meeting on Wednesday at 0900 UTC in channel #openstack-meeting.  We’d 
like to cover the discussion on gathering OpenStack-related research papers, 
and round up on activities from the summit.

Today’s full agenda is here:
https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_May_24th_2017
 


Details of the meeting are here:
http://eavesdrop.openstack.org/#Scientific_Working_Group 


Cheers,
Stig___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] UTC 14:00 henceforth for Ops Meet Up planning

2017-05-23 Thread David Medberry

I have picked "Tuesday, May 30, 2017 8:00 AM (Time zone: Mountain Time)" as
final option(s) for the Doodle poll "Ops Meetup Preferred Time."

Follow this link to open the poll:
http://doodle.com/poll/bccipuc8kfm36xna
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] Mitaka and gnocchi

2017-05-23 Thread mate200


I'm trying to connect ceilometer to gnocchi. I don't use any wsgi application and ceilometer is connecting directly to gnocchi-api.The thing is that I see these kind of errors in ceilometer-collector.log  -  2017-05-23 13:29:10.961 1931583 ERROR ceilometer.dispatcher.gnocchi [-] Failed to connect to Gnocchi.2017-05-23 13:29:10.962 1931583 ERROR stevedore.extension [-] Could not load 'gnocchi': Unexpected exception for http//10.10.10.69:8000/v1/capabilities/: Invalid URL 'http//10.10.10.69:8000/v1/capabilities/': No schema supplied. Perhaps you meant http://http//10.10.10.69:8000/v1/capabilities/?2017-05-23 13:29:10.962 1931583 WARNING stevedore.named [-] Could not load gnocchi2017-05-23 13:29:10.963 1931583 WARNING ceilometer.dispatcher [-] Failed to load any dispatchers for ceilometer.dispatcher.meter2017-05-23 13:29:10.963 1931583 DEBUG ceilometer.dispatcher [-] loading dispatchers from ceilometer.dispatcher.event _load_dispatcher_manager /usr/lib/python2.7/dist-packages/ceilometer/dispatcher/__init__.py:58What it may be ?On May 23, 2017 at 3:29 PM mate...@mailbox.org wrote:As for gnocchi 2.2.1 policy.json is needed :) Now I'm fighting with ceilometer.On May 23, 2017 at 3:16 PM gordon chung  wrote:On 23/05/17 06:35 AM, mate...@mailbox.org wrote:Sorry, I didn't notice policy.json in the first link. I'll check it.:)depending on gnocchi version, it might look internally for policy.jsonif it doesn't find policy.json in /etc/gnocchi/--gord___OpenStack-operators mailing listOpenStack-operators@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators___OpenStack-operators mailing listOpenStack-operators@lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

2017-05-23 Thread Marc Heckmann

For the anti-affinity use case, it's really useful for smaller or medium size 
operators who want to provide some form of failure domains to users but do not 
have the resources to create AZ's at DC or even at rack or row scale. Don't 
forget that as soon as you introduce AZs, you need to grow those AZs at the 
same rate and have the same flavor offerings across those AZs.

For the retry thing, I think enough people have chimed in to echo the general 
sentiment.

-m


On Mon, 2017-05-22 at 16:30 -0600, David Medberry wrote:
I have to agree with James

My affinity and anti-affinity rules have nothing to do with NFV. a-a is almost 
always a failure domain solution. I'm not sure we have users actually choosing 
affinity (though it would likely be for network speed issues and/or some sort 
of badly architected need or perceived need for coupling.)

On Mon, May 22, 2017 at 12:45 PM, James Penick 
mailto:jpen...@gmail.com>> wrote:


On Mon, May 22, 2017 at 10:54 AM, Jay Pipes 
mailto:jaypi...@gmail.com>> wrote:
Hi Ops,

Hi!


For class b) causes, we should be able to solve this issue when the placement 
service understands affinity/anti-affinity (maybe Queens/Rocky). Until then, we 
propose that instead of raising a Reschedule when an affinity constraint was 
last-minute violated due to a racing scheduler decision, that we simply set the 
instance to an ERROR state.

Personally, I have only ever seen anti-affinity/affinity use cases in relation 
to NFV deployments, and in every NFV deployment of OpenStack there is a VNFM or 
MANO solution that is responsible for the orchestration of instances belonging 
to various service function chains. I think it is reasonable to expect the MANO 
system to be responsible for attempting a re-launch of an instance that was set 
to ERROR due to a last-minute affinity violation.

**Operators, do you agree with the above?**

I do not. My affinity and anti-affinity use cases reflect the need to build 
large applications across failure domains in a datacenter.

Anti-affinity: Most anti-affinity use cases relate to the ability to guarantee 
that instances are scheduled across failure domains, others relate to security 
compliance.

Affinity: Hadoop/Big data deployments have affinity use cases, where nodes 
processing data need to be in the same rack as the nodes which house the data. 
This is a common setup for large hadoop deployers.

I recognize that large Ironic users expressed their concerns about IPMI/BMC 
communication being unreliable and not wanting to have users manually retry a 
baremetal instance launch. But, on this particular point, I'm of the opinion 
that Nova just do one thing and do it well. Nova isn't an orchestrator, nor is 
it intending to be a "just continually try to get me to this eventual state" 
system like Kubernetes.

Kubernetes is a larger orchestration platform that provides autoscale. I don't 
expect Nova to provide autoscale, but

I agree that Nova should do one thing and do it really well, and in my mind 
that thing is reliable provisioning of compute resources. Kubernetes does 
autoscale among other things. I'm not asking for Nova to provide Autoscale, I 
-AM- asking OpenStack's compute platform to provision a discrete compute 
resource reliably. This means overcoming common and simple error cases. As a 
deployer of OpenStack I'm trying to build a cloud that wraps the chaos of 
infrastructure, and present a reliable facade. When my users issue a boot 
request, I want to see if fulfilled. I don't expect it to be a 100% guarantee 
across any possible failure, but I expect (and my users demand) that my 
"Infrastructure as a service" API make reasonable accommodation to overcome 
common failures.


If we removed Reschedule for class c) failures entirely, large Ironic deployers 
would have to train users to manually retry a failed launch or would need to 
write a simple retry mechanism into whatever client/UI that they expose to 
their users.

**Ironic operators, would the above decision force you to abandon Nova as the 
multi-tenant BMaaS facility?**


 I just glanced at one of my production clusters and found there are around 7K 
users defined, many of whom use OpenStack on a daily basis. When they issue a 
boot call, they expect that request to be honored. From their perspective, if 
they call AWS, they get what they ask for. If you remove reschedules you're not 
just breaking the expectation of a single deployer, but for my thousands of 
engineers who, every day, rely on OpenStack to manage their stack.

I don't have a "i'll take my football and go home" mentality. But if you remove 
the ability for the compute provisioning API to present a reliable facade over 
infrastructure, I have to go write something else, or patch it back in. Now 
it's even harder for me to get and stay current with OpenStack.

During the summit the agreement was, if I recall, that reschedules would happen 
within a cell, and not between the parent and cell. That was complet

Re: [Openstack-operators] Routed provider networks...

2017-05-23 Thread Chris Marino

On Mon, May 22, 2017 at 9:12 PM, Kevin Benton  wrote:

> The operators that were asking for the spec were using private IP space
> and that is probably going to be the most common use case for routed
> networks. Splitting a /21 up across the entire data center isn't really
> something you would want to do because you would run out of IPs quickly
> like you mentioned.
>
> The use case routed networks is almost exactly like your Romana project.
> For example, you have a large chunk of IPs (e.g. 10.0.0.0/16) and you've
> setup the infrastructure so each rack gets a /23 with the ToR as the
> gateway which would buy you 509 VMs across 128 racks.
>

Yes, it is. That's what brought me back to this. Working with an operator
that's using L2 provider networks today, but will bring L3 to host in their
new design.

Dynamic routing is absolutely necessary, though. Large blocks of 1918
addresses are scarce, even inside the DC. VRFs and/or NAT just not an
option.

CM


>
>
> On May 22, 2017 2:53 PM, "Chris Marino"  wrote:
>
> Thanks Jon, very helpful.
>
> I think a more common use case for provider networks (in enterprise,
> AFAIK) is that they'd have a small number of /20 or /21 networks (VLANs)
> that they would trunk to all hosts. The /21s are part of the larger
> datacenter network with segment firewalls and access to other datacenter
> resources (no NAT). Each functional area would get their own network (i.e.
> QA, Prod, Dev, Test, etc.) but users would have access to only certain
> networks.
>
> For various reasons, they're moving to spine/leaf L3 networks and they
> want to use the same provider network CIDRs with the new L3 network. While
> technically this is covered by the use case described in the spec,
> splitting a /21 into segments (i.e.one for each rack/ToRs) severely limits
> the scheduler (since each rack only get a part of the whole /21).
>
> This can be solved with route advertisement/distribution and/or IPAM
> coordination w/Nova, but this isn't possible today. Which brings me back to
> my earlier question, how useful are routed provider network?
>
> CM
> ᐧ
>
> On Mon, May 22, 2017 at 1:08 PM, Jonathan Proulx 
> wrote:
>
>>
>> Not sure if this is what you're looking for but...
>>
>> For my private cloud in research environment we have a public provider
>> network available to all projects.
>>
>> This is externally routed and has basically been in the same config
>> since Folsom (currently we're upto Mitaka).  It provides public ipv4
>> addresses. DHCP is done in neutron (of course) the lower portion of
>> the allocated subnet is excluded from the dynamic range.  We allow
>> users to register DNS names in this range (through pre-exisiting
>> custom, external IPAM tools) and to specify the fixed ip address when
>> launching VMs.
>>
>> This network typically has 1k VMs running. We've assigned a /18 to
>> which is obviously overkill.
>>
>> A few projects also have provider networks plumbed in to bridge they
>> legacy physical networks into OpenStack.  For these there's no dynamic
>> range and users must specify fixed ip, these are generally considered
>> "a bad idea" and were used to facilitate dumping VMs from old Xen
>> infrastructures into OpenStack with minimal changes.
>>
>> These are old patterns I wouldn't necessarily suggest anyone
>> replicate, but they are the truth of my world...
>>
>> -Jon
>>
>> On Mon, May 22, 2017 at 12:47:01PM -0700, Chris Marino wrote:
>> :Hello operators, I will be talking about the new routed provider network
>> :> outed-networks.html>
>> :features in OpenStack at a Meetup
>> :next week and would
>> :like to get a better sense of how provider networks are currently being
>> :used and if anyone has deployed routed provider networks?
>> :
>> :A typical L2 provider network is deployed as VLANs to every host. But
>> :curious to know how how many hosts or VMs an operator would allow on this
>> :network before you wanted to split into segments? Would you split hosts
>> :between VLANs, or trunk the VLANs to all hosts? How do you handle
>> :scheduling VMs across two provider networks?
>> :
>> :If you were to go with L3 provider networks, would it be L3 to the ToR,
>> or
>> :L3 to the host?
>> :
>> :Are the new routed provider network features useful in their current
>> form?
>> :
>> :Any experience you can share would be very helpful.
>> :CM
>> :
>> :
>> :ᐧ
>>
>> :___
>> :OpenStack-operators mailing list
>> :OpenStack-operators@lists.openstack.org
>> :http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>> --
>>
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
ᐧ
___
OpenStack-operators mailing list

Re: [Openstack-operators] Mitaka and gnocchi

2017-05-23 Thread mate200

As for gnocchi 2.2.1 policy.json is needed :) Now I'm fighting with ceilometer.



> On May 23, 2017 at 3:16 PM gordon chung  wrote:
> 
> 
> 
> 
> On 23/05/17 06:35 AM, mate...@mailbox.org wrote:
> > Sorry, I didn't notice policy.json in the first link. I'll check it.
> 
> :)
> 
> depending on gnocchi version, it might look internally for policy.json 
> if it doesn't find policy.json in /etc/gnocchi/
> 
> -- 
> gord
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] Mitaka and gnocchi

2017-05-23 Thread gordon chung



On 23/05/17 06:35 AM, mate...@mailbox.org wrote:
> Sorry, I didn't notice policy.json in the first link. I'll check it.

:)

depending on gnocchi version, it might look internally for policy.json 
if it doesn't find policy.json in /etc/gnocchi/

-- 
gord
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] Mitaka and gnocchi

2017-05-23 Thread mate200

Sorry, I didn't notice policy.json in the first link. I'll check it.


> On May 22, 2017 at 6:09 PM mate...@mailbox.org wrote:
> 
> 
> Hello Gordon,
> I'm trying to use gnocchi client and it always says to me that I don't have 
> sufficient priviligies -
> 10.10.10.69 - - [22/May/2017 14:58:33] "GET /v1/resource/generic HTTP/1.1" 
> 403 77 Insufficient privileges (HTTP 403)
> I think it's because of the policy.json absence. The log gives me a clue - 
> 2017-05-22 14:40:19.808 21874 DEBUG oslo_policy.policy [-] The policy file 
> policy.json could not be found. load_rules
> /usr/local/lib/python2.7/dist-packages/oslo_policy/policy.py:535
> Where can I find one and am I right ?
> 
> 
> 
> 
> 
> On Tue, 2017-05-16 at 15:22 +, gordon chung wrote:
> > On 15/05/17 06:58 AM, mate...@mailbox.org wrote:
> > 
> > > It seems to me that I should add more settings to both config files, but
> > > the problem is that I cannot find information on that.
> > > 
> > 
> > here's the conf file from gate using mitaka+gnocchi2.2[1]. i would start 
> > there.
> > 
> > i've moved it to paste.openstack.org[2] as well since the log files will 
> > probably be cleared soon. maybe start with those configurations since it 
> > worked in gate.
> > 
> > 
> > [1] 
> > http://logs.openstack.org/46/450646/4/gate/gate-telemetry-dsvm-integration-ceilometer-ubuntu-trusty/61d8bb0/logs/etc/
> > [2] http://paste.openstack.org/show/609688/
> > 
> > -- 
> > gord
> > ___
> > OpenStack-operators mailing list
> > OpenStack-operators@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> -- 
> Best regards,
> Mate200___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] control guest VMs in isolated network

2017-05-23 Thread Volodymyr Litovka


Hi colleagues,

are there ways to control guest VMs which reside in isolated network?

In general, there two methods are available:

1. use Heat's SoftwareDeployment method
2. use Qemu Guest Agent

First method requires accessibility of Keystone/Heat (os-collect-agent 
authorizes on Keystone, receives endpoints list and use public Heat's 
endpoint to deploy changes), but, since network is isolated, these 
addresses are inaccessible. It can work if Neutron can provide proxying 
like it do for Metadata server, but I didn't find this kind of 
functionality neither in Neutron's documentation nor in other sources. 
And I don't want to apply another NIC to VM for access to Keystone/Heat, 
since it violates customer's rules (this is, by design, isolated network 
with just VPN connection to premises). So the first question is - 
*whether Neutron can proxy requests to Keystone/Heat like it do this for 
Metadata*?


Second method (using qemu guest agent) gives some control of VM, but, 
again, I wasn't be able to find how this can achieved using Nova. There 
are some mentions about this functionality but no details and examples. 
So, the second question - *whether Nova supports qemu guest agent and 
allows to use available calls of QEMU-ga protocol, including 
'guest-exec**'*?


And, may be, there are another methods or ways to use mentioned above 
methods to bypass isolation while keeping it?


Thank you!

--
Volodymyr Litovka
  "Vision without Execution is Hallucination." -- Thomas Edison

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

49 matches

Mail list logo