[Openstack-operators] osops-tools-monitoring Dependency problems

2018-10-19 Thread Tomáš Vondra
Hi!
I'm a long time user of monitoring-for-openstack, also known as oschecks.
Concretely, I used a version from 2015 with OpenStack python client
libraries from Kilo. Now I have upgraded them to Mitaka and it got broken.
Even the latest oschecks don't work. I didn't quite expect that, given that
there are several commits from this year e.g. by Nagasai Vinaykumar
Kapalavai and paramite. Can one of them or some other user step up and say
what version of OpenStack clients is oschecks working with? Ideally, write
it down in requirements.txt so that it will be reproducible? Also, some
documentation of what is the minimal set of parameters would also come in
handy.
Thanks a lot, Tomas from Homeatcloud

The error messages are as absurd as:
oschecks-check_glance_api --os_auth_url='http://10.1.101.30:5000/v2.0'
--os_username=monitoring --os_password=XXX --os_tenant_name=monitoring

CRITICAL: Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/oschecks/utils.py", line 121, in
safe_run
method()
  File "/usr/lib/python2.7/dist-packages/oschecks/glance.py", line 29, in
_check_glance_api
glance = utils.Glance()
  File "/usr/lib/python2.7/dist-packages/oschecks/utils.py", line 177, in
__init__
self.glance.parser = self.glance.get_base_parser(sys.argv)
TypeError: get_base_parser() takes exactly 1 argument (2 given)

(I can see 4 parameters on the command line.)


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack] volume state (in-use/available) vs real work

2018-04-27 Thread Tomáš Vondra
Hi!
whether this works or not depends on the backend. For example, on Fibre 
Channel, the VM would still see the same size, because it is still presented 
from the storage to the hypervisor with the same size. Changing it on the fly 
requires you to dig deep in Linux SCSI rescans and device mapper resizes, write 
some disk sizes manually.. Luckily, ext4 refuses to mount when the device is 
shorted than it should be.
A live migrate afterwards tends to help, though. It will get presented 
correctly to the new hypervisor without manual magic.

To make it short: This use case is not automated on OpenStack side.
Tomas

-Original Message-
From: Volodymyr Litovka [mailto:doka...@gmx.com] 
Sent: Monday, April 23, 2018 10:59 AM
To: OpenStack Mailing List
Subject: [Openstack] volume state (in-use/available) vs real work

Hi colleagues,

in order to change (increase) boot disk's size "on the fly", I can do 
the following sequense of commands without stopping VM:

: openstack volume set --state available 
: openstack volume set --state in-use --size 32 

and, if properly configured, disk will be automatically resized by 
cloud-init during next reboot.

Is it dangerous to change volume state to "available" while VM is 
actively working? Which side-effects I can face while doing this?

Thank you.

-- 
Volodymyr Litovka
   "Vision without Execution is Hallucination." -- Thomas Edison


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack-operators] [openstack-dev] [nova] Default scheduler filters survey

2018-04-27 Thread Tomáš Vondra
Hi!

What we‘ve got in our small public cloud:

 

scheduler_default_filters=AggregateInstanceExtraSpecsFilter,

AggregateImagePropertiesIsolation,

RetryFilter,

AvailabilityZoneFilter,

AggregateRamFilter,

AggregateDiskFilter,

AggregateCoreFilter,

ComputeFilter,

ImagePropertiesFilter,

ServerGroupAntiAffinityFilter,

ServerGroupAffinityFilter

 

#ComputeCapabilitiesFilter off because of conflict with 
AggregateInstanceExtraSpecFilter https://bugs.launchpad.net/nova/+bug/1279719

 

I really like to set resource limits using Aggregate metadata.

Also, Windows host isolation is done using image metadata. I have filled a bug 
somewhere that it does not work correctly with Boot from Volume. I believe it 
got pretty much ignored. That’s why we also use flavor metadata.

 

Tomas from Homeatcloud

 

From: Massimo Sgaravatto [mailto:massimo.sgarava...@gmail.com] 
Sent: Saturday, April 21, 2018 7:49 AM
To: Simon Leinen
Cc: OpenStack Development Mailing List (not for usage questions); OpenStack 
Operators
Subject: Re: [Openstack-operators] [openstack-dev] [nova] Default scheduler 
filters survey

 

enabled_filters = 
AggregateInstanceExtraSpecsFilter,AggregateMultiTenancyIsolation,RetryFilter,AvailabilityZoneFilter,RamFilter,CoreFilter,AggregateRamFilter,AggregateCoreFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter

 

Cheers, Massimo

 

On Wed, Apr 18, 2018 at 10:20 PM, Simon Leinen  wrote:

Artom Lifshitz writes:
> To that end, we'd like to know what filters operators are enabling in
> their deployment. If you can, please reply to this email with your
> [filter_scheduler]/enabled_filters (or
> [DEFAULT]/scheduler_default_filters if you're using an older version)
> option from nova.conf. Any other comments are welcome as well :)

We have the following enabled on our semi-public (academic community)
cloud, which runs on Newton:

AggregateInstanceExtraSpecsFilter
AvailabilityZoneFilter
ComputeCapabilitiesFilter
ComputeFilter
ImagePropertiesFilter
PciPassthroughFilter
RamFilter
RetryFilter
ServerGroupAffinityFilter
ServerGroupAntiAffinityFilter

(sorted alphabetically) Recently we've also been trying

AggregateImagePropertiesIsolation

...but it looks like we'll replace it with our own because it's a bit
awkward to use for our purpose (scheduling Windows instance to licensed
compute nodes).
-- 
Simon.


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [openstack-dev] [Openstack-operators] [nova] Default scheduler filters survey

2018-04-27 Thread Tomáš Vondra
Hi!

What we‘ve got in our small public cloud:

 

scheduler_default_filters=AggregateInstanceExtraSpecsFilter,

AggregateImagePropertiesIsolation,

RetryFilter,

AvailabilityZoneFilter,

AggregateRamFilter,

AggregateDiskFilter,

AggregateCoreFilter,

ComputeFilter,

ImagePropertiesFilter,

ServerGroupAntiAffinityFilter,

ServerGroupAffinityFilter

 

#ComputeCapabilitiesFilter off because of conflict with 
AggregateInstanceExtraSpecFilter https://bugs.launchpad.net/nova/+bug/1279719

 

I really like to set resource limits using Aggregate metadata.

Also, Windows host isolation is done using image metadata. I have filled a bug 
somewhere that it does not work correctly with Boot from Volume. I believe it 
got pretty much ignored. That’s why we also use flavor metadata.

 

Tomas from Homeatcloud

 

From: Massimo Sgaravatto [mailto:massimo.sgarava...@gmail.com] 
Sent: Saturday, April 21, 2018 7:49 AM
To: Simon Leinen
Cc: OpenStack Development Mailing List (not for usage questions); OpenStack 
Operators
Subject: Re: [Openstack-operators] [openstack-dev] [nova] Default scheduler 
filters survey

 

enabled_filters = 
AggregateInstanceExtraSpecsFilter,AggregateMultiTenancyIsolation,RetryFilter,AvailabilityZoneFilter,RamFilter,CoreFilter,AggregateRamFilter,AggregateCoreFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter

 

Cheers, Massimo

 

On Wed, Apr 18, 2018 at 10:20 PM, Simon Leinen  wrote:

Artom Lifshitz writes:
> To that end, we'd like to know what filters operators are enabling in
> their deployment. If you can, please reply to this email with your
> [filter_scheduler]/enabled_filters (or
> [DEFAULT]/scheduler_default_filters if you're using an older version)
> option from nova.conf. Any other comments are welcome as well :)

We have the following enabled on our semi-public (academic community)
cloud, which runs on Newton:

AggregateInstanceExtraSpecsFilter
AvailabilityZoneFilter
ComputeCapabilitiesFilter
ComputeFilter
ImagePropertiesFilter
PciPassthroughFilter
RamFilter
RetryFilter
ServerGroupAffinityFilter
ServerGroupAntiAffinityFilter

(sorted alphabetically) Recently we've also been trying

AggregateImagePropertiesIsolation

...but it looks like we'll replace it with our own because it's a bit
awkward to use for our purpose (scheduling Windows instance to licensed
compute nodes).
-- 
Simon.


___
OpenStack-operators mailing list
openstack-operat...@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack-operators] [openstack-dev] [nova] about rebuild instance booted from volume

2018-03-14 Thread Tomáš Vondra
Hi!
I say delete! Delete them all!
Really, it's called delete_on_termination and should be ignored on Rebuild.
We have a VPS service implemented on top of OpenStack and do throw the old 
contents away on Rebuild. When the user has the Backup service paid, they can 
restore a snapshot. Backup is implemented as volume snapshot, then clone 
volume, then upload to image (glance is on a different disk array).

I also sometimes multi-attach a volume manually to a service node and just dd 
an image onto it. If it was to be implemented this way, then there would be no 
deleting a volume with delete_on_termination, just overwriting. But the effect 
is the same.

IMHO you can have snapshots of volumes that have been deleted. Just some 
backends like our 3PAR don't allow it, but it's not disallowed in the API 
contract.
Tomas from Homeatcloud

-Original Message-
From: Saverio Proto [mailto:ziopr...@gmail.com] 
Sent: Wednesday, March 14, 2018 3:19 PM
To: Tim Bell; Matt Riedemann
Cc: OpenStack Development Mailing List (not for usage questions); 
openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] [openstack-dev] [nova] about rebuild 
instance booted from volume

My idea is that if delete_on_termination flag is set to False the Volume should 
never be deleted by Nova.

my 2 cents

Saverio

2018-03-14 15:10 GMT+01:00 Tim Bell :
> Matt,
>
> To add another scenario and make things even more difficult (sorry (), if the 
> original volume has snapshots, I don't think you can delete it.
>
> Tim
>
>
> -Original Message-
> From: Matt Riedemann 
> Reply-To: "OpenStack Development Mailing List (not for usage 
> questions)" 
> Date: Wednesday, 14 March 2018 at 14:55
> To: "openstack-...@lists.openstack.org" 
> , openstack-operators 
> 
> Subject: Re: [openstack-dev] [nova] about rebuild instance booted from 
> volume
>
> On 3/14/2018 3:42 AM, 李杰 wrote:
> >
> >  This is the spec about  rebuild a instance booted from
> > volume.In the spec,there is a
> >question about if we should delete the old root_volume.Anyone who
> > is interested in
> >booted from volume can help to review this. Any suggestion is
> > welcome.Thank you!
> >The link is here.
> >Re:the rebuild spec:https://review.openstack.org/#/c/532407/
>
> Copying the operators list and giving some more context.
>
> This spec is proposing to add support for rebuild with a new image for
> volume-backed servers, which today is just a 400 failure in the API
> since the compute doesn't support that scenario.
>
> With the proposed solution, the backing root volume would be deleted and
> a new volume would be created from the new image, similar to how boot
> from volume works.
>
> The question raised in the spec is whether or not nova should delete the
> root volume even if its delete_on_termination flag is set to False. The
> semantics get a bit weird here since that flag was not meant for this
> scenario, it's meant to be used when deleting the server to which the
> volume is attached. Rebuilding a server is not deleting it, but we would
> need to replace the root volume, so what do we do with the volume we're
> replacing?
>
> Do we say that delete_on_termination only applies to deleting a server
> and not rebuild and therefore nova can delete the root volume during a
> rebuild?
>
> If we don't delete the volume during rebuild, we could end up leaving a
> lot of volumes lying around that the user then has to clean up,
> otherwise they'll eventually go over quota.
>
> We need user (and operator) feedback on this issue and what they would
> expect to happen.
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
> s

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [openstack-dev] [Openstack-operators] [nova] about rebuild instance booted from volume

2018-03-14 Thread Tomáš Vondra
Hi!
I say delete! Delete them all!
Really, it's called delete_on_termination and should be ignored on Rebuild.
We have a VPS service implemented on top of OpenStack and do throw the old 
contents away on Rebuild. When the user has the Backup service paid, they can 
restore a snapshot. Backup is implemented as volume snapshot, then clone 
volume, then upload to image (glance is on a different disk array).

I also sometimes multi-attach a volume manually to a service node and just dd 
an image onto it. If it was to be implemented this way, then there would be no 
deleting a volume with delete_on_termination, just overwriting. But the effect 
is the same.

IMHO you can have snapshots of volumes that have been deleted. Just some 
backends like our 3PAR don't allow it, but it's not disallowed in the API 
contract.
Tomas from Homeatcloud

-Original Message-
From: Saverio Proto [mailto:ziopr...@gmail.com] 
Sent: Wednesday, March 14, 2018 3:19 PM
To: Tim Bell; Matt Riedemann
Cc: OpenStack Development Mailing List (not for usage questions); 
openstack-operat...@lists.openstack.org
Subject: Re: [Openstack-operators] [openstack-dev] [nova] about rebuild 
instance booted from volume

My idea is that if delete_on_termination flag is set to False the Volume should 
never be deleted by Nova.

my 2 cents

Saverio

2018-03-14 15:10 GMT+01:00 Tim Bell :
> Matt,
>
> To add another scenario and make things even more difficult (sorry (), if the 
> original volume has snapshots, I don't think you can delete it.
>
> Tim
>
>
> -Original Message-
> From: Matt Riedemann 
> Reply-To: "OpenStack Development Mailing List (not for usage 
> questions)" 
> Date: Wednesday, 14 March 2018 at 14:55
> To: "openstack-dev@lists.openstack.org" 
> , openstack-operators 
> 
> Subject: Re: [openstack-dev] [nova] about rebuild instance booted from 
> volume
>
> On 3/14/2018 3:42 AM, 李杰 wrote:
> >
> >  This is the spec about  rebuild a instance booted from
> > volume.In the spec,there is a
> >question about if we should delete the old root_volume.Anyone who
> > is interested in
> >booted from volume can help to review this. Any suggestion is
> > welcome.Thank you!
> >The link is here.
> >Re:the rebuild spec:https://review.openstack.org/#/c/532407/
>
> Copying the operators list and giving some more context.
>
> This spec is proposing to add support for rebuild with a new image for
> volume-backed servers, which today is just a 400 failure in the API
> since the compute doesn't support that scenario.
>
> With the proposed solution, the backing root volume would be deleted and
> a new volume would be created from the new image, similar to how boot
> from volume works.
>
> The question raised in the spec is whether or not nova should delete the
> root volume even if its delete_on_termination flag is set to False. The
> semantics get a bit weird here since that flag was not meant for this
> scenario, it's meant to be used when deleting the server to which the
> volume is attached. Rebuilding a server is not deleting it, but we would
> need to replace the root volume, so what do we do with the volume we're
> replacing?
>
> Do we say that delete_on_termination only applies to deleting a server
> and not rebuild and therefore nova can delete the root volume during a
> rebuild?
>
> If we don't delete the volume during rebuild, we could end up leaving a
> lot of volumes lying around that the user then has to clean up,
> otherwise they'll eventually go over quota.
>
> We need user (and operator) feedback on this issue and what they would
> expect to happen.
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> ___
> OpenStack-operators mailing list
> openstack-operat...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
> s

___
OpenStack-operators mailing list
openstack-operat...@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [Openstack-operators] How are you handling billing/chargeback?

2018-03-13 Thread Tomáš Vondra
Hi!

We at Homeatcloud have rolled our own engine taking data from Ceilometer 
events. However, CloudKitty didn‘t exist back then. Now we would probably use 
it to calculate the rating AND roll our own engine for billing and invoice 
printing.

Tomas

 

From: Flint WALRUS [mailto:gael.ther...@gmail.com] 
Sent: Monday, March 12, 2018 9:41 PM
To: Lars Kellogg-Stedman
Cc: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] How are you handling billing/chargeback?

 

Hi lars, personally using an internally crafted service.

It’s one of my main regret with Openstack, lack of a decent billing system.

Le lun. 12 mars 2018 à 20:22, Lars Kellogg-Stedman  a écrit :

Hey folks,

I'm curious what folks out there are using for chargeback/billing in
your OpenStack environment.

Are you doing any sort of chargeback (or showback)?  Are you using (or
have you tried) CloudKitty?  Or some other existing project?  Have you
rolled your own instead?

I ask because I am helping out some folks get a handle on the
operational side of their existing OpenStack environment, and they are
interested in but have not yet deployed some sort of reporting
mechanism.

Thanks,

--
Lars Kellogg-Stedman  | larsks @ {irc,twitter,github}
http://blog.oddbit.com/|

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack] [ironic] how to prevent ironic user to controle ipmi through OS?

2018-01-29 Thread Tomáš Vondra
How about HPE iLO, does anyone know a way to disable access from the OS?

 

From: Tyler Bishop [mailto:tyler.bis...@beyondhosting.net] 
Sent: Sunday, January 28, 2018 2:01 AM
To: Guo James
Cc: openstack
Subject: Re: [Openstack] [ironic] how to prevent ironic user to controle ipmi 
through OS?

 

On dell DRAC you can disable IPMI/RAC control at the the device for OS 
configuration.

 

With Supermicro IPMI you just need to create a random user and random password 
that is not "admin".

 

 

_

Tyler Bishop

Founder EST 2007

 

Obrázek byl odebrán odesílatelem.

 

O: 513-299-7108 x10

M: 513-646-5809

http://BeyondHosting.net

 

 

This email is intended only for the recipient(s) above and/or otherwise 
authorized personnel. The information contained herein and attached is 
confidential and the property of Beyond Hosting. Any unauthorized copying, 
forwarding, printing, and/or disclosing any information related to this email 
is prohibited. If you received this message in error, please contact the sender 
and destroy all copies of this email and any attachment(s).

 

  _  

From: "Guo James" 
To: xief...@sina.com, "openstack" 
Sent: Wednesday, January 10, 2018 10:16:34 PM
Subject: Re: [Openstack] [ironic] how to prevent ironic user to controle ipmi 
through OS?

 

Ironic user can change ipmi address so that OpenStack ironic lose control of 
bare mental.

I think that is unacceptable.

It seems that we should make ironic image without root privilege

 

From: xief...@sina.com [mailto:xief...@sina.com] 
Sent: Thursday, January 11, 2018 9:12 AM
To: Guo James; openstack
Subject: 回复:[Openstack] [ironic] how to prevent ironic user to controle ipmi 
through OS?

 

If you can not get the usename and password of the OS, you can not modify ipmi 
configuration through you got the ironic user info.

 

 

- 原始邮件 -
发件人:Guo James 
收件人:"openstack@lists.openstack.org" 
主题:[Openstack] [ironic] how to prevent ironic user to controle ipmi through OS?
日期:2018年01月10日 17点21分


I notice that after an ironic user get a bare mental successfully, he can 
access ipmi through ipmi device although he can't access ipmi through LAN
How to prevent the situation?
If he modify ipmi configuration, that will be mess.
___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack-operators] thierry's longer dev cycle proposal

2017-12-15 Thread Tomáš Vondra
The thread on the dev list is already too long for my liking. I hope there will 
be a TL;DR in the dev mailing list digest.
Tomas

-Original Message-
From: arkady.kanev...@dell.com [mailto:arkady.kanev...@dell.com] 
Sent: Thursday, December 14, 2017 3:40 AM
To: mrhills...@gmail.com; fu...@yuggoth.org; 
openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] thierry's longer dev cycle proposal

It is a sign of the maturity of OpenStack. With lots of deployment and most of 
them in production, the emphasis is shifting from rapid functionality additions 
to stability, manageability, and long term operability.

-Original Message-
From: Melvin Hillsman [mailto:mrhills...@gmail.com] 
Sent: Wednesday, December 13, 2017 5:29 PM
To: Jeremy Stanley ; openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] thierry's longer dev cycle proposal

I think this is a good opportunity to allow some stress relief to the developer 
community and offer space for more discussions with operators where some 
operators do not feel like they are bothering/bugging developers. I believe 
this is the main gain for operators; my personal opinion. In general I think 
the opportunity costs/gains are worth it for this and it is the responsibility 
of the community to make the change be useful as you mentioned in your original 
thread Thierry. It is not a silver bullet for all of the issues folks have with 
the way things are done but I believe that if it does not hurt things and 
offers even a slight gain in some area it makes sense.

Any change is not going to satisfy/dis-satisfy 100% of the constituents.

-- 
Kind regards,

Melvin Hillsman
mrhills...@gmail.com
mobile: +1 (832) 264-2646
irc: mrhillsman

On 12/13/17, 4:39 PM, "Jeremy Stanley"  wrote:

On 2017-12-13 22:35:41 +0100 (+0100), Thierry Carrez wrote:
[...]
> It's not really fait accompli, it's just a proposal up for discussion at
> this stage. Which is the reason why I started the thread on -dev -- to
> check the sanity of the change from a dev perspective first. If it makes
> things harder and not simpler on that side, I don't expect the TC to
> proceed.
[...]

With my TC hat on, regardless of what impression the developer
community has on this, I plan to take subsequent operator and
end-user/app-dev feedback into account as well before making any
binding decisions (and expect other TC members feel the same).
-- 
Jeremy Stanley
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack] Computer Science Master's Thesis Topics?

2017-12-01 Thread Tomáš Vondra
Hi Felix!
I'm a PhD student at CTU FEE. My own topic was simulation of automatic scaling 
using both the event-driven approach and queue network modelling. The second 
part of the thesis was on the prediction of web server workload in time. If you 
wanted, you could continue and simulate some predictive autoscaling algorithms. 
The framework is already there, and I have tons of literature on how to 
continue, but no time, as I'm working in the public cloud industry now.

It would also be interesting to extend the prediction, which relies on daily 
seasonality and do the other approach - short-term anomaly detection.

One could also apply performance engineering models to machine learning, to 
better estimate the length of big data processing jobs.

I also have some free practical systems engineering topics from clouds and big 
data, such as:
Compare Hadoop access methods - jump box vs. Apache Knox
Automatic server configuration integrated with network management
Compare an enterprise disk array with software defined storage
Install and benchmark alternate network layers in OpenStack

That's about all I have ready at the time.
Tomas

-Original Message-
From: Felix Fischer [mailto:m...@felixfischer.org] 
Sent: Thursday, November 30, 2017 9:02 PM
To: openstack@lists.openstack.org
Subject: [Openstack] Computer Science Master's Thesis Topics?

Hey Folks,

I'm Felix and I'm looking for a topic for my master's thesis in computer 
science. I studied at the Humboldt University of Berlin and currently I'm 
working as an OpenStack consultant. I was hoping some of you could point me to 
areas or even specific topics related to OpenStack which need further 
scientific research. My topic needs to be about a scientific research question 
and cannot be a programming task or something similar to that. Maybe some of 
you have a problem for me which needs solving or can point me directions where 
to look at or who to ask?

Thank you in advance and kind regards,

Felix Fischer

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Compute Node shutdown how to prevent instance suspend

2017-11-02 Thread Tomáš Vondra
Hi!

When I need to reboot a compute node (because of some driver lock-up or such 
problems), I first stop nova-compute so that it does not report the shutoff 
state of VMs to the database and resume_guests_state_on_host_boot=true does 
actually start them. Then I press the power switch on the VMs using virsh 
shutdown, one by one, and only after that I reboot the node. This is a script I 
scrounged somewhere and modified so that it does not take too much time:

Tomas

 

root@cmp03:~# cat vm-shutdown

#!/bin/bash

# file: /usr/local/sbin/vm-shutdown

# Description: shutdown active virtual machines

debug=1

#fake=1

 

# Get list of active virtual machines

vmList="`virsh list | (

while read vmID vmName vmStatus

do

  if [ -n "$vmName" -a "$vmName" != "Name" -a "$vmName" != "Domain-0" ]

  then

[ -z "$vmList" ] && vmList="$vmName" || vmList="$vmList $vmName"

  fi

done

echo $vmList )`"

 

# check there are some active VM's

if [ -n "$vmList" ]; then

  # Shutdown VM's with verification

  for vmName in $vmList

  do

# send initial request

[ -n "$debug" ] && echo -n "Attempting to shutdown $vmName "

[ -z "$fake" ] && virsh shutdown $vmName

# wait a limited time for the VM to be not running

count=30

while $( virsh list | grep $vmName >/dev/null ) && [ $count -gt 0 ]

do

  sleep 1

  let count=count-1

  [ -n "$debug" ] && echo -n "."

done

# report current status

( virsh list | grep $vmName >/dev/null ) && echo " failed!" || echo "down."

# if still running, destroy it

if ( virsh list | grep $vmName >/dev/null )

then

  [ -n "$debug" ] && echo -n "Attempting to destroy $vmName "

  [ -z "$fake" ] && virsh destroy $vmName

  # wait a limited time for the VM to be not running

  count=30

  while $( virsh list | grep $vmName >/dev/null ) && [ $count -gt 0 ]

  do

sleep 1

let count=count-1

[ -n "$debug" ] && echo -n "."

  done

  # report current status

  ( virsh list | grep $vmName >/dev/null ) && echo " failed!" || echo 
"down."

fi

  done

 

From: Tzach Shefi [mailto:tsh...@redhat.com] 
Sent: Thursday, November 02, 2017 9:55 AM
To: Chris
Cc: openstack@lists.openstack.org
Subject: Re: [Openstack] Compute Node shutdown how to prevent instance suspend

 

Hi, 

 

A better Q would be why do you shutdown a compute node to begin with?

I mean if you need you should do so in an orderly fashion basically excavate 
instances

or shut instances down manually, put the compute node in maintenance mode. 

On rebooting compute node remove it from maintenance mode, turn on instances or 
migrate them back to this compute node should you need. 

Od delete them if you wish. 

 

There is this nova option:

resume_guests_state_on_host_boot=true

 

But it doesn't delete or shutdown instances but rather turns them on 
automatically once compute host resumes. 

which might also work for you, probably not just mentioning it any way.  

 

I don't know of an option to stop/delete instance on compute node shutdown. 

 

Another option check maybe you could shelve suspended instance and then later 
delete them. 

 

Shelving stops the instance and takes a snapshot of it. Then depending on the 
value of the shelved_offload_time config option, the instance is deleted from 
the hypervisor (0), never deleted (-1), or deleted after some period of time (> 
0). Note that it's just destroying the backing instance on the hypervisor, the 
actual instance in the nova database is not deleted. Then you can later 
unshelve the instance:

 

 

This might help, but do not if you mess with kvm without updating Nova you 
might be left haning else where :)

https://ask.fedoraproject.org/en/question/8796/make-libvirt-to-shutdown-my-guests-not-suspend/

 

 

 

 

On Thu, Nov 2, 2017 at 9:03 AM, Chris  wrote:

Hello,

When we shut down a compute node the instances running on it get suspended. 
This generates some difficulties with some applications like RabbitMQ dont like 
to be suspended. Is there a way to change this behavior so that the running 
instances gets killed or shutdown instead?

Thanks in advance.

Cheers,
Chris

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack





 

-- 

Tzach Shefi

Senior Quality Engineer, RHCSA

  Red Hat 



  tsh...@redaht.comM:   
+972-54-4701080 IM: tshefi


  Obrázek byl odebrán odesílatelem.

 

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack-operators] [nova] Should we allow passing new user_data during rebuild?

2017-10-06 Thread Tomáš Vondra
Dear Clint,
maybe you misunderstood a little, or I didn't write it explicitly. We use 
OpenStack for providing a VPS service, yes. But the VPS users do not get access 
to OpenStack directly, but instead, they use our Customer Portal which does the 
orchestration. The whole point is to make the service as easy as possible to 
use for them and not expose them to the complexity of the Cloud. As I said, we 
couldn't use Rebuild because VPS's have Volumes. We do use Resize because it is 
there. But we could as well use more low-level cloud primitives. The user does 
not care in this case. How does, e.g., WHMCS do it? That is a stock software 
that you can use to provide VPS over OpenStack.
Tomas from Homeatcloud

-Original Message-
From: Clint Byrum [mailto:cl...@fewbar.com] 
Sent: Thursday, October 05, 2017 6:50 PM
To: openstack-operators
Subject: Re: [Openstack-operators] [nova] Should we allow passing new user_data 
during rebuild?

No offense is intended, so please forgive me for the possibly incendiary nature 
of what I'm about to write:

VPS is the predecessor of cloud (and something I love very much, and rely on 
every day!), and encourages all the bad habits that a cloud disallows. At small 
scale, it's the right thing, and that's why I use it for my small scale needs. 
Get a VM, put your stuff on it, and keep it running forever.

But at scale, VMs in clouds go away. They get migrated, rebooted, turned off, 
and discarded, often. Most clouds are terrible for VPS compared to VPS hosting 
environments.

I'm glad it's working for you. And I think rebuild and resize will stay and 
improve to serve VPS style users in interesting ways. I'm learning now who our 
users are today, and I'm confident we should make sure everyone who has taken 
the time to deploy and care for OpenStack should be served by expanding rebuild 
to meet their needs.

You can all consider this my white flag. :)

Excerpts from Tomáš Vondra's message of 2017-10-05 10:22:14 +0200:
> In our cloud, we offer the possibility to reinstall the same or another OS on 
> a VPS (Virtual Private Server). Unfortunately, we couldn’t use the rebuild 
> function because of the VPS‘s use of Cinder for root disk. We create a new 
> instance and inject the same User Data so that the new instance has the same 
> password and key as the last one. It also has the same name, and the same 
> floating IP is attached. I believe it even has the same IPv6 through some 
> Neutron port magic.
> 
> BTW, you wouldn’t believe how often people use the Reinstall feature.
> 
> Tomas from Homeatcloud
> 
>  
> 
> From: Belmiro Moreira [mailto:moreira.belmiro.email.li...@gmail.com]
> Sent: Wednesday, October 04, 2017 5:34 PM
> To: Chris Friesen
> Cc: openstack-operators@lists.openstack.org
> Subject: Re: [Openstack-operators] [nova] Should we allow passing new 
> user_data during rebuild?
> 
>  
> 
> In our cloud rebuild is the only way for a user to keep the same IP. 
> Unfortunately, we don't offer floating IPs, yet.
> 
> Also, we use the user_data to bootstrap some actions in new instances 
> (puppet, ...).
> 
> Considering all the use-cases for rebuild it would be great if the user_data 
> can be updated at rebuild time.
> 
>  
> 
> On Wed, Oct 4, 2017 at 5:15 PM, Chris Friesen  
> wrote:
> 
> On 10/03/2017 11:12 AM, Clint Byrum wrote:
> 
> My personal opinion is that rebuild is an anti-pattern for cloud, and 
> should be frozen and deprecated. It does nothing but complicate Nova 
> and present challenges for scaling.
> 
> That said, if it must stay as a feature, I don't think updating the 
> user_data should be a priority. At that point, you've basically 
> created an entirely new server, and you can already do that by 
> creating an entirely new server.
> 
> 
> If you've got a whole heat stack with multiple resources, and you realize 
> that you messed up one thing in the template and one of your servers has the 
> wrong personality/user_data, it can be useful to be able to rebuild that one 
> server without affecting anything else in the stack.  That's just a 
> convenience though.
> 
> Chris
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova] Should we allow passing new user_data during rebuild?

2017-10-05 Thread Tomáš Vondra
In our cloud, we offer the possibility to reinstall the same or another OS on a 
VPS (Virtual Private Server). Unfortunately, we couldn’t use the rebuild 
function because of the VPS‘s use of Cinder for root disk. We create a new 
instance and inject the same User Data so that the new instance has the same 
password and key as the last one. It also has the same name, and the same 
floating IP is attached. I believe it even has the same IPv6 through some 
Neutron port magic.

BTW, you wouldn’t believe how often people use the Reinstall feature.

Tomas from Homeatcloud

 

From: Belmiro Moreira [mailto:moreira.belmiro.email.li...@gmail.com] 
Sent: Wednesday, October 04, 2017 5:34 PM
To: Chris Friesen
Cc: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] [nova] Should we allow passing new user_data 
during rebuild?

 

In our cloud rebuild is the only way for a user to keep the same IP. 
Unfortunately, we don't offer floating IPs, yet.

Also, we use the user_data to bootstrap some actions in new instances (puppet, 
...).

Considering all the use-cases for rebuild it would be great if the user_data 
can be updated at rebuild time.

 

On Wed, Oct 4, 2017 at 5:15 PM, Chris Friesen  
wrote:

On 10/03/2017 11:12 AM, Clint Byrum wrote:

My personal opinion is that rebuild is an anti-pattern for cloud, and
should be frozen and deprecated. It does nothing but complicate Nova
and present challenges for scaling.

That said, if it must stay as a feature, I don't think updating the
user_data should be a priority. At that point, you've basically created an
entirely new server, and you can already do that by creating an entirely
new server.


If you've got a whole heat stack with multiple resources, and you realize that 
you messed up one thing in the template and one of your servers has the wrong 
personality/user_data, it can be useful to be able to rebuild that one server 
without affecting anything else in the stack.  That's just a convenience though.

Chris




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] cinder/nova issues

2017-08-24 Thread Tomáš Vondra
Hi!

If this was OpenStack Kilo and HPE 3PAR over Fibre Channel, I would tell you 
that the volume extend operation is designed to work with detached volumes 
only. Hence you need cinder reset-state. At least in our case, it does not 
update the SCSI devices and multipath setup. The volume continues to work with 
the old size. We do a live migrate operation afterwards to disconnect the 
storage from one node and connect to another. Even resize to the same node 
works. However, os-brick was introduced in Liberty, so the case may be 
different.

Tomas

 

From: Adam Dibiase [mailto:adibi...@digiumcloud.com] 
Sent: Wednesday, August 23, 2017 9:06 PM
To: Sean McGinnis
Cc: openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] cinder/nova issues

 

Thanks Sean. I filed a bug report to track this. Bug #1712651. I would agree 
with you on connectivity issues with the Netapp if it happened on all volume 
extensions, but this only happens in one scenario only.




Thanks, 

 

Adam

 

 

 

 

On Wed, Aug 23, 2017 at 2:04 PM, Sean McGinnis  wrote:

Hey Adam,

There have been some updates since Liberty to improve handling in the os-brick
library that handles the local device management. But with this showing the
paths down, I wonder if there's something else going on there between the
NetApp box and the Nova compute host.

Could you file a bug to track this? I think you could just copy and paste the
content of your original email since it captures a lot of great info.

https://bugs.launchpad.net/cinder/+filebug

We can tag it with netapp so maybe it will get some attention there.

Thanks,
Sean

On Wed, Aug 23, 2017 at 01:01:24PM -0400, Adam Dibiase wrote:
> Greetings,
>
> I am having an issue with nova starting an instance that is using a root
> volume that cinder has extended. More specifically, a volume that has been
> extended past the max resize limit of our Netapp filer. I am running
> Liberty and upgraded cinder packages to 7.0.3 from 7.0.0 to take advantage
> of this functionality. From what I can gather, it uses sub-lun cloning to
> get past the hard limit set by Netapp when cloning past 64G (starting from
> a 4G volume).
>
> *Environment*:
>
>- Release: Liberty
>- Filer:   Netapp
>- Protocol: Fiberchannel
>- Multipath: yes
>
>
>
> *Steps to reproduce: *
>
>- Create new instance
>- stop instance
>- extend the volume by running the following commands:
>   - cinder reset-state --state available (volume-ID or name)
>   - cinder extend (volume-ID or name) 100
>   - cinder reset-state --state in-use (volume-ID or name)
>- start instance with either nova start or nova reboot --hard  --same
>result
>
>
> I can see that the instance's multipath status is good before the resize...
>
> *360a98000417643556a2b496d58665473 dm-17 NETAPP  ,LUN *
>
> size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
>
> |-+- policy='round-robin 0' prio=-1 status=active
>
> | |- 6:0:1:5 sdy   65:128 active undef  running
>
> | `- 7:0:0:5 sdz   65:144 active undef  running
>
> `-+- policy='round-robin 0' prio=-1 status=enabled
>
>   |- 6:0:0:5 sdx   65:112 active undef  running
>
>   `- 7:0:1:5 sdaa  65:160 active undef  running
>
>
> Once the volume is resized, the lun goes to a failed state and it does not
> show the new size:
>
>
> *360a98000417643556a2b496d58665473 dm-17 NETAPP  ,LUN *
>
> size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
>
> |-+- policy='round-robin 0' prio=-1 status=enabled
>
> | |- 6:0:1:5 sdy   65:128 failed undef  running
>
> | `- 7:0:0:5 sdz   65:144 failed undef  running
>
> `-+- policy='round-robin 0' prio=-1 status=enabled
>
>   |- 6:0:0:5 sdx   65:112 failed undef  running
>
>   `- 7:0:1:5 sdaa  65:160 failed undef  running
>
>
> Like I said, this only happens on volumes that have been extended past 64G.
> Smaller sizes to not have this issue. I can only assume that the original
> lun is getting destroyed after the clone process and that is cause of the
> failed state. Why is it not picking up the new one and attaching it to the
> compute node?  Is there something I am missing?
>
> Thanks in advance,
>
> Adam

> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] custom build image is slow

2017-08-01 Thread Tomáš Vondra
Hi!

How big are the actual image files? Because qcow2 is a sparse format, it does 
not store zeroes. If the free space in one image is zeroed out, it will convert 
much faster. If that is the problem, use „dd if=/dev/zero of=temp;sync;rm temp“ 
or zerofree.

Tomas

 

From: Paras pradhan [mailto:pradhanpa...@gmail.com] 
Sent: Monday, July 31, 2017 11:54 PM
To: openstack-operators@lists.openstack.org
Subject: [Openstack-operators] custom build image is slow

 

Hello

 

I have two qcow2 images uploaded to glance. One is CentOS 7 cloud image 
downloaded from centos.org.  The other one is custom built using CentOS 7.DVD.  
When I create cinder volumes from them, volume creation from the custom built 
image it is very very slow.

 

 

CenOS qcow2:

 

2017-07-31 21:42:44.287 881609 INFO cinder.image.image_utils 
[req-ea2d7b12-ae9e-45b2-8b4b-ea8465497d5a 
e090e605170a778610438bfabad7aa7764d0a77ef520ae392e2b59074c9f88cf 
490910c1d4e1486d8e3a62d7c0ae698e - d67a18e70dd9467db25b74d33feaad6d default] 
Converted 8192.00 MB image at 253.19 MB/s

 

Custom built qcow2:

INFO cinder.image.image_utils [req-032292d8-1500-474d-95c7-2e8424e2b864 
e090e605170a778610438bfabad7aa7764d0a77ef520ae392e2b59074c9f88cf 
490910c1d4e1486d8e3a62d7c0ae698e - d67a18e70dd9467db25b74d33feaad6d default] 
Converted 10240.00 MB image at 32.22 MB/s

 

I used the following command to create the qcow2 file

qemu-img create -f qcow2 custom.qcow2 10G

 

What am I missing ?

 

Thanks
Paras.

 

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack] dhcp namespace and vert pairs?

2017-07-18 Thread Tomáš Vondra
Hi!
I think that the DHCP server does not work for the laptop because it only 
server MAC:IP pairs configured by Neutron. You can confirm by stealing a MAC 
address configured to an instance. Stop it and try dhclient ;-).

However, I'm not enough of a Neutron guru to guarantee you that you will be 
able to communicate everywhere when you patch in like this. Mainly in Neutron 
DVR, there is some filtering and rewriting being done in OpenvSwitch.
Tomas

-Original Message-
From: Ali Volkan Atli [mailto:volkan.a...@argela.com.tr] 
Sent: Monday, July 17, 2017 12:46 PM
To: openstack@lists.openstack.org
Subject: [Openstack] dhcp namespace and vert pairs?


I have a VM launched using OpenStack and a laptop added directly into 
integration bridge using ovs-vsctl add-port option. Also I used ./stack script 
from devstack github and my local.conf as below:

argela@cloud:~$ sudo ovs-vsctl add-port br-int eno4 tag=1

argela@cloud:~$ sudo ovs-vsctl show
...
Bridge br-int
Port "tapb22e0bb6-c6"
tag: 1
Interface "tapb22e0bb6-c6"
type: internal
Port "eno4"
tag: 1
Interface "eno4"
   ...

root@cloud:~/devstack# cat local.conf
[[local|localrc]]
ADMIN_PASSWORD=admin
DATABASE_PASSWORD=$ADMIN_PASSWORD
RABBIT_PASSWORD=$ADMIN_PASSWORD
SERVICE_PASSWORD=$ADMIN_PASSWORD

FLAT_INTERFACE=eno1,eno3,eno4,eno5,eno6,eno7,eno8

# Fixed and floating subnets
FIXED_RANGE=10.254.1.0/24
FLOATING_RANGE="192.168.111.0/24"

When I run dhclient in the instance launched from OpenStack, I can see the 
bootp/dhcp messages in the dhcp network space, but when I tried to run dhclient 
in external laptop, I can only see the discover message in dhcp namespace, the 
laptop cannot get any response. So VM can get an IP address but the laptop 
cannot. I checked the iptables and flow-entries in OvS but I could not 
understand why laptop cannot get response from dhcp namespace.

stack@cloud:~/devstack$ ip netns list
qrouter-b1285ebf-d7f6-4af5-bf13-54356b073ca2
qdhcp-f0d79126-a5f2-46a6-90a9-b0e2f805f93d

dhcp namespace iptable is as follows:

root@cloud:~/devstack# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N neutron-dhcp-age-FORWARD
-N neutron-dhcp-age-INPUT
-N neutron-dhcp-age-OUTPUT
-N neutron-dhcp-age-local
-N neutron-filter-top
-A INPUT -j neutron-dhcp-age-INPUT
-A FORWARD -j neutron-filter-top
-A FORWARD -j neutron-dhcp-age-FORWARD
-A OUTPUT -j neutron-filter-top
-A OUTPUT -j neutron-dhcp-age-OUTPUT
-A neutron-filter-top -j neutron-dhcp-age-local

One more question. I can see that there are namespaces created by OpenStack, 
qrouter and qdhcp. I know that if I want to connect the namespace to OvS, I 
need to create Veth Pairs (e.g. ip link add veth0 type veth peer name veth1), 
and then assign one peer to a namespace (e.g. ip link set veth1 netns blue) and 
the other into OvS. But for OpenStack I also cannot any vert pairs. How did 
OpenStack connect the dhcp namespace to OvS? How can I find out which "vert 
peer" the dhcp namespaces use.

Hope someone answers. Thanks in advance.

- Volkan

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Mirantis Openstack - Instances lose IP in certain nodes

2017-06-29 Thread Tomáš Vondra
Hi!

What kind of networking plugin have you deployed with Fuel?

Do you mean fixed IP or Floating IP?

Tomas

 

From: Raja T Nair [mailto:rtn...@gmail.com] 
Sent: Tuesday, June 27, 2017 8:54 PM
To: openstack@lists.openstack.org
Subject: [Openstack] Mirantis Openstack - Instances lose IP in certain nodes

 

Hello All,

In my Mirantis cluster, recently VMs in some nodes have started to lose IPs.

After a reboot of the node, none of the VMs there gets an IP.

Version details below:

[root@fuel ~]# cat /etc/fuel/version.yaml 
VERSION:
  feature_groups:
- mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "301"
  build_id: "301"

Any help to troubleshoot this situation is highly appreciated.

Regards,

Raja.



-- 

:^)

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] Floating IP issues in multiple cloud installs.

2017-06-20 Thread Tomáš Vondra
Hi!
Do you have Neutron DVR enabled? Multiple Network nodes?
Tomas

-Original Message-
From: Brian Haley [mailto:haleyb@gmail.com] 
Sent: Tuesday, June 20, 2017 3:30 AM
To: Ken D'Ambrosio
Cc: Openstack
Subject: Re: [Openstack] Floating IP issues in multiple cloud installs.

On 06/19/2017 08:51 AM, Ken D'Ambrosio wrote:
> Hi, all.  We've got two Canonical Newton installs using VLANs and 
> we're having intermittent issues we simply can't figure out.  (Note 
> that a third installation using flat networks is not having this 
> issue.) Floating IPs set up and work... sporadically.
> 
> * Stateful connections (e.g., SSH) often drop after seconds of use to 
> both the FIP and when SSH'd in from the
> * We see RSTs in our TCP dumps
> * Pings work for a while, then don't.
> * We see lots of ARP requests -- even one right after another -- to 
> resolve hosts on the internal subnets:
> 05:43:25.859448 ARP, Request who-has 80.0.0.3 tell 80.0.0.1, length 28
> 05:43:25.859563 ARP, Reply 80.0.0.3 is-at fa:16:3e:28:af:77, length 28
> 05:43:25.964417 ARP, Request who-has 80.0.0.3 tell 80.0.0.1, length 46
> 05:43:25.964572 ARP, Reply 80.0.0.3 is-at fa:16:3e:28:af:77, length 28
> 05:43:26.963989 ARP, Request who-has 80.0.0.3 tell 80.0.0.1, length 46
> 05:43:26.964156 ARP, Reply 80.0.0.3 is-at fa:16:3e:28:af:77, length 28

Was that run with '-i any' or on a single interface?  I would check the ARP 
cache to make sure things entries are in a complete/reachable state. 
  Or even syslog for any other errors.

> 80.0.0.1 is the qrouter.  I can't imagine why it asked -- and was 
> ACK'd in each case -- three times in just over a second.  In 
> hindsight, I should have checked to have seen if the ACK showed up in 
> the qrouter's ARP table.  Next time...

I'd also specify -e to tcpdump to see the MACs involved.  Possibly there is 
something else configured with the same IP on the VLAN (shouldn't happen, but 
worth checking).

-Brian

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack-operators] [nova] instance partition size discrepancy

2017-04-26 Thread Tomáš Vondra
Have you tried
# resize2fs /dev/vda
?
Alternatively, if you use images with cloud-init and initramfs-growroot 
installed, it should work out of the box.
Tomas

-Original Message-
From: Carlos Konstanski [mailto:ckonstan...@pippiandcarlos.com] 
Sent: Wednesday, April 26, 2017 12:02 AM
To: openstack-operators@lists.openstack.org
Subject: [Openstack-operators] [nova] instance partition size discrepancy

I'm having an issue where the instance thinks its root filesystem is much 
smaller than the size of the volume that I used to create it. Not only that, 
the OS cannot decide on whether it thinks the size is right or wrong.

See the following pastebin:
https://paste.pound-python.org/show/eNt8nLNLhHAL5OYICqbs/

Notice that everything shows the size as 20 GB except df, which shows it as 2.8 
GB. I ran the previous instance out of space before spinning up this new one, 
so 2.8 seems to be the winner (though wrong).

Figured I'd check to see if this is a known issue while I dig deeper.

Carlos Konstanski

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack] poor bandwidth across instances running on same host

2017-04-18 Thread Tomáš Vondra
Sorry to shatter you expectations, but those numbers are perfectly OK.

I was testing on HPE DL380 gen9 with Intel Xeon E5-2630v3 

  and I got these speeds between two KVM VMs on the same host using netperf

28186 Mb/s with linux bridge

18552 Mb/s with OpenVSwitch with the full Neutron setup with iptables.

 

How much would you like to achieve? I got 38686 on dev lo on the physical 
server and 47894 on a VM. You could turn to OVS with DPDK as the data path, bu 
I doubt it will do much. SR-IOV might, but I never tried any of this. I’m 
satisfied with the speed for my purposes.

Tomas from Homeatcloud

 

From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au] 
Sent: Tuesday, April 18, 2017 9:11 AM
To: openstack@lists.openstack.org
Subject: [Openstack] poor bandwidth across instances running on same host

 

Hi all,

 

I created 2 instances on the same compute node and tested the bandwidth between 
them, surprisingly iperf tells me I got 16.1Gbits/sec only. Then I changed the 
firewall from hybrid iptables to ovs, the bandwidth improved a little bit to 
17.5Gbits/sec but still far from expected. 

 

Ml2_config.ini config file

 

[root@nova-compute ~]# docker exec -t neutron_openvswitch_agent vi 
/var/lib/kolla/config_files/ml2_config.ini
network_vlan_ranges =
 
[ml2_type_flat]
flat_networks = physnet1
 
[ml2_type_vxlan]
vni_ranges = 1:1000
vxlan_group = 239.1.1.1
 
[securitygroup]
firewall_driver = openvswitch
#firewall_driver = 
neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver
 
[agent]
tunnel_types = vxlan
l2_population = true
arp_responder = true
 
[ovs]
bridge_mappings = physnet1:br-ex
ovsdb_connection = tcp:129.94.72.54:6640
local_ip = 10.1.0.12

 

 

 

ovs config

 

[root@nova-compute ~]# docker exec openvswitch_vswitchd ovs-vsctl show
306d62c4-8e35-45e0-838e-53ebe81f1d06
Bridge br-ex
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port "eno50336512"
Interface "eno50336512"
Port phy-br-ex
Interface phy-br-ex
type: patch
options: {peer=int-br-ex}
Port br-ex
Interface br-ex
type: internal
Bridge br-tun
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}
Port "vxlan-0a01000b"
Interface "vxlan-0a01000b"
type: vxlan
options: {df_default="true", in_key=flow, local_ip="10.1.0.12", 
out_key=flow, remote_ip="10.1.0.11"}
Port br-tun
Interface br-tun
type: internal
Bridge br-int
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port int-br-ex
Interface int-br-ex
type: patch
options: {peer=phy-br-ex}
Port "tapa26ee521-3b"
tag: 2
Interface "tapa26ee521-3b"
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}
Port br-int
Interface br-int
type: internal
Port "tap1f76851b-ea"
tag: 2
Interface "tap1f76851b-ea"
 
 
 

Iperf results

 
[centos@centos7 ~]$ iperf -c 192.168.1.105

Client connecting to 192.168.1.105, TCP port 5001
TCP window size: 45.0 KByte (default)

[  3] local 192.168.1.101 port 48522 connected with 192.168.1.105 port 5001
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec  20.3 GBytes  17.5 Gbits/sec

 

 

 

Ovs info

 

[root@nova-compute ~]# docker exec openvswitch_vswitchd modinfo openvswitch

filename:   
/lib/modules/3.10.0-514.el7.x86_64/kernel/net/openvswitch/openvswitch.ko

license:GPL

description:Open vSwitch switching datapath

rhelversion:7.3

srcversion: B31AE95554C9D9A0067F935

depends:
nf_conntrack,nf_nat,libcrc32c,nf_nat_ipv6,nf_nat_ipv4,nf_defrag_ipv6

intree: Y

vermagic:   3.10.0-514.el7.x86_64 SMP mod_unload modversions

signer: CentOS Linux kernel signing key

sig_key:D4:88:63:A7:C1:6F:CC:27:41:23:E6:29:8F:74:F0:57:AF:19:FC:54

sig_hashalgo:   sha256

 

 

As far as I know the communication is VM ßOVSàVM and the linux bridge is not 
involved.

 

What could be throttling the network traffic and what can I do to improve 
performance?

 

Thank you very much

 

Manuel Sopena Ballesteros | Big data Engineer
Garvan Institute of Medical Research 
The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 2010
T: + 61 (0)2 9355 5760 | F: +61 (0)2 9295 8507 | E: 

Re: [Openstack] VMs lost IP after running for few days on OpenStack Liberty.

2017-04-18 Thread Tomáš Vondra
Hi Dan,

that normally should not happen. Presuming you are running Neutron, it may be 
two things – either your DHCP agent died and the instances could not renew 
their IPs, or the DHCP client inside them died, or as third variant, your 
RabbitMQ bus is broken and the DHCP agents don’t have actual data.

Tomas

 

From: Dan Dong [mailto:dongda...@gmail.com] 
Sent: Monday, April 10, 2017 2:48 PM
To: openstack@lists.openstack.org
Subject: [Openstack] VMs lost IP after running for few days on OpenStack 
Liberty.

 

Hi, Experts,

  Recently I found VMs on my OpenStack Liberty will get lost their private IP 
address after running normally for 1~2 days("#ip addr" will show no IP 
associated with eth0 for example), thus could not be ssh login anymore. But 
after console login to the instances from Horizon and restart the network(# 
service network restart), everything returns normal and they could be ssh login 
again( eth0 now gets it IP). From nova-compute.log I see there are messages 
like "NeutronClientException: Authentication required". But the keystone 
service runs normally and no other service are affected.

  Any hint about the problem? Thanks a lot!

 Cheers, Dan

 

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack-operators] [scientific] Resource reservation requirements (Blazar) - Forum session

2017-04-04 Thread Tomáš Vondra
Hi!
Did someone mention automation changing the spot instance capacity? I did an 
article in 2013 that proposes exactly that. The model forecasts the workload 
curve of the majority traffic, which is presumed to be interactive, and the 
rest may be used for batch traffic. The forecast used is SARIMA and is usable 
up to a few days in advance. Would anybody be interested in trying the forecast 
on data from their cloud?
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.671.7397=rep1=pdf¨
Tomas Vondra, dept. of Cybernetics, CTU FEE

-Original Message-
From: Blair Bethwaite [mailto:blair.bethwa...@gmail.com] 
Sent: Tuesday, April 04, 2017 12:08 AM
To: Jay Pipes
Cc: openstack-oper.
Subject: Re: [Openstack-operators] [scientific] Resource reservation 
requirements (Blazar) - Forum session

Hi Jay,

On 4 April 2017 at 00:20, Jay Pipes  wrote:
> However, implementing the above in any useful fashion requires that 
> Blazar be placed *above* Nova and essentially that the cloud operator 
> turns off access to Nova's  POST /servers API call for regular users. 
> Because if not, the information that Blazar acts upon can be simply 
> circumvented by any user at any time.

That's something of an oversimplification. A reservation system outside of Nova 
could manipulate Nova host-aggregates to "cordon off"
infrastructure from on-demand access (I believe Blazar already uses this 
approach), and it's not much of a jump to imagine operators being able to 
twiddle the available reserved capacity in a finite cloud so that reserved 
capacity can be offered to the subset of users/projects that need (or perhaps 
have paid for) it. Such a reservation system would even be able to backfill 
capacity between reservations. At the end of the reservation the system 
cleans-up any remaining instances and preps for the next reservation.

The are a couple of problems with putting this outside of Nova though.
The main issue is that pre-emptible/spot type instances can't be accommodated 
within the on-demand cloud capacity. You could have the reservation system 
implementing this feature, but that would then put other scheduling constraints 
on the cloud in order to be effective (e.g., there would need to be automation 
changing the size of the on-demand capacity so that the maximum pre-emptible 
capacity was always available). The other issue (admittedly minor, but still a
consideration) is that it's another service - personally I'd love to see Nova 
support these advanced use-cases directly.

--
Cheers,
~Blairo

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Flavors

2017-03-16 Thread Tomáš Vondra
We at Homeatcloud.com do exactly this in our VPS service. The user can 
configure the VPS with any combination of CPU, RAM, and disk. However a) the 
configurations are all about 10% the size of the physical machines and b) the 
disks are in a SAN array, provisioned as volumes. So I give the users some 
flexibility and can better see what configurations they actually want and build 
new hypervisors with that in mind. They mostly want up to 4 GB RAM anyway, si 
it’s not a big deal.

Tomas Vondra

 

From: Adam Lawson [mailto:alaw...@aqorn.com] 
Sent: Thursday, March 16, 2017 5:57 PM
To: Jonathan D. Proulx
Cc: OpenStack Operators
Subject: Re: [Openstack-operators] Flavors

 

One way I know some providers work around this when using OpenStack is by 
fronting the VM request with some code in the web server that checks if the 
requested spec has an existing flavor. if so, use the flavor, if not, use an 
admin account that creates a new flavor and assign use it for that user request 
then remove if when the build is complete. This naturally impacts your control 
over hardware efficiency but it makes your scenario possible (for better or for 
worse). I also hate being forced to do what someone else decided was going to 
be best for me. That's my decision and thanksully with openStack, this kind of 
thing is rather easy to do.

 

//adam





Adam Lawson

 

Principal Architect, CEO

Office: +1-916-794-5706

 

On Thu, Mar 16, 2017 at 7:52 AM, Jonathan D. Proulx  wrote:


I have always hated flavors and so do many of my users.

On Wed, Mar 15, 2017 at 03:22:48PM -0700, James Downs wrote:
:On Wed, Mar 15, 2017 at 10:10:00PM +, Fox, Kevin M wrote:
:> I think the really short answer is something like: It greatly simplifies 
scheduling and billing.
:
:The real answer is that once you buy hardware, it's in a fixed radio of 
CPU/Ram/Disk/IOPS, etc.

This while apparently reasonable is BS (at least in private cloud
space).  What users request and what they actualy use are wildly
divergent.

*IF* usage of claimed resorces were at or near optimal then this might
be true .  But if people are claiming 32G of ram because that how much
you assigned to a 16 vCPU instance type but really just need 16
threads with 2G or 4G then you packing still sucks.

I'm mostly bound on memory so I mostly have my users select on that
basis and over provide and over provision CPU since that can be
effectively shared between VMs where memory needs to be dedicated
(well mostly)

I'm sure I've ranted abotu this before but as you see from other
responses we seem to be in the minority position so mostly I rant at
the walls while my office mates look on perplexed (actually they're
pretty used to it by now and ignore me :) )

-Jon


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack] Migrating cinder volumes to a new cluster

2017-02-09 Thread Tomáš Vondra
Hi Jim!

I think that the plan is good. I think Cinder should not delete anything unless 
told to do so. You have to create the CPG space manually before Cinder can 
start using it, so it does no initialization. Do you have opportunity to test 
it, on some other CPG perhaps? We have the same setup with 3PAR here. With one 
miniature cluster for testing. If you don’t have the opportinity, I could, e.g. 
drop the Cinder database there and try importing a few volumes back.

Tomas

 

From: Jimmy Colestock [mailto:jcolest...@gmail.com] 
Sent: Friday, February 03, 2017 4:54 PM
To: Openstack
Subject: [Openstack] Migrating cinder volumes to a new cluster

 

Hello All, 

 

I want to migrate about 100 instances to a new cluster that are all backed by 
cinder volumes on an HP 3Par.   

My plan is: 

 

1. Create the new cluster.

2. Connect cinder to the existing 3Par cpg 

3. Manually update the cinder volume info in new cluster.

4. Terminate the instance in the old cluster

5. Launch an instance in the new cluster, attaching to the previously used 
volume. 

 

Anyone tried anything like this before?  My biggest concern is attaching to the 
same CPG and to make sure cinder doesn’t re-iniialize the space or otherwise 
delete any of the existing volumes. 

 

Thanks in advance for any thoughts.. 

 

JC

 


Jim Colestock

jcolest...@gmail.com

https://www.linkedin.com/in/jcolestock/

 

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack-operators] [nova] Do you use os-instance-usage-audit-log?

2017-01-13 Thread Tomáš Vondra
Hi Matt,
I've looked at my Nova config and yes, I have it on. We do billing using
Ceilometer data and I think compute.instance.exists is consumed as well. The
Ceilometer event retention is set to 6 months and the database size is in
single gigabytes. Nova database table task_log only contains the fact that
the audit job ran successfully and has 6 MB. It was not pruned for more than
a year.
Tomas

-Original Message-
From: Matt Riedemann [mailto:mrie...@linux.vnet.ibm.com] 
Sent: Thursday, January 12, 2017 12:09 AM
To: openstack-operators@lists.openstack.org
Subject: [Openstack-operators] [nova] Do you use
os-instance-usage-audit-log?

Nova's got this REST API [1] which pulls task_log data from the nova
database if the 'instance_usage_audit' config option value is True on any
compute host.

That table is populated in a periodic task from all computes that have it
enabled and by default it 'audits' instances created in the last month (the
time window is adjustable via the 'instance_get_active_by_window_joined'
config option).

The periodic task also emits a 'compute.instance.exists' notification for
each instance on that compute host which falls into the audit period. I'm
fairly certain that notification is meant to be consumed by Ceilometer which
is going to store it in it's own time-series database.

It just so happens that Nova is also storing this audit data in it's own
database, and never cleaning it up - the only way in-tree to move that data
out of the nova.task_log table is to archive it into shadow tables, but that
doesn't cut down on the bloat in your database. That
os-instance-usage-audit-log REST API is relying on the nova database though.

So my question is, is anyone using this in any shape or form, either via the
Nova REST API or Ceilometer? Or are you using it in one form but not the
other (maybe only via Ceilometer)? If you're using it, how are you
controlling the table growth, i.e. are you deleting records over a certain
age from the nova database using a cron job?

Mike Bayer was going to try and find some large production data sets to see
how many of these records are in a big and busy production DB that's using
this feature, but I'm also simply interested in how people use this, if it's
useful at all, and if there is interest in somehow putting a limit on the
data, i.e. we could add a config option to nova to only store records in the
task_log table under a certain max age.

[1] 
http://developer.openstack.org/api-ref/compute/#server-usage-audit-log-os-in
stance-usage-audit-log

-- 

Thanks,

Matt Riedemann


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RabbitMQ 3.6.x experience?

2017-01-09 Thread Tomáš Vondra
We have upgraded to RabbitMQ 3.6, and it resulted in one node crashing about 
every week on out of memory errors. To avoid this, we had to turn off the 
message rate collection. So no throughput graphs until it gets fixed. Avoid 
this version if you can.

Tomas

 

From: Sam Morrison [mailto:sorri...@gmail.com] 
Sent: Monday, January 09, 2017 3:55 AM
To: Matt Fischer
Cc: OpenStack Operators
Subject: Re: [Openstack-operators] RabbitMQ 3.6.x experience?

 

We’ve been running 3.6.5 for sometime now and it’s working well.

 

3.6.1 - 3.6.3 are unusable, we had lots of issues with stats DB and other 
weirdness. 

 

Our setup is a 3 physical node cluster with around 9k connections, average 
around the 300 messages/sec delivery. We have the stats sample rate set to 
default and it is working fine.

 

Yes we did have to restart the cluster to upgrade.

 

Cheers,

Sam

 

 

 

On 6 Jan 2017, at 5:26 am, Matt Fischer  wrote:

 

MIke,

 

I did a bunch of research and experiments on this last fall. We are running 
Rabbit 3.5.6 on our main cluster and 3.6.5 on our Trove cluster which has 
significantly less load (and criticality). We were going to upgrade to 3.6.5 
everywhere but in the end decided not to, mainly because there was little 
perceived benefit at the time. Our main issue is unchecked memory growth at 
random times. I ended up making several config changes to the stats collector 
and then we also restart it after every deploy and that solved it (so far). 

 

I'd say these were my main reasons for not going to 3.6 for our control nodes:

*   In 3.6.x they re-wrote the stats processor to make it parallel. In 
every 3.6 release since then, Pivotal has fixed bugs in this code. Then finally 
they threw up their hands and said "we're going to make a complete rewrite in 
3.7/4.x" (you need to look through issues on Github to find this discussion)
*   Out of the box with the same configs 3.6.5 used more memory than 3.5.6, 
since this was our main issue, I consider this a negative.
*   Another issue is the ancient version of erlang we have with Ubuntu 
Trusty (which we are working on) which made upgrades more complex/impossible 
depending on the version.

Given those negatives, the main one being that I didn't think there would be 
too many more fixes to the parallel statsdb collector in 3.6, we decided to 
stick with 3.5.6. In the end the devil we know is better than the devil we 
don't and I had no evidence that 3.6.5 would be an improvement.

 

I did decide to leave Trove on 3.6.5 because this would give us some bake-in 
time if 3.5.x became untenable we'd at least have had it up and running in 
production and some data on it.

 

If statsdb is not a concern for you, I think this changes the math and maybe 
you should use 3.6.x. I would however recommend at least going to 3.5.6, it's 
been better than 3.3/3.4 was.

 

No matter what you do definitely read all the release notes. There are some 
upgrades which require an entire cluster shutdown. The upgrade to 3.5.6 did not 
require this IIRC.

 

Here's the hiera for our rabbit settings which I assume you can translate:

 

rabbitmq::cluster_partition_handling: 'autoheal'

rabbitmq::config_variables:

  'vm_memory_high_watermark': '0.6'

  'collect_statistics_interval': 3

rabbitmq::config_management_variables:

  'rates_mode': 'none'

rabbitmq::file_limit: '65535'

 

Finally, if you do upgrade to 3.6.x please report back here with your results 
at scale!

 

 

On Thu, Jan 5, 2017 at 8:49 AM, Mike Dorman  wrote:

We are looking at upgrading to the latest RabbitMQ in an effort to ease some 
cluster failover issues we’ve been seeing.  (Currently on 3.4.0)

 

Anyone been running 3.6.x?  And what has been your experience?  Any gottchas to 
watch out for?

 

Thanks,

Mike

 


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators