[Openstack-operators] How to create floating ip pool use nova network? thanks

2016-07-06 Thread ????????
hi everyone,
How to create floating ip pool use nova network? thanks___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Getting help with Python API

2016-07-06 Thread Matt Fischer
When you make the API calls you're going to get back a list python objects
which you need to iterate. I believe some APIs will let you ask for
specific fields only, but this is simple enough:

keystone = client.Client(username=Username, password=Password,
tenant_name=Tenant,
auth_url='http://%s:5000/v2.0' % (Ip))
tenants = keystone.tenants.list()
tenant_ids = []
for tenant in tenants:
tenant_ids.append(tenant.id)

Then you can process the tenant_ids list, of course once you done that
you've lost the context of the object (like the name of the tenant).

For stuff like this the docs are your best bet along with introspecting the
objects using the interactive python shell:

http://docs.openstack.org/developer/python-keystoneclient/

On Wed, Jul 6, 2016 at 12:53 PM, Jared Wilkinson 
wrote:

> Hey folks,
>
>
>
> I am in the process of teaching myself Python and I have done a couple of
> free online courses but still have a fairly rudimentary understanding of
> Python, especially when it comes to using the OpenStack SDK. I was hoping
> you guys could point me to a mailing list for getting help with using the
> Python API calls in custom scripts. Specifically, something as simple as,
> return a list of all the tenant IDs (only) and push that send that to nova
> quota show and return just the CPU, memory, and storage metrics.
>
>
>
> I can get the list of tenants from Keystone, but I can’t get it to just
> send me the list of IDs (we are using v2 of identity). I have read the
> documentation, but nothing there seems to help (or I am not quite strong
> enough in my Python skills to understand it) for this specific instance.
>
>
>
> Any direction would be appreciated.
>
>
> Thanks,
>
> Jared
>
>
>
> *Jared Wilkinson* | Senior Infrastructure Engineer – Team Lead
>
> jwilkin...@ebsco.com | (W) 205/981-4018 | (M) 205/259-9802
>
> 5724 US Highway 280 East, Birmingham, AL 35242, USA
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] PCI Passthrough issues

2016-07-06 Thread Blair Bethwaite
Jon,

Awesome, thanks for sharing. We've just run into an issue with SRIOV
VF passthrough that sounds like it might be the same problem (device
disappearing after a reboot), but haven't yet investigated deeply -
this will help with somewhere to start!

By the way, the nouveau mention was because we had missed it on some
K80 hypervisors recently and seen passthrough apparently work, but
then the NVIDIA drivers would not build in the guest as they claimed
they could not find a supported device (despite the GPU being visible
on the PCI bus). I have also heard passing mention of requiring qemu
2.3+ but don't have any specific details of the related issue.

Cheers,

On 7 July 2016 at 08:13, Jonathan Proulx  wrote:
> On Wed, Jul 06, 2016 at 12:32:26PM -0400, Jonathan D. Proulx wrote:
> :
> :I do have an odd remaining issue where I can run cuda jobs in the vm
> :but snapshots fail and after pause (for snapshotting) the pci device
> :can't be reattached (which is where i think it deletes the snapshot
> :it took).  Got same issue with 3.16 and 4.4 kernels.
> :
> :Not very well categorized yet, but I'm hoping it's because the VM I
> :was hacking on had it's libvirt.xml written out with the older qemu
> :maybe?  It had been through a couple reboots of the physical system
> :though.
> :
> :Currently building a fresh instance and bashing more keys...
>
> After an ugly bout of bashing I've solve my failing snapshot issue
> which I'll post here in hopes of saving someonelse
>
> Short version:
>
> add "/dev/vfio/vfio rw," to  /etc/apparmor.d/abstractions/libvirt-qemu
> add "ulimit -l unlimited" to /etc/init/libvirt-bin.conf
>
> Longer version:
>
> What was happening.
>
> * send snapshot request
> * instance pauses while snapshot is pending
> * instance attempt to resume
> * fails to reattach pci device
>   * nova-compute.log
> Exception during message handling: internal error: unable to execute QEMU 
> command 'device_add': Device initialization failedcompute.log
>
>   * qemu/.log
> vfio: failed to open /dev/vfio/vfio: Permission denied
> vfio: failed to setup container for group 48
> vfio: failed to get group 48
> * snapshot disappears
> * instance resumes but without passed through device (hard reboot
> reattaches)
>
> seeing permsission denied I though would be an easy fix but:
>
> # ls -l /dev/vfio/vfio
> crw-rw-rw- 1 root root 10, 196 Jul  6 14:05 /dev/vfio/vfio
>
> so I'm guessing I'm in apparmor hell, I try adding "/dev/vfio/vfio
> rw," to  /etc/apparmor.d/abstractions/libvirt-qemu rebooting the
> hypervisor and trying again which gets me a different libvirt error
> set:
>
> VFIO_MAP_DMA: -12
> vfio_dma_map(0x5633a5fa69b0, 0x0, 0xa, 0x7f4e7be0) = -12 (Cannot 
> allocate memory)
>
> kern.log (and thus dmesg) showing:
> vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded
>
> Getting rid of this one required inserting 'ulimit -l unlimited' into
> /etc/init/libvirt-bin.conf in the 'script' section:
>
> 
> script
> [ -r /etc/default/libvirt-bin ] && . /etc/default/libvirt-bin
> ulimit -l unlimited
> exec /usr/sbin/libvirtd $libvirtd_opts
> end script
>
>
> -Jon
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Cheers,
~Blairo

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [User-committee] Seeking feedback: Active User Contributor (AUC) eligibility requirements

2016-07-06 Thread Shamail


> On Jul 6, 2016, at 5:44 PM, Lauren Sell  wrote:
> 
> Hi Rocky,
> 
> Agreed! I think that’s the intent of the first bullet — official user group 
> organizers.

Completely agree, they are a vital part of the community and often are the 
first aspect of community that newcomers experience.  The first bullet is 
intended to ensure they are included in AUC.
> 
> Cheers,
> Lauren
> 
> 
>> On Jul 6, 2016, at 4:42 PM, Rochelle Grober  
>> wrote:
>> 
>> Umm, I see a major contribution area not included here:
>> 
>> OpenStack community meetup organizers.  I know some sink a large amount of 
>> time scheduling and organizing at least one a month, some two or more.  
>> These organizers are critical for getting info out on OpenStack and 
>> familiarizing their local tech communities with the OpenStack project.  I 
>> hope they are included somewhere for their contributions.
>> 
>> --Rocky
>> 
>> -Original Message-
>> From: Jonathan D. Proulx [mailto:j...@csail.mit.edu] 
>> Sent: Thursday, June 30, 2016 12:00 PM
>> To: Shamail Tahir
>> Cc: openstack-operators; user-committee
>> Subject: Re: [User-committee] Seeking feedback: Active User Contributor 
>> (AUC) eligibility requirements
>> 
>> 
>> I'm surprised this hasn't generated more feed back, though I'd
>> generally take that as positive.
>> 
>> List seems good to me.
>> 
>> The self nomintaion + confirm by UC is a good catch all especially in
>> the beginning where we're unlikely to have though of everything.  We
>> can always expand criterial later if the 'misc' of UC confirmantion
>> gets too big and we idnetify patterns.
>> 
>> Thanks all!
>> -Jon
>> 
>> On Wed, Jun 29, 2016 at 04:52:00PM -0400, Shamail Tahir wrote:
>> :Hi everyone,
>> :
>> :The AUC Recognition WG has been hard at work on milestone-4 of our plan
>> :which is to identify the eligibility criteria for each community
>> :contributor role that is covered by AUC.  We had a great mix of community
>> :people involved in defining these thresholds but we wanted to also open
>> :this up for broader community feedback before we propose them to the user
>> :committee.  AUC is a new concept and we hope to make iterative improvements
>> :going forward... you can consider the guidelines below as "version 1" and I
>> :am certain they will evolve as lessons are learned.  Thank you in advance
>> :for your feedback!
>> :
>> :*  Official User Group organizers
>> :
>> :o   Listed as an organizer or coordinator for an official OpenStack user
>> :group
>> :
>> :*  Active members of official UC Working Groups
>> :
>> :o   Attend 25% of the IRC meetings and have spoken more than 25 times OR
>> :have spoken more than 100 times regardless of attendance count over the
>> :last six months
>> :
>> :o   WG that do not use IRC for their meetings will depend on the meeting
>> :chair(s) to identify active participation from attendees
>> :
>> :*  Ops meetup moderators
>> :
>> :o   Moderate a session at the operators meetup over the last six
>> :months AND/OR
>> :
>> :o   Host the operators meetup (limit 2 people from the hosting
>> :organization) over the last six months
>> :
>> :*  Contributions to any repository under UC governance (ops
>> :repositories, user stories repository, etc.)
>> :
>> :o   Submitted two or more patches to a UC governed repository over the last
>> :six months
>> :
>> :*  Track chairs for OpenStack Summits
>> :
>> :o   Identified track chair for the upcoming OpenStack Summit (based on when
>> :data is gathered) [this is a forward-facing metric]
>> :
>> :*  Contributors to Superuser (articles, interviews, user stories, etc.)
>> :
>> :o   Listed as author in at least one publication at superuser.openstack.org
>> :over the last six months
>> :
>> :*  Submission for eligibility to AUC review panel
>> :
>> :o   No formal criteria, anyone can self-nominate, and nominations will be
>> :reviewed per guidance established in milestone-5
>> :
>> :*  Active moderators on ask.openstack
>> :
>> :o   Listed as moderator on Ask OpenStack and have over 500 karma
>> :
>> :There is additional information available in the etherpad[1] the AUC
>> :recognition WG has been using for this task which includes Q&A (question
>> :and answers) between team members.
>> :
>> :[1] https://etherpad.openstack.org/p/uc-recog-metrics
>> :
>> :-- 
>> :Thanks,
>> :Shamail Tahir
>> :t: @ShamailXD
>> :tz: Eastern Time
>> 
>> :___
>> :User-committee mailing list
>> :user-commit...@lists.openstack.org
>> :http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee
>> 
>> 
>> -- 
>> 
>> ___
>> User-committee mailing list
>> user-commit...@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee
>> 
>> ___
>> User-committee mailing list
>> user-commit...@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/user-commi

Re: [Openstack-operators] PCI Passthrough issues

2016-07-06 Thread Jonathan Proulx
On Wed, Jul 06, 2016 at 12:32:26PM -0400, Jonathan D. Proulx wrote:
:
:I do have an odd remaining issue where I can run cuda jobs in the vm
:but snapshots fail and after pause (for snapshotting) the pci device
:can't be reattached (which is where i think it deletes the snapshot
:it took).  Got same issue with 3.16 and 4.4 kernels.
:
:Not very well categorized yet, but I'm hoping it's because the VM I
:was hacking on had it's libvirt.xml written out with the older qemu
:maybe?  It had been through a couple reboots of the physical system
:though.
:
:Currently building a fresh instance and bashing more keys...

After an ugly bout of bashing I've solve my failing snapshot issue
which I'll post here in hopes of saving someonelse 

Short version:

add "/dev/vfio/vfio rw," to  /etc/apparmor.d/abstractions/libvirt-qemu
add "ulimit -l unlimited" to /etc/init/libvirt-bin.conf

Longer version:

What was happening.

* send snapshot request
* instance pauses while snapshot is pending
* instance attempt to resume
* fails to reattach pci device
  * nova-compute.log
Exception during message handling: internal error: unable to execute QEMU 
command 'device_add': Device initialization failedcompute.log

  * qemu/.log
vfio: failed to open /dev/vfio/vfio: Permission denied
vfio: failed to setup container for group 48
vfio: failed to get group 48
* snapshot disappears
* instance resumes but without passed through device (hard reboot
reattaches)

seeing permsission denied I though would be an easy fix but:

# ls -l /dev/vfio/vfio
crw-rw-rw- 1 root root 10, 196 Jul  6 14:05 /dev/vfio/vfio

so I'm guessing I'm in apparmor hell, I try adding "/dev/vfio/vfio
rw," to  /etc/apparmor.d/abstractions/libvirt-qemu rebooting the
hypervisor and trying again which gets me a different libvirt error
set:

VFIO_MAP_DMA: -12
vfio_dma_map(0x5633a5fa69b0, 0x0, 0xa, 0x7f4e7be0) = -12 (Cannot 
allocate memory)

kern.log (and thus dmesg) showing:
vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded

Getting rid of this one required inserting 'ulimit -l unlimited' into
/etc/init/libvirt-bin.conf in the 'script' section:


script
[ -r /etc/default/libvirt-bin ] && . /etc/default/libvirt-bin
ulimit -l unlimited
exec /usr/sbin/libvirtd $libvirtd_opts
end script


-Jon

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [User-committee] Seeking feedback: Active User Contributor (AUC) eligibility requirements

2016-07-06 Thread Lauren Sell
Hi Rocky,

Agreed! I think that’s the intent of the first bullet — official user group 
organizers.

Cheers,
Lauren


> On Jul 6, 2016, at 4:42 PM, Rochelle Grober  
> wrote:
> 
> Umm, I see a major contribution area not included here:
> 
> OpenStack community meetup organizers.  I know some sink a large amount of 
> time scheduling and organizing at least one a month, some two or more.  These 
> organizers are critical for getting info out on OpenStack and familiarizing 
> their local tech communities with the OpenStack project.  I hope they are 
> included somewhere for their contributions.
> 
> --Rocky
> 
> -Original Message-
> From: Jonathan D. Proulx [mailto:j...@csail.mit.edu] 
> Sent: Thursday, June 30, 2016 12:00 PM
> To: Shamail Tahir
> Cc: openstack-operators; user-committee
> Subject: Re: [User-committee] Seeking feedback: Active User Contributor (AUC) 
> eligibility requirements
> 
> 
> I'm surprised this hasn't generated more feed back, though I'd
> generally take that as positive.
> 
> List seems good to me.
> 
> The self nomintaion + confirm by UC is a good catch all especially in
> the beginning where we're unlikely to have though of everything.  We
> can always expand criterial later if the 'misc' of UC confirmantion
> gets too big and we idnetify patterns.
> 
> Thanks all!
> -Jon
> 
> On Wed, Jun 29, 2016 at 04:52:00PM -0400, Shamail Tahir wrote:
> :Hi everyone,
> :
> :The AUC Recognition WG has been hard at work on milestone-4 of our plan
> :which is to identify the eligibility criteria for each community
> :contributor role that is covered by AUC.  We had a great mix of community
> :people involved in defining these thresholds but we wanted to also open
> :this up for broader community feedback before we propose them to the user
> :committee.  AUC is a new concept and we hope to make iterative improvements
> :going forward... you can consider the guidelines below as "version 1" and I
> :am certain they will evolve as lessons are learned.  Thank you in advance
> :for your feedback!
> :
> :*  Official User Group organizers
> :
> :o   Listed as an organizer or coordinator for an official OpenStack user
> :group
> :
> :*  Active members of official UC Working Groups
> :
> :o   Attend 25% of the IRC meetings and have spoken more than 25 times OR
> :have spoken more than 100 times regardless of attendance count over the
> :last six months
> :
> :o   WG that do not use IRC for their meetings will depend on the meeting
> :chair(s) to identify active participation from attendees
> :
> :*  Ops meetup moderators
> :
> :o   Moderate a session at the operators meetup over the last six
> :months AND/OR
> :
> :o   Host the operators meetup (limit 2 people from the hosting
> :organization) over the last six months
> :
> :*  Contributions to any repository under UC governance (ops
> :repositories, user stories repository, etc.)
> :
> :o   Submitted two or more patches to a UC governed repository over the last
> :six months
> :
> :*  Track chairs for OpenStack Summits
> :
> :o   Identified track chair for the upcoming OpenStack Summit (based on when
> :data is gathered) [this is a forward-facing metric]
> :
> :*  Contributors to Superuser (articles, interviews, user stories, etc.)
> :
> :o   Listed as author in at least one publication at superuser.openstack.org
> :over the last six months
> :
> :*  Submission for eligibility to AUC review panel
> :
> :o   No formal criteria, anyone can self-nominate, and nominations will be
> :reviewed per guidance established in milestone-5
> :
> :*  Active moderators on ask.openstack
> :
> :o   Listed as moderator on Ask OpenStack and have over 500 karma
> :
> :There is additional information available in the etherpad[1] the AUC
> :recognition WG has been using for this task which includes Q&A (question
> :and answers) between team members.
> :
> :[1] https://etherpad.openstack.org/p/uc-recog-metrics
> :
> :-- 
> :Thanks,
> :Shamail Tahir
> :t: @ShamailXD
> :tz: Eastern Time
> 
> :___
> :User-committee mailing list
> :user-commit...@lists.openstack.org
> :http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee
> 
> 
> -- 
> 
> ___
> User-committee mailing list
> user-commit...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee
> 
> ___
> User-committee mailing list
> user-commit...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [User-committee] Seeking feedback: Active User Contributor (AUC) eligibility requirements

2016-07-06 Thread Rochelle Grober
Umm, I see a major contribution area not included here:

OpenStack community meetup organizers.  I know some sink a large amount of time 
scheduling and organizing at least one a month, some two or more.  These 
organizers are critical for getting info out on OpenStack and familiarizing 
their local tech communities with the OpenStack project.  I hope they are 
included somewhere for their contributions.

--Rocky

-Original Message-
From: Jonathan D. Proulx [mailto:j...@csail.mit.edu] 
Sent: Thursday, June 30, 2016 12:00 PM
To: Shamail Tahir
Cc: openstack-operators; user-committee
Subject: Re: [User-committee] Seeking feedback: Active User Contributor (AUC) 
eligibility requirements


I'm surprised this hasn't generated more feed back, though I'd
generally take that as positive.

List seems good to me.

The self nomintaion + confirm by UC is a good catch all especially in
the beginning where we're unlikely to have though of everything.  We
can always expand criterial later if the 'misc' of UC confirmantion
gets too big and we idnetify patterns.

Thanks all!
-Jon

On Wed, Jun 29, 2016 at 04:52:00PM -0400, Shamail Tahir wrote:
:Hi everyone,
:
:The AUC Recognition WG has been hard at work on milestone-4 of our plan
:which is to identify the eligibility criteria for each community
:contributor role that is covered by AUC.  We had a great mix of community
:people involved in defining these thresholds but we wanted to also open
:this up for broader community feedback before we propose them to the user
:committee.  AUC is a new concept and we hope to make iterative improvements
:going forward... you can consider the guidelines below as "version 1" and I
:am certain they will evolve as lessons are learned.  Thank you in advance
:for your feedback!
:
:*  Official User Group organizers
:
:o   Listed as an organizer or coordinator for an official OpenStack user
:group
:
:*  Active members of official UC Working Groups
:
:o   Attend 25% of the IRC meetings and have spoken more than 25 times OR
:have spoken more than 100 times regardless of attendance count over the
:last six months
:
:o   WG that do not use IRC for their meetings will depend on the meeting
:chair(s) to identify active participation from attendees
:
:*  Ops meetup moderators
:
:o   Moderate a session at the operators meetup over the last six
:months AND/OR
:
:o   Host the operators meetup (limit 2 people from the hosting
:organization) over the last six months
:
:*  Contributions to any repository under UC governance (ops
:repositories, user stories repository, etc.)
:
:o   Submitted two or more patches to a UC governed repository over the last
:six months
:
:*  Track chairs for OpenStack Summits
:
:o   Identified track chair for the upcoming OpenStack Summit (based on when
:data is gathered) [this is a forward-facing metric]
:
:*  Contributors to Superuser (articles, interviews, user stories, etc.)
:
:o   Listed as author in at least one publication at superuser.openstack.org
:over the last six months
:
:*  Submission for eligibility to AUC review panel
:
:o   No formal criteria, anyone can self-nominate, and nominations will be
:reviewed per guidance established in milestone-5
:
:*  Active moderators on ask.openstack
:
:o   Listed as moderator on Ask OpenStack and have over 500 karma
:
:There is additional information available in the etherpad[1] the AUC
:recognition WG has been using for this task which includes Q&A (question
:and answers) between team members.
:
:[1] https://etherpad.openstack.org/p/uc-recog-metrics
:
:-- 
:Thanks,
:Shamail Tahir
:t: @ShamailXD
:tz: Eastern Time

:___
:User-committee mailing list
:user-commit...@lists.openstack.org
:http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee


-- 

___
User-committee mailing list
user-commit...@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova] Rabbit-mq 3.4 crashing (anyone else seen this?)

2016-07-06 Thread Rochelle Grober
repository is:  http://git.openstack.org/cgit/openstack/osops-tools-contrib/

FYI, there are also:  osops-tools-generic, osops-tools-logging, 
osops-tools-monitoring, osops-example-configs and osops-coda

Wish I could help more,

--Rocky

-Original Message-
From: Joshua Harlow [mailto:harlo...@fastmail.com] 
Sent: Tuesday, July 05, 2016 10:44 AM
To: Matt Fischer
Cc: openstack-...@lists.openstack.org; OpenStack Operators
Subject: Re: [openstack-dev] [Openstack-operators] [nova] Rabbit-mq 3.4 
crashing (anyone else seen this?)

Ah, those sets of command sound pretty nice to run periodically,

Sounds like a useful script that could be placed in the ops tools repo 
(I forget where this repo exists at, but pretty sure it does exist?).

Some other oddness though is that this issue seems to go away when we 
don't run cross-release; do you see that also?

Another hypothesis was that the following fix may be triggering part of 
this @ https://bugs.launchpad.net/oslo.messaging/+bug/1495568

So that if we have some queues being set up as auto-delete and some 
beign set up with expiry that perhaps the combination of these causes 
more work (and therefore eventually it falls behind and falls over) for 
the management database.

Matt Fischer wrote:
> Yes! This happens often but I'd not call it a crash, just the mgmt db
> gets behind then eats all the memory. We've started monitoring it and
> have runbooks on how to bounce just the mgmt db. Here are my notes on that:
>
> restart rabbitmq mgmt server - this seems to clear the memory usage.
>
> rabbitmqctl eval 'application:stop(rabbitmq_management).'
> rabbitmqctl eval 'application:start(rabbitmq_management).'
>
> run GC on rabbit_mgmt_db:
> rabbitmqctl eval
> '(erlang:garbage_collect(global:whereis_name(rabbit_mgmt_db)))'
>
> status of rabbit_mgmt_db:
> rabbitmqctl eval 'sys:get_status(global:whereis_name(rabbit_mgmt_db)).'
>
> Rabbitmq mgmt DB how much memory is used:
> /usr/sbin/rabbitmqctl status | grep mgmt_db
>
> Unfortunately I didn't see that an upgrade would fix for sure and any
> settings changes to reduce the number of monitored events also require a
> restart of the cluster. The other issue with an upgrade for us is the
> ancient version of erlang shipped with trusty. When we upgrade to Xenial
> we'll upgrade erlang and rabbit and hope it goes away. I'll also
> probably tweak the settings on retention of events then too.
>
> Also for the record the GC doesn't seem to help at all.
>
> On Jul 5, 2016 11:05 AM, "Joshua Harlow"  > wrote:
>
> Hi ops and dev-folks,
>
> We over at godaddy (running rabbitmq with openstack) have been
> hitting a issue that has been causing the `rabbit_mgmt_db` consuming
> nearly all the processes memory (after a given amount of time),
>
> We've been thinking that this bug (or bugs?) may have existed for a
> while and our dual-version-path (where we upgrade the control plane
> and then slowly/eventually upgrade the compute nodes to the same
> version) has somehow triggered this memory leaking bug/issue since
> it has happened most prominently on our cloud which was running
> nova-compute at kilo and the other services at liberty (thus using
> the versioned objects code path more frequently due to needing
> translations of objects).
>
> The rabbit we are running is 3.4.0 on CentOS Linux release 7.2.1511
> with kernel 3.10.0-327.4.4.el7.x86_64 (do note that upgrading to
> 3.6.2 seems to make the issue go away),
>
> # rpm -qa | grep rabbit
>
> rabbitmq-server-3.4.0-1.noarch
>
> The logs that seem relevant:
>
> ```
> **
> *** Publishers will be blocked until this alarm clears ***
> **
>
> =INFO REPORT 1-Jul-2016::16:37:46 ===
> accepting AMQP connection <0.23638.342> (127.0.0.1:51932
>  -> 127.0.0.1:5671 )
>
> =INFO REPORT 1-Jul-2016::16:37:47 ===
> vm_memory_high_watermark clear. Memory used:29910180640
> allowed:47126781542
> ```
>
> This happens quite often, the crashes have been affecting our cloud
> over the weekend (which made some dev/ops not so happy especially
> due to the july 4th mini-vacation),
>
> Looking to see if anyone else has seen anything similar?
>
> For those interested this is the upstream bug/mail that I'm also
> seeing about getting confirmation from the upstream users/devs
> (which also has erlang crash dumps attached/linked),
>
> https://groups.google.com/forum/#!topic/rabbitmq-users/FeBK7iXUcLg
>
> Thanks,
>
> -Josh
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> 
> http://lists.openstack.org/cgi-bin/mailman/lis

Re: [Openstack-operators] Next Ops Midcycle NYC August 25-26

2016-07-06 Thread Chris Morgan
For the purposes of the mid-cycle meeting this august, I can be the point
of contact:

If anyone needs my full contact info, just email me here and I will reply
with full details. Sorry for the delay with this.

Chris

On Fri, Jun 24, 2016 at 10:42 PM, Chris Morgan  wrote:

> Hi there
>   Sorry for the delay. I am hoping to see our organizer asap Monday when
> she's back from a trip and to get an answer for this. I don't know who the
> official organizer is but I would think we could do this to just be a
> credible org that can confirm details like you mention.
>
> Chris (Bloomberg)
>
> Sent from my iPhone
>
> > On Jun  23, 2016, at 4:14 PM, Saverio Proto  wrote:
> >
> > Hello there :)
> >
> > is anyone from the openstack foundation or from bloomberg that can
> > help out with this ?
> >
> > I share this for anyone that needs visa.
> >
> > for Austin we had something like this:
> > https://www.openstack.org/summit/austin-2016/austin-and-travel/
> > https://openstackfoundation.formstack.com/forms/visa_form_austin_summit
> >
> > anyone that needs to apply for Visa will need a 'US point of contact
> > information'.
> >
> > Basically, if the organizer of the Ops Midcycle is officially the
> > openstack foundation or bloomberg, I need to enter in my visa
> > application the following info:
> >
> > Organization name
> > Address in the US
> > Phone number
> > Email
> >
> > It must be a phone number and a email where in case there is a check,
> > somebody can tell "yes of course, this guy exist and is coming to the
> > conference" :)
> >
> > How we sort this out ?
> >
> > thank you
> >
> > Saverio
> >
> > ___
> > OpenStack-operators mailing list
> > OpenStack-operators@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>



-- 
Chris Morgan 
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [app-catalog] App Catalog IRC meeting Thursday July 7th

2016-07-06 Thread Christopher Aedo
Join us Thursday for our weekly meeting, scheduled for July 7th at
17:00UTC in #openstack-meeting-3

The agenda can be found here, and please add to if you want to discuss
something with the Community App Catalog team:
https://wiki.openstack.org/wiki/Meetings/app-catalog

Tomorrow we will be talking more about our plan toimplement GLARE as a
back-end for the Community App Catalog, and what we'll need to merge
in the next few weeks to make this a reality.

Hope to see you there tomorrow!

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Getting help with Python API

2016-07-06 Thread Jared Wilkinson
Hey folks,

I am in the process of teaching myself Python and I have done a couple of free 
online courses but still have a fairly rudimentary understanding of Python, 
especially when it comes to using the OpenStack SDK. I was hoping you guys 
could point me to a mailing list for getting help with using the Python API 
calls in custom scripts. Specifically, something as simple as, return a list of 
all the tenant IDs (only) and push that send that to nova quota show and return 
just the CPU, memory, and storage metrics.

I can get the list of tenants from Keystone, but I can’t get it to just send me 
the list of IDs (we are using v2 of identity). I have read the documentation, 
but nothing there seems to help (or I am not quite strong enough in my Python 
skills to understand it) for this specific instance.

Any direction would be appreciated.

Thanks,
Jared

Jared Wilkinson | Senior Infrastructure Engineer – Team Lead
jwilkin...@ebsco.com | (W) 205/981-4018 | (M) 205/259-9802
5724 US Highway 280 East, Birmingham, AL 35242, USA
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Internal integration (best?) practices

2016-07-06 Thread Joshua Harlow

Hi folks (operators and devs),

I was digging into some of godaddy code yesterday/everyday and it got me 
thinking about how other operators are handling customized internal 
integrations and what some of the common patterns are for how these 
integrations are typically performed (overall it doesn't feel like there 
are that many patterns that people can be doing in that many different 
ways). It got me thinking that there really isn't a best practices 
document that I know for these types of common integrations (perhaps 
this can be the start of one?) that operators and developers can provide 
feedback on (because perhaps there is a new or to-be new API that 
replaces an older style of integration).


This hopefully falls into the pattern (and/or is similar to) that is 
described in:


https://specs.openstack.org/openstack/nova-specs/specs/newton/approved/api-no-more-extensions.html#alternatives-for-extension-owners

'''
Please bring forward the conversation to the wider community anyway. 
There are a lot of OpenStack deploys, and issues that you think are 
specific to your environment may not be. And in conversation with the 
upstream operator and developer communities we could probably come up 
with a generic facility that supports your use.

'''

So let me describe some of the internal integrations we've had to do and 
see if these sound familiar to what others have had to do and it'd be 
nice to share how those integrations were performed (or even if the 
integration that was done is similar or done in a different manner, 
that's cool to):


(1) Requirement we have is to register virtual machines booted with a 
in-house dns solution and to also delete dns names registered on 
deletion being triggered.


The steps this will do is on reception of a 
'compute.instance.create.end' event being received (the program that 
does this listens on the notifications queue of nova) logic will proceed 
that will (direct comment from routine here):


'Creates A and PTR records based off of the instance's hostname and 
fixed ip.'


This goes about and has further logic to realize this statement (its 
error robustness is a different discussion, but I digress) and the 
'compute.instance.delete.end' event does the reverse:


'Removes instance's DNS entries. Will remove A and PTR records.'

Now sidestepping the choice of using the notifications queue (and having 
a consumer on that) for this which IMHO has issues in that it's out of 
band, has no way to stop a VM from being created, or set a state of the 
VM like ERROR_DNS_SETUP if it fails and so-on (ie if a create dns record 
fails, how does the delete know that it shouldn't try to delete and 
such...) I was wondering how others are doing this kind of similar 
actions. I believe that they might be more native integration with 
neutron that is possible here to solve this using a native solution instead?


(2) Requirement that for all created instances we register into a 
home-grown cmdb solution with the 'fixed_ips' that the instance was 
built with, the environment it was built with, information about the 
instances image (family) and so-on (the following is a more complete 
gathering of info that we send):


server = sr_manager.build_server_object(
environmentID=cmdb_env_id,
hostname=instance.hostname,
serialNumber=instance.instance_id,
fqdn=fqdn,
rack=_short_hostname(instance.host),
operatingSystemIdentifier=os_spec,
operatingSystemFamily=os_family,
ramMB=instance.memory_mb,
teamID=cmdb_team_id,
teamName=cmdb_team_name,
drivesGB=instance.disk_gb,
...
nics=cmdb_nics
)

So similar to (1) we have also plugged into the notifications queue with 
the same queue consumer and hook-in to do this out of band system 
registration (which is really a eventually/best-effort population into 
the home grown cmdb due to mentioned issues about about robustness and 
failures). I was wondering what others are doing for this kind of in 
house cmdb (which most companies have) and how they are doing VM 
registration into that (are people even bothering to do that?); oh btw, 
on removal it triggers a removal/retirement process from that same 
system using a function like:


sr_manager.retire_server(instance.hostname, instance.user_id)

(3) On deletion (only) we in the same consumer connect into corporate 
LDAP and delete the instance hostname from that system and delete the 
instance from registration into a variant of 
https://fedorahosted.org/spacewalk/ (the registration of the instance 
into these systems happens on instance boot); so this got me wondering 
how do others register/deregister virtual machines created into such 
systems (we can't be the only one using corporate LDAP/spacewalk in this 
manner)?


(4) A few more, but this is long enough for now ;)

-Josh



___
OpenStack-operators mailing list
OpenStack-operators@list

Re: [Openstack-operators] [puppet] [desginate] An update on the state of puppet-designate (and designate in RDO)

2016-07-06 Thread David Moreau Simard
I drafted some tentative release notes that summarizes the work that
has been done so far [1].

I asked input from #openstack-dns but would love if users could chime
in on a deprecation currently in review [2].

This change also makes it so designate will stop maintaining a
directory in /var/lib/designate/bind9.
This directory and was introduced in puppet-designate in 2013 and
doesn't seem relevant anymore according to upstream and designate
documentation.

[1]: 
http://docs-draft.openstack.org/04/338404/1/check/gate-puppet-designate-releasenotes/273e921//releasenotes/build/html/unreleased.html#id1
[2]: https://review.openstack.org/#/c/337951/

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]


On Wed, Jul 6, 2016 at 11:51 AM, David Moreau Simard  wrote:
> Thanks Matt, if you don't mind I might add you to some puppet reviews.
>
> David Moreau Simard
> Senior Software Engineer | Openstack RDO
>
> dmsimard = [irc, github, twitter]
>
>
> On Tue, Jul 5, 2016 at 10:22 PM, Matt Fischer  wrote:
>> We're using Designate but still on Juno. We're running puppet from around
>> then, summer of 2015. We'll likely try to upgrade to Mitaka at some point
>> but Juno Designate "just works" so it's been low priority. Look forward to
>> your efforts here.
>>
>> On Tue, Jul 5, 2016 at 7:47 PM, David Moreau Simard  wrote:
>>>
>>> Hi !
>>>
>>> tl;dr
>>> puppet-designate is going under some significant updates to bring it
>>> up to par right now.
>>> While I will try to ensure it is well tested and backwards compatible,
>>> things *could* break. Would like feedback.
>>>
>>> I cc'd -operators because I'm interested in knowing if there are any
>>> users of puppet-designate right now: which distro and release of
>>> OpenStack?
>>>
>>> I'm a RDO maintainer and I took interest in puppet-designate because
>>> we did not have any proper test coverage for designate in RDO
>>> packaging until now.
>>>
>>> The RDO community mostly relies on collaboration with installation and
>>> deployment projects such as Puppet OpenStack to test our packaging.
>>> We can, in turn, provide some level of guarantee that packages built
>>> out of trunk branches (and eventually stable releases) should work.
>>> The idea is to make puppet-designate work with RDO, then integrate it
>>> in the puppet-openstack-integration CI scenarios and we can leverage
>>> that in RDO CI afterwards.
>>>
>>> Both puppet-designate and designate RDO packaging were unfortunately
>>> in quite a sad state after not being maintained very well and a lot of
>>> work was required to even get basic tests to pass.
>>> The good news is that it didn't work with RDO before and now it does,
>>> for newton.
>>> Testing coverage has been improved and will be improved even further
>>> for both RDO and Ubuntu Cloud Archive.
>>>
>>> If you'd like to follow the progress of the work, the reviews are
>>> tagged with the topic "designate-with-rdo" [1].
>>>
>>> Let me know if you have any questions !
>>>
>>> [1]: https://review.openstack.org/#/q/topic:designate-with-rdo
>>>
>>> David Moreau Simard
>>> Senior Software Engineer | Openstack RDO
>>>
>>> dmsimard = [irc, github, twitter]
>>>
>>> ___
>>> OpenStack-operators mailing list
>>> OpenStack-operators@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Bridge of controller failed to start

2016-07-06 Thread mobile parmenides

Hi,


I am trying to deploy openstack following the guide "OpenStack Installation 
Guide for Ubuntu" in Ubuntu 14.04 LTS. I have two nodes: one is for controller, 
the other for compute. But, I have got stucked at neutron. After finishing 
configration, the results of 'neutron agent-list' command displayed the birdge 
of controller can not start.


+--+++---
| id  | agent_type| 
host  | availability_zone  | alive   |
+--+++---
|   | DHCP agent| controller | nova   | :-) 
|
|   | Metadata agent   | controller || :-)|
|| L3 agent| controller | nova  
| :-) |
|   | Linux bridge agent | compute| 
| :-) |
+--+++---

I have checked the '/etc/neutron/plugins/ml2/linuxbridge_agent.ini' file which 
is in accordance with the guid.


[linux_bridge]
physical_interface_mappings = provider:eth0

[securitygroup]
enable_security_group = True
firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

[vxlan]
enable_vxlan = True
local_ip = 192.168.6.2
l2_population = True

Then, I checked the log file '/var/log/neutron/neutron-linuxbridge-agent.log', 
and there is an error message repeated many times as follows:

2016-07-07 00:28:01.872 6 ERROR neutron Traceback (most recent call last):
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/bin/neutron-linuxbridge-agent", line 10, in 
2016-07-07 00:28:01.872 6 ERROR neutronsys.exit(main())
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py",
 line 21, in main
2016-07-07 00:28:01.872 6 ERROR neutronagent_main.main()
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 898, in main
2016-07-07 00:28:01.872 6 ERROR neutronmanager = 
LinuxBridgeManager(bridge_mappings, interface_mappings)
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 76, in __init__
2016-07-07 00:28:01.872 6 ERROR neutronself.check_vxlan_support()
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 628, in check_vxlan_support
2016-07-07 00:28:01.872 6 ERROR neutronif self.vxlan_ucast_supported():
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 591, in vxlan_ucast_supported
2016-07-07 00:28:01.872 6 ERROR neutrontest_iface = 
self.ensure_vxlan(seg_id)
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 286, in ensure_vxlan
2016-07-07 00:28:01.872 6 ERROR neutronreturn None
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-07-07 00:28:01.872 6 ERROR neutronself.force_reraise()
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2016-07-07 00:28:01.872 6 ERROR neutronsix.reraise(self.type_, 
self.value, self.tb)
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 276, in ensure_vxlan
2016-07-07 00:28:01.872 6 ERROR neutron**args)
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 243, in 
add_vxlan
2016-07-07 00:28:01.872 6 ERROR neutronself._as_root([], 'link', cmd)
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 94, in 
_as_root
2016-07-07 00:28:01.872 6 ERROR neutron
log_fail_as_error=self.log_fail_as_error)
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 103, in 
_execute
2016-07-07 00:28:01.872 6 ERROR neutron
log_fail_as_error=log_fail_as_error)
2016-07-07 00:28:01.872 6 ERROR neutron  File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/

[Openstack-operators] (no subject)

2016-07-06 Thread mobile parmenides
Hi,


I am trying to deploy openstack following the guide "OpenStack Installation 
Guide for Ubuntu" in Ubuntu 14.04 LTS. I have two nodes: one is for controller, 
the other for compute. But, I have got stucked at neutron. After finishing 
configration, the results of 'neutron agent-list' command displayed the birdge 
of controller can not start. 


+--+++---
| id   | agent_type | host  
   | availability_zone  | alive  | 
+--+++---
|| DHCP agent | controller | nova| :-)  
| 
|| Metadata agent | controller |  | 
:-)  | 
|| L3 agent   | controller | nova| :-)  
| 
|| Linux bridge agent | compute   | 
 | :-)  | 
+--+++---

I have checked the '/etc/neutron/plugins/ml2/linuxbridge_agent.ini' file which 
is in accordance with the guid.


[linux_bridge]
physical_interface_mappings = provider:eth0

[securitygroup]
enable_security_group = True
firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

[vxlan]
enable_vxlan = True
local_ip = 192.168.6.2
l2_population = True

Then, I checked the log file '/var/log/neutron/neutron-linuxbridge-agent.log', 
and there is an error message repeated many times as follows:

2016-07-07 00:28:01.872 6 ERROR neutron Traceback (most recent call last):
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/bin/neutron-linuxbridge-agent", line 10, in 
2016-07-07 00:28:01.872 6 ERROR neutron sys.exit(main())
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py",
 line 21, in main
2016-07-07 00:28:01.872 6 ERROR neutron agent_main.main()
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 898, in main
2016-07-07 00:28:01.872 6 ERROR neutron manager = 
LinuxBridgeManager(bridge_mappings, interface_mappings)
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 76, in __init__
2016-07-07 00:28:01.872 6 ERROR neutron self.check_vxlan_support()
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 628, in check_vxlan_support
2016-07-07 00:28:01.872 6 ERROR neutron if self.vxlan_ucast_supported():
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 591, in vxlan_ucast_supported
2016-07-07 00:28:01.872 6 ERROR neutron test_iface = 
self.ensure_vxlan(seg_id)
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 286, in ensure_vxlan
2016-07-07 00:28:01.872 6 ERROR neutron return None
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-07-07 00:28:01.872 6 ERROR neutron self.force_reraise()
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2016-07-07 00:28:01.872 6 ERROR neutron six.reraise(self.type_, 
self.value, self.tb)
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py",
 line 276, in ensure_vxlan
2016-07-07 00:28:01.872 6 ERROR neutron **args)
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 243, in 
add_vxlan
2016-07-07 00:28:01.872 6 ERROR neutron self._as_root([], 'link', cmd)
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 94, in 
_as_root
2016-07-07 00:28:01.872 6 ERROR neutron 
log_fail_as_error=self.log_fail_as_error)
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 103, in 
_execute
2016-07-07 00:28:01.872 6 ERROR neutron 
log_fail_as_error=log_fail_as_error)
2016-07-07 00:28:01.872 6 ERROR neutron   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 140, in 
exe

Re: [Openstack-operators] PCI Passthrough issues

2016-07-06 Thread Jonathan D. Proulx

Joe, seems to have been mostly solved with the qemu upgrade.  Since I
plan on being on Mitaka before blessing the gpu instances with the
'production' label I'm OK with that.

Blair I reflexively black list nouveau drivers about 5 ways in my
installer and six in puppet :)

I do have an odd remaining issue where I can run cuda jobs in the vm
but snapshots fail and after pause (for snapshotting) the pci device
can't be reattached (which is where i think it deletes the snapshot
it took).  Got same issue with 3.16 and 4.4 kernels.

Not very well categorized yet, but I'm hoping it's because the VM I
was hacking on had it's libvirt.xml written out with the older qemu
maybe?  It had been through a couple reboots of the physical system
though.

Currently building a fresh instance and bashing more keys...

Thanks all,

-Jon

On Thu, Jul 07, 2016 at 12:35:33AM +1000, Blair Bethwaite wrote:
:Hi Jon,
:
:Do you have the nouveau driver/module loaded in the host by any
:chance? If so, blacklist, reboot, repeat.
:
:Whilst we're talking about this. Has anyone had any luck doing this
:with hosts having a PCI-e switch across multiple GPUs?
:
:Cheers,
:
:On 6 July 2016 at 23:27, Jonathan D. Proulx  wrote:
:> Hi All,
:>
:> Trying to spass through some Nvidia K80 GPUs to soem instance and have
:> gotten to the place where Nova seems to be doing the right thing gpu
:> instances scheduled on the 1 gpu hypervisor I have and for inside the
:> VM I see:
:>
:> root@gpu-x1:~# lspci | grep -i k80
:> 00:06.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
:>
:> And I can install nvdia-361 driver and get
:>
:> # ls /dev/nvidia*
:> /dev/nvidia0  /dev/nvidiactl  /dev/nvidia-uvm  /dev/nvidia-uvm-tools
:>
:> Once I load up cuda-7.5 and build the exmaples none fo the run
:> claiming there's no cuda device.
:>
:> # ./matrixMul
:> [Matrix Multiply Using CUDA] - Starting...
:> cudaGetDevice returned error no CUDA-capable device is detected (code 38), 
line(396)
:> cudaGetDeviceProperties returned error no CUDA-capable device is detected 
(code 38), line(409)
:> MatrixA(160,160), MatrixB(320,160)
:> cudaMalloc d_A returned error no CUDA-capable device is detected (code 38), 
line(164)
:>
:> I'm not familiar with cuda really but I did get some example code
:> running on the physical system for burn in over the weekend (sicne
:> reinstaleld so no nvidia driver on hypervisor).
:>
:> Following various online examples  for setting up pass through I set
:> the kernel boot line on the hypervisor to:
:>
:> # cat /proc/cmdline
:> BOOT_IMAGE=/boot/vmlinuz-3.13.0-87-generic 
root=UUID=d9bc9159-fedf-475b-b379-f65490c71860 ro console=tty0 
console=ttyS1,115200 intel_iommu=on iommu=pt rd.modules-load=vfio-pci nosplash 
nomodeset intel_iommu=on iommu=pt rd.modules-load=vfio-pci nomdmonddf nomdmonisw
:>
:> Puzzled that I apparently have the device but it is apparently
:> nonfunctional, where do I even look from here?
:>
:> -Jon
:>
:>
:> ___
:> OpenStack-operators mailing list
:> OpenStack-operators@lists.openstack.org
:> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:
:
:
:-- 
:Cheers,
:~Blairo

-- 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [puppet] [desginate] An update on the state of puppet-designate (and designate in RDO)

2016-07-06 Thread David Moreau Simard
Thanks Matt, if you don't mind I might add you to some puppet reviews.

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]


On Tue, Jul 5, 2016 at 10:22 PM, Matt Fischer  wrote:
> We're using Designate but still on Juno. We're running puppet from around
> then, summer of 2015. We'll likely try to upgrade to Mitaka at some point
> but Juno Designate "just works" so it's been low priority. Look forward to
> your efforts here.
>
> On Tue, Jul 5, 2016 at 7:47 PM, David Moreau Simard  wrote:
>>
>> Hi !
>>
>> tl;dr
>> puppet-designate is going under some significant updates to bring it
>> up to par right now.
>> While I will try to ensure it is well tested and backwards compatible,
>> things *could* break. Would like feedback.
>>
>> I cc'd -operators because I'm interested in knowing if there are any
>> users of puppet-designate right now: which distro and release of
>> OpenStack?
>>
>> I'm a RDO maintainer and I took interest in puppet-designate because
>> we did not have any proper test coverage for designate in RDO
>> packaging until now.
>>
>> The RDO community mostly relies on collaboration with installation and
>> deployment projects such as Puppet OpenStack to test our packaging.
>> We can, in turn, provide some level of guarantee that packages built
>> out of trunk branches (and eventually stable releases) should work.
>> The idea is to make puppet-designate work with RDO, then integrate it
>> in the puppet-openstack-integration CI scenarios and we can leverage
>> that in RDO CI afterwards.
>>
>> Both puppet-designate and designate RDO packaging were unfortunately
>> in quite a sad state after not being maintained very well and a lot of
>> work was required to even get basic tests to pass.
>> The good news is that it didn't work with RDO before and now it does,
>> for newton.
>> Testing coverage has been improved and will be improved even further
>> for both RDO and Ubuntu Cloud Archive.
>>
>> If you'd like to follow the progress of the work, the reviews are
>> tagged with the topic "designate-with-rdo" [1].
>>
>> Let me know if you have any questions !
>>
>> [1]: https://review.openstack.org/#/q/topic:designate-with-rdo
>>
>> David Moreau Simard
>> Senior Software Engineer | Openstack RDO
>>
>> dmsimard = [irc, github, twitter]
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] PCI Passthrough issues

2016-07-06 Thread Blair Bethwaite
Hi Jon,

Do you have the nouveau driver/module loaded in the host by any
chance? If so, blacklist, reboot, repeat.

Whilst we're talking about this. Has anyone had any luck doing this
with hosts having a PCI-e switch across multiple GPUs?

Cheers,

On 6 July 2016 at 23:27, Jonathan D. Proulx  wrote:
> Hi All,
>
> Trying to spass through some Nvidia K80 GPUs to soem instance and have
> gotten to the place where Nova seems to be doing the right thing gpu
> instances scheduled on the 1 gpu hypervisor I have and for inside the
> VM I see:
>
> root@gpu-x1:~# lspci | grep -i k80
> 00:06.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
>
> And I can install nvdia-361 driver and get
>
> # ls /dev/nvidia*
> /dev/nvidia0  /dev/nvidiactl  /dev/nvidia-uvm  /dev/nvidia-uvm-tools
>
> Once I load up cuda-7.5 and build the exmaples none fo the run
> claiming there's no cuda device.
>
> # ./matrixMul
> [Matrix Multiply Using CUDA] - Starting...
> cudaGetDevice returned error no CUDA-capable device is detected (code 38), 
> line(396)
> cudaGetDeviceProperties returned error no CUDA-capable device is detected 
> (code 38), line(409)
> MatrixA(160,160), MatrixB(320,160)
> cudaMalloc d_A returned error no CUDA-capable device is detected (code 38), 
> line(164)
>
> I'm not familiar with cuda really but I did get some example code
> running on the physical system for burn in over the weekend (sicne
> reinstaleld so no nvidia driver on hypervisor).
>
> Following various online examples  for setting up pass through I set
> the kernel boot line on the hypervisor to:
>
> # cat /proc/cmdline
> BOOT_IMAGE=/boot/vmlinuz-3.13.0-87-generic 
> root=UUID=d9bc9159-fedf-475b-b379-f65490c71860 ro console=tty0 
> console=ttyS1,115200 intel_iommu=on iommu=pt rd.modules-load=vfio-pci 
> nosplash nomodeset intel_iommu=on iommu=pt rd.modules-load=vfio-pci 
> nomdmonddf nomdmonisw
>
> Puzzled that I apparently have the device but it is apparently
> nonfunctional, where do I even look from here?
>
> -Jon
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Cheers,
~Blairo

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Migration to LDAP / default domain questions

2016-07-06 Thread Ben Morrice

Hello,

We have a small private OpenStack deployment with 300 VMs across 2 regions.
We currently use the Keystone v2.0 API and all accounts are currently 
stored in SQL.


We would like to move keystone to authenticate users from LDAP 
(identity), whilst still having the service accounts stored in SQL 
(migrating to Keystone v3 in the process).


In our testing environment we have configured domain-specific drivers to 
support the above configuration, with the 'default' domain being SQL and 
a separate domain 'ldap' for credentials from LDAP.


Usernames are the same for accounts in both 'default' and 'ldap'.
Assignments would still reside in SQL.

This setup works for the creation of new resources, however any 
resources defined in the old domain ('default') is obviously not 
available in the 'ldap' domain.


Has anyone migrated resources between domains? There doesn't appear to 
be any OpenStack tooling to support this (?).


Or is the solution to simply configure the ldap domain named as 
'default' and the SQL domain named as something like 'services' ?


--
Kind regards,

Ben Morrice

__
Ben Morrice | e: ben.morr...@epfl.ch | t: +41-21-693-9670
EPFL ENT CBS BBP
Biotech Campus
Chemin des Mines 9
1202 Geneva
Switzerland


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] PCI Passthrough issues

2016-07-06 Thread Joe Topjian
Hi Jon,

We were also running into issues with the K80s.

For our GPU nodes, we've gone with a 4.2 or 4.4 kernel. PCI Passthrough
works much better in those releases. (I ran into odd issues with 4.4 and
NFS, downgraded to 4.2 after a few hours of banging my head, problems went
away, not a scientific solution :)

After that, make sure vfio is loaded:

$ lsmod | grep vfio

Then start with the "deviceQuery" CUDA sample. We've found deviceQuery to
be a great check to see if the instance has full/correct access to the
card. If deviceQuery prints a report within 1-2 seconds, all is well. If
there is a lag, something is off.

In our case for the K80s, that final "something" was qemu. We came across
this[1] wiki page (search for K80) and started digging into qemu. tl;dr:
upgrading to the qemu packages found in the Ubuntu Mitaka cloud archive
solved our issues.

Hope that helps,
Joe

1: https://pve.proxmox.com/wiki/Pci_passthrough


On Wed, Jul 6, 2016 at 7:27 AM, Jonathan D. Proulx 
wrote:

> Hi All,
>
> Trying to spass through some Nvidia K80 GPUs to soem instance and have
> gotten to the place where Nova seems to be doing the right thing gpu
> instances scheduled on the 1 gpu hypervisor I have and for inside the
> VM I see:
>
> root@gpu-x1:~# lspci | grep -i k80
> 00:06.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
>
> And I can install nvdia-361 driver and get
>
> # ls /dev/nvidia*
> /dev/nvidia0  /dev/nvidiactl  /dev/nvidia-uvm  /dev/nvidia-uvm-tools
>
> Once I load up cuda-7.5 and build the exmaples none fo the run
> claiming there's no cuda device.
>
> # ./matrixMul
> [Matrix Multiply Using CUDA] - Starting...
> cudaGetDevice returned error no CUDA-capable device is detected (code 38),
> line(396)
> cudaGetDeviceProperties returned error no CUDA-capable device is detected
> (code 38), line(409)
> MatrixA(160,160), MatrixB(320,160)
> cudaMalloc d_A returned error no CUDA-capable device is detected (code
> 38), line(164)
>
> I'm not familiar with cuda really but I did get some example code
> running on the physical system for burn in over the weekend (sicne
> reinstaleld so no nvidia driver on hypervisor).
>
> Following various online examples  for setting up pass through I set
> the kernel boot line on the hypervisor to:
>
> # cat /proc/cmdline
> BOOT_IMAGE=/boot/vmlinuz-3.13.0-87-generic
> root=UUID=d9bc9159-fedf-475b-b379-f65490c71860 ro console=tty0
> console=ttyS1,115200 intel_iommu=on iommu=pt rd.modules-load=vfio-pci
> nosplash nomodeset intel_iommu=on iommu=pt rd.modules-load=vfio-pci
> nomdmonddf nomdmonisw
>
> Puzzled that I apparently have the device but it is apparently
> nonfunctional, where do I even look from here?
>
> -Jon
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] PCI Passthrough issues

2016-07-06 Thread Jonathan D. Proulx
Hi All,

Trying to spass through some Nvidia K80 GPUs to soem instance and have
gotten to the place where Nova seems to be doing the right thing gpu
instances scheduled on the 1 gpu hypervisor I have and for inside the
VM I see:

root@gpu-x1:~# lspci | grep -i k80
00:06.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)

And I can install nvdia-361 driver and get

# ls /dev/nvidia*
/dev/nvidia0  /dev/nvidiactl  /dev/nvidia-uvm  /dev/nvidia-uvm-tools

Once I load up cuda-7.5 and build the exmaples none fo the run
claiming there's no cuda device.

# ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
cudaGetDevice returned error no CUDA-capable device is detected (code 38), 
line(396)
cudaGetDeviceProperties returned error no CUDA-capable device is detected (code 
38), line(409)
MatrixA(160,160), MatrixB(320,160)
cudaMalloc d_A returned error no CUDA-capable device is detected (code 38), 
line(164)

I'm not familiar with cuda really but I did get some example code
running on the physical system for burn in over the weekend (sicne
reinstaleld so no nvidia driver on hypervisor).

Following various online examples  for setting up pass through I set
the kernel boot line on the hypervisor to:

# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.13.0-87-generic 
root=UUID=d9bc9159-fedf-475b-b379-f65490c71860 ro console=tty0 
console=ttyS1,115200 intel_iommu=on iommu=pt rd.modules-load=vfio-pci nosplash 
nomodeset intel_iommu=on iommu=pt rd.modules-load=vfio-pci nomdmonddf nomdmonisw

Puzzled that I apparently have the device but it is apparently
nonfunctional, where do I even look from here?

-Jon


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] ?????? ?????? Quota exceeded for resources:['security_group'].

2016-07-06 Thread ????????
3ks very much !




--  --
??: "Kris G. Lindgren";;
: 2016??7??5??(??) 8:28
??: ""<821696...@qq.com>; 
"vincent.legoll"; 
"openstack-operators"; 

: Re: [Openstack-operators] ??  Quota exceeded for 
resources:['security_group'].



   If you are using neutron ?C you also need to update the quotas for the 
tenant in neutron as well.
   
 
 ___
 Kris Lindgren
 Senior Linux Systems Engineer
 GoDaddy
 
 
 
 
 
 
   From:  <821696...@qq.com>
 Date: Tuesday, July 5, 2016 at 1:50 AM
 To: "vincent.legoll" , openstack-operators 

 Subject: [Openstack-operators] ?? Quota exceeded for 
resources:['security_group'].
 
 
 
 thanks for you.
  
 
  I try it for times ,and then config to
  
 
  nova quota-update --security-groups 300 --security-group-rules 60 
6cb156a82d0f486a9f50132be9438eb6
  nova quota-show | grep security_group
 | security_groups | 300   |
 | security_group_rules| 60|
 
  
 
  but it is also
 
 
 
  
 
 
 
 --  --
  ??: "vincent.legoll";;
 : 2016??7??5??(??) 2:54
 ??: "openstack-operators"; 
 
 : Re: [Openstack-operators] Quota exceeded for 
resources:['security_group'].
 
 
 
 Hello,
 
 Le 05/07/2016 06:02,  a ??crit :
 > when i create cluster:
 >  openstack dataprocessing cluster create --json 
 > my_cluster_create_vmdk.json
 [...]
 > How to deal it ,thanks !
 
 Look at the actual quotas:
 
 $ nova quota-show | grep security_group
 | security_groups | 10|
 | security_group_rules| 20|
 
 Then grow them a bit:
 
 $ nova quota-update --security-groups 11 --security-group-rules 21 
 
 There's the equivalent unified client (openstack) commands, should be easy to 
find
 
 Hope this helps
 
 -- 
 Vincent Legoll
 EGI FedCloud task force
 Cloud Computing at IdGC
 France Grilles / CNRS / IPHC
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] How to config floating ip pool use nova? thanks

2016-07-06 Thread ????????
hi everyone,
How to config floating ip pool use nova? thanks___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [scientific-wg] Lustre war stories

2016-07-06 Thread Blair Bethwaite
Hi Álvaro, hi David -

NB: adding os-ops.

David, we have some real-time Lustre war stories we can share and
hopefully provide some positive conclusions to come Barcelona. I've
given an overview of what we're doing below. Are there any specifics
you were interested in when you raised Lustre in the meeting?

Our present approach leans on SRIOV and has worked with both
nova-network and now Neutron, and actually though this would work with
the Mellanox Neutron ML2 driver we are not using OpenStack Networking
to orchestrate this yet. It achieves the plumbing required to get a
parallel filesystem integrated into a typical virtualised OpenStack
deployment, but it does not "cloudify" the parallel filesystem in
anyway (for this you really need the filesystem to have some concept
of multi-tenancy and/or strong client isolation) and has quite limited
applicability to one or a small number of trusted users/projects.

Currently we have a single high-performance data network per cluster,
that is a high-bandwidth and RDMA capable Ethernet fabric. Our
Mellanox NICs (not sure if other vendors have similar features?) allow
us to restrict SRIOV virtual functions (VFs) to specific VLANs, so we
tie them to the data VLAN and then use PCI passthrough (care of
private Nova instance-types) to give guests a PCI VF plugged straight
into that network. Guests need to load appropriate drivers and
configure their own L3. Lustre servers are traditional bare-metal
affairs sitting at the bottom of that subnet.

We have one deployment like this which has been running Lustre over
TCP for about 12 months. That seems to work pretty well except that we
are in the midst of investigating high rx_errors on the servers and
discards on the switches, which seem like they might be
causing/related to Lustre write checksum errors that we see a lot of -
these don't seem to be fatal or data corrupting but rather Lustre
transport level errors which might cause write errors to propagate to
clients, but we're unsure... That particular problem does not seem to
be inherently related to our host configs or use of SRIOV though, more
likely a fabric config issue.

We have a second slightly larger deployment with a similar
configuration, the most notable difference for that one is that it is
using the o2iblnd (o2ib Lustre Network Driver), i.e., Lustre is
configured as for IB but is really running on RoCE. We plan to extract
some performance comparisons from this over coming weeks (there I
would like to compare with both TCP over SRIOV and TCP over
linux-bridge). Probably the main issue with this setup so far is the
need to build Lustre modules against both kernel and Mellanox OFED -
normally compute nodes like this stay very static, but now that they
are cloud instances there is a natural push towards more frequent
updating, and there is not a great deal of clarity about which
combinations are currently supported.

Cheers,

On 6 July 2016 at 21:31, Álvaro López García  wrote:
> On 06 Jul 2016 (11:58), Stig Telfer wrote:
>> Hi Dave -
>
> Hi all,
>
>> I’d like to introduce Wojciech and Matt from across town at the
>> University.  Wojciech and Matt work on managing and developing the
>> Lustre storage here at Cambridge University.  Right now we are just
>> getting started on integrating Lustre into OpenStack but Blair (also
>> copied) has a similar setup up and running already at Monash
>> University in Melbourne.
>>
>> Parallel filesystems in general are an activity area for the
>> Scientific Working Group, so we try to keep people in touch about what
>> works and what is possible.
>
> Yes, it would be awesome to get some other user stories about parallel
> filesystems and share our approaches, concerns and hacks.
>
>> I’m aware that Lustre is also used in this way in CSC in Finland and
>> (from today’s discussion) Álvaro has a similar configuration using
>> GPFS at his university in Spain.
>
> A bit of context. We are CSIC are operating two separate computing
> infrastructures, one HPC node part of the Spanish supercompting network,
> and one HTC cluster, plugged to the European Grid Infrastructure. Access
> policies for both systems are completely different, and they  are being
> used by a variety of disciplines (high energy physics, astrophysics,
> cosmology, engineering, bio, etc.). Both systems rely on GPFS: the HPC
> node leverages Infiniband for the storage network, whereas the HTC one
> uses 10GbE or 1GbE.
>
> In the IRC meeting I said that these filesystems were not shared, but I
> was wrong. All the filesystems are shared across both infrastructures,
> with the particularity that the HPC filesystems are only shared as read
> only outside the HPC node (they are shared over Ethernet)
>
> The interesting part is that we are running a complete SGE cluster (HTC)
> on top of OpenStack, with access to GPFS. The HTC cluster is subject to
> periodic updates due to middleware upgrades, therefore we were running a
> completely virtualized cluster so that we could pe