Re: [Openstack-operators] pci passthrough & numa affinity

2018-05-24 Thread Jonathan D. Proulx
On Thu, May 24, 2018 at 06:34:06PM -0500, Eric Fried wrote:
:How long are you willing to wait?
:
:The work we're doing to use Placement from Nova ought to allow us to
:model both of these things nicely from the virt driver, and request them
:nicely from the flavor.
:
:By the end of Rocky we will have laid a large percentage of the
:groundwork to enable this. This is all part of the road to what we've
:been calling "generic device management" (GDM) -- which we hope will
:eventually let us remove most/all of the existing PCI passthrough code.
:
:I/we would be interested in hearing more specifics of your requirements
:around this, as it will help inform the GDM roadmap.  And of course,
:upstream help & contributions would be very welcome.

Sounds like good work.

My use case is not yet very clear.  I do have some upcoming
discussions with users around requirements and funding so being able
to say "this is on the road map and could be accelerated with
developer hours"  is useful.  I expect patience is what will come of
that but very good to know where to go when I get some clarity and if
I get some resources.

-Jon

:
:Thanks,
:efried
:
:On 05/24/2018 05:19 PM, Jonathan D. Proulx wrote:
:> On Fri, May 25, 2018 at 07:59:16AM +1000, Blair Bethwaite wrote:
:> :Hi Jon,
:> :
:> :Following up to the question you asked during the HPC on OpenStack
:> :panel at the summit yesterday...
:> :
:> :You might have already seen Daniel Berrange's blog on this topic:
:> 
:https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
:> :? He essentially describes how you can get around the issue of the
:> :naive flat pci bus topology in the guest - exposing numa affinity of
:> :the PCIe root ports requires newish qemu and libvirt.
:> 
:> Thanks for the pointer not sure if I've seen that one, I've seen a few
:> ways to map manually.  I would have been quite surprised if nova did
:> this so I am poking at libvirt.xml outside nova for now
:> 
:> :However, best I can tell there is no way to do this with Nova today.
:> :Are you interested in working together on a spec for this?
:> 
:> I'm not yet convinced it's worth the bother, that's the crux of the
:> question I'm investigating.  Is this worth the effort?  There's a meta
:> question "do I have time to find out" :)
:> 
:> :The other related feature of interest here (newer though - no libvirt
:> :support yet I think) is gpu cliques
:> 
:(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
:> :would be really nice to have a way to set these up through Nova once
:> :libvirt supports it.
:> 
:> Thanks,
:> -Jon
:> 
:> 
:> ___
:> OpenStack-operators mailing list
:> OpenStack-operators@lists.openstack.org
:> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:> 
:
:___
:OpenStack-operators mailing list
:OpenStack-operators@lists.openstack.org
:http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] pci passthrough & numa affinity

2018-05-24 Thread Eric Fried
How long are you willing to wait?

The work we're doing to use Placement from Nova ought to allow us to
model both of these things nicely from the virt driver, and request them
nicely from the flavor.

By the end of Rocky we will have laid a large percentage of the
groundwork to enable this. This is all part of the road to what we've
been calling "generic device management" (GDM) -- which we hope will
eventually let us remove most/all of the existing PCI passthrough code.

I/we would be interested in hearing more specifics of your requirements
around this, as it will help inform the GDM roadmap.  And of course,
upstream help & contributions would be very welcome.

Thanks,
efried

On 05/24/2018 05:19 PM, Jonathan D. Proulx wrote:
> On Fri, May 25, 2018 at 07:59:16AM +1000, Blair Bethwaite wrote:
> :Hi Jon,
> :
> :Following up to the question you asked during the HPC on OpenStack
> :panel at the summit yesterday...
> :
> :You might have already seen Daniel Berrange's blog on this topic:
> :https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
> :? He essentially describes how you can get around the issue of the
> :naive flat pci bus topology in the guest - exposing numa affinity of
> :the PCIe root ports requires newish qemu and libvirt.
> 
> Thanks for the pointer not sure if I've seen that one, I've seen a few
> ways to map manually.  I would have been quite surprised if nova did
> this so I am poking at libvirt.xml outside nova for now
> 
> :However, best I can tell there is no way to do this with Nova today.
> :Are you interested in working together on a spec for this?
> 
> I'm not yet convinced it's worth the bother, that's the crux of the
> question I'm investigating.  Is this worth the effort?  There's a meta
> question "do I have time to find out" :)
> 
> :The other related feature of interest here (newer though - no libvirt
> :support yet I think) is gpu cliques
> :(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
> :would be really nice to have a way to set these up through Nova once
> :libvirt supports it.
> 
> Thanks,
> -Jon
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] pci passthrough & numa affinity

2018-05-24 Thread Jonathan D. Proulx
On Fri, May 25, 2018 at 07:59:16AM +1000, Blair Bethwaite wrote:
:Hi Jon,
:
:Following up to the question you asked during the HPC on OpenStack
:panel at the summit yesterday...
:
:You might have already seen Daniel Berrange's blog on this topic:
:https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
:? He essentially describes how you can get around the issue of the
:naive flat pci bus topology in the guest - exposing numa affinity of
:the PCIe root ports requires newish qemu and libvirt.

Thanks for the pointer not sure if I've seen that one, I've seen a few
ways to map manually.  I would have been quite surprised if nova did
this so I am poking at libvirt.xml outside nova for now

:However, best I can tell there is no way to do this with Nova today.
:Are you interested in working together on a spec for this?

I'm not yet convinced it's worth the bother, that's the crux of the
question I'm investigating.  Is this worth the effort?  There's a meta
question "do I have time to find out" :)

:The other related feature of interest here (newer though - no libvirt
:support yet I think) is gpu cliques
:(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
:would be really nice to have a way to set these up through Nova once
:libvirt supports it.

Thanks,
-Jon


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [nova] Need some feedback on the proposed heal_allocations CLI

2018-05-24 Thread Matt Riedemann
I've written a nova-manage placement heal_allocations CLI [1] which was 
a TODO from the PTG in Dublin as a step toward getting existing 
CachingScheduler users to roll off that (which is deprecated).


During the CERN cells v1 upgrade talk it was pointed out that CERN was 
able to go from placement-per-cell to centralized placement in Ocata 
because the nova-computes in each cell would automatically recreate the 
allocations in Placement in a periodic task, but that code is gone once 
you're upgraded to Pike or later.


In various other talks during the summit this week, we've talked about 
things during upgrades where, for instance, if placement is down for 
some reason during an upgrade, a user deletes an instance and the 
allocation doesn't get cleaned up from placement so it's going to 
continue counting against resource usage on that compute node even 
though the server instance in nova is gone. So this CLI could be 
expanded to help clean up situations like that, e.g. provide it a 
specific server ID and the CLI can figure out if it needs to clean 
things up in placement.


So there are plenty of things we can build into this, but the patch is 
already quite large. I expect we'll also be backporting this to stable 
branches to help operators upgrade/fix allocation issues. It already has 
several things listed in a code comment inline about things to build 
into this later.


My question is, is this good enough for a first iteration or is there 
something severely missing before we can merge this, like the automatic 
marker tracking mentioned in the code (that will probably be a 
non-trivial amount of code to add). I could really use some operator 
feedback on this to just take a look at what it already is capable of 
and if it's not going to be useful in this iteration, let me know what's 
missing and I can add that in to the patch.


[1] https://review.openstack.org/#/c/565886/

--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] pci passthrough & numa affinity

2018-05-24 Thread Blair Bethwaite
Hi Jon,

Following up to the question you asked during the HPC on OpenStack
panel at the summit yesterday...

You might have already seen Daniel Berrange's blog on this topic:
https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
? He essentially describes how you can get around the issue of the
naive flat pci bus topology in the guest - exposing numa affinity of
the PCIe root ports requires newish qemu and libvirt.

However, best I can tell there is no way to do this with Nova today.
Are you interested in working together on a spec for this?

The other related feature of interest here (newer though - no libvirt
support yet I think) is gpu cliques
(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
would be really nice to have a way to set these up through Nova once
libvirt supports it.

-- 
Cheers,
~Blairo

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Ops Community Documentation - first anchor point

2018-05-24 Thread Jonathan D. Proulx
On Thu, May 24, 2018 at 01:26:07PM -0700, Melvin Hillsman wrote:
:   I think a great model we have in general as a community is if people
:   show up to do the work, it is not something crazy, get out of their
:   way; at least that is how I think of it. I apologize if there is any
:   perception opposed to my previous statement by me bringing up the other
:   repos. I tried to be clear in wanting to get feedback from Doug in hope
:   that as we move forward in general, what are some thoughts on that
:   front to ensure we continue to remove roadblocks if any exist in
:   parallel to great work, like what Chris is driving here. On that front,
:   please do what works best for those doing the work.

No worries I feel the love :)

Going to go forward implemnting as SIG + repo which seems lightest way
forward, we can always adapt and evolve.

-Jon

:   On Thu, May 24, 2018 at 7:26 AM, Jonathan D. Proulx
:   <[1]j...@csail.mit.edu> wrote:
:
: On Thu, May 24, 2018 at 07:07:10AM -0700, Doug Hellmann wrote:
: :I know you wanted to avoid lots of governance overhead, so I want
: :to just mention that establishing a SIG is meant to be a painless
: :and light-weight way to declare that a group of interested people
: :exists so that others can find them and participate in the work
: :[1]. It shouldn't take much effort to do the setup, and any ongoing
: :communication is something you would presumably by doing anyway
: :among a group of people trying to collaborate on a project like
: :this.
: Yeah I can see SIG as a useful structure too.  I'm just more
: familiar
: with UC "teams" because of my personal history.
: I do thing SIG -vs- team would impace repo naming, and I'm still
: going
: over creation doc, so I'll let this simmer here at least until YVR
: lunch
: time to see if there's consensus or cotroversy in the potential
: contributer community.  Lacking either I think I will default to
: SIG-ops-docs.
: Thanks,
: -Jon
: :
: :Let me know if you have any questions or concerns about the
: process.
:
:   :
:   :Doug
:   :
:   :[1] [2]https://governance.openstack.org/sigs/#process-to-create-a-sig
:   :
:   :>
:   :> > of repositories under osops-
:   :> >
:   :> > [3]https://github.com/openstack-infra/project-config/blob/
:   master/gerrit/projects.yaml#L5673-L5703
:   :> >
:   :> > Generally active:
:   :> > osops-tools-contrib
:   :> > osops-tools-generic
:   :> > osops-tools-monitoring
:   :> >
:   :> >
:   :> > Probably dead:
:   :> > osops-tools-logging
:   :> > osops-coda
:   :> > osops-example-configs
:   :> >
:   :> > Because you are more familiar with how things work, is there a way
:   to
:   :> > consolidate these vs coming up with another repo like osops-docs
:   or
:   :> > whatever in this case? And second, is there already governance
:   clearance to
:   :> > publish based on the following - [4]https://launchpad.net/osops -
:   which is
:   :> > where these repos originated.
:   :>
:   :> I don't really know what any of those things are, or whether it
:   :> makes sense to put this new content there. I assumed we would make
:   :> a repo with a name like "operations-guide", but that's up to Chris
:   :> and John.  If they think reusing an existing repository makes sense,
:   :> that would be OK with me, but it's cheap and easy to set up a new
:   :> one, too.
:   :>
:   :> My main concern is that we remove the road blocks, now that we have
:   :> people interested in contributing to this documentation.
:   :>
:   :> >
:   :> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <[5]eu...@arcor.de>
:   wrote:
:   :> >
:   :> > > Hi Chris,
:   :> > >
:   :> > > thanks for summarize our session today in Vancouver. As I18n PTL
:   and one
:   :> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
:   :> > > unfortunatelly not on-site.
:   :> > > I couldn't also not get the full history of the story and that's
:   also not
:   :> > > the idea to starting finger pointing. As usualy we moving
:   forward and there
:   :> > > are some interesting things to know what happened.
:   :> > > First of all: There are no "Docs-Team" anymore. If you look at
:   [1] there
:   :> > > are mostly part-time contributors like me or people are more
:   involved in
:   :> > > other projects and therefore busy. Because of that, the
:   responsibility of
:   :> > > documentation content are moved completely to the project teams.
:   Each repo
:   :> > > has a user guide, admin guide, deployment guide, and so on. The
:   small
:   :> > > Documentation Team provides only tooling and give advices how to
:   write and
:   :> > > publish a document. So it's up to you to re-use the old repo on
:   [2] or
:   :> > > setup a new one. I would recommend to use the best of both
:   worlds. There
:   :> > > are a very good toolset in place for testing and publishing
:   documents.
:   :> > > There are also various text editors for 

Re: [Openstack-operators] Ops Community Documentation - first anchor point

2018-05-24 Thread Melvin Hillsman
I think a great model we have in general as a community is if people show
up to do the work, it is not something crazy, get out of their way; at
least that is how I think of it. I apologize if there is any perception
opposed to my previous statement by me bringing up the other repos. I tried
to be clear in wanting to get feedback from Doug in hope that as we move
forward in general, what are some thoughts on that front to ensure we
continue to remove roadblocks if any exist in parallel to great work, like
what Chris is driving here. On that front, please do what works best for
those doing the work.

On Thu, May 24, 2018 at 7:26 AM, Jonathan D. Proulx 
wrote:

> On Thu, May 24, 2018 at 07:07:10AM -0700, Doug Hellmann wrote:
>
> :I know you wanted to avoid lots of governance overhead, so I want
> :to just mention that establishing a SIG is meant to be a painless
> :and light-weight way to declare that a group of interested people
> :exists so that others can find them and participate in the work
> :[1]. It shouldn't take much effort to do the setup, and any ongoing
> :communication is something you would presumably by doing anyway
> :among a group of people trying to collaborate on a project like
> :this.
>
> Yeah I can see SIG as a useful structure too.  I'm just more familiar
> with UC "teams" because of my personal history.
>
> I do thing SIG -vs- team would impace repo naming, and I'm still going
> over creation doc, so I'll let this simmer here at least until YVR lunch
> time to see if there's consensus or cotroversy in the potential
> contributer community.  Lacking either I think I will default to
> SIG-ops-docs.
>
> Thanks,
> -Jon
>
> :
> :Let me know if you have any questions or concerns about the process.
> :
> :Doug
> :
> :[1] https://governance.openstack.org/sigs/#process-to-create-a-sig
> :
> :>
> :> > of repositories under osops-
> :> >
> :> > https://github.com/openstack-infra/project-config/blob/
> master/gerrit/projects.yaml#L5673-L5703
> :> >
> :> > Generally active:
> :> > osops-tools-contrib
> :> > osops-tools-generic
> :> > osops-tools-monitoring
> :> >
> :> >
> :> > Probably dead:
> :> > osops-tools-logging
> :> > osops-coda
> :> > osops-example-configs
> :> >
> :> > Because you are more familiar with how things work, is there a way to
> :> > consolidate these vs coming up with another repo like osops-docs or
> :> > whatever in this case? And second, is there already governance
> clearance to
> :> > publish based on the following - https://launchpad.net/osops - which
> is
> :> > where these repos originated.
> :>
> :> I don't really know what any of those things are, or whether it
> :> makes sense to put this new content there. I assumed we would make
> :> a repo with a name like "operations-guide", but that's up to Chris
> :> and John.  If they think reusing an existing repository makes sense,
> :> that would be OK with me, but it's cheap and easy to set up a new
> :> one, too.
> :>
> :> My main concern is that we remove the road blocks, now that we have
> :> people interested in contributing to this documentation.
> :>
> :> >
> :> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker 
> wrote:
> :> >
> :> > > Hi Chris,
> :> > >
> :> > > thanks for summarize our session today in Vancouver. As I18n PTL
> and one
> :> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> :> > > unfortunatelly not on-site.
> :> > > I couldn't also not get the full history of the story and that's
> also not
> :> > > the idea to starting finger pointing. As usualy we moving forward
> and there
> :> > > are some interesting things to know what happened.
> :> > > First of all: There are no "Docs-Team" anymore. If you look at [1]
> there
> :> > > are mostly part-time contributors like me or people are more
> involved in
> :> > > other projects and therefore busy. Because of that, the
> responsibility of
> :> > > documentation content are moved completely to the project teams.
> Each repo
> :> > > has a user guide, admin guide, deployment guide, and so on. The
> small
> :> > > Documentation Team provides only tooling and give advices how to
> write and
> :> > > publish a document. So it's up to you to re-use the old repo on [2]
> or
> :> > > setup a new one. I would recommend to use the best of both worlds.
> There
> :> > > are a very good toolset in place for testing and publishing
> documents.
> :> > > There are also various text editors for rst extensions available,
> like in
> :> > > vim, notepad++ or also online services. I understand the concerns
> and when
> :> > > people are sad because their patches are ignored for months. But
> it's
> :> > > alltime a question of responsibilty and how can spend people time.
> :> > > I would be available for help. As I18n PTL I could imagine that a
> :> > > OpenStack Operations Guide is available in different languages and
> portable
> :> > > in different formats like in Sphinx. For us as translation team
> it's a good
> 

Re: [Openstack-operators] Ops Community Documentation - first anchor point

2018-05-24 Thread Jonathan D. Proulx
On Thu, May 24, 2018 at 07:07:10AM -0700, Doug Hellmann wrote:

:I know you wanted to avoid lots of governance overhead, so I want
:to just mention that establishing a SIG is meant to be a painless
:and light-weight way to declare that a group of interested people
:exists so that others can find them and participate in the work
:[1]. It shouldn't take much effort to do the setup, and any ongoing
:communication is something you would presumably by doing anyway
:among a group of people trying to collaborate on a project like
:this.

Yeah I can see SIG as a useful structure too.  I'm just more familiar
with UC "teams" because of my personal history.

I do thing SIG -vs- team would impace repo naming, and I'm still going
over creation doc, so I'll let this simmer here at least until YVR lunch
time to see if there's consensus or cotroversy in the potential
contributer community.  Lacking either I think I will default to
SIG-ops-docs.

Thanks,
-Jon

:
:Let me know if you have any questions or concerns about the process.
:
:Doug
:
:[1] https://governance.openstack.org/sigs/#process-to-create-a-sig
:
:> 
:> > of repositories under osops-
:> > 
:> > 
https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703
:> > 
:> > Generally active:
:> > osops-tools-contrib
:> > osops-tools-generic
:> > osops-tools-monitoring
:> > 
:> > 
:> > Probably dead:
:> > osops-tools-logging
:> > osops-coda
:> > osops-example-configs
:> > 
:> > Because you are more familiar with how things work, is there a way to
:> > consolidate these vs coming up with another repo like osops-docs or
:> > whatever in this case? And second, is there already governance clearance to
:> > publish based on the following - https://launchpad.net/osops - which is
:> > where these repos originated.
:> 
:> I don't really know what any of those things are, or whether it
:> makes sense to put this new content there. I assumed we would make
:> a repo with a name like "operations-guide", but that's up to Chris
:> and John.  If they think reusing an existing repository makes sense,
:> that would be OK with me, but it's cheap and easy to set up a new
:> one, too.
:> 
:> My main concern is that we remove the road blocks, now that we have
:> people interested in contributing to this documentation.
:> 
:> > 
:> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker  wrote:
:> > 
:> > > Hi Chris,
:> > >
:> > > thanks for summarize our session today in Vancouver. As I18n PTL and one
:> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
:> > > unfortunatelly not on-site.
:> > > I couldn't also not get the full history of the story and that's also not
:> > > the idea to starting finger pointing. As usualy we moving forward and 
there
:> > > are some interesting things to know what happened.
:> > > First of all: There are no "Docs-Team" anymore. If you look at [1] there
:> > > are mostly part-time contributors like me or people are more involved in
:> > > other projects and therefore busy. Because of that, the responsibility of
:> > > documentation content are moved completely to the project teams. Each 
repo
:> > > has a user guide, admin guide, deployment guide, and so on. The small
:> > > Documentation Team provides only tooling and give advices how to write 
and
:> > > publish a document. So it's up to you to re-use the old repo on [2] or
:> > > setup a new one. I would recommend to use the best of both worlds. There
:> > > are a very good toolset in place for testing and publishing documents.
:> > > There are also various text editors for rst extensions available, like in
:> > > vim, notepad++ or also online services. I understand the concerns and 
when
:> > > people are sad because their patches are ignored for months. But it's
:> > > alltime a question of responsibilty and how can spend people time.
:> > > I would be available for help. As I18n PTL I could imagine that a
:> > > OpenStack Operations Guide is available in different languages and 
portable
:> > > in different formats like in Sphinx. For us as translation team it's a 
good
:> > > possibility to get feedback about the quality and to understand the
:> > > requirements, also for other documents.
:> > > So let's move on.
:> > >
:> > > kind regards
:> > >
:> > > Frank
:> > >
:> > > [1] https://review.openstack.org/#/admin/groups/30,members
:> > > [2] https://github.com/openstack/operations-guide
:> > >
:> > >
:> > > Am 2018-05-24 03:38, schrieb Chris Morgan:
:> > >
:> > >> Hello Everyone,
:> > >>
:> > >> In the Ops Community documentation working session today in Vancouver,
:> > >> we made some really good progress (etherpad here:
:> > >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
:> > >> the good stuff is yet written down).
:> > >>
:> > >> In short, we're going to course correct on maintaining the Operators
:> > >> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
:> > >> wiki and instead try still 

Re: [Openstack-operators] Ops Community Documentation - first anchor point

2018-05-24 Thread Jonathan D. Proulx

My intention based on current understandign would be to create a git
repo called "osops-docs" as this fits current naming an thin initial
document we intend to put there and the others we may adopt from
docs-team.

My understanding being they don't to have this type of
documentention due to much reduced team size and prefer it live with
subject matter experts. It that correct?  If that's not correct I'm
not personally opposed to trying this under docs.  We'll need to
maintain enough contributors and reviewers to make the work flow go in
either location and that's my understanding of the basic issue not
where it lives.

This naming would also match other repos wich could be consolidated into an
"osops" repo to rule them all.  That may make sense as I think there's
significant overlap in set of people who might contribute, but that
can be a parallel conversation.

Doug looking at new project docs I think most of it is clear enough to
me.  Since it's not code I can skip all th PyPi stuff yes? The repo
creation seems pretty clear and I can steal the CI stuff from similar
projects.  I'm a little unclear on the Storyboard bit I've not done
much contribution lately and haven't storyboarded.  Is that relevant
(or at least relevent at first) for this use case?  If it is I
probably have more questions.

I agree governance can also be a parallel discussion.  I don't have
strong opinions there but seems based on participants and content like
a "UC" thing but < shrug />

-Jon

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Ops Community Documentation - first anchor point

2018-05-24 Thread Doug Hellmann
Excerpts from Doug Hellmann's message of 2018-05-23 22:58:40 -0700:
> Excerpts from Melvin Hillsman's message of 2018-05-23 22:26:02 -0700:
> > Great to see this moving. I have some questions/concerns based on your
> > statement Doug about docs.openstack.org publishing and do not want to
> > detour the conversation but ask for feedback. Currently there are a number
> 
> I'm just unclear on that, but don't consider it a blocker. We will sort
> out whatever governance or policy change is needed to let this move
> forward.

When I talked with Petr about it, he pointed to the Security SIG
and Security Guide as a parallel precedent for this. IIRC, yesterday
Adam mentioned that the Self-Healing SIG was also going to be
managing some documentation, so we have two examples.

Looking at https://governance.openstack.org/sigs/, I don't see
another existing SIG that it would make sense to join, so, I think
to deal with the publishing rights we would want set up a SIG for
something like "Operator Documentation," which gives you some
flexibility on exactly what content is managed.

I know you wanted to avoid lots of governance overhead, so I want
to just mention that establishing a SIG is meant to be a painless
and light-weight way to declare that a group of interested people
exists so that others can find them and participate in the work
[1]. It shouldn't take much effort to do the setup, and any ongoing
communication is something you would presumably by doing anyway
among a group of people trying to collaborate on a project like
this.

Let me know if you have any questions or concerns about the process.

Doug

[1] https://governance.openstack.org/sigs/#process-to-create-a-sig

> 
> > of repositories under osops-
> > 
> > https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703
> > 
> > Generally active:
> > osops-tools-contrib
> > osops-tools-generic
> > osops-tools-monitoring
> > 
> > 
> > Probably dead:
> > osops-tools-logging
> > osops-coda
> > osops-example-configs
> > 
> > Because you are more familiar with how things work, is there a way to
> > consolidate these vs coming up with another repo like osops-docs or
> > whatever in this case? And second, is there already governance clearance to
> > publish based on the following - https://launchpad.net/osops - which is
> > where these repos originated.
> 
> I don't really know what any of those things are, or whether it
> makes sense to put this new content there. I assumed we would make
> a repo with a name like "operations-guide", but that's up to Chris
> and John.  If they think reusing an existing repository makes sense,
> that would be OK with me, but it's cheap and easy to set up a new
> one, too.
> 
> My main concern is that we remove the road blocks, now that we have
> people interested in contributing to this documentation.
> 
> > 
> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker  wrote:
> > 
> > > Hi Chris,
> > >
> > > thanks for summarize our session today in Vancouver. As I18n PTL and one
> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> > > unfortunatelly not on-site.
> > > I couldn't also not get the full history of the story and that's also not
> > > the idea to starting finger pointing. As usualy we moving forward and 
> > > there
> > > are some interesting things to know what happened.
> > > First of all: There are no "Docs-Team" anymore. If you look at [1] there
> > > are mostly part-time contributors like me or people are more involved in
> > > other projects and therefore busy. Because of that, the responsibility of
> > > documentation content are moved completely to the project teams. Each repo
> > > has a user guide, admin guide, deployment guide, and so on. The small
> > > Documentation Team provides only tooling and give advices how to write and
> > > publish a document. So it's up to you to re-use the old repo on [2] or
> > > setup a new one. I would recommend to use the best of both worlds. There
> > > are a very good toolset in place for testing and publishing documents.
> > > There are also various text editors for rst extensions available, like in
> > > vim, notepad++ or also online services. I understand the concerns and when
> > > people are sad because their patches are ignored for months. But it's
> > > alltime a question of responsibilty and how can spend people time.
> > > I would be available for help. As I18n PTL I could imagine that a
> > > OpenStack Operations Guide is available in different languages and 
> > > portable
> > > in different formats like in Sphinx. For us as translation team it's a 
> > > good
> > > possibility to get feedback about the quality and to understand the
> > > requirements, also for other documents.
> > > So let's move on.
> > >
> > > kind regards
> > >
> > > Frank
> > >
> > > [1] https://review.openstack.org/#/admin/groups/30,members
> > > [2] https://github.com/openstack/operations-guide
> > >
> > >
> > > Am 2018-05-24 

Re: [Openstack-operators] attaching network cards to VMs taking a very long time

2018-05-24 Thread Saverio Proto
Glad to hear it!
Always monitor rabbitmq queues to identify bottlenecks !! :)

Cheers

Saverio

Il gio 24 mag 2018, 11:07 Radu Popescu | eMAG, Technology <
radu.pope...@emag.ro> ha scritto:

> Hi,
>
> did the change yesterday. Had no issue this morning with neutron not being
> able to move fast enough. Still, we had some storage issues, but that's
> another thing.
> Anyway, I'll leave it like this for the next few days and report back in
> case I get the same slow neutron errors.
>
> Thanks a lot!
> Radu
>
> On Wed, 2018-05-23 at 10:08 +, Radu Popescu | eMAG, Technology wrote:
>
> Hi,
>
> actually, I didn't know about that option. I'll enable it right now.
> Testing is done every morning at about 4:00AM ..so I'll know tomorrow
> morning if it changed anything.
>
> Thanks,
> Radu
>
> On Tue, 2018-05-22 at 15:30 +0200, Saverio Proto wrote:
>
> Sorry email went out incomplete.
>
> Read this:
>
> https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/
>
>
> make sure that Openstack rootwrap configured to work in daemon mode
>
>
> Thank you
>
>
> Saverio
>
>
>
> 2018-05-22 15:29 GMT+02:00 Saverio Proto :
>
> Hello Radu,
>
>
> do you have the Openstack rootwrap configured to work in daemon mode ?
>
>
> please read this article:
>
>
> 2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology
>
> :
>
> Hi,
>
>
> so, nova says the VM is ACTIVE and actually boots with no network. We are
>
> setting some metadata that we use later on and have cloud-init for different
>
> tasks.
>
> So, VM is up, OS is running, but network is working after a random amount of
>
> time, that can get to around 45 minutes. Thing is, is not happening to all
>
> VMs in that test (around 300), but it's happening to a fair amount - around
>
> 25%.
>
>
> I can see the callback coming few seconds after neutron openvswitch agent
>
> says it's completed the setup. My question is, why is it taking so long for
>
> nova openvswitch agent to configure the port? I can see the port up in both
>
> host OS and openvswitch. I would assume it's doing the whole namespace and
>
> iptables setup. But still, 30 minutes? Seems a lot!
>
>
> Thanks,
>
> Radu
>
>
> On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:
>
>
> We have other scheduled tests that perform end-to-end (assign floating IP,
>
> ssh, ping outside) and never had an issue.
>
> I think we turned it off because the callback code was initially buggy and
>
> nova would wait forever while things were in fact ok, but I'll  change
>
> "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run
>
> another large test, just to confirm.
>
>
> We usually run these large tests after a version upgrade to test the APIs
>
> under load.
>
>
>
>
> On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann 
>
> wrote:
>
>
> On 5/17/2018 9:46 AM, George Mihaiescu wrote:
>
>
> and large rally tests of 500 instances complete with no issues.
>
>
>
> Sure, except you can't ssh into the guests.
>
>
> The whole reason the vif plugging is fatal and timeout and callback code was
>
> because the upstream CI was unstable without it. The server would report as
>
> ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE
>
> guest that you can't actually do anything with is kind of pointless.
>
>
> ___
>
>
> OpenStack-operators mailing list
>
>
> OpenStack-operators@lists.openstack.org
>
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
>
> ___
>
> OpenStack-operators mailing list
>
> OpenStack-operators@lists.openstack.org
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
> ___
>
> OpenStack-operators mailing list
>
> OpenStack-operators@lists.openstack.org
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] attaching network cards to VMs taking a very long time

2018-05-24 Thread Radu Popescu | eMAG, Technology
Hi,

did the change yesterday. Had no issue this morning with neutron not being able 
to move fast enough. Still, we had some storage issues, but that's another 
thing.
Anyway, I'll leave it like this for the next few days and report back in case I 
get the same slow neutron errors.

Thanks a lot!
Radu

On Wed, 2018-05-23 at 10:08 +, Radu Popescu | eMAG, Technology wrote:
Hi,

actually, I didn't know about that option. I'll enable it right now.
Testing is done every morning at about 4:00AM ..so I'll know tomorrow morning 
if it changed anything.

Thanks,
Radu

On Tue, 2018-05-22 at 15:30 +0200, Saverio Proto wrote:

Sorry email went out incomplete.

Read this:

https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/


make sure that Openstack rootwrap configured to work in daemon mode


Thank you


Saverio



2018-05-22 15:29 GMT+02:00 Saverio Proto 
>:

Hello Radu,


do you have the Openstack rootwrap configured to work in daemon mode ?


please read this article:


2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology

>:

Hi,


so, nova says the VM is ACTIVE and actually boots with no network. We are

setting some metadata that we use later on and have cloud-init for different

tasks.

So, VM is up, OS is running, but network is working after a random amount of

time, that can get to around 45 minutes. Thing is, is not happening to all

VMs in that test (around 300), but it's happening to a fair amount - around

25%.


I can see the callback coming few seconds after neutron openvswitch agent

says it's completed the setup. My question is, why is it taking so long for

nova openvswitch agent to configure the port? I can see the port up in both

host OS and openvswitch. I would assume it's doing the whole namespace and

iptables setup. But still, 30 minutes? Seems a lot!


Thanks,

Radu


On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:


We have other scheduled tests that perform end-to-end (assign floating IP,

ssh, ping outside) and never had an issue.

I think we turned it off because the callback code was initially buggy and

nova would wait forever while things were in fact ok, but I'll  change

"vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run

another large test, just to confirm.


We usually run these large tests after a version upgrade to test the APIs

under load.




On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann 
>

wrote:


On 5/17/2018 9:46 AM, George Mihaiescu wrote:


and large rally tests of 500 instances complete with no issues.



Sure, except you can't ssh into the guests.


The whole reason the vif plugging is fatal and timeout and callback code was

because the upstream CI was unstable without it. The server would report as

ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE

guest that you can't actually do anything with is kind of pointless.


___


OpenStack-operators mailing list


OpenStack-operators@lists.openstack.org


http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




___

OpenStack-operators mailing list

OpenStack-operators@lists.openstack.org

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___

OpenStack-operators mailing list

OpenStack-operators@lists.openstack.org

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Ops Community Documentation - first anchor point

2018-05-24 Thread Melvin Hillsman
Sure definitely, that's why I said I was not trying to detour the
conversation, but rather asking for feedback. Definitely agree things
should continue to plow forward and Chris has been doing an excellent job
here and I think it is awesome that he is continuing to push this.

On Wed, May 23, 2018 at 10:58 PM, Doug Hellmann 
wrote:

> Excerpts from Melvin Hillsman's message of 2018-05-23 22:26:02 -0700:
> > Great to see this moving. I have some questions/concerns based on your
> > statement Doug about docs.openstack.org publishing and do not want to
> > detour the conversation but ask for feedback. Currently there are a
> number
>
> I'm just unclear on that, but don't consider it a blocker. We will sort
> out whatever governance or policy change is needed to let this move
> forward.
>
> > of repositories under osops-
> >
> > https://github.com/openstack-infra/project-config/blob/
> master/gerrit/projects.yaml#L5673-L5703
> >
> > Generally active:
> > osops-tools-contrib
> > osops-tools-generic
> > osops-tools-monitoring
> >
> >
> > Probably dead:
> > osops-tools-logging
> > osops-coda
> > osops-example-configs
> >
> > Because you are more familiar with how things work, is there a way to
> > consolidate these vs coming up with another repo like osops-docs or
> > whatever in this case? And second, is there already governance clearance
> to
> > publish based on the following - https://launchpad.net/osops - which is
> > where these repos originated.
>
> I don't really know what any of those things are, or whether it
> makes sense to put this new content there. I assumed we would make
> a repo with a name like "operations-guide", but that's up to Chris
> and John.  If they think reusing an existing repository makes sense,
> that would be OK with me, but it's cheap and easy to set up a new
> one, too.
>
> My main concern is that we remove the road blocks, now that we have
> people interested in contributing to this documentation.
>
> >
> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker  wrote:
> >
> > > Hi Chris,
> > >
> > > thanks for summarize our session today in Vancouver. As I18n PTL and
> one
> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> > > unfortunatelly not on-site.
> > > I couldn't also not get the full history of the story and that's also
> not
> > > the idea to starting finger pointing. As usualy we moving forward and
> there
> > > are some interesting things to know what happened.
> > > First of all: There are no "Docs-Team" anymore. If you look at [1]
> there
> > > are mostly part-time contributors like me or people are more involved
> in
> > > other projects and therefore busy. Because of that, the responsibility
> of
> > > documentation content are moved completely to the project teams. Each
> repo
> > > has a user guide, admin guide, deployment guide, and so on. The small
> > > Documentation Team provides only tooling and give advices how to write
> and
> > > publish a document. So it's up to you to re-use the old repo on [2] or
> > > setup a new one. I would recommend to use the best of both worlds.
> There
> > > are a very good toolset in place for testing and publishing documents.
> > > There are also various text editors for rst extensions available, like
> in
> > > vim, notepad++ or also online services. I understand the concerns and
> when
> > > people are sad because their patches are ignored for months. But it's
> > > alltime a question of responsibilty and how can spend people time.
> > > I would be available for help. As I18n PTL I could imagine that a
> > > OpenStack Operations Guide is available in different languages and
> portable
> > > in different formats like in Sphinx. For us as translation team it's a
> good
> > > possibility to get feedback about the quality and to understand the
> > > requirements, also for other documents.
> > > So let's move on.
> > >
> > > kind regards
> > >
> > > Frank
> > >
> > > [1] https://review.openstack.org/#/admin/groups/30,members
> > > [2] https://github.com/openstack/operations-guide
> > >
> > >
> > > Am 2018-05-24 03:38, schrieb Chris Morgan:
> > >
> > >> Hello Everyone,
> > >>
> > >> In the Ops Community documentation working session today in Vancouver,
> > >> we made some really good progress (etherpad here:
> > >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all
> of
> > >> the good stuff is yet written down).
> > >>
> > >> In short, we're going to course correct on maintaining the Operators
> > >> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> > >> wiki and instead try still maintaining them as code, but with a
> > >> different, new set of owners, possibly in a new Ops-focused repo.
> > >> There was a strong consensus that a) code workflow >> wiki workflow
> > >> and that b) openstack core docs tools are just fine.
> > >>
> > >> There is a lot still to be decided on how where and when, but we do
> > >> have an offer of a rewrite 

Re: [Openstack-operators] Ops Community Documentation - first anchor point

2018-05-24 Thread Doug Hellmann
Excerpts from Melvin Hillsman's message of 2018-05-23 22:26:02 -0700:
> Great to see this moving. I have some questions/concerns based on your
> statement Doug about docs.openstack.org publishing and do not want to
> detour the conversation but ask for feedback. Currently there are a number

I'm just unclear on that, but don't consider it a blocker. We will sort
out whatever governance or policy change is needed to let this move
forward.

> of repositories under osops-
> 
> https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703
> 
> Generally active:
> osops-tools-contrib
> osops-tools-generic
> osops-tools-monitoring
> 
> 
> Probably dead:
> osops-tools-logging
> osops-coda
> osops-example-configs
> 
> Because you are more familiar with how things work, is there a way to
> consolidate these vs coming up with another repo like osops-docs or
> whatever in this case? And second, is there already governance clearance to
> publish based on the following - https://launchpad.net/osops - which is
> where these repos originated.

I don't really know what any of those things are, or whether it
makes sense to put this new content there. I assumed we would make
a repo with a name like "operations-guide", but that's up to Chris
and John.  If they think reusing an existing repository makes sense,
that would be OK with me, but it's cheap and easy to set up a new
one, too.

My main concern is that we remove the road blocks, now that we have
people interested in contributing to this documentation.

> 
> On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker  wrote:
> 
> > Hi Chris,
> >
> > thanks for summarize our session today in Vancouver. As I18n PTL and one
> > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> > unfortunatelly not on-site.
> > I couldn't also not get the full history of the story and that's also not
> > the idea to starting finger pointing. As usualy we moving forward and there
> > are some interesting things to know what happened.
> > First of all: There are no "Docs-Team" anymore. If you look at [1] there
> > are mostly part-time contributors like me or people are more involved in
> > other projects and therefore busy. Because of that, the responsibility of
> > documentation content are moved completely to the project teams. Each repo
> > has a user guide, admin guide, deployment guide, and so on. The small
> > Documentation Team provides only tooling and give advices how to write and
> > publish a document. So it's up to you to re-use the old repo on [2] or
> > setup a new one. I would recommend to use the best of both worlds. There
> > are a very good toolset in place for testing and publishing documents.
> > There are also various text editors for rst extensions available, like in
> > vim, notepad++ or also online services. I understand the concerns and when
> > people are sad because their patches are ignored for months. But it's
> > alltime a question of responsibilty and how can spend people time.
> > I would be available for help. As I18n PTL I could imagine that a
> > OpenStack Operations Guide is available in different languages and portable
> > in different formats like in Sphinx. For us as translation team it's a good
> > possibility to get feedback about the quality and to understand the
> > requirements, also for other documents.
> > So let's move on.
> >
> > kind regards
> >
> > Frank
> >
> > [1] https://review.openstack.org/#/admin/groups/30,members
> > [2] https://github.com/openstack/operations-guide
> >
> >
> > Am 2018-05-24 03:38, schrieb Chris Morgan:
> >
> >> Hello Everyone,
> >>
> >> In the Ops Community documentation working session today in Vancouver,
> >> we made some really good progress (etherpad here:
> >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
> >> the good stuff is yet written down).
> >>
> >> In short, we're going to course correct on maintaining the Operators
> >> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> >> wiki and instead try still maintaining them as code, but with a
> >> different, new set of owners, possibly in a new Ops-focused repo.
> >> There was a strong consensus that a) code workflow >> wiki workflow
> >> and that b) openstack core docs tools are just fine.
> >>
> >> There is a lot still to be decided on how where and when, but we do
> >> have an offer of a rewrite of the HA Guide, as long as the changes
> >> will be allowed to actually land, so we expect to actually start
> >> showing some progress.
> >>
> >> At the end of the session, people wanted to know how to follow along
> >> as various people work out how to do this... and so for now that place
> >> is this very email thread. The idea is if the code for those documents
> >> goes to live in a different repo, or if new contributors turn up, or
> >> if a new version we will announce/discuss it here until such time as
> >> we have a better home for this initiative.
> >>
> >>