Re: [openstack-dev] [octavia] enabling new topologies

2016-06-13 Thread Stephen Balukoff
Hey Sergey,

In-line comments below:

On Sun, Jun 5, 2016 at 8:07 AM, Sergey Guenender  wrote:

>
> Hi Stephen, please find my reply next to your points below.
>
> Thank you,
> -Sergey.
>
>
> On 01/06/2016 20:23, Stephen Balukoff wrote:
> > Hey Sergey--
> >
> > Apologies for the delay in my response. I'm still wrapping my head
> > around your option 2 suggestion and the implications it might have for
> > the code base moving forward. I think, though, that I'm against your
> > option 2 proposal and in favor of option 1 (which, yes, is more work
> > initially) for the following reasons:
> >
> > A. We have a precedent in the code tree with how the stand-alone and
> > active-standby topologies are currently being handled. Yes, this does
> > entail various conditionals and branches in tasks and flows-- which is
> > not really that ideal, as it means the controller worker needs to have
> > more specific information on how topologies work than I think any of us
> > would like, and this adds some rigidity to the implementation (meaning
> > 3rd party vendors may have more trouble interfacing at that level)...
> > but it's actually "not that bad" in many ways, especially given we don't
> > anticipate supporting a large or variable number of topologies.
> > (stand-alone, active-standby, active-active... and then what? We've been
> > doing this for a number of years and nobody has mentioned any radically
> > new topologies they would like in their load balancing. Things like
> > auto-scale are just a specific case of active-active).
>
> Just as you say, two topologies are being handled as of now by only one
> set of flows. Option two goes along the same lines, instead of adding new
> flows for active-active it suggests that minor adjustments to existing
> flows can also satisfy active-active.
>

My point was that I think the distributor and amphora roles are different
enough that they ought to have separate drivers, separate flows, etc.
almost entirely. There's not much difference between a stand-alone amphora
and an amphora in an active-standby topology. However, there's a huge
difference between both of these and a distributor (which will have its own
back-end API, for example).


>
> > B. If anything Option 2 builds more less-obvious rigidity into the
> > implementation than option 1. For example, it makes the assumption that
> > the distributor is necessarily an amphora or service VM, whereas we have
> > already heard that some will implement the distributor as a pure network
> > routing function that isn't going to be managed the same way other
> > amphorae are.
>
> This is a good point. By looking at the code, I see there are comments
> mentioning the intent to share amphora between several load balancers.
> Although probably not straightforward to implement, it might be a good idea
> one day, but the fact is it looks like amphora has not been shared between
> load balancers for a few years.
>
> Personally, when developing something complex, I believe in taking baby
> steps. If the virtual, non-shared distributor (which is promised by the AA
> blueprint anyway) is the smallest step towards a working active-active,
> then I guess it should be considered taking first.
>

The AA blueprint has yet to be approved (and there appear to be a *lot* of
comments on the latest revision). But yes-- in general you need to walk
before you can run. But instead of torturing analogies, let me say this:
Assumptions about design are reflected in the code. So, I generally like to
do my best getting the design right... and then any baby steps taken should
be evaluated against that end design to ensure they don't introduce
assumptions that will make it difficult to get there.



>
> Unless of course, it precludes implementing the following, more complex
> topologies.
>
> My belief is it doesn't have to. The proposed change alone (splitting
> amphorae into sub-clusters to be used by the many for-loops) doesn't force
> any special direction on its own. Any future topology may leave its
> "front-facing amphorae" set equal to its "back-facing amphorae" which
> brings it back to the current style of for-loops handling.
>

See, I disagree that an amphora and a distributor are even really similar.
The idea that a distributor is just a front-facing amphora I think is
fundamentally false-- Especially if distributors are implemented with a
direct-return topology (as the blueprint under evaluation describes) then
they're almost nothing alike. The distributor service VM will not be
running haproxy and will be running its own unique API specifically because
it's fulfilling a vastly different role in the topology than the amphorae
fill.


>
> > C. Option 2 seems like it's going to have a lot more permutations that
> > would need testing to ensure that code changes don't break existing /
> > potentially supported functionality. Option 1 keeps the distributor and
> > amphorae management code separate, which means tests should be 

Re: [openstack-dev] [octavia] enabling new topologies

2016-06-05 Thread Sergey Guenender

> Hi Sergey,  Welcome to working on Octavia!

Thanks, glad to be joining! :^)
Please read a further explanation of my proposal down below.

> I'm not sure I fully understand your proposals, but I can give my
> thoughts/opinion on the challenge for Active/Active.
>
> In general I agree with Stephen.
>
> The intention of using TaskFlow is to facilitate code reuse across
> similar but different code flows.
>
> For an Active/Active provisioning request I envision it as a new flow
> that is loaded as opposed to the current standalone and Active/Standby
> flow.  I would expect it would include many existing tasks (example,
> plug_network) that may be required for the requested action.  This new
> flow will likely include a number of concurrent sub-flows using these
> existing tasks.
>
> I do expect that the "distributor" will need to be a new "element".
> Because the various stakeholders are considering implementing this
> function in different ways, we agreed that an API and driver would be
> developed for interactions with the distributor.  This should also
> take into account that there may be some deployments where
> distributors are not shared.

I too expect a new model element to represent distributor, including its 
own API.


Virtual distributor does seem to share some behavior with amphora.

For instance, consider the "create load balancer" flow:
 * get_create_load_balancer_flow gets-or-creates a few nova instances, 
waits till they boot and marks them in DB
 * get_new_LB_networking_subflow allocates and plugs VIP on both 
Neutron and amphorae sides; security group handling included
 * when needed, get_vrrp_subflow creates a VRRP group on the LB and 
configures/starts it on amphorae

 * amphorae get connected to the members' networks
 * if listeners are defined on LB
   * haproxy services get configured and started including "peers" 
configuration

   * VIP network connections get Neutron security groups blessing

All parts of this flow seem to apply to the active-active topology too.

My intent is to try and reuse most of this rather involved flow by 
treating distributors as both a subset of "front-facing" amphorae and 
the "vrrp running" amphorae, while the original amphorae would be 
treated as both "back-facing" (for haproxy configuration, members' 
networks plugging, etc.) and "front-facing" (for VIP network 
plugging/processing).


If this leads to changing a lot of existing code or changing it 
non-trivially, I'll drop this idea, as my hope is to have less code 
review, not more.


> I still need to review the latest version of the Act/Act spec to
> understand where that was left after my first round of comments and
> our mid-cycle discussions.
>
> Michael

Thanks,
-Sergey.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [octavia] enabling new topologies

2016-06-05 Thread Sergey Guenender


Hi Stephen, please find my reply next to your points below.

Thank you,
-Sergey.


On 01/06/2016 20:23, Stephen Balukoff wrote:
> Hey Sergey--
>
> Apologies for the delay in my response. I'm still wrapping my head
> around your option 2 suggestion and the implications it might have for
> the code base moving forward. I think, though, that I'm against your
> option 2 proposal and in favor of option 1 (which, yes, is more work
> initially) for the following reasons:
>
> A. We have a precedent in the code tree with how the stand-alone and
> active-standby topologies are currently being handled. Yes, this does
> entail various conditionals and branches in tasks and flows-- which is
> not really that ideal, as it means the controller worker needs to have
> more specific information on how topologies work than I think any of us
> would like, and this adds some rigidity to the implementation (meaning
> 3rd party vendors may have more trouble interfacing at that level)...
> but it's actually "not that bad" in many ways, especially given we don't
> anticipate supporting a large or variable number of topologies.
> (stand-alone, active-standby, active-active... and then what? We've been
> doing this for a number of years and nobody has mentioned any radically
> new topologies they would like in their load balancing. Things like
> auto-scale are just a specific case of active-active).

Just as you say, two topologies are being handled as of now by only one 
set of flows. Option two goes along the same lines, instead of adding 
new flows for active-active it suggests that minor adjustments to 
existing flows can also satisfy active-active.


> B. If anything Option 2 builds more less-obvious rigidity into the
> implementation than option 1. For example, it makes the assumption that
> the distributor is necessarily an amphora or service VM, whereas we have
> already heard that some will implement the distributor as a pure network
> routing function that isn't going to be managed the same way other
> amphorae are.

This is a good point. By looking at the code, I see there are comments 
mentioning the intent to share amphora between several load balancers. 
Although probably not straightforward to implement, it might be a good 
idea one day, but the fact is it looks like amphora has not been shared 
between load balancers for a few years.


Personally, when developing something complex, I believe in taking baby 
steps. If the virtual, non-shared distributor (which is promised by the 
AA blueprint anyway) is the smallest step towards a working 
active-active, then I guess it should be considered taking first.


Unless of course, it precludes implementing the following, more complex 
topologies.


My belief is it doesn't have to. The proposed change alone (splitting 
amphorae into sub-clusters to be used by the many for-loops) doesn't 
force any special direction on its own. Any future topology may leave 
its "front-facing amphorae" set equal to its "back-facing amphorae" 
which brings it back to the current style of for-loops handling.


> C. Option 2 seems like it's going to have a lot more permutations that
> would need testing to ensure that code changes don't break existing /
> potentially supported functionality. Option 1 keeps the distributor and
> amphorae management code separate, which means tests should be more
> straight-forward, and any breaking changes which slip through
> potentially break less stuff. Make sense?

It certainly does.

My intent is that the simplest active-active implementation promised by 
the blueprint can be achieved with only minor changes by existing code. 
If required changes are not in fact small, or if this simplistic 
approach in some way impedes future work, we can drop this option.



> Stephen
>
>
> On Sun, May 29, 2016 at 7:12 AM, Sergey Guenender
>  > wrote:
>
> I'm working with the IBM team implementing the Active-Active N+1
> topology [1].
>
> I've been commissioned with the task to help integrate the code
> supporting the new topology while a) making as few code changes and
> b) reusing as much code as possible.
>
> To make sure the changes to existing code are future-proof, I'd like
> to implement them outside AA N+1, submit them on their own and let
> the AA N+1 base itself on top of it.
>
> --TL;DR--
>
> what follows is a description of the challenges I'm facing and the
> way I propose to solve them. Please skip down to the end of the
> email to see the actual questions.
>
> --The details--
>
> I've been studying the code for a few weeks now to see where the
> best places for minimal changes might be.
>
> Currently I see two options:
>
> 1. introduce a new kind of entity (the distributor) and make
> sure it's being handled on any of the 6 levels of controller worker
> code (endpoint, controller worker, *_flows, *_tasks, *_driver)
>
> 2. 

[openstack-dev] [octavia] enabling new topologies

2016-06-02 Thread Sergey Guenender
Stephen, Michael, thank you for having a look.

I'll respond to every issue you mentioned when I get to work on Sunday.

Until then, in case you don't mind inspecting a small diff, just to 
clarify my point, please have a look at a rather straightforward change, 
which
1. exemplifies pretty much all I'm currently proposing (just splitting 
amphorae into semantic sub-clusters to facilitate code-reuse)
2. I'm hoping should provide everything needed (and thus frictionless 
review) for the virtual non-shared distributor of active active topology
3. is quite transparent for other topologies, including future 
active-active shared, hardware, what-have-you, just because it's fully 
compliant with existing code

https://github.com/sgserg/octavia/commit/030e786ce4966bbf24e73c00364f167596aef004

Needless to say, I wouldn't expect anything like this to be merged until 
we see an end-to-end working (virtual-private-d'tor) AA N+1 create-lb 
proof of concept (not destroying existing topologies).

I'm not married to this idea, it's just something I came up with having 
spent a few weeks in front of the code, trying to imagine how the simplest 
active-active use-case would go around performing the same tasks (vrrp, 
vip plugging, etc.).

-Sergey.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [octavia] enabling new topologies

2016-06-02 Thread Michael Johnson
Hi Sergey,  Welcome to working on Octavia!

I'm not sure I fully understand your proposals, but I can give my
thoughts/opinion on the challenge for Active/Active.

In general I agree with Stephen.

The intention of using TaskFlow is to facilitate code reuse across
similar but different code flows.

For an Active/Active provisioning request I envision it as a new flow
that is loaded as opposed to the current standalone and Active/Standby
flow.  I would expect it would include many existing tasks (example,
plug_network) that may be required for the requested action.  This new
flow will likely include a number of concurrent sub-flows using these
existing tasks.

I do expect that the "distributor" will need to be a new "element".
Because the various stakeholders are considering implementing this
function in different ways, we agreed that an API and driver would be
developed for interactions with the distributor.  This should also
take into account that there may be some deployments where
distributors are not shared.

I still need to review the latest version of the Act/Act spec to
understand where that was left after my first round of comments and
our mid-cycle discussions.

Michael


On Wed, Jun 1, 2016 at 10:23 AM, Stephen Balukoff  wrote:
> Hey Sergey--
>
> Apologies for the delay in my response. I'm still wrapping my head around
> your option 2 suggestion and the implications it might have for the code
> base moving forward. I think, though, that I'm against your option 2
> proposal and in favor of option 1 (which, yes, is more work initially) for
> the following reasons:
>
> A. We have a precedent in the code tree with how the stand-alone and
> active-standby topologies are currently being handled. Yes, this does entail
> various conditionals and branches in tasks and flows-- which is not really
> that ideal, as it means the controller worker needs to have more specific
> information on how topologies work than I think any of us would like, and
> this adds some rigidity to the implementation (meaning 3rd party vendors may
> have more trouble interfacing at that level)...  but it's actually "not that
> bad" in many ways, especially given we don't anticipate supporting a large
> or variable number of topologies. (stand-alone, active-standby,
> active-active... and then what? We've been doing this for a number of years
> and nobody has mentioned any radically new topologies they would like in
> their load balancing. Things like auto-scale are just a specific case of
> active-active).
>
> B. If anything Option 2 builds more less-obvious rigidity into the
> implementation than option 1. For example, it makes the assumption that the
> distributor is necessarily an amphora or service VM, whereas we have already
> heard that some will implement the distributor as a pure network routing
> function that isn't going to be managed the same way other amphorae are.
>
> C. Option 2 seems like it's going to have a lot more permutations that would
> need testing to ensure that code changes don't break existing / potentially
> supported functionality. Option 1 keeps the distributor and amphorae
> management code separate, which means tests should be more straight-forward,
> and any breaking changes which slip through potentially break less stuff.
> Make sense?
>
> Stephen
>
>
> On Sun, May 29, 2016 at 7:12 AM, Sergey Guenender  wrote:
>>
>> I'm working with the IBM team implementing the Active-Active N+1 topology
>> [1].
>>
>> I've been commissioned with the task to help integrate the code supporting
>> the new topology while a) making as few code changes and b) reusing as much
>> code as possible.
>>
>> To make sure the changes to existing code are future-proof, I'd like to
>> implement them outside AA N+1, submit them on their own and let the AA N+1
>> base itself on top of it.
>>
>> --TL;DR--
>>
>> what follows is a description of the challenges I'm facing and the way I
>> propose to solve them. Please skip down to the end of the email to see the
>> actual questions.
>>
>> --The details--
>>
>> I've been studying the code for a few weeks now to see where the best
>> places for minimal changes might be.
>>
>> Currently I see two options:
>>
>>1. introduce a new kind of entity (the distributor) and make sure it's
>> being handled on any of the 6 levels of controller worker code (endpoint,
>> controller worker, *_flows, *_tasks, *_driver)
>>
>>2. leave most of the code layers intact by building on the fact that
>> distributor will inherit most of the controller worker logic of amphora
>>
>>
>> In Active-Active topology, very much like in Active/StandBy:
>> * top level of distributors will have to run VRRP
>> * the distributors will have a Neutron port made on the VIP network
>> * the distributors' neutron ports on VIP network will need the same
>> security groups
>> * the amphorae facing the pool member networks still require
>> * ports on the pool member networks
>>

Re: [openstack-dev] [octavia] enabling new topologies

2016-06-01 Thread Stephen Balukoff
Hey Sergey--

Apologies for the delay in my response. I'm still wrapping my head around
your option 2 suggestion and the implications it might have for the code
base moving forward. I think, though, that I'm against your option 2
proposal and in favor of option 1 (which, yes, is more work initially) for
the following reasons:

A. We have a precedent in the code tree with how the stand-alone and
active-standby topologies are currently being handled. Yes, this does
entail various conditionals and branches in tasks and flows-- which is not
really that ideal, as it means the controller worker needs to have more
specific information on how topologies work than I think any of us would
like, and this adds some rigidity to the implementation (meaning 3rd party
vendors may have more trouble interfacing at that level)...  but it's
actually "not that bad" in many ways, especially given we don't anticipate
supporting a large or variable number of topologies. (stand-alone,
active-standby, active-active... and then what? We've been doing this for a
number of years and nobody has mentioned any radically new topologies they
would like in their load balancing. Things like auto-scale are just a
specific case of active-active).

B. If anything Option 2 builds more less-obvious rigidity into the
implementation than option 1. For example, it makes the assumption that the
distributor is necessarily an amphora or service VM, whereas we have
already heard that some will implement the distributor as a pure network
routing function that isn't going to be managed the same way other amphorae
are.

C. Option 2 seems like it's going to have a lot more permutations that
would need testing to ensure that code changes don't break existing /
potentially supported functionality. Option 1 keeps the distributor and
amphorae management code separate, which means tests should be more
straight-forward, and any breaking changes which slip through potentially
break less stuff. Make sense?

Stephen


On Sun, May 29, 2016 at 7:12 AM, Sergey Guenender  wrote:

> I'm working with the IBM team implementing the Active-Active N+1 topology
> [1].
>
> I've been commissioned with the task to help integrate the code supporting
> the new topology while a) making as few code changes and b) reusing as much
> code as possible.
>
> To make sure the changes to existing code are future-proof, I'd like to
> implement them outside AA N+1, submit them on their own and let the AA N+1
> base itself on top of it.
>
> --TL;DR--
>
> what follows is a description of the challenges I'm facing and the way I
> propose to solve them. Please skip down to the end of the email to see the
> actual questions.
>
> --The details--
>
> I've been studying the code for a few weeks now to see where the best
> places for minimal changes might be.
>
> Currently I see two options:
>
>1. introduce a new kind of entity (the distributor) and make sure it's
> being handled on any of the 6 levels of controller worker code (endpoint,
> controller worker, *_flows, *_tasks, *_driver)
>
>2. leave most of the code layers intact by building on the fact that
> distributor will inherit most of the controller worker logic of amphora
>
>
> In Active-Active topology, very much like in Active/StandBy:
> * top level of distributors will have to run VRRP
> * the distributors will have a Neutron port made on the VIP network
> * the distributors' neutron ports on VIP network will need the same
> security groups
> * the amphorae facing the pool member networks still require
> * ports on the pool member networks
> * "peers" HAProxy configuration for real-time state exchange
> * VIP network connections with the right security groups
>
> The fact that existing topologies lack the notion of distributor and
> inspecting the 30-or-so existing references to amphorae clusters, swayed me
> towards the second option.
>
> The easiest way to make use of existing code seems to be by splitting
> load-balancer's amphorae into three overlapping sets:
> 1. The front-facing - those connected to the VIP network
> 2. The back-facing - subset of front-facing amphorae, also connected to
> the pool members' networks
> 3. The VRRP-running - subset of front-facing amphorae, making sure the VIP
> routing remains highly available
>
> At the code-changes level
> * the three sets can be simply added as properties of
> common.data_model.LoadBalancer
> * the existing amphorae cluster references would switch to using one of
> these properties, for example
> * the VRRP sub-flow would loop over only the VRRP amphorae
> * the network driver, when plugging the VIP, would loop over the
> front-facing amphorae
> * when connecting to the pool members' networks,
> network_tasks.CalculateDelta would only loop over the back-facing amphorae
>
>-
>
> In terms of backwards compatibility, Active-StandBy topology would have
> the 3 sets equal and contain both of its amphorae.
>
> An even more 

[openstack-dev] [octavia] enabling new topologies

2016-05-29 Thread Sergey Guenender
I'm working with the IBM team implementing the Active-Active N+1 topology 
[1].

I've been commissioned with the task to help integrate the code supporting 
the new topology while a) making as few code changes and b) reusing as 
much code as possible.

To make sure the changes to existing code are future-proof, I'd like to 
implement them outside AA N+1, submit them on their own and let the AA N+1 
base itself on top of it.

--TL;DR--

what follows is a description of the challenges I'm facing and the way I 
propose to solve them. Please skip down to the end of the email to see the 
actual questions.

--The details--

I've been studying the code for a few weeks now to see where the best 
places for minimal changes might be.

Currently I see two options:

   1. introduce a new kind of entity (the distributor) and make sure it's 
being handled on any of the 6 levels of controller worker code (endpoint, 
controller worker, *_flows, *_tasks, *_driver)

   2. leave most of the code layers intact by building on the fact that 
distributor will inherit most of the controller worker logic of amphora


In Active-Active topology, very much like in Active/StandBy:
* top level of distributors will have to run VRRP
* the distributors will have a Neutron port made on the VIP network
* the distributors' neutron ports on VIP network will need the same 
security groups
* the amphorae facing the pool member networks still require
* ports on the pool member networks
* "peers" HAProxy configuration for real-time state exchange
* VIP network connections with the right security groups

The fact that existing topologies lack the notion of distributor and 
inspecting the 30-or-so existing references to amphorae clusters, swayed 
me towards the second option.

The easiest way to make use of existing code seems to be by splitting 
load-balancer's amphorae into three overlapping sets:
1. The front-facing - those connected to the VIP network
2. The back-facing - subset of front-facing amphorae, also connected to 
the pool members' networks
3. The VRRP-running - subset of front-facing amphorae, making sure the VIP 
routing remains highly available

At the code-changes level
* the three sets can be simply added as properties of 
common.data_model.LoadBalancer
* the existing amphorae cluster references would switch to using one of 
these properties, for example
* the VRRP sub-flow would loop over only the VRRP amphorae
* the network driver, when plugging the VIP, would loop over the 
front-facing amphorae
* when connecting to the pool members' networks, 
network_tasks.CalculateDelta would only loop over the back-facing amphorae

In terms of backwards compatibility, Active-StandBy topology would have 
the 3 sets equal and contain both of its amphorae.

An even more future-proof approach might be to implement the sets-getters 
as selector methods, supporting operation on subsets of each kind of 
amphorae. For instance when growing/shrinking back-facing amphorae 
cluster, only the added/removed ones will need to be processed.

Finally (thank you for your patience, dear reader), my question is: if any 
of the above makes sense, and to facilitate the design/code review, what 
would be the best way to move forward?

Should I create a mini-blueprint describing the changes and implement it?
Should I just open a bug for it and supply a fix?

Thanks,
-Sergey.

[1] https://review.openstack.org/#/c/234639

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev