Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-22 Thread Joshua Harlow

And for another recent one that came out yesterday:

Interesting to read for those who are using mongodb + openstack...

https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-reads

-Josh

Joshua Harlow wrote:

Joshua Harlow wrote:

Kevin Benton wrote:

>Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.

Very cool. Is the backend completely transparent so a deployer could
choose a service they are comfortable maintaining, or will that change
the properties WRT to resiliency of state on node restarts,
partitions, etc?


Of course... we tried to make it 'completely' transparent, but in
reality certain backends (zookeeper which uses a paxos-like algorithm
and redis with sentinel support...) are better (more resilient, more
consistent, handle partitions/restarts better...) than others (memcached
is after all just a distributed cache). This is just the nature of the
game...



And for some more reading fun:

https://aphyr.com/posts/315-call-me-maybe-rabbitmq

https://aphyr.com/posts/291-call-me-maybe-zookeeper

https://aphyr.com/posts/283-call-me-maybe-redis

https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

... (aphyr.com has alot of these neat posts)...



The Nova implementation of Tooz seemed pretty straight-forward, although
it looked like it had pluggable drivers for service management already.
Before I dig into it much further I'll file a spec on the Neutron side
to see if I can get some other cores onboard to do the review work if I
push a change to tooz.


Sounds good to me.




On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow mailto:harlo...@outlook.com>> wrote:

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the
agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not,
who does
a given node ask to know if an agent is online or offline when
making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.


However, before (what I assume is) the large code change to
implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the
master branch
a few months ago, processing a heartbeat took an order of
magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent
(~300ms). A
few query optimizations might buy us a lot more headroom before
we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#__/c/138607/
 (same thing/nearly same
in nova)...

https://review.openstack.org/#__/c/172502/
 (a WIP implementation of
the latter).

[1]
https://zookeeper.apache.org/__doc/trunk/__zookeeperProgrammers.html#__Ephemeral+Nodes







Kevin Benton wrote:


One of the most common is the heartbeat from each agent.
However, I
don't think we can't eliminate them because they are used
to determine
if the agents are still alive for scheduling purposes. Did
you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active
members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/developer/tooz/compatibility.html

>

[1]
http://docs.openstack.org/developer/tooz/compatibility.html#grouping





>
[2]
https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315







Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-17 Thread Kevin Benton
I think before we go much further with any of these major refactors, we
need hard numbers that show where the scaling issues are. We can have an
extremely efficient messaging system that will not make any impact in the
grand scheme of things if there are other things that unnecessarily take up
an order of magnitude more resources.

For example, I dug into the router retrieval code that l3 agents use and it
was resulting in 80+ SQL queries to retrieve ~10 routers.[1] If we end up
refactoring the entire messaging layer to save 5% of our query load, it's
just not worth it.


1. https://bugs.launchpad.net/neutron/+bug/1445412

On Fri, Apr 17, 2015 at 8:42 AM, Attila Fazekas  wrote:

>
>
>
>
> - Original Message -
> > From: "joehuang" 
> > To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev@lists.openstack.org>
> > Sent: Friday, April 17, 2015 9:46:12 AM
> > Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> >
> > Hi, Attila,
> >
> > only address the issue of agent status/liveness management is not enough
> for
> > Neutron scalability. The concurrent dynamic load impact on large scale (
> for
> > example 100k managed nodes with the dynamic load like security group rule
> > update, routers_updated, etc ) should also be taken into account too. So
> > even if is agent status/liveness management improved in Neutron, that
> > doesn't mean the scalability issue totally being addressed.
> >
>
> This story is not about the heartbeat.
> https://bugs.launchpad.net/neutron/+bug/1438159
>
> What I am looking for is managing lot of nodes, with minimal `controller`
> resources.
>
> The actual required system changes like (for example regarding to vm boot)
> per/sec
> is relative low, even if you have many nodes and vms. - Consider the
> instances average lifetime -
>
> The `bug` is for the resources what the agents are related and querying
> many times,
> BTW: I am thinking about several alternatives and other variants.
>
> In neutron case a `system change` can affect multiple agents
> like security group rule change.
>
> It seams possible to have all agents to `query` a resource only once,
> and being notified by any subsequent change `for free`. (IP, sec group
> rule, new neighbor)
>
> This is the scenario when the message brokers can shine and scale,
> and it also offloads lot of work from the DB.
>
>
> > And on the other hand, Nova already supports several segregation
> concepts,
> > for example, Cells, Availability Zone... If there are 100k nodes to be
> > managed by one OpenStack instances, it's impossible to work without
> hardware
> > resources segregation. It's weird to put agent liveness manager in
> > availability zone(AZ in short) 1, but all managed agents in AZ 2. If AZ
> 1 is
> > power off, then all agents in AZ2 lost management.
> >
> >
> > The benchmark is already here for scalability "test report for million
> ports
> > scalability of Neutron "
> >
> http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers
> >
> > The cascading may be not perfect, but at least it provides a feasible
> way if
> > we really want scalability.
> > 
> > I am also working to evolve OpenStack to a world no need to worry about
> > "OpenStack Scalability Issue" based on cascading:
> >
> > "Tenant level virtual OpenStack service over hybrid or federated or
> multiple
> > OpenStack based clouds":
> >
> > There are lots of OpenStack based clouds, each tenant will be allocated
> with
> > one cascading OpenStack as the virtual OpenStack service, and single
> > OpenStack API endpoint served for this tenant. The tenant's resources
> can be
> > distributed or dynamically scaled to multi-OpenStack based clouds, these
> > clouds may be federated with KeyStone, or using shared KeyStone, or  even
> > some OpenStack clouds built in AWS or Azure, or VMWare vSphere.
> >
> >
> > Under this deployment scenario, unlimited scalability in a cloud can be
> > achieved, no unified cascading layer, tenant level resources
> orchestration
> > among multi-OpenStack clouds fully distributed(even geographically). The
> > database and load for one casacding OpenStack is very very small, easy
> for
> > disaster recovery or backup. Multiple tenant may share one cascading
> > OpenStack to reduce resource waste, but the principle is to keep the
> > cascading OpenStack as thin as possible.
> &

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-17 Thread Attila Fazekas




- Original Message -
> From: "joehuang" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Friday, April 17, 2015 9:46:12 AM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> Hi, Attila,
> 
> only address the issue of agent status/liveness management is not enough for
> Neutron scalability. The concurrent dynamic load impact on large scale ( for
> example 100k managed nodes with the dynamic load like security group rule
> update, routers_updated, etc ) should also be taken into account too. So
> even if is agent status/liveness management improved in Neutron, that
> doesn't mean the scalability issue totally being addressed.
> 

This story is not about the heartbeat.
https://bugs.launchpad.net/neutron/+bug/1438159

What I am looking for is managing lot of nodes, with minimal `controller` 
resources.

The actual required system changes like (for example regarding to vm boot) 
per/sec
is relative low, even if you have many nodes and vms. - Consider the instances 
average lifetime -

The `bug` is for the resources what the agents are related and querying many 
times,
BTW: I am thinking about several alternatives and other variants.

In neutron case a `system change` can affect multiple agents
like security group rule change.

It seams possible to have all agents to `query` a resource only once,
and being notified by any subsequent change `for free`. (IP, sec group rule, 
new neighbor) 

This is the scenario when the message brokers can shine and scale,
and it also offloads lot of work from the DB.


> And on the other hand, Nova already supports several segregation concepts,
> for example, Cells, Availability Zone... If there are 100k nodes to be
> managed by one OpenStack instances, it's impossible to work without hardware
> resources segregation. It's weird to put agent liveness manager in
> availability zone(AZ in short) 1, but all managed agents in AZ 2. If AZ 1 is
> power off, then all agents in AZ2 lost management.
> 
>
> The benchmark is already here for scalability "test report for million ports
> scalability of Neutron "
> http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers
> 
> The cascading may be not perfect, but at least it provides a feasible way if
> we really want scalability.
> 
> I am also working to evolve OpenStack to a world no need to worry about
> "OpenStack Scalability Issue" based on cascading:
> 
> "Tenant level virtual OpenStack service over hybrid or federated or multiple
> OpenStack based clouds":
> 
> There are lots of OpenStack based clouds, each tenant will be allocated with
> one cascading OpenStack as the virtual OpenStack service, and single
> OpenStack API endpoint served for this tenant. The tenant's resources can be
> distributed or dynamically scaled to multi-OpenStack based clouds, these
> clouds may be federated with KeyStone, or using shared KeyStone, or  even
> some OpenStack clouds built in AWS or Azure, or VMWare vSphere.
>
> 
> Under this deployment scenario, unlimited scalability in a cloud can be
> achieved, no unified cascading layer, tenant level resources orchestration
> among multi-OpenStack clouds fully distributed(even geographically). The
> database and load for one casacding OpenStack is very very small, easy for
> disaster recovery or backup. Multiple tenant may share one cascading
> OpenStack to reduce resource waste, but the principle is to keep the
> cascading OpenStack as thin as possible.
>
> You can find the information here:
> https://wiki.openstack.org/wiki/OpenStack_cascading_solution#Use_Case
> 
> Best Regards
> Chaoyi Huang ( joehuang )
> 
> -Original Message-
> From: Attila Fazekas [mailto:afaze...@redhat.com]
> Sent: Thursday, April 16, 2015 3:06 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> 
> 
> - Original Message -
> > From: "joehuang" 
> > To: "OpenStack Development Mailing List (not for usage questions)"
> > 
> > Sent: Sunday, April 12, 2015 3:46:24 AM
> > Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> > 
> > 
> > 
> > As Kevin talking about agents, I want to remind that in TCP/IP stack,
> > port ( not Neutron Port ) is a two bytes field, i.e. port ranges from
> > 0 ~ 65535, supports maximum 64k port number.
> > 
> > 
> > 
> > " above 100k managed node " means more than 100k L2 agents/L3
> > agents... will be alive under Neutron.
> &

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-17 Thread joehuang
Hi, Attila,

only address the issue of agent status/liveness management is not enough for 
Neutron scalability. The concurrent dynamic load impact on large scale ( for 
example 100k managed nodes with the dynamic load like security group rule 
update, routers_updated, etc ) should also be taken into account too. So even 
if is agent status/liveness management improved in Neutron, that doesn't mean 
the scalability issue totally being addressed.

And on the other hand, Nova already supports several segregation concepts, for 
example, Cells, Availability Zone... If there are 100k nodes to be managed by 
one OpenStack instances, it's impossible to work without hardware resources 
segregation. It's weird to put agent liveness manager in availability zone(AZ 
in short) 1, but all managed agents in AZ 2. If AZ 1 is power off, then all 
agents in AZ2 lost management. 

The benchmark is already here for scalability "test report for million ports 
scalability of Neutron " 
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers

The cascading may be not perfect, but at least it provides a feasible way if we 
really want scalability.

I am also working to evolve OpenStack to a world no need to worry about 
"OpenStack Scalability Issue" based on cascading:

"Tenant level virtual OpenStack service over hybrid or federated or multiple 
OpenStack based clouds":

There are lots of OpenStack based clouds, each tenant will be allocated with 
one cascading OpenStack as the virtual OpenStack service, and single OpenStack 
API endpoint served for this tenant. The tenant's resources can be distributed 
or dynamically scaled to multi-OpenStack based clouds, these clouds may be 
federated with KeyStone, or using shared KeyStone, or  even some OpenStack 
clouds built in AWS or Azure, or VMWare vSphere.

Under this deployment scenario, unlimited scalability in a cloud can be 
achieved, no unified cascading layer, tenant level resources orchestration 
among multi-OpenStack clouds fully distributed(even geographically). The 
database and load for one casacding OpenStack is very very small, easy for 
disaster recovery or backup. Multiple tenant may share one cascading OpenStack 
to reduce resource waste, but the principle is to keep the cascading OpenStack 
as thin as possible.

You can find the information here:
https://wiki.openstack.org/wiki/OpenStack_cascading_solution#Use_Case

Best Regards
Chaoyi Huang ( joehuang )

-Original Message-
From: Attila Fazekas [mailto:afaze...@redhat.com] 
Sent: Thursday, April 16, 2015 3:06 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?





- Original Message -
> From: "joehuang" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Sunday, April 12, 2015 3:46:24 AM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> As Kevin talking about agents, I want to remind that in TCP/IP stack, 
> port ( not Neutron Port ) is a two bytes field, i.e. port ranges from 
> 0 ~ 65535, supports maximum 64k port number.
> 
> 
> 
> " above 100k managed node " means more than 100k L2 agents/L3 
> agents... will be alive under Neutron.
> 
> 
> 
> Want to know the detail design how to support 99.9% possibility for 
> scaling Neutron in this way, and PoC and test would be a good support for 
> this idea.
> 

Would you consider something as PoC which uses the technology in similar way, 
with a similar port - security problem, but with a lower level API than neutron 
using currently ?

Is it an acceptable flaw:
If you kill -9 the q-svc 1 times at the `right` millisec the rabbitmq 
memory usage increases by ~1MiB ? (Rabbit usually eats ~10GiB under pressure) 
The memory can be freed without broker restart, it also gets freed on agent 
restart.


> 
> 
> "I'm 99.9% sure, for scaling above 100k managed node, we do not really 
> need to split the openstack to multiple smaller openstack, or use 
> significant number of extra controller machine."
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [blak...@gmail.com]
> Sent: 11 April 2015 12:34
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> Which periodic updates did you have in mind to eliminate? One of the 
> few remaining ones I can think of is sync_routers but it would be 
> great if you can enumerate the ones you observed because eliminating 
> overhead in agents is something I've been working on as well.
> 
> One of the most common

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-16 Thread joehuang
Hi, Neil,

The api_wokers / rpc-worker configuration for the cascading layer can be found 
in the test report, it's based on community Juno version, and some issues found 
and listed in the end of the report. 

Simulator is used for the cascaded OpenStackNo configuration in the test. For 
configuration of api_worker/rpc_works for one OpenStack Neutron to support 1152 
nodes, you can refer to the article http://www.openstack.cn/p2932.html or 
http://www.csdn.net/article/2014-12-19/2823077, but unfortunately, it was 
written in Chinese, and no detail number of workers.

Best Regards
Chaoyi Huang ( Joe Huang )


-Original Message-
From: Neil Jerram [mailto:neil.jer...@metaswitch.com] 
Sent: Thursday, April 16, 2015 5:15 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Thanks Joe, I really appreciate these numbers.

For an individual (cascaded) Neutron, then, your testing showed that it could 
happily handle 1000 compute hosts.  Apart from the cascading on the northbound 
side, was that otherwise unmodified from vanilla OpenStack?  Do you recall any 
particular config settings that were needed to achieve that?  (e.g. api_workers 
and rpc_workers)

Regards,
Neil


On 16/04/15 03:03, joehuang wrote:
> "In case it's helpful to see all the cases together, sync_routers (from the 
> L3 agent) was also mentioned in other part of this thread.  Plus of course 
> the liveness reporting from all agents."
>
> In the test report [1], which shows Neutron can supports up to million level 
> ports and 100k level physical hosts, the scalability is done by one cascading 
> Neutron to manage 100 cascaded Neutrons through current Neutron restful API. 
> For normal Neutron, each compute node will host L2 agent/OVS, L3 agent/DVR. 
> In the cascading Neutron layer, the L2 agent is modified to interact with 
> regarding cascaded Neutron but not OVS, the L3 agent(DVR) is modified to 
> interact with regarding cascaded Neutron but not linux route. That's why we 
> call the cascaded Neutron is the backend of Neutron.
>
> Therefore, there are only 100 compute nodes (or say agent ) required in the 
> cascading layer, each compute node will manage one cascaded Neutron. Each 
> cascaded Neutron can manage up to 1000 nodes (there is already report and 
> deployment and lab test can support this). That's the scalability to 100k 
> nodes.
>
> Because the cloud is splited into two layer (100 nodes in the cascading 
> layer, 1000 nodes in each cascaded layer ), even current mechanism can meet 
> the demand for sync_routers and liveness reporting from all agents, or L2 
> population, DVR router update...etc.
>
> The test report [1] at least prove that the layered architecture idea is 
> feasible for Neutron scalability, even up to million level ports and 100k 
> level nodes. The extra benefit for the layered architecture is that each 
> cascaded Neutron can leverage different backend technology implementation, 
> for example, one is ML2+OVS, another is OVN or ODL or Calico...
>
> [1]test report for million ports scalability of Neutron 
> http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascadi
> ng-solution-to-support-1-million-v-ms-in-100-data-centers
>
> Best Regards
> Chaoyi Huang ( Joe Huang )
>
> -Original Message-
> From: Neil Jerram [mailto:neil.jer...@metaswitch.com]
> Sent: Wednesday, April 15, 2015 9:46 PM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
> Hi again Joe, (+ list)
>
> On 11/04/15 02:00, joehuang wrote:
>> Hi, Neil,
>>
>> See inline comments.
>>
>> Best Regards
>>
>> Chaoyi Huang
>>
>> ________________
>> From: Neil Jerram [neil.jer...@metaswitch.com]
>> Sent: 09 April 2015 23:01
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>>
>> Hi Joe,
>>
>> Many thanks for your reply!
>>
>> On 09/04/15 03:34, joehuang wrote:
>>> Hi, Neil,
>>>
>>>From theoretic, Neutron is like a "broadcast" domain, for example, 
>>> enforcement of DVR and security group has to touch each regarding host 
>>> where there is VM of this project resides. Even using SDN controller, the 
>>> "touch" to regarding host is inevitable. If there are plenty of physical 
>>> hosts, for example, 10k, inside one Neutron, it's very hard to overcome the 
>>> "broadcast storm" issue under concurrent operation, that's the bottleneck 
>>> for scalability of Neutron.
>

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-16 Thread Neil Jerram

Thanks Joe, I really appreciate these numbers.

For an individual (cascaded) Neutron, then, your testing showed that it 
could happily handle 1000 compute hosts.  Apart from the cascading on 
the northbound side, was that otherwise unmodified from vanilla 
OpenStack?  Do you recall any particular config settings that were 
needed to achieve that?  (e.g. api_workers and rpc_workers)


Regards,
Neil


On 16/04/15 03:03, joehuang wrote:

"In case it's helpful to see all the cases together, sync_routers (from the L3 
agent) was also mentioned in other part of this thread.  Plus of course the liveness 
reporting from all agents."

In the test report [1], which shows Neutron can supports up to million level 
ports and 100k level physical hosts, the scalability is done by one cascading 
Neutron to manage 100 cascaded Neutrons through current Neutron restful API. 
For normal Neutron, each compute node will host L2 agent/OVS, L3 agent/DVR. In 
the cascading Neutron layer, the L2 agent is modified to interact with 
regarding cascaded Neutron but not OVS, the L3 agent(DVR) is modified to 
interact with regarding cascaded Neutron but not linux route. That's why we 
call the cascaded Neutron is the backend of Neutron.

Therefore, there are only 100 compute nodes (or say agent ) required in the 
cascading layer, each compute node will manage one cascaded Neutron. Each 
cascaded Neutron can manage up to 1000 nodes (there is already report and 
deployment and lab test can support this). That's the scalability to 100k nodes.

Because the cloud is splited into two layer (100 nodes in the cascading layer, 
1000 nodes in each cascaded layer ), even current mechanism can meet the demand 
for sync_routers and liveness reporting from all agents, or L2 population, DVR 
router update...etc.

The test report [1] at least prove that the layered architecture idea is 
feasible for Neutron scalability, even up to million level ports and 100k level 
nodes. The extra benefit for the layered architecture is that each cascaded 
Neutron can leverage different backend technology implementation, for example, 
one is ML2+OVS, another is OVN or ODL or Calico...

[1]test report for million ports scalability of Neutron 
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers

Best Regards
Chaoyi Huang ( Joe Huang )

-Original Message-
From: Neil Jerram [mailto:neil.jer...@metaswitch.com]
Sent: Wednesday, April 15, 2015 9:46 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Hi again Joe, (+ list)

On 11/04/15 02:00, joehuang wrote:

Hi, Neil,

See inline comments.

Best Regards

Chaoyi Huang


From: Neil Jerram [neil.jer...@metaswitch.com]
Sent: 09 April 2015 23:01
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Hi Joe,

Many thanks for your reply!

On 09/04/15 03:34, joehuang wrote:

Hi, Neil,

   From theoretic, Neutron is like a "broadcast" domain, for example, enforcement of DVR and 
security group has to touch each regarding host where there is VM of this project resides. Even using SDN 
controller, the "touch" to regarding host is inevitable. If there are plenty of physical hosts, for 
example, 10k, inside one Neutron, it's very hard to overcome the "broadcast storm" issue under 
concurrent operation, that's the bottleneck for scalability of Neutron.


I think I understand that in general terms - but can you be more
specific about the broadcast storm?  Is there one particular message
exchange that involves broadcasting?  Is it only from the server to
agents, or are there 'broadcasts' in other directions as well?

[[joehuang]] for example, L2 population, Security group rule update, DVR route 
update. Both direction in different scenario.


Thanks.  In case it's helpful to see all the cases together, sync_routers (from 
the L3 agent) was also mentioned in other part of this thread.  Plus of course 
the liveness reporting from all agents.


(I presume you are talking about control plane messages here, i.e.
between Neutron components.  Is that right?  Obviously there can also
be broadcast storm problems in the data plane - but I don't think
that's what you are talking about here.)

[[joehuang]] Yes, controll plane here.


Thanks for confirming that.


We need layered architecture in Neutron to solve the "broadcast
domain" bottleneck of scalability. The test report from OpenStack
cascading shows that through layered architecture "Neutron
cascading", Neutron can supports up to million level ports and 100k
level physical hosts. You can find the report here:
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascad
ing-solution-to-support-1-million-v-ms-in-10

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-16 Thread Attila Fazekas




- Original Message -
> From: "joehuang" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Sunday, April 12, 2015 3:46:24 AM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> As Kevin talking about agents, I want to remind that in TCP/IP stack, port (
> not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~ 65535,
> supports maximum 64k port number.
> 
> 
> 
> " above 100k managed node " means more than 100k L2 agents/L3 agents... will
> be alive under Neutron.
> 
> 
> 
> Want to know the detail design how to support 99.9% possibility for scaling
> Neutron in this way, and PoC and test would be a good support for this idea.
> 

Would you consider something as PoC which uses the technology in similar way,
with a similar port - security problem, but with a lower level API
than neutron using currently ?

Is it an acceptable flaw:
If you kill -9 the q-svc 1 times at the `right` millisec the rabbitmq
memory usage increases by ~1MiB ? (Rabbit usually eats ~10GiB under pressure)
The memory can be freed without broker restart, it also gets freed on
agent restart.


> 
> 
> "I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine."
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [blak...@gmail.com]
> Sent: 11 April 2015 12:34
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> Which periodic updates did you have in mind to eliminate? One of the few
> remaining ones I can think of is sync_routers but it would be great if you
> can enumerate the ones you observed because eliminating overhead in agents
> is something I've been working on as well.
> 
> One of the most common is the heartbeat from each agent. However, I don't
> think we can't eliminate them because they are used to determine if the
> agents are still alive for scheduling purposes. Did you have something else
> in mind to determine if an agent is alive?
> 
> On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas < afaze...@redhat.com >
> wrote:
> 
> 
> I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine.
> 
> The problem is openstack using the right tools SQL/AMQP/(zk),
> but in a wrong way.
> 
> For example.:
> Periodic updates can be avoided almost in all cases
> 
> The new data can be pushed to the agent just when it needed.
> The agent can know when the AMQP connection become unreliable (queue or
> connection loose),
> and needs to do full sync.
> https://bugs.launchpad.net/neutron/+bug/1438159
> 
> Also the agents when gets some notification, they start asking for details
> via the
> AMQP -> SQL. Why they do not know it already or get it with the notification
> ?
> 
> 
> - Original Message -
> > From: "Neil Jerram" < neil.jer...@metaswitch.com >
> > To: "OpenStack Development Mailing List (not for usage questions)" <
> > openstack-dev@lists.openstack.org >
> > Sent: Thursday, April 9, 2015 5:01:45 PM
> > Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> > 
> > Hi Joe,
> > 
> > Many thanks for your reply!
> > 
> > On 09/04/15 03:34, joehuang wrote:
> > > Hi, Neil,
> > > 
> > > From theoretic, Neutron is like a "broadcast" domain, for example,
> > > enforcement of DVR and security group has to touch each regarding host
> > > where there is VM of this project resides. Even using SDN controller, the
> > > "touch" to regarding host is inevitable. If there are plenty of physical
> > > hosts, for example, 10k, inside one Neutron, it's very hard to overcome
> > > the "broadcast storm" issue under concurrent operation, that's the
> > > bottleneck for scalability of Neutron.
> > 
> > I think I understand that in general terms - but can you be more
> > specific about the broadcast storm? Is there one particular message
> > exchange that involves broadcasting? Is it only from the server to
> > agents, or are there 'broadcasts' in other directions as well?
> > 
> > (I presume you are talking about control plane messages here, i.e.
> > between Neutron components. Is that right? Obviou

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-15 Thread joehuang
"In case it's helpful to see all the cases together, sync_routers (from the L3 
agent) was also mentioned in other part of this thread.  Plus of course the 
liveness reporting from all agents."

In the test report [1], which shows Neutron can supports up to million level 
ports and 100k level physical hosts, the scalability is done by one cascading 
Neutron to manage 100 cascaded Neutrons through current Neutron restful API. 
For normal Neutron, each compute node will host L2 agent/OVS, L3 agent/DVR. In 
the cascading Neutron layer, the L2 agent is modified to interact with 
regarding cascaded Neutron but not OVS, the L3 agent(DVR) is modified to 
interact with regarding cascaded Neutron but not linux route. That's why we 
call the cascaded Neutron is the backend of Neutron. 

Therefore, there are only 100 compute nodes (or say agent ) required in the 
cascading layer, each compute node will manage one cascaded Neutron. Each 
cascaded Neutron can manage up to 1000 nodes (there is already report and 
deployment and lab test can support this). That's the scalability to 100k nodes.

Because the cloud is splited into two layer (100 nodes in the cascading layer, 
1000 nodes in each cascaded layer ), even current mechanism can meet the demand 
for sync_routers and liveness reporting from all agents, or L2 population, DVR 
router update...etc. 

The test report [1] at least prove that the layered architecture idea is 
feasible for Neutron scalability, even up to million level ports and 100k level 
nodes. The extra benefit for the layered architecture is that each cascaded 
Neutron can leverage different backend technology implementation, for example, 
one is ML2+OVS, another is OVN or ODL or Calico...

[1]test report for million ports scalability of Neutron 
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers

Best Regards
Chaoyi Huang ( Joe Huang )

-Original Message-
From: Neil Jerram [mailto:neil.jer...@metaswitch.com] 
Sent: Wednesday, April 15, 2015 9:46 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Hi again Joe, (+ list)

On 11/04/15 02:00, joehuang wrote:
> Hi, Neil,
>
> See inline comments.
>
> Best Regards
>
> Chaoyi Huang
>
> 
> From: Neil Jerram [neil.jer...@metaswitch.com]
> Sent: 09 April 2015 23:01
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
> Hi Joe,
>
> Many thanks for your reply!
>
> On 09/04/15 03:34, joehuang wrote:
>> Hi, Neil,
>>
>>   From theoretic, Neutron is like a "broadcast" domain, for example, 
>> enforcement of DVR and security group has to touch each regarding host where 
>> there is VM of this project resides. Even using SDN controller, the "touch" 
>> to regarding host is inevitable. If there are plenty of physical hosts, for 
>> example, 10k, inside one Neutron, it's very hard to overcome the "broadcast 
>> storm" issue under concurrent operation, that's the bottleneck for 
>> scalability of Neutron.
>
> I think I understand that in general terms - but can you be more 
> specific about the broadcast storm?  Is there one particular message 
> exchange that involves broadcasting?  Is it only from the server to 
> agents, or are there 'broadcasts' in other directions as well?
>
> [[joehuang]] for example, L2 population, Security group rule update, DVR 
> route update. Both direction in different scenario.

Thanks.  In case it's helpful to see all the cases together, sync_routers (from 
the L3 agent) was also mentioned in other part of this thread.  Plus of course 
the liveness reporting from all agents.

> (I presume you are talking about control plane messages here, i.e.
> between Neutron components.  Is that right?  Obviously there can also 
> be broadcast storm problems in the data plane - but I don't think 
> that's what you are talking about here.)
>
> [[joehuang]] Yes, controll plane here.

Thanks for confirming that.

>> We need layered architecture in Neutron to solve the "broadcast 
>> domain" bottleneck of scalability. The test report from OpenStack 
>> cascading shows that through layered architecture "Neutron 
>> cascading", Neutron can supports up to million level ports and 100k 
>> level physical hosts. You can find the report here: 
>> http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascad
>> ing-solution-to-support-1-million-v-ms-in-100-data-centers
>
> Many thanks, I will take a look at this.

It was very interesting, thanks.  And by following through your links I also

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-15 Thread Joshua Harlow

Neil Jerram wrote:

Hi again Joe, (+ list)

On 11/04/15 02:00, joehuang wrote:

Hi, Neil,

See inline comments.

Best Regards

Chaoyi Huang


From: Neil Jerram [neil.jer...@metaswitch.com]
Sent: 09 April 2015 23:01
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Hi Joe,

Many thanks for your reply!

On 09/04/15 03:34, joehuang wrote:

Hi, Neil,

From theoretic, Neutron is like a "broadcast" domain, for example,
enforcement of DVR and security group has to touch each regarding
host where there is VM of this project resides. Even using SDN
controller, the "touch" to regarding host is inevitable. If there are
plenty of physical hosts, for example, 10k, inside one Neutron, it's
very hard to overcome the "broadcast storm" issue under concurrent
operation, that's the bottleneck for scalability of Neutron.


I think I understand that in general terms - but can you be more
specific about the broadcast storm? Is there one particular message
exchange that involves broadcasting? Is it only from the server to
agents, or are there 'broadcasts' in other directions as well?

[[joehuang]] for example, L2 population, Security group rule update,
DVR route update. Both direction in different scenario.


Thanks. In case it's helpful to see all the cases together, sync_routers
(from the L3 agent) was also mentioned in other part of this thread.
Plus of course the liveness reporting from all agents.


(I presume you are talking about control plane messages here, i.e.
between Neutron components. Is that right? Obviously there can also be
broadcast storm problems in the data plane - but I don't think that's
what you are talking about here.)

[[joehuang]] Yes, controll plane here.


Thanks for confirming that.


We need layered architecture in Neutron to solve the "broadcast
domain" bottleneck of scalability. The test report from OpenStack
cascading shows that through layered architecture "Neutron
cascading", Neutron can supports up to million level ports and 100k
level physical hosts. You can find the report here:
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers



Many thanks, I will take a look at this.


It was very interesting, thanks. And by following through your links I
also learned more about Nova cells, and about how some people question
whether we need any kind of partitioning at all, and should instead
solve scaling/performance problems in other ways... It will be
interesting to see how this plays out.

I'd still like to see more information, though, about how far people
have scaled OpenStack - and in particular Neutron - as it exists today.
Surely having a consensus set of current limits is an important input
into any discussion of future scaling work.


+2 to this...

Shooting for the moon (although nice in theory) is not so useful when 
you can't even get up a hill ;)




For example, Kevin mentioned benchmarking where the Neutron server
processed a liveness update in <50ms and a sync_routers in 300ms.
Suppose, the liveness update time was 50ms (since I don't know in detail
what that < means) and agents report liveness every 30s. Does that mean
that a single Neutron server can only support 600 agents?

I'm also especially interested in the DHCP agent, because in Calico we
have one of those on every compute host. We've just run tests which
appeared to be hitting trouble from just 50 compute hosts onwards, and
apparently because of DHCP agent communications. We need to continue
looking into that and report findings properly, but if anyone already
has any insights, they would be much appreciated.

Many thanks,
Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-15 Thread Neil Jerram

Hi again Joe, (+ list)

On 11/04/15 02:00, joehuang wrote:

Hi, Neil,

See inline comments.

Best Regards

Chaoyi Huang


From: Neil Jerram [neil.jer...@metaswitch.com]
Sent: 09 April 2015 23:01
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Hi Joe,

Many thanks for your reply!

On 09/04/15 03:34, joehuang wrote:

Hi, Neil,

  From theoretic, Neutron is like a "broadcast" domain, for example, enforcement of DVR and 
security group has to touch each regarding host where there is VM of this project resides. Even using SDN 
controller, the "touch" to regarding host is inevitable. If there are plenty of physical hosts, for 
example, 10k, inside one Neutron, it's very hard to overcome the "broadcast storm" issue under 
concurrent operation, that's the bottleneck for scalability of Neutron.


I think I understand that in general terms - but can you be more
specific about the broadcast storm?  Is there one particular message
exchange that involves broadcasting?  Is it only from the server to
agents, or are there 'broadcasts' in other directions as well?

[[joehuang]] for example, L2 population, Security group rule update, DVR route 
update. Both direction in different scenario.


Thanks.  In case it's helpful to see all the cases together, 
sync_routers (from the L3 agent) was also mentioned in other part of 
this thread.  Plus of course the liveness reporting from all agents.



(I presume you are talking about control plane messages here, i.e.
between Neutron components.  Is that right?  Obviously there can also be
broadcast storm problems in the data plane - but I don't think that's
what you are talking about here.)

[[joehuang]] Yes, controll plane here.


Thanks for confirming that.


We need layered architecture in Neutron to solve the "broadcast domain" bottleneck of 
scalability. The test report from OpenStack cascading shows that through layered architecture 
"Neutron cascading", Neutron can supports up to million level ports and 100k level 
physical hosts. You can find the report here: 
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers


Many thanks, I will take a look at this.


It was very interesting, thanks.  And by following through your links I 
also learned more about Nova cells, and about how some people question 
whether we need any kind of partitioning at all, and should instead 
solve scaling/performance problems in other ways...  It will be 
interesting to see how this plays out.


I'd still like to see more information, though, about how far people 
have scaled OpenStack - and in particular Neutron - as it exists today. 
 Surely having a consensus set of current limits is an important input 
into any discussion of future scaling work.


For example, Kevin mentioned benchmarking where the Neutron server 
processed a liveness update in <50ms and a sync_routers in 300ms. 
Suppose, the liveness update time was 50ms (since I don't know in detail 
what that < means) and agents report liveness every 30s.  Does that mean 
that a single Neutron server can only support 600 agents?


I'm also especially interested in the DHCP agent, because in Calico we 
have one of those on every compute host.  We've just run tests which 
appeared to be hitting trouble from just 50 compute hosts onwards, and 
apparently because of DHCP agent communications.  We need to continue 
looking into that and report findings properly, but if anyone already 
has any insights, they would be much appreciated.


Many thanks,
Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-15 Thread joehuang
Hi, Joshua,

This is a long discussion thread, may we come back to the scalability topic? 

As you confirmed, Tooz only addresses the issue of agent status management, not 
solve the concurrent dynamic load impact on large scale ( for example 100k 
managed nodes with the dynamic load like security group rule update, 
routers_updated, etc ). 

So even if Tooz is implemented in Neutron, that doesn't mean the scalability 
issue totally being addressed. 

So what's the goal and the whole picture to address the Neutron scalability? 
And Tooz will help the picture to be completed.
 
Best Regards
Chaoyi Huang ( Joe Huang )

-Original Message-
From: Joshua Harlow [mailto:harlo...@outlook.com] 
Sent: Tuesday, April 14, 2015 11:33 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] 答复: [neutron] Neutron scaling datapoints?

Daniel Comnea wrote:
> Joshua,
>
> those are old and have been fixed/ documented on Consul side.
> As for ZK, i have nothing against it, just wish you good luck running 
> it in a multi cross-DC setup :)

Totally fair, although I start to question a cross-DC setup of things, and why 
that's needed in this (and/or any) architecture, but to each there own ;)

>
> Dani
>
> On Mon, Apr 13, 2015 at 11:37 PM, Joshua Harlow  <mailto:harlo...@outlook.com>> wrote:
>
> Did the following get addressed?
>
> https://aphyr.com/posts/316-__call-me-maybe-etcd-and-consul
> <https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul>
>
> Seems like quite a few things got raised in that post about etcd/consul.
>
> Maybe they are fixed, idk...
>
> https://aphyr.com/posts/291-__call-me-maybe-zookeeper
> <https://aphyr.com/posts/291-call-me-maybe-zookeeper> though worked
> as expected (and without issue)...
>
> I quote:
>
> '''
> Recommendations
>
> Use Zookeeper. It’s mature, well-designed, and battle-tested.
> Because the consequences of its connection model and linearizability
> properties are subtle, you should, wherever possible, take advantage
> of tested recipes and client libraries like Curator, which do their
> best to correctly handle the complex state transitions associated
> with session and connection loss.
> '''
>
> Daniel Comnea wrote:
>
> My $2 cents:
>
> I like the 3rd party backend however instead of ZK wouldn't
> Consul [1]
> fit better due to lighter/ out of box multi DC awareness?
>
> Dani
>
> [1] Consul - https://www.consul.io/
>
>
> On Mon, Apr 13, 2015 at 9:51 AM, Wangbibo  <mailto:wangb...@huawei.com>
> <mailto:wangb...@huawei.com <mailto:wangb...@huawei.com>>> wrote:
>
>  Hi Kevin,
>
>  __ __
>
>  Totally agree with you that heartbeat from each agent is
> something
>  that we cannot eliminate currently. Agent status depends on
> it, and
>  further scheduler and HA depends on agent status.
>
>  __ __
>
>  I proposed a Liberty spec for introducing open
> framework/pluggable
>  agent status drivers.[1][2]  It allows us to use some other
> 3^rd
>  party backend to monitor agent status, such as zookeeper,
> memcached.
>  Meanwhile, it guarantees backward compatibility so that
> users could
>  still use db-based status monitoring mechanism as their default
>  choice.
>
>  __ __
>
>  Base on that, we may do further optimization on issues
> Attila and
>  you mentioned. Thanks. 
>
>  __ __
>
>  [1] BP  -
> 
> https://blueprints.launchpad.__net/neutron/+spec/agent-group-__and-status-drivers
> 
> <https://blueprints.launchpad.net/neutron/+spec/agent-group-and-status-drivers>
>
>  [2] Liberty Spec proposed -
> https://review.openstack.org/#__/c/168921/
> <https://review.openstack.org/#/c/168921/>
>
>  __ __
>
>  Best,
>
>  Robin
>
>  __ __
>
>      __ __
>
>      __ __
>
>  __ __
>
>  *发件人:*Kevin Benton [mailto:blak...@gmail.com
> <mailto:blak...@gmail.com>
> <mailto:blak...@gmail.com <mailto:blak...@gmail.com>>]
>  *发送时间:*2015年4月11日12:35
>  *收件人:*OpenStack Development Mailing List (not for u

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-13 Thread joehuang
Tooz provides a mechanism for grouping agents and agent status/liveness 
management, multiple coordinator services may be required in large scale 
deployment, especially for 100k nodes level. We can't make assumption that only 
one coordinator service is enough to manage all nodes, that means tooz may need 
to support multiple coordinate backend.

And Nova already supports several segregation concepts, for example, Cells, 
Availability Zone, Host Aggregates,  Where the coordinate backend will 
resides? How to group agents? It's weird to put coordinator in availability 
zone(AZ in short) 1, but all managed agents in AZ 2. If AZ 1 is power off, then 
all agents in AZ2 lost management. Do we need segregation concept for agents, 
or reuse Nova concept, or build mapping between them? Especially if multiple 
coordinate backend will work under one Neutron.

Best Regards
Chaoyi Huang ( Joe Huang )

-Original Message-
From: Joshua Harlow [mailto:harlo...@outlook.com] 
Sent: Monday, April 13, 2015 11:11 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

joehuang wrote:
> Hi, Kevin and Joshua,
>
> As my understanding, Tooz only addresses the issue of agent status 
> management, but how to solve the concurrent dynamic load impact on 
> large scale ( for example 100k managed nodes with the dynamic load 
> like security goup rule update, routers_updated, etc )

Yes, that is correct, let's not confuse status/liveness management with 
updates... since IMHO they are to very different things (the latter can be 
eventually consistent IMHO will the liveness 'question' probably should not 
be...).

>
> And one more question is, if we have 100k managed nodes, how to do the 
> partition? Or all nodes will be managed by one Tooz service, like 
> Zookeeper? Can Zookeeper manage 100k nodes status?

I can get u some data/numbers from some studies I've seen, but what u are 
talking about is highly specific as to what u are doing with zookeeper... There 
is no one solution for all the things IMHO; choose what's best from your 
tool-belt for each problem...

>
> Best Regards
>
> Chaoyi Huang ( Joe Huang )
>
> *From:*Kevin Benton [mailto:blak...@gmail.com]
> *Sent:* Monday, April 13, 2015 3:52 AM
> *To:* OpenStack Development Mailing List (not for usage questions)
> *Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
>>Timestamps are just one way (and likely the most primitive), using 
>>redis
> (or memcache) key/value and expiry are another (and letting memcache 
> or redis expire using its own internal algorithms), using zookeeper 
> ephemeral nodes[1] are another... The point being that its backend 
> specific and tooz supports varying backends.
>
> Very cool. Is the backend completely transparent so a deployer could 
> choose a service they are comfortable maintaining, or will that change 
> the properties WRT to resiliency of state on node restarts, partitions, etc?
>
> The Nova implementation of Tooz seemed pretty straight-forward, 
> although it looked like it had pluggable drivers for service management 
> already.
> Before I dig into it much further I'll file a spec on the Neutron side 
> to see if I can get some other cores onboard to do the review work if 
> I push a change to tooz.
>
> On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow  <mailto:harlo...@outlook.com>> wrote:
>
> Kevin Benton wrote:
>
> So IIUC tooz would be handling the liveness detection for the agents.
> That would be nice to get ride of that logic in Neutron and just 
> register callbacks for rescheduling the dead.
>
> Where does it store that state, does it persist timestamps to the DB 
> like Neutron does? If so, how would that scale better? If not, who 
> does a given node ask to know if an agent is online or offline when 
> making a scheduling decision?
>
>
> Timestamps are just one way (and likely the most primitive), using 
> redis (or memcache) key/value and expiry are another (and letting 
> memcache or redis expire using its own internal algorithms), using 
> zookeeper ephemeral nodes[1] are another... The point being that its 
> backend specific and tooz supports varying backends.
>
>
> However, before (what I assume is) the large code change to implement 
> tooz, I would like to quantify that the heartbeats are actually a 
> bottleneck. When I was doing some profiling of them on the master 
> branch a few months ago, processing a heartbeat took an order of 
> magnitude less time (<50ms) than the 'sync routers' task of the l3 
> agent (~300ms). A few query optimizations might buy us a lot more 
> headroom before we have to fall back to large refactors.
>
>
> S

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-13 Thread joehuang

-Original Message-
From: Attila Fazekas [mailto:afaze...@redhat.com] 
Sent: Monday, April 13, 2015 3:19 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?


- Original Message -
> From: "joehuang" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Sunday, April 12, 2015 1:20:48 PM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> Hi, Kevin,
> 
> 
> 
> I assumed that all agents are connected to same IP address of 
> RabbitMQ, then the connection will exceed the port ranges limitation.
> 
https://news.ycombinator.com/item?id=1571300

"TCP connections are identified by the (src ip, src port, dest ip, dest port) 
tuple."

"The server doesn't need multiple IPs to handle > 65535 connections. All the 
server connections to a given IP are to the same port. For a given client, the 
unique key for an http connection is (client-ip, PORT, server-ip, 80). The only 
number that can vary is PORT, and that's a value on the client. So, the client 
is limited to 65535 connections to the server. But, a second client could also 
have another 65K connections to the same server-ip:port."


[[joehuang]] Sorry, long time not writing socket based app, I may make a 
mistake for HTTP server to spawn a thread to handle a new connection. I'll 
check again.

> 
> For a RabbitMQ cluster, for sure the client can connect to any one of 
> member in the cluster, but in this case, the client has to be designed 
> in fail-safe
> manner: the client should be aware of the cluster member failure, and 
> reconnect to other survive member. No such mechnism has been 
> implemented yet.
> 
> 
> 
> Other way is to use LVS or DNS based like load balancer, or something else.
> If you put one load balancer ahead of a cluster, then we have to take 
> care of the port number limitation, there are so many agents will 
> require connection concurrently, 100k level, and the requests can not be 
> rejected.
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [blak...@gmail.com]
> Sent: 12 April 2015 9:59
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> The TCP/IP stack keeps track of connections as a combination of IP + 
> TCP port. The two byte port limit doesn't matter unless all of the 
> agents are connecting from the same IP address, which shouldn't be the 
> case unless compute nodes connect to the rabbitmq server via one IP 
> address running port address translation.
> 
> Either way, the agents don't connect directly to the Neutron server, 
> they connect to the rabbit MQ cluster. Since as many Neutron server 
> processes can be launched as necessary, the bottlenecks will likely 
> show up at the messaging or DB layer.
> 
> On Sat, Apr 11, 2015 at 6:46 PM, joehuang < joehu...@huawei.com > wrote:
> 
> 
> 
> 
> 
> As Kevin talking about agents, I want to remind that in TCP/IP stack, 
> port ( not Neutron Port ) is a two bytes field, i.e. port ranges from 
> 0 ~ 65535, supports maximum 64k port number.
> 
> 
> 
> " above 100k managed node " means more than 100k L2 agents/L3 
> agents... will be alive under Neutron.
> 
> 
> 
> Want to know the detail design how to support 99.9% possibility for 
> scaling Neutron in this way, and PoC and test would be a good support for 
> this idea.
> 
> 
> 
> "I'm 99.9% sure, for scaling above 100k managed node, we do not really 
> need to split the openstack to multiple smaller openstack, or use 
> significant number of extra controller machine."
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [ blak...@gmail.com ]
> Sent: 11 April 2015 12:34
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> Which periodic updates did you have in mind to eliminate? One of the 
> few remaining ones I can think of is sync_routers but it would be 
> great if you can enumerate the ones you observed because eliminating 
> overhead in agents is something I've been working on as well.
> 
> One of the most common is the heartbeat from each agent. However, I 
> don't think we can't eliminate them because they are used to determine 
> if the agents are still alive for scheduling purposes. Did you have 
> something else in mind to determine if an agent is alive?
> 
> On Fri, Apr 10, 2015

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-13 Thread Attila Fazekas




- Original Message -
> From: "joehuang" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Sunday, April 12, 2015 1:20:48 PM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> Hi, Kevin,
> 
> 
> 
> I assumed that all agents are connected to same IP address of RabbitMQ, then
> the connection will exceed the port ranges limitation.
> 
https://news.ycombinator.com/item?id=1571300

"TCP connections are identified by the (src ip, src port, dest ip, dest port) 
tuple."

"The server doesn't need multiple IPs to handle > 65535 connections. All the 
server connections to a given IP are to the same port. For a given client, the 
unique key for an http connection is (client-ip, PORT, server-ip, 80). The only 
number that can vary is PORT, and that's a value on the client. So, the client 
is limited to 65535 connections to the server. But, a second client could also 
have another 65K connections to the same server-ip:port."

> 
> For a RabbitMQ cluster, for sure the client can connect to any one of member
> in the cluster, but in this case, the client has to be designed in fail-safe
> manner: the client should be aware of the cluster member failure, and
> reconnect to other survive member. No such mechnism has been implemented
> yet.
> 
> 
> 
> Other way is to use LVS or DNS based like load balancer, or something else.
> If you put one load balancer ahead of a cluster, then we have to take care
> of the port number limitation, there are so many agents will require
> connection concurrently, 100k level, and the requests can not be rejected.
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [blak...@gmail.com]
> Sent: 12 April 2015 9:59
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> The TCP/IP stack keeps track of connections as a combination of IP + TCP
> port. The two byte port limit doesn't matter unless all of the agents are
> connecting from the same IP address, which shouldn't be the case unless
> compute nodes connect to the rabbitmq server via one IP address running port
> address translation.
> 
> Either way, the agents don't connect directly to the Neutron server, they
> connect to the rabbit MQ cluster. Since as many Neutron server processes can
> be launched as necessary, the bottlenecks will likely show up at the
> messaging or DB layer.
> 
> On Sat, Apr 11, 2015 at 6:46 PM, joehuang < joehu...@huawei.com > wrote:
> 
> 
> 
> 
> 
> As Kevin talking about agents, I want to remind that in TCP/IP stack, port (
> not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~ 65535,
> supports maximum 64k port number.
> 
> 
> 
> " above 100k managed node " means more than 100k L2 agents/L3 agents... will
> be alive under Neutron.
> 
> 
> 
> Want to know the detail design how to support 99.9% possibility for scaling
> Neutron in this way, and PoC and test would be a good support for this idea.
> 
> 
> 
> "I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine."
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [ blak...@gmail.com ]
> Sent: 11 April 2015 12:34
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> Which periodic updates did you have in mind to eliminate? One of the few
> remaining ones I can think of is sync_routers but it would be great if you
> can enumerate the ones you observed because eliminating overhead in agents
> is something I've been working on as well.
> 
> One of the most common is the heartbeat from each agent. However, I don't
> think we can't eliminate them because they are used to determine if the
> agents are still alive for scheduling purposes. Did you have something else
> in mind to determine if an agent is alive?
> 
> On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas < afaze...@redhat.com >
> wrote:
> 
> 
> I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine.
> 
> The problem is openstack using the right tools SQL/AMQP/(zk),
> but in a wrong way.
> 
> For example.:
> Periodic updates can be avoided almost in 

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Attila Fazekas




- Original Message -
> From: "Kevin Benton" 
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Sent: Sunday, April 12, 2015 4:17:29 AM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> So IIUC tooz would be handling the liveness detection for the agents. That
> would be nice to get ride of that logic in Neutron and just register
> callbacks for rescheduling the dead.
> 
> Where does it store that state, does it persist timestamps to the DB like
> Neutron does? If so, how would that scale better? If not, who does a given
> node ask to know if an agent is online or offline when making a scheduling
> decision?
> 
You might find interesting the proposed solution in this bug:
https://bugs.launchpad.net/nova/+bug/1437199

> However, before (what I assume is) the large code change to implement tooz, I
> would like to quantify that the heartbeats are actually a bottleneck. When I
> was doing some profiling of them on the master branch a few months ago,
> processing a heartbeat took an order of magnitude less time (<50ms) than the
> 'sync routers' task of the l3 agent (~300ms). A few query optimizations
> might buy us a lot more headroom before we have to fall back to large
> refactors.
> Kevin Benton wrote:
> 
> 
> 
> One of the most common is the heartbeat from each agent. However, I
> don't think we can't eliminate them because they are used to determine
> if the agents are still alive for scheduling purposes. Did you have
> something else in mind to determine if an agent is alive?
> 
> Put each agent in a tooz[1] group; have each agent periodically heartbeat[2],
> have whoever needs to schedule read the active members of that group (or use
> [3] to get notified via a callback), profit...
> 
> Pick from your favorite (supporting) driver at:
> 
> http://docs.openstack.org/ developer/tooz/compatibility. html
> 
> [1] http://docs.openstack.org/ developer/tooz/compatibility. html#grouping
> [2] https://github.com/openstack/ tooz/blob/0.13.1/tooz/ coordination.py#L315
> [3] http://docs.openstack.org/ developer/tooz/tutorial/group_
> membership.html#watching- group-changes
> 
> 
> __ __ __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request@lists. openstack.org?subject: unsubscribe
> http://lists.openstack.org/ cgi-bin/mailman/listinfo/ openstack-dev
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

Joshua Harlow wrote:

Kevin Benton wrote:

>Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.

Very cool. Is the backend completely transparent so a deployer could
choose a service they are comfortable maintaining, or will that change
the properties WRT to resiliency of state on node restarts,
partitions, etc?


Of course... we tried to make it 'completely' transparent, but in
reality certain backends (zookeeper which uses a paxos-like algorithm
and redis with sentinel support...) are better (more resilient, more
consistent, handle partitions/restarts better...) than others (memcached
is after all just a distributed cache). This is just the nature of the
game...



And for some more reading fun:

https://aphyr.com/posts/315-call-me-maybe-rabbitmq

https://aphyr.com/posts/291-call-me-maybe-zookeeper

https://aphyr.com/posts/283-call-me-maybe-redis

https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

... (aphyr.com has alot of these neat posts)...



The Nova implementation of Tooz seemed pretty straight-forward, although
it looked like it had pluggable drivers for service management already.
Before I dig into it much further I'll file a spec on the Neutron side
to see if I can get some other cores onboard to do the review work if I
push a change to tooz.


Sounds good to me.




On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow mailto:harlo...@outlook.com>> wrote:

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the
agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not,
who does
a given node ask to know if an agent is online or offline when
making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.


However, before (what I assume is) the large code change to
implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the
master branch
a few months ago, processing a heartbeat took an order of
magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent
(~300ms). A
few query optimizations might buy us a lot more headroom before
we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#__/c/138607/
 (same thing/nearly same
in nova)...

https://review.openstack.org/#__/c/172502/
 (a WIP implementation of
the latter).

[1]
https://zookeeper.apache.org/__doc/trunk/__zookeeperProgrammers.html#__Ephemeral+Nodes





Kevin Benton wrote:


One of the most common is the heartbeat from each agent.
However, I
don't think we can't eliminate them because they are used
to determine
if the agents are still alive for scheduling purposes. Did
you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active
members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/developer/tooz/compatibility.html

>

[1]
http://docs.openstack.org/developer/tooz/compatibility.html#grouping



>
[2]
https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315



>

[3]
http://docs.openstack.org/developer/tooz/tutorial/group_membership.html#watching-group-changes



Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

joehuang wrote:

Hi, Kevin and Joshua,

As my understanding, Tooz only addresses the issue of agent status
management, but how to solve the concurrent dynamic load impact on large
scale ( for example 100k managed nodes with the dynamic load like
security goup rule update, routers_updated, etc )


Yes, that is correct, let's not confuse status/liveness management with 
updates... since IMHO they are to very different things (the latter can 
be eventually consistent IMHO will the liveness 'question' probably 
should not be...).




And one more question is, if we have 100k managed nodes, how to do the
partition? Or all nodes will be managed by one Tooz service, like
Zookeeper? Can Zookeeper manage 100k nodes status?


I can get u some data/numbers from some studies I've seen, but what u 
are talking about is highly specific as to what u are doing with 
zookeeper... There is no one solution for all the things IMHO; choose 
what's best from your tool-belt for each problem...




Best Regards

Chaoyi Huang ( Joe Huang )

*From:*Kevin Benton [mailto:blak...@gmail.com]
*Sent:* Monday, April 13, 2015 3:52 AM
*To:* OpenStack Development Mailing List (not for usage questions)
*Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?


Timestamps are just one way (and likely the most primitive), using redis

(or memcache) key/value and expiry are another (and letting memcache or
redis expire using its own internal algorithms), using zookeeper
ephemeral nodes[1] are another... The point being that its backend
specific and tooz supports varying backends.

Very cool. Is the backend completely transparent so a deployer could
choose a service they are comfortable maintaining, or will that change
the properties WRT to resiliency of state on node restarts, partitions, etc?

The Nova implementation of Tooz seemed pretty straight-forward, although
it looked like it had pluggable drivers for service management already.
Before I dig into it much further I'll file a spec on the Neutron side
to see if I can get some other cores onboard to do the review work if I
push a change to tooz.

On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow mailto:harlo...@outlook.com>> wrote:

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not, who does
a given node ask to know if an agent is online or offline when making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using redis
(or memcache) key/value and expiry are another (and letting memcache or
redis expire using its own internal algorithms), using zookeeper
ephemeral nodes[1] are another... The point being that its backend
specific and tooz supports varying backends.


However, before (what I assume is) the large code change to implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the master branch
a few months ago, processing a heartbeat took an order of magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
few query optimizations might buy us a lot more headroom before we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#/c/138607/ (same thing/nearly same in nova)...

https://review.openstack.org/#/c/172502/ (a WIP implementation of the
latter).

[1]
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephemeral+Nodes
<https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephemeral+Nodes>


Kevin Benton wrote:


One of the most common is the heartbeat from each agent. However, I
don't think we can't eliminate them because they are used to determine
if the agents are still alive for scheduling purposes. Did you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/__developer/tooz/compatibility.__html
<http://docs.openstack.org/developer/tooz/compatibility.html>

[1]
http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping
<http://docs.openstack.org/developer/tooz/compatibility.html#grouping>
[2]
https://github.com/openstack/__tooz/blob/0.13.1/tooz/__coordination.py#L315
<https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315>
[3]
http://docs.openstack.org/__developer/tooz/tutorial/group___membersh

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

Kevin Benton wrote:

 >Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.

Very cool. Is the backend completely transparent so a deployer could
choose a service they are comfortable maintaining, or will that change
the properties WRT to resiliency of state on node restarts, partitions, etc?


Of course... we tried to make it 'completely' transparent, but in 
reality certain backends (zookeeper which uses a paxos-like algorithm 
and redis with sentinel support...) are better (more resilient, more 
consistent, handle partitions/restarts better...) than others (memcached 
is after all just a distributed cache). This is just the nature of the 
game...




The Nova implementation of Tooz seemed pretty straight-forward, although
it looked like it had pluggable drivers for service management already.
Before I dig into it much further I'll file a spec on the Neutron side
to see if I can get some other cores onboard to do the review work if I
push a change to tooz.


Sounds good to me.




On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow mailto:harlo...@outlook.com>> wrote:

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the
agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not,
who does
a given node ask to know if an agent is online or offline when
making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using
redis (or memcache) key/value and expiry are another (and letting
memcache or redis expire using its own internal algorithms), using
zookeeper ephemeral nodes[1] are another... The point being that its
backend specific and tooz supports varying backends.


However, before (what I assume is) the large code change to
implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the
master branch
a few months ago, processing a heartbeat took an order of
magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent
(~300ms). A
few query optimizations might buy us a lot more headroom before
we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#__/c/138607/
 (same thing/nearly same
in nova)...

https://review.openstack.org/#__/c/172502/
 (a WIP implementation of
the latter).

[1]

https://zookeeper.apache.org/__doc/trunk/__zookeeperProgrammers.html#__Ephemeral+Nodes




Kevin Benton wrote:


 One of the most common is the heartbeat from each agent.
However, I
 don't think we can't eliminate them because they are used
to determine
 if the agents are still alive for scheduling purposes. Did
you have
 something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active
members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/developer/tooz/compatibility.html

>

[1]

http://docs.openstack.org/developer/tooz/compatibility.html#grouping



>
[2]

https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315





Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread joehuang
Hi, Kevin and Joshua,

As my understanding, Tooz only addresses the issue of agent status management, 
but how to solve the concurrent dynamic load impact on large scale ( for 
example 100k managed nodes with the dynamic load like security goup rule 
update, routers_updated, etc )

And one more question is, if we have 100k managed nodes, how to do the 
partition? Or all nodes will be managed by one Tooz service, like Zookeeper? 
Can Zookeeper manage 100k nodes status?

Best Regards
Chaoyi Huang ( Joe Huang )

From: Kevin Benton [mailto:blak...@gmail.com]
Sent: Monday, April 13, 2015 3:52 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

>Timestamps are just one way (and likely the most primitive), using redis (or 
>memcache) key/value and expiry are another (and letting memcache or redis 
>expire using its own internal algorithms), using zookeeper ephemeral nodes[1] 
>are another... The point being that its backend specific and tooz supports 
>varying backends.

Very cool. Is the backend completely transparent so a deployer could choose a 
service they are comfortable maintaining, or will that change the properties 
WRT to resiliency of state on node restarts, partitions, etc?

The Nova implementation of Tooz seemed pretty straight-forward, although it 
looked like it had pluggable drivers for service management already. Before I 
dig into it much further I'll file a spec on the Neutron side to see if I can 
get some other cores onboard to do the review work if I push a change to tooz.


On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow 
mailto:harlo...@outlook.com>> wrote:
Kevin Benton wrote:
So IIUC tooz would be handling the liveness detection for the agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not, who does
a given node ask to know if an agent is online or offline when making a
scheduling decision?

Timestamps are just one way (and likely the most primitive), using redis (or 
memcache) key/value and expiry are another (and letting memcache or redis 
expire using its own internal algorithms), using zookeeper ephemeral nodes[1] 
are another... The point being that its backend specific and tooz supports 
varying backends.

However, before (what I assume is) the large code change to implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the master branch
a few months ago, processing a heartbeat took an order of magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
few query optimizations might buy us a lot more headroom before we have
to fall back to large refactors.

Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#/c/138607/ (same thing/nearly same in nova)...

https://review.openstack.org/#/c/172502/ (a WIP implementation of the latter).

[1] 
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephemeral+Nodes

Kevin Benton wrote:


One of the most common is the heartbeat from each agent. However, I
don't think we can't eliminate them because they are used to determine
if the agents are still alive for scheduling purposes. Did you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/__developer/tooz/compatibility.__html
<http://docs.openstack.org/developer/tooz/compatibility.html>

[1]
http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping
<http://docs.openstack.org/developer/tooz/compatibility.html#grouping>
[2]
https://github.com/openstack/__tooz/blob/0.13.1/tooz/__coordination.py#L315
<https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315>
[3]
http://docs.openstack.org/__developer/tooz/tutorial/group___membership.html#watching-__group-changes
<http://docs.openstack.org/developer/tooz/tutorial/group_membership.html#watching-group-changes>


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.__openstack.org?subject:__unsubscribe<http://openstack.org?subject:__unsubscribe>
<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/__cgi-bin/mailman/listin

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Kevin Benton
>Timestamps are just one way (and likely the most primitive), using redis
(or memcache) key/value and expiry are another (and letting memcache or
redis expire using its own internal algorithms), using zookeeper ephemeral
nodes[1] are another... The point being that its backend specific and tooz
supports varying backends.

Very cool. Is the backend completely transparent so a deployer could choose
a service they are comfortable maintaining, or will that change the
properties WRT to resiliency of state on node restarts, partitions, etc?

The Nova implementation of Tooz seemed pretty straight-forward, although it
looked like it had pluggable drivers for service management already. Before
I dig into it much further I'll file a spec on the Neutron side to see if I
can get some other cores onboard to do the review work if I push a change
to tooz.


On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow  wrote:

> Kevin Benton wrote:
>
>> So IIUC tooz would be handling the liveness detection for the agents.
>> That would be nice to get ride of that logic in Neutron and just
>> register callbacks for rescheduling the dead.
>>
>> Where does it store that state, does it persist timestamps to the DB
>> like Neutron does? If so, how would that scale better? If not, who does
>> a given node ask to know if an agent is online or offline when making a
>> scheduling decision?
>>
>
> Timestamps are just one way (and likely the most primitive), using redis
> (or memcache) key/value and expiry are another (and letting memcache or
> redis expire using its own internal algorithms), using zookeeper ephemeral
> nodes[1] are another... The point being that its backend specific and tooz
> supports varying backends.
>
>
>> However, before (what I assume is) the large code change to implement
>> tooz, I would like to quantify that the heartbeats are actually a
>> bottleneck. When I was doing some profiling of them on the master branch
>> a few months ago, processing a heartbeat took an order of magnitude less
>> time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
>> few query optimizations might buy us a lot more headroom before we have
>> to fall back to large refactors.
>>
>
> Sure, always good to avoid prematurely optimizing things...
>
> Although this is relevant for u I think anyway:
>
> https://review.openstack.org/#/c/138607/ (same thing/nearly same in
> nova)...
>
> https://review.openstack.org/#/c/172502/ (a WIP implementation of the
> latter).
>
> [1] https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#
> Ephemeral+Nodes
>
>
>> Kevin Benton wrote:
>>
>>
>> One of the most common is the heartbeat from each agent. However, I
>> don't think we can't eliminate them because they are used to determine
>> if the agents are still alive for scheduling purposes. Did you have
>> something else in mind to determine if an agent is alive?
>>
>>
>> Put each agent in a tooz[1] group; have each agent periodically
>> heartbeat[2], have whoever needs to schedule read the active members of
>> that group (or use [3] to get notified via a callback), profit...
>>
>> Pick from your favorite (supporting) driver at:
>>
>> http://docs.openstack.org/__developer/tooz/compatibility.__html
>> 
>>
>> [1]
>> http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping
>> 
>> [2]
>> https://github.com/openstack/__tooz/blob/0.13.1/tooz/__
>> coordination.py#L315
>> 
>> [3]
>> http://docs.openstack.org/__developer/tooz/tutorial/group_
>> __membership.html#watching-__group-changes
>> > membership.html#watching-group-changes>
>>
>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request@lists.__openstack.org?subject:__unsubscribe
>> 
>> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>> 
>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:
>> unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Kevin Benton
__
Ope

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Kevin Benton
>I assumed that all agents are connected to same IP address of RabbitMQ,
then the connection will exceed the port ranges limitation.

Only if the clients are all using the same IP address. If connections
weren't scoped by source IP, busy servers would be completely unreliable
because clients would keep having source port collisions.

For example, the following is a netstat output from a server with two
connections to a service running on port 4000 with both clients using
source port 5: http://paste.openstack.org/show/203211/

>the client should be aware of the cluster member failure, and reconnect to
other survive member. No such mechnism has been implemented yet.

If I understand what you are suggesting, it already has been implemented
that way. The neutron agents and servers can be configured with multiple
rabbitmq servers and they will cycle through the list whenever there is a
failure.

The only downside to that approach is that every neutron agent and server
has to be configured with every rabbitmq server address. This gets tedious
to manage if you want to add cluster members dynamically so using a load
balancer can help relieve that.

Hi, Kevin,



I assumed that all agents are connected to same IP address of RabbitMQ,
then the connection will exceed the port ranges limitation.



For a RabbitMQ cluster, for sure the client can connect to any one of
member in the cluster, but in this case, the client has to be designed in
fail-safe manner: the client should be aware of the cluster member failure,
and reconnect to other survive member. No such mechnism has
been implemented yet.



Other way is to use LVS or DNS based like load balancer, or something else.
If you put one load balancer ahead of a cluster, then we have to take care
of the port number limitation, there are so many agents  will require
connection concurrently, 100k level, and the requests can not be rejected.



Best Regards



Chaoyi Huang ( joehuang )


 --
*From:* Kevin Benton [blak...@gmail.com]
*Sent:* 12 April 2015 9:59
*To:* OpenStack Development Mailing List (not for usage questions)
*Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?

  The TCP/IP stack keeps track of connections as a combination of IP + TCP
port. The two byte port limit doesn't matter unless all of the agents are
connecting from the same IP address, which shouldn't be the case unless
compute nodes connect to the rabbitmq server via one IP address running
port address translation.

 Either way, the agents don't connect directly to the Neutron server, they
connect to the rabbit MQ cluster. Since as many Neutron server processes
can be launched as necessary, the bottlenecks will likely show up at the
messaging or DB layer.

On Sat, Apr 11, 2015 at 6:46 PM, joehuang  wrote:

>  As Kevin talking about agents, I want to remind that in TCP/IP stack,
> port ( not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~
> 65535, supports maximum 64k port number.
>
>
>
> " above 100k managed node " means more than 100k L2 agents/L3 agents...
> will be alive under Neutron.
>
>
>
> Want to know the detail design how to support 99.9% possibility for
> scaling Neutron in this way, and PoC and test would be a good support for
> this idea.
>
>
>
> "I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine."
>
>
>
> Best Regards
>
>
>
> Chaoyi Huang ( joehuang )
>
>
>  --
> *From:* Kevin Benton [blak...@gmail.com]
> *Sent:* 11 April 2015 12:34
> *To:* OpenStack Development Mailing List (not for usage questions)
>  *Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
>Which periodic updates did you have in mind to eliminate? One of the
> few remaining ones I can think of is sync_routers but it would be great if
> you can enumerate the ones you observed because eliminating overhead in
> agents is something I've been working on as well.
>
>  One of the most common is the heartbeat from each agent. However, I
> don't think we can't eliminate them because they are used to determine if
> the agents are still alive for scheduling purposes. Did you have something
> else in mind to determine if an agent is alive?
>
> On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas 
> wrote:
>
>> I'm 99.9% sure, for scaling above 100k managed node,
>> we do not really need to split the openstack to multiple smaller
>> openstack,
>> or use significant number of extra controller machine.
>>
>> The problem is openstack using the right tools SQL/AMQP/(zk),
>> but in a wrong way.
>>
>> For example.:
>> Perio

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread Joshua Harlow

Kevin Benton wrote:

So IIUC tooz would be handling the liveness detection for the agents.
That would be nice to get ride of that logic in Neutron and just
register callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB
like Neutron does? If so, how would that scale better? If not, who does
a given node ask to know if an agent is online or offline when making a
scheduling decision?


Timestamps are just one way (and likely the most primitive), using redis 
(or memcache) key/value and expiry are another (and letting memcache or 
redis expire using its own internal algorithms), using zookeeper 
ephemeral nodes[1] are another... The point being that its backend 
specific and tooz supports varying backends.




However, before (what I assume is) the large code change to implement
tooz, I would like to quantify that the heartbeats are actually a
bottleneck. When I was doing some profiling of them on the master branch
a few months ago, processing a heartbeat took an order of magnitude less
time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
few query optimizations might buy us a lot more headroom before we have
to fall back to large refactors.


Sure, always good to avoid prematurely optimizing things...

Although this is relevant for u I think anyway:

https://review.openstack.org/#/c/138607/ (same thing/nearly same in nova)...

https://review.openstack.org/#/c/172502/ (a WIP implementation of the 
latter).


[1] 
https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephemeral+Nodes




Kevin Benton wrote:


One of the most common is the heartbeat from each agent. However, I
don't think we can't eliminate them because they are used to determine
if the agents are still alive for scheduling purposes. Did you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/__developer/tooz/compatibility.__html


[1]
http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping

[2]
https://github.com/openstack/__tooz/blob/0.13.1/tooz/__coordination.py#L315

[3]
http://docs.openstack.org/__developer/tooz/tutorial/group___membership.html#watching-__group-changes



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
OpenStack-dev-request@lists.__openstack.org?subject:__unsubscribe

http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-12 Thread joehuang
Hi, Kevin,



I assumed that all agents are connected to same IP address of RabbitMQ, then 
the connection will exceed the port ranges limitation.



For a RabbitMQ cluster, for sure the client can connect to any one of member in 
the cluster, but in this case, the client has to be designed in fail-safe 
manner: the client should be aware of the cluster member failure, and reconnect 
to other survive member. No such mechnism has been implemented yet.



Other way is to use LVS or DNS based like load balancer, or something else. If 
you put one load balancer ahead of a cluster, then we have to take care of the 
port number limitation, there are so many agents  will require connection 
concurrently, 100k level, and the requests can not be rejected.



Best Regards



Chaoyi Huang ( joehuang )




From: Kevin Benton [blak...@gmail.com]
Sent: 12 April 2015 9:59
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

The TCP/IP stack keeps track of connections as a combination of IP + TCP port. 
The two byte port limit doesn't matter unless all of the agents are connecting 
from the same IP address, which shouldn't be the case unless compute nodes 
connect to the rabbitmq server via one IP address running port address 
translation.

Either way, the agents don't connect directly to the Neutron server, they 
connect to the rabbit MQ cluster. Since as many Neutron server processes can be 
launched as necessary, the bottlenecks will likely show up at the messaging or 
DB layer.

On Sat, Apr 11, 2015 at 6:46 PM, joehuang 
mailto:joehu...@huawei.com>> wrote:

As Kevin talking about agents, I want to remind that in TCP/IP stack, port ( 
not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~ 65535, 
supports maximum 64k port number.



" above 100k managed node " means more than 100k L2 agents/L3 agents... will be 
alive under Neutron.



Want to know the detail design how to support 99.9% possibility for scaling 
Neutron in this way, and PoC and test would be a good support for this idea.



"I'm 99.9% sure, for scaling above 100k managed node,
we do not really need to split the openstack to multiple smaller openstack,
or use significant number of extra controller machine."



Best Regards



Chaoyi Huang ( joehuang )




From: Kevin Benton [blak...@gmail.com<mailto:blak...@gmail.com>]
Sent: 11 April 2015 12:34
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Which periodic updates did you have in mind to eliminate? One of the few 
remaining ones I can think of is sync_routers but it would be great if you can 
enumerate the ones you observed because eliminating overhead in agents is 
something I've been working on as well.

One of the most common is the heartbeat from each agent. However, I don't think 
we can't eliminate them because they are used to determine if the agents are 
still alive for scheduling purposes. Did you have something else in mind to 
determine if an agent is alive?

On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas 
mailto:afaze...@redhat.com>> wrote:
I'm 99.9% sure, for scaling above 100k managed node,
we do not really need to split the openstack to multiple smaller openstack,
or use significant number of extra controller machine.

The problem is openstack using the right tools SQL/AMQP/(zk),
but in a wrong way.

For example.:
Periodic updates can be avoided almost in all cases

The new data can be pushed to the agent just when it needed.
The agent can know when the AMQP connection become unreliable (queue or 
connection loose),
and needs to do full sync.
https://bugs.launchpad.net/neutron/+bug/1438159

Also the agents when gets some notification, they start asking for details via 
the
AMQP -> SQL. Why they do not know it already or get it with the notification ?


- Original Message -
> From: "Neil Jerram" 
> mailto:neil.jer...@metaswitch.com>>
> To: "OpenStack Development Mailing List (not for usage questions)" 
> mailto:openstack-dev@lists.openstack.org>>
> Sent: Thursday, April 9, 2015 5:01:45 PM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
> Hi Joe,
>
> Many thanks for your reply!
>
> On 09/04/15 03:34, joehuang wrote:
> > Hi, Neil,
> >
> >  From theoretic, Neutron is like a "broadcast" domain, for example,
> >  enforcement of DVR and security group has to touch each regarding host
> >  where there is VM of this project resides. Even using SDN controller, the
> >  "touch" to regarding host is inevitable. If there are plenty of physical
> >  hosts, for example, 10k, inside one Neutron, it's very hard to overcome

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-11 Thread Kevin Benton
So IIUC tooz would be handling the liveness detection for the agents. That
would be nice to get ride of that logic in Neutron and just register
callbacks for rescheduling the dead.

Where does it store that state, does it persist timestamps to the DB like
Neutron does? If so, how would that scale better? If not, who does a given
node ask to know if an agent is online or offline when making a scheduling
decision?

However, before (what I assume is) the large code change to implement tooz,
I would like to quantify that the heartbeats are actually a bottleneck.
When I was doing some profiling of them on the master branch a few months
ago, processing a heartbeat took an order of magnitude less time (<50ms)
than the 'sync routers' task of the l3 agent (~300ms). A few query
optimizations might buy us a lot more headroom before we have to fall back
to large refactors.
Kevin Benton wrote:

>
> One of the most common is the heartbeat from each agent. However, I
> don't think we can't eliminate them because they are used to determine
> if the agents are still alive for scheduling purposes. Did you have
> something else in mind to determine if an agent is alive?
>

Put each agent in a tooz[1] group; have each agent periodically
heartbeat[2], have whoever needs to schedule read the active members of
that group (or use [3] to get notified via a callback), profit...

Pick from your favorite (supporting) driver at:

http://docs.openstack.org/developer/tooz/compatibility.html

[1] http://docs.openstack.org/developer/tooz/compatibility.html#grouping
[2] https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315
[3] http://docs.openstack.org/developer/tooz/tutorial/group_
membership.html#watching-group-changes


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-11 Thread Kevin Benton
The TCP/IP stack keeps track of connections as a combination of IP + TCP
port. The two byte port limit doesn't matter unless all of the agents are
connecting from the same IP address, which shouldn't be the case unless
compute nodes connect to the rabbitmq server via one IP address running
port address translation.

Either way, the agents don't connect directly to the Neutron server, they
connect to the rabbit MQ cluster. Since as many Neutron server processes
can be launched as necessary, the bottlenecks will likely show up at the
messaging or DB layer.

On Sat, Apr 11, 2015 at 6:46 PM, joehuang  wrote:

>  As Kevin talking about agents, I want to remind that in TCP/IP stack,
> port ( not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~
> 65535, supports maximum 64k port number.
>
>
>
> " above 100k managed node " means more than 100k L2 agents/L3 agents...
> will be alive under Neutron.
>
>
>
> Want to know the detail design how to support 99.9% possibility for
> scaling Neutron in this way, and PoC and test would be a good support for
> this idea.
>
>
>
> "I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine."
>
>
>
> Best Regards
>
>
>
> Chaoyi Huang ( joehuang )
>
>
>  --
> *From:* Kevin Benton [blak...@gmail.com]
> *Sent:* 11 April 2015 12:34
> *To:* OpenStack Development Mailing List (not for usage questions)
> *Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
>   Which periodic updates did you have in mind to eliminate? One of the
> few remaining ones I can think of is sync_routers but it would be great if
> you can enumerate the ones you observed because eliminating overhead in
> agents is something I've been working on as well.
>
>  One of the most common is the heartbeat from each agent. However, I
> don't think we can't eliminate them because they are used to determine if
> the agents are still alive for scheduling purposes. Did you have something
> else in mind to determine if an agent is alive?
>
> On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas 
> wrote:
>
>> I'm 99.9% sure, for scaling above 100k managed node,
>> we do not really need to split the openstack to multiple smaller
>> openstack,
>> or use significant number of extra controller machine.
>>
>> The problem is openstack using the right tools SQL/AMQP/(zk),
>> but in a wrong way.
>>
>> For example.:
>> Periodic updates can be avoided almost in all cases
>>
>> The new data can be pushed to the agent just when it needed.
>> The agent can know when the AMQP connection become unreliable (queue or
>> connection loose),
>> and needs to do full sync.
>> https://bugs.launchpad.net/neutron/+bug/1438159
>>
>> Also the agents when gets some notification, they start asking for
>> details via the
>> AMQP -> SQL. Why they do not know it already or get it with the
>> notification ?
>>
>>
>> - Original Message -
>> > From: "Neil Jerram" 
>>  > To: "OpenStack Development Mailing List (not for usage questions)" <
>> openstack-dev@lists.openstack.org>
>> > Sent: Thursday, April 9, 2015 5:01:45 PM
>> > Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>> >
>> > Hi Joe,
>> >
>> > Many thanks for your reply!
>> >
>> > On 09/04/15 03:34, joehuang wrote:
>> > > Hi, Neil,
>> > >
>> > >  From theoretic, Neutron is like a "broadcast" domain, for example,
>> > >  enforcement of DVR and security group has to touch each regarding
>> host
>> > >  where there is VM of this project resides. Even using SDN
>> controller, the
>> > >  "touch" to regarding host is inevitable. If there are plenty of
>> physical
>> > >  hosts, for example, 10k, inside one Neutron, it's very hard to
>> overcome
>> > >  the "broadcast storm" issue under concurrent operation, that's the
>> > >  bottleneck for scalability of Neutron.
>> >
>> > I think I understand that in general terms - but can you be more
>> > specific about the broadcast storm?  Is there one particular message
>> > exchange that involves broadcasting?  Is it only from the server to
>> > agents, or are there 'broadcasts' in other directions as well?
>> >
>> > (I presume you are talking about 

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-11 Thread joehuang
As Kevin talking about agents, I want to remind that in TCP/IP stack, port ( 
not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~ 65535, 
supports maximum 64k port number.



" above 100k managed node " means more than 100k L2 agents/L3 agents... will be 
alive under Neutron.



Want to know the detail design how to support 99.9% possibility for scaling 
Neutron in this way, and PoC and test would be a good support for this idea.



"I'm 99.9% sure, for scaling above 100k managed node,
we do not really need to split the openstack to multiple smaller openstack,
or use significant number of extra controller machine."



Best Regards



Chaoyi Huang ( joehuang )




From: Kevin Benton [blak...@gmail.com]
Sent: 11 April 2015 12:34
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Which periodic updates did you have in mind to eliminate? One of the few 
remaining ones I can think of is sync_routers but it would be great if you can 
enumerate the ones you observed because eliminating overhead in agents is 
something I've been working on as well.

One of the most common is the heartbeat from each agent. However, I don't think 
we can't eliminate them because they are used to determine if the agents are 
still alive for scheduling purposes. Did you have something else in mind to 
determine if an agent is alive?

On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas 
mailto:afaze...@redhat.com>> wrote:
I'm 99.9% sure, for scaling above 100k managed node,
we do not really need to split the openstack to multiple smaller openstack,
or use significant number of extra controller machine.

The problem is openstack using the right tools SQL/AMQP/(zk),
but in a wrong way.

For example.:
Periodic updates can be avoided almost in all cases

The new data can be pushed to the agent just when it needed.
The agent can know when the AMQP connection become unreliable (queue or 
connection loose),
and needs to do full sync.
https://bugs.launchpad.net/neutron/+bug/1438159

Also the agents when gets some notification, they start asking for details via 
the
AMQP -> SQL. Why they do not know it already or get it with the notification ?


- Original Message -
> From: "Neil Jerram" 
> mailto:neil.jer...@metaswitch.com>>
> To: "OpenStack Development Mailing List (not for usage questions)" 
> mailto:openstack-dev@lists.openstack.org>>
> Sent: Thursday, April 9, 2015 5:01:45 PM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
> Hi Joe,
>
> Many thanks for your reply!
>
> On 09/04/15 03:34, joehuang wrote:
> > Hi, Neil,
> >
> >  From theoretic, Neutron is like a "broadcast" domain, for example,
> >  enforcement of DVR and security group has to touch each regarding host
> >  where there is VM of this project resides. Even using SDN controller, the
> >  "touch" to regarding host is inevitable. If there are plenty of physical
> >  hosts, for example, 10k, inside one Neutron, it's very hard to overcome
> >  the "broadcast storm" issue under concurrent operation, that's the
> >  bottleneck for scalability of Neutron.
>
> I think I understand that in general terms - but can you be more
> specific about the broadcast storm?  Is there one particular message
> exchange that involves broadcasting?  Is it only from the server to
> agents, or are there 'broadcasts' in other directions as well?
>
> (I presume you are talking about control plane messages here, i.e.
> between Neutron components.  Is that right?  Obviously there can also be
> broadcast storm problems in the data plane - but I don't think that's
> what you are talking about here.)
>
> > We need layered architecture in Neutron to solve the "broadcast domain"
> > bottleneck of scalability. The test report from OpenStack cascading shows
> > that through layered architecture "Neutron cascading", Neutron can
> > supports up to million level ports and 100k level physical hosts. You can
> > find the report here:
> > http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers
>
> Many thanks, I will take a look at this.
>
> > "Neutron cascading" also brings extra benefit: One cascading Neutron can
> > have many cascaded Neutrons, and different cascaded Neutron can leverage
> > different SDN controller, maybe one is ODL, the other one is OpenContrail.
> >
> > Cascading Neutron---
> >  / \
> > --cascaded Neutron--   --cascaded Neutron-
> >   

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-11 Thread Joshua Harlow

Kevin Benton wrote:


One of the most common is the heartbeat from each agent. However, I
don't think we can't eliminate them because they are used to determine
if the agents are still alive for scheduling purposes. Did you have
something else in mind to determine if an agent is alive?


Put each agent in a tooz[1] group; have each agent periodically 
heartbeat[2], have whoever needs to schedule read the active members of 
that group (or use [3] to get notified via a callback), profit...


Pick from your favorite (supporting) driver at:

http://docs.openstack.org/developer/tooz/compatibility.html

[1] http://docs.openstack.org/developer/tooz/compatibility.html#grouping
[2] https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315
[3] 
http://docs.openstack.org/developer/tooz/tutorial/group_membership.html#watching-group-changes



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-10 Thread Kevin Benton
Which periodic updates did you have in mind to eliminate? One of the few
remaining ones I can think of is sync_routers but it would be great if you
can enumerate the ones you observed because eliminating overhead in agents
is something I've been working on as well.

One of the most common is the heartbeat from each agent. However, I don't
think we can't eliminate them because they are used to determine if the
agents are still alive for scheduling purposes. Did you have something else
in mind to determine if an agent is alive?

On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas  wrote:

> I'm 99.9% sure, for scaling above 100k managed node,
> we do not really need to split the openstack to multiple smaller openstack,
> or use significant number of extra controller machine.
>
> The problem is openstack using the right tools SQL/AMQP/(zk),
> but in a wrong way.
>
> For example.:
> Periodic updates can be avoided almost in all cases
>
> The new data can be pushed to the agent just when it needed.
> The agent can know when the AMQP connection become unreliable (queue or
> connection loose),
> and needs to do full sync.
> https://bugs.launchpad.net/neutron/+bug/1438159
>
> Also the agents when gets some notification, they start asking for details
> via the
> AMQP -> SQL. Why they do not know it already or get it with the
> notification ?
>
>
> - Original Message -
> > From: "Neil Jerram" 
> > To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev@lists.openstack.org>
> > Sent: Thursday, April 9, 2015 5:01:45 PM
> > Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> >
> > Hi Joe,
> >
> > Many thanks for your reply!
> >
> > On 09/04/15 03:34, joehuang wrote:
> > > Hi, Neil,
> > >
> > >  From theoretic, Neutron is like a "broadcast" domain, for example,
> > >  enforcement of DVR and security group has to touch each regarding host
> > >  where there is VM of this project resides. Even using SDN controller,
> the
> > >  "touch" to regarding host is inevitable. If there are plenty of
> physical
> > >  hosts, for example, 10k, inside one Neutron, it's very hard to
> overcome
> > >  the "broadcast storm" issue under concurrent operation, that's the
> > >  bottleneck for scalability of Neutron.
> >
> > I think I understand that in general terms - but can you be more
> > specific about the broadcast storm?  Is there one particular message
> > exchange that involves broadcasting?  Is it only from the server to
> > agents, or are there 'broadcasts' in other directions as well?
> >
> > (I presume you are talking about control plane messages here, i.e.
> > between Neutron components.  Is that right?  Obviously there can also be
> > broadcast storm problems in the data plane - but I don't think that's
> > what you are talking about here.)
> >
> > > We need layered architecture in Neutron to solve the "broadcast domain"
> > > bottleneck of scalability. The test report from OpenStack cascading
> shows
> > > that through layered architecture "Neutron cascading", Neutron can
> > > supports up to million level ports and 100k level physical hosts. You
> can
> > > find the report here:
> > >
> http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers
> >
> > Many thanks, I will take a look at this.
> >
> > > "Neutron cascading" also brings extra benefit: One cascading Neutron
> can
> > > have many cascaded Neutrons, and different cascaded Neutron can
> leverage
> > > different SDN controller, maybe one is ODL, the other one is
> OpenContrail.
> > >
> > > Cascading Neutron---
> > >  / \
> > > --cascaded Neutron--   --cascaded Neutron-
> > > |  |
> > > -ODL--   OpenContrail
> > >
> > >
> > > And furthermore, if using Neutron cascading in multiple data centers,
> the
> > > DCI controller (Data center inter-connection controller) can also be
> used
> > > under cascading Neutron, to provide NaaS ( network as a service )
> across
> > > data centers.
> > >
> > > ---Cascading Neutron--
> > >  /   

Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-10 Thread joehuang
Hi, Neil, 

See inline comments.

Best Regards

Chaoyi Huang


From: Neil Jerram [neil.jer...@metaswitch.com]
Sent: 09 April 2015 23:01
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

Hi Joe,

Many thanks for your reply!

On 09/04/15 03:34, joehuang wrote:
> Hi, Neil,
>
>  From theoretic, Neutron is like a "broadcast" domain, for example, 
> enforcement of DVR and security group has to touch each regarding host where 
> there is VM of this project resides. Even using SDN controller, the "touch" 
> to regarding host is inevitable. If there are plenty of physical hosts, for 
> example, 10k, inside one Neutron, it's very hard to overcome the "broadcast 
> storm" issue under concurrent operation, that's the bottleneck for 
> scalability of Neutron.

I think I understand that in general terms - but can you be more
specific about the broadcast storm?  Is there one particular message
exchange that involves broadcasting?  Is it only from the server to
agents, or are there 'broadcasts' in other directions as well?

[[joehuang]] for example, L2 population, Security group rule update, DVR route 
update. Both direction in different scenario.

(I presume you are talking about control plane messages here, i.e.
between Neutron components.  Is that right?  Obviously there can also be
broadcast storm problems in the data plane - but I don't think that's
what you are talking about here.)

[[joehuang]] Yes, controll plane here.


> We need layered architecture in Neutron to solve the "broadcast domain" 
> bottleneck of scalability. The test report from OpenStack cascading shows 
> that through layered architecture "Neutron cascading", Neutron can supports 
> up to million level ports and 100k level physical hosts. You can find the 
> report here: 
> http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers

Many thanks, I will take a look at this.

> "Neutron cascading" also brings extra benefit: One cascading Neutron can have 
> many cascaded Neutrons, and different cascaded Neutron can leverage different 
> SDN controller, maybe one is ODL, the other one is OpenContrail.
>
> Cascading Neutron---
>  / \
> --cascaded Neutron--   --cascaded Neutron-
> |  |
> -ODL--   OpenContrail
>
>
> And furthermore, if using Neutron cascading in multiple data centers, the DCI 
> controller (Data center inter-connection controller) can also be used under 
> cascading Neutron, to provide NaaS ( network as a service ) across data 
> centers.
>
> ---Cascading Neutron--
>  /|  \
> --cascaded Neutron--  -DCI controller-  --cascaded Neutron-
> | ||
> -ODL--   | OpenContrail
>   |
> --(Data center 1)--   --(DCI networking)--  --(Data center 2)--
>
> Is it possible for us to discuss this in OpenStack Vancouver summit?

Most certainly, yes.  I will be there from mid Monday afternoon through
to end Friday.  But it will be my first summit, so I have no idea yet as
to how I might run into you - please can you suggest!

I will also attend the summit whole week, sometimes in the OPNFV parts, 
sometimes in OpenStack parts. Let me see how to meet.

> Best Regards
> Chaoyi Huang ( Joe Huang )

Regards,
Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-09 Thread Neil Jerram

Hi Mike,

Many thanks for your reply!

On 08/04/15 17:56, Mike Spreitzer wrote:

Are you looking at scaling the numbers of tenants, Neutron routers, and
tenant networks as you scale hosts and guests?  I think this is a
plausible way to grow.  The compartmentalizations that comes with
growing those things may make a difference in results.


Are you thinking of control plane or data plane limits?  In my email I 
was thinking of control plane points, such as


- how many compute host agents can communicate with the Neutron server

- how many Neutron server instances or threads are needed

- whether there are any limits associated with the Neutron DB (unlikely 
I guess).


Does the use of tenant networks and routers affect those points, in your 
experience?  That would be less obvious to me than simply how many 
compute hosts or Neutron servers there are.


On the data plane side - if that was more what you meant - I can 
certainly see the limits there and how they are alleviated by using 
tenant networks and routers, in the L2 model.  FWIW, my project Calico 
[1] tries to avoid those by not providing a L2 domain at all - which can 
make sense for workloads that only require or provide IP services - and 
instead routing data through the fabric.


To answer your question, then, no, I wasn't thinking of scaling tenant 
networks and routers, per your suggestion, because Calico doesn't do 
things that way (or alternatively because Calico already routes 
everywhere), and because I didn't think that would be relevant to the 
control plane scaling that I had in mind.  But I may be missing 
something, so please do say if so.


Many thanks,
Neil


[1] http://www.projectcalico.org/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-09 Thread Neil Jerram

Hi Joe,

Many thanks for your reply!

On 09/04/15 03:34, joehuang wrote:

Hi, Neil,

 From theoretic, Neutron is like a "broadcast" domain, for example, enforcement of DVR and security 
group has to touch each regarding host where there is VM of this project resides. Even using SDN controller, 
the "touch" to regarding host is inevitable. If there are plenty of physical hosts, for example, 
10k, inside one Neutron, it's very hard to overcome the "broadcast storm" issue under concurrent 
operation, that's the bottleneck for scalability of Neutron.


I think I understand that in general terms - but can you be more 
specific about the broadcast storm?  Is there one particular message 
exchange that involves broadcasting?  Is it only from the server to 
agents, or are there 'broadcasts' in other directions as well?


(I presume you are talking about control plane messages here, i.e. 
between Neutron components.  Is that right?  Obviously there can also be 
broadcast storm problems in the data plane - but I don't think that's 
what you are talking about here.)



We need layered architecture in Neutron to solve the "broadcast domain" bottleneck of 
scalability. The test report from OpenStack cascading shows that through layered architecture 
"Neutron cascading", Neutron can supports up to million level ports and 100k level 
physical hosts. You can find the report here: 
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers


Many thanks, I will take a look at this.


"Neutron cascading" also brings extra benefit: One cascading Neutron can have 
many cascaded Neutrons, and different cascaded Neutron can leverage different SDN 
controller, maybe one is ODL, the other one is OpenContrail.

Cascading Neutron---
 / \
--cascaded Neutron--   --cascaded Neutron-
|  |
-ODL--   OpenContrail


And furthermore, if using Neutron cascading in multiple data centers, the DCI 
controller (Data center inter-connection controller) can also be used under 
cascading Neutron, to provide NaaS ( network as a service ) across data centers.

---Cascading Neutron--
 /|  \
--cascaded Neutron--  -DCI controller-  --cascaded Neutron-
| ||
-ODL--   | OpenContrail
  |
--(Data center 1)--   --(DCI networking)--  --(Data center 2)--

Is it possible for us to discuss this in OpenStack Vancouver summit?


Most certainly, yes.  I will be there from mid Monday afternoon through 
to end Friday.  But it will be my first summit, so I have no idea yet as 
to how I might run into you - please can you suggest!



Best Regards
Chaoyi Huang ( Joe Huang )


Regards,
Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-08 Thread joehuang
Hi, Neil,

>From theoretic, Neutron is like a "broadcast" domain, for example, enforcement 
>of DVR and security group has to touch each regarding host where there is VM 
>of this project resides. Even using SDN controller, the "touch" to regarding 
>host is inevitable. If there are plenty of physical hosts, for example, 10k, 
>inside one Neutron, it's very hard to overcome the "broadcast storm" issue 
>under concurrent operation, that's the bottleneck for scalability of Neutron. 

We need layered architecture in Neutron to solve the "broadcast domain" 
bottleneck of scalability. The test report from OpenStack cascading shows that 
through layered architecture "Neutron cascading", Neutron can supports up to 
million level ports and 100k level physical hosts. You can find the report 
here: 
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers

"Neutron cascading" also brings extra benefit: One cascading Neutron can have 
many cascaded Neutrons, and different cascaded Neutron can leverage different 
SDN controller, maybe one is ODL, the other one is OpenContrail.

Cascading Neutron---
/ \
--cascaded Neutron--   --cascaded Neutron-
   |  |
-ODL--   OpenContrail


And furthermore, if using Neutron cascading in multiple data centers, the DCI 
controller (Data center inter-connection controller) can also be used under 
cascading Neutron, to provide NaaS ( network as a service ) across data centers.

---Cascading Neutron--
/|  \
--cascaded Neutron--  -DCI controller-  --cascaded Neutron-
   | ||
-ODL--   | OpenContrail
 |
--(Data center 1)--   --(DCI networking)--  --(Data center 2)--

Is it possible for us to discuss this in OpenStack Vancouver summit?

Best Regards
Chaoyi Huang ( Joe Huang )


-Original Message-
From: Neil Jerram [mailto:neil.jer...@metaswitch.com] 
Sent: Thursday, April 09, 2015 12:27 AM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] [neutron] Neutron scaling datapoints?

My team is working on experiments looking at how far the Neutron server will 
scale, with increasing numbers of compute hosts and VMs.  Does anyone have any 
datapoints on this that they can share?  Or any clever hints?

I'm already aware of the following ones:

https://javacruft.wordpress.com/2014/06/18/168k-instances/
 Icehouse
 118 compute hosts
 80 Neutron server processes (10 per core on each of 8 cores, on the  
controller node)
 27,000 VMs - but only after disabling all security/iptables

http://www.opencontrail.org/openstack-neutron-at-scale/
 1000 hosts
 5000 VMs
 3 Neutron servers (via a load balancer)  But doesn't describe if any specific 
configuration is needed for this.
 (Other than using OpenContrail! :-))

Many thanks!
 Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-08 Thread Mike Spreitzer
Are you looking at scaling the numbers of tenants, Neutron routers, and 
tenant networks as you scale hosts and guests?  I think this is a 
plausible way to grow.  The compartmentalizations that comes with growing 
those things may make a difference in results.

Thanks,
Mike



From:   Neil Jerram 
To: 
Date:   04/08/2015 12:29 PM
Subject:[openstack-dev] [neutron] Neutron scaling datapoints?



My team is working on experiments looking at how far the Neutron server
will scale, with increasing numbers of compute hosts and VMs.  Does
anyone have any datapoints on this that they can share?  Or any clever
hints?

I'm already aware of the following ones:

https://javacruft.wordpress.com/2014/06/18/168k-instances/
 Icehouse
 118 compute hosts
 80 Neutron server processes (10 per core on each of 8 cores, on the
 controller node)
 27,000 VMs - but only after disabling all security/iptables

http://www.opencontrail.org/openstack-neutron-at-scale/
 1000 hosts
 5000 VMs
 3 Neutron servers (via a load balancer)
 But doesn't describe if any specific configuration is needed for this.
 (Other than using OpenContrail! :-))

Many thanks!
 Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] Neutron scaling datapoints?

2015-04-08 Thread Neil Jerram
My team is working on experiments looking at how far the Neutron server
will scale, with increasing numbers of compute hosts and VMs.  Does
anyone have any datapoints on this that they can share?  Or any clever
hints?

I'm already aware of the following ones:

https://javacruft.wordpress.com/2014/06/18/168k-instances/
 Icehouse
 118 compute hosts
 80 Neutron server processes (10 per core on each of 8 cores, on the
 controller node)
 27,000 VMs - but only after disabling all security/iptables

http://www.opencontrail.org/openstack-neutron-at-scale/
 1000 hosts
 5000 VMs
 3 Neutron servers (via a load balancer)
 But doesn't describe if any specific configuration is needed for this.
 (Other than using OpenContrail! :-))

Many thanks!
 Neil

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev