Public bug reported:

The conductor is consuming messages form single queue which has performance 
limitation due to various reasons.:
- per queue lock
- Some broker also limiting same part of the message handling to single CPU 
thread/queue
- Multiple broker instances needs to synchronise to queue content, which causes 
additional delays die to the tcp request/response times

The single queue limitation is much greater than the limits getting by
single mysql server, the rate is even worse when you consider slave
reads.

This can be workarounded by explicitly or implicit distributing the rpc
calls to multiple different queue.

The message broker provides additional message durability properties which is 
not needed just for an rpc_call,
we spend resource on what we actually do not need.

For TCP/HTTP traffic load balancing we have many-many tools even hardware 
assisted options are available providing virtually unlimited scalability.
At TCP level also possible to exclude the loadbalancer node(s) form the 
response traffic.

Why HTTP?
Basically any protocol which can do request/response `thing` with arbitrary  
type and size of data with keep-alive connection and with ssl option, could be 
used.
HTTP is a simple and well know protocol, with already existing many-many load 
balancing tool.

Why not have the agents to do a regular API call?
The regular API calls needs to do policy check, which in this case is not 
required, every authenticated user can be considered as admin.  

The  the conductor clients needs to use at least a single shared key configured 
in every nova host.
It has similar security as openstack used with the brokers, basically all nova 
node had credentials in one rabbitmq virtual host,
configured in the /etc/nova/nova.conf . If any of those credentials stolen it 
provided access to the whole virtual host. 

NOTE.: HTTPs can be used with certificate or kerberos based
authentication as well.


I think the for `rpc_calls` which are served by the agents using AMQP is still 
better option,  this bug is just about the situation when the conductor itself 
serves  rpc_call(s). 

NOTE.: The 1 Million msq/sec rabbitmq benchmark is done 186 queues, in
way which does not hits the single queue limitations.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1438113

Title:
  Use plain HTTP listeners in the conductor

Status in OpenStack Compute (Nova):
  New

Bug description:
  The conductor is consuming messages form single queue which has performance 
limitation due to various reasons.:
  - per queue lock
  - Some broker also limiting same part of the message handling to single CPU 
thread/queue
  - Multiple broker instances needs to synchronise to queue content, which 
causes additional delays die to the tcp request/response times

  The single queue limitation is much greater than the limits getting by
  single mysql server, the rate is even worse when you consider slave
  reads.

  This can be workarounded by explicitly or implicit distributing the
  rpc calls to multiple different queue.

  The message broker provides additional message durability properties which is 
not needed just for an rpc_call,
  we spend resource on what we actually do not need.

  For TCP/HTTP traffic load balancing we have many-many tools even hardware 
assisted options are available providing virtually unlimited scalability.
  At TCP level also possible to exclude the loadbalancer node(s) form the 
response traffic.

  Why HTTP?
  Basically any protocol which can do request/response `thing` with arbitrary  
type and size of data with keep-alive connection and with ssl option, could be 
used.
  HTTP is a simple and well know protocol, with already existing many-many load 
balancing tool.

  Why not have the agents to do a regular API call?
  The regular API calls needs to do policy check, which in this case is not 
required, every authenticated user can be considered as admin.  

  The  the conductor clients needs to use at least a single shared key 
configured in every nova host.
  It has similar security as openstack used with the brokers, basically all 
nova node had credentials in one rabbitmq virtual host,
  configured in the /etc/nova/nova.conf . If any of those credentials stolen it 
provided access to the whole virtual host. 

  NOTE.: HTTPs can be used with certificate or kerberos based
  authentication as well.

  
  I think the for `rpc_calls` which are served by the agents using AMQP is 
still better option,  this bug is just about the situation when the conductor 
itself serves  rpc_call(s). 

  NOTE.: The 1 Million msq/sec rabbitmq benchmark is done 186 queues, in
  way which does not hits the single queue limitations.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1438113/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to