Signed-off-by: Guido Trotter <[email protected]>
---
 doc/design-2.2.rst |   62 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 62 insertions(+), 0 deletions(-)

diff --git a/doc/design-2.2.rst b/doc/design-2.2.rst
index ab0a8bd..c18e7a7 100644
--- a/doc/design-2.2.rst
+++ b/doc/design-2.2.rst
@@ -33,6 +33,68 @@ As for 2.1 we divide the 2.2 design into three areas:
 Core changes
 ------------
 
+Master Daemon Scaling improvements
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Current state and shortcomings
+++++++++++++++++++++++++++++++
+
+Currently the Ganeti master daemon is based on four sets of threads:
+
+- The main thread (1) just accepts connections on the master socket
+- The client worker (16) pool (16 threads) handles those connections,
+  one thread per connected socket, parses luxi requests, and sends data
+  back to the clients
+- The job queue worker pool (25) executes the actual jobs submitted by
+  the clients
+- The rpc worker pool (10) interacts with the nodes via http-based-rpc
+
+This means that every masterd currently runs 52 threads to do its job.
+Being able to reduce this number would make the master a lot simpler.
+Also, even with this big number of threads masterd suffers from quite a
+few scalability issues:
+
+- Since the 16 client worker threads handle one connection each, it's
+  very easy to exaust them, by just connecting to masterd 16 times.
+  While we could perhaps make those pools resizable, increasing the
+  number of threads won't help with lock contention.
+- Some luxi operations (in particular REQ_WAIT_FOR_JOB_CHANGE) make the
+  relevant client thread block on its job for a relatively long time.
+  This makes it easier to finish the 16 client threads.
+- The luxi lock is quite heavily contended, and certain easily
+  reproducible worklogs show that's it's very easy to put masterd in
+  trouble: for example running ~15 background instance reinstall jobs,
+  results in a master daemon that, even without having finished the
+  client worker threads, can't answer simple job list requests, or
+  submit more jobs.
+
+Proposed changes
+++++++++++++++++
+
+In order to fix the above issues, for Ganeti 2.2, we propose the
+following core changes:
+
+- The main thread of masterd is moved to asyncore (so it can share the
+  mainloop code with all other ganeti daemons) and handles all client
+  connections.
+- The REQ_WAIT_FOR_JOB_CHANGE luxi request is changed to be
+  subscription-based, so that the executing thread doesn't have to be
+  hogged while changes arrive.
+- The job queue lock is reviewed to decrease its contention, making the
+  job queue more interactive.
+
+With these changes it should be possible to interact with the master
+daemon even when it's under heavy load, and it will also be simpler to
+add core functionality such as: asynchronous rpc client, internal timers
+to avoid master client timeouts (luxi level keepalives).
+
+Only the first two changes should be enough to reduce the size of the
+client worker pool from 16 to ~4/5 threads maximum (although the perfect
+number needs to be tested in practice) and if the rpc client can be
+moved to be asynchronous as well, masterd should become a lot smaller in
+number of threads, and thus also easier to understand, debug, and scale.
+
+
 Remote procedure call timeouts
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-- 
1.7.1

Reply via email to