Signed-off-by: Guido Trotter <[email protected]> --- doc/design-2.2.rst | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 62 insertions(+), 0 deletions(-)
diff --git a/doc/design-2.2.rst b/doc/design-2.2.rst index ab0a8bd..c18e7a7 100644 --- a/doc/design-2.2.rst +++ b/doc/design-2.2.rst @@ -33,6 +33,68 @@ As for 2.1 we divide the 2.2 design into three areas: Core changes ------------ +Master Daemon Scaling improvements +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Current state and shortcomings +++++++++++++++++++++++++++++++ + +Currently the Ganeti master daemon is based on four sets of threads: + +- The main thread (1) just accepts connections on the master socket +- The client worker (16) pool (16 threads) handles those connections, + one thread per connected socket, parses luxi requests, and sends data + back to the clients +- The job queue worker pool (25) executes the actual jobs submitted by + the clients +- The rpc worker pool (10) interacts with the nodes via http-based-rpc + +This means that every masterd currently runs 52 threads to do its job. +Being able to reduce this number would make the master a lot simpler. +Also, even with this big number of threads masterd suffers from quite a +few scalability issues: + +- Since the 16 client worker threads handle one connection each, it's + very easy to exaust them, by just connecting to masterd 16 times. + While we could perhaps make those pools resizable, increasing the + number of threads won't help with lock contention. +- Some luxi operations (in particular REQ_WAIT_FOR_JOB_CHANGE) make the + relevant client thread block on its job for a relatively long time. + This makes it easier to finish the 16 client threads. +- The luxi lock is quite heavily contended, and certain easily + reproducible worklogs show that's it's very easy to put masterd in + trouble: for example running ~15 background instance reinstall jobs, + results in a master daemon that, even without having finished the + client worker threads, can't answer simple job list requests, or + submit more jobs. + +Proposed changes +++++++++++++++++ + +In order to fix the above issues, for Ganeti 2.2, we propose the +following core changes: + +- The main thread of masterd is moved to asyncore (so it can share the + mainloop code with all other ganeti daemons) and handles all client + connections. +- The REQ_WAIT_FOR_JOB_CHANGE luxi request is changed to be + subscription-based, so that the executing thread doesn't have to be + hogged while changes arrive. +- The job queue lock is reviewed to decrease its contention, making the + job queue more interactive. + +With these changes it should be possible to interact with the master +daemon even when it's under heavy load, and it will also be simpler to +add core functionality such as: asynchronous rpc client, internal timers +to avoid master client timeouts (luxi level keepalives). + +Only the first two changes should be enough to reduce the size of the +client worker pool from 16 to ~4/5 threads maximum (although the perfect +number needs to be tested in practice) and if the rpc client can be +moved to be asynchronous as well, masterd should become a lot smaller in +number of threads, and thus also easier to understand, debug, and scale. + + Remote procedure call timeouts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- 1.7.1
