On Tue, May 18, 2010 at 04:44:15PM +0100, Guido Trotter wrote: > Signed-off-by: Guido Trotter <[email protected]> > --- > doc/design-2.2.rst | 62 > ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 62 insertions(+), 0 deletions(-) > > diff --git a/doc/design-2.2.rst b/doc/design-2.2.rst > index ab0a8bd..c18e7a7 100644 > --- a/doc/design-2.2.rst > +++ b/doc/design-2.2.rst > @@ -33,6 +33,68 @@ As for 2.1 we divide the 2.2 design into three areas: > Core changes > ------------ > > +Master Daemon Scaling improvements > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +Current state and shortcomings > +++++++++++++++++++++++++++++++ > + > +Currently the Ganeti master daemon is based on four sets of threads: > + > +- The main thread (1) just accepts connections on the master socket > +- The client worker (16) pool (16 threads) handles those connections, > + one thread per connected socket, parses luxi requests, and sends data > + back to the clients > +- The job queue worker pool (25) executes the actual jobs submitted by > + the clients > +- The rpc worker pool (10) interacts with the nodes via http-based-rpc > + > +This means that every masterd currently runs 52 threads to do its job. > +Being able to reduce this number would make the master a lot simpler.
I think you mean reducing the number of thread *sets*, not threads, would simplify the architecture. > +Also, even with this big number of threads masterd suffers from quite a > +few scalability issues: > + > +- Since the 16 client worker threads handle one connection each, it's > + very easy to exaust them, by just connecting to masterd 16 times. > + While we could perhaps make those pools resizable, increasing the > + number of threads won't help with lock contention. > +- Some luxi operations (in particular REQ_WAIT_FOR_JOB_CHANGE) make the > + relevant client thread block on its job for a relatively long time. > + This makes it easier to finish the 16 client threads. s/finish/exhaust/ > +- The luxi lock is quite heavily contended, and certain easily Hmm, what luxi lock? > + reproducible worklogs show that's it's very easy to put masterd in > + trouble: for example running ~15 background instance reinstall jobs, > + results in a master daemon that, even without having finished the > + client worker threads, can't answer simple job list requests, or > + submit more jobs. I'd like to understand better how this happens. Do you have more info? > +Proposed changes > +++++++++++++++++ > + > +In order to fix the above issues, for Ganeti 2.2, we propose the > +following core changes: > + > +- The main thread of masterd is moved to asyncore (so it can share the > + mainloop code with all other ganeti daemons) and handles all client > + connections. > +- The REQ_WAIT_FOR_JOB_CHANGE luxi request is changed to be > + subscription-based, so that the executing thread doesn't have to be > + hogged while changes arrive. What do you mean "subscription-based"? > +- The job queue lock is reviewed to decrease its contention, making the > + job queue more interactive. > + > +With these changes it should be possible to interact with the master > +daemon even when it's under heavy load, and it will also be simpler to > +add core functionality such as: asynchronous rpc client, internal timers > +to avoid master client timeouts (luxi level keepalives). > + > +Only the first two changes should be enough to reduce the size of the > +client worker pool from 16 to ~4/5 threads maximum (although the perfect > +number needs to be tested in practice) and if the rpc client can be > +moved to be asynchronous as well, masterd should become a lot smaller in > +number of threads, and thus also easier to understand, debug, and scale. Hmm… What gain is there in reducing this number? iustin
