] TNonblockingServer: Refactor to allow multiple IO ----------------------------------------------------
Key: THRIFT-1442 URL: https://issues.apache.org/jira/browse/THRIFT-1442 Project: Thrift Issue Type: Improvement Components: C++ - Library Reporter: Dave Watson Priority: Minor Attachments: 0001-TNonblockingServer-Refactor-to-allow-multiple-IO-thr.patch >From 04fc8cbc24c64e1b68a23a1df2c46056785c269d Mon Sep 17 00:00:00 2001 From: Mark Rabkin <mrab...@fb.com> Date: Tue, 18 May 2010 22:16:09 +0000 Subject: [PATCH 01/56] TNonblockingServer: Refactor to allow multiple IO threads, not just one davejwatson: This diff ads multiple IO threads to TNonblocking Server. We use it extensively, it's pretty well tested other than merge errors. This diff reverts THRIFT-1184, allowing re-use of an existing event base. With multiple IO threads, re-using a single event base doesn't make sense, so this seemed ok Summary: The diff creates multiple IO threads at startup -- the number of threads in this diff is fixed at server start and cannot change for simplicity. The first thread (id = 0) also doubles as the listen/accept thread, so there is still only a single thread doing accepts. The other IO threads listen only on their notification pipe and the actual connection sockets. Also, for simplicity, each accepted connection is simply assigned in a round-robin fashion to the next IO thread and lives on that IO thread permanently. Note that there are only trivial changes to TConnection to get it to work, so most of the tricky state transitions and buffer management are unchanged. There is still a single worker pool for processing tasks, so that code is unchanged as well. The trickiest part of the diff requiring the most careful review is the new synchronization code in the TNonblockingServer to manage the connection stack and counters of active/inactive connections. We now lock a mutex when incrementing/decrementing server counters, which is less than ideal for extremely high-QPS servers -- should I switch to atomic ops? One important change here is that while connections are created and initialized by the listen thread (IO thread #0), they may now be assigned to an event_base owned by a different IO thread. To work safely, TConnection::init() no longer calls setFlags() - instead, it immediately calls TConnection::notifyIOThread(). This results in a notification-fd event in the correct event base, which then calls TConnection::transition() which sets up correct read flags. This means that a TConnection now calls notifyIOThread() once upon creation and assignment to its event_base, and thereafter after each processing call completes. TNonblocking server: Allow high-priority scheduler for IO threads Summary: Adds a boolean option to TNonblockingServer to use sched_setscheduler() to set a high scheduling priority for the IO thread(s) -- this is a POSIX api and should be safe on most operating systems. Please let me know if this is a known terrible idea, but we're experimenting to see if this helps the situation where we have 40 worker threads and 1 IO thread and the IO thread doesn't get scheduled nearly often enough. Reviewers: dreiss,edhall,putivsky Test Plan: Need to work with Ed to run his capacity-loadtesting scripts to verify performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira