[ https://issues.apache.org/jira/browse/THRIFT-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavlin Radoslavov updated THRIFT-1442: -------------------------------------- Attachment: thrift-jira-1442-nonblocking-server-single-thread-2013-07-04.patch Regenerating the patch for the non-blocking single thread issue. The patch is for the head of git master tree as of 2013-07-04. Any chance this patch will be evaluated and merged? It has been almost an year since the issue was reopened and I submitted a patch to fix it. I am beginning to lose state about the issue and the implementation details. > TNonblockingServer: Refactor to allow multiple IO Threads > --------------------------------------------------------- > > Key: THRIFT-1442 > URL: https://issues.apache.org/jira/browse/THRIFT-1442 > Project: Thrift > Issue Type: Improvement > Components: C++ - Library > Reporter: Dave Watson > Assignee: Dave Watson > Priority: Minor > Fix For: 1.0 > > Attachments: > 0001-TNonblockingServer-Refactor-to-allow-multiple-IO-thr.patch, > nonblocking_unsigned.patch, > thrift-jira-1442-nonblocking-server-single-thread-2013-07-04.patch > > > From 04fc8cbc24c64e1b68a23a1df2c46056785c269d Mon Sep 17 00:00:00 2001 > From: Mark Rabkin <mrab...@fb.com> > Date: Tue, 18 May 2010 22:16:09 +0000 > Subject: [PATCH 01/56] TNonblockingServer: Refactor to allow multiple IO > threads, not just one > davejwatson: This diff ads multiple IO threads to TNonblocking Server. > We use it extensively, it's pretty well tested other than merge errors. > This diff reverts THRIFT-1184, allowing re-use of an existing > event base. With multiple IO threads, re-using a single event base doesn't > make sense, so this seemed ok > Summary: > The diff creates multiple IO threads at startup -- the number of threads in > this diff is fixed at server start and cannot change for simplicity. The > first thread (id = 0) also doubles as the listen/accept thread, so there is > still only a single thread doing accepts. The other IO threads listen only > on their notification pipe and the actual connection sockets. > Also, for simplicity, each accepted connection is simply assigned in a > round-robin fashion to the next IO thread and lives on that IO thread > permanently. > Note that there are only trivial changes to TConnection to get it to work, so > most of the tricky state transitions and buffer management are unchanged. > There is still a single worker pool for processing tasks, so that code is > unchanged as well. > The trickiest part of the diff requiring the most careful review is the new > synchronization code in the TNonblockingServer to manage the connection stack > and counters of active/inactive connections. We now lock a mutex when > incrementing/decrementing server counters, which is less than ideal for > extremely high-QPS servers -- should I switch to atomic ops? > One important change here is that while connections are created and > initialized > by the listen thread (IO thread #0), they may now be assigned to an event_base > owned by a different IO thread. To work safely, TConnection::init() no longer > calls setFlags() - instead, it immediately calls > TConnection::notifyIOThread(). > This results in a notification-fd event in the correct event base, which then > calls TConnection::transition() which sets up correct read flags. This means > that a TConnection now calls notifyIOThread() once upon creation and > assignment > to its event_base, and thereafter after each processing call completes. > TNonblocking server: Allow high-priority scheduler for IO threads > Summary: > Adds a boolean option to TNonblockingServer to use sched_setscheduler() to set > a high scheduling priority for the IO thread(s) -- this is a POSIX api and > should be safe on most operating systems. > Please let me know if this is a known terrible idea, but we're experimenting > to see if this helps the situation where we have 40 worker threads and 1 IO > thread and the IO thread doesn't get scheduled nearly often enough. > Reviewers: dreiss,edhall,putivsky > Test Plan: Need to work with Ed to run his capacity-loadtesting scripts to > verify performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira