Hello All, it is possible, that you see same problem, which I have observed on my ulevpoll backend for epoll syscall.
The kernel processing of wait queues does not distinguish if the event if the event is POLLIN or POLLOUT when both events are reported by single wait queue. The layer reporting events back to userspace recheck the condition normally (for poll or select) and because after device/socket file operations poll call it finds, that there is no real event, kernel does not return to userspace and userspace visible behavior is correct. There is some waste of cycles by unneeded context switch in kernel and events recheck but not so more. Problem is for epoll case. Because kernel does not process full recheck before epoll base fd POLLIN is assigned. This means that userpace is woken up. But in call obtaining list of active events there is the full check for conditions and because no match is found kernel returns 0. I my case problem demonstrated when when I enabled debugging which print result of each epoll wait call. I have registered fd 0 (console input) for POLLIN only to have ability to terminate program or command it from terminal. When printf wrote debugging information to console, write condition has changed and has caused POLLIN to be set on epoll fd even that fd 0 has been registered only for POLLIN. Due epoll fd active kernel finished wait, but when event has been read there has been no real event, but my debugging code wrote report about that to console. This recorded epoll fd POLLIN event again => busy loop. The problem has been analyzed by Davide Libenzi and he provided solution which allows distinguish correctly between event types. This correct epoll behavior and has significant positive effect on performace for some sockets use scenarios even unrelated to epoll. Patch is integrated in mainline kernel 2.6.30+ 37e5540b3c9d838eb20f2ca8ea2eb8072271e403 PATCH: epoll keyed wakeups: make sockets use keyed wakeups http://thread.gmane.org/gmane.linux.kernel/786236 http://article.gmane.org/gmane.linux.kernel/790696/match=epoll You can easily check, if this is cause of your troubles by running same code on 2.6.30+ kernel. If you need correct behavior even on older kernels, then it can be problematic. Basically you have to do no I/O or changes related to any of FDs registered in epoll if event count 0 is reported. Best wishes, Pavel Pisa e-mail: [email protected] www: http://cmp.felk.cvut.cz/~pisa university: http://dce.felk.cvut.cz/ company: http://www.pikron.com/ On Thursday 29 April 2010 18:35:30 Nick Mathewson wrote: > On Thu, Apr 29, 2010 at 5:19 AM, Sebastian Sjöberg > > <[email protected]> wrote: > > Hi, > > > > I've encountered a problem with openssl bufferevents where libevent > > reports fd:s as writeable but no action is being taken. > > [...] > > > There is no problem when I'm connecting without tls so I think this is an > > issue with openssl bufferevents and my guess is that somehow the write > > events that openssl bufferevents sets up sometimes doesn't get removed or > > disabled properly. > > > > Is this an issue that someone else has seen and does anyone have any > > pointers on how to debug this problem? > > I haven't run into this myself yet, but the openssl code is relatively > new, and probably has some bugs left. > > To clarify, it seems that the problem is that Libevent bufferevent > openssl code never deletes the relevant read events, even though it > isn't actually interested in reading? Or the problem is that epoll is > returning immediately but not making any events active? > > If it's the first problem, I'd try adding debugging messages to the > points in bufferevent_openssl that call event_add, event_del, and > _bufferevent_add_event, along with debugging statements to display the > return values of SSL_read and SSL_write, to see at what point we're > supposed to be deleting the relevant read event but not really doing > it. > > If it's the second problem, I'd start by testing whether stuff begins > to work when you set the EVENT_NOEPOLL environment variable. If so, > then the bug is probably with the epoll backend -- or at least, it > requires the epoll backend to appear. To debug this, I'd add > debugging messages to the loop in epoll_dispatch that calls > evmap_io_active to tell me whenever it decided not to call > evmap_io_active, and I'd have evmap_io_active tell me whenever it made > 0 events become active. > > With any luck, the debugging output should help figure out exactly > what's going wrong here. > > I'm afraid I'm about to be away from the internet for tomorrow and the > weekend, so I won't be able to help much more until early next week. > Good luck! > > yrs, > -- > Nick > *********************************************************************** > To unsubscribe, send an e-mail to [email protected] with > unsubscribe libevent-users in the body. *********************************************************************** To unsubscribe, send an e-mail to [email protected] with unsubscribe libevent-users in the body.
