Hi guys,

we have a set of JIRA refering to a well known bug in Java 5-6-7 (up to b55 for Java 7). Basically, there is a nasty bug in the select() method. The issue is described in http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933 :

" This is an issue with poll (and epoll) on Linux. If a file descriptor for a connected socket is polled with a request event mask of 0, and if the connection is abruptly terminated (RST) then the poll wakes up with the POLLHUP (and maybe POLLERR) bit set in the returned event set. The implication of this behaviour is that Selector will wakeup and as the interest set for the SocketChannel is 0 it means there aren't any selected events and the select method returns 0."

I have baked a small patch against this problem. The idea is to check if select( timeout ) returns too quickly. It would have been easier if only we have used select() in the IoProcessor, but sadly, we use this timeout to allow the detection of idle sessions to be done in this loop (a major mistake, IMO). However...

Here is the proposed solution :

               for(;;) {
                   long t0 = System.currentTimeMillis();
                   int selected = select(SELECT_TIMEOUT);

                   long t1 = System.currentTimeMillis();
if (selected == 0) {
                       if ((t1 - t0) < 100) {
                           // Switch the selectors
                           registerNewSelector();
                       }
                   }

                   // process the selected keys now ...


Ok, so far so good, but it's not enough. One other reason we might get out the select( SELECT_TIMEOUT) is some other thread called selector.wakeup(). We have to deal with that. I have added a flag set to false by default and flipped by the wakeup() method in order to be sure that we are hitting the NIO bug. The code looks like :

           for (;;) {
               try {
                   long t0 = System.currentTimeMillis();
                   int selected = select(SELECT_TIMEOUT);

                   synchronized(wakeupCalled) {
                       long t1 = System.currentTimeMillis();
if (selected == 0) {
                           if ( ! wakeupCalled.get()) {
                               if ((t1 - t0) < 100) {
                                   registerNewSelector();
                               }
                           }
                       }
wakeupCalled.getAndSet(false);
                   }

                   nSessions += handleNewSessions();

and in the wakeup() method :

   protected void wakeup() {
       synchronized(wakeupCalled) {
           wakeupCalled.getAndSet(true);
           selector.wakeup();
       }
   }

I have created a branch (select-fix) for that. Please test it and give me some feedback !

Thanks !

--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org


Reply via email to